npm - @hegemonart/get-design-done - Versions diffs - 1.46.0 → 1.48.0 - Mend

@hegemonart/get-design-done 1.46.0 → 1.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +94 -0
package/README.md +4 -0
package/SKILL.md +2 -1
package/agents/brief-auditor.md +147 -0
package/agents/copy-auditor.md +215 -0
package/agents/design-auditor.md +13 -3
package/agents/design-debt-crawler.md +269 -0
package/agents/design-fixer.md +2 -0
package/agents/quality-gate-runner.md +11 -10
package/dist/claude-code/.claude/skills/brief/SKILL.md +17 -0
package/dist/claude-code/.claude/skills/live/SKILL.md +98 -0
package/dist/claude-code/.claude/skills/quality-gate/SKILL.md +2 -2
package/hooks/gdd-a11y-gate.js +119 -0
package/hooks/hooks.json +8 -0
package/package.json +1 -1
package/reference/brief-quality-rubric.md +98 -0
package/reference/copy-quality.md +135 -0
package/reference/debt-categories.md +148 -0
package/reference/live-mode-integration.md +80 -0
package/reference/registry.json +28 -0
package/reference/schemas/events.schema.json +1 -1
package/reference/schemas/live-session.schema.json +64 -0
package/scripts/lib/live/bandit-feed.cjs +64 -0
package/scripts/lib/live/events.cjs +86 -0
package/scripts/lib/live/harness-mode.cjs +93 -0
package/scripts/lib/live/postcheck.cjs +158 -0
package/scripts/lib/live/runtime.cjs +233 -0
package/scripts/lib/live/scope-guard.cjs +145 -0
package/scripts/lib/live/session-store.cjs +364 -0
package/scripts/lib/manifest/skills.json +8 -0
package/skills/brief/SKILL.md +17 -0
package/skills/live/SKILL.md +98 -0
package/skills/quality-gate/SKILL.md +2 -2

package/reference/brief-quality-rubric.md ADDED Viewed

@@ -0,0 +1,98 @@
+# Brief Quality Rubric
+The five anti-patterns `agents/brief-auditor.md` grades `.design/BRIEF.md` against. Each entry pairs a
+definition with a good and bad example, the detection signal the auditor greps for, and a severity note.
+This rubric is advisory: a flagged brief still proceeds to explore. The point is to surface vagueness
+while the cost of fixing it is one sentence, not a redesign.
+A brief is the contract every later stage checks against. A vague brief produces an unverifiable cycle,
+because verify has nothing concrete to test. The auditor reads the brief once and writes findings to
+`.design/BRIEF-AUDIT.md`; the brief skill then offers `/gdd:discuss brief` when any anti-pattern fires.
+---
+## AP-1: Vague verbs without a metric
+**Definition:** The problem or goal uses a soft verb (improve, optimize, streamline, enhance, modernize,
+refresh) with no number, threshold, or observable change attached. The verb hides the actual target.
+- **Bad:** "Improve the checkout flow."
+- **Good:** "Cut checkout abandonment from 38 percent to under 25 percent on mobile."
+**Detection signal:** Match soft verbs (`improve`, `optimize`, `streamline`, `enhance`, `modernize`,
+`refresh`) in the Problem or Success Metrics sections, then check the same sentence for a digit, a
+percent sign, or a unit. A soft verb with no adjacent quantity is a hit.
+**Severity:** Major. A goal with no metric cannot be verified, so the whole cycle inherits the ambiguity.
+---
+## AP-2: Missing audience
+**Definition:** The brief never names who the design is for. No role, device, context, or skill level is
+stated, so every later trade-off (density, reading level, input model) is a guess.
+- **Bad:** "Build a dashboard for tracking orders."
+- **Good:** "Build an order dashboard for warehouse leads on a shared floor tablet, glanceable at arm's length."
+**Detection signal:** Read the Audience section. Flag when it is empty, a placeholder (`TBD`, `users`,
+`everyone`, `all users`), or names no role plus context. A single generic noun with no qualifier is a hit.
+**Severity:** Major. Audience drives density, tone, and accessibility floor; without it the design optimizes
+for no one.
+---
+## AP-3: Immeasurable success criteria
+**Definition:** Success is described in feelings rather than observables (looks modern, feels clean, is
+intuitive, delights users). There is no event, count, or threshold a verifier could check.
+- **Bad:** "Users should feel the app is fast and modern."
+- **Good:** "First contentful paint under 1.5 seconds; task completion rate above 90 percent in five tests."
+**Detection signal:** Scan Success Metrics for subjective adjectives (`modern`, `clean`, `intuitive`,
+`delightful`, `nice`, `beautiful`) with no paired number or pass/fail condition. Subjective-only criteria
+are a hit.
+**Severity:** Major. Verify cannot grade a feeling; immeasurable criteria collapse the verify gate.
+---
+## AP-4: Scope creep
+**Definition:** The Scope section lists more than the cycle can deliver, or mixes unrelated surfaces into
+one brief, so the in-scope line stops constraining anything.
+- **Bad:** "Redesign onboarding, billing, settings, the marketing site, and add dark mode."
+- **Good:** "In scope: the three-step onboarding flow. Out of scope: billing, settings, marketing site."
+**Detection signal:** Count distinct surfaces or top-level features named as in-scope. More than three
+unrelated surfaces in one brief, or an in-scope list with no matching out-of-scope line, is a hit.
+**Severity:** Minor. Wide scope is recoverable by splitting, but unsplit it inflates every later estimate.
+---
+## AP-5: Missing anti-goals
+**Definition:** The brief states what to build but never what to avoid. With no anti-goals, explore widens
+to fill the vacuum and the design picks up patterns the team never wanted.
+- **Bad:** (Scope lists features only; no "we are deliberately not doing X" line anywhere.)
+- **Good:** "Anti-goals: no new navigation paradigm, no carousel, do not touch the existing auth screens."
+**Detection signal:** Look for an explicit non-goal, anti-goal, or out-of-scope statement framed as a
+prohibition (`do not`, `avoid`, `no new`, `out of scope`). A brief with zero prohibition statements is a hit.
+**Severity:** Minor. Anti-goals prevent drift; their absence is a warning, not a blocker.
+---
+## How findings are scored
+The auditor reports a count of fired anti-patterns and lists each with its section and the matched text.
+It does not compute a pass/fail gate and it does not block the brief to explore transition. Major findings
+(AP-1, AP-2, AP-3) carry more weight in the summary line than Minor findings (AP-4, AP-5), so the user
+knows which gaps most threaten a verifiable cycle. When any anti-pattern fires, the brief skill surfaces a
+one-line pointer offering `/gdd:discuss brief` to refine before moving on.

package/reference/copy-quality.md ADDED Viewed

@@ -0,0 +1,135 @@
+---
+name: copy-quality
+type: heuristic
+version: 1.0.0
+phase: 48
+tags: [copy, microcopy, ux-writing, ctas, errors, empty-states, aria, alt-text, i18n, voice, pillar-1]
+last_updated: 2026-06-03
+---
+# Copy Quality - Microcopy Rubric
+This file is the rubric the Copy pillar (Pillar 1 of `agents/design-auditor.md`) scores against, and the detailed reference that `agents/copy-auditor.md` applies. The design-auditor's Pillar 1 carries the 1-4 score and a one-line finding; this file holds the per-category criteria, the failure patterns, and the i18n lens that produce that score.
+Voice and tone are owned by [`brand-voice.md`](./brand-voice.md): the five axes (formal/casual, serious/playful, expert/approachable, reverent/irreverent, authoritative/collaborative), the 12 archetypes, and the tone-by-context table. This file does not restate those. It checks whether the implemented strings honor the voice the product declared, and whether each microcopy surface does its specific job.
+Cultural and locale meaning is owned by [`rtl-cjk-cultural.md`](./rtl-cjk-cultural.md); engineering primitives for localized strings are owned by [`i18n.md`](./i18n.md). This file consumes the i18n text-expansion table as a copy-length lens (see Internationalization Lens below).
+## Why Copy Is a Pillar
+Generic or default copy is the most common quiet failure in shipped UI. A "Submit" button, a "No data" empty state, and a raw stack-trace error each pass a functional test and each fail the user. Specific, purposeful language is a craft signal: it shows the team pictured a real person reading the screen. The Copy pillar measures that gap between functional text and intentional text.
+## Category Rubrics
+Each category below lists what good looks like and the failure patterns to grep for. A category is "weak" when the failure patterns dominate; "absent" when the surface exists with no considered copy at all.
+### Button and CTA labels
+Good labels are verb-first and name the object: "Export CSV", "Add member", "Send invite", "Delete project". The user reads the label and knows the outcome before clicking. Primary actions carry the specific verb; secondary actions ("Cancel", "Back") may stay plain because they are reversible and conventional.
+Failure patterns: bare "Submit", "OK", "Go", "Click here", "Button", or a noun with no verb ("Form", "Settings" on an action). A primary action labeled "Save" with no object on a screen that saves several distinct things is weak, not broken.
+```bash
+grep -rEn ">(Submit|Click Here|OK|Go|Button|Done)<" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+```
+### Error messages
+Good error copy names the cause and the recovery, and never blames the user. State what happened, then what to do: "We could not save your changes. Check your connection and try again." For validation, point at the field and the fix: "Enter a date in the future." The tone matches the stakes (see the tone-by-context table in `brand-voice.md`); data-loss and payment failures are calm and serious, never playful.
+Failure patterns: raw codes or stack traces shown to users ("Error 500", "undefined is not a function"), blame language ("You entered an invalid value", "Your input was wrong"), and dead ends that state the failure with no next step.
+```bash
+grep -rEn "went wrong|Error [0-9]|invalid input|you entered|try again" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+```
+### Empty states
+Good empty states orient and offer the first action. They explain why the screen is empty and give one clear thing to do: "No projects yet. Create your first project to get started." A first-run empty state is an onboarding moment, not an error.
+Failure patterns: "No data", "Nothing here", a bare zero, or an empty container with no copy. An empty state that explains the emptiness but offers no action is weak.
+```bash
+grep -rEn "No data|No results|Nothing here|No items|empty" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+```
+### Loading and skeleton copy
+Good loading copy reassures and signals progress without nagging: "Getting your data...", "Almost there." Skeleton screens are preferred to spinner-plus-text for content that has a known shape; when text is shown, it is brief and specific to what is loading. Long operations name the step ("Uploading 3 of 12 files").
+Failure patterns: "Loading..." with no context on a multi-second operation, a spinner with no label on a process that can fail, or jokey loading copy on a high-stakes action.
+```bash
+grep -rEn "Loading|Please wait|spinner|Skeleton" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+```
+### ARIA text quality
+ARIA strings are copy the screen-reader user hears. They must describe purpose, not implementation. `aria-label="Close dialog"` is good; `aria-label="button"` or `aria-label="icon"` is noise. Live-region announcements (`aria-live`, `role="status"`, `role="alert"`) carry the same cause-plus-recovery standard as visible errors. Labels must not duplicate adjacent visible text in a way that makes the reader say it twice.
+Failure patterns: `aria-label` set to the element type, `aria-label` that restates a visible label verbatim, missing `aria-label` on icon-only controls, and silent live regions on async state changes.
+```bash
+grep -rEn "aria-label=\"(button|icon|link|image|click)\"" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+```
+### Alt-text quality
+Good alt text conveys the function or meaning of an image in context. A logo links home: `alt="Acme home"`, not `alt="logo"`. A decorative image takes empty alt (`alt=""`) so the reader skips it. An informative chart names its takeaway, not its file. Alt text never starts with "image of" or "picture of"; the reader already knows it is an image.
+Failure patterns: `alt="image"`, `alt="photo"`, filename alt (`alt="IMG_2043.jpg"`), missing alt on informative images, and non-empty alt on purely decorative images.
+```bash
+grep -rEn "alt=\"(image|photo|picture|img|logo)\"|alt=\"[^\"]*\\.(png|jpg|jpeg|svg|webp)\"" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+```
+### Form labels, helper, and validation copy
+Every input has a persistent visible label (placeholders are not labels). Helper text sets expectations before the user types ("Use 8 or more characters"). Validation copy is specific and arrives at the right time: inline after blur for format rules, on submit for cross-field rules. Required and optional states are stated in words, not color alone.
+Failure patterns: placeholder-as-label, helper text that only appears as an error after failure, generic "This field is required" with no field name when several are blank, and validation copy that says what is wrong but not how to fix it.
+```bash
+grep -rEn "placeholder=|required|This field|is invalid" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+```
+### Voice and tone alignment
+Read the declared voice from `.design/DESIGN-CONTEXT.md` (axis positions and archetype, if recorded) and check the implemented strings against it. A Caregiver health app with a curt, blaming error message is off-voice even if the error is technically clear. A formal finance app with "Oops!" on a failed transfer is off-voice. Use the tone-by-context table in `brand-voice.md` as the per-surface contract: error, empty state, success, onboarding, loading, destructive confirmation each have a recommended register.
+Failure patterns: one tone for marketing copy and a flatly different tone for in-product copy, playful language on high-stakes actions, and copy that contradicts the declared archetype.
+## Internationalization Lens
+Apply this lens to copy-heavy components (anything where text drives width: buttons, nav, tabs, chips, table headers, banners). It folds the [`i18n.md`](./i18n.md) text-expansion table into the copy score.
+Two probes:
+1. **Hardcoded user-facing strings.** Strings rendered to users should flow through the project's i18n layer, not sit as literals in JSX. A literal English string in a component is a copy defect for any product that ships, or plans to ship, more than one locale.
+   ```bash
+   grep -rEn ">[A-Z][a-z]+ [a-z]+.*<|aria-label=\"[A-Z][a-z]+ " src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+   ```
+2. **Expansion-overflow at +40%.** German expands English by about +30%, Russian by about +40% (see the expansion table in `i18n.md`). A fixed-width control sized to English is the worst case. The lens: for each copy-bearing component with a width constraint, ask whether the label survives a +40% expansion without clipping or truncating mid-word. Containers sized to `EN base x 1.4` clear the worst row of that table. Flag fixed pixel widths on text controls and single-line labels with no wrap or ellipsis strategy.
+   ```bash
+   grep -rEn "w-\[[0-9]+px\]|width:\s*[0-9]+px|truncate|whitespace-nowrap" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
+   ```
+A copy-heavy component that hardcodes strings, or that clips at +40%, drops the Copy pillar by one point and appears in the findings with the `i18n_readiness` lens tag (see `audit-scoring.md` Lens-Tags).
+## Scoring Guide (Copy Pillar, 1-4)
+This scale matches the 1-4 definitions in `agents/design-auditor.md`. The Copy pillar score is a single 1-4 value; the categories above supply the evidence.
+| Score | Label | Criteria |
+|-------|-------|----------|
+| 4 | Exemplary | CTAs are verb-object specific; error messages name cause and recovery without blame; empty states orient and offer a first action; ARIA and alt text describe purpose; form copy guides before and during input; voice matches the declared axes and archetype; copy-heavy components survive a +40% expansion and route strings through i18n. |
+| 3 | Solid | Most copy is intentional; one or two generic labels remain (a plain "Save" on a single-purpose form); error and empty states are present and human but plain; minor voice drift only. |
+| 2 | Present but weak | Mix of intentional and generic copy; some empty states missing; errors show raw codes or blame; ARIA or alt text restates element type; hardcoded strings or +40% overflow risk in copy-heavy components. |
+| 1 | Absent or broken | Majority generic ("Submit", "OK", "Cancel"); empty states absent or "No data"; errors are developer-facing; ARIA labels are noise; alt text is "image" or filenames; no voice considered. |
+## How findings feed the audit
+`agents/copy-auditor.md` runs these probes, scores the pillar, and writes `.design/COPY-AUDIT.md` as a supplement. `agents/design-auditor.md` folds that score and the top finding into Pillar 1 of `.design/DESIGN-AUDIT.md`. Neither file changes the separate 7-category 0-10 system in `audit-scoring.md`; the Copy pillar is a qualitative 1-4 signal, not a weighted metric.

package/reference/debt-categories.md ADDED Viewed

@@ -0,0 +1,148 @@
+---
+name: debt-categories
+type: reference
+version: 1.0.0
+phase: 48
+tags: [debt, taxonomy, audit, crawler, priority-scoring, retroactive]
+last_updated: 2026-06-03
+---
+# Debt Categories
+The taxonomy `agents/design-debt-crawler.md` uses to classify and rank design debt
+found across an entire codebase. Each class has a definition, the detection signal
+that surfaces it, and the typical fix shape. The priority model at the end converts
+every finding into one comparable score so the catalog reads top to bottom by impact.
+This taxonomy is detection-oriented, not a style guide. The reason behind each rule
+lives in `reference/anti-patterns.md` (the BAN-NN and SLOP-NN catalog) and the domain
+references `reference/color.md`, `reference/typography.md`, and `reference/spatial.md`.
+This file is the catalog of what to look for and how to weigh it.
+---
+## Debt Classes
+### color-literal
+**Definition:** A raw color value written directly in source instead of a token
+reference. Includes `#rgb` / `#rrggbb` / `#rrggbbaa` hex, `rgb()` / `rgba()`, and
+`hsl()` / `hsla()` function literals that flow to a rendered surface.
+**Detection signal:** Grep for `#[0-9a-fA-F]{3,8}`, `rgb\(`, `rgba\(`, `hsl\(`,
+`hsla\(` in source files. Exclude token-definition files (the palette source itself)
+and comments. A literal inside a `var(--token: #hex)` definition is the token, not debt.
+**Fix shape:** Replace the literal with the matching semantic token. If no token
+exists for that role, the fix is to define one first.
+### untokenized-component
+**Definition:** A component file that renders visual surface (color, spacing, type
+size) using inline values or arbitrary utility classes rather than referencing the
+design system's tokens or scale.
+**Detection signal:** A component file (`.tsx` / `.jsx` / `.vue` / `.svelte`) that
+contains color-literal or arbitrary-bracket utility hits (`\[[0-9]+px\]`,
+`\[#[0-9a-f]+\]`) and zero `var(--` references or scale-class references. The ratio
+of literal uses to token uses inside one file is the strength signal.
+**Fix shape:** Route the component's visual values through tokens and the spacing
+or type scale; remove the arbitrary brackets.
+### anti-pattern
+**Definition:** A confirmed instance of a banned construct (BAN-NN) or an AI-slop
+tell (SLOP-NN) from `reference/anti-patterns.md`.
+**Detection signal:** Run `gdd-detect <path> --json`. Each finding carries its
+`ruleId`, `file`, `line`, and a link back to the matching paragraph. The detector
+covers the statically matchable BAN rules; the two subjective rules it cannot match
+(BAN-04 keyboard-action animation, BAN-10 nested equal radius) are noted as a
+manual-review item, not auto-counted.
+**Fix shape:** Apply the rule's documented rewrite. Hard bans take precedence over
+SLOP tells when both touch the same element.
+### contrast
+**Definition:** A foreground and background pairing that falls below the WCAG 2.1 AA
+contrast floor (4.5:1 for body text, 3:1 for large text and non-text indicators).
+**Detection signal:** Resolve text-color and background-color pairs that share an
+element or selector, compute the ratio, and flag pairs under the threshold. Pairs
+built from unresolvable runtime values are a manual-review item.
+**Fix shape:** Adjust the token or the role assignment so the pair clears AA. Never
+rely on color alone to carry meaning; pair it with text or an icon.
+### density-spacing
+**Definition:** Spacing values that sit off the project's modular scale (for example
+the 4 / 8 / 12 / 16 / 24 / 32 series), or inconsistent density between sibling
+components that should share rhythm.
+**Detection signal:** Collect every padding, margin, and gap value, then flag values
+that are not on the declared scale and clusters where neighboring components use
+different step counts for the same structural role.
+**Fix shape:** Snap off-grid values to the nearest scale step; align sibling density.
+### typography-drift
+**Definition:** Font sizes, weights, or families that drift from a systematic type
+scale: arbitrary pixel sizes, more than the agreed family count, or weight choices
+that break the heading-to-body hierarchy.
+**Detection signal:** Tally the distinct font-size and font-weight values and the
+family count. A long tail of one-off sizes, more than two families, or `font-weight`
+under 400 on small text are the drift markers.
+**Fix shape:** Map each one-off value onto the nearest scale step; cap families at two.
+### a11y-text
+**Definition:** Text-content accessibility debt: missing `alt` on meaningful images,
+icon-only controls without an accessible name, placeholder used as the only label,
+and generic or developer-facing copy in empty and error states.
+**Detection signal:** Grep for `<img` without `alt`, interactive elements without
+`aria-label` or visible text, inputs with `placeholder` and no `<label>`, and
+empty-state strings such as "No data" or raw error codes.
+**Fix shape:** Add the accessible name or label; rewrite generic copy to be specific
+and actionable. Copy-quality detail lives in `reference/copy-quality.md`.
+---
+## Priority Scoring Model
+Every finding gets one priority score so the catalog ranks by impact, not by the
+order files were walked. Three ordinal factors combine, each on a 1 to 3 scale.
+**Visible-delta** (how much a user notices the fix):
+| Value | Meaning |
+|-------|---------|
+| 3 | Changes a primary surface a user sees on first load |
+| 2 | Changes a secondary or interior surface |
+| 1 | Invisible at rest; shows only in an edge state or to assistive tech |
+**Effort** (how cheap the fix is; cheaper scores higher so quick wins float up):
+| Value | Meaning |
+|-------|---------|
+| 3 | Mechanical one-line swap (literal to token) |
+| 2 | Localized edit across a single component |
+| 1 | Needs a new token, scale decision, or cross-file refactor |
+**Prevalence** (how many instances share this root cause):
+| Value | Meaning |
+|-------|---------|
+| 3 | Ten or more instances of the same finding |
+| 2 | Three to nine instances |
+| 1 | One or two instances |
+**Combine** by multiplying the three factors:
+```
+priority = visible-delta × effort × prevalence
+```
+The product ranges from 1 (low) to 27 (high). Sort the catalog by `priority`
+descending so the highest-impact, lowest-cost, most-widespread debt sits at the top.
+On ties, break by visible-delta first, then prevalence. Record the three factors next
+to the score in each catalog row so the ranking is auditable, not opaque.

package/reference/live-mode-integration.md ADDED Viewed

@@ -0,0 +1,80 @@
+---
+name: live-mode-integration
+type: meta-rules
+version: 1.0.0
+phase: 47
+tags: [live, preview-mcp, hot-swap, session, bandit, scope-guard, degraded-mode]
+last_updated: 2026-06-03
+---
+# Live Mode Integration
+This file is the meta-rules companion to the `gdd-live` skill (`source/skills/live/SKILL.md`). It describes how `/gdd:live` turns a running dev server into a live design surface: the user picks a DOM element, the agent generates N variants in one batch, the variants hot-swap in place, the user accepts or discards, and the whole session persists. For the SKILL.md structural contract (line cap, description budget, frontmatter), see `./skill-authoring-contract.md`. The variants themselves are grounded in the Phase 45 domain indexes (`./spatial.md`, `./interaction.md`, `./color.md`, `./typography.md`, `./motion.md`).
+There is NO bundled browser automation. The skill drives the Claude Preview MCP at runtime; the modules under `scripts/lib/live/` are pure and dependency-free.
+## The live loop
+The loop is seven stages, each owned by one module under `scripts/lib/live/`:
+1. BOOT. Probe the Preview MCP (see below), resolve the harness live mode, and detect the dev server (Vite, Next, Bun, or static). Open or create the session.
+2. INJECT. Read `RUNTIME_JS` from `scripts/lib/live/runtime.cjs` and evaluate it in the page via `preview_eval`. The runtime is an idempotent IIFE on `window.__gddLive`; re-injecting after a navigation rebinds the same singleton.
+3. PICK. The runtime arms a one-shot click handler. The clicked element is reported back as `{selector, tagName, classList, boundingRect, computedStyle subset, variant}` (the `pickReportShape` contract). The selector strategy prefers id, then a data-testid, then a tag plus class plus nth-of-type path.
+4. GENERATE. Load the matching domain index first, then generate all N variants (default 3) in ONE batch. Each variant is written atomically to its source file and made live, either through HMR or through the runtime swap helper.
+5. POST-CHECK. Each variant runs through `scripts/lib/live/postcheck.cjs`, which invokes `gdd-detect`. Findings show inline. A finding flags a variant; it never auto-rejects it.
+6. ACCEPT or DISCARD. Accept applies the chosen variant as the canonical source edit, reverts the others, and feeds the bandit. Discard reverts everything and leaves the source untouched.
+7. PERSIST. Every stage is written to the session file as it happens.
+## The Preview MCP surface
+Live mode drives the built-in Claude Preview MCP, the same connection documented in `../connections/preview.md`. The tools it uses:
+- `preview_list` and `ToolSearch` for the availability probe.
+- `preview_eval` to inject `RUNTIME_JS` and to read the pick report back.
+- `preview_inspect` and `preview_click` to confirm the picked element and read its computed styles.
+- `preview_navigate` to move between routes (the runtime re-inject is idempotent across navigations).
+- `preview_screenshot` for evidence, and as the only output in degraded mode.
+The `data-gdd-variant` attribute is the hot-swap marker. The runtime stamps `data-gdd-variant="N"` on the picked element when it applies variant N, and reads it back to know which variant is live.
+## The six live events
+Live mode emits six telemetry event types through `scripts/lib/live/events.cjs` onto the standard `.design/telemetry/events.jsonl` stream:
+| Event | Emitted when |
+|---|---|
+| `live_boot` | A session opens (BOOT completes). |
+| `live_pick` | The user picks an element. |
+| `live_generate` | A batch of N variants is generated. |
+| `live_postcheck` | A variant clears or trips its post-check. |
+| `live_accept` | The user accepts a variant. |
+| `live_discard` | The user discards the batch. |
+These telemetry events are distinct from the on-disk session log. The session file records the four lifecycle kinds (`pick`, `generate`, `accept`, `discard`); the telemetry stream carries the six `live_*` types above.
+## The session file
+`scripts/lib/live/session-store.cjs` owns one JSON document per session at `.design/live-sessions/<session-id>.json`, conforming to `./schemas/live-session.schema.json`. It carries `{schema_version, session_id, status, started_at, ended_at, url, dev_server, events[]}`. `status` is one of `in_progress`, `completed`, or `abandoned`; only an `in_progress` session is resumable. Writes are atomic (write to a temp file, then rename), so an interrupted step never leaves a half-written session.
+On `--resume <session-id>`, the skill loads the named session and offers two choices: continue from the last recorded event, or start fresh. It never silently replays completed events.
+## The bandit dev-time feed
+When the user accepts a variant, `scripts/lib/live/bandit-feed.cjs` feeds that outcome to the design-variants bandit described in `./design-variants.md`. This is a dev-time signal: it records which design pattern the developer chose during live iteration. It is advisory and separate from the user-outcome A/B loop and from the routing bandit. The user always wins; the bandit only learns.
+## Degraded mode
+Live mode is gated on `capability_matrix.mcp_support` in `scripts/lib/manifest/harnesses.json`. `scripts/lib/live/harness-mode.cjs` projects each harness onto a mode: `liveModeFor(harnessId)` returns `puppeteer` (full live mode) when `mcp_support` is true, and `degraded` otherwise. `degradedHarnesses()` lists every screenshot-only harness.
+In degraded mode there is no Preview MCP, so there is no runtime injection and no hot-swap. The skill states this up front, then generates variants against the file the user names and captures static screenshots for comparison. Degraded mode never errors; it is a documented fallback.
+## Scope guard
+`scripts/lib/live/scope-guard.cjs` maps the picked selector to its owning source files and rejects any write that falls outside them. A variant that would need a change beyond that scope (a shared token, a parent layout, a new dependency) stops the loop and surfaces the decision to the user, rather than silently widening the blast radius. The guard is the hard boundary on what live mode may edit.
+## See also
+- `./skill-authoring-contract.md` for the SKILL.md authoring rules.
+- `./design-variants.md` for the bandit the accept step feeds.
+- `../connections/preview.md` for the full Preview MCP connection spec.
+- The Phase 45 domain indexes (`./spatial.md`, `./interaction.md`, `./color.md`, `./typography.md`, `./motion.md`) the generate step loads.

package/reference/registry.json CHANGED Viewed

@@ -856,6 +856,13 @@
       "phase": 46,
       "description": "Phase 46 skill-metadata single source of truth: scripts/lib/manifest/skills.json record shape, the generate-skill-frontmatter.cjs forward/extract/check modes, the skills.json to generator to build:skills to skills/+dist build chain, the six managed frontmatter keys (canonical order, always-quoted strings) vs extra_frontmatter passthrough, the 20..1024 description budget (validate-skill-length.cjs + lint-agentskills-spec.cjs R4, lint:agentskills CI gate), and the pin metadata catalogue consumer."
     },
+    {
+      "name": "live-mode-integration",
+      "path": "reference/live-mode-integration.md",
+      "type": "meta-rules",
+      "phase": 47,
+      "description": "Phase 47 live-mode contract for /gdd:live: the seven-stage pick to generate to accept loop, the Claude Preview MCP surface (preview_eval injects scripts/lib/live/runtime.cjs RUNTIME_JS, hot-swap via the data-gdd-variant marker), the six live_* telemetry events vs the four on-disk session kinds, the .design/live-sessions session file (scripts/lib/live/session-store.cjs + live-session.schema.json), the dev-time bandit feed (bandit-feed.cjs into design-variants), degraded screenshot-only mode gated on capability_matrix.mcp_support (harness-mode.cjs), and the scope guard (scope-guard.cjs)."
+    },
     {
       "name": "capability-gap-stage-gate",
       "path": "reference/capability-gap-stage-gate.md",
@@ -1093,6 +1100,27 @@
       "type": "domain-index",
       "phase": 45,
       "description": "Phase 45 domain-index: UX-writing + voice - brand-voice, style-vocabulary, anti-patterns."
+    },
+    {
+      "name": "copy-quality",
+      "path": "reference/copy-quality.md",
+      "type": "heuristic",
+      "phase": 48,
+      "description": "Phase 48 copy-quality pillar rubric: microcopy (button/CTA labels, error messages, empty states, ARIA/alt text, loading copy), voice alignment, i18n overflow lens."
+    },
+    {
+      "name": "debt-categories",
+      "path": "reference/debt-categories.md",
+      "type": "taxonomy",
+      "phase": 48,
+      "description": "Phase 48 design-debt taxonomy: debt classes (color-literal, untokenized-component, anti-pattern, contrast, density-spacing, typography-drift, a11y-text) + visible-delta x effort x prevalence priority scoring."
+    },
+    {
+      "name": "brief-quality-rubric",
+      "path": "reference/brief-quality-rubric.md",
+      "type": "output-contract",
+      "phase": 48,
+      "description": "Phase 48 brief-quality rubric: 5 anti-patterns (vague verbs, missing audience, immeasurable success criteria, scope creep, missing anti-goals) the brief-auditor surfaces."
     }
   ]
 }

package/reference/schemas/events.schema.json CHANGED Viewed

@@ -10,7 +10,7 @@
     "type": {
       "type": "string",
       "minLength": 1,
-      "description": "Free-form event type identifier. Pre-registered seeds: state.mutation, state.transition, stage.entered, stage.exited, hook.fired, error, capability_gap, kfm-candidate, router_pick, verify_outcome, rollout_started, rollout_advanced, rollout_stuck, budget_forecast, project_cap_warning, project_cap_halt."
+      "description": "Free-form event type identifier. Pre-registered seeds: state.mutation, state.transition, stage.entered, stage.exited, hook.fired, error, capability_gap, kfm-candidate, router_pick, verify_outcome, rollout_started, rollout_advanced, rollout_stuck, budget_forecast, project_cap_warning, project_cap_halt, live_session_start, live_pick, live_generate, live_accept, live_discard, live_session_end."
     },
     "timestamp": {
       "type": "string",

package/reference/schemas/live-session.schema.json ADDED Viewed

@@ -0,0 +1,64 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "$id": "https://github.com/hegemonart/get-design-done/reference/schemas/live-session.schema.json",
+  "title": "Live Session",
+  "description": "A single `/gdd:live` session record persisted at .design/live-sessions/<session-id>.json. Captures the pick -> generate -> accept/discard loop as an append-only event log so a session survives a crash or --resume. Written atomically by scripts/lib/live/session-store.cjs.",
+  "type": "object",
+  "required": ["schema_version", "session_id", "status", "started_at", "ended_at", "events"],
+  "properties": {
+    "schema_version": {
+      "type": "string",
+      "description": "Schema version of this session record (e.g. \"47.0\")."
+    },
+    "session_id": {
+      "type": "string",
+      "minLength": 1,
+      "description": "Stable id; also the basename of the on-disk file (no path separators)."
+    },
+    "status": {
+      "type": "string",
+      "enum": ["in_progress", "completed", "abandoned"],
+      "description": "Lifecycle status. Only in_progress sessions are resumable."
+    },
+    "started_at": {
+      "type": "string",
+      "format": "date-time",
+      "description": "ISO-8601 timestamp the session was created."
+    },
+    "ended_at": {
+      "type": ["string", "null"],
+      "format": "date-time",
+      "description": "ISO-8601 timestamp the session was closed; null while in_progress."
+    },
+    "url": {
+      "type": "string",
+      "description": "The page the element was picked from (optional)."
+    },
+    "dev_server": {
+      "description": "Dev-server descriptor (url/port/command, or a plain string). Free-form by design.",
+      "type": ["string", "object", "null"]
+    },
+    "events": {
+      "type": "array",
+      "description": "Append-only log of session events, oldest first.",
+      "items": {
+        "type": "object",
+        "required": ["kind", "at"],
+        "properties": {
+          "kind": {
+            "type": "string",
+            "enum": ["pick", "generate", "accept", "discard"],
+            "description": "Event kind."
+          },
+          "at": {
+            "type": "string",
+            "format": "date-time",
+            "description": "ISO-8601 timestamp the event was recorded."
+          }
+        },
+        "additionalProperties": true
+      }
+    }
+  },
+  "additionalProperties": false
+}

package/scripts/lib/live/bandit-feed.cjs ADDED Viewed

@@ -0,0 +1,64 @@
+'use strict';
+/**
+ * scripts/lib/live/bandit-feed.cjs — Phase 47 (Live Mode) → Phase 38 design-arms bridge.
+ *
+ * When a user ACCEPTS a generated variant in /gdd:live, that acceptance is a weak,
+ * dev-time positive signal for the variant's design pattern. We fold it into the Phase 38
+ * `design_arms` posterior store as a WON observation — but at a discounted weight
+ * (DEV_TIME_WEIGHT = 0.5) because a developer's pick during authoring is advisory until
+ * real production A/B + user-research data arrives (Phase 38 D-03: advisory, never
+ * directive). The full posterior math + atomic persistence lives in the design-arms
+ * store; we only reuse it (variantKey + observe) and never duplicate the Beta math.
+ *
+ * Pure, dependency-free CommonJS. Persistence + `baseDir`/`armsPath` test-injection are
+ * inherited from the store. Ships in the npm package; requires only the in-repo store.
+ */
+const { variantKey, observe } = require('../ds-arms/design-arms-store.cjs');
+/**
+ * The dev-time acceptance weight. Half a win — a developer's accept is a discounted,
+ * advisory signal (Phase 38 D-03) until production user-outcome data shifts the arm.
+ */
+const DEV_TIME_WEIGHT = 0.5;
+/**
+ * Record an accepted Live Mode variant as a discounted WON observation on its design arm.
+ *
+ * @param {object} args
+ * @param {string} args.componentType   The component class (e.g. 'button', 'card').
+ * @param {string|object} args.pattern  The variant's design pattern (string or object;
+ *                                       hashed to the arm key via the store's variantKey).
+ * @param {string} [args.label]         Human-readable arm label, stored on first observe.
+ * @param {number} [args.weight]        Override the dev-time weight (defaults to DEV_TIME_WEIGHT).
+ * @param {string} [args.projectRoot]   Repo root — forwarded to the store as `baseDir` so the
+ *                                       arms file resolves under it (testable / hermetic).
+ * @param {string} [args.armsPath]      Explicit arms-file path override (forwarded to the store).
+ * @returns {object} The updated arm posterior (as returned by the store's observe()).
+ */
+function recordAccepted(args = {}) {
+  const { componentType, pattern, label, projectRoot, armsPath } = args;
+  if (typeof componentType !== 'string' || componentType.length === 0) {
+    throw new TypeError('recordAccepted: componentType is required (non-empty string)');
+  }
+  if (pattern == null) {
+    throw new TypeError('recordAccepted: pattern is required');
+  }
+  const weight = typeof args.weight === 'number' && args.weight > 0 ? args.weight : DEV_TIME_WEIGHT;
+  const hash = variantKey(componentType, pattern);
+  // Forward store options only when provided so the store keeps its own defaults otherwise.
+  const opts = {};
+  if (projectRoot) opts.baseDir = projectRoot;
+  if (armsPath) opts.armsPath = armsPath;
+  return observe(
+    componentType,
+    hash,
+    { won: true, weight, source: 'dev_time', label },
+    opts,
+  );
+}
+module.exports = { recordAccepted, DEV_TIME_WEIGHT };