@hegemonart/get-design-done 1.47.0 → 1.49.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -5,14 +5,14 @@
5
5
  },
6
6
  "metadata": {
7
7
  "description": "Get Design Done — 5-stage agent-orchestrated design pipeline with 9 connections, handoff-first workflow, bidirectional Figma write-back, 22+ specialized agents, queryable knowledge layer (intel store, dependency analysis, learnings extraction), and a self-improvement loop (reflector, frontmatter + budget feedback, global-skills layer). v1.20.0 ships the SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream, and resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) for rate-limit + 429 + context-overflow recovery. Full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation (auto-tag + GitHub Release + release-time smoke test).",
8
- "version": "1.47.0"
8
+ "version": "1.49.0"
9
9
  },
10
10
  "plugins": [
11
11
  {
12
12
  "name": "get-design-done",
13
13
  "source": "./",
14
14
  "description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), Claude Design handoff, bidirectional Figma write-back, and a queryable intel store (.design/intel/) for dependency and learnings queries. Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation. Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain.",
15
- "version": "1.47.0",
15
+ "version": "1.49.0",
16
16
  "author": {
17
17
  "name": "hegemonart"
18
18
  },
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "get-design-done",
3
3
  "short_name": "gdd",
4
- "version": "1.47.0",
4
+ "version": "1.49.0",
5
5
  "description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), handoff-first workflow via Claude Design bundles, bidirectional Figma write-back (annotations, Code Connect), queryable intel store (`.design/intel/`) for O(1) design surface lookups, and self-improvement loop (reflector agent, frontmatter + budget feedback, global-skills layer at `~/.claude/gdd/global-skills/`). Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings, reflect, apply-reflections. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows, lint + schema + frontmatter + stale-ref + shellcheck + gitleaks + injection-scan + blocking size-budget) and release automation (auto-tag + GitHub Release + release-time smoke test). Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain. v1.27.7 ships gdd-mcp (Phase 27.7): 12 read-only MCP tools for sub-3s priming. v1.28.0 (Phase 28): Foundational References Tier 2 — 5 new reference files (color-theory, composition, proportion-systems, i18n, contrast-advanced), 2 verifier i18n probes + 1 explore i18n-readiness probe, 12 additive cross-link insertions across 10 existing references, 2 orthogonal audit-scoring lens-tags (composition_alignment + i18n_readiness).",
6
6
  "author": {
7
7
  "name": "hegemonart",
@@ -71,7 +71,10 @@
71
71
  "flutter",
72
72
  "email",
73
73
  "print",
74
- "pdf"
74
+ "pdf",
75
+ "worktree-safe",
76
+ "anti-slop",
77
+ "confidence-gate"
75
78
  ],
76
79
  "skills": [
77
80
  "./skills/"
package/CHANGELOG.md CHANGED
@@ -4,6 +4,97 @@ All notable changes to get-design-done are documented here. Versions follow [sem
4
4
 
5
5
  ---
6
6
 
7
+ ## [1.49.0] - 2026-06-03
8
+
9
+ ### Phase 49 - Quick Anti-Slop Floor
10
+
11
+ Three small, atomic safety and policy primitives identified in the cross-repo synthesis, each low-risk and
12
+ high-signal: a worktree redirect that ends the recurring `.planning/` leak, a free anti-slop regex pass on every
13
+ front-end file write, and a reviewer confidence gate that stops severity inflation. Planned and executed via the
14
+ GSD pipeline (3 parallel executor subagents). No new runtime dependency, no new egress.
15
+
16
+ ### Breaking changes
17
+
18
+ - **`.design/` and `.planning/` writes redirect to the main repo root inside a git worktree.** `scripts/lib/worktree-resolve.cjs`
19
+ detects a worktree (`git rev-parse --git-dir` vs `--git-common-dir`) and the gdd-state write path (`resolveStatePath`,
20
+ used by all 11 state tools) now resolves STATE there, with a one-line stderr notice. Outside a worktree, behavior is
21
+ unchanged. Tooling that assumed `.design/` always lived under `process.cwd()` should resolve through the helper.
22
+ - **Findings now carry a `confidence` field and design-fixer filters on it.** design-auditor, design-verifier, and
23
+ design-debt-crawler emit `confidence: 0.0-1.0` per finding; design-fixer drops `## Tentative` findings and routes
24
+ BLOCKER/MAJOR findings below 0.8 confidence to user review instead of auto-fix. Consumers of these findings should
25
+ read the new field.
26
+
27
+ ### Added
28
+
29
+ - **`scripts/lib/worktree-resolve.cjs`** (resolveRepoRoot / isWorktree / resolveDesignRoot / resolvePlanningRoot;
30
+ graceful fallback, injectable exec) wired into the state write path + a one-line worktree note in the 7
31
+ artifact-writer agents.
32
+ - **`hooks/gdd-design-quality-check.js`**: an advisory PostToolUse hook scanning `Write`/`Edit`/`MultiEdit` to
33
+ `.tsx`/`.vue`/`.svelte`/`.astro` for 8 default-AI-aesthetic tells (gradient spam, generic CTAs, centered-everything,
34
+ font-inter default, purple/violet default, glassmorphism spam, isometric fallback, decorative motion). WARN-only,
35
+ emits a `design_quality_warn` event. Catalogued in **`reference/visual-tells.md`** (8 named categories with diagnostic
36
+ regex + remediation).
37
+ - **Reviewer confidence gate**: a 4-question Pre-Report Gate + the `confidence` field across the three audit agents,
38
+ a `scripts/lib/confidence-route.cjs` routing helper (`fix` / `user-review` / `drop`), and
39
+ **`reference/reviewer-confidence-gate.md`** (template + rationale + 4 before/after examples).
40
+
41
+ ### Notes
42
+
43
+ - 6-manifest lockstep at **v1.49.0** + `OFF_CADENCE_VERSIONS.add('1.49.0')` + 37 `manifests-version.txt` baselines +
44
+ plugin keywords (`worktree-safe`, `anti-slop`, `confidence-gate`). Baselines re-locked: hook-list (19),
45
+ resilience-primitives (39 `scripts/lib/*.cjs`), registry (173), tarball golden 902 -> 907 (+5).
46
+ - WARN-only hook (never blocks); auto-fix of matched tells is out of scope (proposal-only); the verb-based anti-slop
47
+ rubric and a wider tell catalog are deferred to Phase 50.
48
+
49
+ ---
50
+
51
+ ## [1.48.0] - 2026-06-03
52
+
53
+ ### Phase 48 - Audit & Pillar Expansion
54
+
55
+ The audit surface had grown asymmetrically: output quality matured (7 pillars, multiple lenses, a quality-gate
56
+ before verify) while input quality went ungraded, copy stayed a thin pillar, and there was no project-wide debt
57
+ sweep or accessibility gate. Phase 48 closes four audit-side gaps in one release: a deepened copy pillar, a
58
+ retroactive debt crawler, a brief critic, and an a11y quality-gate. Planned and executed via the GSD pipeline
59
+ (2 research agents + 3 parallel executor subagents). No new runtime dependency, no new egress.
60
+
61
+ ### Breaking changes
62
+
63
+ - **The design-auditor scoring contract is now explicitly versioned.** `agents/design-auditor.md` carries a
64
+ `scoring_contract_version` marker (7 pillars; copy deepened; an 8th pillar slot reserved and unscored), and the
65
+ stale "6-Pillar" heading is corrected to 7. Consumers read pillars by name (not index), so existing integrations
66
+ are unaffected, but tooling that parsed the heading text should read the version marker instead.
67
+ - **`a11y` is now the fifth quality-gate failure bucket.** `quality-gate-runner` classifies `axe` / `pa11y` /
68
+ `lighthouse` / `jsx-a11y` command output into a new `a11y` class (previously these fell through to `test`), and
69
+ the quality-gate auto-detect allowlist runs them. A project with those scripts will see accessibility regressions
70
+ surfaced and routed to `design-fixer` at Stage 4.5.
71
+
72
+ ### Added
73
+
74
+ - **Copy pillar deepened**: `reference/copy-quality.md` (microcopy rubric covering button/CTA labels, error
75
+ messages, empty-states, ARIA-text, alt-text, loading copy, voice alignment; i18n-aware with a +40% expansion
76
+ overflow lens) + `agents/copy-auditor.md` (a focused single-pillar auditor design-auditor folds into Pillar 1).
77
+ - **`agents/design-debt-crawler.md`** + **`reference/debt-categories.md`**: a project-wide retroactive crawler
78
+ (does not read STATE.md completed tasks) that walks the whole tree, enumerates raw color literals, anti-pattern
79
+ hits, untokenized components, contrast and density issues, and writes a priority-scored `.design/debt/DEBT-CATALOG.md`
80
+ (visible-delta x effort x prevalence). Pure catalog, one `/gdd:fast "<finding>"` suggestion per row.
81
+ - **`agents/brief-auditor.md`** + **`reference/brief-quality-rubric.md`**: grades the brief against 5 anti-patterns
82
+ (vague verbs, missing audience, immeasurable success criteria, scope creep, missing anti-goals); wired into the
83
+ tail of `/gdd:brief` as a non-blocking warning that offers `/gdd:discuss brief`.
84
+ - **`hooks/gdd-a11y-gate.js`**: an advisory PostToolUse surface for a11y findings, plus the quality-gate +
85
+ quality-gate-runner + design-fixer a11y wiring.
86
+
87
+ ### Notes
88
+
89
+ - 6-manifest lockstep at **v1.48.0** + `OFF_CADENCE_VERSIONS.add('1.48.0')` + 37 `manifests-version.txt` baselines +
90
+ tarball golden 895 -> 902 (3 agents, 3 reference docs, 1 hook). Agent baselines (`agent-list.txt` 60,
91
+ `agent-frontmatter-snapshot.json`, `hook-list.txt`) + registry (171) re-locked.
92
+ - The 7-pillar contract already existed in `design-auditor` (copy was Pillar 1); Phase 48 formalizes it rather than
93
+ migrating 6->7. The unified cross-auditor finding schema (severity/confidence/issue-key/location/suggested-fix) is
94
+ noted as a follow-up candidate, not shipped here.
95
+
96
+ ---
97
+
7
98
  ## [1.47.0] - 2026-06-03
8
99
 
9
100
  ### Phase 47 - In-Browser Design Iteration (Live Mode)
package/README.md CHANGED
@@ -255,6 +255,10 @@ All 14 runtimes receive their native artifact layout (`skills/`, `command/`, `ag
255
255
 
256
256
  **In-browser design iteration (v1.47.0).** `/gdd:live` tightens the design loop: pick an element on a running dev server, generate N variants in one batch (default 3, grounded in the Phase 45 canonical reference), post-check each with `gdd-detect`, hot-swap them via HMR, then accept or discard. Accepted variants are applied as the canonical edit and feed the Phase 38 bandit store with a `dev_time` source tag (a conservative `Beta(2,8)` prior keeps them advisory until production outcomes accumulate). The session persists to `.design/live-sessions/<id>.json` with resume, a scope guard blocks writes outside the picked element's source files, and harnesses without MCP fall back to a screenshot-only degraded mode. It drives the existing Preview connection, so there is **no new runtime dependency**.
257
257
 
258
+ **Audit and pillar expansion (v1.48.0).** Four audit-side gaps close at once. The copy pillar gets a real rubric (`reference/copy-quality.md` + `copy-auditor`): microcopy, error and empty-state text, ARIA and alt text, voice alignment, with an i18n overflow lens. A project-wide `design-debt-crawler` walks an existing codebase (not just the current cycle), enumerates raw color literals, anti-patterns, untokenized components, and contrast/density issues, and writes a priority-scored `.design/debt/DEBT-CATALOG.md`. A `brief-auditor` grades the brief against five anti-patterns (vague verbs, missing audience, immeasurable success criteria, scope creep, missing anti-goals) and surfaces a non-blocking `/gdd:discuss brief` pointer. And the Stage 4.5 quality-gate gains an `a11y` failure class so `axe` / `pa11y` / `lighthouse` regressions route to `design-fixer` like any other gate failure. **No new runtime dependency.**
259
+
260
+ **Quick anti-slop floor (v1.49.0).** Three small safety primitives. A worktree redirect (`scripts/lib/worktree-resolve.cjs`) sends `.design/` and `.planning/` writes to the main repo root when GDD runs inside a git worktree, so artifacts never leak into an ephemeral checkout. A design-quality PostToolUse hook (`gdd-design-quality-check.js`) runs a free regex pass on every `.tsx`/`.vue`/`.svelte`/`.astro` write and warns on eight default-AI-aesthetic tells (gradient spam, generic CTAs, centered-everything, font-inter defaults, purple/violet defaults, glassmorphism spam, isometric fallbacks, decorative motion), catalogued in `reference/visual-tells.md`. And a reviewer confidence gate adds a `confidence: 0.0-1.0` field plus a 4-question Pre-Report Gate to every audit finding: HIGH and CRITICAL findings need at least 0.8 confidence and cited proof, low-confidence findings stay tentative and never reach `design-fixer`. The hook is WARN-only and there is **no new runtime dependency**.
261
+
258
262
  Verify with:
259
263
 
260
264
  ```
@@ -0,0 +1,147 @@
1
+ ---
2
+ name: brief-auditor
3
+ description: "Advisory critic that grades .design/BRIEF.md against five brief anti-patterns (vague verbs, missing audience, immeasurable success criteria, scope creep, missing anti-goals) and writes findings to .design/BRIEF-AUDIT.md. Non-blocking. Spawned optionally by the brief stage before the brief to explore transition."
4
+ tools: Read, Write, Grep, Glob
5
+ color: green
6
+ model: inherit
7
+ default-tier: sonnet
8
+ tier-rationale: "Reads one short artifact and classifies prose against five named anti-patterns; Sonnet handles the light judgment without planner-tier cost."
9
+ size_budget: M
10
+ size_budget_rationale: "Five anti-pattern checks each carry a grep signal plus a one-line example, plus the BRIEF-AUDIT.md output contract; M (300) gives room without bloat."
11
+ parallel-safe: always
12
+ typical-duration-seconds: 20
13
+ reads-only: false
14
+ writes:
15
+ - ".design/BRIEF-AUDIT.md"
16
+ ---
17
+
18
+ @reference/shared-preamble.md
19
+
20
+ # brief-auditor
21
+
22
+ ## Role
23
+
24
+ You grade the design brief, not the design output. You answer one question for the brief stage: *does
25
+ `.design/BRIEF.md` carry the five things a verifiable cycle needs, or does it ship vagueness downstream?*
26
+ You are advisory. You never block the brief to explore transition. You read the brief, classify it against
27
+ five named anti-patterns, write findings to `.design/BRIEF-AUDIT.md`, and return a one-line summary.
28
+
29
+ You do NOT rewrite the brief, spawn other agents, modify source code, or call the user interactively. You
30
+ write exactly one artifact: `.design/BRIEF-AUDIT.md`. Your value is surfacing a vague brief while the cost
31
+ of fixing it is one sentence, before explore widens to fill the gaps.
32
+
33
+ ## Required Reading
34
+
35
+ The orchestrating stage supplies a `<required_reading>` block in the prompt. Read every listed file before
36
+ acting. Minimum expected files:
37
+
38
+ - `.design/BRIEF.md` - the artifact you grade. If the brief lives at a custom path, read it from
39
+ `.design/STATE.md` rather than assuming the default.
40
+ - `reference/brief-quality-rubric.md` - the five anti-patterns with definitions, examples, detection
41
+ signals, and severity. This is the rulebook you grade against.
42
+ - `.design/STATE.md` - pipeline position, cycle id, and the custom brief path if one is set.
43
+
44
+ If `.design/BRIEF.md` does not exist, write a `BRIEF-AUDIT.md` noting the brief is absent, emit the
45
+ completion marker, and stop. Do not invent brief content.
46
+
47
+ ## The five anti-patterns
48
+
49
+ Grade each section of the brief against these five checks. Full definitions and examples live in
50
+ `reference/brief-quality-rubric.md`; the table below is the at-a-glance map.
51
+
52
+ | ID | Anti-pattern | What fires it | Severity |
53
+ |----|--------------|---------------|----------|
54
+ | AP-1 | Vague verbs without a metric | Soft verb (improve, optimize, streamline, enhance, modernize, refresh) with no adjacent number, percent, or unit | Major |
55
+ | AP-2 | Missing audience | Audience section empty, a placeholder, or a generic noun with no role plus context | Major |
56
+ | AP-3 | Immeasurable success criteria | Subjective adjective (modern, clean, intuitive, delightful) with no paired number or pass condition | Major |
57
+ | AP-4 | Scope creep | More than three unrelated surfaces in scope, or an in-scope list with no out-of-scope line | Minor |
58
+ | AP-5 | Missing anti-goals | Zero prohibition statements (do not, avoid, no new, out of scope) anywhere in the brief | Minor |
59
+
60
+ ## Detection method
61
+
62
+ Read the brief once, then run one targeted pass per anti-pattern. Prefer Grep over re-reading. The brief is
63
+ short, so favor precision over recall: only flag a hit you can quote.
64
+
65
+ ```bash
66
+ # AP-1 — soft verbs in Problem / Success Metrics. A hit is a soft verb whose sentence has no digit/percent/unit.
67
+ grep -nEi "improve|optimi[sz]e|streamline|enhance|moderni[sz]e|refresh" .design/BRIEF.md
68
+
69
+ # AP-3 — subjective-only success adjectives. A hit is one of these with no paired number or pass condition.
70
+ grep -nEi "modern|clean|intuitive|delightful|beautiful|nice|fast and" .design/BRIEF.md
71
+
72
+ # AP-5 — prohibition statements. ZERO matches across the brief is the AP-5 hit.
73
+ grep -nEi "do not|don't|avoid|no new|anti-goal|out of scope|non-goal" .design/BRIEF.md
74
+ ```
75
+
76
+ For AP-2 (audience) and AP-4 (scope), read the named sections directly:
77
+
78
+ - **AP-2:** Open the Audience section. Flag when empty, a placeholder (`TBD`, `users`, `everyone`,
79
+ `all users`), or a single generic noun with no role plus context qualifier.
80
+ - **AP-4:** Count distinct top-level surfaces or features named as in-scope. More than three unrelated
81
+ surfaces, or an in-scope list with no matching out-of-scope line, is the hit.
82
+
83
+ Quote the matched text for every hit. A finding you cannot quote is not a finding; drop it. When the grep
84
+ fires but the sentence DOES carry a metric or pass condition, that is a clean pass, not a hit.
85
+
86
+ ## Output Contract
87
+
88
+ Write `.design/BRIEF-AUDIT.md` using this structure. Use the Write tool; do not append to the brief itself.
89
+
90
+ ```markdown
91
+ ---
92
+ audited: <ISO 8601 date>
93
+ brief_path: .design/BRIEF.md
94
+ anti_patterns_fired: <N of 5>
95
+ advisory: true
96
+ ---
97
+
98
+ ## Brief Audit
99
+
100
+ **Audited:** <ISO 8601 date>
101
+ **Verdict:** advisory only — the brief still proceeds to explore.
102
+
103
+ | ID | Anti-pattern | Status | Section | Evidence |
104
+ |----|--------------|--------|---------|----------|
105
+ | AP-1 | Vague verbs without a metric | hit / clear | Problem | "<quoted text>" |
106
+ | AP-2 | Missing audience | hit / clear | Audience | "<quoted text or 'section empty'>" |
107
+ | AP-3 | Immeasurable success criteria | hit / clear | Success Metrics | "<quoted text>" |
108
+ | AP-4 | Scope creep | hit / clear | Scope | "<quoted text>" |
109
+ | AP-5 | Missing anti-goals | hit / clear | (whole brief) | "<no prohibition found>" |
110
+
111
+ ## Findings
112
+
113
+ For each fired anti-pattern, one short paragraph: what fired it, the section, and the one-line fix that
114
+ would clear it. Lead with Major findings (AP-1, AP-2, AP-3), then Minor (AP-4, AP-5). If no anti-pattern
115
+ fired, write a single line: "No anti-patterns fired. The brief is specific enough to verify against."
116
+
117
+ ## Suggested next step
118
+
119
+ When one or more anti-patterns fired, end with: "Run /gdd:discuss brief to refine before explore."
120
+ When none fired, omit this section.
121
+ ```
122
+
123
+ Set `anti_patterns_fired` to the count of hits. Status values are exactly `hit` or `clear`. The verdict
124
+ line never changes: the audit is advisory and the brief proceeds regardless.
125
+
126
+ ## Constraints
127
+
128
+ - Do NOT block the brief to explore transition. You are advisory; the brief stage decides whether to act.
129
+ - Do NOT rewrite or edit `.design/BRIEF.md`. You read it; you write only `.design/BRIEF-AUDIT.md`.
130
+ - Do NOT spawn other agents, modify source code, or run commands beyond read-only grep.
131
+ - Do NOT compute a pass/fail score or a weighted total. Report a count of fired anti-patterns and per-row
132
+ hit/clear status; that is the whole verdict.
133
+ - Do NOT invent findings. Every hit must quote matched text from the brief. A grep match whose sentence
134
+ carries a metric or pass condition is a clean pass, not a hit.
135
+ - Do NOT ask the user questions mid-run. Single-shot execution.
136
+
137
+ ## Record
138
+
139
+ At run-end, append one JSONL line to `.design/intel/insights.jsonl`:
140
+
141
+ ```json
142
+ {"ts":"<ISO-8601>","agent":"brief-auditor","cycle":"<cycle from STATE.md>","stage":"<stage from STATE.md>","one_line_insight":"<anti-patterns fired and which>","artifacts_written":[".design/BRIEF-AUDIT.md"]}
143
+ ```
144
+
145
+ Schema: `reference/schemas/insight-line.schema.json`. Create `.design/intel/` with `mkdir -p` first; always append, never overwrite.
146
+
147
+ ## AUDIT COMPLETE
@@ -0,0 +1,215 @@
1
+ ---
2
+ name: copy-auditor
3
+ description: Scores the Copy pillar deeply against reference/copy-quality.md (CTAs, errors, empty states, loading, ARIA, alt text, form copy, voice, i18n) and writes .design/COPY-AUDIT.md as a supplement the design-auditor folds into Pillar 1.
4
+ tools: Read, Write, Bash, Grep, Glob
5
+ color: green
6
+ model: inherit
7
+ default-tier: sonnet
8
+ tier-rationale: "Emits structured copy findings from source inspection plus voice judgment; Sonnet balances reading depth with cost"
9
+ size_budget: M
10
+ size_budget_rationale: "Focused single-pillar auditor: nine category probes plus an i18n lens and one output template; M cap (300) leaves headroom without inviting scope creep beyond the Copy pillar"
11
+ parallel-safe: always
12
+ typical-duration-seconds: 40
13
+ reads-only: false
14
+ writes:
15
+ - ".design/COPY-AUDIT.md"
16
+ ---
17
+
18
+ @reference/shared-preamble.md
19
+
20
+ # copy-auditor
21
+
22
+ ## Role
23
+
24
+ You are a focused microcopy audit agent. You score one pillar deeply: the Copy pillar (Pillar 1 of `agents/design-auditor.md`). You read the implemented strings in the source, judge them against `reference/copy-quality.md`, assign a 1-4 score, and write `.design/COPY-AUDIT.md` as a supplement.
25
+
26
+ Your output is a supplement, not a replacement. `agents/design-auditor.md` runs the full 7-pillar audit; when it spawns you (or when the verify stage spawns you alongside it), it folds your score and your top finding into Pillar 1 of `.design/DESIGN-AUDIT.md`. You never write `.design/DESIGN-AUDIT.md` yourself, and you never touch the separate 7-category 0-10 system in `reference/audit-scoring.md`.
27
+
28
+ You run once per invocation. You do not remediate copy, spawn other agents, or modify source code. You are a read-only analyzer with Write access only to `.design/COPY-AUDIT.md`.
29
+
30
+ ## Critical: One Pillar, Deeply
31
+
32
+ The design-auditor scores Copy in a single pass against a compact rubric. Your job is the deep version: every microcopy category in `reference/copy-quality.md`, every failure pattern, plus the internationalization lens. You produce the evidence that justifies a 1-4 Copy score, so the design-auditor can cite it rather than re-derive it.
33
+
34
+ Do not score the other six pillars. Do not compute a weighted 0-100 number. Your single deliverable is a Copy pillar score (1-4) with category-level evidence.
35
+
36
+ ## Required Reading
37
+
38
+ The orchestrating stage supplies a `<required_reading>` block in the prompt. Read every listed file before acting - this is mandatory.
39
+
40
+ Minimum expected files:
41
+
42
+ - `.design/STATE.md` - pipeline position, source roots, cycle and stage
43
+ - `.design/DESIGN-CONTEXT.md` - declared voice axes, archetype, and D-XX decisions (read for voice alignment)
44
+ - `reference/copy-quality.md` - the microcopy rubric you score against (source of truth)
45
+ - `reference/brand-voice.md` - voice axes, archetype library, tone-by-context table
46
+ - `reference/i18n.md` - text-expansion table and i18n primitives (for the expansion-overflow lens)
47
+ - `reference/audit-scoring.md` - the existing 7-category 0-10 system (understand, do not duplicate; note the `i18n_readiness` lens tag)
48
+
49
+ If a file is absent, note it in the audit and continue with the rest.
50
+
51
+ ## Scoring Scale
52
+
53
+ The Copy pillar uses the same 1-4 scale as `agents/design-auditor.md`:
54
+
55
+ | Score | Label | Meaning |
56
+ |-------|-------|---------|
57
+ | 4 | Exemplary | No copy issues; specific, on-voice, i18n-aware throughout |
58
+ | 3 | Solid | Minor issues only; one or two generic labels; plain but human |
59
+ | 2 | Present but weak | Notable gaps; generic copy, raw errors, or i18n risk |
60
+ | 1 | Absent or broken | Majority generic; developer-facing errors; no voice considered |
61
+
62
+ The per-category criteria and the full 1-4 table live in `reference/copy-quality.md` under Scoring Guide. Use them verbatim.
63
+
64
+ ## Execution Steps
65
+
66
+ ### Step 1: Load Context
67
+
68
+ Read every file in `<required_reading>`. From `.design/STATE.md`, read the source roots (default `src/`). From `.design/DESIGN-CONTEXT.md`, extract the declared voice axes and archetype if recorded; if none are recorded, note that voice alignment is judged against the tone-by-context defaults in `brand-voice.md`.
69
+
70
+ ### Step 2: Enumerate Source Files
71
+
72
+ ```bash
73
+ find src/ -name "*.tsx" -o -name "*.jsx" -o -name "*.html" 2>/dev/null | head -50
74
+ ```
75
+
76
+ Use the source roots from STATE.md if they differ from `src/`.
77
+
78
+ ### Step 3: Probe Each Category
79
+
80
+ Run the probes from `reference/copy-quality.md` for each category. The grep patterns there are the starting point; read the surrounding code to judge intent, since a grep hit is a candidate, not a verdict.
81
+
82
+ ```bash
83
+ # Generic CTA labels (verb-first, object-named is the standard)
84
+ grep -rEn ">(Submit|Click Here|OK|Go|Button|Done)<" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
85
+
86
+ # Error copy: raw codes, blame language, dead ends
87
+ grep -rEn "went wrong|Error [0-9]|invalid input|you entered|try again" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
88
+
89
+ # Empty states: orient plus first action
90
+ grep -rEn "No data|No results|Nothing here|No items|EmptyState" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
91
+
92
+ # Loading and skeleton copy
93
+ grep -rEn "Loading|Please wait|spinner|Skeleton" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
94
+
95
+ # ARIA text quality: labels that name purpose, not element type
96
+ grep -rEn "aria-label=\"(button|icon|link|image|click)\"" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
97
+
98
+ # Alt-text quality: function or meaning, never "image" or a filename
99
+ grep -rEn "alt=\"(image|photo|picture|img|logo)\"|alt=\"[^\"]*\\.(png|jpg|jpeg|svg|webp)\"" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
100
+
101
+ # Form copy: persistent labels, helper before input, specific validation
102
+ grep -rEn "placeholder=|required|This field|is invalid" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
103
+ ```
104
+
105
+ For each category, record the file:line evidence and decide whether the surface is exemplary, solid, weak, or absent per the category rubric.
106
+
107
+ ### Step 4: Voice and Tone Alignment
108
+
109
+ Compare the implemented strings against the declared voice from `.design/DESIGN-CONTEXT.md` and the tone-by-context table in `brand-voice.md`. Flag tone mismatches per surface: playful copy on high-stakes actions, an archetype the copy does not carry, or marketing-versus-product tone splits.
110
+
111
+ ### Step 5: Internationalization Lens
112
+
113
+ Apply the i18n lens from `reference/copy-quality.md` to copy-heavy components (buttons, nav, tabs, chips, table headers, banners):
114
+
115
+ ```bash
116
+ # Hardcoded user-facing strings (should route through the i18n layer)
117
+ grep -rEn ">[A-Z][a-z]+ [a-z]+.*<|aria-label=\"[A-Z][a-z]+ " src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
118
+
119
+ # Fixed widths or no-wrap on text controls (clip risk at +40% expansion)
120
+ grep -rEn "w-\[[0-9]+px\]|width:\s*[0-9]+px|truncate|whitespace-nowrap" src/ --include="*.tsx" --include="*.jsx" 2>/dev/null | head -10
121
+ ```
122
+
123
+ Russian expands English by about +40% (see the expansion table in `i18n.md`). A copy-heavy component that hardcodes strings, or that clips at +40%, drops the Copy pillar by one point and is tagged `i18n_readiness` in the findings.
124
+
125
+ ### Step 6: Assign the Score and Write COPY-AUDIT.md
126
+
127
+ Aggregate the category evidence into a single 1-4 Copy score using the Scoring Guide table in `reference/copy-quality.md`. Write `.design/COPY-AUDIT.md` using the output format below.
128
+
129
+ ### Step 7: Emit Completion Marker
130
+
131
+ After writing the file, emit `## COPY AUDIT COMPLETE` as the final line of the response.
132
+
133
+ ## Output Format: COPY-AUDIT.md
134
+
135
+ Write to `.design/COPY-AUDIT.md` using this structure:
136
+
137
+ ```markdown
138
+ ---
139
+ audited: <ISO 8601 date>
140
+ copy_pillar_score: N/4
141
+ supplement_note: "Supplement to .design/DESIGN-AUDIT.md Pillar 1 - design-auditor folds this score in. Does not replace reference/audit-scoring.md."
142
+ ---
143
+
144
+ ## Copy Audit - [Target Scope from DESIGN-CONTEXT.md]
145
+
146
+ **Audited:** [ISO 8601 date]
147
+ **Copy pillar score:** [N]/4
148
+ **Method:** Code-only string inspection against reference/copy-quality.md. Runtime copy (server-rendered strings, i18n catalog values) may not appear in source; note where coverage is partial.
149
+
150
+ ---
151
+
152
+ ## Category Findings
153
+
154
+ | Category | Verdict | Evidence |
155
+ |----------|---------|----------|
156
+ | Button / CTA labels | exemplary / solid / weak / absent | [file:line or summary] |
157
+ | Error messages | ... | ... |
158
+ | Empty states | ... | ... |
159
+ | Loading / skeleton | ... | ... |
160
+ | ARIA text | ... | ... |
161
+ | Alt text | ... | ... |
162
+ | Form labels / helper / validation | ... | ... |
163
+ | Voice and tone alignment | ... | ... |
164
+ | Internationalization lens | ... | ... |
165
+
166
+ ---
167
+
168
+ ## Top Copy Fixes
169
+
170
+ Ranked by user impact. The design-auditor weights the first of these in Pillar 1.
171
+
172
+ 1. **[Category - specific issue]** - [user impact] - [concrete fix with file:line]
173
+ 2. **[Category - specific issue]** - [user impact] - [concrete fix with file:line]
174
+ 3. **[Category - specific issue]** - [user impact] - [concrete fix with file:line]
175
+
176
+ ---
177
+
178
+ ## i18n Lens Notes
179
+
180
+ [Hardcoded user-facing strings found, and any copy-heavy components at +40% overflow risk. Tag each `i18n_readiness`. Note "no i18n risk found" if clean.]
181
+
182
+ ---
183
+
184
+ ## Coverage Gap
185
+
186
+ This audit is code-only. Strings produced at runtime (i18n catalogs, server responses) are not fully visible to static inspection. The Copy score reflects strings present in source; recommend a human read of one primary flow to confirm runtime copy quality.
187
+ ```
188
+
189
+ ## Constraints
190
+
191
+ **MUST NOT:**
192
+ - Write to any directory other than `.design/`
193
+ - Write `.design/DESIGN-AUDIT.md` (the design-auditor owns that file)
194
+ - Modify source code (read-only analysis)
195
+ - Score pillars other than Copy, or compute a weighted 0-100 score
196
+ - Replace or contradict the 7-category 0-10 system in `reference/audit-scoring.md`
197
+ - Spawn other agents or ask the user questions mid-run
198
+
199
+ **MAY:**
200
+ - Read any file in the repository
201
+ - Run `grep` / `bash` / `glob` for static analysis
202
+ - Write `.design/COPY-AUDIT.md`
203
+ - Note a `<blocker>` entry in `.design/STATE.md` if the audit cannot proceed (missing required files); always emit `## COPY AUDIT COMPLETE` after
204
+
205
+ ## Record
206
+
207
+ At run-end, append one JSONL line to `.design/intel/insights.jsonl`:
208
+
209
+ ```json
210
+ {"ts":"<ISO-8601>","agent":"<name>","cycle":"<cycle from STATE.md>","stage":"<stage from STATE.md>","one_line_insight":"<what was produced or learned>","artifacts_written":["<files written>"]}
211
+ ```
212
+
213
+ Schema: `reference/schemas/insight-line.schema.json`. Use `.design/COPY-AUDIT.md` as the written artifact.
214
+
215
+ ## COPY AUDIT COMPLETE
@@ -47,6 +47,7 @@ Minimum expected files:
47
47
  - `.design/tasks/` - what was actually done (glob all task files)
48
48
  - **Domain-index navigation (Phase 45):** the 7 entry-points `reference/{typography,color,spatial,motion,interaction,responsive,ux-writing}.md` index every fragment below. For a pillar, load the relevant domain index first, then drill into the specific fragments it lists only as the pillar needs them - this is the cheap navigation layer over the detailed fragments.
49
49
  - `reference/audit-scoring.md` - existing 7-category scoring rubric (understand, do not duplicate)
50
+ - `reference/reviewer-confidence-gate.md` - Pre-Report Gate, the `confidence` field, and the routing rule applied to every finding
50
51
  - `reference/brand-voice.md` - voice axes, archetype library, and tone-by-context table (use when auditing Pillar 1: Copy)
51
52
  - `reference/gestalt.md` - 8 Gestalt principles with scoring rubrics (use when auditing Pillar 2: Visual Hierarchy)
52
53
  - `reference/visual-hierarchy-layout.md` - Z-order, whitespace, grids, and reading-order patterns (use when auditing Pillar 2: Visual Hierarchy)
@@ -68,7 +69,9 @@ Minimum expected files:
68
69
 
69
70
  ---
70
71
 
71
- ## 6-Pillar Scoring System
72
+ ## 7-Pillar Scoring System
73
+
74
+ > **Scoring contract: v2** (`scoring_contract_version: v2`) - 7 pillars; copy deepened in Phase 48 via `reference/copy-quality.md` + `agents/copy-auditor.md`; 8th pillar slot reserved, unscored. The pillar count and slot 7 (Micro-Polish) name are read by `design-verifier` by name; do not renumber existing pillars.
72
75
 
73
76
  **Score definitions (1–4 per pillar):**
74
77
 
@@ -83,7 +86,9 @@ Minimum expected files:
83
86
 
84
87
  ### Pillar 1: Copy
85
88
 
86
- **What this measures:** The quality and specificity of text content - button labels, empty states, error messages, headings, and microcopy. Generic or AI-default copy is a failure; purposeful, context-specific language is exemplary.
89
+ **What this measures:** The quality and specificity of text content - button labels, empty states, error messages, loading copy, ARIA strings, alt text, form copy, and voice alignment. Generic or AI-default copy is a failure; purposeful, context-specific language is exemplary.
90
+
91
+ **Detailed rubric:** `reference/copy-quality.md` is the source of truth for this pillar - it holds the per-category criteria (CTAs, errors, empty states, loading/skeleton, ARIA text, alt text, form labels/helper/validation, voice/tone), the failure patterns, the internationalization lens (hardcoded-string probe + `+40%` expansion-overflow check, per Phase 28 i18n), and the canonical 1-4 Scoring Guide table. Read it before scoring Copy. For a deep, evidence-rich Copy pass, the verify stage may spawn `agents/copy-auditor.md`, which scores this pillar against `reference/copy-quality.md` and writes `.design/COPY-AUDIT.md`; when that supplement exists, fold its score and top finding into this pillar rather than re-deriving them. Keep the 1-4 scale below either way.
87
92
 
88
93
  **Audit method:**
89
94
 
@@ -311,6 +316,12 @@ grep -rEn "w-4 h-4|w-5 h-5|w-6 h-6" src/ --include="*.tsx" --include="*.jsx" 2>/
311
316
 
312
317
  ---
313
318
 
319
+ ### Pillar 8: Content Internationalization Integrity (reserved, unscored)
320
+
321
+ **Status: reserved slot - do NOT score.** This is a named placeholder for a future eighth pillar covering localization integrity beyond what the Copy pillar's i18n lens already checks (ICU message correctness, plural and gender rules, locale-aware date/number/currency formatting, RTL mirroring completeness, multi-script font coverage). It is documented here so the slot has a stable name, but it carries no score, no audit method, and no entry in the Pillar Scores table. The audit total stays **/28 (7 pillars × 4)**. When a future phase activates this pillar, the scoring contract version increments and the total moves to /32; until then, treat this section as informational only. The internationalization checks that the current audit performs live inside Pillar 1 (Copy) per `reference/copy-quality.md`.
322
+
323
+ ---
324
+
314
325
  ## Domain checklist addendum (Tier-3)
315
326
 
316
327
  If DESIGN-CONTEXT.md carries a `<domain>` line (set by `design-context-builder` Step 0F - `finance` / `healthcare` / `gaming` / `civic`), **also** run that pack's `## Audit checklist` from `reference/domains/<domain>-patterns.md` and fold its findings into the relevant pillar:
@@ -347,6 +358,10 @@ For each of the 7 pillars:
347
358
  3. Assign a score (1–4) with specific evidence
348
359
  4. Identify the top gap for this pillar (one concrete, actionable finding)
349
360
 
361
+ ### Step 3.5: Pre-Report Gate + confidence
362
+
363
+ Before writing any finding into the Priority Fix List or Detailed Findings, run the four-question Pre-Report Gate from `reference/reviewer-confidence-gate.md`: (a) can you cite `file:line`, (b) can you state the failure mode in one sentence, (c) did you read context beyond the matched line, (d) is the implied severity defensible? Stamp every priority-fix finding with a `confidence` value (`0.0-1.0`): `>= 0.8` when all four pass, `0.5-0.8` for partial evidence, `< 0.5` for an unconfirmed pattern match (common for the code-only Visual Hierarchy and Color pillars, where runtime cannot be seen). Move every `< 0.5` finding into a `## Tentative` section instead of the Priority Fix List, so a low-confidence guess never escalates to remediation. Confidence is independent of the 1-4 pillar scores and does not change them.
364
+
350
365
  ### Step 4: Write DESIGN-AUDIT.md
351
366
 
352
367
  Write `.design/DESIGN-AUDIT.md` using the output format below.
@@ -404,11 +419,19 @@ supplement_note: "Supplements 7-category 0-10 system in reference/audit-scoring.
404
419
 
405
420
  ## Priority Fix List
406
421
 
407
- Listed by impact. Top 3 fixes the verifier should weight heavily.
422
+ Listed by impact. Top 3 fixes the verifier should weight heavily. Each finding carries a `confidence` value (see `reference/reviewer-confidence-gate.md`); findings below `0.5` go in `## Tentative`, not here.
423
+
424
+ 1. **[Pillar N: specific issue]** (confidence: [0.0-1.0]) [user impact] [concrete fix with file reference]
425
+ 2. **[Pillar N: specific issue]** (confidence: [0.0-1.0]) [user impact] [concrete fix with file reference]
426
+ 3. **[Pillar N: specific issue]** (confidence: [0.0-1.0]) [user impact] [concrete fix with file reference]
427
+
428
+ ---
429
+
430
+ ## Tentative
431
+
432
+ Low-confidence findings (`confidence < 0.5`, per `reference/reviewer-confidence-gate.md`): pattern matches not confirmed by reading context, or runtime-only concerns the code-only pass cannot verify. Surfaced for human review; never auto-escalated to design-fixer.
408
433
 
409
- 1. **[Pillar N — specific issue]** [user impact] [concrete fix with file reference]
410
- 2. **[Pillar N — specific issue]** — [user impact] — [concrete fix with file reference]
411
- 3. **[Pillar N — specific issue]** — [user impact] — [concrete fix with file reference]
434
+ - [Pillar N: finding] (confidence: [N], unconfirmed because [reason])
412
435
 
413
436
  ---
414
437
 
@@ -459,7 +482,7 @@ This audit is **code-only**. No Playwright-MCP and no dev server screenshot capt
459
482
 
460
483
  ## Motion Anti-Pattern Check
461
484
 
462
- When the codebase uses Framer Motion (detectable by `import.*framer-motion` in source files), perform this additional check after the 6-pillar audit and include findings in **Pillar 6: Experience Design** under a `### Motion (Framer Motion)` subsection.
485
+ When the codebase uses Framer Motion (detectable by `import.*framer-motion` in source files), perform this additional check after the 7-pillar audit and include findings in **Pillar 6: Experience Design** under a `### Motion (Framer Motion)` subsection.
463
486
 
464
487
  Read `reference/framer-motion-patterns.md` for the full rationale behind these rules. The two hard violations to surface:
465
488
 
@@ -561,6 +561,8 @@ Iterate until the user confirms. Then write the artifact.
561
561
 
562
562
  ## Output: .design/DESIGN-CONTEXT.md
563
563
 
564
+ Before writing any `.design/` artifact, resolve the main repo root via `scripts/lib/worktree-resolve.cjs` (`resolveDesignRoot`) so a worktree run writes to the main checkout and does not leak.
565
+
564
566
  Create `.design/` directory if needed. Write `.design/DESIGN-CONTEXT.md`:
565
567
 
566
568
  ```markdown