@hegemonart/get-design-done 1.48.0 → 1.49.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +2 -2
- package/.claude-plugin/plugin.json +5 -2
- package/CHANGELOG.md +44 -0
- package/README.md +2 -0
- package/agents/design-auditor.md +17 -4
- package/agents/design-context-builder.md +2 -0
- package/agents/design-debt-crawler.md +28 -5
- package/agents/design-executor.md +2 -0
- package/agents/design-fixer.md +4 -1
- package/agents/design-planner.md +2 -0
- package/agents/design-reflector.md +2 -0
- package/agents/design-research-synthesizer.md +2 -0
- package/agents/design-verifier.md +7 -15
- package/hooks/gdd-design-quality-check.js +340 -0
- package/hooks/hooks.json +9 -0
- package/package.json +5 -2
- package/reference/registry.json +14 -0
- package/reference/reviewer-confidence-gate.md +108 -0
- package/reference/visual-tells.md +237 -0
- package/scripts/lib/confidence-route.cjs +60 -0
- package/scripts/lib/worktree-resolve.cjs +221 -0
- package/sdk/mcp/gdd-state/server.js +37 -4
- package/sdk/mcp/gdd-state/tools/shared.ts +61 -0
|
@@ -5,14 +5,14 @@
|
|
|
5
5
|
},
|
|
6
6
|
"metadata": {
|
|
7
7
|
"description": "Get Design Done — 5-stage agent-orchestrated design pipeline with 9 connections, handoff-first workflow, bidirectional Figma write-back, 22+ specialized agents, queryable knowledge layer (intel store, dependency analysis, learnings extraction), and a self-improvement loop (reflector, frontmatter + budget feedback, global-skills layer). v1.20.0 ships the SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream, and resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) for rate-limit + 429 + context-overflow recovery. Full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation (auto-tag + GitHub Release + release-time smoke test).",
|
|
8
|
-
"version": "1.
|
|
8
|
+
"version": "1.49.0"
|
|
9
9
|
},
|
|
10
10
|
"plugins": [
|
|
11
11
|
{
|
|
12
12
|
"name": "get-design-done",
|
|
13
13
|
"source": "./",
|
|
14
14
|
"description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), Claude Design handoff, bidirectional Figma write-back, and a queryable intel store (.design/intel/) for dependency and learnings queries. Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation. Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain.",
|
|
15
|
-
"version": "1.
|
|
15
|
+
"version": "1.49.0",
|
|
16
16
|
"author": {
|
|
17
17
|
"name": "hegemonart"
|
|
18
18
|
},
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "get-design-done",
|
|
3
3
|
"short_name": "gdd",
|
|
4
|
-
"version": "1.
|
|
4
|
+
"version": "1.49.0",
|
|
5
5
|
"description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), handoff-first workflow via Claude Design bundles, bidirectional Figma write-back (annotations, Code Connect), queryable intel store (`.design/intel/`) for O(1) design surface lookups, and self-improvement loop (reflector agent, frontmatter + budget feedback, global-skills layer at `~/.claude/gdd/global-skills/`). Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings, reflect, apply-reflections. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows, lint + schema + frontmatter + stale-ref + shellcheck + gitleaks + injection-scan + blocking size-budget) and release automation (auto-tag + GitHub Release + release-time smoke test). Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain. v1.27.7 ships gdd-mcp (Phase 27.7): 12 read-only MCP tools for sub-3s priming. v1.28.0 (Phase 28): Foundational References Tier 2 — 5 new reference files (color-theory, composition, proportion-systems, i18n, contrast-advanced), 2 verifier i18n probes + 1 explore i18n-readiness probe, 12 additive cross-link insertions across 10 existing references, 2 orthogonal audit-scoring lens-tags (composition_alignment + i18n_readiness).",
|
|
6
6
|
"author": {
|
|
7
7
|
"name": "hegemonart",
|
|
@@ -71,7 +71,10 @@
|
|
|
71
71
|
"flutter",
|
|
72
72
|
"email",
|
|
73
73
|
"print",
|
|
74
|
-
"pdf"
|
|
74
|
+
"pdf",
|
|
75
|
+
"worktree-safe",
|
|
76
|
+
"anti-slop",
|
|
77
|
+
"confidence-gate"
|
|
75
78
|
],
|
|
76
79
|
"skills": [
|
|
77
80
|
"./skills/"
|
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,50 @@ All notable changes to get-design-done are documented here. Versions follow [sem
|
|
|
4
4
|
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
+
## [1.49.0] - 2026-06-03
|
|
8
|
+
|
|
9
|
+
### Phase 49 - Quick Anti-Slop Floor
|
|
10
|
+
|
|
11
|
+
Three small, atomic safety and policy primitives identified in the cross-repo synthesis, each low-risk and
|
|
12
|
+
high-signal: a worktree redirect that ends the recurring `.planning/` leak, a free anti-slop regex pass on every
|
|
13
|
+
front-end file write, and a reviewer confidence gate that stops severity inflation. Planned and executed via the
|
|
14
|
+
GSD pipeline (3 parallel executor subagents). No new runtime dependency, no new egress.
|
|
15
|
+
|
|
16
|
+
### Breaking changes
|
|
17
|
+
|
|
18
|
+
- **`.design/` and `.planning/` writes redirect to the main repo root inside a git worktree.** `scripts/lib/worktree-resolve.cjs`
|
|
19
|
+
detects a worktree (`git rev-parse --git-dir` vs `--git-common-dir`) and the gdd-state write path (`resolveStatePath`,
|
|
20
|
+
used by all 11 state tools) now resolves STATE there, with a one-line stderr notice. Outside a worktree, behavior is
|
|
21
|
+
unchanged. Tooling that assumed `.design/` always lived under `process.cwd()` should resolve through the helper.
|
|
22
|
+
- **Findings now carry a `confidence` field and design-fixer filters on it.** design-auditor, design-verifier, and
|
|
23
|
+
design-debt-crawler emit `confidence: 0.0-1.0` per finding; design-fixer drops `## Tentative` findings and routes
|
|
24
|
+
BLOCKER/MAJOR findings below 0.8 confidence to user review instead of auto-fix. Consumers of these findings should
|
|
25
|
+
read the new field.
|
|
26
|
+
|
|
27
|
+
### Added
|
|
28
|
+
|
|
29
|
+
- **`scripts/lib/worktree-resolve.cjs`** (resolveRepoRoot / isWorktree / resolveDesignRoot / resolvePlanningRoot;
|
|
30
|
+
graceful fallback, injectable exec) wired into the state write path + a one-line worktree note in the 7
|
|
31
|
+
artifact-writer agents.
|
|
32
|
+
- **`hooks/gdd-design-quality-check.js`**: an advisory PostToolUse hook scanning `Write`/`Edit`/`MultiEdit` to
|
|
33
|
+
`.tsx`/`.vue`/`.svelte`/`.astro` for 8 default-AI-aesthetic tells (gradient spam, generic CTAs, centered-everything,
|
|
34
|
+
font-inter default, purple/violet default, glassmorphism spam, isometric fallback, decorative motion). WARN-only,
|
|
35
|
+
emits a `design_quality_warn` event. Catalogued in **`reference/visual-tells.md`** (8 named categories with diagnostic
|
|
36
|
+
regex + remediation).
|
|
37
|
+
- **Reviewer confidence gate**: a 4-question Pre-Report Gate + the `confidence` field across the three audit agents,
|
|
38
|
+
a `scripts/lib/confidence-route.cjs` routing helper (`fix` / `user-review` / `drop`), and
|
|
39
|
+
**`reference/reviewer-confidence-gate.md`** (template + rationale + 4 before/after examples).
|
|
40
|
+
|
|
41
|
+
### Notes
|
|
42
|
+
|
|
43
|
+
- 6-manifest lockstep at **v1.49.0** + `OFF_CADENCE_VERSIONS.add('1.49.0')` + 37 `manifests-version.txt` baselines +
|
|
44
|
+
plugin keywords (`worktree-safe`, `anti-slop`, `confidence-gate`). Baselines re-locked: hook-list (19),
|
|
45
|
+
resilience-primitives (39 `scripts/lib/*.cjs`), registry (173), tarball golden 902 -> 907 (+5).
|
|
46
|
+
- WARN-only hook (never blocks); auto-fix of matched tells is out of scope (proposal-only); the verb-based anti-slop
|
|
47
|
+
rubric and a wider tell catalog are deferred to Phase 50.
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
7
51
|
## [1.48.0] - 2026-06-03
|
|
8
52
|
|
|
9
53
|
### Phase 48 - Audit & Pillar Expansion
|
package/README.md
CHANGED
|
@@ -257,6 +257,8 @@ All 14 runtimes receive their native artifact layout (`skills/`, `command/`, `ag
|
|
|
257
257
|
|
|
258
258
|
**Audit and pillar expansion (v1.48.0).** Four audit-side gaps close at once. The copy pillar gets a real rubric (`reference/copy-quality.md` + `copy-auditor`): microcopy, error and empty-state text, ARIA and alt text, voice alignment, with an i18n overflow lens. A project-wide `design-debt-crawler` walks an existing codebase (not just the current cycle), enumerates raw color literals, anti-patterns, untokenized components, and contrast/density issues, and writes a priority-scored `.design/debt/DEBT-CATALOG.md`. A `brief-auditor` grades the brief against five anti-patterns (vague verbs, missing audience, immeasurable success criteria, scope creep, missing anti-goals) and surfaces a non-blocking `/gdd:discuss brief` pointer. And the Stage 4.5 quality-gate gains an `a11y` failure class so `axe` / `pa11y` / `lighthouse` regressions route to `design-fixer` like any other gate failure. **No new runtime dependency.**
|
|
259
259
|
|
|
260
|
+
**Quick anti-slop floor (v1.49.0).** Three small safety primitives. A worktree redirect (`scripts/lib/worktree-resolve.cjs`) sends `.design/` and `.planning/` writes to the main repo root when GDD runs inside a git worktree, so artifacts never leak into an ephemeral checkout. A design-quality PostToolUse hook (`gdd-design-quality-check.js`) runs a free regex pass on every `.tsx`/`.vue`/`.svelte`/`.astro` write and warns on eight default-AI-aesthetic tells (gradient spam, generic CTAs, centered-everything, font-inter defaults, purple/violet defaults, glassmorphism spam, isometric fallbacks, decorative motion), catalogued in `reference/visual-tells.md`. And a reviewer confidence gate adds a `confidence: 0.0-1.0` field plus a 4-question Pre-Report Gate to every audit finding: HIGH and CRITICAL findings need at least 0.8 confidence and cited proof, low-confidence findings stay tentative and never reach `design-fixer`. The hook is WARN-only and there is **no new runtime dependency**.
|
|
261
|
+
|
|
260
262
|
Verify with:
|
|
261
263
|
|
|
262
264
|
```
|
package/agents/design-auditor.md
CHANGED
|
@@ -47,6 +47,7 @@ Minimum expected files:
|
|
|
47
47
|
- `.design/tasks/` - what was actually done (glob all task files)
|
|
48
48
|
- **Domain-index navigation (Phase 45):** the 7 entry-points `reference/{typography,color,spatial,motion,interaction,responsive,ux-writing}.md` index every fragment below. For a pillar, load the relevant domain index first, then drill into the specific fragments it lists only as the pillar needs them - this is the cheap navigation layer over the detailed fragments.
|
|
49
49
|
- `reference/audit-scoring.md` - existing 7-category scoring rubric (understand, do not duplicate)
|
|
50
|
+
- `reference/reviewer-confidence-gate.md` - Pre-Report Gate, the `confidence` field, and the routing rule applied to every finding
|
|
50
51
|
- `reference/brand-voice.md` - voice axes, archetype library, and tone-by-context table (use when auditing Pillar 1: Copy)
|
|
51
52
|
- `reference/gestalt.md` - 8 Gestalt principles with scoring rubrics (use when auditing Pillar 2: Visual Hierarchy)
|
|
52
53
|
- `reference/visual-hierarchy-layout.md` - Z-order, whitespace, grids, and reading-order patterns (use when auditing Pillar 2: Visual Hierarchy)
|
|
@@ -357,6 +358,10 @@ For each of the 7 pillars:
|
|
|
357
358
|
3. Assign a score (1–4) with specific evidence
|
|
358
359
|
4. Identify the top gap for this pillar (one concrete, actionable finding)
|
|
359
360
|
|
|
361
|
+
### Step 3.5: Pre-Report Gate + confidence
|
|
362
|
+
|
|
363
|
+
Before writing any finding into the Priority Fix List or Detailed Findings, run the four-question Pre-Report Gate from `reference/reviewer-confidence-gate.md`: (a) can you cite `file:line`, (b) can you state the failure mode in one sentence, (c) did you read context beyond the matched line, (d) is the implied severity defensible? Stamp every priority-fix finding with a `confidence` value (`0.0-1.0`): `>= 0.8` when all four pass, `0.5-0.8` for partial evidence, `< 0.5` for an unconfirmed pattern match (common for the code-only Visual Hierarchy and Color pillars, where runtime cannot be seen). Move every `< 0.5` finding into a `## Tentative` section instead of the Priority Fix List, so a low-confidence guess never escalates to remediation. Confidence is independent of the 1-4 pillar scores and does not change them.
|
|
364
|
+
|
|
360
365
|
### Step 4: Write DESIGN-AUDIT.md
|
|
361
366
|
|
|
362
367
|
Write `.design/DESIGN-AUDIT.md` using the output format below.
|
|
@@ -414,11 +419,19 @@ supplement_note: "Supplements 7-category 0-10 system in reference/audit-scoring.
|
|
|
414
419
|
|
|
415
420
|
## Priority Fix List
|
|
416
421
|
|
|
417
|
-
Listed by impact. Top 3 fixes the verifier should weight heavily.
|
|
422
|
+
Listed by impact. Top 3 fixes the verifier should weight heavily. Each finding carries a `confidence` value (see `reference/reviewer-confidence-gate.md`); findings below `0.5` go in `## Tentative`, not here.
|
|
423
|
+
|
|
424
|
+
1. **[Pillar N: specific issue]** (confidence: [0.0-1.0]) [user impact] [concrete fix with file reference]
|
|
425
|
+
2. **[Pillar N: specific issue]** (confidence: [0.0-1.0]) [user impact] [concrete fix with file reference]
|
|
426
|
+
3. **[Pillar N: specific issue]** (confidence: [0.0-1.0]) [user impact] [concrete fix with file reference]
|
|
427
|
+
|
|
428
|
+
---
|
|
429
|
+
|
|
430
|
+
## Tentative
|
|
431
|
+
|
|
432
|
+
Low-confidence findings (`confidence < 0.5`, per `reference/reviewer-confidence-gate.md`): pattern matches not confirmed by reading context, or runtime-only concerns the code-only pass cannot verify. Surfaced for human review; never auto-escalated to design-fixer.
|
|
418
433
|
|
|
419
|
-
|
|
420
|
-
2. **[Pillar N — specific issue]** — [user impact] — [concrete fix with file reference]
|
|
421
|
-
3. **[Pillar N — specific issue]** — [user impact] — [concrete fix with file reference]
|
|
434
|
+
- [Pillar N: finding] (confidence: [N], unconfirmed because [reason])
|
|
422
435
|
|
|
423
436
|
---
|
|
424
437
|
|
|
@@ -561,6 +561,8 @@ Iterate until the user confirms. Then write the artifact.
|
|
|
561
561
|
|
|
562
562
|
## Output: .design/DESIGN-CONTEXT.md
|
|
563
563
|
|
|
564
|
+
Before writing any `.design/` artifact, resolve the main repo root via `scripts/lib/worktree-resolve.cjs` (`resolveDesignRoot`) so a worktree run writes to the main checkout and does not leak.
|
|
565
|
+
|
|
564
566
|
Create `.design/` directory if needed. Write `.design/DESIGN-CONTEXT.md`:
|
|
565
567
|
|
|
566
568
|
```markdown
|
|
@@ -60,6 +60,7 @@ listed file before acting. Minimum expected files:
|
|
|
60
60
|
|
|
61
61
|
- @reference/debt-categories.md
|
|
62
62
|
- @reference/anti-patterns.md
|
|
63
|
+
- @reference/reviewer-confidence-gate.md
|
|
63
64
|
|
|
64
65
|
`reference/debt-categories.md` is the taxonomy you classify against and the source of
|
|
65
66
|
the priority-scoring model. `reference/anti-patterns.md` is the BAN-NN and SLOP-NN
|
|
@@ -157,6 +158,19 @@ grep -rEn "No data|No results|Nothing here|went wrong|error occurred" src/ \
|
|
|
157
158
|
Flag meaningful images without `alt`, icon-only controls without an accessible name,
|
|
158
159
|
placeholder used as the only label, and generic empty or error copy.
|
|
159
160
|
|
|
161
|
+
### Step 2.5: Pre-Report Gate + confidence
|
|
162
|
+
|
|
163
|
+
Before cataloging any finding, run the four-question Pre-Report Gate from
|
|
164
|
+
`reference/reviewer-confidence-gate.md`: (a) can you cite `file:line`, (b) can you state the
|
|
165
|
+
failure mode in one sentence, (c) did you read context beyond the matched line (the token
|
|
166
|
+
definition, the call site), and (d) is the class assignment defensible? Stamp every catalog
|
|
167
|
+
row with a `confidence` value (`0.0-1.0`): `>= 0.8` when all four pass, `0.5-0.8` when evidence
|
|
168
|
+
is partial, `< 0.5` for a pattern match you could not confirm (for example an unresolved
|
|
169
|
+
contrast pair or a literal that may be inside a token definition). Move every `< 0.5` finding
|
|
170
|
+
into a `## Tentative` section instead of the ranked findings table, so a low-confidence guess
|
|
171
|
+
never escalates to remediation. Confidence is independent of priority: a high-priority debt
|
|
172
|
+
item can still be low confidence and belongs in `## Tentative` until confirmed.
|
|
173
|
+
|
|
160
174
|
### Step 3: Group and score
|
|
161
175
|
|
|
162
176
|
Group findings by the seven debt classes. For each finding, assign the three priority
|
|
@@ -217,12 +231,21 @@ note: "Project-scoped retroactive debt catalog. Does NOT read STATE.md completed
|
|
|
217
231
|
|
|
218
232
|
## Findings (ranked by priority)
|
|
219
233
|
|
|
220
|
-
| Priority | Class | Location | Finding | V × E × P | Suggested command |
|
|
221
|
-
|
|
222
|
-
| 18 | color-literal | src/Card.tsx:42 | Raw #1a73e8 instead of token | 3×3×2 | `/gdd:fast "replace #1a73e8 with semantic token in Card.tsx"` |
|
|
223
|
-
| 12 | anti-pattern | src/Hero.tsx:8 | BAN-02 gradient text on heading | 3×2×2 | `/gdd:fast "remove BAN-02 gradient text in Hero.tsx"` |
|
|
234
|
+
| Priority | Class | Location | Finding | V × E × P | Confidence | Suggested command |
|
|
235
|
+
|----------|-------|----------|---------|-----------|------------|-------------------|
|
|
236
|
+
| 18 | color-literal | src/Card.tsx:42 | Raw #1a73e8 instead of token | 3×3×2 | 0.9 | `/gdd:fast "replace #1a73e8 with semantic token in Card.tsx"` |
|
|
237
|
+
| 12 | anti-pattern | src/Hero.tsx:8 | BAN-02 gradient text on heading | 3×2×2 | 0.85 | `/gdd:fast "remove BAN-02 gradient text in Hero.tsx"` |
|
|
238
|
+
|
|
239
|
+
(One row per finding with `confidence >= 0.5`. The Suggested command column always carries a `/gdd:fast "<finding>"` string. Findings below `0.5` go in `## Tentative` below, not in this table.)
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
## Tentative
|
|
244
|
+
|
|
245
|
+
Findings with `confidence < 0.5` (pattern matches not confirmed by reading context, per
|
|
246
|
+
`reference/reviewer-confidence-gate.md`). Listed for human review; never auto-escalated.
|
|
224
247
|
|
|
225
|
-
|
|
248
|
+
- [class] [location]: [finding] (confidence: [N], unconfirmed because [reason])
|
|
226
249
|
|
|
227
250
|
---
|
|
228
251
|
|
|
@@ -395,6 +395,8 @@ Apply these rules automatically during execution. Track all deviations in the ta
|
|
|
395
395
|
|
|
396
396
|
## Task Output - .design/tasks/task-NN.md
|
|
397
397
|
|
|
398
|
+
Before writing any `.design/` artifact, resolve the main repo root via `scripts/lib/worktree-resolve.cjs` (`resolveDesignRoot`) so a worktree run writes to the main checkout and does not leak.
|
|
399
|
+
|
|
398
400
|
After completing the task's implementation work, write `.design/tasks/task-NN.md` (where NN = task_id from prompt context). Create `.design/tasks/` directory first if it does not exist.
|
|
399
401
|
|
|
400
402
|
Format (locked - do not alter structure):
|
package/agents/design-fixer.md
CHANGED
|
@@ -48,6 +48,8 @@ The orchestrating stage supplies a `<required_reading>` block in the prompt. Rea
|
|
|
48
48
|
|
|
49
49
|
**Invariant:** read all listed files FIRST, before making any changes.
|
|
50
50
|
|
|
51
|
+
**Worktree-root invariant:** before writing any `.design/` artifact (for example a `<blocker>` entry to `.design/STATE.md`), resolve the main repo root via `scripts/lib/worktree-resolve.cjs` so a worktree run writes to the canonical `.design/` and does not leak artifacts into the worktree checkout.
|
|
52
|
+
|
|
51
53
|
---
|
|
52
54
|
|
|
53
55
|
## Prompt Context Fields
|
|
@@ -88,7 +90,8 @@ Parse every entry in that section. The `G-NN` identifier, severity classificatio
|
|
|
88
90
|
4. Filter by severity based on `auto_mode`:
|
|
89
91
|
- Always include: `BLOCKER`, `MAJOR`
|
|
90
92
|
- Include only if `auto_mode=true`: `MINOR`, `COSMETIC`
|
|
91
|
-
5.
|
|
93
|
+
5. **Confidence routing filter (Phase 49, see `reference/reviewer-confidence-gate.md`).** Drop any gap that sits under a `## Tentative` heading: those never reach you. Then drop any `BLOCKER` or `MAJOR` gap whose `confidence` field is below `0.8` and route it to user review instead of auto-fix, since a high-severity gap without strong evidence is exactly the inflated-severity case the gate exists to catch. A gap missing its `confidence` field is treated as below the floor. The shared decision lives in `scripts/lib/confidence-route.cjs` (`route({ severity, confidence, tentative })` returns `'fix' | 'user-review' | 'drop'`); fix only the gaps it routes to `'fix'`.
|
|
94
|
+
6. Build an ordered list: BLOCKER first, then MAJOR, then (if included) MINOR, COSMETIC.
|
|
92
95
|
|
|
93
96
|
If no in-scope gaps are found (e.g., verifier found only MINOR gaps and `auto_mode=false`), emit `## FIX COMPLETE` immediately with "No in-scope gaps to fix."
|
|
94
97
|
|
package/agents/design-planner.md
CHANGED
|
@@ -227,6 +227,8 @@ Before finalizing task list:
|
|
|
227
227
|
|
|
228
228
|
## Output Format
|
|
229
229
|
|
|
230
|
+
Before writing any `.design/` artifact, resolve the main repo root via `scripts/lib/worktree-resolve.cjs` (`resolveDesignRoot`) so a worktree run writes to the main checkout and does not leak.
|
|
231
|
+
|
|
230
232
|
Write `.design/DESIGN-PLAN.md` with this exact structure:
|
|
231
233
|
|
|
232
234
|
```markdown
|
|
@@ -62,6 +62,8 @@ Minimum expected inputs (skip gracefully if absent, note what's missing):
|
|
|
62
62
|
|
|
63
63
|
## Output
|
|
64
64
|
|
|
65
|
+
Before writing any `.design/` artifact, resolve the main repo root via `scripts/lib/worktree-resolve.cjs` (`resolveDesignRoot`) so a worktree run writes to the main checkout and does not leak.
|
|
66
|
+
|
|
65
67
|
Write `.design/reflections/<cycle-slug>.md`. If `--dry-run` is set in the spawning prompt, print proposals to stdout only - do not write the file.
|
|
66
68
|
|
|
67
69
|
If the capability-gap pattern scan emitted any events during this run, include a `## Capability gaps emitted` heading listing each `event_id` with the source signal kind (`intel` | `posterior` | `trajectory`) and the `suggested_kind` (`agent` | `skill`) per event. Plan 29-03 reads these events from `.design/gep/events.jsonl` to cluster recurring `capability_gap` events for `/gdd:apply-reflections`.
|
|
@@ -161,6 +161,8 @@ Read .design/STATE.md
|
|
|
161
161
|
|
|
162
162
|
## Output
|
|
163
163
|
|
|
164
|
+
Before writing any `.design/` artifact, resolve the main repo root via `scripts/lib/worktree-resolve.cjs` (`resolveDesignRoot`) so a worktree run writes to the main checkout and does not leak.
|
|
165
|
+
|
|
164
166
|
Single file: `.design/DESIGN-CONTEXT.md`.
|
|
165
167
|
|
|
166
168
|
## Record
|
|
@@ -33,6 +33,7 @@ The orchestrating stage supplies a `<required_reading>` block in the prompt. Rea
|
|
|
33
33
|
- `.design/DESIGN-CONTEXT.md` - goals, must-haves, brand direction, references
|
|
34
34
|
- `.design/tasks/` - what was actually done (glob all task files)
|
|
35
35
|
- `reference/audit-scoring.md` - scoring rubric for category weights
|
|
36
|
+
- `reference/reviewer-confidence-gate.md` - Pre-Report Gate, the `confidence` field, and the gap routing rule
|
|
36
37
|
- `reference/heuristics.md` - NNG heuristics H-01..H-10 scoring guide
|
|
37
38
|
- `reference/review-format.md` - visual UAT presentation format
|
|
38
39
|
- `reference/accessibility.md` - WCAG checklist for accessibility scoring
|
|
@@ -40,6 +41,8 @@ The orchestrating stage supplies a `<required_reading>` block in the prompt. Rea
|
|
|
40
41
|
- `connections/chromatic.md` - Chromatic CLI connection spec (probe, baseline management, fallback)
|
|
41
42
|
- `connections/storybook.md` - Storybook HTTP probe and a11y integration details
|
|
42
43
|
|
|
44
|
+
**Worktree-root invariant:** before writing `.design/DESIGN-VERIFICATION.md` (or any `.design/` artifact), resolve the main repo root via `scripts/lib/worktree-resolve.cjs` so a worktree run writes to the canonical `.design/` and does not leak artifacts into the worktree checkout.
|
|
45
|
+
|
|
43
46
|
## Prompt Context Fields
|
|
44
47
|
|
|
45
48
|
The stage embeds these fields in its prompt:
|
|
@@ -440,6 +443,8 @@ Classify each gap:
|
|
|
440
443
|
- `MINOR` - noticeable issue; fix if time allows
|
|
441
444
|
- `COSMETIC` - polish only; defer to later
|
|
442
445
|
|
|
446
|
+
**Pre-Report Gate (Phase 49, see `reference/reviewer-confidence-gate.md`).** Before emitting each gap, answer the four questions: (a) can you cite `file:line`, (b) can you state the failure mode in one sentence, (c) did you read context beyond the modified file, (d) is the severity defensible? Stamp every gap with a `confidence` field (`0.0-1.0`): `>= 0.8` when all four pass, `0.5-0.8` when evidence is partial, `< 0.5` for an unconfirmed hunch. A BLOCKER or MAJOR requires `confidence >= 0.8` plus a `file:line` citation plus a one-sentence failure mode; below that, lower the severity or move it to `## Tentative`. Confidence is independent of severity. Move every `< 0.5` gap into a `## Tentative` section so it is surfaced but never reaches `design-fixer`.
|
|
447
|
+
|
|
443
448
|
For each gap, emit an entry in the locked gap format:
|
|
444
449
|
|
|
445
450
|
```
|
|
@@ -452,6 +457,7 @@ For each gap, emit an entry in the locked gap format:
|
|
|
452
457
|
- Actual: [what is true]
|
|
453
458
|
- Location: [file:line or UI element]
|
|
454
459
|
- Suggested fix: [one-line hint]
|
|
460
|
+
- confidence: [0.0-1.0]
|
|
455
461
|
```
|
|
456
462
|
|
|
457
463
|
Order gaps: BLOCKER first, then MAJOR, MINOR, COSMETIC. Number sequentially (G-01, G-02, ...).
|
|
@@ -464,21 +470,7 @@ If zero gaps found: skip this section entirely - do NOT emit `## GAPS FOUND`.
|
|
|
464
470
|
|
|
465
471
|
**Skip if `chromatic` is `not_configured` or `unavailable` in STATE.md `<connections>`.**
|
|
466
472
|
|
|
467
|
-
If `.design/chromatic-results.json` exists:
|
|
468
|
-
1. Read .design/chromatic-results.json
|
|
469
|
-
2. Check if this is a first run (all entries have status: "new"):
|
|
470
|
-
→ First run: emit "Baseline established - no regressions detected (first run creates baseline)."
|
|
471
|
-
3. For subsequent runs, narrate changes:
|
|
472
|
-
For each story entry in results:
|
|
473
|
-
- status "unchanged" → PASS <StoryTitle>:<StoryName>
|
|
474
|
-
- status "changed" → CHANGED <StoryTitle>:<StoryName> (visual change detected - review on chromatic.com)
|
|
475
|
-
- status "new" → NEW <StoryTitle>:<StoryName> (first snapshot - not a regression)
|
|
476
|
-
- status "error" → ERROR <StoryTitle>:<StoryName> - investigate
|
|
477
|
-
4. Emit summary: "Total: N stories. X unchanged. Y changed. Z new. W errors."
|
|
478
|
-
5. If Y > 0 (changed stories): flag as "VISUAL REGRESSION CANDIDATES - review required on chromatic.com before merging"
|
|
479
|
-
6. Append narration to DESIGN-VERIFICATION.md ## Visual Regression section (create section if absent)
|
|
480
|
-
|
|
481
|
-
If .design/chromatic-results.json does not exist: skip; emit no note.
|
|
473
|
+
If `.design/chromatic-results.json` exists, read it and narrate. First run (all entries `status: "new"`): emit "Baseline established - no regressions detected (first run creates baseline)." Subsequent runs, per story entry: `unchanged` → PASS, `changed` → CHANGED (review on chromatic.com), `new` → NEW (first snapshot, not a regression), `error` → ERROR (investigate). Emit summary "Total: N stories. X unchanged. Y changed. Z new. W errors." If any changed (Y > 0), flag "VISUAL REGRESSION CANDIDATES - review required on chromatic.com before merging". Append the narration to the DESIGN-VERIFICATION.md `## Visual Regression` section (create it if absent). If the file does not exist: skip; emit no note.
|
|
482
474
|
|
|
483
475
|
---
|
|
484
476
|
|