@hegemonart/get-design-done 1.57.1 → 1.57.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +26 -41
- package/.claude-plugin/plugin.json +23 -48
- package/CHANGELOG.md +91 -0
- package/README.md +166 -511
- package/SKILL.md +2 -0
- package/agents/README.md +33 -36
- package/agents/a11y-mapper.md +3 -3
- package/agents/component-benchmark-harvester.md +6 -6
- package/agents/component-benchmark-synthesizer.md +3 -3
- package/agents/compose-executor.md +3 -3
- package/agents/cost-forecaster.md +2 -2
- package/agents/design-auditor.md +7 -7
- package/agents/design-authority-watcher.md +15 -15
- package/agents/design-context-builder.md +4 -4
- package/agents/design-context-checker-gate.md +1 -1
- package/agents/design-discussant.md +2 -2
- package/agents/design-doc-writer.md +1 -1
- package/agents/design-executor.md +2 -2
- package/agents/design-figma-writer.md +2 -2
- package/agents/design-fixer.md +7 -7
- package/agents/design-integration-checker-gate.md +1 -1
- package/agents/design-integration-checker.md +1 -1
- package/agents/design-paper-writer.md +3 -3
- package/agents/design-pencil-writer.md +1 -1
- package/agents/design-planner.md +21 -0
- package/agents/design-reflector.md +39 -39
- package/agents/design-research-synthesizer.md +1 -0
- package/agents/design-start-writer.md +1 -1
- package/agents/design-update-checker.md +5 -5
- package/agents/design-verifier-gate.md +1 -1
- package/agents/design-verifier.md +52 -48
- package/agents/ds-generator.md +2 -2
- package/agents/ds-migration-planner.md +4 -4
- package/agents/email-executor.md +9 -9
- package/agents/experiment-result-ingester.md +3 -3
- package/agents/flutter-executor.md +5 -5
- package/agents/gdd-graph-refresh.md +3 -3
- package/agents/gdd-intel-updater.md +2 -2
- package/agents/motion-mapper.md +2 -2
- package/agents/motion-verifier.md +4 -4
- package/agents/pdf-executor.md +8 -8
- package/agents/perf-analyzer.md +17 -17
- package/agents/pr-commenter.md +9 -9
- package/agents/prototype-gate.md +2 -2
- package/agents/quality-gate-runner.md +1 -1
- package/agents/rollout-coordinator.md +3 -3
- package/agents/swift-executor.md +4 -4
- package/agents/ticket-sync-agent.md +6 -6
- package/agents/user-research-synthesizer.md +2 -2
- package/connections/connections.md +44 -45
- package/connections/cursor.md +73 -0
- package/connections/preview.md +3 -3
- package/dist/claude-code/.claude/skills/cache-manager/SKILL.md +3 -3
- package/dist/claude-code/.claude/skills/cache-manager/cache-policy.md +1 -1
- package/dist/claude-code/.claude/skills/design/SKILL.md +19 -0
- package/dist/claude-code/.claude/skills/explore/SKILL.md +11 -0
- package/dist/claude-code/.claude/skills/figma-write/SKILL.md +13 -2
- package/dist/claude-code/.claude/skills/paper-write/SKILL.md +54 -0
- package/dist/claude-code/.claude/skills/pencil-write/SKILL.md +54 -0
- package/dist/claude-code/.claude/skills/report-issue/SKILL.md +2 -2
- package/dist/claude-code/.claude/skills/router/SKILL.md +2 -2
- package/dist/claude-code/.claude/skills/verify/verify-procedure.md +10 -11
- package/dist/claude-code/.claude/skills/warm-cache/SKILL.md +1 -1
- package/hooks/first-run-nudge.cjs +171 -0
- package/hooks/gdd-intel-trigger.js +243 -0
- package/hooks/gdd-mcp-circuit-breaker.js +62 -7
- package/hooks/gdd-precompact-snapshot.js +50 -29
- package/hooks/gdd-protected-paths.js +150 -18
- package/hooks/gdd-risk-gate.js +93 -1
- package/hooks/gdd-sessionstart-recap.js +59 -24
- package/hooks/hooks.json +13 -4
- package/hooks/inject-using-gdd.cjs +188 -0
- package/hooks/update-check.cjs +511 -0
- package/package.json +9 -2
- package/reference/STATE-TEMPLATE.md +10 -13
- package/reference/audit-scoring.md +1 -1
- package/reference/cache-tier-doctrine.md +46 -0
- package/reference/config-schema.md +9 -9
- package/reference/i18n.md +1 -1
- package/reference/intel-schema.md +37 -2
- package/reference/meta-rules.md +4 -4
- package/reference/model-tiers.md +2 -2
- package/reference/registry.json +101 -94
- package/reference/runtime-models.md +11 -1
- package/reference/shared-preamble.md +13 -14
- package/reference/skill-graph.md +24 -1
- package/scripts/bootstrap.cjs +373 -0
- package/scripts/injection-patterns.cjs +58 -0
- package/scripts/lib/apply-reflections/incubator-proposals.cjs +57 -26
- package/scripts/lib/install/converters/codex-plugin.cjs +5 -2
- package/scripts/lib/install/converters/cursor.cjs +20 -0
- package/scripts/lib/issue-reporter/report-flow.cjs +1 -1
- package/scripts/lib/manifest/skills.json +80 -13
- package/scripts/lib/state/query-surface.cjs +67 -9
- package/scripts/lib/state/state-store.cjs +68 -26
- package/sdk/cli/commands/stage.ts +17 -0
- package/sdk/cli/index.js +14 -0
- package/skills/cache-manager/SKILL.md +3 -3
- package/skills/cache-manager/cache-policy.md +1 -1
- package/skills/design/SKILL.md +19 -0
- package/skills/explore/SKILL.md +11 -0
- package/skills/figma-write/SKILL.md +13 -2
- package/skills/paper-write/SKILL.md +54 -0
- package/skills/pencil-write/SKILL.md +54 -0
- package/skills/report-issue/SKILL.md +2 -2
- package/skills/router/SKILL.md +2 -2
- package/skills/verify/verify-procedure.md +10 -11
- package/skills/warm-cache/SKILL.md +1 -1
- package/hooks/first-run-nudge.sh +0 -82
- package/hooks/inject-using-gdd.sh +0 -72
- package/hooks/update-check.sh +0 -251
- package/scripts/lib/audit-aggregator/index.cjs +0 -219
- package/scripts/lib/hedge-ensemble.cjs +0 -217
package/SKILL.md
CHANGED
|
@@ -37,6 +37,8 @@ Each stage produces artifacts in `.design/` inside the current project.
|
|
|
37
37
|
| `compare` | `get-design-done:gdd-compare` | Delta between DESIGN.md baseline and DESIGN-VERIFICATION.md → .design/COMPARE-REPORT.md |
|
|
38
38
|
| `figma-write <mode>` | `get-design-done:gdd-figma-write` | Write design decisions to Figma (annotate/tokenize/mappings) |
|
|
39
39
|
| `figma-extract <file-url-or-key>` | `get-design-done:gdd-figma-extract` | Off-context Figma design-system extraction → compact local digest (DESIGN.md + tokens.json + components.json), zero raw JSON in context |
|
|
40
|
+
| `paper-write <mode>` | `get-design-done:gdd-paper-write` | Write design decisions back into paper.design via MCP (annotate/tokenize/roundtrip) |
|
|
41
|
+
| `pencil-write <mode>` | `get-design-done:gdd-pencil-write` | Write design decisions into git-tracked `.pen` spec files (annotate/roundtrip) |
|
|
40
42
|
| `graphify <subcommand>` | `get-design-done:gdd-graphify` | Manage Graphify knowledge graph (build/query/status/diff) |
|
|
41
43
|
| `discuss [topic] [--all] [--spec] [--cycle <name>]` | `get-design-done:gdd-discuss` | Adaptive design interview - spawns design-discussant; appends D-XX decisions to STATE.md |
|
|
42
44
|
| `list-assumptions [--area]` | `get-design-done:gdd-list-assumptions` | Surface implicit design assumptions baked into the codebase |
|
package/agents/README.md
CHANGED
|
@@ -33,24 +33,24 @@ The `design-` prefix prevents name collisions with agents from other Claude Code
|
|
|
33
33
|
|
|
34
34
|
## Frontmatter Schema
|
|
35
35
|
|
|
36
|
-
Every agent file begins with a YAML frontmatter block. All fields except `model` are required.
|
|
36
|
+
Every agent file begins with a YAML frontmatter block. All fields except `model` are required. See `reference/model-tiers.md` for the per-agent `default-tier` / `tier-rationale` assignment rationale.
|
|
37
37
|
|
|
38
38
|
| Field | Type | Accepted values | Purpose |
|
|
39
39
|
|-------|------|-----------------|---------|
|
|
40
40
|
| `name` | kebab-case string | unique within plugin | Identifier passed to the `Task` tool - must match the filename without `.md` |
|
|
41
41
|
| `description` | string | free-form | One sentence: what the agent does + when it is spawned |
|
|
42
|
-
| `description_i18n` | object | `{ <locale>: "<description>" }` | **
|
|
42
|
+
| `description_i18n` | object | `{ <locale>: "<description>" }` | **Opt-in.** Localized descriptions keyed by locale (en/ru/uk/de/fr/zh/ja). `scripts/lib/i18n/index.cjs` `descriptionFor(frontmatter, locale)` resolves it via the fallback chain and falls back to the English `description` when a locale is absent. Backward-compatible - omit it and nothing changes. |
|
|
43
43
|
| `tools` | comma-separated list | `Read`, `Write`, `Edit`, `Bash`, `Grep`, `Glob`, `Task`, `WebFetch`, `TodoWrite`, `mcp__*` | Claude tools the agent may use - list only what is needed |
|
|
44
44
|
| `color` | enum | `yellow`, `green`, `blue`, `red` | Terminal display color for the agent's output |
|
|
45
45
|
| `model` | enum (optional) | `inherit`, `sonnet`, `haiku` | Omit to use the project's configured profile default. Use `inherit` to bypass the profile and use the highest available model (quality-tier work) |
|
|
46
|
-
| `default-tier` | enum | `haiku`, `sonnet`, `opus` |
|
|
47
|
-
| `tier-rationale` | string | free-form, one line, quoted |
|
|
46
|
+
| `default-tier` | enum | `haiku`, `sonnet`, `opus` | The model tier the router + budget-enforcer hook select when `.design/budget.json.tier_overrides` has no entry for this agent. Paired with `reference/model-tiers.md` - the per-agent map in that file is the source of truth; this field is the per-agent replica the hook reads. Required on all agents. |
|
|
47
|
+
| `tier-rationale` | string | free-form, one line, quoted | One-sentence justification for the `default-tier` choice. Surfaces in `/gdd:optimize` output when the advisor suggests a tier move. Required on all agents. |
|
|
48
48
|
| `parallel-safe` | enum | `always`, `never`, `conditional-on-touches`, `auto` | Whether stages may dispatch this agent in parallel with siblings. `conditional-on-touches` means safe only when `Touches:` do not overlap |
|
|
49
|
-
| `typical-duration-seconds` | int | e.g. `30`, `60`, `120` | Expected wall-clock duration. Used by parallelism planner to decide whether savings clear `min_estimated_savings_seconds`. **Extensible** -
|
|
49
|
+
| `typical-duration-seconds` | int | e.g. `30`, `60`, `120` | Expected wall-clock duration. Used by parallelism planner to decide whether savings clear `min_estimated_savings_seconds`. **Extensible** - the `design-reflector` may propose `measured-duration-seconds` from telemetry without replacing this field. |
|
|
50
50
|
| `reads-only` | bool | `true`/`false` | True when the agent never writes any file |
|
|
51
51
|
| `writes` | list | e.g. `[".design/DESIGN-PLAN.md"]` | Files / globs the agent may write. `[]` for read-only agents |
|
|
52
52
|
|
|
53
|
-
> **Frontmatter is extensible.** New fields can be added
|
|
53
|
+
> **Frontmatter is extensible.** New fields can be added without removing existing ones. The `design-reflector` agent may propose updates to `typical-duration-seconds` and `default-tier` based on measured telemetry - those proposals go through `/gdd:apply-reflections`, never auto-applied.
|
|
54
54
|
|
|
55
55
|
Example frontmatter block:
|
|
56
56
|
|
|
@@ -67,9 +67,9 @@ color: blue
|
|
|
67
67
|
|
|
68
68
|
## Runtime-neutral reasoning class (alias for default-tier)
|
|
69
69
|
|
|
70
|
-
**
|
|
70
|
+
**Introduced in v1.26.0.** Agents may carry an optional `reasoning-class: high|medium|low` field as a runtime-neutral alias for `default-tier`. The alias exists because `default-tier`'s enum (`opus|sonnet|haiku`) hard-codes Anthropic model names, while the multi-runtime installer ships agents to 14 runtimes whose authors do not all use those names. `reasoning-class` describes the *reasoning density* the agent needs without naming a vendor's model lineup.
|
|
71
71
|
|
|
72
|
-
**This field is additive, not a replacement.** `default-tier: opus|sonnet|haiku` remains the authoritative, required field for v1.26 and is the source of truth that `hooks/budget-enforcer.ts`, `skills/router/SKILL.md`, and `agents/gdd-intel-updater.md` read. Both fields may coexist on the same agent during the transition window. The long-term winner - which field is canonical and which is deprecated - is data-gated
|
|
72
|
+
**This field is additive, not a replacement.** `default-tier: opus|sonnet|haiku` remains the authoritative, required field for v1.26 and is the source of truth that `hooks/budget-enforcer.ts`, `skills/router/SKILL.md`, and `agents/gdd-intel-updater.md` read. Both fields may coexist on the same agent during the transition window. The long-term winner - which field is canonical and which is deprecated - is data-gated by future measurement of adoption rates; no deprecation lands in v1.26.
|
|
73
73
|
|
|
74
74
|
### Frontmatter shape
|
|
75
75
|
|
|
@@ -77,7 +77,7 @@ color: blue
|
|
|
77
77
|
|-------|------|-----------------|----------|---------|
|
|
78
78
|
| `reasoning-class` | enum | `high`, `medium`, `low` | optional | Runtime-neutral name for the reasoning-density tier this agent needs. Equivalent to `default-tier` per the equivalence table below. |
|
|
79
79
|
|
|
80
|
-
### Equivalence
|
|
80
|
+
### Equivalence
|
|
81
81
|
|
|
82
82
|
| `reasoning-class` | `default-tier` | Typical role classes |
|
|
83
83
|
|-------------------|----------------|----------------------|
|
|
@@ -100,34 +100,33 @@ tier-rationale: "Authors DESIGN-PLAN.md — the contract every downstream agent
|
|
|
100
100
|
---
|
|
101
101
|
```
|
|
102
102
|
|
|
103
|
-
When both are present, the values MUST be equivalent per the table above. Mismatched dual annotations (e.g. `default-tier: opus` paired with `reasoning-class: medium`) are a validation error - `scripts/validate-frontmatter.ts`
|
|
103
|
+
When both are present, the values MUST be equivalent per the table above. Mismatched dual annotations (e.g. `default-tier: opus` paired with `reasoning-class: medium`) are a validation error - `scripts/validate-frontmatter.ts` enforces equivalence at lint time. If only one of the two is present, the validator accepts it and downstream consumers use the equivalence table to derive the missing field.
|
|
104
104
|
|
|
105
105
|
### How runtime-aware tooling reads either field
|
|
106
106
|
|
|
107
107
|
Downstream consumers (`skills/router/SKILL.md`, `hooks/budget-enforcer.ts`, `scripts/lib/budget-enforcer.cjs`, `agents/gdd-intel-updater.md`) accept either field individually and map between them via the equivalence table:
|
|
108
108
|
|
|
109
109
|
- **`default-tier` only** - consumers read `default-tier` directly. This is the v1.26 baseline state for all 26 shipped agents.
|
|
110
|
-
- **`reasoning-class` only** - consumers map `high → opus`, `medium → sonnet`, `low → haiku` and feed the resulting tier into `tier-resolver.cjs`
|
|
111
|
-
- **Both present** - consumers prefer `default-tier` for now (v1.26 canonical), with `reasoning-class` carried through to telemetry (`gdd-intel-updater` writes both fields to `.design/intel/agent-tiers.json`
|
|
110
|
+
- **`reasoning-class` only** - consumers map `high → opus`, `medium → sonnet`, `low → haiku` and feed the resulting tier into `tier-resolver.cjs` for runtime-correct model resolution. Consumers that have not yet been updated to read `reasoning-class` natively still see a valid `default-tier` semantically (via the alias), so no consumer breaks when an agent author chooses the runtime-neutral name.
|
|
111
|
+
- **Both present** - consumers prefer `default-tier` for now (v1.26 canonical), with `reasoning-class` carried through to telemetry (`gdd-intel-updater` writes both fields to `.design/intel/agent-tiers.json`) so adoption can be measured for the future deprecation gate.
|
|
112
112
|
|
|
113
113
|
### Rollout policy for v1.26
|
|
114
114
|
|
|
115
|
-
- The 26 existing agents continue to carry `default-tier` only - **no per-agent retrofit lands in v1.26**. New agents
|
|
116
|
-
- Validators, intel-updater, router, and budget-enforcer accept either field starting in v1.26
|
|
117
|
-
- Adoption is measured by `gdd-intel-updater` over `agents/*.md` changes; if alias adoption stays below 50%
|
|
115
|
+
- The 26 existing agents continue to carry `default-tier` only - **no per-agent retrofit lands in v1.26**. New agents MAY carry `reasoning-class` instead of, or alongside, `default-tier`.
|
|
116
|
+
- Validators, intel-updater, router, and budget-enforcer accept either field starting in v1.26.
|
|
117
|
+
- Adoption is measured by `gdd-intel-updater` over `agents/*.md` changes; if alias adoption stays below 50% at the deprecation review, `default-tier` remains canonical and the alias is deprecated. If alias wins majority share, the reverse. **No deprecation in v1.26.**
|
|
118
118
|
|
|
119
119
|
### Cross-references
|
|
120
120
|
|
|
121
121
|
- `reference/model-tiers.md` - tier-selection guide and per-agent map for `default-tier`. The same role-class rationale applies to `reasoning-class` via the equivalence table.
|
|
122
|
-
- `reference/runtime-models.md`
|
|
123
|
-
- `scripts/validate-frontmatter.ts`
|
|
124
|
-
- `.planning/phases/26-headless-model-resolver/CONTEXT.md` D-10, D-11 - decision lineage for additive-alias and equivalence-enforced semantics.
|
|
122
|
+
- `reference/runtime-models.md` - per-runtime tier→model adapter that consumes the resolved tier (whether sourced from `default-tier` or via `reasoning-class` alias).
|
|
123
|
+
- `scripts/validate-frontmatter.ts` - validator that accepts the optional field and enforces equivalence when both are present.
|
|
125
124
|
|
|
126
125
|
---
|
|
127
126
|
|
|
128
127
|
## Peer-CLI delegation (delegate_to)
|
|
129
128
|
|
|
130
|
-
|
|
129
|
+
The **optional** frontmatter field `delegate_to:` lets an agent OPT IN to running on a peer CLI (Codex via ASP; Gemini/Cursor/Copilot/Qwen via ACP) instead of the in-process Anthropic SDK call.
|
|
131
130
|
|
|
132
131
|
| Property | Value |
|
|
133
132
|
|----------|-------|
|
|
@@ -135,22 +134,21 @@ Phase 27 introduces an **optional** frontmatter field `delegate_to:` that lets a
|
|
|
135
134
|
| Required | NO - optional, additive |
|
|
136
135
|
| Default | absent = use local Anthropic call (existing behavior) |
|
|
137
136
|
| Valid values | `gemini-research`, `gemini-exploration`, `codex-execute`, `cursor-debug`, `cursor-plan`, `copilot-review`, `copilot-research`, `qwen-write`, or `none` (explicit opt-out) |
|
|
138
|
-
| Validator | `scripts/validate-frontmatter.ts`
|
|
137
|
+
| Validator | `scripts/validate-frontmatter.ts` - checks format + cross-references the capability matrix in `scripts/lib/peer-cli/registry.cjs`. Mismatched `<peer>-<role>` values that aren't in the matrix → validation error. |
|
|
139
138
|
|
|
140
139
|
**Behavior at runtime:**
|
|
141
|
-
- When session-runner spawns an agent with `delegate_to: gemini-research`, it tries `peer-cli/registry.dispatch('research', tier, prompt, opts)` first. On null result (peer absent OR peer error
|
|
140
|
+
- When session-runner spawns an agent with `delegate_to: gemini-research`, it tries `peer-cli/registry.dispatch('research', tier, prompt, opts)` first. On null result (peer absent OR peer error) it transparently falls back to the local Anthropic call. The skill never sees the peer failure.
|
|
142
141
|
- `delegate_to: none` explicitly skips registry dispatch (security-sensitive agents).
|
|
143
142
|
- Absent field = same as not setting it = local Anthropic call (unchanged behavior).
|
|
144
143
|
|
|
145
|
-
**Opt-in gating:** Even with `delegate_to:` set on an agent, dispatch only fires if the peer is in `.design/config.json#peer_cli.enabled_peers` allowlist (populated by the install-time nudge
|
|
144
|
+
**Opt-in gating:** Even with `delegate_to:` set on an agent, dispatch only fires if the peer is in `.design/config.json#peer_cli.enabled_peers` allowlist (populated by the install-time nudge; default empty). This keeps cost surprises off - users explicitly authorize each peer.
|
|
146
145
|
|
|
147
|
-
**Telemetry:** Peer calls emit `peer_call_started` / `peer_call_complete` / `peer_call_failed` events in `events.jsonl`, tagged with `runtime_role: "peer"` and `peer_id
|
|
146
|
+
**Telemetry:** Peer calls emit `peer_call_started` / `peer_call_complete` / `peer_call_failed` events in `events.jsonl`, tagged with `runtime_role: "peer"` and `peer_id`. Cost rows in `costs.jsonl` carry the same tags so reflector cross-runtime arbitrage extends naturally.
|
|
148
147
|
|
|
149
148
|
**Cross-references:**
|
|
150
|
-
- `scripts/lib/peer-cli/registry.cjs`
|
|
151
|
-
- `scripts/lib/peer-cli/adapters/{codex,gemini,cursor,copilot,qwen}.cjs`
|
|
152
|
-
- `reference/peer-cli-capabilities.md`
|
|
153
|
-
- `.planning/phases/27-peer-cli-delegation/CONTEXT.md` D-06, D-07, D-11 - decision lineage.
|
|
149
|
+
- `scripts/lib/peer-cli/registry.cjs` - capability matrix + dispatch.
|
|
150
|
+
- `scripts/lib/peer-cli/adapters/{codex,gemini,cursor,copilot,qwen}.cjs` - per-peer thin adapters.
|
|
151
|
+
- `reference/peer-cli-capabilities.md` - full capability matrix doc.
|
|
154
152
|
|
|
155
153
|
---
|
|
156
154
|
|
|
@@ -184,7 +182,7 @@ Every agent terminates its response with a completion marker - a specific `##` h
|
|
|
184
182
|
| Execution agent | `## EXECUTION COMPLETE` |
|
|
185
183
|
| Verification agent | `## VERIFICATION COMPLETE` |
|
|
186
184
|
|
|
187
|
-
**Design-pipeline-specific markers (proposed - confirm
|
|
185
|
+
**Design-pipeline-specific markers (proposed - confirm when the first stage agent is written):**
|
|
188
186
|
|
|
189
187
|
| Stage | Proposed marker |
|
|
190
188
|
|-------|-----------------|
|
|
@@ -278,7 +276,7 @@ Constraints: do not modify any file other than .design/example-output.md.
|
|
|
278
276
|
|
|
279
277
|
---
|
|
280
278
|
|
|
281
|
-
## Mandatory Record Step
|
|
279
|
+
## Mandatory Record Step
|
|
282
280
|
|
|
283
281
|
Every agent **must** end its run by appending one JSONL line to `.design/intel/insights.jsonl`. This feeds `/gdd:reflect`, `/gdd:extract-learnings`, and the decision-injector relevance counter.
|
|
284
282
|
|
|
@@ -345,9 +343,9 @@ Global ceiling: **no single agent file exceeds 600 lines** under any circumstanc
|
|
|
345
343
|
|
|
346
344
|
---
|
|
347
345
|
|
|
348
|
-
## Cache-Aligned Ordering Convention
|
|
346
|
+
## Cache-Aligned Ordering Convention
|
|
349
347
|
|
|
350
|
-
Every agent body under `agents/*.md` is structured in this exact order so that Anthropic's 5-minute prompt cache (and the plugin's `/gdd:warm-cache` pre-warmer) can key on the longest possible identical prefix across spawns. The rule
|
|
348
|
+
Every agent body under `agents/*.md` is structured in this exact order so that Anthropic's 5-minute prompt cache (and the plugin's `/gdd:warm-cache` pre-warmer) can key on the longest possible identical prefix across spawns. The rule:
|
|
351
349
|
|
|
352
350
|
1. **Shared-preamble import** - the first non-blank line of the body MUST be `@reference/shared-preamble.md`. This pulls the framework identity, required-reading discipline, writes protocol, deviation handling, and hook awareness into the prompt. Identical bytes across all 26 agents → one cache entry warms them all.
|
|
353
351
|
2. **Agent-specific role + tools contract + output format** - unique to the agent but stable across every invocation of that same agent. Cache hits on the per-agent tail after the first call of the session.
|
|
@@ -358,11 +356,10 @@ Every agent body under `agents/*.md` is structured in this exact order so that A
|
|
|
358
356
|
See `reference/shared-preamble.md` (the imported file) and `reference/model-tiers.md` (tier assignment + override precedence) for the two paired references.
|
|
359
357
|
|
|
360
358
|
**Cross-references.**
|
|
361
|
-
- `reference/shared-preamble.md` - the preamble file itself
|
|
362
|
-
- `reference/model-tiers.md` - tier-selection guide + per-agent map
|
|
363
|
-
- `skills/warm-cache/SKILL.md` - the command that primes Layer A cache across the roster
|
|
364
|
-
- `skills/cache-manager/SKILL.md` - Layer B (explicit manifest) cache; independent of this ordering rule
|
|
365
|
-
- `.planning/phases/10.1-optimization-layer-cost-governance/10.1-CONTEXT.md` §D-08, §D-16, §D-17 - decision lineage.
|
|
359
|
+
- `reference/shared-preamble.md` - the preamble file itself.
|
|
360
|
+
- `reference/model-tiers.md` - tier-selection guide + per-agent map.
|
|
361
|
+
- `skills/warm-cache/SKILL.md` - the command that primes Layer A cache across the roster.
|
|
362
|
+
- `skills/cache-manager/SKILL.md` - Layer B (explicit manifest) cache; independent of this ordering rule.
|
|
366
363
|
|
|
367
364
|
---
|
|
368
365
|
|
package/agents/a11y-mapper.md
CHANGED
|
@@ -20,7 +20,7 @@ writes:
|
|
|
20
20
|
|
|
21
21
|
## Role
|
|
22
22
|
|
|
23
|
-
You produce a static accessibility inventory. You do NOT run a browser audit - that is
|
|
23
|
+
You produce a static accessibility inventory. You do NOT run a browser audit - that is reserved for live verification. You never modify source code and do not spawn agents.
|
|
24
24
|
|
|
25
25
|
## Required Reading
|
|
26
26
|
|
|
@@ -109,7 +109,7 @@ scope: static-only
|
|
|
109
109
|
- 4.1.2 Name, Role, Value — [status]
|
|
110
110
|
|
|
111
111
|
## Scope note
|
|
112
|
-
Static scan only. Runtime contrast, focus-trap, and screen-reader behavior require a live audit
|
|
112
|
+
Static scan only. Runtime contrast, focus-trap, and screen-reader behavior require a live audit.
|
|
113
113
|
|
|
114
114
|
## Micro-polish a11y findings
|
|
115
115
|
|
|
@@ -149,7 +149,7 @@ Run the matching extractor over the same source roots you scanned above:
|
|
|
149
149
|
node scripts/lib/design-context/extract-a11y.mjs <source_root> [<source_root>...] > .design/fragments/a11y-mapper.json
|
|
150
150
|
```
|
|
151
151
|
|
|
152
|
-
`extract-a11y.mjs` walks the source roots with regex (zero-dep) and returns a Fragment whose `nodes[]` have `id`, `type` (`a11y-pattern`), and `name` filled, with stub `summary` you must replace. Patterns map to the ARIA, keyboard, focus, landmark, and skip-link signals you inventoried above. This is a static scan only; runtime behavior stays out
|
|
152
|
+
`extract-a11y.mjs` walks the source roots with regex (zero-dep) and returns a Fragment whose `nodes[]` have `id`, `type` (`a11y-pattern`), and `name` filled, with stub `summary` you must replace. Patterns map to the ARIA, keyboard, focus, landmark, and skip-link signals you inventoried above. This is a static scan only; runtime behavior stays out.
|
|
153
153
|
|
|
154
154
|
### 2. LLM phase (fill summary, tags, complexity)
|
|
155
155
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: component-benchmark-harvester
|
|
3
|
-
description: Given a component name, harvests design-system excerpts from 18 sources (design-corpora.md) and emits raw, source-attributed output to .
|
|
3
|
+
description: Given a component name, harvests design-system excerpts from 18 sources (design-corpora.md) and emits raw, source-attributed output to .design/benchmarks/raw/<component>.md. Spawned by /gdd:benchmark.
|
|
4
4
|
tools: Read, Write, WebFetch, Bash, Grep, Glob
|
|
5
5
|
color: yellow
|
|
6
6
|
default-tier: sonnet
|
|
@@ -9,7 +9,7 @@ parallel-safe: conditional-on-touches
|
|
|
9
9
|
typical-duration-seconds: 120
|
|
10
10
|
reads-only: false
|
|
11
11
|
writes:
|
|
12
|
-
- ".
|
|
12
|
+
- ".design/benchmarks/raw/"
|
|
13
13
|
---
|
|
14
14
|
|
|
15
15
|
@reference/shared-preamble.md
|
|
@@ -21,7 +21,7 @@ writes:
|
|
|
21
21
|
You are the harvesting agent for the component benchmark corpus. Given a component name
|
|
22
22
|
(e.g. "button", "modal-dialog"), you systematically gather per-source excerpts from the
|
|
23
23
|
18 design systems catalogued in `connections/design-corpora.md` and emit a consolidated
|
|
24
|
-
raw harvest file at `.
|
|
24
|
+
raw harvest file at `.design/benchmarks/raw/<component>.md`.
|
|
25
25
|
|
|
26
26
|
The raw harvest is **input to `component-benchmark-synthesizer`** - it is not the final
|
|
27
27
|
spec. Focus on breadth and attribution; the synthesizer does convergence analysis.
|
|
@@ -54,7 +54,7 @@ paragraphs. For WAI-ARIA APG keyboard contracts, quote verbatim.
|
|
|
54
54
|
|
|
55
55
|
## Step 3 - Write raw harvest file
|
|
56
56
|
|
|
57
|
-
Write `.
|
|
57
|
+
Write `.design/benchmarks/raw/<component>.md` with this structure:
|
|
58
58
|
|
|
59
59
|
```markdown
|
|
60
60
|
# <Component Name> — Raw Benchmark Harvest
|
|
@@ -95,7 +95,7 @@ This pre-analysis seeds the synthesizer's convergence analysis.
|
|
|
95
95
|
|
|
96
96
|
## Output Contract
|
|
97
97
|
|
|
98
|
-
- File: `.
|
|
98
|
+
- File: `.design/benchmarks/raw/<component>.md`
|
|
99
99
|
- One `###` block per harvested source (≥4 blocks minimum for a useful spec)
|
|
100
100
|
- WAI-ARIA APG keyboard contracts quoted verbatim
|
|
101
101
|
- Convergence pre-analysis section present
|
|
@@ -119,5 +119,5 @@ Schema: `reference/schemas/insight-line.schema.json`. Use an empty `artifacts_wr
|
|
|
119
119
|
## HARVEST COMPLETE
|
|
120
120
|
Component: <name>
|
|
121
121
|
Sources harvested: <N>
|
|
122
|
-
Raw file: .
|
|
122
|
+
Raw file: .design/benchmarks/raw/<component>.md
|
|
123
123
|
```
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: component-benchmark-synthesizer
|
|
3
|
-
description: Reads a raw harvest file from .
|
|
3
|
+
description: Reads a raw harvest file from .design/benchmarks/raw/<component>.md and emits a canonical component spec at reference/components/<name>.md using the locked TEMPLATE.md shape. Spawned by /gdd:benchmark after harvesting.
|
|
4
4
|
tools: Read, Write, Grep, Glob
|
|
5
5
|
color: green
|
|
6
6
|
default-tier: sonnet
|
|
@@ -19,7 +19,7 @@ writes:
|
|
|
19
19
|
## Role
|
|
20
20
|
|
|
21
21
|
You are the synthesis agent for the component benchmark corpus. You read a raw harvest
|
|
22
|
-
file at `.
|
|
22
|
+
file at `.design/benchmarks/raw/<component>.md` (produced by `component-benchmark-harvester`)
|
|
23
23
|
and emit a canonical component spec at `reference/components/<name>.md`, following
|
|
24
24
|
`reference/components/TEMPLATE.md` exactly.
|
|
25
25
|
|
|
@@ -32,7 +32,7 @@ that signal in the spec so future agents know what is non-negotiable.
|
|
|
32
32
|
The orchestrating skill supplies a `<required_reading>` block in the prompt. Read every
|
|
33
33
|
listed file before acting. Minimum expected inputs:
|
|
34
34
|
|
|
35
|
-
- `.
|
|
35
|
+
- `.design/benchmarks/raw/<component>.md` - the raw harvest to synthesize
|
|
36
36
|
- `reference/components/TEMPLATE.md` - the locked spec shape you must follow
|
|
37
37
|
- `reference/anti-patterns.md` - for cross-linking anti-pattern entries
|
|
38
38
|
|
|
@@ -88,9 +88,9 @@ State the file(s) written in the task output (`.design/tasks/task-NN.md`), mirro
|
|
|
88
88
|
|
|
89
89
|
---
|
|
90
90
|
|
|
91
|
-
## Emulator is OPTIONAL
|
|
91
|
+
## Emulator is OPTIONAL
|
|
92
92
|
|
|
93
|
-
You do **NOT** require a running Android emulator to produce Compose code. Code generation is purely static - no emulator, no `adb`, no Android SDK is needed at generation time (
|
|
93
|
+
You do **NOT** require a running Android emulator to produce Compose code. Code generation is purely static - no emulator, no `adb`, no Android SDK is needed at generation time (the default suite stays green on any machine). Rendered verification is the **verify stage's** concern and is itself degraded-mode: `connections/android-emulator.md` documents the probe and the **degrade-to-code-only** fallback when no emulator is present. Never block on a missing emulator.
|
|
94
94
|
|
|
95
95
|
---
|
|
96
96
|
|
|
@@ -121,7 +121,7 @@ This agent MUST NOT:
|
|
|
121
121
|
|
|
122
122
|
- Run `git clean` (any flags) - absolute prohibition.
|
|
123
123
|
- Re-derive the token→Compose mapping - consume `emitCompose` / `reference/native-platforms.md`.
|
|
124
|
-
- Require a running emulator, `adb`, or the Android SDK to generate code
|
|
124
|
+
- Require a running emulator, `adb`, or the Android SDK to generate code.
|
|
125
125
|
- Hardcode themed `Color(0x…)` / magic `dp` where a token exists.
|
|
126
126
|
- Modify `.design/DESIGN-PLAN.md` or `.design/DESIGN-CONTEXT.md`, or re-plan task scope.
|
|
127
127
|
- Spawn other agents via the `Task` tool, or ask clarifying questions (single-shot - note choices in the output).
|
|
@@ -20,7 +20,7 @@ writes:
|
|
|
20
20
|
|
|
21
21
|
You forecast GDD's design-cycle spend so the user sees a cost trajectory **before** the bill arrives.
|
|
22
22
|
You are **report-only**: you read telemetry, run a pure model, and narrate. You never edit
|
|
23
|
-
`budget.json`, never spend, and never block a spawn - the
|
|
23
|
+
`budget.json`, never spend, and never block a spawn - the budget-enforcer hook is the only
|
|
24
24
|
thing that halts.
|
|
25
25
|
|
|
26
26
|
**Read `reference/cost-governance.md` first** - it is the contract for the model, the scenarios, and
|
|
@@ -62,7 +62,7 @@ the `project_cap` semantics.
|
|
|
62
62
|
payload `{ scenario, perCycle, projectedTotal, cyclesToCap }` (PII-free). Append only - never
|
|
63
63
|
rewrite the stream.
|
|
64
64
|
|
|
65
|
-
## Scenarios (from `cost-forecast.cjs
|
|
65
|
+
## Scenarios (from `cost-forecast.cjs`)
|
|
66
66
|
|
|
67
67
|
| `--scenario` | per-cycle rate | reads as |
|
|
68
68
|
|---|---|---|
|
package/agents/design-auditor.md
CHANGED
|
@@ -7,7 +7,7 @@ model: inherit
|
|
|
7
7
|
default-tier: sonnet
|
|
8
8
|
tier-rationale: "Emits structured findings from code inspection; Sonnet balances depth with cost"
|
|
9
9
|
size_budget: XXL
|
|
10
|
-
size_budget_rationale: "
|
|
10
|
+
size_budget_rationale: "Component Conformance addendum adds ~54 lines for spec-grep detection + conformance scoring algorithm"
|
|
11
11
|
parallel-safe: always
|
|
12
12
|
typical-duration-seconds: 45
|
|
13
13
|
reads-only: false
|
|
@@ -31,7 +31,7 @@ You run once per verify session. You do NOT remediate gaps, spawn other agents,
|
|
|
31
31
|
|
|
32
32
|
**This audit SUPPLEMENTS the existing 7-category 0-10 scoring system in `reference/audit-scoring.md`. It does NOT replace it.**
|
|
33
33
|
|
|
34
|
-
The existing system (7 categories: Accessibility, Visual Hierarchy, Typography, Color, Layout & Spacing, Anti-Pattern Compliance, Interaction & Motion - each scored 0–10 with weighted totals) continues to be the primary quantitative score used by design-verifier in its
|
|
34
|
+
The existing system (7 categories: Accessibility, Visual Hierarchy, Typography, Color, Layout & Spacing, Anti-Pattern Compliance, Interaction & Motion - each scored 0–10 with weighted totals) continues to be the primary quantitative score used by design-verifier in its Stage 1 evaluation. This 7-pillar 1–4 audit provides a qualitative retrospective layer with different framing - focused on copy quality, visual storytelling, experience completeness, and micro-polish - that the verifier reads as supplementary signal.
|
|
35
35
|
|
|
36
36
|
Do not compute a weighted 0–100 score. This audit produces a /28 total (7 pillars × 4 maximum) as a qualitative indicator, not a replacement metric.
|
|
37
37
|
|
|
@@ -45,7 +45,7 @@ Minimum expected files:
|
|
|
45
45
|
- `.design/DESIGN-CONTEXT.md` - goals, brand direction, design decisions (D-XX)
|
|
46
46
|
- `.design/DESIGN-PLAN.md` - planned tasks and acceptance criteria
|
|
47
47
|
- `.design/tasks/` - what was actually done (glob all task files)
|
|
48
|
-
- **Domain-index navigation
|
|
48
|
+
- **Domain-index navigation:** the 7 entry-points `reference/{typography,color,spatial,motion,interaction,responsive,ux-writing}.md` index every fragment below. For a pillar, load the relevant domain index first, then drill into the specific fragments it lists only as the pillar needs them - this is the cheap navigation layer over the detailed fragments.
|
|
49
49
|
- `reference/audit-scoring.md` - existing 7-category scoring rubric (understand, do not duplicate)
|
|
50
50
|
- `reference/anti-slop-rubric.md` - the five verb axes scored after the pillar pass (see Anti-slop scoring section)
|
|
51
51
|
- `reference/visual-tells.md` - default-AI tell catalog; each tell names the verb axis it degrades
|
|
@@ -73,7 +73,7 @@ Minimum expected files:
|
|
|
73
73
|
|
|
74
74
|
## 7-Pillar Scoring System
|
|
75
75
|
|
|
76
|
-
> **Scoring contract: v2** (`scoring_contract_version: v2`) - 7 pillars; copy deepened
|
|
76
|
+
> **Scoring contract: v2** (`scoring_contract_version: v2`) - 7 pillars; copy deepened via `reference/copy-quality.md` + `agents/copy-auditor.md`; 8th pillar slot reserved, unscored. The pillar count and slot 7 (Micro-Polish) name are read by `design-verifier` by name; do not renumber existing pillars.
|
|
77
77
|
|
|
78
78
|
**Score definitions (1–4 per pillar):**
|
|
79
79
|
|
|
@@ -90,7 +90,7 @@ Minimum expected files:
|
|
|
90
90
|
|
|
91
91
|
**What this measures:** The quality and specificity of text content - button labels, empty states, error messages, loading copy, ARIA strings, alt text, form copy, and voice alignment. Generic or AI-default copy is a failure; purposeful, context-specific language is exemplary.
|
|
92
92
|
|
|
93
|
-
**Detailed rubric:** `reference/copy-quality.md` is the source of truth for this pillar - it holds the per-category criteria (CTAs, errors, empty states, loading/skeleton, ARIA text, alt text, form labels/helper/validation, voice/tone), the failure patterns, the internationalization lens (hardcoded-string probe + `+40%` expansion-overflow check
|
|
93
|
+
**Detailed rubric:** `reference/copy-quality.md` is the source of truth for this pillar - it holds the per-category criteria (CTAs, errors, empty states, loading/skeleton, ARIA text, alt text, form labels/helper/validation, voice/tone), the failure patterns, the internationalization lens (hardcoded-string probe + `+40%` expansion-overflow check), and the canonical 1-4 Scoring Guide table. Read it before scoring Copy. For a deep, evidence-rich Copy pass, the verify stage may spawn `agents/copy-auditor.md`, which scores this pillar against `reference/copy-quality.md` and writes `.design/COPY-AUDIT.md`; when that supplement exists, fold its score and top finding into this pillar rather than re-deriving them. Keep the 1-4 scale below either way.
|
|
94
94
|
|
|
95
95
|
**Audit method:**
|
|
96
96
|
|
|
@@ -285,7 +285,7 @@ grep -rEn "confirm\b|Confirm\b|areYouSure|destructive|danger" src/ --include="*.
|
|
|
285
285
|
Collect findings from the micro-polish sections of the mapper outputs (`.design/map/motion.md`, `.design/map/tokens.md`, `.design/map/visual-hierarchy.md`, `.design/map/a11y.md`). If those files are not yet available, run targeted grep passes:
|
|
286
286
|
|
|
287
287
|
```bash
|
|
288
|
-
# BAN-NN anti-patterns
|
|
288
|
+
# BAN-NN anti-patterns: run the deterministic detector instead of hand-grepping each
|
|
289
289
|
# rule. One pass, --json, every BAN rule (transition:all, will-change:all, gradient text, bounce
|
|
290
290
|
# easing, scale(0), naked outline:none, pure-black dark mode, disabled zoom, tinted image outline),
|
|
291
291
|
# each finding linked to its reference/anti-patterns.md paragraph. Offline + zero-LLM.
|
|
@@ -495,7 +495,7 @@ This audit is **code-only**. No Playwright-MCP and no dev server screenshot capt
|
|
|
495
495
|
- **Color (Pillar 3):** Color palette harmony and dark mode visual quality require a rendered view. Code analysis checks for token usage and known anti-patterns but cannot assess harmony.
|
|
496
496
|
- **Typography (Pillar 4):** Font rendering and scale legibility require visual inspection. Code analysis checks class usage but cannot assess the rendered result.
|
|
497
497
|
|
|
498
|
-
**Recommendation:** Run design-verifier
|
|
498
|
+
**Recommendation:** Run design-verifier Stage 4 (Visual UAT) to supplement these code-only findings with human visual inspection.
|
|
499
499
|
```
|
|
500
500
|
|
|
501
501
|
---
|
|
@@ -28,18 +28,18 @@ You are the network-fetching agent for the authority-watcher phase. You read the
|
|
|
28
28
|
The orchestrating skill supplies a `<required_reading>` block in the prompt. Read every listed file before acting - this is mandatory. Minimum expected inputs (skip gracefully if absent, note what is missing):
|
|
29
29
|
|
|
30
30
|
- `reference/authority-feeds.md` - the curated whitelist you fetch from.
|
|
31
|
-
- `.design/authority-snapshot.json` - prior snapshot (absent = first run
|
|
31
|
+
- `.design/authority-snapshot.json` - prior snapshot (absent = first run).
|
|
32
32
|
- `.design/STATE.md` - for cycle slug if present (non-fatal if absent).
|
|
33
33
|
|
|
34
34
|
## Flags
|
|
35
35
|
|
|
36
36
|
Flags are supplied by the orchestrating skill in the prompt (the skill parses `/gdd:watch-authorities` user arguments):
|
|
37
37
|
|
|
38
|
-
- `--refresh` → re-seed snapshot from current feed state without surfacing anything (
|
|
39
|
-
- `--since <ISO8601 date>` → surface entries whose `published` date is newer than the given boundary regardless of snapshot state (
|
|
38
|
+
- `--refresh` → re-seed snapshot from current feed state without surfacing anything (recovery mode; behaves identically to first run).
|
|
39
|
+
- `--since <ISO8601 date>` → surface entries whose `published` date is newer than the given boundary regardless of snapshot state (first-run escape hatch + backlog surfacing).
|
|
40
40
|
- `--feed <feed-id>` → fetch only the single named feed (debugging / spot-check).
|
|
41
41
|
|
|
42
|
-
The `--schedule` flag is handled by the skill
|
|
42
|
+
The `--schedule` flag is handled by the skill, not by this agent. If you receive it, ignore.
|
|
43
43
|
|
|
44
44
|
## Step 1 - Load Whitelist
|
|
45
45
|
|
|
@@ -76,7 +76,7 @@ published = block.created_at // used only for --since filtering
|
|
|
76
76
|
|
|
77
77
|
Parse the structured reply into entries with the same field names as the arena branch.
|
|
78
78
|
|
|
79
|
-
**Polite-crawl:** between requests to the **same host** (by `URL.host`), sleep 250ms
|
|
79
|
+
**Polite-crawl:** between requests to the **same host** (by `URL.host`), sleep 250ms. Distinct hosts may fetch back-to-back without delay. A per-feed inline `min-delay-ms:` override in the whitelist (if present) supersedes the default.
|
|
80
80
|
|
|
81
81
|
**Errors are non-fatal.** On WebFetch or parse failure, push `{ feed-id, error: "<one-sentence>" }` into `fetch_notes` and continue. A single failing feed must not block the other ~25.
|
|
82
82
|
|
|
@@ -90,7 +90,7 @@ hash = sha256(title + "\n" + summary)
|
|
|
90
90
|
|
|
91
91
|
Use `Bash` to invoke `printf '%s\n%s' "$title" "$summary" | shasum -a 256 | awk '{print $1}'` (or the Node `crypto.createHash('sha256').update(title+"\n"+summary).digest('hex')` equivalent). Output MUST be a 64-char lowercase hex string - the schema at `reference/schemas/authority-snapshot.schema.json` enforces `^[0-9a-f]{64}$`.
|
|
92
92
|
|
|
93
|
-
**New-entry rule
|
|
93
|
+
**New-entry rule:**
|
|
94
94
|
- Entry is new if its `id` is not present in `prior.feeds[feed-id].entries`, OR
|
|
95
95
|
- Entry is new if its `id` IS present but the `hash` differs from the stored one (content changed).
|
|
96
96
|
|
|
@@ -100,9 +100,9 @@ Use `Bash` to invoke `printf '%s\n%s' "$title" "$summary" | shasum -a 256 | awk
|
|
|
100
100
|
|
|
101
101
|
## Step 5 - Classify
|
|
102
102
|
|
|
103
|
-
Apply the decision table below to each new entry. Emit `{ ...entry, classification, rationale }` where `rationale` is a ≤1-sentence deterministic trace of which rule matched (e.g., "title matched `/added|updated|removed/i` → spec-change"). Entries classified `skip` go into `skipped_entries` and do NOT appear in the report body
|
|
103
|
+
Apply the decision table below to each new entry. Emit `{ ...entry, classification, rationale }` where `rationale` is a ≤1-sentence deterministic trace of which rule matched (e.g., "title matched `/added|updated|removed/i` → spec-change"). Entries classified `skip` go into `skipped_entries` and do NOT appear in the report body.
|
|
104
104
|
|
|
105
|
-
**Classification decision table
|
|
105
|
+
**Classification decision table:**
|
|
106
106
|
|
|
107
107
|
| Source kind | Default classification |
|
|
108
108
|
|---|---|
|
|
@@ -115,7 +115,7 @@ Apply the decision table below to each new entry. Emit `{ ...entry, classificati
|
|
|
115
115
|
|
|
116
116
|
The skip row is evaluated LAST and overrides the kind-based row - a component-system release titled "Sponsored: shipping our new sponsor tier" still ends up `skip`.
|
|
117
117
|
|
|
118
|
-
### OpenRouter catalog drift
|
|
118
|
+
### OpenRouter catalog drift
|
|
119
119
|
|
|
120
120
|
Beyond the design-authority feeds above, the **OpenRouter model catalog** (`.design/cache/openrouter-models.json`, fetched by `scripts/lib/openrouter/catalog-fetcher.cjs`) is a **weekly-diff feed**. Diff the prior vs current catalog via `scripts/lib/authority-watcher/index.cjs#diffOpenRouterCatalog(prevModels, currModels, { overrides })`, which classifies each delta as `new-model` / `pricing-change` / `deprecated` / `withdrawn`. To keep the report actionable and quiet, **surface ONLY `deprecated`/`withdrawn` entries whose id matches a configured `.design/config.json#openrouter_tier_overrides` pin** - i.e. the user pinned a model that is going away. `new-model` and `pricing-change` deltas are classified (returned, `surfaced:false`) but never surfaced as alerts (noise control). When OpenRouter is not configured (no catalog), this feed is silently skipped.
|
|
121
121
|
|
|
@@ -125,7 +125,7 @@ For each feed, merge the newly-fetched entries into `feeds[feed-id].entries`:
|
|
|
125
125
|
- Preserve the prior entries for ids not seen this run (stale entries persist until pruned).
|
|
126
126
|
- For ids seen this run, overwrite the prior record with `{ id, hash }` from the fresh fetch.
|
|
127
127
|
- Append order: existing retained entries first (oldest → newest), then new arrivals.
|
|
128
|
-
- **Prune: keep only the last 200 entries per feed
|
|
128
|
+
- **Prune: keep only the last 200 entries per feed.** This is a hard cap; the schema at `reference/schemas/authority-snapshot.schema.json` rejects >200 via `maxItems:200`, so pruning MUST happen before the write call.
|
|
129
129
|
|
|
130
130
|
Set `feeds[feed-id].last_fetched_at` to the current ISO8601 UTC timestamp. Set top-level `generated_at` to the same. Serialize with 2-space indentation.
|
|
131
131
|
|
|
@@ -177,7 +177,7 @@ N entries surfaced across M feeds. K skipped.
|
|
|
177
177
|
```
|
|
178
178
|
|
|
179
179
|
**Rules:**
|
|
180
|
-
- Classification sections ordered by weight: `spec-change` → `heuristic-update` → `pattern-guidance` → `craft-tip
|
|
180
|
+
- Classification sections ordered by weight: `spec-change` → `heuristic-update` → `pattern-guidance` → `craft-tip`.
|
|
181
181
|
- Omit a section entirely when its count is zero (signal density).
|
|
182
182
|
- The **Skipped** footer line is ALWAYS present - even when K=0 - for Plan 13.2-04 diff-test determinism.
|
|
183
183
|
- If `fetch_notes` is non-empty, append a `Fetch notes:` block after the Skipped line, one bullet per note:
|
|
@@ -188,7 +188,7 @@ N entries surfaced across M feeds. K skipped.
|
|
|
188
188
|
```
|
|
189
189
|
- Entry line format is exact: `- **[Title](url)** — feed: <feed-title> — *<rationale>*`. Em-dash (`—`), italicized rationale, no trailing period unless the rationale itself ends one.
|
|
190
190
|
|
|
191
|
-
## Step 7.5 - Emit `kfm-candidate` events
|
|
191
|
+
## Step 7.5 - Emit `kfm-candidate` events
|
|
192
192
|
|
|
193
193
|
After classifying the new entries (Step 5) but BEFORE writing the snapshot (Step 6), evaluate every NEW entry against the failure-mode-article whitelist. The whitelist patterns (case-insensitive) are:
|
|
194
194
|
|
|
@@ -221,9 +221,9 @@ Event payload shape - validates against `reference/schemas/events.schema.json` d
|
|
|
221
221
|
|
|
222
222
|
**Excerpt cap.** `raw_excerpt` MUST be ≤500 chars (the schema rejects longer). Truncate with a single-char ellipsis when the source summary exceeds 500.
|
|
223
223
|
|
|
224
|
-
**One event per matched entry.** Do NOT emit duplicates within a single run; if `event_id` is already present in the stream from a prior run, the writer's dedup logic handles it
|
|
224
|
+
**One event per matched entry.** Do NOT emit duplicates within a single run; if `event_id` is already present in the stream from a prior run, the writer's dedup logic handles it.
|
|
225
225
|
|
|
226
|
-
**No catalogue writes.** This step ONLY emits events. The
|
|
226
|
+
**No catalogue writes.** This step ONLY emits events. The reflector consumes them into `.design/reflections/incubator/kfm-<slug>/CATALOGUE-ENTRY.md` drafts; the user reviews via `/gdd:apply-reflections` and accepts/rejects. Authority-watcher NEVER writes to `reference/known-failure-modes.md` directly.
|
|
227
227
|
|
|
228
228
|
Programmatic helper available at `scripts/lib/authority-watcher/index.cjs` - `classifyArticles(articles) → events`. Callers in test harnesses use the helper directly; the agent emits events via the Bash equivalent.
|
|
229
229
|
|
|
@@ -238,7 +238,7 @@ When `X > 0`, the suffix `X kfm-candidate events emitted` is appended; when `X =
|
|
|
238
238
|
|
|
239
239
|
## Do Not
|
|
240
240
|
|
|
241
|
-
- Do NOT modify `agents/design-reflector.md`. Reflector integration
|
|
241
|
+
- Do NOT modify `agents/design-reflector.md`. Reflector integration lives in `skills/reflect/SKILL.md` only.
|
|
242
242
|
- Do NOT fetch URLs that are not listed in `reference/authority-feeds.md`. The whitelist is the allow-list.
|
|
243
243
|
- Do NOT spawn subagents - you have no `Task` tool for a reason.
|
|
244
244
|
- Do NOT commit on behalf of the user. `.design/authority-snapshot.json` and `.design/authority-report.md` both live under gitignored `.design/`.
|
|
@@ -218,7 +218,7 @@ Proceed to Step 0E regardless of whether Step 0D ran or was skipped.
|
|
|
218
218
|
|
|
219
219
|
Detect the **project type** so the pipeline routes the brief to the correct executor. Reuse the Step 0C / Step 1 grep/glob idiom (file reads only, < 1 second, no skip condition).
|
|
220
220
|
|
|
221
|
-
**Enum (7 values
|
|
221
|
+
**Enum (7 values):** `web` (DEFAULT) · `native-ios` · `native-android` · `flutter` · `email` · `print` · `conversational`. (The first six are the *rendered-output* set; `conversational` is the *interaction-surface* type - a chat/voice UI is still rendered code, so it routes to `design-executor` but loads the conversational patterns.)
|
|
222
222
|
|
|
223
223
|
**Detection signals + precedence** (first match wins; brief overrides - if the user explicitly says "iOS app" / "Android app" / "Flutter app" / "email" / "newsletter" / "email template" / "print" / "PDF" / "print-ready" / "brochure" / "flyer" / "poster", honor that):
|
|
224
224
|
|
|
@@ -246,7 +246,7 @@ Precedence: an explicit brief override (the user says "email" / "newsletter" / "
|
|
|
246
246
|
| print | pdf-executor |
|
|
247
247
|
| conversational | design-executor (loads `reference/conversational-ui.md`) |
|
|
248
248
|
|
|
249
|
-
<!--
|
|
249
|
+
<!-- Output types complete: native (native-ios/native-android/flutter) + email + print. The enum + routing table above are the full set. The routing table remains structurally extensible for future output types. -->
|
|
250
250
|
|
|
251
251
|
Record the detected type in DESIGN-CONTEXT.md as a `<project_type>` line (e.g. `<project_type>native-ios</project_type>`) so downstream stages route correctly. The native specifics (token→theme bridge) live in `reference/native-platforms.md`; the email specifics (table layout, inline styles, MSO/dark-mode constraints) live in `reference/email-design.md`; the print specifics (the `@page` box model, bleed/crop marks, CMYK awareness, font embedding, 300dpi raster) live in `reference/print-design.md`; the conversational specifics (voice-flow reprompts, multi-turn dialogue, prompt-as-UX, chatbot empty-states, error recovery) live in `reference/conversational-ui.md` - do not inline any of them here.
|
|
252
252
|
|
|
@@ -270,7 +270,7 @@ grep -iE "trading|portfolio|brokerage|fintech|patient|clinical|EHR|HIPAA|HUD|mul
|
|
|
270
270
|
node -e "const p=require('./package.json');const d={...p.dependencies,...p.devDependencies};console.log(Object.keys(d).join(' '))" 2>/dev/null # match against the dependency column
|
|
271
271
|
```
|
|
272
272
|
|
|
273
|
-
**Confidence rule
|
|
273
|
+
**Confidence rule:**
|
|
274
274
|
|
|
275
275
|
- **≥2 distinct signals, OR any dependency match → auto-apply** the pack. Record it + note it in the brief ("Detected finance domain - loaded finance-patterns.md").
|
|
276
276
|
- **Exactly 1 weak keyword signal → suggest**, don't impose ("This looks like a healthcare project - load healthcare-patterns.md? [y/N]").
|
|
@@ -417,7 +417,7 @@ The NOT is equally important:
|
|
|
417
417
|
|
|
418
418
|
### Area 5 - Visual References (cost-aware - free source first)
|
|
419
419
|
|
|
420
|
-
This area pulls real product references, resolving sources **cost-aware
|
|
420
|
+
This area pulls real product references, resolving sources **cost-aware: try the free source before any paid one.** Check `.design/STATE.md` `<connections>` for `lazyweb:` / `mobbin:` / `refero:` / `pinterest:` status before proceeding. Tool names may vary - verify via ToolSearch before calling. **Two or more references are required.**
|
|
421
421
|
|
|
422
422
|
**Tier 1 - Lazyweb (FREE - tried first; if `lazyweb: available`)**
|
|
423
423
|
|
|
@@ -85,7 +85,7 @@ You MAY:
|
|
|
85
85
|
|
|
86
86
|
## Why this agent exists
|
|
87
87
|
|
|
88
|
-
|
|
88
|
+
Lazy Checker Spawning: cheap Haiku gate agents at `agents/*-gate.md` decide whether to spawn the full checker. If false, the full checker is skipped and logged as `lazy_skipped: true` in telemetry. This gate is the context-checker-specific instance of that pattern - the full `design-context-checker` runs a 6-dimension rubric against `.design/DESIGN-CONTEXT.md`. If the builder made no changes to that file in this cycle (a no-op re-run of discover, for example), the prior verdict still holds and the spawn is wasted cost.
|
|
89
89
|
|
|
90
90
|
## Record
|
|
91
91
|
|
|
@@ -92,7 +92,7 @@ Rewrite STATE.md after each confirmed area so a crash does not lose work.
|
|
|
92
92
|
After each question-answer exchange, append one JSON object to `.design/learnings/question-quality.jsonl` (create file if it doesn't exist):
|
|
93
93
|
|
|
94
94
|
```json
|
|
95
|
-
{"ts":"<iso-timestamp>","question_id":"Q-NN","question_text":"<verbatim question>","answer_summary":"<one sentence>","quality":"high|medium|low|skipped","evidence":"<why — e.g. user said skip, answer < 10 words, answer overridden by
|
|
95
|
+
{"ts":"<iso-timestamp>","question_id":"Q-NN","question_text":"<verbatim question>","answer_summary":"<one sentence>","quality":"high|medium|low|skipped","evidence":"<why — e.g. user said skip, answer < 10 words, answer overridden by a later decision>","cycle":"<active-cycle-slug>"}
|
|
96
96
|
```
|
|
97
97
|
|
|
98
98
|
**Quality classification** (automatic, no user interaction):
|
|
@@ -101,7 +101,7 @@ After each question-answer exchange, append one JSON object to `.design/learning
|
|
|
101
101
|
- `medium` - answer ≥ 10 words but contains "maybe", "probably", "I think", "not sure", "I guess"
|
|
102
102
|
- `high` - specific, actionable, no hedging language
|
|
103
103
|
|
|
104
|
-
Write quality log after every exchange. This data feeds `design-reflector`'s question-quality analysis
|
|
104
|
+
Write quality log after every exchange. This data feeds `design-reflector`'s question-quality analysis.
|
|
105
105
|
|
|
106
106
|
## Constraints
|
|
107
107
|
|
|
@@ -7,7 +7,7 @@ model: sonnet
|
|
|
7
7
|
default-tier: sonnet
|
|
8
8
|
tier-rationale: "Produces polished prose documentation; Sonnet's style quality is sufficient"
|
|
9
9
|
size_budget: XL
|
|
10
|
-
size_budget_rationale: "
|
|
10
|
+
size_budget_rationale: "Record contract added ~11 lines; base doc-writer body is 250-line tier"
|
|
11
11
|
parallel-safe: always
|
|
12
12
|
typical-duration-seconds: 45
|
|
13
13
|
reads-only: false
|