@hegemonart/get-design-done 1.57.1 → 1.57.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +26 -41
- package/.claude-plugin/plugin.json +23 -48
- package/CHANGELOG.md +139 -0
- package/README.md +166 -511
- package/SKILL.md +4 -6
- package/agents/README.md +33 -36
- package/agents/a11y-mapper.md +3 -3
- package/agents/component-benchmark-harvester.md +6 -6
- package/agents/component-benchmark-synthesizer.md +3 -3
- package/agents/compose-executor.md +3 -3
- package/agents/cost-forecaster.md +2 -2
- package/agents/design-auditor.md +7 -7
- package/agents/design-authority-watcher.md +15 -15
- package/agents/design-context-builder.md +4 -4
- package/agents/design-context-checker-gate.md +1 -1
- package/agents/design-discussant.md +2 -2
- package/agents/design-doc-writer.md +1 -1
- package/agents/design-executor.md +2 -2
- package/agents/design-figma-writer.md +2 -2
- package/agents/design-fixer.md +7 -7
- package/agents/design-integration-checker-gate.md +1 -1
- package/agents/design-integration-checker.md +1 -1
- package/agents/design-paper-writer.md +3 -3
- package/agents/design-pencil-writer.md +1 -1
- package/agents/design-planner.md +21 -0
- package/agents/design-reflector.md +39 -39
- package/agents/design-research-synthesizer.md +1 -0
- package/agents/design-start-writer.md +1 -1
- package/agents/design-update-checker.md +5 -5
- package/agents/design-verifier-gate.md +1 -1
- package/agents/design-verifier.md +52 -48
- package/agents/ds-generator.md +2 -2
- package/agents/ds-migration-planner.md +4 -4
- package/agents/email-executor.md +9 -9
- package/agents/experiment-result-ingester.md +3 -3
- package/agents/flutter-executor.md +5 -5
- package/agents/gdd-graph-refresh.md +3 -3
- package/agents/gdd-intel-updater.md +2 -2
- package/agents/motion-mapper.md +2 -2
- package/agents/motion-verifier.md +4 -4
- package/agents/pdf-executor.md +8 -8
- package/agents/perf-analyzer.md +17 -17
- package/agents/pr-commenter.md +9 -9
- package/agents/prototype-gate.md +2 -2
- package/agents/quality-gate-runner.md +1 -1
- package/agents/rollout-coordinator.md +3 -3
- package/agents/swift-executor.md +4 -4
- package/agents/ticket-sync-agent.md +6 -6
- package/agents/user-research-synthesizer.md +2 -2
- package/connections/connections.md +44 -45
- package/connections/cursor.md +72 -0
- package/connections/preview.md +3 -3
- package/hooks/first-run-nudge.cjs +171 -0
- package/hooks/gdd-intel-trigger.js +243 -0
- package/hooks/gdd-mcp-circuit-breaker.js +62 -7
- package/hooks/gdd-precompact-snapshot.js +50 -29
- package/hooks/gdd-protected-paths.js +150 -18
- package/hooks/gdd-risk-gate.js +93 -1
- package/hooks/gdd-sessionstart-recap.js +59 -24
- package/hooks/hooks.json +13 -4
- package/hooks/inject-using-gdd.cjs +188 -0
- package/hooks/update-check.cjs +511 -0
- package/package.json +9 -3
- package/reference/STATE-TEMPLATE.md +10 -13
- package/reference/audit-scoring.md +1 -1
- package/reference/cache-tier-doctrine.md +46 -0
- package/reference/config-schema.md +9 -9
- package/reference/i18n.md +1 -1
- package/reference/intel-schema.md +37 -2
- package/reference/meta-rules.md +4 -4
- package/reference/model-tiers.md +2 -2
- package/reference/registry.json +101 -94
- package/reference/runtime-models.md +11 -1
- package/reference/shared-preamble.md +13 -14
- package/reference/skill-graph.md +22 -3
- package/scripts/bootstrap.cjs +373 -0
- package/scripts/injection-patterns.cjs +58 -0
- package/scripts/lib/apply-reflections/incubator-proposals.cjs +57 -26
- package/scripts/lib/install/converters/codex-plugin.cjs +5 -2
- package/scripts/lib/install/converters/cursor.cjs +20 -0
- package/scripts/lib/issue-reporter/report-flow.cjs +1 -1
- package/scripts/lib/manifest/skills.json +75 -28
- package/scripts/lib/state/query-surface.cjs +67 -9
- package/scripts/lib/state/state-store.cjs +68 -26
- package/scripts/lib/worktree-resolve.cjs +4 -16
- package/sdk/cli/commands/stage.ts +17 -0
- package/sdk/cli/index.js +14 -0
- package/skills/README.md +46 -0
- package/skills/bootstrap-ds/SKILL.md +1 -1
- package/skills/cache-manager/SKILL.md +3 -3
- package/skills/cache-manager/cache-policy.md +1 -1
- package/skills/compare/SKILL.md +1 -1
- package/skills/design/SKILL.md +19 -0
- package/skills/explore/SKILL.md +11 -0
- package/skills/figma-write/SKILL.md +13 -2
- package/skills/new-cycle/SKILL.md +1 -1
- package/skills/paper-write/SKILL.md +54 -0
- package/skills/peer-cli-customize/SKILL.md +0 -1
- package/skills/peers/SKILL.md +1 -1
- package/skills/pencil-write/SKILL.md +54 -0
- package/skills/reflect/procedures/capability-gap-scan.md +0 -1
- package/skills/report-issue/SKILL.md +2 -2
- package/skills/report-issue/report-issue-procedure.md +0 -1
- package/skills/router/SKILL.md +2 -2
- package/skills/synthesize/SKILL.md +1 -1
- package/skills/turn-closeout/SKILL.md +1 -1
- package/skills/verify/verify-procedure.md +10 -11
- package/skills/warm-cache/SKILL.md +1 -1
- package/dist/claude-code/.claude/skills/add-backlog/SKILL.md +0 -48
- package/dist/claude-code/.claude/skills/analyze-dependencies/SKILL.md +0 -95
- package/dist/claude-code/.claude/skills/apply-reflections/SKILL.md +0 -109
- package/dist/claude-code/.claude/skills/apply-reflections/apply-reflections-procedure.md +0 -170
- package/dist/claude-code/.claude/skills/audit/SKILL.md +0 -79
- package/dist/claude-code/.claude/skills/bandit-status/SKILL.md +0 -94
- package/dist/claude-code/.claude/skills/benchmark/SKILL.md +0 -65
- package/dist/claude-code/.claude/skills/bootstrap-ds/SKILL.md +0 -43
- package/dist/claude-code/.claude/skills/brief/SKILL.md +0 -145
- package/dist/claude-code/.claude/skills/budget/SKILL.md +0 -45
- package/dist/claude-code/.claude/skills/cache-manager/SKILL.md +0 -66
- package/dist/claude-code/.claude/skills/cache-manager/cache-policy.md +0 -126
- package/dist/claude-code/.claude/skills/check-update/SKILL.md +0 -98
- package/dist/claude-code/.claude/skills/compare/SKILL.md +0 -82
- package/dist/claude-code/.claude/skills/compare/compare-rubric.md +0 -171
- package/dist/claude-code/.claude/skills/complete-cycle/SKILL.md +0 -81
- package/dist/claude-code/.claude/skills/connections/SKILL.md +0 -71
- package/dist/claude-code/.claude/skills/connections/connections-onboarding.md +0 -608
- package/dist/claude-code/.claude/skills/context/SKILL.md +0 -137
- package/dist/claude-code/.claude/skills/continue/SKILL.md +0 -24
- package/dist/claude-code/.claude/skills/darkmode/SKILL.md +0 -76
- package/dist/claude-code/.claude/skills/darkmode/darkmode-audit-procedure.md +0 -258
- package/dist/claude-code/.claude/skills/debug/SKILL.md +0 -41
- package/dist/claude-code/.claude/skills/debug/debug-feedback-loops.md +0 -119
- package/dist/claude-code/.claude/skills/design/SKILL.md +0 -99
- package/dist/claude-code/.claude/skills/design/design-procedure.md +0 -304
- package/dist/claude-code/.claude/skills/discover/SKILL.md +0 -78
- package/dist/claude-code/.claude/skills/discover/discover-procedure.md +0 -222
- package/dist/claude-code/.claude/skills/discuss/SKILL.md +0 -96
- package/dist/claude-code/.claude/skills/do/SKILL.md +0 -45
- package/dist/claude-code/.claude/skills/explore/SKILL.md +0 -107
- package/dist/claude-code/.claude/skills/explore/explore-procedure.md +0 -267
- package/dist/claude-code/.claude/skills/export/SKILL.md +0 -30
- package/dist/claude-code/.claude/skills/extract-learnings/SKILL.md +0 -114
- package/dist/claude-code/.claude/skills/fast/SKILL.md +0 -91
- package/dist/claude-code/.claude/skills/figma-extract/SKILL.md +0 -64
- package/dist/claude-code/.claude/skills/figma-write/SKILL.md +0 -39
- package/dist/claude-code/.claude/skills/graphify/SKILL.md +0 -49
- package/dist/claude-code/.claude/skills/health/SKILL.md +0 -99
- package/dist/claude-code/.claude/skills/health/health-mcp-detection.md +0 -44
- package/dist/claude-code/.claude/skills/health/health-skill-length-report.md +0 -69
- package/dist/claude-code/.claude/skills/help/SKILL.md +0 -87
- package/dist/claude-code/.claude/skills/instinct/SKILL.md +0 -111
- package/dist/claude-code/.claude/skills/list-assumptions/SKILL.md +0 -61
- package/dist/claude-code/.claude/skills/list-pins/SKILL.md +0 -27
- package/dist/claude-code/.claude/skills/live/SKILL.md +0 -98
- package/dist/claude-code/.claude/skills/locale/SKILL.md +0 -51
- package/dist/claude-code/.claude/skills/map/SKILL.md +0 -89
- package/dist/claude-code/.claude/skills/migrate/SKILL.md +0 -70
- package/dist/claude-code/.claude/skills/migrate-context/SKILL.md +0 -123
- package/dist/claude-code/.claude/skills/new-addendum/SKILL.md +0 -81
- package/dist/claude-code/.claude/skills/new-cycle/SKILL.md +0 -37
- package/dist/claude-code/.claude/skills/new-cycle/milestone-completeness-rubric.md +0 -87
- package/dist/claude-code/.claude/skills/new-project/SKILL.md +0 -53
- package/dist/claude-code/.claude/skills/new-skill/SKILL.md +0 -90
- package/dist/claude-code/.claude/skills/next/SKILL.md +0 -68
- package/dist/claude-code/.claude/skills/note/SKILL.md +0 -48
- package/dist/claude-code/.claude/skills/openrouter-status/SKILL.md +0 -86
- package/dist/claude-code/.claude/skills/optimize/SKILL.md +0 -97
- package/dist/claude-code/.claude/skills/override/SKILL.md +0 -86
- package/dist/claude-code/.claude/skills/pause/SKILL.md +0 -77
- package/dist/claude-code/.claude/skills/peer-cli-add/SKILL.md +0 -88
- package/dist/claude-code/.claude/skills/peer-cli-add/peer-cli-protocol.md +0 -161
- package/dist/claude-code/.claude/skills/peer-cli-customize/SKILL.md +0 -90
- package/dist/claude-code/.claude/skills/peers/SKILL.md +0 -96
- package/dist/claude-code/.claude/skills/pin/SKILL.md +0 -37
- package/dist/claude-code/.claude/skills/plan/SKILL.md +0 -105
- package/dist/claude-code/.claude/skills/plan/plan-procedure.md +0 -278
- package/dist/claude-code/.claude/skills/plant-seed/SKILL.md +0 -48
- package/dist/claude-code/.claude/skills/pr-branch/SKILL.md +0 -32
- package/dist/claude-code/.claude/skills/progress/SKILL.md +0 -107
- package/dist/claude-code/.claude/skills/quality-gate/SKILL.md +0 -90
- package/dist/claude-code/.claude/skills/quality-gate/threat-modeling.md +0 -101
- package/dist/claude-code/.claude/skills/quick/SKILL.md +0 -44
- package/dist/claude-code/.claude/skills/reapply-patches/SKILL.md +0 -32
- package/dist/claude-code/.claude/skills/recall/SKILL.md +0 -75
- package/dist/claude-code/.claude/skills/reflect/SKILL.md +0 -85
- package/dist/claude-code/.claude/skills/reflect/procedures/capability-gap-scan.md +0 -120
- package/dist/claude-code/.claude/skills/report-issue/SKILL.md +0 -53
- package/dist/claude-code/.claude/skills/report-issue/report-issue-procedure.md +0 -120
- package/dist/claude-code/.claude/skills/resume/SKILL.md +0 -93
- package/dist/claude-code/.claude/skills/review-backlog/SKILL.md +0 -46
- package/dist/claude-code/.claude/skills/review-decisions/SKILL.md +0 -42
- package/dist/claude-code/.claude/skills/roi/SKILL.md +0 -54
- package/dist/claude-code/.claude/skills/rollout-status/SKILL.md +0 -35
- package/dist/claude-code/.claude/skills/router/SKILL.md +0 -89
- package/dist/claude-code/.claude/skills/router/capability-gap-emitter.md +0 -65
- package/dist/claude-code/.claude/skills/router/router-pick-emitter.md +0 -78
- package/dist/claude-code/.claude/skills/router/router-rules.md +0 -84
- package/dist/claude-code/.claude/skills/scan/SKILL.md +0 -92
- package/dist/claude-code/.claude/skills/scan/scan-procedure.md +0 -732
- package/dist/claude-code/.claude/skills/settings/SKILL.md +0 -87
- package/dist/claude-code/.claude/skills/ship/SKILL.md +0 -48
- package/dist/claude-code/.claude/skills/sketch/SKILL.md +0 -78
- package/dist/claude-code/.claude/skills/sketch-wrap-up/SKILL.md +0 -92
- package/dist/claude-code/.claude/skills/skill-manifest/SKILL.md +0 -79
- package/dist/claude-code/.claude/skills/spike/SKILL.md +0 -67
- package/dist/claude-code/.claude/skills/spike-wrap-up/SKILL.md +0 -86
- package/dist/claude-code/.claude/skills/start/SKILL.md +0 -67
- package/dist/claude-code/.claude/skills/start/start-procedure.md +0 -115
- package/dist/claude-code/.claude/skills/state/SKILL.md +0 -106
- package/dist/claude-code/.claude/skills/stats/SKILL.md +0 -51
- package/dist/claude-code/.claude/skills/style/SKILL.md +0 -71
- package/dist/claude-code/.claude/skills/style/style-doc-procedure.md +0 -150
- package/dist/claude-code/.claude/skills/synthesize/SKILL.md +0 -94
- package/dist/claude-code/.claude/skills/timeline/SKILL.md +0 -66
- package/dist/claude-code/.claude/skills/todo/SKILL.md +0 -64
- package/dist/claude-code/.claude/skills/turn-closeout/SKILL.md +0 -95
- package/dist/claude-code/.claude/skills/undo/SKILL.md +0 -31
- package/dist/claude-code/.claude/skills/unlock-decision/SKILL.md +0 -54
- package/dist/claude-code/.claude/skills/unpin/SKILL.md +0 -31
- package/dist/claude-code/.claude/skills/update/SKILL.md +0 -56
- package/dist/claude-code/.claude/skills/using-gdd/SKILL.md +0 -78
- package/dist/claude-code/.claude/skills/verify/SKILL.md +0 -113
- package/dist/claude-code/.claude/skills/verify/verify-procedure.md +0 -512
- package/dist/claude-code/.claude/skills/warm-cache/SKILL.md +0 -81
- package/dist/claude-code/.claude/skills/watch-authorities/SKILL.md +0 -82
- package/dist/claude-code/.claude/skills/zoom-out/SKILL.md +0 -26
- package/hooks/first-run-nudge.sh +0 -82
- package/hooks/inject-using-gdd.sh +0 -72
- package/hooks/run-hook.cmd +0 -35
- package/hooks/update-check.sh +0 -251
- package/scripts/lib/audit-aggregator/index.cjs +0 -219
- package/scripts/lib/hedge-ensemble.cjs +0 -217
- package/skills/discover/SKILL.md +0 -78
- package/skills/discover/discover-procedure.md +0 -222
- package/skills/new-cycle/milestone-completeness-rubric.md +0 -87
- package/skills/scan/SKILL.md +0 -92
- package/skills/scan/scan-procedure.md +0 -732
|
@@ -92,7 +92,7 @@ Rewrite STATE.md after each confirmed area so a crash does not lose work.
|
|
|
92
92
|
After each question-answer exchange, append one JSON object to `.design/learnings/question-quality.jsonl` (create file if it doesn't exist):
|
|
93
93
|
|
|
94
94
|
```json
|
|
95
|
-
{"ts":"<iso-timestamp>","question_id":"Q-NN","question_text":"<verbatim question>","answer_summary":"<one sentence>","quality":"high|medium|low|skipped","evidence":"<why — e.g. user said skip, answer < 10 words, answer overridden by
|
|
95
|
+
{"ts":"<iso-timestamp>","question_id":"Q-NN","question_text":"<verbatim question>","answer_summary":"<one sentence>","quality":"high|medium|low|skipped","evidence":"<why — e.g. user said skip, answer < 10 words, answer overridden by a later decision>","cycle":"<active-cycle-slug>"}
|
|
96
96
|
```
|
|
97
97
|
|
|
98
98
|
**Quality classification** (automatic, no user interaction):
|
|
@@ -101,7 +101,7 @@ After each question-answer exchange, append one JSON object to `.design/learning
|
|
|
101
101
|
- `medium` - answer ≥ 10 words but contains "maybe", "probably", "I think", "not sure", "I guess"
|
|
102
102
|
- `high` - specific, actionable, no hedging language
|
|
103
103
|
|
|
104
|
-
Write quality log after every exchange. This data feeds `design-reflector`'s question-quality analysis
|
|
104
|
+
Write quality log after every exchange. This data feeds `design-reflector`'s question-quality analysis.
|
|
105
105
|
|
|
106
106
|
## Constraints
|
|
107
107
|
|
|
@@ -7,7 +7,7 @@ model: sonnet
|
|
|
7
7
|
default-tier: sonnet
|
|
8
8
|
tier-rationale: "Produces polished prose documentation; Sonnet's style quality is sufficient"
|
|
9
9
|
size_budget: XL
|
|
10
|
-
size_budget_rationale: "
|
|
10
|
+
size_budget_rationale: "Record contract added ~11 lines; base doc-writer body is 250-line tier"
|
|
11
11
|
parallel-safe: always
|
|
12
12
|
typical-duration-seconds: 45
|
|
13
13
|
reads-only: false
|
|
@@ -6,7 +6,7 @@ color: yellow
|
|
|
6
6
|
default-tier: sonnet
|
|
7
7
|
tier-rationale: "Follows an Opus-authored plan; executes rather than plans"
|
|
8
8
|
size_budget: XXL
|
|
9
|
-
size_budget_rationale: "
|
|
9
|
+
size_budget_rationale: "Benchmark Spec Pre-Flight section for type:components adds ~17 lines"
|
|
10
10
|
parallel-safe: conditional-on-touches
|
|
11
11
|
typical-duration-seconds: 60
|
|
12
12
|
reads-only: false
|
|
@@ -34,7 +34,7 @@ The orchestrating stage supplies a `<required_reading>` block in the prompt. Rea
|
|
|
34
34
|
- `.design/STATE.md` - pipeline state (decisions, blockers, must-haves)
|
|
35
35
|
- `.design/DESIGN-PLAN.md` - full task list (your task is identified by task_id)
|
|
36
36
|
- `.design/DESIGN-CONTEXT.md` - brand decisions, constraints, locked choices
|
|
37
|
-
- The reference file(s) relevant to the task type (e.g., `reference/typography.md` for a typography task). The 7 domain-index entry-points `reference/{typography,color,spatial,motion,interaction,responsive,ux-writing}.md`
|
|
37
|
+
- The reference file(s) relevant to the task type (e.g., `reference/typography.md` for a typography task). The 7 domain-index entry-points `reference/{typography,color,spatial,motion,interaction,responsive,ux-writing}.md` are the navigation start: load the index, drill into the fragments it lists only as the task needs them.
|
|
38
38
|
|
|
39
39
|
**Invariant:** read all listed files FIRST, before making any changes.
|
|
40
40
|
|
|
@@ -153,8 +153,8 @@ Build a numbered operation list based on mode. Do not execute yet.
|
|
|
153
153
|
|
|
154
154
|
```
|
|
155
155
|
Proposed annotations (N operations):
|
|
156
|
-
1. Layer "Button/Primary" → add comment: "Background: brand-primary-500 (#1A73E8) per
|
|
157
|
-
2. Layer "Typography/H1" → add comment: "Font: Inter 32/40 per
|
|
156
|
+
1. Layer "Button/Primary" → add comment: "Background: brand-primary-500 (#1A73E8) per color decision"
|
|
157
|
+
2. Layer "Typography/H1" → add comment: "Font: Inter 32/40 per typography decision"
|
|
158
158
|
... (one line per annotation)
|
|
159
159
|
```
|
|
160
160
|
|
package/agents/design-fixer.md
CHANGED
|
@@ -23,7 +23,7 @@ You fix design gaps atomically. One agent invocation = fix all in-scope gaps fro
|
|
|
23
23
|
|
|
24
24
|
You have zero session memory. Every invocation starts fresh. The orchestrating stage supplies all context via the `<required_reading>` block and prompt context fields - you rely entirely on those inputs.
|
|
25
25
|
|
|
26
|
-
**Scope of work:** You apply targeted source-code fixes for gaps listed in `.design/DESIGN-VERIFICATION.md ##
|
|
26
|
+
**Scope of work:** You apply targeted source-code fixes for gaps listed in `.design/DESIGN-VERIFICATION.md ## Stage 5 — Gaps`. You commit one fix per gap. You do nothing else.
|
|
27
27
|
|
|
28
28
|
**Accessibility failures route here too.** When the quality-gate skill classifies a failure into the `a11y` bucket (sourced from axe / pa11y / lighthouse / jsx-a11y runs), it spawns you with that failure exactly like a `lint`, `type`, `test`, or `visual` failure. Treat an `a11y` classified failure as a normal in-scope fix: read the cited rule, apply the minimal source change that clears the violation (a missing label, an aria attribute, a contrast token), confirm the fix, and commit one fix per gap. No special handling beyond the standard fix sequence below.
|
|
29
29
|
|
|
@@ -43,7 +43,7 @@ You have zero session memory. Every invocation starts fresh. The orchestrating s
|
|
|
43
43
|
The orchestrating stage supplies a `<required_reading>` block in the prompt. Read every listed file before acting - this is mandatory. Minimum expected files:
|
|
44
44
|
|
|
45
45
|
- `.design/STATE.md` - pipeline state, blockers, decisions
|
|
46
|
-
- `.design/DESIGN-VERIFICATION.md` - gaps to fix (##
|
|
46
|
+
- `.design/DESIGN-VERIFICATION.md` - gaps to fix (## Stage 5 - Gaps section)
|
|
47
47
|
- `.design/DESIGN-CONTEXT.md` - locked D-XX decisions; do not contradict them
|
|
48
48
|
|
|
49
49
|
**Invariant:** read all listed files FIRST, before making any changes.
|
|
@@ -64,11 +64,11 @@ The stage embeds the following fields in the prompt:
|
|
|
64
64
|
|
|
65
65
|
## Gap Input Format
|
|
66
66
|
|
|
67
|
-
Gaps are produced by design-verifier
|
|
67
|
+
Gaps are produced by design-verifier Stage 5 and written to the `## Stage 5 — Gaps` section of `.design/DESIGN-VERIFICATION.md`. The format is locked:
|
|
68
68
|
|
|
69
69
|
```
|
|
70
70
|
### [BLOCKER|MAJOR|MINOR|COSMETIC] G-NN: [title]
|
|
71
|
-
-
|
|
71
|
+
- Stage: [1|2|3|4]
|
|
72
72
|
- Description: [what is broken]
|
|
73
73
|
- Expected: [what should be true]
|
|
74
74
|
- Actual: [what is true]
|
|
@@ -85,12 +85,12 @@ Parse every entry in that section. The `G-NN` identifier, severity classificatio
|
|
|
85
85
|
### Step 1 - Read gaps and filter by scope
|
|
86
86
|
|
|
87
87
|
1. Read `.design/DESIGN-VERIFICATION.md`.
|
|
88
|
-
2. Locate the `##
|
|
88
|
+
2. Locate the `## Stage 5 — Gaps` section (or `## GAPS FOUND` if verifier used that heading).
|
|
89
89
|
3. Parse all gap entries in locked G-NN format.
|
|
90
90
|
4. Filter by severity based on `auto_mode`:
|
|
91
91
|
- Always include: `BLOCKER`, `MAJOR`
|
|
92
92
|
- Include only if `auto_mode=true`: `MINOR`, `COSMETIC`
|
|
93
|
-
5. **Confidence routing filter (
|
|
93
|
+
5. **Confidence routing filter (see `reference/reviewer-confidence-gate.md`).** Drop any gap that sits under a `## Tentative` heading: those never reach you. Then drop any `BLOCKER` or `MAJOR` gap whose `confidence` field is below `0.8` and route it to user review instead of auto-fix, since a high-severity gap without strong evidence is exactly the inflated-severity case the gate exists to catch. A gap missing its `confidence` field is treated as below the floor. The shared decision lives in `scripts/lib/confidence-route.cjs` (`route({ severity, confidence, tentative })` returns `'fix' | 'user-review' | 'drop'`); fix only the gaps it routes to `'fix'`.
|
|
94
94
|
6. Build an ordered list: BLOCKER first, then MAJOR, then (if included) MINOR, COSMETIC.
|
|
95
95
|
|
|
96
96
|
If no in-scope gaps are found (e.g., verifier found only MINOR gaps and `auto_mode=false`), emit `## FIX COMPLETE` immediately with "No in-scope gaps to fix."
|
|
@@ -125,7 +125,7 @@ f. **Record status.** Note `G-NN: fixed` in your running tracker.
|
|
|
125
125
|
- **Rule 3 - Blocking issue:** If something prevents applying this specific fix (missing import, wrong file structure), resolve the blocking issue first, then apply the fix → continue.
|
|
126
126
|
- **Rule 4 - Architectural change required:** If resolving the gap requires a new DB table, major schema change, switching libraries, or breaking API changes → DO NOT force a fix. Classify as unresolvable and proceed to Step 3 for this gap.
|
|
127
127
|
|
|
128
|
-
### Step 2.5 - Confidence x risk routing
|
|
128
|
+
### Step 2.5 - Confidence x risk routing
|
|
129
129
|
|
|
130
130
|
Step 1's confidence filter (`scripts/lib/confidence-route.cjs`) already dropped tentative and low-confidence gaps. Step 2.5 adds the action-risk dimension: a fix that is correct can still be dangerous to APPLY (touching STATE.md, a schema, a hook, a large diff). Score the write, then combine score and confidence into one routing decision per gap.
|
|
131
131
|
|
|
@@ -88,7 +88,7 @@ You MAY:
|
|
|
88
88
|
|
|
89
89
|
## Why this agent exists
|
|
90
90
|
|
|
91
|
-
|
|
91
|
+
Lazy Checker Spawning: cheap Haiku gate agents at `agents/*-gate.md` decide whether to spawn the full checker. If false, the full checker is skipped and logged as `lazy_skipped: true` in telemetry. This gate is the integration-checker-specific instance of that pattern - the full `design-integration-checker` is a LARGE-size post-verification spawn that grep-walks the codebase for D-XX decision application. If no decision or anchor doc moved in the diff, the wiring result is unchanged from the last verify and the spawn is wasted cost.
|
|
92
92
|
|
|
93
93
|
## Record
|
|
94
94
|
|
|
@@ -21,7 +21,7 @@ writes: []
|
|
|
21
21
|
|
|
22
22
|
You are a post-execution design decision wiring verifier. You confirm that each D-XX design decision recorded in `.design/DESIGN-CONTEXT.md` is actually reflected in the source code - not just described in planning documents.
|
|
23
23
|
|
|
24
|
-
You are spawned by the verify stage **AFTER** design-verifier completes. You supplement the verifier's gap list with decision-wiring status: decisions that are documented but not applied in code are gaps that escaped
|
|
24
|
+
You are spawned by the verify stage **AFTER** design-verifier completes. You supplement the verifier's gap list with decision-wiring status: decisions that are documented but not applied in code are gaps that escaped Stages 1–4 verification.
|
|
25
25
|
|
|
26
26
|
You run once per verify session. You are read-only - no Write tool. Your findings are returned inline and incorporated into the verify stage's gap-response loop.
|
|
27
27
|
|
|
@@ -60,11 +60,11 @@ STOP.
|
|
|
60
60
|
|
|
61
61
|
Read `.design/DESIGN-CONTEXT.md`. Build a numbered operation list per mode. Do NOT execute yet.
|
|
62
62
|
|
|
63
|
-
**annotate mode** - extract confirmed
|
|
63
|
+
**annotate mode** - extract confirmed decisions, map to canvas nodes:
|
|
64
64
|
```
|
|
65
65
|
Proposed annotations (N operations):
|
|
66
|
-
1. Node "Button/Primary" → add_comment: "bg: brand-primary-500 per
|
|
67
|
-
2. Node "Typography/H1" → add_comment: "font: Inter 32/40 per
|
|
66
|
+
1. Node "Button/Primary" → add_comment: "bg: brand-primary-500 per color decision"
|
|
67
|
+
2. Node "Typography/H1" → add_comment: "font: Inter 32/40 per typography decision"
|
|
68
68
|
```
|
|
69
69
|
|
|
70
70
|
**tokenize mode** - extract CSS literal values, map to paper.design style updates:
|
|
@@ -49,7 +49,7 @@ Parse mode: `annotate | roundtrip` (required). If absent, list modes and STOP.
|
|
|
49
49
|
**annotate mode** - read `.design/DESIGN-DEBT.md`, map findings to .pen components:
|
|
50
50
|
```
|
|
51
51
|
Proposed annotations (N operations):
|
|
52
|
-
1. Button.pen → add comment: "DEBT: padding token mismatch —
|
|
52
|
+
1. Button.pen → add comment: "DEBT: padding token mismatch — decision says 8px, impl uses 10px"
|
|
53
53
|
2. Modal.pen → add comment: "DEBT: missing focus-trap per accessibility audit"
|
|
54
54
|
```
|
|
55
55
|
|
package/agents/design-planner.md
CHANGED
|
@@ -24,6 +24,27 @@ You are the design-planner agent. Spawned by the `plan` stage after optional res
|
|
|
24
24
|
|
|
25
25
|
Do not start design work, generate code, or modify any file outside `.design/`. Your output is the plan that the `design` stage will execute.
|
|
26
26
|
|
|
27
|
+
## Output Contract
|
|
28
|
+
|
|
29
|
+
Emit a single top-of-response fenced ```json block conforming to `reference/output-contracts/planner-decision.schema.json` BEFORE any prose. The envelope captures the typed plan summary so downstream stages can consume it without re-parsing markdown.
|
|
30
|
+
|
|
31
|
+
The envelope shape:
|
|
32
|
+
|
|
33
|
+
```json
|
|
34
|
+
{
|
|
35
|
+
"schema_version": "1.0.0",
|
|
36
|
+
"plan_id": "<dated slug, e.g. 2026-06-04-dashboard>",
|
|
37
|
+
"tasks": [
|
|
38
|
+
{ "task_id": "T-1", "summary": "<one line>", "touches": ["<glob>"], "dependencies": [], "parallel_safe": true, "estimated_minutes": 30 }
|
|
39
|
+
],
|
|
40
|
+
"waves": [
|
|
41
|
+
{ "wave": "A", "task_ids": ["T-1"] }
|
|
42
|
+
]
|
|
43
|
+
}
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
After the envelope, continue with the human-readable plan body in prose + markdown tables (the existing format). The DESIGN-PLAN.md file you write continues to include both; the envelope at the top, then the prose. The `parse-contract.cjs#parsePlannerDecision` consumer reads only the envelope; orchestrators and reviewers read the prose.
|
|
47
|
+
|
|
27
48
|
---
|
|
28
49
|
|
|
29
50
|
## Required Reading
|
|
@@ -5,7 +5,7 @@ tools: Read, Write, Bash, Grep, Glob
|
|
|
5
5
|
color: purple
|
|
6
6
|
model: inherit
|
|
7
7
|
default-tier: opus
|
|
8
|
-
tier-rationale: "
|
|
8
|
+
tier-rationale: "Strategic reflector; reads telemetry + proposes plugin-level changes"
|
|
9
9
|
size_budget: XL
|
|
10
10
|
parallel-safe: never
|
|
11
11
|
typical-duration-seconds: 60
|
|
@@ -24,7 +24,7 @@ You are a post-cycle reflection agent. You analyze what happened in a design cyc
|
|
|
24
24
|
|
|
25
25
|
## Event-Stream Mode (Phase 20 onwards)
|
|
26
26
|
|
|
27
|
-
The reflector
|
|
27
|
+
The reflector reads proposals from `.design/telemetry/events.jsonl` - the append-only event stream. It filters entries where `type === 'reflection.proposal'`. Each matching line is a JSON object whose `payload` carries fields like `{ source: <skill|hook>, proposal_kind: <string>, rationale: <string>, ... }` emitted by the producing skill or hook.
|
|
28
28
|
|
|
29
29
|
Read flow:
|
|
30
30
|
|
|
@@ -33,17 +33,17 @@ Read flow:
|
|
|
33
33
|
3. Collect every entry where `type === 'reflection.proposal'`. Render each payload into the appropriate Proposals section below.
|
|
34
34
|
4. Cross-reference the event's `stage`, `cycle`, and `_meta.source` fields when citing evidence.
|
|
35
35
|
|
|
36
|
-
Legacy grep-based parsing of skill outputs is preserved as a fallback for skills that haven't yet migrated to emit `reflection.proposal` events
|
|
36
|
+
Legacy grep-based parsing of skill outputs is preserved as a fallback for skills that haven't yet migrated to emit `reflection.proposal` events. If no `reflection.proposal` events are present in the stream, run the legacy harvest across `.design/learnings/*.md` and `.design/intel/` exactly as before - both paths produce the same Proposals section format.
|
|
37
37
|
|
|
38
|
-
## Capability-gap pattern scan
|
|
38
|
+
## Capability-gap pattern scan
|
|
39
39
|
|
|
40
|
-
During the reflection pass, also run the capability-gap pattern scan to detect recurring patterns lacking a dedicated executable owner. The scan emits `capability_gap` events with `source: "reflector_pattern"` for
|
|
40
|
+
During the reflection pass, also run the capability-gap pattern scan to detect recurring patterns lacking a dedicated executable owner. The scan emits `capability_gap` events with `source: "reflector_pattern"` for downstream aggregation.
|
|
41
41
|
|
|
42
42
|
```
|
|
43
43
|
node -e "console.log(JSON.stringify(require('./scripts/lib/reflector/capability-gap-scan.cjs').runCapabilityGapScan(), null, 2))"
|
|
44
44
|
```
|
|
45
45
|
|
|
46
|
-
The scan reads three signal sources: `.design/intel/*.md` `Touches:` clusters, `.design/telemetry/posterior.json` high-usage arms with no specialized agent, and recent `.design/gep/events.jsonl` decision sequences. MCP-probe failures (`outcome === 'connection-error'`, `agent === 'mcp-probe'`, or `mcp_probe: true`) do NOT trigger gap events
|
|
46
|
+
The scan reads three signal sources: `.design/intel/*.md` `Touches:` clusters, `.design/telemetry/posterior.json` high-usage arms with no specialized agent, and recent `.design/gep/events.jsonl` decision sequences. MCP-probe failures (`outcome === 'connection-error'`, `agent === 'mcp-probe'`, or `mcp_probe: true`) do NOT trigger gap events. See @skills/reflect/procedures/capability-gap-scan.md for the full contract.
|
|
47
47
|
|
|
48
48
|
Cite the returned `emittedEventIds` in the run summary under a `## Capability gaps emitted` heading. The threshold knob is `reflector.capability_gap_threshold` in `.design/config.json` (default `N=3`, integer ≥ 1).
|
|
49
49
|
|
|
@@ -54,10 +54,10 @@ The orchestrating stage supplies a `<required_reading>` block in the prompt. Rea
|
|
|
54
54
|
Minimum expected inputs (skip gracefully if absent, note what's missing):
|
|
55
55
|
- `.design/STATE.md` - cycle identity, decisions, session history
|
|
56
56
|
- `.design/DESIGN-VERIFICATION.md` - cycle outcome scores + gaps
|
|
57
|
-
- `.design/learnings/*.md` - structured learnings from
|
|
58
|
-
- `.design/telemetry/costs.jsonl` - per-agent-spawn cost data
|
|
59
|
-
- `.design/agent-metrics.json` - aggregated agent performance data
|
|
60
|
-
- `.design/learnings/question-quality.jsonl` - discussant answer quality log
|
|
57
|
+
- `.design/learnings/*.md` - structured learnings from extract
|
|
58
|
+
- `.design/telemetry/costs.jsonl` - per-agent-spawn cost data
|
|
59
|
+
- `.design/agent-metrics.json` - aggregated agent performance data
|
|
60
|
+
- `.design/learnings/question-quality.jsonl` - discussant answer quality log
|
|
61
61
|
- `.design/cycles/<slug>/CYCLE-SUMMARY.md` - if present
|
|
62
62
|
|
|
63
63
|
## Output
|
|
@@ -66,13 +66,13 @@ Before writing any `.design/` artifact, resolve the main repo root via `scripts/
|
|
|
66
66
|
|
|
67
67
|
Write `.design/reflections/<cycle-slug>.md`. If `--dry-run` is set in the spawning prompt, print proposals to stdout only - do not write the file.
|
|
68
68
|
|
|
69
|
-
If the capability-gap pattern scan emitted any events during this run, include a `## Capability gaps emitted` heading listing each `event_id` with the source signal kind (`intel` | `posterior` | `trajectory`) and the `suggested_kind` (`agent` | `skill`) per event.
|
|
69
|
+
If the capability-gap pattern scan emitted any events during this run, include a `## Capability gaps emitted` heading listing each `event_id` with the source signal kind (`intel` | `posterior` | `trajectory`) and the `suggested_kind` (`agent` | `skill`) per event. Downstream consumers read these events from `.design/gep/events.jsonl` to cluster recurring `capability_gap` events for `/gdd:apply-reflections`.
|
|
70
70
|
|
|
71
71
|
Terminate with `## REFLECTION COMPLETE`.
|
|
72
72
|
|
|
73
73
|
## Reflection Sections
|
|
74
74
|
|
|
75
|
-
Write these sections in order. If source data is missing, write the section heading and a single note: "Source not found - requires
|
|
75
|
+
Write these sections in order. If source data is missing, write the section heading and a single note: "Source not found - requires upstream artifacts."
|
|
76
76
|
|
|
77
77
|
---
|
|
78
78
|
|
|
@@ -94,7 +94,7 @@ After listing standard surprises, apply the **Four Principles Checks** from `ref
|
|
|
94
94
|
|
|
95
95
|
Scan STATE.md `<decisions>` block for D-XX codes. Cross-reference `.design/learnings/` files from prior cycles if present. Flag decisions that: (a) appeared in multiple sessions of the same cycle, or (b) appear under the same keyword in learnings from ≥2 prior cycles. These are candidates for `reference/` additions.
|
|
96
96
|
|
|
97
|
-
**Per-author patterns (
|
|
97
|
+
**Per-author patterns (team mode).** When decisions carry the `[author= co-author=]` attribution suffix (see `reference/multi-author-model.md`), parse it with `scripts/lib/collab/attribution.cjs` (`parseDecisionsBlock` + `groupByAuthor`) and add a brief **Per-author patterns** sub-note: who locks decisions early, whose decisions get reverted or unlocked most, and any author whose decisions cluster around a recurring keyword. Skip silently when no decision is attributed (single-author projects).
|
|
98
98
|
|
|
99
99
|
### 3. Agent Performance
|
|
100
100
|
|
|
@@ -122,29 +122,29 @@ Read `.design/telemetry/costs.jsonl` (if exists). Aggregate per agent:
|
|
|
122
122
|
- Sustained underspend: < 40% of allocation for ≥3 cycles → `[BUDGET]` proposal to lower cap
|
|
123
123
|
- Consistent cap breaches: `cap_hit: true` ≥3 times → `[BUDGET]` proposal
|
|
124
124
|
|
|
125
|
-
If `.design/budget.json` doesn't exist: note "budget.json not found -
|
|
125
|
+
If `.design/budget.json` doesn't exist: note "budget.json not found - budget governance required."
|
|
126
126
|
|
|
127
|
-
### 7. Cross-runtime cost arbitrage
|
|
127
|
+
### 7. Cross-runtime cost arbitrage
|
|
128
128
|
|
|
129
|
-
**Why this exists:**
|
|
129
|
+
**Why this exists:** gdd ships to 14 runtimes (claude, codex, gemini, qwen, …). The same `(agent, tier)` pair can cost dramatically different amounts depending on which runtime executed the spawn - runtime-author pricing varies, and the user may already be paying for one runtime via subscription while paying per-token in another. This section surfaces those arbitrage opportunities as **structured, measurable signals** - never hand-wavy assumptions.
|
|
130
130
|
|
|
131
|
-
**Data source:** `.design/telemetry/events.jsonl` - filter entries where `type === 'cost.update'`. Each cost row is tagged with `payload.runtime`
|
|
131
|
+
**Data source:** `.design/telemetry/events.jsonl` - filter entries where `type === 'cost.update'`. Each cost row is tagged with `payload.runtime` so spawns from different runtimes are attributable apples-to-apples. The reflector reads cost events from this stream alongside Section 6's `costs.jsonl` rollup; events.jsonl is authoritative for runtime attribution.
|
|
132
132
|
|
|
133
133
|
**The rule:**
|
|
134
134
|
|
|
135
|
-
For each `(agent, tier)` pair observed in the last 5 cycles (
|
|
135
|
+
For each `(agent, tier)` pair observed in the last 5 cycles (default window):
|
|
136
136
|
|
|
137
137
|
1. Bucket cost events by `(agent, tier, runtime, cycle)` and sum within each bucket. Sum-then-average is critical: a cycle that ran 4 design-verifier spawns in claude and 1 in codex must NOT inflate claude's per-cycle average by a factor of 4. Sum the 4 spawns into one cycle-sum, then average across the cycles where the runtime appeared.
|
|
138
138
|
2. Compute `avg_cost_per_cycle` per `(agent, tier, runtime)` triple, restricted to the recency window.
|
|
139
139
|
3. For each pair that has ≥2 runtimes in the window, find the cheapest and most expensive runtime. Compute `delta_pct = (max_avg - min_avg) / min_avg`.
|
|
140
|
-
4. If `delta_pct > 0.5` (50%,
|
|
140
|
+
4. If `delta_pct > 0.5` (50%, starting heuristic), emit a structured `cost_arbitrage` proposal.
|
|
141
141
|
|
|
142
142
|
**Important guardrails (failure modes the rule must avoid):**
|
|
143
143
|
|
|
144
144
|
- **Mixed-runtime cycles must not crash or double-count.** A single cycle where some agent spawns ran in CC and others in Codex is normal - runtime attribution is per-spawn (`payload.runtime`), never per-cycle.
|
|
145
145
|
- **Single-runtime-only history is silent.** If only one runtime has events for an `(agent, tier)` pair in the window, no arbitrage can be computed - emit nothing rather than a misleading "no comparison available" proposal.
|
|
146
146
|
- **Zero-cost denominators are skipped.** A runtime that averaged $0 in the window would produce `delta_pct: Infinity`; skip the pair rather than emit a useless signal.
|
|
147
|
-
- **The 50% threshold is a starting heuristic.** Bandit-style learning over arbitrage outcomes (was the proposal applied? did costs drop?) is
|
|
147
|
+
- **The 50% threshold is a starting heuristic.** Bandit-style learning over arbitrage outcomes (was the proposal applied? did costs drop?) is bandit-posterior territory - it lives in the bandit posterior, NOT here. This section's job is to surface measurement signals; tier-selection learning is a separate data product.
|
|
148
148
|
|
|
149
149
|
**Helper:** `scripts/lib/cost-arbitrage.cjs` exports `analyze(events, options) → proposals[]` implementing the above rule deterministically. The executor agent following this skill loads `events.jsonl`, parses each line as JSON (skipping malformed lines), and passes the array of envelopes to `analyze()`. No re-derivation of the rule in prose - call the helper.
|
|
150
150
|
|
|
@@ -165,17 +165,17 @@ For each `(agent, tier)` pair observed in the last 5 cycles (D-09 default window
|
|
|
165
165
|
}
|
|
166
166
|
```
|
|
167
167
|
|
|
168
|
-
Render each `cost_arbitrage` entry into the Proposals section as a `[BUDGET]`-tagged proposal carrying the structured payload verbatim - `/gdd:apply-reflections` will route it to the runtime-routing layer (
|
|
168
|
+
Render each `cost_arbitrage` entry into the Proposals section as a `[BUDGET]`-tagged proposal carrying the structured payload verbatim - `/gdd:apply-reflections` will route it to the runtime-routing layer (tier-resolver / runtime-detect) rather than to `.design/budget.json`.
|
|
169
169
|
|
|
170
170
|
---
|
|
171
171
|
|
|
172
|
-
### 8. Bandit-arbitrage analysis
|
|
172
|
+
### 8. Bandit-arbitrage analysis
|
|
173
173
|
|
|
174
|
-
**Why this exists:**
|
|
174
|
+
**Why this exists:** The bandit posterior + delegate dimension is wired into production. The posterior accumulates per-`(agent, bin, delegate, tier)` win-rates from real spawns. Once the posterior has enough data, the bandit's best-arm tier for an agent may differ from that agent's frontmatter `default-tier:` - a measurement signal that the frontmatter is stale. This section surfaces that signal as a `[FRONTMATTER]` proposal.
|
|
175
175
|
|
|
176
176
|
**Data sources:**
|
|
177
177
|
|
|
178
|
-
- `.design/telemetry/posterior.json` - the bandit posterior file written by
|
|
178
|
+
- `.design/telemetry/posterior.json` - the bandit posterior file written by `bandit-router.cjs` + production callers. Path matches `bandit-router.cjs`'s `DEFAULT_POSTERIOR_PATH`. If the file does not exist, skip this section with note "posterior.json not found - bandit wiring required."
|
|
179
179
|
- `agents/*.md` - read each agent's frontmatter `default-tier:` value. The reflector already parses frontmatter in Section 3 ("Agent Performance"); reuse that parse pass and build a `{agent: defaultTier}` map keyed by the agent's `name:` field.
|
|
180
180
|
|
|
181
181
|
**The rule:**
|
|
@@ -185,7 +185,7 @@ For each `(agent, bin)` slice in the posterior (defaulting to `delegate='none'`
|
|
|
185
185
|
1. Compute per-tier posterior mean = `α / (α + β)` and stddev = `sqrt(αβ / ((α+β)² · (α+β+1)))`.
|
|
186
186
|
2. Identify `posterior_best_tier = argmax(mean)` across the tiers present in the slice.
|
|
187
187
|
3. Gates (all must hold to emit):
|
|
188
|
-
- `sum(arm.count)` across the slice's tier rows >= 3 (
|
|
188
|
+
- `sum(arm.count)` across the slice's tier rows >= 3 ("3+ cycles" proxy).
|
|
189
189
|
- `(best_mean - second_best_mean) / second_best_mean >= 0.5` (50% delta heuristic).
|
|
190
190
|
- `stddev(best_tier) < 0.05` (credible interval narrow enough).
|
|
191
191
|
- `frontmatter[agent].default-tier !== posterior_best_tier` (the actual stale signal).
|
|
@@ -196,7 +196,7 @@ For each `(agent, bin)` slice in the posterior (defaulting to `delegate='none'`
|
|
|
196
196
|
- **Single-tier-only history is silent.** If only one tier has been pulled for `(agent, bin)`, no comparison is possible - emit nothing rather than a misleading "winner" proposal.
|
|
197
197
|
- **Wide credible intervals are silent.** Bandit posteriors are noisy early on; the 0.05 stddev gate ensures we only surface signals where the bandit is confident.
|
|
198
198
|
- **The 50% threshold is a starting heuristic.** Same discipline as cost-arbitrage Section 7 - bandit-learning over which arbitrage proposals were APPLIED (and whether the posterior subsequently shifted) is a separate (future) phase.
|
|
199
|
-
- **delegateFilter='none' is the
|
|
199
|
+
- **delegateFilter='none' is the current default.** Arbitrage analysis on the 5 peer-delegate slices is left for a future plan; current peer data is too sparse to credibly disagree with frontmatter.
|
|
200
200
|
|
|
201
201
|
**Helper:** `scripts/lib/bandit-arbitrage.cjs` exports `analyze(posterior, options) → proposals[]` implementing the above rule deterministically. The executor agent following this skill loads the posterior via `bandit-router.loadPosterior()`, builds the `{agent: defaultTier}` map from `agents/*.md` frontmatter, and passes both to `analyze()`. No re-derivation of the rule in prose - call the helper.
|
|
202
202
|
|
|
@@ -221,14 +221,14 @@ Render each `bandit_arbitrage` entry into the Proposals section as a `[FRONTMATT
|
|
|
221
221
|
|
|
222
222
|
---
|
|
223
223
|
|
|
224
|
-
### 9. Capability gaps observed
|
|
224
|
+
### 9. Capability gaps observed
|
|
225
225
|
|
|
226
|
-
**Why this exists:**
|
|
226
|
+
**Why this exists:** Capability-gap detectors emit `capability_gap` events to `.design/gep/events.jsonl` whenever `/gdd:fast`, `gdd-router`, or the reflector pattern-detection pass identifies a lookup-fail with no dedicated owner. This section surfaces those events as clusters in the cycle markdown and evaluates the Stage-0 → Stage-1 gate per `reference/capability-gap-stage-gate.md`.
|
|
227
227
|
|
|
228
228
|
**Data sources:**
|
|
229
229
|
|
|
230
|
-
- `.design/gep/events.jsonl` - the
|
|
231
|
-
- `.design/config.json` (optional) - `capability_gap_gate.{K, M, stddev_threshold}` overrides. Defaults: `K=3`, `M=10`, `stddev_threshold=0.05
|
|
230
|
+
- `.design/gep/events.jsonl` - the causal event chain. Rows where `type === 'capability_gap'` (or `outcome === 'capability_gap'`) are aggregated by `payload.context_hash`.
|
|
231
|
+
- `.design/config.json` (optional) - `capability_gap_gate.{K, M, stddev_threshold}` overrides. Defaults: `K=3`, `M=10`, `stddev_threshold=0.05`.
|
|
232
232
|
|
|
233
233
|
**The mechanism:**
|
|
234
234
|
|
|
@@ -247,19 +247,19 @@ node scripts/lib/reflections-cycle-writer.cjs \
|
|
|
247
247
|
|
|
248
248
|
Append stdout to the cycle markdown body (after Section 8 / before the Proposals header). If `--history=<path>` is wired by a future cycle-aggregator, add the flag. For Stage 0 (this phase), per-cycle cluster aggregation alone is the deliverable - gate evaluation surfaces additively when history is present.
|
|
249
249
|
|
|
250
|
-
**Important discipline
|
|
250
|
+
**Important discipline:**
|
|
251
251
|
|
|
252
|
-
- This section NEVER auto-flips `capability_gap_gate.stage` or any other runtime state. The output is markdown only; the user opts in via
|
|
253
|
-
- The shim is read-only with respect to `.design/config.json`. The only state-mutating writer is the user-driven opt-in path
|
|
254
|
-
- `evidence_refs[]` content is rendered as-is in the markdown table examples column -
|
|
252
|
+
- This section NEVER auto-flips `capability_gap_gate.stage` or any other runtime state. The output is markdown only; the user opts in via the apply-reflections extension.
|
|
253
|
+
- The shim is read-only with respect to `.design/config.json`. The only state-mutating writer is the user-driven opt-in path.
|
|
254
|
+
- `evidence_refs[]` content is rendered as-is in the markdown table examples column - evidence refs are trusted-content (file:line or event-id strings from the capability-gap schema).
|
|
255
255
|
|
|
256
|
-
**Helper:** `scripts/lib/reflector-capability-gap-aggregator.cjs` exports `aggregateCapabilityGaps`, `renderGapsSection`, `evaluateStageGate`. The shim wraps these for invocation from the agent prompt; tests in `tests/reflector-capability-gap-aggregation.test.cjs` cover the helper directly with synthetic fixtures
|
|
256
|
+
**Helper:** `scripts/lib/reflector-capability-gap-aggregator.cjs` exports `aggregateCapabilityGaps`, `renderGapsSection`, `evaluateStageGate`. The shim wraps these for invocation from the agent prompt; tests in `tests/reflector-capability-gap-aggregation.test.cjs` cover the helper directly with synthetic fixtures.
|
|
257
257
|
|
|
258
258
|
---
|
|
259
259
|
|
|
260
260
|
## Atomic instincts
|
|
261
261
|
|
|
262
|
-
|
|
262
|
+
Alongside the prose reflection, emit atomic instinct units. For each pattern you observed this cycle that is small enough to state as a single trigger plus a one-line response, emit a structured instinct unit. The narrative below stays for human reading; this section is the machine-readable twin. Both are emitted for one minor version so readers and tooling migrate together.
|
|
263
263
|
|
|
264
264
|
Emit 0 to N units. Each unit follows `reference/instinct-format.md` exactly: YAML frontmatter (`id`, `trigger`, `confidence` from 0.3 to 0.9, `domain` from the format's enum, `scope`, `project_id`, `source`, `cycles_seen`, `first_seen`, `last_seen`) plus a short body. Set `source: design-reflector`. Set `confidence` from the strength of the evidence - a pattern seen once this cycle stays near 0.3 to 0.5; a pattern that recurs across prior learnings earns more. Do not exceed 0.9.
|
|
265
265
|
|
|
@@ -337,7 +337,7 @@ For each keyword cluster meeting threshold:
|
|
|
337
337
|
|
|
338
338
|
## Discussant Question Quality (generates [QUESTION] proposals)
|
|
339
339
|
|
|
340
|
-
Read `.design/learnings/question-quality.jsonl` (if exists). If it doesn't exist: skip and note "question-quality.jsonl not found - requires at least one discuss session with
|
|
340
|
+
Read `.design/learnings/question-quality.jsonl` (if exists). If it doesn't exist: skip and note "question-quality.jsonl not found - requires at least one discuss session with the discussant."
|
|
341
341
|
|
|
342
342
|
Aggregate per `question_id` across all entries:
|
|
343
343
|
- Compute: `(count_skipped + count_low) / total_asks`
|
|
@@ -355,9 +355,9 @@ For each flagged question, emit a `[QUESTION]` proposal:
|
|
|
355
355
|
|
|
356
356
|
## Budget Analysis (generates [BUDGET] proposals)
|
|
357
357
|
|
|
358
|
-
Read `.design/telemetry/costs.jsonl` (if exists). If it doesn't exist: skip and note "costs.jsonl not found -
|
|
358
|
+
Read `.design/telemetry/costs.jsonl` (if exists). If it doesn't exist: skip and note "costs.jsonl not found - telemetry required."
|
|
359
359
|
|
|
360
|
-
Read `.design/budget.json` to get per-agent cap allocations. If it doesn't exist: skip budget analysis and note "budget.json not found -
|
|
360
|
+
Read `.design/budget.json` to get per-agent cap allocations. If it doesn't exist: skip budget analysis and note "budget.json not found - budget governance required."
|
|
361
361
|
|
|
362
362
|
Aggregate per agent across cycles:
|
|
363
363
|
- **Sustained overspend**: `est_cost_usd` > (budget allocation × 1.2) in ≥3 consecutive cycles → propose raising cap
|
|
@@ -6,7 +6,7 @@ color: green
|
|
|
6
6
|
|
|
7
7
|
model: haiku
|
|
8
8
|
default-tier: haiku
|
|
9
|
-
tier-rationale: "Formatting + light synthesis over a bounded ~3KB input; Haiku is the correct tier
|
|
9
|
+
tier-rationale: "Formatting + light synthesis over a bounded ~3KB input; Haiku is the correct tier (writers/formatters with fixed schemas)."
|
|
10
10
|
|
|
11
11
|
parallel-safe: always
|
|
12
12
|
typical-duration-seconds: 10
|
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: design-update-checker
|
|
3
|
-
description: Cold-path enrichment agent for /gdd:check-update --prompt. Reads .design/update-cache.json plus a release body supplied in the prompt, classifies the delta (major|minor|patch|off-cadence), and returns a 3-5-line human-friendly "what this release changes for you" summary. Does not write any file. Haiku-tier summarizer
|
|
3
|
+
description: Cold-path enrichment agent for /gdd:check-update --prompt. Reads .design/update-cache.json plus a release body supplied in the prompt, classifies the delta (major|minor|patch|off-cadence), and returns a 3-5-line human-friendly "what this release changes for you" summary. Does not write any file. Haiku-tier summarizer.
|
|
4
4
|
tools: Read, Grep, Glob
|
|
5
5
|
color: yellow
|
|
6
6
|
model: haiku
|
|
7
7
|
default-tier: haiku
|
|
8
|
-
tier-rationale: "Pure summarization + classification over a bounded 500-char input; Haiku is correct tier
|
|
8
|
+
tier-rationale: "Pure summarization + classification over a bounded 500-char input; Haiku is correct tier for verifiers/checkers/summarizers"
|
|
9
9
|
parallel-safe: always
|
|
10
10
|
typical-duration-seconds: 8
|
|
11
11
|
reads-only: true
|
|
@@ -30,7 +30,7 @@ You have zero session memory. One invocation = one release summarized. Everythin
|
|
|
30
30
|
|
|
31
31
|
Before producing output you MUST read:
|
|
32
32
|
|
|
33
|
-
1. `.design/update-cache.json` - canonical delta classification, `current_tag`, `latest_tag`, `changelog_excerpt` (up to 500 chars of the release body). Written by `hooks/update-check.sh
|
|
33
|
+
1. `.design/update-cache.json` - canonical delta classification, `current_tag`, `latest_tag`, `changelog_excerpt` (up to 500 chars of the release body). Written by `hooks/update-check.sh`.
|
|
34
34
|
2. Any `release_body` string supplied in the spawning prompt context - may be fuller than the 500-char cache excerpt. Prefer it over `changelog_excerpt` when both are present.
|
|
35
35
|
|
|
36
36
|
If `.design/update-cache.json` does not exist, return exactly:
|
|
@@ -49,7 +49,7 @@ From the spawning prompt (`/gdd:check-update --prompt` in plan 13.3-04) you rece
|
|
|
49
49
|
|-------|------|---------|--------|
|
|
50
50
|
| `current_tag` | string | `v1.0.7` | installed plugin version |
|
|
51
51
|
| `latest_tag` | string | `v1.0.7.3` | GitHub Releases latest tag |
|
|
52
|
-
| `delta` | enum | `major` \| `minor` \| `patch` \| `off-cadence` | pre-classified by the hot-path hook
|
|
52
|
+
| `delta` | enum | `major` \| `minor` \| `patch` \| `off-cadence` | pre-classified by the hot-path hook |
|
|
53
53
|
| `release_body` | string (markdown) | release-notes markdown, up to a few thousand chars | optional; fuller than the cached excerpt |
|
|
54
54
|
|
|
55
55
|
When `release_body` is absent, fall back to `changelog_excerpt` from the cache. When both are missing, emit the fallback message above.
|
|
@@ -87,7 +87,7 @@ Bullets must name concrete user impact (new command, changed behavior, fixed bug
|
|
|
87
87
|
|
|
88
88
|
## Classification boundary
|
|
89
89
|
|
|
90
|
-
Do NOT reclassify the delta. The hot-path script (`hooks/update-check.sh`) already wrote `delta` into `.design/update-cache.json`
|
|
90
|
+
Do NOT reclassify the delta. The hot-path script (`hooks/update-check.sh`) already wrote `delta` into `.design/update-cache.json` (4-segment semver compare). Echo the incoming delta verbatim.
|
|
91
91
|
|
|
92
92
|
If the incoming delta looks wrong given the release body (e.g. body headlines a breaking change but `delta` is `patch`), note the discrepancy in a single inline line after the bullets - but do **not** override the field. Example:
|
|
93
93
|
|
|
@@ -92,7 +92,7 @@ You MAY:
|
|
|
92
92
|
|
|
93
93
|
## Why this agent exists
|
|
94
94
|
|
|
95
|
-
|
|
95
|
+
Lazy Checker Spawning: cheap Haiku gate agents at `agents/*-gate.md` decide whether to spawn the full checker. The gate agent reads the DIFF of changed files, applies a heuristic (design-system paths touched? copy strings touched? token files touched?), and returns `{spawn: true|false, rationale: '...'}`. If false, the full checker is skipped and logged as `lazy_skipped: true` in telemetry. This gate is the verifier-specific instance of that pattern - full `design-verifier` is an XL-size spawn and the most expensive single agent in the pipeline, so gating it behind a cheap Haiku diff-scan yields the largest single cost win for lazy-spawn savings.
|
|
96
96
|
|
|
97
97
|
## Record
|
|
98
98
|
|