@hegemonart/get-design-done 1.31.5 → 1.33.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/.claude-plugin/marketplace.json +2 -2
  2. package/.claude-plugin/plugin.json +1 -1
  3. package/CHANGELOG.md +63 -0
  4. package/NOTICE +81 -5
  5. package/README.md +25 -0
  6. package/SKILL.md +4 -0
  7. package/hooks/hooks.json +9 -0
  8. package/hooks/inject-using-gdd.sh +72 -0
  9. package/hooks/run-hook.cmd +35 -0
  10. package/package.json +2 -2
  11. package/reference/schemas/events.schema.json +63 -1
  12. package/reference/schemas/pressure-scenario.schema.json +69 -0
  13. package/scripts/lib/health-mirror/index.cjs +79 -1
  14. package/scripts/lib/skill-behavior/runner.cjs +187 -0
  15. package/scripts/lib/skill-behavior/stub-invoker.cjs +95 -0
  16. package/scripts/lib/skill-behavior/telemetry.cjs +379 -0
  17. package/sdk/mcp/gdd-mcp/server.js +42 -0
  18. package/skills/audit/SKILL.md +13 -0
  19. package/skills/brief/SKILL.md +25 -0
  20. package/skills/design/SKILL.md +17 -0
  21. package/skills/discuss/SKILL.md +13 -0
  22. package/skills/explore/SKILL.md +17 -0
  23. package/skills/health/SKILL.md +6 -0
  24. package/skills/plan/SKILL.md +25 -0
  25. package/skills/router/SKILL.md +4 -0
  26. package/skills/router/router-pick-emitter.md +78 -0
  27. package/skills/using-gdd/SKILL.md +78 -0
  28. package/skills/verify/SKILL.md +17 -0
  29. package/scripts/lib/cli/index.ts +0 -29
  30. package/scripts/lib/error-classifier.cjs +0 -29
  31. package/scripts/lib/event-stream/index.ts +0 -29
  32. package/scripts/lib/gdd-errors/index.ts +0 -29
  33. package/scripts/lib/gdd-state/index.ts +0 -29
  34. package/scripts/lib/iteration-budget.cjs +0 -29
  35. package/scripts/lib/jittered-backoff.cjs +0 -29
  36. package/scripts/lib/lockfile.cjs +0 -29
  37. package/scripts/mcp-servers/gdd-mcp/server.ts +0 -35
  38. package/scripts/mcp-servers/gdd-state/server.ts +0 -34
@@ -0,0 +1,379 @@
1
+ /**
2
+ * telemetry.cjs — reflector-telemetry layer for the pressure-scenario harness
3
+ * (Plan 33-05). The third leg of Phase 33: it CONSUMES the 33-01 runner result
4
+ * ({ scenario, target_skill, pass, compliance_hits, violation_hits }), records a
5
+ * scenario-failure event to a JSONL artifact, detects SUSTAINED failure, and on
6
+ * sustained failure produces a PROPOSE-ONLY reflector content-edit draft via the
7
+ * same incubator/apply-reflections surface the shipped reflector-kfm-proposer
8
+ * uses.
9
+ *
10
+ * Why this module exists: behavior tests only matter if a sustained failure
11
+ * prompts a content fix. This closes that loop — a failing run is recorded; when
12
+ * a scenario fails ≥3 of its last 10 runs (D-07 threshold), the reflector
13
+ * proposes a skill-content edit for human review via /gdd:apply-reflections. The
14
+ * proposal NEVER auto-edits a skill (Phase 11/29 propose-only SC; Phase 33
15
+ * out-of-scope: "Auto-applying reflector-proposed skill edits — propose-only").
16
+ *
17
+ * Decisions honored:
18
+ * * D-07 — telemetry → .design/telemetry/skill-behavior.jsonl (runtime
19
+ * artifact, gitignored, local); sustained-failure signal = ≥3 of the last 10
20
+ * runs failing for a scenario; reflector consumption is STUB-tested (no live
21
+ * runs — all paths + the clock are injectable so tests use a tmp dir).
22
+ * * D-06 — this module is exercised by the DEFAULT suite (no API key / no LLM).
23
+ *
24
+ * Injectability / purity:
25
+ * The JSONL path, the incubator root, `fs`, and the clock (`now`) are ALL
26
+ * injectable via opts so every test writes to an os.tmpdir() dir and NOTHING
27
+ * touches the real .design/ tree. The runner (33-01) does NOT stamp a `ts`;
28
+ * the timestamp is stamped HERE via the injected `now`.
29
+ *
30
+ * Pattern references (style mirrored, NOT imported):
31
+ * * scripts/lib/event-chain.cjs — house JSONL append (defensive mkdir -p +
32
+ * append, never-throw) + findRepoRoot + line-by-line read idiom.
33
+ * * scripts/lib/reflector-kfm-proposer.cjs — shouldPropose-style stability gate
34
+ * + proposeKfmDraft writing a proposal-only draft under
35
+ * .design/reflections/incubator/<slug>/CATALOGUE-ENTRY.md.
36
+ *
37
+ * Public API:
38
+ * recordRun(result, opts) → event | null (append on pass:false)
39
+ * readRuns(scenario, opts) → Array<event> (tail JSONL, filter)
40
+ * isSustainedFailure(scenario, opts) → boolean (≥3 of last 10 failed)
41
+ * maybeProposeReflection(scenario, opts) → { action:'drafted', path, slug }
42
+ * | { action:'skipped', reason }
43
+ *
44
+ * Pure CommonJS, deps = node:fs + node:path ONLY. No npm dependencies.
45
+ */
46
+
47
+ 'use strict';
48
+
49
+ const nodeFs = require('node:fs');
50
+ const path = require('node:path');
51
+
52
+ // -------------------------------------------------------------------
53
+ // Constants
54
+ // -------------------------------------------------------------------
55
+
56
+ const EVENT_TYPE = 'skill_behavior_failure';
57
+ const DEFAULT_JSONL_REL = '.design/telemetry/skill-behavior.jsonl';
58
+ const DEFAULT_INCUBATOR_REL = '.design/reflections/incubator';
59
+ const SUSTAINED_WINDOW = 10; // D-07: look at the last N runs
60
+ const SUSTAINED_THRESHOLD = 3; // D-07: ≥3 failures of the last 10 == sustained
61
+ const INCUBATOR_PREFIX = 'skill-edit-';
62
+
63
+ // -------------------------------------------------------------------
64
+ // Helpers
65
+ // -------------------------------------------------------------------
66
+
67
+ /**
68
+ * Walk up from a start dir until a package.json is found (repo root). Mirrors
69
+ * the reflector-kfm-proposer / event-chain findRepoRoot idiom.
70
+ *
71
+ * @param {string} [startDir]
72
+ * @returns {string}
73
+ */
74
+ function findRepoRoot(startDir) {
75
+ let dir = startDir || __dirname;
76
+ for (let i = 0; i < 12; i++) {
77
+ if (nodeFs.existsSync(path.join(dir, 'package.json'))) return dir;
78
+ const parent = path.dirname(dir);
79
+ if (parent === dir) break;
80
+ dir = parent;
81
+ }
82
+ return path.resolve(__dirname, '..', '..', '..');
83
+ }
84
+
85
+ /**
86
+ * Resolve the JSONL emit path: explicit opts.jsonlPath wins (absolute or
87
+ * relative to cwd); otherwise <repoRoot>/.design/telemetry/skill-behavior.jsonl.
88
+ */
89
+ function resolveJsonlPath(opts) {
90
+ const o = opts || {};
91
+ if (o.jsonlPath) {
92
+ return path.isAbsolute(o.jsonlPath)
93
+ ? o.jsonlPath
94
+ : path.resolve(o.repoRoot || process.cwd(), o.jsonlPath);
95
+ }
96
+ return path.join(o.repoRoot || findRepoRoot(), DEFAULT_JSONL_REL);
97
+ }
98
+
99
+ /**
100
+ * Resolve the incubator draft root: explicit opts.incubatorRoot wins; otherwise
101
+ * <repoRoot>/.design/reflections/incubator.
102
+ */
103
+ function resolveIncubatorRoot(opts) {
104
+ const o = opts || {};
105
+ if (o.incubatorRoot) {
106
+ return path.isAbsolute(o.incubatorRoot)
107
+ ? o.incubatorRoot
108
+ : path.resolve(o.repoRoot || process.cwd(), o.incubatorRoot);
109
+ }
110
+ return path.join(o.repoRoot || findRepoRoot(), DEFAULT_INCUBATOR_REL);
111
+ }
112
+
113
+ /**
114
+ * Kebab-case slug from a free-text scenario name (mirrors the reflector-kfm
115
+ * deriveSlug semantics — ASCII-only, dash-collapsed, ≤40 chars).
116
+ */
117
+ function deriveSlug(text) {
118
+ const raw = typeof text === 'string' ? text : '';
119
+ let s = raw.toLowerCase();
120
+ s = s.replace(/[^\x20-\x7e]+/g, '');
121
+ s = s.replace(/[^a-z0-9]+/g, '-');
122
+ s = s.replace(/-+/g, '-');
123
+ s = s.replace(/^-+|-+$/g, '');
124
+ if (s.length > 40) s = s.slice(0, 40);
125
+ s = s.replace(/-+$/g, '');
126
+ return s || 'unnamed';
127
+ }
128
+
129
+ // -------------------------------------------------------------------
130
+ // recordRun — emit a scenario-failure event to the JSONL artifact
131
+ // -------------------------------------------------------------------
132
+
133
+ /**
134
+ * Append ONE scenario-failure event to the JSONL artifact when a 33-01 runner
135
+ * result has pass:false. The timestamp is stamped HERE via the injected clock
136
+ * (the runner does not emit a `ts`). On a passing result, returns null (the
137
+ * sustained-failure detector reads failures only).
138
+ *
139
+ * Never throws on a missing .design/ tree — mkdir -p the parent defensively and
140
+ * swallow write errors (mirrors event-chain.cjs).
141
+ *
142
+ * EVENT SHAPE:
143
+ * { event_type:'skill_behavior_failure', scenario, target_skill?, pass:false,
144
+ * compliance_hits, violation_hits, ts }
145
+ *
146
+ * @param {{ scenario:string, target_skill?:string, pass:boolean,
147
+ * compliance_hits?:number, violation_hits?:number }} result
148
+ * @param {{ jsonlPath?:string, fs?:typeof import('node:fs'),
149
+ * now?:() => number|string, repoRoot?:string }} [opts]
150
+ * @returns {object | null} the appended event, or null on a passing result
151
+ */
152
+ function recordRun(result, opts) {
153
+ const o = opts || {};
154
+ const fs = o.fs || nodeFs;
155
+ const now = typeof o.now === 'function' ? o.now : () => new Date().toISOString();
156
+
157
+ if (!result || typeof result !== 'object') return null;
158
+ // Detector reads FAILURES only — a passing run emits nothing.
159
+ if (result.pass !== false) return null;
160
+
161
+ const event = {
162
+ event_type: EVENT_TYPE,
163
+ scenario: result.scenario,
164
+ pass: false,
165
+ compliance_hits: Number.isFinite(result.compliance_hits) ? result.compliance_hits : 0,
166
+ violation_hits: Number.isFinite(result.violation_hits) ? result.violation_hits : 0,
167
+ ts: now(),
168
+ };
169
+ // Preserve target_skill when the runner supplied it (useful for the proposal).
170
+ if (result.target_skill !== undefined) event.target_skill = result.target_skill;
171
+
172
+ const jsonlPath = resolveJsonlPath(o);
173
+ try {
174
+ fs.mkdirSync(path.dirname(jsonlPath), { recursive: true });
175
+ fs.appendFileSync(jsonlPath, JSON.stringify(event) + '\n', { flag: 'a' });
176
+ } catch (err) {
177
+ // Defensive: telemetry must never crash a run. Mirror event-chain.cjs.
178
+ try {
179
+ process.stderr.write(
180
+ `[skill-behavior-telemetry] write failed: ${err && err.message ? err.message : String(err)}\n`,
181
+ );
182
+ } catch (_e) {
183
+ /* swallow */
184
+ }
185
+ }
186
+ return event;
187
+ }
188
+
189
+ // -------------------------------------------------------------------
190
+ // readRuns — tail the JSONL, filter by scenario
191
+ // -------------------------------------------------------------------
192
+
193
+ /**
194
+ * Read the JSONL artifact and return every recorded event for `scenario`, in
195
+ * file order (oldest → newest). Defensive on a missing file: returns []. Invalid
196
+ * JSON lines are skipped.
197
+ *
198
+ * @param {string} scenario
199
+ * @param {{ jsonlPath?:string, fs?:typeof import('node:fs'), repoRoot?:string }} [opts]
200
+ * @returns {Array<object>}
201
+ */
202
+ function readRuns(scenario, opts) {
203
+ const o = opts || {};
204
+ const fs = o.fs || nodeFs;
205
+ const jsonlPath = resolveJsonlPath(o);
206
+ if (!fs.existsSync(jsonlPath)) return [];
207
+
208
+ let raw;
209
+ try {
210
+ raw = fs.readFileSync(jsonlPath, 'utf8');
211
+ } catch (_e) {
212
+ return [];
213
+ }
214
+
215
+ const out = [];
216
+ for (const line of raw.split('\n')) {
217
+ if (line.trim() === '') continue;
218
+ let rec;
219
+ try {
220
+ rec = JSON.parse(line);
221
+ } catch (_e) {
222
+ continue; // skip malformed line
223
+ }
224
+ if (rec && rec.scenario === scenario) out.push(rec);
225
+ }
226
+ return out;
227
+ }
228
+
229
+ // -------------------------------------------------------------------
230
+ // isSustainedFailure — ≥3 of the last 10 runs failed for a scenario (D-07)
231
+ // -------------------------------------------------------------------
232
+
233
+ /**
234
+ * Sustained-failure detector. Considers the LAST 10 runs for `scenario` and
235
+ * returns true iff ≥3 of them failed (D-07). Accepts EITHER an in-memory
236
+ * opts.window (array of `{ pass }` objects — for unit tests) OR reads the
237
+ * on-disk JSONL tail via readRuns().
238
+ *
239
+ * Boundary: 2/10 → false, 3/10 → true; strictly windowed to the last 10 (older
240
+ * failures excluded).
241
+ *
242
+ * Note: recordRun only persists FAILURE events, so the on-disk path counts each
243
+ * recorded row as a failure. The in-memory window path inspects `pass` so tests
244
+ * can mix pass/fail entries to exercise the windowing math precisely.
245
+ *
246
+ * @param {string} scenario
247
+ * @param {{ window?:Array<{pass:boolean}>, jsonlPath?:string,
248
+ * fs?:typeof import('node:fs'), window_size?:number,
249
+ * threshold?:number, repoRoot?:string }} [opts]
250
+ * @returns {boolean}
251
+ */
252
+ function isSustainedFailure(scenario, opts) {
253
+ const o = opts || {};
254
+ const windowSize = Number.isInteger(o.window_size) && o.window_size > 0 ? o.window_size : SUSTAINED_WINDOW;
255
+ const threshold = Number.isInteger(o.threshold) && o.threshold > 0 ? o.threshold : SUSTAINED_THRESHOLD;
256
+
257
+ let runs;
258
+ if (Array.isArray(o.window)) {
259
+ runs = o.window;
260
+ } else {
261
+ runs = readRuns(scenario, o);
262
+ }
263
+
264
+ // Strictly the LAST `windowSize` runs.
265
+ const tail = runs.slice(-windowSize);
266
+ // A row counts as a failure when pass === false. On-disk rows are all failures
267
+ // (recordRun only persists pass:false), so a missing `pass` defaults to failed
268
+ // for the disk path; the in-memory window always carries an explicit `pass`.
269
+ const failures = tail.filter((r) => r && r.pass !== true).length;
270
+ return failures >= threshold;
271
+ }
272
+
273
+ // -------------------------------------------------------------------
274
+ // maybeProposeReflection — propose-only reflector content-edit draft
275
+ // -------------------------------------------------------------------
276
+
277
+ /**
278
+ * Reflector consumption point (mirrors reflector-kfm-proposer's shouldPropose +
279
+ * proposeKfmDraft idiom): gate on isSustainedFailure(scenario); if NOT sustained
280
+ * return { action:'skipped', reason:'below_sustained_threshold' }; if sustained,
281
+ * write a PROPOSE-ONLY draft under the (injectable) incubator root at
282
+ * <incubatorRoot>/skill-edit-<scenario>/CATALOGUE-ENTRY.md naming the failing
283
+ * scenario/skill + the sustained-failure signal + a TODO for the content edit,
284
+ * and return { action:'drafted', path, slug }.
285
+ *
286
+ * This draft lands in the SAME incubator tree that
287
+ * scripts/lib/apply-reflections/incubator-proposals.cjs surfaces in
288
+ * /gdd:apply-reflections — so a maintainer reviews + accepts/rejects the proposed
289
+ * skill edit there. It NEVER auto-edits a skill (Phase 11/29 propose-only SC;
290
+ * Phase 33 out-of-scope).
291
+ *
292
+ * @param {string} scenario
293
+ * @param {{ window?:Array<{pass:boolean}>, jsonlPath?:string,
294
+ * incubatorRoot?:string, fs?:typeof import('node:fs'),
295
+ * now?:() => number|string, target_skill?:string,
296
+ * repoRoot?:string }} [opts]
297
+ * @returns {{ action:'drafted', path:string, slug:string }
298
+ * | { action:'skipped', reason:string }}
299
+ */
300
+ function maybeProposeReflection(scenario, opts) {
301
+ const o = opts || {};
302
+ const fs = o.fs || nodeFs;
303
+ const now = typeof o.now === 'function' ? o.now : () => new Date().toISOString();
304
+
305
+ // Stability gate — the ≥3/10 sustained-failure threshold (analogous to the
306
+ // reflector-kfm ≥K gate).
307
+ if (!isSustainedFailure(scenario, o)) {
308
+ return { action: 'skipped', reason: 'below_sustained_threshold' };
309
+ }
310
+
311
+ const slug = `${INCUBATOR_PREFIX}${deriveSlug(scenario)}`;
312
+ const incubatorRoot = resolveIncubatorRoot(o);
313
+ const draftDir = path.join(incubatorRoot, slug);
314
+ const draftPath = path.join(draftDir, 'CATALOGUE-ENTRY.md');
315
+
316
+ // Best-effort target_skill: prefer an injected hint, else the latest recorded
317
+ // failure event for this scenario (recordRun stamps target_skill).
318
+ let targetSkill = o.target_skill;
319
+ if (!targetSkill && !Array.isArray(o.window)) {
320
+ const recorded = readRuns(scenario, o);
321
+ const last = recorded.length ? recorded[recorded.length - 1] : null;
322
+ if (last && last.target_skill) targetSkill = last.target_skill;
323
+ }
324
+
325
+ const body = [
326
+ `# Skill-edit proposal — ${scenario}`,
327
+ '',
328
+ `**Source:** skill-behavior-telemetry (pressure-scenario harness)`,
329
+ `**Failing scenario:** ${scenario}`,
330
+ `**Target skill:** ${targetSkill || 'TODO: <skill that failed under pressure>'}`,
331
+ `**Signal:** sustained failure — ≥${SUSTAINED_THRESHOLD} of the last ${SUSTAINED_WINDOW} runs failed (D-07).`,
332
+ '',
333
+ `Drafted ${now()}. **PROPOSE-ONLY** — review via \`/gdd:apply-reflections\`.`,
334
+ 'This draft NEVER auto-edits a skill (Phase 11/29 propose-only SC; Phase 33 out-of-scope).',
335
+ '',
336
+ '## Rationalization signal',
337
+ '',
338
+ `The "${scenario}" pressure scenario is failing repeatedly: the target skill is`,
339
+ 'not holding under pressure (an agent is rationalizing past its HARD-GATE /',
340
+ 'rationalization table). A content edit is proposed to close the loophole.',
341
+ '',
342
+ '## Proposed content edit',
343
+ '',
344
+ `- TODO: identify which rationalization the "${scenario}" scenario exploits.`,
345
+ '- TODO: add / strengthen the counter-rationalization row in the target skill',
346
+ " (the '| Thought | Reality |' table) OR tighten its <HARD-GATE> wording.",
347
+ '- TODO: re-run `npm run test:behavior` for this scenario to confirm GREEN.',
348
+ '',
349
+ ].join('\n');
350
+
351
+ try {
352
+ fs.mkdirSync(draftDir, { recursive: true });
353
+ fs.writeFileSync(draftPath, body);
354
+ } catch (err) {
355
+ // A draft-write failure must not crash the harness; surface as skipped.
356
+ return { action: 'skipped', reason: `draft_write_failed: ${err && err.message ? err.message : String(err)}` };
357
+ }
358
+
359
+ return { action: 'drafted', path: draftPath, slug };
360
+ }
361
+
362
+ // -------------------------------------------------------------------
363
+ // Exports
364
+ // -------------------------------------------------------------------
365
+
366
+ module.exports = {
367
+ recordRun,
368
+ readRuns,
369
+ isSustainedFailure,
370
+ maybeProposeReflection,
371
+ // Exposed for tests / higher-level integration.
372
+ EVENT_TYPE,
373
+ DEFAULT_JSONL_REL,
374
+ DEFAULT_INCUBATOR_REL,
375
+ SUSTAINED_WINDOW,
376
+ SUSTAINED_THRESHOLD,
377
+ _deriveSlug: deriveSlug,
378
+ _findRepoRoot: findRepoRoot,
379
+ };
@@ -251,8 +251,50 @@ var require_health_mirror = __commonJS({
251
251
  }
252
252
  checks.push({ name: "figma_extract", status, detail });
253
253
  }
254
+ {
255
+ const skillPresent = fileExists(
256
+ path.join(rootDir, "skills", "using-gdd", "SKILL.md")
257
+ );
258
+ const hookWired = skillPresent && sessionStartWiresInject(rootDir);
259
+ let detail;
260
+ let status;
261
+ if (!skillPresent) {
262
+ detail = "skill-discipline: missing using-gdd";
263
+ status = "warn";
264
+ } else if (!hookWired) {
265
+ detail = "skill-discipline: hook not wired";
266
+ status = "warn";
267
+ } else {
268
+ detail = "skill-discipline: ready";
269
+ status = "ok";
270
+ }
271
+ checks.push({ name: "skill_discipline", status, detail });
272
+ }
254
273
  return { checks };
255
274
  }
275
+ function sessionStartWiresInject(rootDir) {
276
+ try {
277
+ const p = path.join(rootDir, "hooks", "hooks.json");
278
+ let hooks;
279
+ try {
280
+ hooks = JSON.parse(fs.readFileSync(p, "utf8"));
281
+ } catch {
282
+ return false;
283
+ }
284
+ const sessionStart = hooks && hooks.hooks && Array.isArray(hooks.hooks.SessionStart) ? hooks.hooks.SessionStart : [];
285
+ for (const entry of sessionStart) {
286
+ const inner = entry && Array.isArray(entry.hooks) ? entry.hooks : [];
287
+ for (const h of inner) {
288
+ if (h && typeof h.command === "string" && /inject-using-gdd/.test(h.command)) {
289
+ return true;
290
+ }
291
+ }
292
+ }
293
+ return false;
294
+ } catch {
295
+ return false;
296
+ }
297
+ }
256
298
  function figmaVariablesBlockedLocally(rootDir) {
257
299
  try {
258
300
  const rawRoot = path.join(rootDir, ".figma-extract-cache", "raw");
@@ -63,4 +63,17 @@ After the consolidated audit summary has been printed (and any reflection-propos
63
63
 
64
64
  Written by `hooks/update-check.sh`; suppressed mid-pipeline and when the latest release is dismissed.
65
65
 
66
+ ## Rationalizations — Thought to Reality
67
+
68
+ The excuses an agent reaches for to skip or thin out an audit, and the drift each one misses:
69
+
70
+ | Thought | Reality |
71
+ |---------|---------|
72
+ | "The audit passed last cycle, I can skip it this cycle." | Per-cycle audit catches drift the prior pass couldn't see; a skipped review is exactly where regressions accumulate unnoticed. |
73
+ | "`--quick` is fine, integration isn't the concern here." | Dropping the integration-checker hides orphaned decisions — wiring breaks even when the 6-pillar score looks healthy. |
74
+ | "I can eyeball the scores instead of spawning the auditor." | The auditor's rubric scores six pillars consistently; an eyeballed review drifts toward whatever the agent already believes. |
75
+ | "Reflection proposals are optional polish, skip the reflector." | The reflector turns this cycle's learnings into next-cycle improvements; skipping it lets the same mistakes repeat. |
76
+ | "I'll modify the source while I'm in here fixing findings." | Audit is read-only by contract; editing source mid-audit invalidates the very scores you're producing. |
77
+ | "Retroactive mode is overkill for a finished cycle." | Retroactive verification is the only check on tasks that shipped without per-task verify — skipping it leaves a completed cycle unaudited. |
78
+
66
79
  ## AUDIT COMPLETE
@@ -92,4 +92,29 @@ Next: @get-design-done explore
92
92
  ━━━━━━━━━━━━━━━━━━━━━━━
93
93
  ```
94
94
 
95
+ ## Spec self-review (before transition)
96
+
97
+ Run this final spec-quality pass over `.design/BRIEF.md` before the brief→explore transition:
98
+ - Placeholder scan: no TBD / TODO / `<placeholder>` / lorem left in the artifact.
99
+ - Internal consistency: sections don't contradict each other.
100
+ - Scope check: nothing in the artifact exceeds (or silently drops) the agreed scope.
101
+ - Ambiguity check: every requirement/decision is specific enough to act on without a follow-up question.
102
+
103
+ <HARD-GATE>
104
+ Do NOT transition to explore (or invoke `/gdd:explore`) until the brief artifact (default `.design/BRIEF.md`) is committed AND the user has approved it. If this project uses a custom `.design` location, read the artifact path from `.design/STATE.md` rather than assuming the default.
105
+ </HARD-GATE>
106
+
107
+ ## Rationalizations — Thought to Reality
108
+
109
+ The excuses an agent invents to skip or shortcut the brief, and what each one actually costs the cycle:
110
+
111
+ | Thought | Reality |
112
+ |---------|---------|
113
+ | "This brief is too simple to need a problem statement." | Skip the brief = guess at requirements, then redesign mid-design when the real problem surfaces. |
114
+ | "The user told me what to build, I can skip the interview." | Unasked constraints (a11y, brand, stack) become rework — the five questions exist because each one has blown a past cycle. |
115
+ | "I'll capture success metrics later in verify." | Verify has nothing to check against; an un-metricked brief produces an un-verifiable cycle. |
116
+ | "Scope is obvious, I don't need an in/out line." | Undeclared scope is scope creep waiting to happen — the explore scan widens to fill the vacuum. |
117
+ | "I can answer all five questions for the user from context." | AskUserQuestion one-at-a-time exists because batched/assumed answers smuggle in wrong premises that compound downstream. |
118
+ | "STATE.md bootstrap can wait." | Every later MCP mutation requires STATE.md to exist; skipping the bootstrap hard-blocks explore on entry. |
119
+
95
120
  ## BRIEF COMPLETE
@@ -78,4 +78,21 @@ Print the `=== Design stage complete ===` summary (tasks complete/total, deviati
78
78
 
79
79
  After all tasks finish, if STATE.md `<connections>` has `figma: available`, offer the user the figma-write opt-in prompt (modes: annotate / tokenize / mappings, with optional `--dry-run`). Spawn `design-figma-writer` with the selected mode on "yes"; skip silently on "no". NEVER auto-run without confirmation. Full prompt + dispatch logic: `./design-procedure.md` §Figma Write Dispatch.
80
80
 
81
+ <HARD-GATE>
82
+ Do NOT transition to verify (or invoke `/gdd:verify`) until `.design/DESIGN-SUMMARY.md` is committed. If this project uses a custom `.design` location, read the artifact path from `.design/STATE.md` rather than assuming the default.
83
+ </HARD-GATE>
84
+
85
+ ## Rationalizations — Thought to Reality
86
+
87
+ The excuses an agent uses to cut corners during design implementation, and the cost of each:
88
+
89
+ | Thought | Reality |
90
+ |---------|---------|
91
+ | "I can skip planning for this small task and just implement it." | Plan-skipped tasks blow scope per cycle telemetry; the gate is for the typical case, not the exception. |
92
+ | "These two tasks touch nearby files but I'll run them in parallel anyway." | Overlapping `Touches:` in a parallel batch produce merge conflicts that silently drop one task's work — split into sequential sub-waves. |
93
+ | "Hardcoding this value is faster than wiring the token." | A hardcoded value is a stub the verifier catches as drift from the design tokens; you pay for it twice. |
94
+ | "I'll emit the `.stories.tsx` stub later when Storybook is back up." | The CSF stub must land with the component or the next cycle's visual-regression scope misses it entirely. |
95
+ | "This deviation is minor, I won't record a blocker." | An unrecorded deviation can't be resolved by a follow-up task, so it leaks into verify as an unexplained gap. |
96
+ | "Auto-mode means I can ignore the wave checkpoints." | Auto-mode skips prompts, not the wave structure; ignoring wave order still corrupts dependent-task ordering. |
97
+
81
98
  ## DESIGN COMPLETE
@@ -80,4 +80,17 @@ Cycle: <name or "default">
80
80
  - Do not run the interview yourself — always spawn the agent.
81
81
  - Do not touch files outside `.design/`.
82
82
 
83
+ ## Rationalizations — Thought to Reality
84
+
85
+ The shortcuts an agent takes during a discuss session, and what each one costs the decision record:
86
+
87
+ | Thought | Reality |
88
+ |---------|---------|
89
+ | "I'll ask all eight questions at once to save time." | Batched questions overwhelm the user; one-at-a-time keeps each decision clean and prevents coupled answers. |
90
+ | "I can run the interview inline instead of spawning the discussant." | The skill's contract is to always spawn the agent — running it yourself skips the discussant's mode handling and D-XX numbering. |
91
+ | "This answer is good enough, I'll record it as a decision without follow-up." | A vague answer ("modern", "clean") recorded as a D-XX locks in an undecided premise; reject and re-ask once. |
92
+ | "I'll batch all the new D-XX entries into STATE.md at the end." | Decisions written atomically per answer survive an interrupted session; batching loses everything if the session drops. |
93
+ | "The glossary term can wait until I write the summary." | CONTEXT.md is written immediately per term — a deferred glossary entry is a naming inconsistency the next cycle inherits. |
94
+ | "Every decision this session is worth an ADR." | ADRs require all three criteria (hard-to-reverse, surprising, real-tradeoff); auto-promoting routine choices buries the genuinely load-bearing ones. |
95
+
83
96
  ## DISCUSS COMMAND COMPLETE
@@ -85,4 +85,21 @@ Full interview protocol + JSON line schema: `./explore-procedure.md` §Step 3.
85
85
 
86
86
  Print: "=== Explore complete ===\nSaved: .design/DESIGN.md, .design/DESIGN-DEBT.md, .design/DESIGN-CONTEXT.md\nNext: @get-design-done plan".
87
87
 
88
+ <HARD-GATE>
89
+ Do NOT transition to plan (or invoke `/gdd:plan`) until BOTH `.design/DESIGN.md` AND `.design/DESIGN-CONTEXT.md` are committed AND the user has approved them. If this project uses a custom `.design` location, read the artifact paths from `.design/STATE.md` rather than assuming the default.
90
+ </HARD-GATE>
91
+
92
+ ## Rationalizations — Thought to Reality
93
+
94
+ The shortcut excuses an agent reaches for during explore, and the drift each one introduces:
95
+
96
+ | Thought | Reality |
97
+ |---------|---------|
98
+ | "I already know this codebase, I can skip the inventory scan." | An unscanned codebase hides the tokens/components you'll duplicate — the grep pass exists to stop you reinventing what's there. |
99
+ | "The six connection probes are noise, I'll assume Figma is off." | A skipped probe means a wrong connection assumption silently breaks the design stage's tool dispatch. |
100
+ | "`--skip-interview` is fine, the brief covered it." | The interview locks the gray areas the brief left fuzzy; skipping it ships undecided D-XX into planning. |
101
+ | "I'll batch all the interview questions to save round-trips." | Batched questions overwhelm the user and smuggle in coupled assumptions — one-at-a-time keeps each decision clean. |
102
+ | "DESIGN-DEBT.md is optional, the scan was clean enough." | Unrecorded debt resurfaces as an unexplained constraint three stages later with no provenance. |
103
+ | "Prior sketches and project conventions don't apply this cycle." | Ignored conventions get overridden by defaults, producing inconsistency the audit will flag against the rest of the system. |
104
+
88
105
  ## EXPLORE COMPLETE
@@ -63,6 +63,12 @@ After the health table, the `gdd_health` MCP surface (`scripts/lib/health-mirror
63
63
 
64
64
  Token PRESENCE only is detected (D-10) — the token value is never read, logged, or shown. The Free-tier signal is read from the local raw-pull cache only; no network call is made.
65
65
 
66
+ ## Skill-discipline bootstrap (skill_discipline)
67
+
68
+ The `gdd_health` MCP surface also reports a `skill_discipline` check (Phase 32) confirming the using-gdd SessionStart bootstrap is live — detail is one of three exact strings:
69
+ - `skill-discipline: ready` — `skills/using-gdd/SKILL.md` exists AND `hooks/hooks.json` SessionStart wires `inject-using-gdd.sh` (status `ok`).
70
+ - `skill-discipline: missing using-gdd` (skill absent) or `skill-discipline: hook not wired` (skill present, no SessionStart inject) — both `warn`.
71
+
66
72
  ## Check MCP registration (gdd-mcp)
67
73
 
68
74
  After the health table, inspect whether `gdd-mcp` (Phase 27.7+) is registered with any installed harness and render a one-line status row. Dismissable via `.design/config.json#mcp_nudge=false`. Non-blocking: failure paths render `MCP server: unknown` rather than crash. Full detection procedure (dismissal check, detection via `scripts/lib/install/mcp-register.cjs`, row rendering for claude/codex/both/neither, fallback) lives in `./health-mcp-detection.md`.
@@ -77,4 +77,29 @@ The next stage (design) calls `mcp__gdd_state__transition_stage` on entry — th
77
77
 
78
78
  Print: plan tasks (N waves, M total tasks), files written (`.design/DESIGN-PLAN.md`, plus `.design/DESIGN-RESEARCH.md` if research ran), next step `/get-design-done:design`.
79
79
 
80
+ ## Spec self-review (before transition)
81
+
82
+ Run this final spec-quality pass over `.design/DESIGN-PLAN.md` before the plan→design transition:
83
+ - Placeholder scan: no TBD / TODO / `<placeholder>` / lorem left in the artifact.
84
+ - Internal consistency: sections don't contradict each other.
85
+ - Scope check: nothing in the artifact exceeds (or silently drops) the agreed scope.
86
+ - Ambiguity check: every requirement/decision is specific enough to act on without a follow-up question.
87
+
88
+ <HARD-GATE>
89
+ Do NOT transition to design (or invoke `/gdd:design`) until `.design/DESIGN-PLAN.md` is committed AND the user has approved it. If this project uses a custom `.design` location, read the artifact path from `.design/STATE.md` rather than assuming the default.
90
+ </HARD-GATE>
91
+
92
+ ## Rationalizations — Thought to Reality
93
+
94
+ The reasons an agent gives to skip planning or rush DESIGN-PLAN.md, and what each one costs:
95
+
96
+ | Thought | Reality |
97
+ |---------|---------|
98
+ | "This change is small, I can design straight from DESIGN-CONTEXT.md." | Plan-skipped tasks blow scope per cycle telemetry; the plan gate is for the typical case, not the exception you think you're in. |
99
+ | "Pattern mapping is brownfield ceremony, I'll skip it." | Step 1.5 is mandatory because an unmapped brownfield is where the executor silently re-implements an existing pattern. |
100
+ | "The plan-checker will just rubber-stamp it, skip the spawn." | The checker's 5 dimensions (coverage, wave order, must-have derivation) catch the gaps you can't see in your own plan. |
101
+ | "I'll let the planner infer wave ordering at design time." | Unordered waves serialize work that could parallelize — or worse, run dependent tasks concurrently and corrupt the tree. |
102
+ | "Research is overkill for this scope." | The complexity heuristic exists precisely because agents under-estimate scope; skipping research on a 3+-scope domain guarantees a mid-design surprise. |
103
+ | "I can record decisions in DESIGN-PLAN.md prose instead of D-XX." | Prose decisions never reach STATE.md, so verify's integration-checker can't trace them and flags them orphaned. |
104
+
80
105
  ## PLAN COMPLETE
@@ -79,6 +79,10 @@ If `.design/budget.json` is missing, assume defaults from `reference/config-sche
79
79
 
80
80
  When the router cannot resolve `intent-string` to a known agent (no `description` match, no `default-tier` rule, no path-selection fallback), emit ONE `capability_gap` event with `source: "router"` before returning the conservative-fallback JSON. Feeds Phase 29 Stage-0 telemetry — see `./capability-gap-emitter.md` for the synchronous Node snippet, semantic notes (suggested_kind = `"agent"`, MCP-probe exclusion per D-08, back-compat invariant on router output), and the opaque-extras payload routing through `appendChainEvent`.
81
81
 
82
+ ## Emitting router_pick on a resolved pick
83
+
84
+ When the router DID resolve a pick — it has the `path`/`complexity_class`/`resolved_models` decision and is about to return the decision JSON — emit ONE `router_pick` event (`source: "router"`) recording which skill/agent was auto-picked, as the last step before returning. Side-effect only; the output JSON contract is UNCHANGED. Feeds the D-02 under-reached-skill instrument (Phase 33 baselines per-skill pick rates) — see `./router-pick-emitter.md` for the synchronous Node snippet, the 7-field no-PII payload (context_hash only — never the raw prompt), and the opaque-extras routing through `appendChainEvent`.
85
+
82
86
  ## Non-Goals
83
87
 
84
88
  The router does not: (a) make a model call, (b) write files, (c) enforce budget caps (that's the hook's job), (d) learn from history (Phase 11 reflector territory per D-07).