npm - @hegemonart/get-design-done - Versions diffs - 1.31.5 → 1.33.0 - Mend

@hegemonart/get-design-done 1.31.5 → 1.33.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +63 -0
package/NOTICE +81 -5
package/README.md +25 -0
package/SKILL.md +4 -0
package/hooks/hooks.json +9 -0
package/hooks/inject-using-gdd.sh +72 -0
package/hooks/run-hook.cmd +35 -0
package/package.json +2 -2
package/reference/schemas/events.schema.json +63 -1
package/reference/schemas/pressure-scenario.schema.json +69 -0
package/scripts/lib/health-mirror/index.cjs +79 -1
package/scripts/lib/skill-behavior/runner.cjs +187 -0
package/scripts/lib/skill-behavior/stub-invoker.cjs +95 -0
package/scripts/lib/skill-behavior/telemetry.cjs +379 -0
package/sdk/mcp/gdd-mcp/server.js +42 -0
package/skills/audit/SKILL.md +13 -0
package/skills/brief/SKILL.md +25 -0
package/skills/design/SKILL.md +17 -0
package/skills/discuss/SKILL.md +13 -0
package/skills/explore/SKILL.md +17 -0
package/skills/health/SKILL.md +6 -0
package/skills/plan/SKILL.md +25 -0
package/skills/router/SKILL.md +4 -0
package/skills/router/router-pick-emitter.md +78 -0
package/skills/using-gdd/SKILL.md +78 -0
package/skills/verify/SKILL.md +17 -0
package/scripts/lib/cli/index.ts +0 -29
package/scripts/lib/error-classifier.cjs +0 -29
package/scripts/lib/event-stream/index.ts +0 -29
package/scripts/lib/gdd-errors/index.ts +0 -29
package/scripts/lib/gdd-state/index.ts +0 -29
package/scripts/lib/iteration-budget.cjs +0 -29
package/scripts/lib/jittered-backoff.cjs +0 -29
package/scripts/lib/lockfile.cjs +0 -29
package/scripts/mcp-servers/gdd-mcp/server.ts +0 -35
package/scripts/mcp-servers/gdd-state/server.ts +0 -34

package/scripts/lib/skill-behavior/telemetry.cjs ADDED Viewed

@@ -0,0 +1,379 @@
+/**
+ * telemetry.cjs — reflector-telemetry layer for the pressure-scenario harness
+ * (Plan 33-05). The third leg of Phase 33: it CONSUMES the 33-01 runner result
+ * ({ scenario, target_skill, pass, compliance_hits, violation_hits }), records a
+ * scenario-failure event to a JSONL artifact, detects SUSTAINED failure, and on
+ * sustained failure produces a PROPOSE-ONLY reflector content-edit draft via the
+ * same incubator/apply-reflections surface the shipped reflector-kfm-proposer
+ * uses.
+ *
+ * Why this module exists: behavior tests only matter if a sustained failure
+ * prompts a content fix. This closes that loop — a failing run is recorded; when
+ * a scenario fails ≥3 of its last 10 runs (D-07 threshold), the reflector
+ * proposes a skill-content edit for human review via /gdd:apply-reflections. The
+ * proposal NEVER auto-edits a skill (Phase 11/29 propose-only SC; Phase 33
+ * out-of-scope: "Auto-applying reflector-proposed skill edits — propose-only").
+ *
+ * Decisions honored:
+ *   * D-07 — telemetry → .design/telemetry/skill-behavior.jsonl (runtime
+ *     artifact, gitignored, local); sustained-failure signal = ≥3 of the last 10
+ *     runs failing for a scenario; reflector consumption is STUB-tested (no live
+ *     runs — all paths + the clock are injectable so tests use a tmp dir).
+ *   * D-06 — this module is exercised by the DEFAULT suite (no API key / no LLM).
+ *
+ * Injectability / purity:
+ *   The JSONL path, the incubator root, `fs`, and the clock (`now`) are ALL
+ *   injectable via opts so every test writes to an os.tmpdir() dir and NOTHING
+ *   touches the real .design/ tree. The runner (33-01) does NOT stamp a `ts`;
+ *   the timestamp is stamped HERE via the injected `now`.
+ *
+ * Pattern references (style mirrored, NOT imported):
+ *   * scripts/lib/event-chain.cjs — house JSONL append (defensive mkdir -p +
+ *     append, never-throw) + findRepoRoot + line-by-line read idiom.
+ *   * scripts/lib/reflector-kfm-proposer.cjs — shouldPropose-style stability gate
+ *     + proposeKfmDraft writing a proposal-only draft under
+ *     .design/reflections/incubator/<slug>/CATALOGUE-ENTRY.md.
+ *
+ * Public API:
+ *   recordRun(result, opts)              → event | null   (append on pass:false)
+ *   readRuns(scenario, opts)             → Array<event>   (tail JSONL, filter)
+ *   isSustainedFailure(scenario, opts)   → boolean         (≥3 of last 10 failed)
+ *   maybeProposeReflection(scenario, opts) → { action:'drafted', path, slug }
+ *                                            | { action:'skipped', reason }
+ *
+ * Pure CommonJS, deps = node:fs + node:path ONLY. No npm dependencies.
+ */
+'use strict';
+const nodeFs = require('node:fs');
+const path = require('node:path');
+// -------------------------------------------------------------------
+// Constants
+// -------------------------------------------------------------------
+const EVENT_TYPE = 'skill_behavior_failure';
+const DEFAULT_JSONL_REL = '.design/telemetry/skill-behavior.jsonl';
+const DEFAULT_INCUBATOR_REL = '.design/reflections/incubator';
+const SUSTAINED_WINDOW = 10; // D-07: look at the last N runs
+const SUSTAINED_THRESHOLD = 3; // D-07: ≥3 failures of the last 10 == sustained
+const INCUBATOR_PREFIX = 'skill-edit-';
+// -------------------------------------------------------------------
+// Helpers
+// -------------------------------------------------------------------
+/**
+ * Walk up from a start dir until a package.json is found (repo root). Mirrors
+ * the reflector-kfm-proposer / event-chain findRepoRoot idiom.
+ *
+ * @param {string} [startDir]
+ * @returns {string}
+ */
+function findRepoRoot(startDir) {
+  let dir = startDir || __dirname;
+  for (let i = 0; i < 12; i++) {
+    if (nodeFs.existsSync(path.join(dir, 'package.json'))) return dir;
+    const parent = path.dirname(dir);
+    if (parent === dir) break;
+    dir = parent;
+  }
+  return path.resolve(__dirname, '..', '..', '..');
+}
+/**
+ * Resolve the JSONL emit path: explicit opts.jsonlPath wins (absolute or
+ * relative to cwd); otherwise <repoRoot>/.design/telemetry/skill-behavior.jsonl.
+ */
+function resolveJsonlPath(opts) {
+  const o = opts || {};
+  if (o.jsonlPath) {
+    return path.isAbsolute(o.jsonlPath)
+      ? o.jsonlPath
+      : path.resolve(o.repoRoot || process.cwd(), o.jsonlPath);
+  }
+  return path.join(o.repoRoot || findRepoRoot(), DEFAULT_JSONL_REL);
+}
+/**
+ * Resolve the incubator draft root: explicit opts.incubatorRoot wins; otherwise
+ * <repoRoot>/.design/reflections/incubator.
+ */
+function resolveIncubatorRoot(opts) {
+  const o = opts || {};
+  if (o.incubatorRoot) {
+    return path.isAbsolute(o.incubatorRoot)
+      ? o.incubatorRoot
+      : path.resolve(o.repoRoot || process.cwd(), o.incubatorRoot);
+  }
+  return path.join(o.repoRoot || findRepoRoot(), DEFAULT_INCUBATOR_REL);
+}
+/**
+ * Kebab-case slug from a free-text scenario name (mirrors the reflector-kfm
+ * deriveSlug semantics — ASCII-only, dash-collapsed, ≤40 chars).
+ */
+function deriveSlug(text) {
+  const raw = typeof text === 'string' ? text : '';
+  let s = raw.toLowerCase();
+  s = s.replace(/[^\x20-\x7e]+/g, '');
+  s = s.replace(/[^a-z0-9]+/g, '-');
+  s = s.replace(/-+/g, '-');
+  s = s.replace(/^-+|-+$/g, '');
+  if (s.length > 40) s = s.slice(0, 40);
+  s = s.replace(/-+$/g, '');
+  return s || 'unnamed';
+}
+// -------------------------------------------------------------------
+// recordRun — emit a scenario-failure event to the JSONL artifact
+// -------------------------------------------------------------------
+/**
+ * Append ONE scenario-failure event to the JSONL artifact when a 33-01 runner
+ * result has pass:false. The timestamp is stamped HERE via the injected clock
+ * (the runner does not emit a `ts`). On a passing result, returns null (the
+ * sustained-failure detector reads failures only).
+ *
+ * Never throws on a missing .design/ tree — mkdir -p the parent defensively and
+ * swallow write errors (mirrors event-chain.cjs).
+ *
+ * EVENT SHAPE:
+ *   { event_type:'skill_behavior_failure', scenario, target_skill?, pass:false,
+ *     compliance_hits, violation_hits, ts }
+ *
+ * @param {{ scenario:string, target_skill?:string, pass:boolean,
+ *           compliance_hits?:number, violation_hits?:number }} result
+ * @param {{ jsonlPath?:string, fs?:typeof import('node:fs'),
+ *           now?:() => number|string, repoRoot?:string }} [opts]
+ * @returns {object | null} the appended event, or null on a passing result
+ */
+function recordRun(result, opts) {
+  const o = opts || {};
+  const fs = o.fs || nodeFs;
+  const now = typeof o.now === 'function' ? o.now : () => new Date().toISOString();
+  if (!result || typeof result !== 'object') return null;
+  // Detector reads FAILURES only — a passing run emits nothing.
+  if (result.pass !== false) return null;
+  const event = {
+    event_type: EVENT_TYPE,
+    scenario: result.scenario,
+    pass: false,
+    compliance_hits: Number.isFinite(result.compliance_hits) ? result.compliance_hits : 0,
+    violation_hits: Number.isFinite(result.violation_hits) ? result.violation_hits : 0,
+    ts: now(),
+  };
+  // Preserve target_skill when the runner supplied it (useful for the proposal).
+  if (result.target_skill !== undefined) event.target_skill = result.target_skill;
+  const jsonlPath = resolveJsonlPath(o);
+  try {
+    fs.mkdirSync(path.dirname(jsonlPath), { recursive: true });
+    fs.appendFileSync(jsonlPath, JSON.stringify(event) + '\n', { flag: 'a' });
+  } catch (err) {
+    // Defensive: telemetry must never crash a run. Mirror event-chain.cjs.
+    try {
+      process.stderr.write(
+        `[skill-behavior-telemetry] write failed: ${err && err.message ? err.message : String(err)}\n`,
+      );
+    } catch (_e) {
+      /* swallow */
+    }
+  }
+  return event;
+}
+// -------------------------------------------------------------------
+// readRuns — tail the JSONL, filter by scenario
+// -------------------------------------------------------------------
+/**
+ * Read the JSONL artifact and return every recorded event for `scenario`, in
+ * file order (oldest → newest). Defensive on a missing file: returns []. Invalid
+ * JSON lines are skipped.
+ *
+ * @param {string} scenario
+ * @param {{ jsonlPath?:string, fs?:typeof import('node:fs'), repoRoot?:string }} [opts]
+ * @returns {Array<object>}
+ */
+function readRuns(scenario, opts) {
+  const o = opts || {};
+  const fs = o.fs || nodeFs;
+  const jsonlPath = resolveJsonlPath(o);
+  if (!fs.existsSync(jsonlPath)) return [];
+  let raw;
+  try {
+    raw = fs.readFileSync(jsonlPath, 'utf8');
+  } catch (_e) {
+    return [];
+  }
+  const out = [];
+  for (const line of raw.split('\n')) {
+    if (line.trim() === '') continue;
+    let rec;
+    try {
+      rec = JSON.parse(line);
+    } catch (_e) {
+      continue; // skip malformed line
+    }
+    if (rec && rec.scenario === scenario) out.push(rec);
+  }
+  return out;
+}
+// -------------------------------------------------------------------
+// isSustainedFailure — ≥3 of the last 10 runs failed for a scenario (D-07)
+// -------------------------------------------------------------------
+/**
+ * Sustained-failure detector. Considers the LAST 10 runs for `scenario` and
+ * returns true iff ≥3 of them failed (D-07). Accepts EITHER an in-memory
+ * opts.window (array of `{ pass }` objects — for unit tests) OR reads the
+ * on-disk JSONL tail via readRuns().
+ *
+ * Boundary: 2/10 → false, 3/10 → true; strictly windowed to the last 10 (older
+ * failures excluded).
+ *
+ * Note: recordRun only persists FAILURE events, so the on-disk path counts each
+ * recorded row as a failure. The in-memory window path inspects `pass` so tests
+ * can mix pass/fail entries to exercise the windowing math precisely.
+ *
+ * @param {string} scenario
+ * @param {{ window?:Array<{pass:boolean}>, jsonlPath?:string,
+ *           fs?:typeof import('node:fs'), window_size?:number,
+ *           threshold?:number, repoRoot?:string }} [opts]
+ * @returns {boolean}
+ */
+function isSustainedFailure(scenario, opts) {
+  const o = opts || {};
+  const windowSize = Number.isInteger(o.window_size) && o.window_size > 0 ? o.window_size : SUSTAINED_WINDOW;
+  const threshold = Number.isInteger(o.threshold) && o.threshold > 0 ? o.threshold : SUSTAINED_THRESHOLD;
+  let runs;
+  if (Array.isArray(o.window)) {
+    runs = o.window;
+  } else {
+    runs = readRuns(scenario, o);
+  }
+  // Strictly the LAST `windowSize` runs.
+  const tail = runs.slice(-windowSize);
+  // A row counts as a failure when pass === false. On-disk rows are all failures
+  // (recordRun only persists pass:false), so a missing `pass` defaults to failed
+  // for the disk path; the in-memory window always carries an explicit `pass`.
+  const failures = tail.filter((r) => r && r.pass !== true).length;
+  return failures >= threshold;
+}
+// -------------------------------------------------------------------
+// maybeProposeReflection — propose-only reflector content-edit draft
+// -------------------------------------------------------------------
+/**
+ * Reflector consumption point (mirrors reflector-kfm-proposer's shouldPropose +
+ * proposeKfmDraft idiom): gate on isSustainedFailure(scenario); if NOT sustained
+ * return { action:'skipped', reason:'below_sustained_threshold' }; if sustained,
+ * write a PROPOSE-ONLY draft under the (injectable) incubator root at
+ * <incubatorRoot>/skill-edit-<scenario>/CATALOGUE-ENTRY.md naming the failing
+ * scenario/skill + the sustained-failure signal + a TODO for the content edit,
+ * and return { action:'drafted', path, slug }.
+ *
+ * This draft lands in the SAME incubator tree that
+ * scripts/lib/apply-reflections/incubator-proposals.cjs surfaces in
+ * /gdd:apply-reflections — so a maintainer reviews + accepts/rejects the proposed
+ * skill edit there. It NEVER auto-edits a skill (Phase 11/29 propose-only SC;
+ * Phase 33 out-of-scope).
+ *
+ * @param {string} scenario
+ * @param {{ window?:Array<{pass:boolean}>, jsonlPath?:string,
+ *           incubatorRoot?:string, fs?:typeof import('node:fs'),
+ *           now?:() => number|string, target_skill?:string,
+ *           repoRoot?:string }} [opts]
+ * @returns {{ action:'drafted', path:string, slug:string }
+ *           | { action:'skipped', reason:string }}
+ */
+function maybeProposeReflection(scenario, opts) {
+  const o = opts || {};
+  const fs = o.fs || nodeFs;
+  const now = typeof o.now === 'function' ? o.now : () => new Date().toISOString();
+  // Stability gate — the ≥3/10 sustained-failure threshold (analogous to the
+  // reflector-kfm ≥K gate).
+  if (!isSustainedFailure(scenario, o)) {
+    return { action: 'skipped', reason: 'below_sustained_threshold' };
+  }
+  const slug = `${INCUBATOR_PREFIX}${deriveSlug(scenario)}`;
+  const incubatorRoot = resolveIncubatorRoot(o);
+  const draftDir = path.join(incubatorRoot, slug);
+  const draftPath = path.join(draftDir, 'CATALOGUE-ENTRY.md');
+  // Best-effort target_skill: prefer an injected hint, else the latest recorded
+  // failure event for this scenario (recordRun stamps target_skill).
+  let targetSkill = o.target_skill;
+  if (!targetSkill && !Array.isArray(o.window)) {
+    const recorded = readRuns(scenario, o);
+    const last = recorded.length ? recorded[recorded.length - 1] : null;
+    if (last && last.target_skill) targetSkill = last.target_skill;
+  }
+  const body = [
+    `# Skill-edit proposal — ${scenario}`,
+    '',
+    `**Source:** skill-behavior-telemetry (pressure-scenario harness)`,
+    `**Failing scenario:** ${scenario}`,
+    `**Target skill:** ${targetSkill || 'TODO: <skill that failed under pressure>'}`,
+    `**Signal:** sustained failure — ≥${SUSTAINED_THRESHOLD} of the last ${SUSTAINED_WINDOW} runs failed (D-07).`,
+    '',
+    `Drafted ${now()}. **PROPOSE-ONLY** — review via \`/gdd:apply-reflections\`.`,
+    'This draft NEVER auto-edits a skill (Phase 11/29 propose-only SC; Phase 33 out-of-scope).',
+    '',
+    '## Rationalization signal',
+    '',
+    `The "${scenario}" pressure scenario is failing repeatedly: the target skill is`,
+    'not holding under pressure (an agent is rationalizing past its HARD-GATE /',
+    'rationalization table). A content edit is proposed to close the loophole.',
+    '',
+    '## Proposed content edit',
+    '',
+    `- TODO: identify which rationalization the "${scenario}" scenario exploits.`,
+    '- TODO: add / strengthen the counter-rationalization row in the target skill',
+    "  (the '| Thought | Reality |' table) OR tighten its <HARD-GATE> wording.",
+    '- TODO: re-run `npm run test:behavior` for this scenario to confirm GREEN.',
+    '',
+  ].join('\n');
+  try {
+    fs.mkdirSync(draftDir, { recursive: true });
+    fs.writeFileSync(draftPath, body);
+  } catch (err) {
+    // A draft-write failure must not crash the harness; surface as skipped.
+    return { action: 'skipped', reason: `draft_write_failed: ${err && err.message ? err.message : String(err)}` };
+  }
+  return { action: 'drafted', path: draftPath, slug };
+}
+// -------------------------------------------------------------------
+// Exports
+// -------------------------------------------------------------------
+module.exports = {
+  recordRun,
+  readRuns,
+  isSustainedFailure,
+  maybeProposeReflection,
+  // Exposed for tests / higher-level integration.
+  EVENT_TYPE,
+  DEFAULT_JSONL_REL,
+  DEFAULT_INCUBATOR_REL,
+  SUSTAINED_WINDOW,
+  SUSTAINED_THRESHOLD,
+  _deriveSlug: deriveSlug,
+  _findRepoRoot: findRepoRoot,
+};

package/sdk/mcp/gdd-mcp/server.js CHANGED Viewed

@@ -251,8 +251,50 @@ var require_health_mirror = __commonJS({
         }
         checks.push({ name: "figma_extract", status, detail });
       }
+      {
+        const skillPresent = fileExists(
+          path.join(rootDir, "skills", "using-gdd", "SKILL.md")
+        );
+        const hookWired = skillPresent && sessionStartWiresInject(rootDir);
+        let detail;
+        let status;
+        if (!skillPresent) {
+          detail = "skill-discipline: missing using-gdd";
+          status = "warn";
+        } else if (!hookWired) {
+          detail = "skill-discipline: hook not wired";
+          status = "warn";
+        } else {
+          detail = "skill-discipline: ready";
+          status = "ok";
+        }
+        checks.push({ name: "skill_discipline", status, detail });
+      }
       return { checks };
     }
+    function sessionStartWiresInject(rootDir) {
+      try {
+        const p = path.join(rootDir, "hooks", "hooks.json");
+        let hooks;
+        try {
+          hooks = JSON.parse(fs.readFileSync(p, "utf8"));
+        } catch {
+          return false;
+        }
+        const sessionStart = hooks && hooks.hooks && Array.isArray(hooks.hooks.SessionStart) ? hooks.hooks.SessionStart : [];
+        for (const entry of sessionStart) {
+          const inner = entry && Array.isArray(entry.hooks) ? entry.hooks : [];
+          for (const h of inner) {
+            if (h && typeof h.command === "string" && /inject-using-gdd/.test(h.command)) {
+              return true;
+            }
+          }
+        }
+        return false;
+      } catch {
+        return false;
+      }
+    }
     function figmaVariablesBlockedLocally(rootDir) {
       try {
         const rawRoot = path.join(rootDir, ".figma-extract-cache", "raw");

package/skills/audit/SKILL.md CHANGED Viewed

@@ -63,4 +63,17 @@ After the consolidated audit summary has been printed (and any reflection-propos
 Written by `hooks/update-check.sh`; suppressed mid-pipeline and when the latest release is dismissed.
+## Rationalizations — Thought to Reality
+The excuses an agent reaches for to skip or thin out an audit, and the drift each one misses:
+| Thought | Reality |
+|---------|---------|
+| "The audit passed last cycle, I can skip it this cycle." | Per-cycle audit catches drift the prior pass couldn't see; a skipped review is exactly where regressions accumulate unnoticed. |
+| "`--quick` is fine, integration isn't the concern here." | Dropping the integration-checker hides orphaned decisions — wiring breaks even when the 6-pillar score looks healthy. |
+| "I can eyeball the scores instead of spawning the auditor." | The auditor's rubric scores six pillars consistently; an eyeballed review drifts toward whatever the agent already believes. |
+| "Reflection proposals are optional polish, skip the reflector." | The reflector turns this cycle's learnings into next-cycle improvements; skipping it lets the same mistakes repeat. |
+| "I'll modify the source while I'm in here fixing findings." | Audit is read-only by contract; editing source mid-audit invalidates the very scores you're producing. |
+| "Retroactive mode is overkill for a finished cycle." | Retroactive verification is the only check on tasks that shipped without per-task verify — skipping it leaves a completed cycle unaudited. |
 ## AUDIT COMPLETE

package/skills/brief/SKILL.md CHANGED Viewed

@@ -92,4 +92,29 @@ Next: @get-design-done explore
 ━━━━━━━━━━━━━━━━━━━━━━━
 ```
+## Spec self-review (before transition)
+Run this final spec-quality pass over `.design/BRIEF.md` before the brief→explore transition:
+- Placeholder scan: no TBD / TODO / `<placeholder>` / lorem left in the artifact.
+- Internal consistency: sections don't contradict each other.
+- Scope check: nothing in the artifact exceeds (or silently drops) the agreed scope.
+- Ambiguity check: every requirement/decision is specific enough to act on without a follow-up question.
+<HARD-GATE>
+Do NOT transition to explore (or invoke `/gdd:explore`) until the brief artifact (default `.design/BRIEF.md`) is committed AND the user has approved it. If this project uses a custom `.design` location, read the artifact path from `.design/STATE.md` rather than assuming the default.
+</HARD-GATE>
+## Rationalizations — Thought to Reality
+The excuses an agent invents to skip or shortcut the brief, and what each one actually costs the cycle:
+| Thought | Reality |
+|---------|---------|
+| "This brief is too simple to need a problem statement." | Skip the brief = guess at requirements, then redesign mid-design when the real problem surfaces. |
+| "The user told me what to build, I can skip the interview." | Unasked constraints (a11y, brand, stack) become rework — the five questions exist because each one has blown a past cycle. |
+| "I'll capture success metrics later in verify." | Verify has nothing to check against; an un-metricked brief produces an un-verifiable cycle. |
+| "Scope is obvious, I don't need an in/out line." | Undeclared scope is scope creep waiting to happen — the explore scan widens to fill the vacuum. |
+| "I can answer all five questions for the user from context." | AskUserQuestion one-at-a-time exists because batched/assumed answers smuggle in wrong premises that compound downstream. |
+| "STATE.md bootstrap can wait." | Every later MCP mutation requires STATE.md to exist; skipping the bootstrap hard-blocks explore on entry. |
 ## BRIEF COMPLETE

package/skills/design/SKILL.md CHANGED Viewed

@@ -78,4 +78,21 @@ Print the `=== Design stage complete ===` summary (tasks complete/total, deviati
 After all tasks finish, if STATE.md `<connections>` has `figma: available`, offer the user the figma-write opt-in prompt (modes: annotate / tokenize / mappings, with optional `--dry-run`). Spawn `design-figma-writer` with the selected mode on "yes"; skip silently on "no". NEVER auto-run without confirmation. Full prompt + dispatch logic: `./design-procedure.md` §Figma Write Dispatch.
+<HARD-GATE>
+Do NOT transition to verify (or invoke `/gdd:verify`) until `.design/DESIGN-SUMMARY.md` is committed. If this project uses a custom `.design` location, read the artifact path from `.design/STATE.md` rather than assuming the default.
+</HARD-GATE>
+## Rationalizations — Thought to Reality
+The excuses an agent uses to cut corners during design implementation, and the cost of each:
+| Thought | Reality |
+|---------|---------|
+| "I can skip planning for this small task and just implement it." | Plan-skipped tasks blow scope per cycle telemetry; the gate is for the typical case, not the exception. |
+| "These two tasks touch nearby files but I'll run them in parallel anyway." | Overlapping `Touches:` in a parallel batch produce merge conflicts that silently drop one task's work — split into sequential sub-waves. |
+| "Hardcoding this value is faster than wiring the token." | A hardcoded value is a stub the verifier catches as drift from the design tokens; you pay for it twice. |
+| "I'll emit the `.stories.tsx` stub later when Storybook is back up." | The CSF stub must land with the component or the next cycle's visual-regression scope misses it entirely. |
+| "This deviation is minor, I won't record a blocker." | An unrecorded deviation can't be resolved by a follow-up task, so it leaks into verify as an unexplained gap. |
+| "Auto-mode means I can ignore the wave checkpoints." | Auto-mode skips prompts, not the wave structure; ignoring wave order still corrupts dependent-task ordering. |
 ## DESIGN COMPLETE

package/skills/discuss/SKILL.md CHANGED Viewed

@@ -80,4 +80,17 @@ Cycle: <name or "default">
 - Do not run the interview yourself — always spawn the agent.
 - Do not touch files outside `.design/`.
+## Rationalizations — Thought to Reality
+The shortcuts an agent takes during a discuss session, and what each one costs the decision record:
+| Thought | Reality |
+|---------|---------|
+| "I'll ask all eight questions at once to save time." | Batched questions overwhelm the user; one-at-a-time keeps each decision clean and prevents coupled answers. |
+| "I can run the interview inline instead of spawning the discussant." | The skill's contract is to always spawn the agent — running it yourself skips the discussant's mode handling and D-XX numbering. |
+| "This answer is good enough, I'll record it as a decision without follow-up." | A vague answer ("modern", "clean") recorded as a D-XX locks in an undecided premise; reject and re-ask once. |
+| "I'll batch all the new D-XX entries into STATE.md at the end." | Decisions written atomically per answer survive an interrupted session; batching loses everything if the session drops. |
+| "The glossary term can wait until I write the summary." | CONTEXT.md is written immediately per term — a deferred glossary entry is a naming inconsistency the next cycle inherits. |
+| "Every decision this session is worth an ADR." | ADRs require all three criteria (hard-to-reverse, surprising, real-tradeoff); auto-promoting routine choices buries the genuinely load-bearing ones. |
 ## DISCUSS COMMAND COMPLETE

package/skills/explore/SKILL.md CHANGED Viewed

@@ -85,4 +85,21 @@ Full interview protocol + JSON line schema: `./explore-procedure.md` §Step 3.
 Print: "=== Explore complete ===\nSaved: .design/DESIGN.md, .design/DESIGN-DEBT.md, .design/DESIGN-CONTEXT.md\nNext: @get-design-done plan".
+<HARD-GATE>
+Do NOT transition to plan (or invoke `/gdd:plan`) until BOTH `.design/DESIGN.md` AND `.design/DESIGN-CONTEXT.md` are committed AND the user has approved them. If this project uses a custom `.design` location, read the artifact paths from `.design/STATE.md` rather than assuming the default.
+</HARD-GATE>
+## Rationalizations — Thought to Reality
+The shortcut excuses an agent reaches for during explore, and the drift each one introduces:
+| Thought | Reality |
+|---------|---------|
+| "I already know this codebase, I can skip the inventory scan." | An unscanned codebase hides the tokens/components you'll duplicate — the grep pass exists to stop you reinventing what's there. |
+| "The six connection probes are noise, I'll assume Figma is off." | A skipped probe means a wrong connection assumption silently breaks the design stage's tool dispatch. |
+| "`--skip-interview` is fine, the brief covered it." | The interview locks the gray areas the brief left fuzzy; skipping it ships undecided D-XX into planning. |
+| "I'll batch all the interview questions to save round-trips." | Batched questions overwhelm the user and smuggle in coupled assumptions — one-at-a-time keeps each decision clean. |
+| "DESIGN-DEBT.md is optional, the scan was clean enough." | Unrecorded debt resurfaces as an unexplained constraint three stages later with no provenance. |
+| "Prior sketches and project conventions don't apply this cycle." | Ignored conventions get overridden by defaults, producing inconsistency the audit will flag against the rest of the system. |
 ## EXPLORE COMPLETE

package/skills/health/SKILL.md CHANGED Viewed

@@ -63,6 +63,12 @@ After the health table, the `gdd_health` MCP surface (`scripts/lib/health-mirror
 Token PRESENCE only is detected (D-10) — the token value is never read, logged, or shown. The Free-tier signal is read from the local raw-pull cache only; no network call is made.
+## Skill-discipline bootstrap (skill_discipline)
+The `gdd_health` MCP surface also reports a `skill_discipline` check (Phase 32) confirming the using-gdd SessionStart bootstrap is live — detail is one of three exact strings:
+- `skill-discipline: ready` — `skills/using-gdd/SKILL.md` exists AND `hooks/hooks.json` SessionStart wires `inject-using-gdd.sh` (status `ok`).
+- `skill-discipline: missing using-gdd` (skill absent) or `skill-discipline: hook not wired` (skill present, no SessionStart inject) — both `warn`.
 ## Check MCP registration (gdd-mcp)
 After the health table, inspect whether `gdd-mcp` (Phase 27.7+) is registered with any installed harness and render a one-line status row. Dismissable via `.design/config.json#mcp_nudge=false`. Non-blocking: failure paths render `MCP server: unknown` rather than crash. Full detection procedure (dismissal check, detection via `scripts/lib/install/mcp-register.cjs`, row rendering for claude/codex/both/neither, fallback) lives in `./health-mcp-detection.md`.

package/skills/plan/SKILL.md CHANGED Viewed

@@ -77,4 +77,29 @@ The next stage (design) calls `mcp__gdd_state__transition_stage` on entry — th
 Print: plan tasks (N waves, M total tasks), files written (`.design/DESIGN-PLAN.md`, plus `.design/DESIGN-RESEARCH.md` if research ran), next step `/get-design-done:design`.
+## Spec self-review (before transition)
+Run this final spec-quality pass over `.design/DESIGN-PLAN.md` before the plan→design transition:
+- Placeholder scan: no TBD / TODO / `<placeholder>` / lorem left in the artifact.
+- Internal consistency: sections don't contradict each other.
+- Scope check: nothing in the artifact exceeds (or silently drops) the agreed scope.
+- Ambiguity check: every requirement/decision is specific enough to act on without a follow-up question.
+<HARD-GATE>
+Do NOT transition to design (or invoke `/gdd:design`) until `.design/DESIGN-PLAN.md` is committed AND the user has approved it. If this project uses a custom `.design` location, read the artifact path from `.design/STATE.md` rather than assuming the default.
+</HARD-GATE>
+## Rationalizations — Thought to Reality
+The reasons an agent gives to skip planning or rush DESIGN-PLAN.md, and what each one costs:
+| Thought | Reality |
+|---------|---------|
+| "This change is small, I can design straight from DESIGN-CONTEXT.md." | Plan-skipped tasks blow scope per cycle telemetry; the plan gate is for the typical case, not the exception you think you're in. |
+| "Pattern mapping is brownfield ceremony, I'll skip it." | Step 1.5 is mandatory because an unmapped brownfield is where the executor silently re-implements an existing pattern. |
+| "The plan-checker will just rubber-stamp it, skip the spawn." | The checker's 5 dimensions (coverage, wave order, must-have derivation) catch the gaps you can't see in your own plan. |
+| "I'll let the planner infer wave ordering at design time." | Unordered waves serialize work that could parallelize — or worse, run dependent tasks concurrently and corrupt the tree. |
+| "Research is overkill for this scope." | The complexity heuristic exists precisely because agents under-estimate scope; skipping research on a 3+-scope domain guarantees a mid-design surprise. |
+| "I can record decisions in DESIGN-PLAN.md prose instead of D-XX." | Prose decisions never reach STATE.md, so verify's integration-checker can't trace them and flags them orphaned. |
 ## PLAN COMPLETE

package/skills/router/SKILL.md CHANGED Viewed

@@ -79,6 +79,10 @@ If `.design/budget.json` is missing, assume defaults from `reference/config-sche
 When the router cannot resolve `intent-string` to a known agent (no `description` match, no `default-tier` rule, no path-selection fallback), emit ONE `capability_gap` event with `source: "router"` before returning the conservative-fallback JSON. Feeds Phase 29 Stage-0 telemetry — see `./capability-gap-emitter.md` for the synchronous Node snippet, semantic notes (suggested_kind = `"agent"`, MCP-probe exclusion per D-08, back-compat invariant on router output), and the opaque-extras payload routing through `appendChainEvent`.
+## Emitting router_pick on a resolved pick
+When the router DID resolve a pick — it has the `path`/`complexity_class`/`resolved_models` decision and is about to return the decision JSON — emit ONE `router_pick` event (`source: "router"`) recording which skill/agent was auto-picked, as the last step before returning. Side-effect only; the output JSON contract is UNCHANGED. Feeds the D-02 under-reached-skill instrument (Phase 33 baselines per-skill pick rates) — see `./router-pick-emitter.md` for the synchronous Node snippet, the 7-field no-PII payload (context_hash only — never the raw prompt), and the opaque-extras routing through `appendChainEvent`.
 ## Non-Goals
 The router does not: (a) make a model call, (b) write files, (c) enforce budget caps (that's the hook's job), (d) learn from history (Phase 11 reflector territory per D-07).