npm - @hegemonart/get-design-done - Versions diffs - 1.24.2 → 1.26.0 - Mend

@hegemonart/get-design-done 1.24.2 → 1.26.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (60) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +87 -0
package/README.de.md +679 -0
package/README.fr.md +679 -0
package/README.it.md +679 -0
package/README.ja.md +679 -0
package/README.ko.md +679 -0
package/README.md +399 -728
package/README.zh-CN.md +480 -133
package/SKILL.md +2 -0
package/agents/README.md +60 -0
package/agents/design-reflector.md +43 -0
package/agents/gdd-intel-updater.md +34 -1
package/agents/prototype-gate.md +122 -0
package/agents/quality-gate-runner.md +125 -0
package/hooks/budget-enforcer.ts +275 -11
package/hooks/gdd-decision-injector.js +183 -3
package/hooks/gdd-turn-closeout.js +238 -0
package/hooks/hooks.json +10 -0
package/package.json +5 -5
package/reference/STATE-TEMPLATE.md +41 -0
package/reference/config-schema.md +30 -0
package/reference/model-prices.md +40 -19
package/reference/prices/antigravity.md +21 -0
package/reference/prices/augment.md +21 -0
package/reference/prices/claude.md +42 -0
package/reference/prices/cline.md +23 -0
package/reference/prices/codebuddy.md +21 -0
package/reference/prices/codex.md +25 -0
package/reference/prices/copilot.md +21 -0
package/reference/prices/cursor.md +21 -0
package/reference/prices/gemini.md +25 -0
package/reference/prices/kilo.md +21 -0
package/reference/prices/opencode.md +23 -0
package/reference/prices/qwen.md +25 -0
package/reference/prices/trae.md +23 -0
package/reference/prices/windsurf.md +21 -0
package/reference/registry.json +107 -1
package/reference/runtime-models.md +446 -0
package/reference/schemas/runtime-models.schema.json +123 -0
package/scripts/install.cjs +8 -0
package/scripts/lib/budget-enforcer.cjs +446 -0
package/scripts/lib/cost-arbitrage.cjs +294 -0
package/scripts/lib/gdd-state/mutator.ts +454 -0
package/scripts/lib/gdd-state/parser.ts +351 -1
package/scripts/lib/gdd-state/types.ts +193 -0
package/scripts/lib/install/installer.cjs +188 -11
package/scripts/lib/install/parse-runtime-models.cjs +267 -0
package/scripts/lib/install/runtimes.cjs +43 -0
package/scripts/lib/quality-gate-detect.cjs +126 -0
package/scripts/lib/runtime-detect.cjs +96 -0
package/scripts/lib/tier-resolver.cjs +311 -0
package/scripts/validate-frontmatter.ts +138 -1
package/skills/quality-gate/SKILL.md +222 -0
package/skills/router/SKILL.md +79 -10
package/skills/sketch-wrap-up/SKILL.md +47 -2
package/skills/spike-wrap-up/SKILL.md +41 -2
package/skills/turn-closeout/SKILL.md +115 -0
package/skills/verify/SKILL.md +22 -0

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -5,14 +5,14 @@
   },
   "metadata": {
     "description": "Get Design Done — 5-stage agent-orchestrated design pipeline with 9 connections, handoff-first workflow, bidirectional Figma write-back, 22+ specialized agents, queryable knowledge layer (intel store, dependency analysis, learnings extraction), and a self-improvement loop (reflector, frontmatter + budget feedback, global-skills layer). v1.20.0 ships the SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream, and resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) for rate-limit + 429 + context-overflow recovery. Full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation (auto-tag + GitHub Release + release-time smoke test).",
-    "version": "1.24.2"
+    "version": "1.26.0"
   },
   "plugins": [
     {
       "name": "get-design-done",
       "source": "./",
       "description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), Claude Design handoff, bidirectional Figma write-back, and a queryable intel store (.design/intel/) for dependency and learnings queries. Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation. Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain.",
-      "version": "1.24.2",
+      "version": "1.26.0",
       "author": {
         "name": "hegemonart"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "get-design-done",
   "short_name": "gdd",
-  "version": "1.24.2",
+  "version": "1.26.0",
   "description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), handoff-first workflow via Claude Design bundles, bidirectional Figma write-back (annotations, Code Connect), queryable intel store (`.design/intel/`) for O(1) design surface lookups, and self-improvement loop (reflector agent, frontmatter + budget feedback, global-skills layer at `~/.claude/gdd/global-skills/`). Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings, reflect, apply-reflections. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows, lint + schema + frontmatter + stale-ref + shellcheck + gitleaks + injection-scan + blocking size-budget) and release automation (auto-tag + GitHub Release + release-time smoke test). Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain.",
   "author": {
     "name": "hegemonart",

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,93 @@ All notable changes to get-design-done are documented here. Versions follow [sem
 ---
+## [1.26.0] — 2026-04-29
+Phase 26 Headless Model Resolver milestone — closes the model-selection gap left by Phase 24's distribution headlessness. `default-tier: opus|sonnet|haiku` frontmatter now actually does something on the 13 non-Claude runtimes the multi-runtime installer ships to. Three layers gain runtime-awareness without a breaking change: the agent frontmatter (additive `reasoning-class` alias), the router output (additive `resolved_models` field), and the cost telemetry (per-runtime price tables + runtime-tagged events.jsonl rows). The phase ships **structure** — adapter layer, resolvers, schemas, contracts — not editorial picks for which model each runtime treats as opus/sonnet/haiku; those come from runtime adapter authors with provenance citations baked into `reference/runtime-models.md`.
+### Added
+- **Per-runtime tier→model adapter source-of-truth** — `reference/runtime-models.md` ships the canonical map for all 14 runtimes (claude, codex, gemini, qwen, kilo, copilot, cursor, windsurf, antigravity, augment, trae, codebuddy, cline, opencode). Each row carries `tier_to_model` (`opus`/`sonnet`/`haiku`), `reasoning_class_to_model` (`high`/`medium`/`low`), and a `provenance` array (source URL + retrieval timestamp + last-validated cycle) per D-01. Schema lives at `reference/schemas/runtime-models.schema.json` with `$schema_version: 1` for forward-compatible bumps (D-03). Pure-JS strict validator at `scripts/lib/install/parse-runtime-models.cjs` — no `ajv` dependency at the parser layer; install-time validation catches typos before runtime. Canonical seed picks per D-02: `claude → claude-opus-4-7 / claude-sonnet-4-7 / claude-haiku-4-5`, `codex → gpt-5 / gpt-5-mini / gpt-5-nano`, `gemini → gemini-2.5-pro / gemini-2.5-flash / gemini-2.5-flash-lite`, `qwen → qwen3-max / qwen3-plus / qwen3-flash`. (Plan 26-01, commit `5541086`)
+- **`tier-resolver.cjs` + `runtime-detect.cjs`** — `scripts/lib/tier-resolver.cjs` exports `resolve(runtime, tier, opts?) → model-string | null` translating frontmatter tier vocabulary into the concrete model name a specific runtime understands. Fallback chain per D-04: (1) runtime-specific entry → use; (2) claude row → use with `tier_resolution_fallback` event; (3) null + `tier_resolution_failed` event. Never throws — null is a valid output the consumer must handle. `scripts/lib/runtime-detect.cjs` exports `detect()` which reads the same `*_CONFIG_DIR` / `*_HOME` env-var chain Phase 24's installer uses (D-05); the env-var → runtime-ID mapping is owned by `scripts/lib/install/runtimes.cjs` and re-derived here so adding a runtime in one place automatically extends detection. Returns null when no recognized env-var is set (e.g. CI matrix, bare Node script). (Plan 26-02, commits `4bf7dea`, `c0bbae3`)
+- **Installer emits `models.json` per runtime config-dir** — `scripts/lib/install/runtimes.cjs` gains a `tier_to_model` field; `installer.cjs` writes a `models.json` payload at install time per runtime config-dir per D-06: `{ tier_to_model, reasoning_class_to_model, runtime, schema_version: 1, generated_at: <ISO>, source: "reference/runtime-models.md" }`. `--dry-run` shows the same set without writing; `uninstall` removes the file (clean uninstall guarantee from Phase 24 carries forward). One file per config-dir means runtime harnesses can read it at session start without parsing markdown. (Plan 26-03, commit `2ab47cf`)
+- **Router emits `resolved_models` field** — `skills/router/SKILL.md` JSON output gains `resolved_models: { "agent_name": "concrete-model-id", … }` next to the existing `model_tier_overrides` per D-07. Strict superset over v1.25.0: existing consumers reading `model_tier_overrides` keep working unchanged (enum stays `opus|sonnet|haiku` for back-compat across all 14 runtimes); new consumers (budget-enforcer cost computation, Phase 22 cost telemetry, Phase 23.5 bandit posterior store) read `resolved_models` for runtime-correct cost. Output schema versioning table bumped: `resolved_models` lands at v1.26.0 (26-04), `complexity_class` (Phase 25) and `model_tier_overrides` (legacy) preserved unchanged. (Plan 26-04, commit `eb38d4e`)
+- **Per-runtime price tables + budget-enforcer shared backend** — `reference/model-prices.md` becomes a router that links to per-runtime sub-tables under `reference/prices/`: `claude.md` (Anthropic), `codex.md` (OpenAI Codex gpt-5 family), `gemini.md` (Google Gemini 2.5 family), `qwen.md` (Alibaba Qwen 3 family) carry confirmed prices; the remaining 10 runtimes ship as stubs with provenance citation TODOs per D-08. `scripts/lib/budget-enforcer.cjs` exports `computeCost({ model_id?, tier?, runtime, tokens_in, tokens_out, cache_hit? })` with the four-step lookup order (runtime price-table by model_id → runtime by tier → claude fallback by model_id → claude by tier → null + diagnostic reason). `hooks/budget-enforcer.ts` reaches into the shared backend via `createRequire` — same scheme as `rate-guard.cjs`. Cost telemetry events.jsonl rows tag `runtime` (Phase 22 event chain), so the cost-aggregator rolls up per-runtime AND per-tier for apples-to-apples comparison. (Plan 26-05, commit `57bf43e`)
+- **Reflector cross-runtime cost-arbitrage** — `scripts/lib/cost-arbitrage.cjs` and reflector wiring surface a structured proposal when one runtime's spend exceeds another's by >50% on the same `(agent, tier)` per D-09. Mixed-runtime cycle history (some agent spawns ran in CC, others in Codex within the same cycle) is handled without crash or per-runtime double-count. Reflector emits `runtime_arbitrage_signal` events with both runtime IDs, the agent/tier pair, the observed spread, and the recommended cheaper-runtime tag. The 50% threshold is a starting heuristic — bandit-style learning over arbitrage outcomes is Phase 23.5+ territory. (Plan 26-06, commit `5de824c`)
+- **`reasoning-class` runtime-neutral frontmatter alias** — `agents/README.md` documents `reasoning-class: high|medium|low` as an additive alias for `default-tier` per D-10. v1.26 ships the alias with full equivalence semantics (`high ↔ opus`, `medium ↔ sonnet`, `low ↔ haiku`) but does not deprecate `default-tier`. Both fields may coexist on the same agent; mismatched dual annotations are a validation error (D-11). Long-term winner is data-gated: alias adoption signal measured by `gdd-intel-updater` on `agents/*.md` changes; if alias share stays below 50% by Phase 28, `default-tier` is canonical and alias is deprecated; if alias wins majority, the reverse. Same evidence-gating discipline as Phase 23.5's deferred items. (Plan 26-07, commit `be3e590`)
+- **Frontmatter validator + intel-updater integration** — `scripts/validate-frontmatter.ts` accepts optional `reasoning-class` enum; if both `default-tier` and `reasoning-class` are present, equivalence is enforced (`high+opus` / `medium+sonnet` / `low+haiku` — mismatch is a validation error per D-11). `gdd-intel-updater` re-runs on changes under `agents/*.md` to keep `.design/intel/agent-tiers.json` current with **both** fields populated for downstream tooling. Tests assert tier↔class equivalence across all 26 agents. (Plan 26-08, commit `14afa72`)
+- **`docs/MULTI-RUNTIME-MODELS.md`** — Plan 26-09 ships an ops guide covering: how to add a new runtime tier-map (edit `reference/runtime-models.md`, follow schema, run the parser test), the `reasoning-class ↔ default-tier` equivalence table, the `tier-resolver.cjs` fallback chain (runtime entry → claude row + warn event → null + fail event), how cost telemetry rolls up (per-runtime + per-tier), and the future `budget.json#runtime_overrides.<runtime>.tier_to_model` per-runtime override hook.
+### Tests
+- `tests/runtime-models-schema.test.cjs` (new) — calls `parseRuntimeModels()` from the dependency-free pure-JS parser at `scripts/lib/install/parse-runtime-models.cjs` (no `ajv` pulled in — the parser does strict validation natively), asserts `$schema_version === 1`, all 14 runtime IDs from `runtimes.cjs` present, canonical seed picks correct (claude→claude-opus-4-7, codex→gpt-5, gemini→gemini-2.5-pro, qwen→qwen3-max), and provenance fields present per row.
+- `tests/router-resolved-models.test.cjs` (new) — content-level assertions on `skills/router/SKILL.md`: `resolved_models` mentioned in the JSON example, in the field docstring, and at v1.26.0 in the Output schema versioning table; `complexity_class` (Phase 25) still mentioned (no regression); `model_tier_overrides` still mentioned (back-compat).
+- `tests/budget-enforcer-runtime-aware.test.cjs` (new) — pure-function tests of `scripts/lib/budget-enforcer.cjs#computeCost()`: codex/gpt-5-mini path returns cost from `reference/prices/codex.md`; claude/opus path returns cost from `reference/prices/claude.md`; missing-runtime / missing-tier falls back to claude with the `fallback: true` flag; cache-hit path swaps `cached_input_per_1m` for `input_per_1m`.
+- `tests/phase-26-baseline.test.cjs` (new) — same shape as `phase-25-baseline.test.cjs`. Asserts all 9 plans landed (runtime-models source + tier-resolver + runtime-detect + installer models.json + router resolved_models + budget-enforcer + cost-arbitrage + reasoning-class alias + frontmatter validator extension) plus all 4 manifests align at 1.26.0 + CHANGELOG `## [1.26.0]` block exists.
+- `tests/semver-compare.test.cjs` `OFF_CADENCE_VERSIONS` gains `1.26.0` with the milestone summary.
+### Decisions
+D-01 through D-13 — see `.planning/phases/26-headless-model-resolver/CONTEXT.md` for the full register. Highlights:
+- **D-01** — `reference/runtime-models.md` is the single source of truth for all 14 runtimes; each row carries provenance (URL + retrieval timestamp + last-validated cycle) so the future authority-watcher can flag drift.
+- **D-04** — `tier-resolver.cjs` fallback chain is non-blocking: runtime entry → claude row + warning event → null + fail event. Never throws; null is a valid output the consumer must handle.
+- **D-05** — `runtime-detect.cjs` reuses Phase 24's env-var → runtime-ID mapping verbatim; single source of truth lives in `runtimes.cjs`. Adding a new runtime extends both detection and installation.
+- **D-07** — `resolved_models` is additive to `model_tier_overrides` — strict superset, same back-compat discipline as Phase 25's `complexity_class` next to `path`.
+- **D-08** — Cost telemetry split: `reference/model-prices.md` becomes a router; per-runtime sub-tables under `reference/prices/<runtime>.md`. events.jsonl rows tag `runtime`. Aggregation rolls up per-runtime AND per-tier.
+- **D-10** — `reasoning-class` is additive, NOT a replacement for `default-tier`. Both may coexist; equivalence is enforced. Deprecation is data-gated (Phase 28 measurement).
+- **D-12** — All 9 plans land together with one CHANGELOG block. 4 manifests bump in lockstep (`package.json` + `.claude-plugin/plugin.json` + `.claude-plugin/marketplace.json` × 2 slots + `tests/semver-compare.test.cjs` `OFF_CADENCE_VERSIONS`).
+- **D-13** — Plan boundary discipline: Wave A (26-01..26-03) builds the adapter; Wave B (26-04..26-06) wires the existing pipeline; Wave C (26-07..26-08) lands the runtime-neutral alias additively; Wave D (26-09) closes out.
+---
+## [1.25.0] — 2026-04-29
+Phase 25 Pipeline Hardening milestone — converts four pipeline gaps surfaced in the post-Phase-24 retrospective from side roads into first-class pipeline citizens: a prototype gate that makes sketches/spikes a read/write member of the decision graph, an S/M/L/XL complexity refinement to the router that distinguishes trivial from full-pipeline work, a Stage 4.5 quality gate that runs lint/typecheck/test between Design and Verify, and a Stop-hook turn closeout that closes the events.jsonl gap at turn-end. All four sub-features are additive — no state-machine break (5 stages stay 5 stages), no breaking router contract (`path: fast|quick|full` is preserved alongside the new `complexity_class`), and the existing budget-enforcer / verify-entry / decision-injector consumers gain the new fields without a code change to their existing call sites.
+### Added
+- **Prototype gate** — `agents/prototype-gate.md` (new Haiku-tier signal-scoring agent emitting `{recommend: 'sketch'|'spike'|'none', confidence, reasons}`) + STATE.md `<prototyping>` block schema with three child element types (`<sketch slug=… cycle=… decision=D-XX status=resolved/>`, `<spike slug=… cycle=… decision=D-XX verdict=yes|no|partial status=resolved/>`, `<skipped at=… cycle=… reason=…/>`) parsed/serialized through `scripts/lib/gdd-state/{types,parser,mutator}.ts`. Round-trips byte-identically; the block is omitted when null (no spurious empty-tag pairs). (Plan 25-01, commit `92e93bf`)
+- **Router S/M/L/XL complexity_class** — `skills/router/SKILL.md` heuristic table extended from 3 to 4 buckets (S = `/gdd:help`/`/gdd:stats`/single-Haiku skills; M = `/gdd:scan`/`/gdd:brief`/`/gdd:sketch`; L = `/gdd:explore`/`/gdd:discover`/standalone `/gdd:plan`/`/gdd:verify`; XL = `/gdd:next`/`/gdd:do`/autonomous flows). Router JSON output gains a `complexity_class` field next to the existing `path`; the canonical mapping is S→fast (short-circuited), M→fast, L→quick, XL→full. `hooks/budget-enforcer.ts` reads `complexity_class` when present and applies `.design/budget.json#class_caps_usd[<class>]` per-spawn cap; falls through to legacy cap behavior when the field is absent (strict superset). `reference/config-schema.md` documents the new optional `class_caps_usd: { S?, M?, L?, XL? }` field. (Plan 25-02, commit `a239171`)
+- **Quality gate (Stage 4.5)** — `skills/quality-gate/SKILL.md` (new) runs the project's own `lint`/`typecheck`/`test`/visual-regression scripts in parallel between `/gdd:design` and `/gdd:verify`. Three-tier detection chain (D-06): authoritative `.design/config.json#quality_gate.commands` > auto-detect from `package.json#scripts` > skip-with-notice. `agents/quality-gate-runner.md` (new Haiku-tier classifier) parses tool output and routes the fix-vs-block decision; `design-fixer` is reused for the bounded fix loop (D-08, default `max_iters: 3`). Timeout warns and proceeds (D-07 — non-blocking on slow suites); fail at `max_iters` marks STATE `<quality_gate>` block status="fail" and verify entry refuses (D-08). Six lifecycle events emitted to `events.jsonl` via the existing `appendEvent()` surface: `quality_gate_started`, `quality_gate_iteration`, `quality_gate_pass`, `quality_gate_fail`, `quality_gate_timeout`, `quality_gate_skipped` (D-09). STATE.md `<quality_gate>` block is a single-element schema (`<run started_at=… completed_at=… status=… iteration=N commands_run="lint,typecheck,test"/>`) — most-recent run only. (Plan 25-03, commit `037b25f`; Plan 25-07 wiring, commit `b64250c`)
+- **Turn closeout Stop hook** — `hooks/gdd-turn-closeout.js` (new) fires when the assistant turn ends. Reads STATE.md (1 file) + tails the last line of events.jsonl; if `position.status==in_progress` AND last event >60s ago, appends a `turn_end` event and surfaces a stage-completion or paused-mid-task nudge as `additionalContext`. ≤10ms typical latency budget per D-10; idempotent on the `(stage, task_progress)` tuple. `skills/turn-closeout/SKILL.md` (new) is the portable mirror for the 13 non-Claude runtimes that lack a Stop-hook surface — same logic, called from orchestrator skills (`/gdd:next`, `/gdd:design`, `/gdd:verify`) at exit per D-11. `hooks/hooks.json` Stop entry wires the JS hook into Claude Code's harness. (Plan 25-04, commit `675e879`; Plan 25-08 wiring, commit `f52a471`)
+- **sketch-wrap-up + spike-wrap-up dual writes** — `skills/sketch-wrap-up/SKILL.md` and `skills/spike-wrap-up/SKILL.md` now perform a coupled two-write on resolution: append the chosen-variant or verdict as a `D-XX` entry under `<decisions>` AND append a corresponding `<sketch …/>` or `<spike …/>` line under `<prototyping>`. The two entries reference each other via `decision="D-XX"` so the `decision-injector` retrieval path works through both `<decisions>` (read by all downstream stages) and `<prototyping>` (read by planner-specific context). (Plan 25-05, commit `953605e`)
+- **Decision injector surfaces prior prototyping outcomes** — `hooks/gdd-decision-injector.js` parses STATE.md `<prototyping>` entries (regex-based — the hook stays self-contained JS) and surfaces the top-N relevant entries when an agent reads any `.design/**.md` or `.planning/**.md` file ≥1500 bytes. Outcomes appear under a "### Prior prototyping outcomes" heading alongside the existing D-XX matches. Empty / missing block is a no-op. (Plan 25-06, commit `da1961c`)
+- **`scripts/lib/quality-gate-detect.cjs`** — Plan 25-09 promotes the doc-only auto-detection logic from `skills/quality-gate/SKILL.md` Step 1 (D-06) into a small testable JS module. Pure function, no I/O. Exports `detect({configCommands, scripts}) → {commands, tier, reason?}` plus the `ALLOWLIST` and `ALWAYS_EXCLUDED` arrays so downstream consumers and tests can pin to the canonical detection contract.
+### Tests
+- `tests/prototype-gate-state-block.test.cjs` (new, 6 tests) — `<prototyping>` block round-trips through parse → serialize byte-identically, parser populates `state.prototyping.{sketches,spikes,skipped}` correctly, and serializer omits the block when null (no spurious empty-tag pair).
+- `tests/router-complexity-class.test.cjs` (new, 10 tests) — JSON example contains `complexity_class` next to `path`; heuristic table contains S/M/L/XL bucket labels; canonical mapping rows present; `/gdd:help`→S, `/gdd:scan`→M, standalone `/gdd:plan`→L, `/gdd:next`→XL bucket assignments documented; S-class short-circuit semantics captured.
+- `tests/quality-gate-detection.test.cjs` (new, 12 tests) — Tier 1 (config override) / Tier 2 (auto-detect with `tsc` substitution + `test:e2e` exclusion) / Tier 3 (skip-with-notice) of the detection chain; SKILL.md timeout-and-failure semantics asserted at the content level.
+- `tests/turn-closeout-hook.test.cjs` (new, 8 tests) — spawns the Stop hook with 4 synthesized cwd states (no STATE.md, status=completed, fresh event, stale event with N/N) and asserts `{continue: true}` shape, `additionalContext` nudge, idempotence on the `(stage, task_progress)` tuple, and end-to-end latency.
+- `tests/phase-25-baseline.test.cjs` (new, 18 tests) — same shape as `phase-24-baseline.test.cjs`. Asserts all four sub-features land (prototype-gate agent + state block + wrap-up dual-writes + decision-injector wiring; router complexity_class + 4 buckets; quality-gate skill + agent + detection module + verify Step 2.5 + 6 lifecycle events; turn-closeout hook + portable mirror + hooks.json Stop wiring) plus all 4 manifests align at 1.25.0 + CHANGELOG `## [1.25.0]` block exists.
+- `tests/semver-compare.test.cjs` `OFF_CADENCE_VERSIONS` gains `1.25.0` with the milestone summary.
+### Decisions
+D-01 through D-13 — see `.planning/phases/25-pipeline-hardening/CONTEXT.md` for the full decision register. Highlights:
+- **D-01** — `<prototyping>` is a STATE.md block, not a stage. Keeps the 5-stage state machine intact (anti-renumber rule); sketches/spikes are checkpoints that fire mid-pipeline.
+- **D-04 / D-05** — `complexity_class` is additive to `path`. `path` enum stays `fast|quick|full` for back-compat; `complexity_class` enum is `S|M|L|XL` with canonical mapping S→fast (short-circuited), M→fast, L→quick, XL→full.
+- **D-06** — Quality-gate detection is a 3-tier chain: authoritative config > auto-detect from `package.json#scripts` > skip-with-notice. Never blocks if no commands resolve.
+- **D-07** — Quality-gate timeout warns and proceeds (does not block). Failures still block at the verify entry; slow runs surface a warning.
+- **D-10** — Stop-hook latency budget ≤10ms typical, idempotent on the `(stage, task_progress)` tuple, never blocks the user's next turn.
+- **D-12** — All 9 plans land together with one CHANGELOG block. 4 manifests bump in lockstep (`package.json` + `.claude-plugin/plugin.json` + `.claude-plugin/marketplace.json` × 2 slots + `tests/semver-compare.test.cjs` `OFF_CADENCE_VERSIONS`).
+---
 ## [1.24.2] — 2026-04-25
 Dependabot cleanup — patches the one real transitive vulnerability flagged on `main` and configures Dependabot to stop scanning inert framework-detection test fixtures. No behavior change for end users; security/quality patch on top of v1.24.1.