@hegemonart/get-design-done 1.39.1 → 1.39.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -5,14 +5,14 @@
5
5
  },
6
6
  "metadata": {
7
7
  "description": "Get Design Done — 5-stage agent-orchestrated design pipeline with 9 connections, handoff-first workflow, bidirectional Figma write-back, 22+ specialized agents, queryable knowledge layer (intel store, dependency analysis, learnings extraction), and a self-improvement loop (reflector, frontmatter + budget feedback, global-skills layer). v1.20.0 ships the SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream, and resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) for rate-limit + 429 + context-overflow recovery. Full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation (auto-tag + GitHub Release + release-time smoke test).",
8
- "version": "1.39.1"
8
+ "version": "1.39.2"
9
9
  },
10
10
  "plugins": [
11
11
  {
12
12
  "name": "get-design-done",
13
13
  "source": "./",
14
14
  "description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), Claude Design handoff, bidirectional Figma write-back, and a queryable intel store (.design/intel/) for dependency and learnings queries. Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation. Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain.",
15
- "version": "1.39.1",
15
+ "version": "1.39.2",
16
16
  "author": {
17
17
  "name": "hegemonart"
18
18
  },
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "get-design-done",
3
3
  "short_name": "gdd",
4
- "version": "1.39.1",
4
+ "version": "1.39.2",
5
5
  "description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), handoff-first workflow via Claude Design bundles, bidirectional Figma write-back (annotations, Code Connect), queryable intel store (`.design/intel/`) for O(1) design surface lookups, and self-improvement loop (reflector agent, frontmatter + budget feedback, global-skills layer at `~/.claude/gdd/global-skills/`). Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings, reflect, apply-reflections. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows, lint + schema + frontmatter + stale-ref + shellcheck + gitleaks + injection-scan + blocking size-budget) and release automation (auto-tag + GitHub Release + release-time smoke test). Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain. v1.27.7 ships gdd-mcp (Phase 27.7): 12 read-only MCP tools for sub-3s priming. v1.28.0 (Phase 28): Foundational References Tier 2 — 5 new reference files (color-theory, composition, proportion-systems, i18n, contrast-advanced), 2 verifier i18n probes + 1 explore i18n-readiness probe, 12 additive cross-link insertions across 10 existing references, 2 orthogonal audit-scoring lens-tags (composition_alignment + i18n_readiness).",
6
6
  "author": {
7
7
  "name": "hegemonart",
package/CHANGELOG.md CHANGED
@@ -4,6 +4,36 @@ All notable changes to get-design-done are documented here. Versions follow [sem
4
4
 
5
5
  ---
6
6
 
7
+ ## [1.39.2] - 2026-06-01
8
+
9
+ ### Phase 39.2 — Long-Horizon Cost Governance
10
+
11
+ Closes the split Phase 39 (39.1 shipped DS migration). Phase 10.1 per-task caps + Phase 26 per-runtime telemetry track *cost* — none **forecast** it, cap it at the *project* level, or show whether the spend actually *shipped* anything. 39.2 adds a per-cycle spend **forecast**, a **`project_cap`** hard-halt, and an **ROI dashboard**. **No new runtime dependency, no new egress** — three pure helpers + an additive, disabled-by-default branch on the existing budget-enforcer hook.
12
+
13
+ ### Added
14
+
15
+ - **`scripts/lib/budget/cost-forecast.cjs`** — pure, dep-free per-cycle forecast: `forecast()` (best/typical/worst from the mean ± k·σ of historical per-cycle rates) + `cyclesToCap()` ("hit your cap in Y cycles"). Deterministic.
16
+ - **`scripts/lib/budget/roi.cjs`** — pure ROI join: `computeRoi()` (per-cycle cost ⋈ shipped/reverted commits → cost-per-shipped-commit + stick rate) + `roiTableMarkdown()`.
17
+ - **`scripts/lib/budget/project-cap.cjs`** — pure cap classifier: `classifyProjectBudget(spend, cap)` → `ok`/`warn-50`/`warn-80`/`halt`; **disabled when `cap ≤ 0`** (the non-breaking default).
18
+ - **`agents/cost-forecaster.md`** — groups `costs.jsonl` by cycle, runs the model, supports `--scenario best|typical|worst`, emits a `budget_forecast` event. Report-only (sonnet, size_budget M).
19
+ - **`skills/budget/SKILL.md`** (`/gdd:budget [--cycles N] [--scenario …]`) — forecast + "at the current rate you'll hit your $X project cap in Y cycles."
20
+ - **`skills/roi/SKILL.md`** (`/gdd:roi [--since <date>] [--window-days 14]`) — the ROI table; "shipped" = a commit surviving ≥ 14 days (catches revert-after-bug-discovery).
21
+ - **`reference/cost-governance.md`** — the contract (forecast model, `project_cap` semantics, ROI signal, events). Registered.
22
+
23
+ ### Changed
24
+
25
+ - **`hooks/budget-enforcer.ts`** — an **additive** `project_cap` branch (delegates the threshold math to `project-cap.cjs`): warns at 50% + 80%, hard-halts at 100% under `enforce`. **Disabled by default** (`project_cap_usd: 0`) so existing users see zero behavior change. **Graceful** — it blocks the *next* PreToolUse:Agent spawn, letting the current stage finish.
26
+ - **`reference/schemas/budget.schema.json`** — + `project_cap_usd` (≥ 0; 0/absent = disabled) + `project_cap_enforcement_mode` (enforce|warn|log).
27
+ - **`reference/schemas/events.schema.json`** — free-form `type` seed += `budget_forecast` / `project_cap_warning` / `project_cap_halt` (schema-seed only; `KNOWN_EVENT_TYPES` count unchanged).
28
+
29
+ ### Notes
30
+
31
+ - **No new runtime dependency, no new egress** — three pure text/arithmetic helpers + a local `package.json`/`costs.jsonl` read; the hook only ever *blocks*, never spends.
32
+ - 6-manifest lockstep at **v1.39.2** + `OFF_CADENCE_VERSIONS.add('1.39.2')` + the 31 live-pinned `manifests-version.txt` baselines forward-propagated 1.39.1 → 1.39.2.
33
+ - Inventory relock: registry-diff 157 → 158 (+`cost-governance`), skill-list 77 → 79 (+`budget`, +`roi`), agent-list +`cost-forecaster` + both frontmatter-snapshots, event-schema-snapshot sha256 re-locked (the seed-list edit, LF-normalized), tarball golden 700 → 707 (+7). Root `SKILL.md` command table += `budget` + `roi`.
34
+
35
+ ---
36
+
7
37
  ## [1.39.1] - 2026-06-01
8
38
 
9
39
  ### Phase 39.1 — DS Migration Workflows
package/README.md CHANGED
@@ -182,6 +182,10 @@ GDD now tracks a design past "PR merged" to **actually live**. [`/gdd:rollout-st
182
182
 
183
183
  When a design system ships a breaking major — shadcn/ui v1→v2, Tailwind v3→v4, MUI v5→v6, or the Material 2/3 token rename — GDD detects the skew from the in-repo `package.json`, consults a curated rule library ([`reference/migrations/`](reference/migrations/)), and produces an **impact-scored, proposal-only** migration plan via [`ds-migration-planner`](agents/ds-migration-planner.md). Each affected component is scored by visual-delta × usage × tests-affected, and the planner emits codemod scaffolds to `.design/migration/` through the pure [`codemod-gen`](scripts/lib/migration/codemod-gen.cjs) — which produces jscodeshift/ast-grep template **text only** (it never imports or runs a codemod engine). [`design-verifier`](agents/design-verifier.md) then treats an in-flight migration as a contract: visual-diff within threshold, component API surface unchanged, tests green, and an unmigrated high-impact rule is a gap. **Proposal-only, no new runtime dependency, no new egress.**
184
184
 
185
+ ### Long-horizon cost governance (v1.39.2)
186
+
187
+ GDD already tracks cost per task and per runtime — now it **forecasts** it, **caps** it at the project level, and shows whether the spend **shipped**. [`/gdd:budget`](skills/budget/SKILL.md) groups `costs.jsonl` by cycle and (via [`cost-forecaster`](agents/cost-forecaster.md) → the pure [`cost-forecast`](scripts/lib/budget/cost-forecast.cjs)) projects the next N cycles in **best / typical / worst** scenarios — "at the current rate you'll hit your $X project cap in Y cycles." A new `budget.json.project_cap_usd` adds a **project-level hard cap**: the [`budget-enforcer`](hooks/budget-enforcer.ts) hook warns at 50% + 80% and **gracefully halts** the next agent spawn at 100% (via the pure [`project-cap`](scripts/lib/budget/project-cap.cjs) classifier) — **disabled by default**, so existing users are unaffected. [`/gdd:roi`](skills/roi/SKILL.md) joins per-cycle cost with commits that shipped (survived ≥ 14 days) vs reverted into a cost-per-shipped-commit table ([`roi`](scripts/lib/budget/roi.cjs)). **No new runtime dependency, no new egress** — the hook only ever blocks, never spends.
188
+
185
189
  ### Previous releases
186
190
 
187
191
  - **v1.26.0** — Headless Model Resolver (per-runtime tier→model map, `resolved_models` router field, per-runtime price tables, `reasoning-class` runtime-neutral alias).
package/SKILL.md CHANGED
@@ -102,6 +102,8 @@ Each stage produces artifacts in `.design/` inside the current project.
102
102
  | `export <cycle> --format html\|pdf\|notion [--pseudonymize] [--pr]` | `get-design-done:gdd-export` | Phase 35.5 — package a finished cycle's design output into a stakeholder-shareable artifact (self-contained HTML / Paged.js-print PDF / Notion page); redacts always, `--pseudonymize` masks identity for external sharing, `--pr` posts the HTML preview via pr-commenter |
103
103
  | `bootstrap-ds [--primary <color>] [--secondary <color>] [--tone <tags>] [--framework <t>]` | `get-design-done:gdd-bootstrap-ds` | Phase 37.2 — bootstrap a design system for a GREENFIELD project (no DS): brand input → OKLCH token system (color tints + modular type + 4pt/8pt spacing + radius/motion) in 3 variants to pick, then button/input/card proof scaffolding via `ds-generator` |
104
104
  | `rollout-status [<cycle>] [--all] [--stuck]` | `get-design-done:gdd-rollout-status` | Phase 38.5 — track a shipped cycle's production rollout (unrolled / staging-only / canary-N% / prod-100%) by reading the feature-flag service via `rollout-coordinator`; surfaces STUCK rollouts; feeds `design_arms` by deployed %. Read-only — never advances or rolls back |
105
+ | `budget [--cycles N] [--scenario best\|typical\|worst]` | `get-design-done:gdd-budget` | Phase 39.2 — forecast design-cycle spend (best/typical/worst from telemetry variance) via `cost-forecaster`; "at the current rate you'll hit your $X project cap in Y cycles." Read-only — never spends, edits `budget.json`, or halts (the budget-enforcer hook halts) |
106
+ | `roi [--since <date>] [--window-days 14]` | `get-design-done:gdd-roi` | Phase 39.2 — ROI table joining per-cycle cost with commits that shipped (survived ≥14d) vs reverted → cost-per-shipped-commit + stick rate. Read-only markdown report |
105
107
 
106
108
  ## Handoff Routing
107
109
 
@@ -0,0 +1,91 @@
1
+ ---
2
+ name: cost-forecaster
3
+ description: Forecasts GDD spend over the next N design cycles. Reads .design/telemetry/costs.jsonl (grouping est_cost_usd by cycle) plus the configured .design/budget.json caps, runs the pure scripts/lib/budget/cost-forecast.cjs model (best/typical/worst from the variance of historical per-cycle rates), and reports "at the current rate you'll hit your project_cap in Y cycles." Supports --scenario best|typical|worst. Report-only — it never writes budget.json, never spends, never halts (the budget-enforcer hook halts). Spawned by /gdd:budget.
4
+ tools: Read, Bash, Grep, Glob
5
+ color: green
6
+ default-tier: sonnet
7
+ tier-rationale: "Groups a JSONL ledger by cycle and runs a pure projection helper, then narrates the result; bounded arithmetic + reporting, no design judgment — sonnet-tier."
8
+ size_budget: M
9
+ size_budget_rationale: "Honest tier sized to the ~95-line body. DELEGATES the projection math to scripts/lib/budget/cost-forecast.cjs and the contract to reference/cost-governance.md — the rollout-coordinator → rollout-status.cjs precedent."
10
+ parallel-safe: false
11
+ typical-duration-seconds: 30
12
+ reads-only: true
13
+ required_reading:
14
+ - "reference/cost-governance.md"
15
+ writes:
16
+ - ".design/telemetry/events.jsonl (a budget_forecast event only — append, no mutation)"
17
+ ---
18
+
19
+ # cost-forecaster
20
+
21
+ You forecast GDD's design-cycle spend so the user sees a cost trajectory **before** the bill arrives.
22
+ You are **report-only**: you read telemetry, run a pure model, and narrate. You never edit
23
+ `budget.json`, never spend, and never block a spawn — the Phase 25 budget-enforcer hook is the only
24
+ thing that halts.
25
+
26
+ **Read `reference/cost-governance.md` first** — it is the contract for the model, the scenarios, and
27
+ the `project_cap` semantics.
28
+
29
+ ## Inputs
30
+
31
+ - **`.design/telemetry/costs.jsonl`** — one row per agent spawn: `{ ts, agent, tier, est_cost_usd,
32
+ cycle, phase, ... }`. The **`cycle`** field is the grouping key.
33
+ - **`.design/budget.json`** — `project_cap_usd` (the ceiling to forecast against; `0`/absent ⇒ no
34
+ project cap configured, so report the trajectory without a "cycles to cap" line).
35
+ - **`--scenario best|typical|worst`** (default `typical`) and **`--cycles N`** (default `5`).
36
+
37
+ ## Procedure
38
+
39
+ 1. **Group spend by cycle.** Read `costs.jsonl`; sum `est_cost_usd` per distinct `cycle` value, in
40
+ chronological order. This yields the array of per-cycle USD totals. If there are 0 cycles, say so
41
+ and stop (nothing to forecast).
42
+ 2. **Run the model.** Call the pure helper — do the math in the lib, never by hand:
43
+
44
+ ```bash
45
+ node -e '
46
+ const { forecast, cyclesToCap } = require("./scripts/lib/budget/cost-forecast.cjs");
47
+ const perCycle = JSON.parse(process.argv[1]); // e.g. [10.2, 12.0, 8.4]
48
+ const f = forecast(perCycle, { nCycles: Number(process.argv[2]||5), scenario: process.argv[3]||"typical" });
49
+ const cap = Number(process.argv[4]||0);
50
+ const toCap = cap > 0 ? cyclesToCap(perCycle.reduce((a,b)=>a+b,0), cap, f.perCycle) : null;
51
+ console.log(JSON.stringify({ ...f, toCap }));
52
+ ' "$PER_CYCLE_JSON" "$N" "$SCENARIO" "$PROJECT_CAP"
53
+ ```
54
+
55
+ 3. **Report.** Print a short markdown summary:
56
+ - the chosen scenario + its per-cycle rate, and the best/typical/worst band (`low`/`high`);
57
+ - the projected total over the next N cycles;
58
+ - if `project_cap_usd > 0`: **"at the `<scenario>` rate (~$X/cycle) you'll reach your
59
+ $`<cap>` project cap in `<toCap>` cycles"** (or "never, spend is trending flat/down" when
60
+ `toCap` is `Infinity`).
61
+ 4. **Emit one event.** Append a `budget_forecast` event to `.design/telemetry/events.jsonl` with
62
+ payload `{ scenario, perCycle, projectedTotal, cyclesToCap }` (PII-free). Append only — never
63
+ rewrite the stream.
64
+
65
+ ## Scenarios (from `cost-forecast.cjs`, D-05)
66
+
67
+ | `--scenario` | per-cycle rate | reads as |
68
+ |---|---|---|
69
+ | `best` | `max(0, mean − k·σ)` | spend trending down / favorable variance |
70
+ | `typical` | `mean` | steady state (default) |
71
+ | `worst` | `mean + k·σ` | spend trending up / unfavorable variance |
72
+
73
+ `k = 1`. The projection is linear on the chosen rate. Always show the band, not just the point —
74
+ a wide best↔worst gap is itself the signal that spend is volatile.
75
+
76
+ ## Record
77
+
78
+ At run-end, print a `## Cost forecast` summary — the scenario, the per-cycle rate + band, the
79
+ projected next-N-cycle total, and the cycles-to-cap line (when a `project_cap_usd` is set). Then
80
+ append one JSONL line to `.design/intel/insights.jsonl` (per `reference/schemas/insight-line.schema.json`)
81
+ recording the forecast `{ scenario, perCycle, projectedTotal, cyclesToCap }`. Close with:
82
+
83
+ ```
84
+ ## COST FORECAST COMPLETE
85
+ ```
86
+
87
+ ## Boundaries
88
+
89
+ - Forecast is **cycle-scoped**, never per-agent-call.
90
+ - You **report**; you do not act. Setting or raising `project_cap_usd` is the user's call.
91
+ - No network. No external services. Pure local telemetry + a pure helper.
@@ -207,6 +207,27 @@ const tierResolverOpenRouter = nodeRequire(
207
207
  '../scripts/lib/tier-resolver-openrouter.cjs',
208
208
  ) as TierResolverOpenRouterModule;
209
209
 
210
+ // Phase 39.2 D-04: project-level cap classifier (pure). Keeping the threshold
211
+ // math in scripts/lib/budget/project-cap.cjs (out of this hook) mirrors how the
212
+ // hook already delegates cost computation to scripts/lib/budget-enforcer.cjs,
213
+ // and makes the 50/80/100 thresholds unit-testable. The hook only reads the
214
+ // running project spend and asks this module what to do.
215
+ interface ProjectCapClassification {
216
+ enabled: boolean;
217
+ pct: number;
218
+ level: 'ok' | 'warn-50' | 'warn-80' | 'halt';
219
+ cap: number;
220
+ spend: number;
221
+ }
222
+ interface ProjectCapModule {
223
+ classifyProjectBudget(spendUsd: number, capUsd: number): ProjectCapClassification;
224
+ shouldHalt(c: ProjectCapClassification | null, enforcementMode: string): boolean;
225
+ capMessage(c: ProjectCapClassification | null): string | null;
226
+ }
227
+ const projectCap = nodeRequire(
228
+ '../scripts/lib/budget/project-cap.cjs',
229
+ ) as ProjectCapModule;
230
+
210
231
  /**
211
232
  * Plan 33.6-03 (SC#6 opt-in). OpenRouter is consulted ONLY when the user opts
212
233
  * in — either `.design/config.json#openrouter_enabled === true` OR
@@ -380,6 +401,15 @@ const PHASE_TOTALS_PATH = join(
380
401
  'telemetry',
381
402
  'phase-totals.json',
382
403
  );
404
+ // Phase 39.2 D-04: optional fast-path for the running project spend, mirroring
405
+ // PHASE_TOTALS_PATH. When absent the hook replays costs.jsonl (the project cap
406
+ // is opt-in, so this replay only happens for users who set project_cap_usd).
407
+ const PROJECT_TOTALS_PATH = join(
408
+ process.cwd(),
409
+ '.design',
410
+ 'telemetry',
411
+ 'project-totals.json',
412
+ );
383
413
  const STATE_PATH = join(process.cwd(), '.design', 'STATE.md');
384
414
 
385
415
  /** Defaults per D-12 — mirror scripts/bootstrap.sh budget.json bootstrap. */
@@ -392,6 +422,7 @@ const BUDGET_DEFAULTS: Required<
392
422
  | 'auto_downgrade_on_cap'
393
423
  | 'cache_ttl_seconds'
394
424
  | 'enforcement_mode'
425
+ | 'project_cap_usd'
395
426
  >
396
427
  > = {
397
428
  per_task_cap_usd: 2.0,
@@ -400,6 +431,11 @@ const BUDGET_DEFAULTS: Required<
400
431
  auto_downgrade_on_cap: true,
401
432
  cache_ttl_seconds: 3600,
402
433
  enforcement_mode: 'enforce',
434
+ // Phase 39.2 D-04: project-level cap is DISABLED by default (0). Existing
435
+ // users — who have no project_cap_usd in budget.json — see zero behavior
436
+ // change. project_cap_enforcement_mode stays optional and falls back to
437
+ // enforcement_mode at the use-site.
438
+ project_cap_usd: 0,
403
439
  };
404
440
 
405
441
  /**
@@ -504,6 +540,40 @@ export function currentPhaseSpend(phase: string): number {
504
540
  return sum;
505
541
  }
506
542
 
543
+ // ── cumulative project spend (Phase 39.2 D-04) ───────────────────────────────
544
+
545
+ /**
546
+ * Total project spend = sum of est_cost_usd across the WHOLE costs.jsonl ledger.
547
+ * Fast path: a `project-totals.json` (`{ total: number }`, written by the
548
+ * aggregator) mirrors the WR-02 phase-totals optimization. Falls back to a full
549
+ * ledger replay otherwise. Returns 0 on any error. Only ever consulted when
550
+ * project_cap_usd > 0, so the replay cost is paid only by opt-in users.
551
+ */
552
+ export function currentProjectSpend(): number {
553
+ if (existsSync(PROJECT_TOTALS_PATH)) {
554
+ try {
555
+ const data = JSON.parse(readFileSync(PROJECT_TOTALS_PATH, 'utf8')) as { total?: number };
556
+ return Number(data.total ?? 0);
557
+ } catch {
558
+ // fall through to replay
559
+ }
560
+ }
561
+ if (!existsSync(TELEMETRY_PATH)) return 0;
562
+ const lines = readFileSync(TELEMETRY_PATH, 'utf8')
563
+ .split(/\r?\n/)
564
+ .filter(Boolean);
565
+ let sum = 0;
566
+ for (const line of lines) {
567
+ try {
568
+ const row = JSON.parse(line) as { est_cost_usd?: number };
569
+ sum += Number(row.est_cost_usd ?? 0);
570
+ } catch {
571
+ // tolerant — skip malformed lines
572
+ }
573
+ }
574
+ return sum;
575
+ }
576
+
507
577
  // ── cycle + phase reader (STATE.md frontmatter) ─────────────────────────────
508
578
 
509
579
  /**
@@ -985,6 +1055,82 @@ export async function main(): Promise<void> {
985
1055
  const estCost = Number(toolInput._est_cost_usd ?? 0);
986
1056
  const phaseSpend = currentPhaseSpend(phase);
987
1057
 
1058
+ // ── Phase 39.2 D-04: project-level cap ─────────────────────────────────────
1059
+ //
1060
+ // Independent of enforcement_mode: the 50%/80% warnings + the 100% halt are
1061
+ // governed by project_cap_enforcement_mode (falling back to enforcement_mode).
1062
+ // No-op when project_cap_usd <= 0 (the opt-in default), so existing users see
1063
+ // zero change. Checked here, before the per-task/per-phase branches, so a
1064
+ // project-level breach halts the NEXT spawn regardless of the per-scope caps —
1065
+ // the graceful halt (the current stage's in-flight spawns already ran).
1066
+ if (budget.project_cap_usd > 0) {
1067
+ const projectSpend = currentProjectSpend();
1068
+ const projClass = projectCap.classifyProjectBudget(
1069
+ projectSpend + estCost,
1070
+ budget.project_cap_usd,
1071
+ );
1072
+ const projMode = budget.project_cap_enforcement_mode ?? budget.enforcement_mode;
1073
+ if (projClass.level === 'warn-50' || projClass.level === 'warn-80') {
1074
+ try {
1075
+ appendEvent({
1076
+ type: 'project_cap_warning',
1077
+ timestamp: new Date().toISOString(),
1078
+ sessionId: getSessionId(),
1079
+ ...(cycle !== undefined && cycle !== 'unknown' ? { cycle } : {}),
1080
+ payload: {
1081
+ pct: projClass.pct,
1082
+ spend: projClass.spend,
1083
+ cap: projClass.cap,
1084
+ level: projClass.level,
1085
+ },
1086
+ } as unknown as HookFiredEvent);
1087
+ } catch {
1088
+ // fail-open — event-stream errors never block the hook.
1089
+ }
1090
+ process.stderr.write(`gdd-budget-enforcer WARN: ${projectCap.capMessage(projClass)}\n`);
1091
+ } else if (projClass.level === 'halt') {
1092
+ try {
1093
+ appendEvent({
1094
+ type: 'project_cap_halt',
1095
+ timestamp: new Date().toISOString(),
1096
+ sessionId: getSessionId(),
1097
+ ...(cycle !== undefined && cycle !== 'unknown' ? { cycle } : {}),
1098
+ payload: {
1099
+ pct: projClass.pct,
1100
+ spend: projClass.spend,
1101
+ cap: projClass.cap,
1102
+ enforcementMode: projMode,
1103
+ },
1104
+ } as unknown as HookFiredEvent);
1105
+ } catch {
1106
+ // fail-open.
1107
+ }
1108
+ if (projectCap.shouldHalt(projClass, projMode)) {
1109
+ writeTelemetry({
1110
+ agent,
1111
+ tier: toolInput._tier_override ?? toolInput._default_tier ?? 'sonnet',
1112
+ tokens_in: Number(toolInput._tokens_in_est ?? 0),
1113
+ tokens_out: Number(toolInput._tokens_out_est ?? 0),
1114
+ cache_hit: false,
1115
+ est_cost_usd: estCost,
1116
+ enforcement_mode: projMode,
1117
+ block_reason: 'project_cap',
1118
+ _cyclePhase: cyclePhase,
1119
+ });
1120
+ emitHookFired('block', cycle);
1121
+ const response: ToolOutput = {
1122
+ continue: false,
1123
+ suppressOutput: false,
1124
+ message: `Project budget cap reached: $${projClass.spend.toFixed(2)} of $${budget.project_cap_usd.toFixed(2)} (${projClass.pct.toFixed(0)}%). Raise project_cap_usd in .design/budget.json, or set project_cap_enforcement_mode to "warn" to keep going. (Graceful halt — the current stage's earlier spawns already completed; this blocks the next one.)`,
1125
+ };
1126
+ process.stdout.write(JSON.stringify(response));
1127
+ return;
1128
+ }
1129
+ // warn / log mode: surface the 100% breach but allow the spawn.
1130
+ process.stderr.write(`gdd-budget-enforcer WARN: ${projectCap.capMessage(projClass)}\n`);
1131
+ }
1132
+ }
1133
+
988
1134
  // Phase 25 / D-05: per-spawn cap is class-specific when
989
1135
  // complexity_class is present and class_caps_usd[class] is defined.
990
1136
  // Falls back to per_task_cap_usd for backwards compatibility — when
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@hegemonart/get-design-done",
3
- "version": "1.39.1",
3
+ "version": "1.39.2",
4
4
  "description": "A design-quality pipeline for AI coding agents: brief, plan, implement, and verify UI work against your design system.",
5
5
  "author": "Hegemon",
6
6
  "homepage": "https://github.com/hegemonart/get-design-done",
@@ -0,0 +1,93 @@
1
+ # Cost Governance — Forecast, Project Cap, and ROI
2
+
3
+ Phase 39.2 contract. GDD already tracks cost (Phase 10.1 per-task caps, Phase 26 per-runtime
4
+ telemetry, Phase 27.5 bandit cost-arbitrage) — but it never *forecasts* spend, never imposes a
5
+ *project-level* hard cap, and never shows whether the spend actually *shipped* anything. This file is
6
+ the contract for the three pieces that close those gaps: the **forecast model**, the **`project_cap`
7
+ hard-halt**, and the **ROI dashboard**. All three are read-only/report-only except the hook, which
8
+ only ever *blocks* a spawn — it never spends, edits config, or mutates telemetry.
9
+
10
+ ## Telemetry inputs
11
+
12
+ - **`.design/telemetry/costs.jsonl`** (OPT-09) — one row per agent spawn:
13
+ `{ ts, agent, tier, tokens_in, tokens_out, cache_hit, est_cost_usd, cycle, phase }`.
14
+ The **`cycle`** field is the join key: grouping `est_cost_usd` by `cycle` gives per-cycle USD totals.
15
+ - **`.design/telemetry/events.jsonl`** — the event stream; this phase appends three new `type`s
16
+ (below).
17
+ - **Cycle identity** — `.design/STATE.md` frontmatter `cycle:`. There is no `CYCLES.md`; per-cycle
18
+ commit counts are computed on demand from `git log` (the `/gdd:stats` precedent).
19
+
20
+ ## Forecast model (`scripts/lib/budget/cost-forecast.cjs`, pure)
21
+
22
+ Group `costs.jsonl` by `cycle` → an array of per-cycle USD totals. From the **mean** `m` and
23
+ **population standard deviation** `σ` of those rates, the three scenarios are:
24
+
25
+ | Scenario | Per-cycle rate | Meaning |
26
+ |---|---|---|
27
+ | `best` | `max(0, m − k·σ)` | spend trends down / variance favorable |
28
+ | `typical` | `m` | steady state |
29
+ | `worst` | `m + k·σ` | spend trends up / variance unfavorable |
30
+
31
+ `k = 1` by default. The projection over the next `N` cycles is linear: `projectedTotal = rate · N`.
32
+ `cyclesToCap(currentSpend, cap, rate)` returns the integer number of cycles until `currentSpend`
33
+ reaches `cap` at that rate — `Infinity` when `rate ≤ 0`, `0` when already at/over the cap. This powers
34
+ the `/gdd:budget` warning **"at the current rate you'll hit cap $X in Y cycles."**
35
+
36
+ The math is a pure, dep-free, deterministic core (no fs, no clock, no randomness) — `agents/cost-forecaster.md`
37
+ and `/gdd:budget` read the telemetry and hand the grouped totals in. `--scenario best|typical|worst`
38
+ selects the rate.
39
+
40
+ ## Project cap (`scripts/lib/budget/project-cap.cjs` + `hooks/budget-enforcer.ts`)
41
+
42
+ A **project-level** hard cap, distinct from the existing per-task and per-phase caps. Config lives in
43
+ `.design/budget.json`:
44
+
45
+ | Key | Type | Default | Meaning |
46
+ |---|---|---|---|
47
+ | `project_cap_usd` | number ≥ 0 | `0` (disabled) | Total project spend ceiling (USD). |
48
+ | `project_cap_enforcement_mode` | `enforce` \| `warn` \| `log` | falls back to `enforcement_mode` | How a breach is handled. |
49
+
50
+ **Disabled by default.** A cap of `0` (or absent / non-finite) means *no project cap* — existing
51
+ users see zero behavior change. The classifier `classifyProjectBudget(spend, cap)` returns a level:
52
+
53
+ | Running spend vs cap | Level | Hook behavior |
54
+ |---|---|---|
55
+ | `< 50%` | `ok` | nothing |
56
+ | `≥ 50%` | `warn-50` | emit `project_cap_warning`, print, allow |
57
+ | `≥ 80%` | `warn-80` | emit `project_cap_warning`, print, allow |
58
+ | `≥ 100%` | `halt` | emit `project_cap_halt`; under `enforce`, block the spawn |
59
+
60
+ The cap is enforced in the **PreToolUse:Agent** hook, so the halt is **graceful**: it blocks the
61
+ *next* agent spawn, letting the current pipeline stage finish. Under `warn`/`log` mode a `halt`-level
62
+ breach prints/records but still allows the spawn (advisory). Running project spend is the sum of
63
+ `est_cost_usd` across all `costs.jsonl` rows (a `project-totals.json` fast-path mirrors the Phase 10.1
64
+ `phase-totals.json` optimization).
65
+
66
+ ## ROI dashboard (`scripts/lib/budget/roi.cjs`, pure + `/gdd:roi`)
67
+
68
+ Joins per-cycle cost with what actually shipped. **"Shipped"** = a commit that **survived ≥ 14 days**
69
+ in `main` (the ROADMAP default — a longer window catches revert-after-bug-discovery); a commit
70
+ reverted inside that window counts as `reverted`. `/gdd:roi` shells `git log` per cycle for the
71
+ shipped/reverted counts and reads per-cycle cost from `costs.jsonl`; `roi.cjs` computes:
72
+
73
+ - `costPerShipped = costUsd / max(shipped, 1)` — USD per commit that stuck.
74
+ - `stickRate = shipped / max(shipped + reverted, 1)` — fraction of commits that survived.
75
+
76
+ Output is a markdown table (cycle · cost · shipped · reverted · $/shipped · stick rate) plus a TOTAL
77
+ row. Markdown only — no GUI.
78
+
79
+ ## Events
80
+
81
+ Three new free-form `type`s on `.design/telemetry/events.jsonl`:
82
+
83
+ | Type | Emitted by | Payload (PII-free) |
84
+ |---|---|---|
85
+ | `budget_forecast` | `cost-forecaster` / `/gdd:budget` | `{ scenario, perCycle, projectedTotal, cyclesToCap }` |
86
+ | `project_cap_warning` | budget-enforcer hook | `{ pct, spend, cap, level }` at `warn-50` / `warn-80` |
87
+ | `project_cap_halt` | budget-enforcer hook | `{ pct, spend, cap, enforcementMode }` at `halt` |
88
+
89
+ ## Boundaries
90
+
91
+ Forecast is **cycle-scoped** (not per-agent-call). The cap **halts**, it never spends or auto-tunes.
92
+ ROI is **markdown**, not a GUI. Nothing here writes `budget.json` — the user sets the cap; GDD only
93
+ reads, forecasts, warns, and (at 100% under `enforce`) blocks the next spawn.
@@ -1021,6 +1021,13 @@
1021
1021
  "type": "heuristic",
1022
1022
  "phase": 39.1,
1023
1023
  "description": "Phase 39.1 migration rule library — Material Design token migration (M3→next), grounded in the real M2→M3 token-system patterns (md.sys.color/typescale roles, @material/web mwc-→md-) — no fabricated M4 spec. Rules + Detection + Impact; codemod-gen-consumable."
1024
+ },
1025
+ {
1026
+ "name": "cost-governance",
1027
+ "path": "reference/cost-governance.md",
1028
+ "type": "heuristic",
1029
+ "phase": 39.2,
1030
+ "description": "Phase 39.2 cost-governance contract: the per-cycle forecast model (best/typical/worst from mean ± k·σ, cyclesToCap) via scripts/lib/budget/cost-forecast.cjs; the project_cap hard-halt (disabled by default, graceful PreToolUse:Agent block, warn 50/80 + halt 100) via scripts/lib/budget/project-cap.cjs + hooks/budget-enforcer.ts; the ROI dashboard (shipped = surviving >=14d, cost-per-shipped-commit) via scripts/lib/budget/roi.cjs; and the budget_forecast/project_cap_warning/project_cap_halt events. Agent agents/cost-forecaster.md; skills /gdd:budget + /gdd:roi. Read/report-only — the hook only blocks, never spends."
1024
1031
  }
1025
1032
  ]
1026
1033
  }
@@ -37,6 +37,16 @@
37
37
  "type": "string",
38
38
  "enum": ["enforce", "warn", "log"],
39
39
  "description": "D-11 enforcement policy. enforce = block + auto-downgrade; warn = print warnings but allow spawn; log = advisory-only telemetry without gating."
40
+ },
41
+ "project_cap_usd": {
42
+ "type": "number",
43
+ "minimum": 0,
44
+ "description": "Phase 39.2 D-04 — project-level hard cap (USD) across the whole project's costs.jsonl. 0 or absent = DISABLED (no project-level enforcement; zero behavior change for existing users). When > 0, hooks/budget-enforcer.ts warns at 50% + 80% of this cap and (under project_cap_enforcement_mode=enforce) hard-halts the next PreToolUse:Agent spawn at 100%. Distinct from per_task_cap_usd / per_phase_cap_usd."
45
+ },
46
+ "project_cap_enforcement_mode": {
47
+ "type": "string",
48
+ "enum": ["enforce", "warn", "log"],
49
+ "description": "Phase 39.2 D-04 — enforcement policy for project_cap_usd specifically. enforce = hard-halt at 100%; warn = print at 100% but allow; log = advisory telemetry only. Falls back to enforcement_mode when absent."
40
50
  }
41
51
  }
42
52
  }
@@ -10,7 +10,7 @@
10
10
  "type": {
11
11
  "type": "string",
12
12
  "minLength": 1,
13
- "description": "Free-form event type identifier. Pre-registered seeds: state.mutation, state.transition, stage.entered, stage.exited, hook.fired, error, capability_gap, kfm-candidate, router_pick, verify_outcome, rollout_started, rollout_advanced, rollout_stuck."
13
+ "description": "Free-form event type identifier. Pre-registered seeds: state.mutation, state.transition, stage.entered, stage.exited, hook.fired, error, capability_gap, kfm-candidate, router_pick, verify_outcome, rollout_started, rollout_advanced, rollout_stuck, budget_forecast, project_cap_warning, project_cap_halt."
14
14
  },
15
15
  "timestamp": {
16
16
  "type": "string",
@@ -58,6 +58,14 @@ export interface DesignBudgetJson {
58
58
  * D-11 enforcement policy. enforce = block + auto-downgrade; warn = print warnings but allow spawn; log = advisory-only telemetry without gating.
59
59
  */
60
60
  enforcement_mode?: 'enforce' | 'warn' | 'log';
61
+ /**
62
+ * Phase 39.2 D-04 — project-level hard cap (USD) across the whole project's costs.jsonl. 0 or absent = DISABLED (no project-level enforcement; zero behavior change for existing users). When > 0, hooks/budget-enforcer.ts warns at 50% + 80% of this cap and (under project_cap_enforcement_mode=enforce) hard-halts the next PreToolUse:Agent spawn at 100%. Distinct from per_task_cap_usd / per_phase_cap_usd.
63
+ */
64
+ project_cap_usd?: number;
65
+ /**
66
+ * Phase 39.2 D-04 — enforcement policy for project_cap_usd specifically. enforce = hard-halt at 100%; warn = print at 100% but allow; log = advisory telemetry only. Falls back to enforcement_mode when absent.
67
+ */
68
+ project_cap_enforcement_mode?: 'enforce' | 'warn' | 'log';
61
69
  [k: string]: unknown;
62
70
  }
63
71
 
@@ -106,7 +114,7 @@ export type Event = {
106
114
  [k: string]: unknown;
107
115
  } & {
108
116
  /**
109
- * Free-form event type identifier. Pre-registered seeds: state.mutation, state.transition, stage.entered, stage.exited, hook.fired, error, capability_gap.
117
+ * Free-form event type identifier. Pre-registered seeds: state.mutation, state.transition, stage.entered, stage.exited, hook.fired, error, capability_gap, kfm-candidate, router_pick, verify_outcome, rollout_started, rollout_advanced, rollout_stuck, budget_forecast, project_cap_warning, project_cap_halt.
110
118
  */
111
119
  type: string;
112
120
  /**
@@ -581,6 +589,62 @@ export interface ClaudePluginJson {
581
589
 
582
590
  export type PluginSchema = ClaudePluginJson;
583
591
 
592
+ // ---- pressure-scenario.schema.json ----
593
+ /**
594
+ * Contract for a Phase-33 skill-behavior pressure-scenario manifest. The runner (scripts/lib/skill-behavior/runner.cjs) loads manifests conforming to this schema, spawns a subagent against `setup_prompt` under the named `pressures`, and validates the response against the `expected_compliance` / `expected_violations` regex sources (compiled with new RegExp(source)). The 5-value `pressures` enum and the required-field set come verbatim from ROADMAP Phase-33 SC#2.
595
+ */
596
+ export interface PressureScenarioManifest {
597
+ /**
598
+ * Unique scenario identifier, e.g. "brief-time-pressure".
599
+ */
600
+ name: string;
601
+ /**
602
+ * The skill under test, e.g. "brief", "explore", "plan", "using-gdd".
603
+ */
604
+ target_skill: string;
605
+ /**
606
+ * One or more pressure vectors applied in the setup_prompt.
607
+ *
608
+ * @minItems 1
609
+ */
610
+ pressures: [
611
+ 'time' | 'sunk-cost' | 'authority' | 'exhaustion' | 'scope-minimization',
612
+ ...('time' | 'sunk-cost' | 'authority' | 'exhaustion' | 'scope-minimization')[],
613
+ ];
614
+ /**
615
+ * The prompt handed to the subagent — embeds the pressure(s) and asks it to act.
616
+ */
617
+ setup_prompt: string;
618
+ /**
619
+ * Regex SOURCE strings the response MUST match to count as compliant (the runner compiles each with new RegExp(source)).
620
+ *
621
+ * @minItems 1
622
+ */
623
+ expected_compliance: [string, ...string[]];
624
+ /**
625
+ * Regex SOURCE strings that, if matched, count as a violation (the runner compiles each with new RegExp(source)). May be empty.
626
+ */
627
+ expected_violations: string[];
628
+ /**
629
+ * Optional free-text scenario note (33-03 baselines reference it).
630
+ */
631
+ description?: string;
632
+ /**
633
+ * Optional A/B variant label, e.g. "trigger-only" | "what-clause" (33-04 description-format A/B).
634
+ */
635
+ variant?: string;
636
+ /**
637
+ * Optional array of A/B variant descriptors for a single-manifest A/B pair (33-04). Each item is an object, e.g. { label, description }.
638
+ */
639
+ variants?: {}[];
640
+ /**
641
+ * Optional body-only probe prompt the A/B scenario asks (33-04 description-format A/B).
642
+ */
643
+ body_probe?: string;
644
+ }
645
+
646
+ export type PressureScenarioSchema = PressureScenarioManifest;
647
+
584
648
  // ---- protected-paths.schema.json ----
585
649
  /**
586
650
  * Glob list describing paths the plugin refuses to Edit/Write or mutate via destructive Bash. User additions MERGE with this default list; users cannot reduce the default set.
@@ -622,6 +686,35 @@ export interface RateLimits {
622
686
 
623
687
  export type RateLimitsSchema = RateLimits;
624
688
 
689
+ // ---- recipe.schema.json ----
690
+ /**
691
+ * Shape of a declarative recipe loaded from recipes/<name>.json by scripts/lib/recipe-loader.cjs (Plan 31-5-03, RECIPE-01 / SC#14). The recipes/ directory ships EMPTY of recipes and is populated downstream by Phase 32 (skill-trigger recipes), Phase 33.6 (per-provider), Phase 26 (per-runtime/per-model), and Phase 23.5 (bandit-arm shape). This is a minimal, forward-compatible envelope: a recipe MUST carry name/version/steps; additionalProperties:true lets the populating phases extend the envelope without breaking the loader contract. Modelled on Storybloq's src/autonomous/recipes/ loader.ts pattern.
692
+ */
693
+ export interface Recipe {
694
+ /**
695
+ * The recipe identifier. Matches the filename stem (recipes/<name>.json).
696
+ */
697
+ name: string;
698
+ /**
699
+ * Recipe/schema version string for forward-compatibility. Lets the loader and downstream phases reason about envelope evolution.
700
+ */
701
+ version: string;
702
+ /**
703
+ * The ordered recipe body. Item shape is kept permissive for now — each step is an object carrying at least a `kind` OR an `id` string. Downstream phases (32/33.6/26/23.5) tighten the step contract per their domain.
704
+ */
705
+ steps: (
706
+ | {
707
+ kind: string;
708
+ }
709
+ | {
710
+ id: string;
711
+ }
712
+ )[];
713
+ [k: string]: unknown;
714
+ }
715
+
716
+ export type RecipeSchema = Recipe;
717
+
625
718
  // ---- runtime-models.schema.json ----
626
719
  /**
627
720
  * Parsed shape of reference/runtime-models.md — the per-runtime tier→model adapter source-of-truth shipped in Phase 26 (D-01..D-03). Consumed by scripts/lib/install/parse-runtime-models.cjs at install time and scripts/lib/tier-resolver.cjs at runtime. Strict enums catch typos at install time, not at runtime. Schema versioned via $schema_version for forward-compat (D-03).
@@ -0,0 +1,103 @@
1
+ 'use strict';
2
+ // Phase 39.2 — cost-forecast.cjs — PURE, dep-free per-cycle cost forecasting core.
3
+ //
4
+ // The /gdd:budget skill and agents/cost-forecaster.md read .design/telemetry/costs.jsonl, group the
5
+ // est_cost_usd by `cycle`, and hand the resulting per-cycle USD totals here. This module does ONLY
6
+ // the projection math — it never touches the filesystem, the clock, or randomness, so it is trivially
7
+ // unit-testable (the build-html.cjs / codemod-gen.cjs purity precedent).
8
+ //
9
+ // Scenario derivation (D-05): from the variance of the historical per-cycle rates,
10
+ // typical = mean
11
+ // worst = mean + k·stddev
12
+ // best = max(0, mean − k·stddev)
13
+ // with k = 1 by default. Projection over the next N cycles is linear on the chosen rate.
14
+ //
15
+ // No `require` — pure. Deterministic.
16
+
17
+ /** Coerce to a finite, non-negative number or throw. */
18
+ function num(x, label) {
19
+ const n = Number(x);
20
+ if (!Number.isFinite(n)) throw new Error(`cost-forecast: ${label} must be a finite number (got ${x})`);
21
+ return n;
22
+ }
23
+
24
+ /** Population mean of an array of numbers (0 for empty). */
25
+ function mean(xs) {
26
+ if (!xs.length) return 0;
27
+ let s = 0;
28
+ for (const x of xs) s += x;
29
+ return s / xs.length;
30
+ }
31
+
32
+ /** Population standard deviation (0 for length < 2). */
33
+ function stddev(xs) {
34
+ if (xs.length < 2) return 0;
35
+ const m = mean(xs);
36
+ let acc = 0;
37
+ for (const x of xs) acc += (x - m) * (x - m);
38
+ return Math.sqrt(acc / xs.length);
39
+ }
40
+
41
+ /**
42
+ * Normalize the cycle-cost input into a clean array of non-negative per-cycle USD totals.
43
+ * Accepts either an array of numbers, or an array of { costUsd } / { est_cost_usd } objects.
44
+ */
45
+ function perCycleRates(cycleCosts) {
46
+ if (!Array.isArray(cycleCosts)) throw new Error('cost-forecast: cycleCosts must be an array');
47
+ return cycleCosts.map((c, i) => {
48
+ const v = typeof c === 'object' && c !== null
49
+ ? (c.costUsd !== undefined ? c.costUsd : c.est_cost_usd)
50
+ : c;
51
+ const n = num(v, `cycleCosts[${i}]`);
52
+ return n < 0 ? 0 : n;
53
+ });
54
+ }
55
+
56
+ /**
57
+ * Project the next `nCycles` of spend.
58
+ * @returns {{scenario, k, observedCycles, perCycle, projectedTotal, low, high}}
59
+ * perCycle — the per-cycle rate used for this scenario
60
+ * projectedTotal — perCycle * nCycles
61
+ * low/high — the best/worst per-cycle band (always returned for context)
62
+ */
63
+ function forecast(cycleCosts, opts) {
64
+ const o = opts || {};
65
+ const nCycles = o.nCycles === undefined ? 5 : Math.max(0, Math.trunc(num(o.nCycles, 'nCycles')));
66
+ const scenario = o.scenario === undefined ? 'typical' : String(o.scenario);
67
+ const k = o.k === undefined ? 1 : num(o.k, 'k');
68
+ if (!['best', 'typical', 'worst'].includes(scenario)) {
69
+ throw new Error(`cost-forecast: scenario must be best|typical|worst (got ${scenario})`);
70
+ }
71
+ const rates = perCycleRates(cycleCosts);
72
+ const m = mean(rates);
73
+ const sd = stddev(rates);
74
+ const low = Math.max(0, m - k * sd);
75
+ const high = m + k * sd;
76
+ const perCycle = scenario === 'best' ? low : scenario === 'worst' ? high : m;
77
+ return {
78
+ scenario,
79
+ k,
80
+ observedCycles: rates.length,
81
+ perCycle,
82
+ projectedTotal: perCycle * nCycles,
83
+ low,
84
+ high,
85
+ };
86
+ }
87
+
88
+ /**
89
+ * Integer count of full cycles until `currentSpend` reaches `cap` at `perCycleRate`.
90
+ * - rate <= 0 → Infinity (never reaches cap)
91
+ * - currentSpend >= cap → 0 (already at/over)
92
+ * Throws on non-finite inputs.
93
+ */
94
+ function cyclesToCap(currentSpend, cap, perCycleRate) {
95
+ const s = num(currentSpend, 'currentSpend');
96
+ const c = num(cap, 'cap');
97
+ const r = num(perCycleRate, 'perCycleRate');
98
+ if (s >= c) return 0;
99
+ if (r <= 0) return Infinity;
100
+ return Math.ceil((c - s) / r);
101
+ }
102
+
103
+ module.exports = { perCycleRates, mean, stddev, forecast, cyclesToCap };
@@ -0,0 +1,55 @@
1
+ 'use strict';
2
+ // Phase 39.2 — project-cap.cjs — PURE, dep-free project-budget classifier.
3
+ //
4
+ // The Phase 25 budget-enforcer hook (hooks/budget-enforcer.ts) reads the running project spend and
5
+ // the configured project cap, and calls this classifier to decide whether to warn (50% / 80%) or
6
+ // hard-halt (100%). Keeping the decision math here (out of the .ts hook) mirrors how the hook already
7
+ // delegates cost computation to scripts/lib/budget-enforcer.cjs, and makes the thresholds unit-testable.
8
+ //
9
+ // project_cap is DISABLED by default (D-04): a cap of 0 / negative / non-finite means "no project cap"
10
+ // and always returns level 'ok' — so existing users (who have no project_cap_usd in budget.json) see
11
+ // zero behavior change. The halt is graceful: the hook fires on PreToolUse:Agent, so a 'halt' blocks
12
+ // the NEXT agent spawn, letting the current stage finish.
13
+ //
14
+ // No `require` — pure. Deterministic.
15
+
16
+ const WARN_50 = 50;
17
+ const WARN_80 = 80;
18
+ const HALT_100 = 100;
19
+
20
+ /**
21
+ * @param {number} spendUsd running project spend (USD)
22
+ * @param {number} capUsd configured project cap (USD); <= 0 / non-finite ⇒ disabled
23
+ * @returns {{enabled:boolean, pct:number, level:'ok'|'warn-50'|'warn-80'|'halt', cap:number, spend:number}}
24
+ */
25
+ function classifyProjectBudget(spendUsd, capUsd) {
26
+ const spend = Number(spendUsd);
27
+ const cap = Number(capUsd);
28
+ const enabled = Number.isFinite(cap) && cap > 0 && Number.isFinite(spend) && spend >= 0;
29
+ if (!enabled) {
30
+ return { enabled: false, pct: 0, level: 'ok', cap: Number.isFinite(cap) ? cap : 0, spend: Number.isFinite(spend) ? spend : 0 };
31
+ }
32
+ const pct = (spend / cap) * 100;
33
+ let level = 'ok';
34
+ if (pct >= HALT_100) level = 'halt';
35
+ else if (pct >= WARN_80) level = 'warn-80';
36
+ else if (pct >= WARN_50) level = 'warn-50';
37
+ return { enabled: true, pct, level, cap, spend };
38
+ }
39
+
40
+ /** True when a classification should hard-block the next spawn (enforce mode + level 'halt'). */
41
+ function shouldHalt(classification, enforcementMode) {
42
+ return !!classification && classification.level === 'halt' && enforcementMode === 'enforce';
43
+ }
44
+
45
+ /** A one-line human message for a non-'ok' level (null when ok). */
46
+ function capMessage(c) {
47
+ if (!c || !c.enabled || c.level === 'ok') return null;
48
+ const pct = c.pct.toFixed(0);
49
+ if (c.level === 'halt') {
50
+ return `project budget cap reached: $${c.spend.toFixed(2)} / $${c.cap.toFixed(2)} (${pct}%) — halting before the next agent spawn`;
51
+ }
52
+ return `project budget at ${pct}%: $${c.spend.toFixed(2)} / $${c.cap.toFixed(2)}`;
53
+ }
54
+
55
+ module.exports = { classifyProjectBudget, shouldHalt, capMessage, WARN_50, WARN_80, HALT_100 };
@@ -0,0 +1,73 @@
1
+ 'use strict';
2
+ // Phase 39.2 — roi.cjs — PURE, dep-free ROI join + table formatter.
3
+ //
4
+ // The /gdd:roi skill shells `git log` to count, per cycle, commits that SHIPPED (survived >= 14 days
5
+ // in main — the ROADMAP "shipped" definition, catching revert-after-bug-discovery) vs commits that
6
+ // were REVERTED, and reads per-cycle cost from .design/telemetry/costs.jsonl. It hands the joined rows
7
+ // here. This module does ONLY the arithmetic + markdown formatting — no fs, no clock, no git. Pure.
8
+ //
9
+ // No `require` — pure. Deterministic.
10
+
11
+ function num(x, label) {
12
+ const n = Number(x);
13
+ if (!Number.isFinite(n)) throw new Error(`roi: ${label} must be a finite number (got ${x})`);
14
+ return n;
15
+ }
16
+
17
+ /**
18
+ * @param {Array<{cycle, costUsd, commitsShipped, commitsReverted}>} cycles
19
+ * @returns {{rows, totals}}
20
+ * row — { cycle, costUsd, shipped, reverted, costPerShipped, stickRate }
21
+ * totals — aggregate across all cycles (same fields, cycle: 'TOTAL')
22
+ * costPerShipped = costUsd / max(shipped, 1) (USD per commit that stuck)
23
+ * stickRate = shipped / max(shipped + reverted, 1) (0..1)
24
+ */
25
+ function computeRoi(cycles) {
26
+ if (!Array.isArray(cycles)) throw new Error('roi: cycles must be an array');
27
+ const rows = cycles.map((c, i) => {
28
+ if (typeof c !== 'object' || c === null) throw new Error(`roi: cycles[${i}] must be an object`);
29
+ const costUsd = num(c.costUsd, `cycles[${i}].costUsd`);
30
+ const shipped = Math.max(0, Math.trunc(num(c.commitsShipped, `cycles[${i}].commitsShipped`)));
31
+ const reverted = Math.max(0, Math.trunc(num(c.commitsReverted, `cycles[${i}].commitsReverted`)));
32
+ return {
33
+ cycle: String(c.cycle),
34
+ costUsd,
35
+ shipped,
36
+ reverted,
37
+ costPerShipped: costUsd / Math.max(shipped, 1),
38
+ stickRate: shipped / Math.max(shipped + reverted, 1),
39
+ };
40
+ });
41
+ const totCost = rows.reduce((a, r) => a + r.costUsd, 0);
42
+ const totShipped = rows.reduce((a, r) => a + r.shipped, 0);
43
+ const totReverted = rows.reduce((a, r) => a + r.reverted, 0);
44
+ const totals = {
45
+ cycle: 'TOTAL',
46
+ costUsd: totCost,
47
+ shipped: totShipped,
48
+ reverted: totReverted,
49
+ costPerShipped: totCost / Math.max(totShipped, 1),
50
+ stickRate: totShipped / Math.max(totShipped + totReverted, 1),
51
+ };
52
+ return { rows, totals };
53
+ }
54
+
55
+ /** Format a USD value as $X.XX. */
56
+ function usd(n) {
57
+ return '$' + num(n, 'usd').toFixed(2);
58
+ }
59
+
60
+ /** Render the ROI result as a GitHub-flavored markdown table. Pure string output. */
61
+ function roiTableMarkdown(roi) {
62
+ if (!roi || !Array.isArray(roi.rows)) throw new Error('roi: roiTableMarkdown needs a computeRoi() result');
63
+ const head =
64
+ '| Cycle | Cost | Shipped | Reverted | $/shipped | Stick rate |\n' +
65
+ '|---|---:|---:|---:|---:|---:|';
66
+ const fmt = (r) =>
67
+ `| ${r.cycle} | ${usd(r.costUsd)} | ${r.shipped} | ${r.reverted} | ${usd(r.costPerShipped)} | ${(r.stickRate * 100).toFixed(0)}% |`;
68
+ const body = roi.rows.map(fmt).join('\n');
69
+ const foot = fmt(roi.totals);
70
+ return [head, body, foot].join('\n');
71
+ }
72
+
73
+ module.exports = { computeRoi, roiTableMarkdown, usd };
@@ -0,0 +1,45 @@
1
+ ---
2
+ name: gdd-budget
3
+ description: "Forecasts GDD design-cycle spend before the bill arrives. Reads .design/telemetry/costs.jsonl (cost per cycle) + .design/budget.json (the project_cap), runs the pure cost-forecast model via agents/cost-forecaster.md, and projects the next N cycles — surfacing 'at the current rate you'll hit your $X project cap in Y cycles.' Supports --scenario best|typical|worst and --cycles N. Read-only — it forecasts and warns; it never spends, edits budget.json, or halts (the budget-enforcer hook halts). Use to sanity-check spend trajectory before a long run."
4
+ argument-hint: "[--cycles N] [--scenario best|typical|worst]"
5
+ user-invocable: true
6
+ tools: Read, Bash, Grep, Glob, ToolSearch, Task
7
+ ---
8
+
9
+ # /gdd:budget
10
+
11
+ Closes the long-horizon cost gap: Phase 10.1 per-task caps + Phase 26 per-runtime telemetry track
12
+ *cost*, but nothing **forecasts** it. This skill projects the next N cycles of spend and tells you how
13
+ many cycles you have before you hit your `project_cap`. **Read-only** — it forecasts and warns; it
14
+ never spends, never edits `budget.json`, and never halts (the Phase 25 budget-enforcer hook is the
15
+ only thing that blocks a spawn). Contract: `../../reference/cost-governance.md`.
16
+
17
+ ## Invocation
18
+
19
+ | Command | Behavior |
20
+ |---|---|
21
+ | `/gdd:budget` | Typical-scenario forecast over the next 5 cycles + cycles-to-cap. |
22
+ | `/gdd:budget --cycles N` | Forecast over the next N cycles. |
23
+ | `/gdd:budget --scenario best\|typical\|worst` | Pick the projection rate (best / steady / worst). |
24
+
25
+ ## Steps
26
+
27
+ 1. **Check telemetry exists.** No `.design/telemetry/costs.jsonl` (or zero rows) → print
28
+ `budget: no cost telemetry yet — run a cycle first.` and exit.
29
+ 2. **Delegate to `cost-forecaster`** (via `Task`): it groups `est_cost_usd` by `cycle`, runs the pure
30
+ `scripts/lib/budget/cost-forecast.cjs` model for the requested `--scenario`/`--cycles`, reads
31
+ `project_cap_usd` from `.design/budget.json`, and computes cycles-to-cap.
32
+ 3. **Render.** Show: the scenario + its per-cycle rate, the best↔worst band, the projected total over
33
+ N cycles, and — when `project_cap_usd > 0` — **"at the `<scenario>` rate (~$X/cycle) you'll reach
34
+ your $`<cap>` project cap in `<Y>` cycles"** (or "not at this rate" when the trend is flat/down).
35
+ When no cap is set, show the trajectory and note that `project_cap_usd` is unset (so the hook won't
36
+ halt).
37
+ 4. **Do not act.** Never raise/lower the cap, never spend — GDD forecasts; the human sets the budget.
38
+
39
+ ## Output
40
+
41
+ End with:
42
+
43
+ ```
44
+ ## BUDGET COMPLETE
45
+ ```
@@ -0,0 +1,54 @@
1
+ ---
2
+ name: gdd-roi
3
+ description: "Shows whether GDD spend actually shipped anything. Joins per-cycle cost (.design/telemetry/costs.jsonl) with what each cycle shipped — commits that SURVIVED in main vs commits that were reverted — and reports cost-per-shipped-commit + a stick rate per cycle. 'Shipped' = a commit surviving >= the window (default 14 days), which catches revert-after-bug-discovery. Markdown table, not a GUI. Read-only — it reads git log + cost telemetry and reports. Use to see which cycles were worth their spend."
4
+ argument-hint: "[--since <date>] [--window-days 14]"
5
+ user-invocable: true
6
+ tools: Read, Bash, Grep, Glob
7
+ ---
8
+
9
+ # /gdd:roi
10
+
11
+ Closes the loop on cost: `/gdd:budget` forecasts *spend*, this shows the *return*. It joins per-cycle
12
+ cost with the commits that actually stuck, so you can see cost-per-shipped-commit and which cycles
13
+ were worth it. **Read-only** — it reads `git log` + cost telemetry and prints a table. Contract +
14
+ the "shipped" definition: `../../reference/cost-governance.md`.
15
+
16
+ ## Invocation
17
+
18
+ | Command | Behavior |
19
+ |---|---|
20
+ | `/gdd:roi` | ROI table across all cycles with cost telemetry (14-day stick window). |
21
+ | `/gdd:roi --since <date>` | Only cycles since `<date>`. |
22
+ | `/gdd:roi --window-days N` | "Shipped" = a commit surviving ≥ N days (default 14). |
23
+
24
+ ## Steps
25
+
26
+ 1. **Check telemetry exists.** No `.design/telemetry/costs.jsonl` (or zero rows) → print
27
+ `roi: no cost telemetry yet — run a cycle first.` and exit.
28
+ 2. **Per-cycle cost.** Group `est_cost_usd` in `costs.jsonl` by `cycle`.
29
+ 3. **Per-cycle shipped / reverted.** For each cycle, use `git log` to count, in that cycle's date
30
+ range: commits still present in `main` and older than the window = **shipped**; commits that a
31
+ later `revert` removed (or that were reverted within the window) = **reverted**. (A commit younger
32
+ than the window is "too new to score" — exclude it, don't count it as shipped.)
33
+ 4. **Join + compute** via the pure helper — never hand-compute:
34
+
35
+ ```bash
36
+ node -e '
37
+ const { computeRoi, roiTableMarkdown } = require("./scripts/lib/budget/roi.cjs");
38
+ const cycles = JSON.parse(process.argv[1]); // [{cycle,costUsd,commitsShipped,commitsReverted},...]
39
+ console.log(roiTableMarkdown(computeRoi(cycles)));
40
+ ' "$CYCLES_JSON"
41
+ ```
42
+
43
+ 5. **Render** the markdown table (cycle · cost · shipped · reverted · $/shipped · stick rate) plus the
44
+ TOTAL row. A high `$/shipped` with a low stick rate is the signal that a cycle burned budget
45
+ without lasting output.
46
+ 6. **Do not act.** Reporting only — never revert, re-run, or change budget.
47
+
48
+ ## Output
49
+
50
+ End with:
51
+
52
+ ```
53
+ ## ROI COMPLETE
54
+ ```