npm - @hegemonart/get-design-done - Versions diffs - 1.37.2 → 1.38.0 - Mend

@hegemonart/get-design-done 1.37.2 → 1.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +23 -0
package/README.md +4 -0
package/agents/design-verifier.md +1 -1
package/agents/experiment-result-ingester.md +61 -0
package/agents/user-research-synthesizer.md +65 -0
package/connections/connections.md +7 -1
package/connections/growthbook.md +110 -0
package/connections/hotjar.md +110 -0
package/connections/launchdarkly.md +83 -0
package/connections/maze.md +130 -0
package/connections/statsig.md +83 -0
package/connections/usertesting.md +99 -0
package/package.json +1 -1
package/reference/design-variants.md +56 -0
package/reference/registry.json +7 -0
package/scripts/lib/ds-arms/design-arms-store.cjs +119 -0
package/skills/brief/SKILL.md +8 -0
package/skills/connections/SKILL.md +4 -4
package/skills/connections/connections-onboarding.md +58 -4
package/skills/design/SKILL.md +2 -1

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -5,14 +5,14 @@
   },
   "metadata": {
     "description": "Get Design Done — 5-stage agent-orchestrated design pipeline with 9 connections, handoff-first workflow, bidirectional Figma write-back, 22+ specialized agents, queryable knowledge layer (intel store, dependency analysis, learnings extraction), and a self-improvement loop (reflector, frontmatter + budget feedback, global-skills layer). v1.20.0 ships the SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream, and resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) for rate-limit + 429 + context-overflow recovery. Full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation (auto-tag + GitHub Release + release-time smoke test).",
-    "version": "1.37.2"
+    "version": "1.38.0"
   },
   "plugins": [
     {
       "name": "get-design-done",
       "source": "./",
       "description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), Claude Design handoff, bidirectional Figma write-back, and a queryable intel store (.design/intel/) for dependency and learnings queries. Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows) and release automation. Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain.",
-      "version": "1.37.2",
+      "version": "1.38.0",
       "author": {
         "name": "hegemonart"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "get-design-done",
   "short_name": "gdd",
-  "version": "1.37.2",
+  "version": "1.38.0",
   "description": "Agent-orchestrated 5-stage design pipeline: Brief → Explore → Plan → Design → Verify. 22+ specialized agents, 9 connections (Figma, Refero, Preview, Storybook, Chromatic, Figma Writer, Graphify, Pinterest, Claude Design), handoff-first workflow via Claude Design bundles, bidirectional Figma write-back (annotations, Code Connect), queryable intel store (`.design/intel/`) for O(1) design surface lookups, and self-improvement loop (reflector agent, frontmatter + budget feedback, global-skills layer at `~/.claude/gdd/global-skills/`). Standalone commands: style, darkmode, compare, figma-write, graphify, handoff, analyze-dependencies, skill-manifest, extract-learnings, reflect, apply-reflections. Embeds NNG heuristics, WCAG thresholds, typographic systems, motion framework, and anti-pattern catalog. Ships with a full CI/CD pipeline (Node 22/24 × Linux/macOS/Windows, lint + schema + frontmatter + stale-ref + shellcheck + gitleaks + injection-scan + blocking size-budget) and release automation (auto-tag + GitHub Release + release-time smoke test). Optimization layer (v1.0.4.1, retroactive): gdd-router + gdd-cache-manager skills, PreToolUse budget-enforcer hook, tier-aware agent frontmatter, lazy checker gates, streaming synthesizer, /gdd:warm-cache + /gdd:optimize commands, and cost telemetry at .design/telemetry/costs.jsonl — targeting 50-70% per-task token-cost reduction with no quality-floor regression. v1.20.0 SDK foundation: gdd-state MCP server (11 typed tools), lockfile-safe STATE.md mutations, event stream at .design/telemetry/events.jsonl, resilience primitives (jittered-backoff, rate-guard, error-classifier, iteration-budget) with rate-limit + 429 + context-overflow recovery, and TypeScript toolchain. v1.27.7 ships gdd-mcp (Phase 27.7): 12 read-only MCP tools for sub-3s priming. v1.28.0 (Phase 28): Foundational References Tier 2 — 5 new reference files (color-theory, composition, proportion-systems, i18n, contrast-advanced), 2 verifier i18n probes + 1 explore i18n-readiness probe, 12 additive cross-link insertions across 10 existing references, 2 orthogonal audit-scoring lens-tags (composition_alignment + i18n_readiness).",
   "author": {
     "name": "hegemonart",

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,29 @@ All notable changes to get-design-done are documented here. Versions follow [sem
 ---
+## [1.38.0] - 2026-06-01
+### Phase 38 — Outcome-Driven Adaptation (A/B Variants + Inbound User-Research Signals)
+Closes the external-outcome loop. The bandit's reward was **internal** (lint/test/visual pass-fail); it couldn't learn "which design pattern wins with **users**" because user-outcome signals never entered the system. Phase 38 adds two external signal sources — **A/B experiments** + **user research** — feeding a new `design_arms` posterior + the brief/verify loop. **No new runtime dependency** (a pure Beta-posterior store + injectable `fetch`; the platforms are opt-in, read-only).
+### Added
+- **`scripts/lib/ds-arms/design-arms-store.cjs`** — a pure `design_arms` posterior class, **distinct from the routing bandit** (`bandit-router.cjs`). Keyed by `(component_type, variant_pattern_hash)` (inline FNV-1a — no `crypto`), conservative **Beta(2, 8)** prior (mean 0.2 — a pattern earns trust from real outcomes). `variantKey` / `pull` / `observe` / `all`; atomic persist to `.design/telemetry/design-arms.json`. node-builtins only.
+- **`reference/design-variants.md`** + **`design` `--variants N`** — N competing, hypothesis-tagged variants (`<variant id component pattern hypothesis>`, default N=2); the design stage consults the posterior to bias generation — **advisory, never directive (the user always wins, D-03)**. Registered.
+- **`connections/launchdarkly.md` + `statsig.md` + `growthbook.md`** — A/B **experiment-source** connections (read-only; never run experiments — D-04). **`agents/experiment-result-ingester.md`** maps variant→outcome by the primary metric + significance → `observe()` into `design_arms` → `experiment_result` event.
+- **`connections/usertesting.md` + `maze.md` + `hotjar.md`** + **`agents/user-research-synthesizer.md`** — user-research sources (read-only, indexed insights). **PII guard (D-05): every payload routes through `scripts/lib/pseudonymize.cjs` BEFORE any agent context** — enforced by `test/suite/phase-38-pii-guard.test.cjs`. Synthesizes ranked findings (finding · frequency · severity) into the brief **`<prior-research>`** block.
+- **Verify cross-check** — `design-verifier` asserts each `<prior-research>` finding is addressed or explicitly deferred (unaddressed `critical`/`serious` = a gap).
+### Notes
+- **No new runtime dependency, no new egress** — pure Beta store + injectable `fetchImpl` (hermetic tests); the A/B + research platforms are opt-in user-connected MCP/API, read-only.
+- The 6 outcome connections are **Active-table + onboarding entries** (27 → 33 onboarded), NOT pipeline-stage capability-matrix rows — outcome-ingest is post-pipeline (the Notion export-only precedent), so it does not occupy a stage column.
+- 6-manifest lockstep at **v1.38.0** + `OFF_CADENCE_VERSIONS.add('1.38.0')` + the 28 live-pinned `manifests-version.txt` baselines forward-propagated 1.37.2 → 1.38.0.
+- Inventory relock: connection-list 35 → 41 (+6), phase-20 agent-list 50 → 52 (+`experiment-result-ingester` + `user-research-synthesizer`) + both frontmatter-snapshots, registry-diff 151 → 152 (+`design-variants`), tarball golden 680 → 690 (+10). `design-verifier` augmented in place (stays at the 700 cap).
+---
 ## [1.37.2] - 2026-06-01
 ### Phase 37.2 — Greenfield Design-System Bootstrap (`/gdd:bootstrap-ds`) — completes Phase 37

package/README.md CHANGED Viewed

@@ -170,6 +170,10 @@ Six more AI-native design tools join the connection layer (Phase 14's backlog, n
 `/gdd:bootstrap-ds` gives a brand-new project a coherent design system from a brand input (a primary color + optional secondary + tone tags + target framework). The pure [`token-scale`](scripts/lib/ds/token-scale.cjs) helper emits a 9-stop OKLCH color scale as native CSS `oklch()` (no color-conversion library), a modular type scale, a 4pt/8pt spacing scale, and radius/motion defaults; [`ds-generator`](agents/ds-generator.md) offers **3 variants** (conservative / balanced / bold) to pick from per [`reference/ds-bootstrap-rubric.md`](reference/ds-bootstrap-rubric.md), then scaffolds button/input/card proof components. Never invents a brand; never overwrites an existing DS. **No new runtime dependency.** This **completes Phase 37** (AI-native Wave 2 + greenfield bootstrap).
+### Outcome-driven adaptation (v1.38.0)
+GDD now learns **which design patterns win with users**, not just which pass lint. `/gdd:design --variants N` emits N competing, hypothesis-tagged variants; a new `design_arms` posterior ([`design-arms-store`](scripts/lib/ds-arms/design-arms-store.cjs) — Beta(2,8), distinct from the routing bandit) consults prior outcomes to bias generation (**advisory, never directive**). Two read-only external signal sources close the loop: **A/B experiments** ([LaunchDarkly / Statsig / GrowthBook](connections/launchdarkly.md) → [`experiment-result-ingester`](agents/experiment-result-ingester.md)) and **user research** ([UserTesting / Maze / Hotjar](connections/usertesting.md) → [`user-research-synthesizer`](agents/user-research-synthesizer.md)), the latter **pseudonymized before any agent context** (a tested PII guard). Findings populate the brief's `<prior-research>` block, which verify cross-checks. **No new runtime dependency.** Onboarding 27 → 33.
 ### Previous releases
 - **v1.26.0** — Headless Model Resolver (per-runtime tier→model map, `resolved_models` router field, per-runtime price tables, `reasoning-class` runtime-neutral alias).

package/agents/design-verifier.md CHANGED Viewed

@@ -182,7 +182,7 @@ Allow-list seed (skip): `console\.(log|error|warn|info|debug)`, dev-only `/* */`
 ## Phase 2 — Must-Have Check
-Read `.design/STATE.md` `<must_haves>`. Also read must-haves from DESIGN-PLAN.md acceptance criteria. For each M-XX must-have, determine verification method and verify:
+Read `.design/STATE.md` `<must_haves>`. Also read must-haves from DESIGN-PLAN.md acceptance criteria, **and the brief's `<prior-research>` findings (Phase 38)** — for each prior-research finding, assert the current design addresses it or note an explicit defer + rationale (an unaddressed `critical`/`serious` finding is a gap). For each M-XX must-have, determine verification method and verify:
 | Must-have type | Verification method |
 |---|---|

package/agents/experiment-result-ingester.md ADDED Viewed

@@ -0,0 +1,61 @@
+---
+name: experiment-result-ingester
+description: Ingests A/B experiment results (LaunchDarkly / Statsig / GrowthBook) and folds them into the design_arms posterior. Reads a finished experiment's payload, maps each variant to a win/lose by the primary metric + significance, calls observe() on scripts/lib/ds-arms/design-arms-store.cjs, and emits an experiment_result typed event. Read-only against the platform (never runs/creates experiments). Injectable fetch — hermetic. Degrades to a noop when no experiment-source is configured.
+tools: Read, Bash, Grep, Glob, ToolSearch
+color: green
+default-tier: sonnet
+tier-rationale: "Mechanical mapping of an experiment payload to win/lose + a posterior update via a pure store; no design judgment — sonnet-tier."
+size_budget: M
+size_budget_rationale: "Honest tier sized to the ~95-line body. The agent states the read→map→observe→emit flow and DELEGATES the posterior math to scripts/lib/ds-arms/design-arms-store.cjs and the per-platform probe to connections/{launchdarkly,statsig,growthbook}.md (the ticket-sync-agent→reference precedent)."
+parallel-safe: false
+typical-duration-seconds: 30
+reads-only: false
+writes:
+  - ".design/telemetry/design-arms.json"
+  - ".design/intel/insights.jsonl"
+---
+@reference/shared-preamble.md
+# experiment-result-ingester
+## Role
+Close the A/B side of the outcome loop: read a **finished** experiment's results from the configured experiment-source and teach the `design_arms` posterior which design pattern actually won with users. **Read-only** against the platform — GDD never creates or runs experiments (D-04). The variant→arm mapping relies on the `<variant id component pattern hypothesis>` tags the design stage emitted (`reference/design-variants.md`).
+## When invoked
+After an experiment tagged to a GDD cycle reaches a decision, or on demand. Gate on an experiment-source being `available` (per `connections/launchdarkly.md` / `connections/statsig.md` / `connections/growthbook.md`); none → print `experiment ingest: no experiment-source configured — skipped.` and stop (degrade-to-noop).
+## Step 1 — Read the experiment payload
+Probe the configured source (ToolSearch for an MCP, else the platform API key env). Read the experiment's variants + the **primary metric** per variant + the statistical decision (winner / no-significant-difference). Use an **injectable `fetchImpl`** so this is hermetic under test — never hard-code a live HTTP call in a way the test can't stub. Read-only scopes only.
+## Step 2 — Map variant → outcome
+For each variant in the experiment:
+- Resolve its GDD `component` + `pattern` from the variant tag (or the experiment's metadata mapping).
+- `won` = this variant is the **statistically significant winner** on the primary metric. A no-significant-difference experiment yields NO observation (do not reward noise) — skip, and note it.
+## Step 3 — Fold into the posterior
+```bash
+node -e "const s=require('./scripts/lib/ds-arms/design-arms-store.cjs'); \
+  const k=s.variantKey(COMPONENT, PATTERN); \
+  s.observe(COMPONENT, k, { won: WON, source: 'ab', label: PATTERN });"
+```
+One `observe` per variant with a decided outcome. `won:true` → `alpha += 1`; the losing variant(s) → `won:false` (`beta += 1`). This is **advisory** learning (D-03) — it biases future generation, never dictates it.
+## Step 4 — Emit the event
+Emit an `experiment_result` typed event into the Phase 22 chain (`.design/intel/insights.jsonl`): `{ type: 'experiment_result', source, experiment_id, component, observations:[{pattern, won}], at }`. No PII (experiment IDs + pattern slugs only).
+## Record
+Emit a `## Experiment ingest` summary: source, experiment, the per-variant win/lose, the posterior means before→after, and any skipped (no-significant-difference) variants. Close with:
+```
+## EXPERIMENT INGEST COMPLETE
+```

package/agents/user-research-synthesizer.md ADDED Viewed

@@ -0,0 +1,65 @@
+---
+name: user-research-synthesizer
+description: Synthesizes inbound user-research signals (UserTesting / Maze / Hotjar) into brief-grade insights. Reads test reports + session/heatmap aggregates, ALWAYS pseudonymizes them through scripts/lib/pseudonymize.cjs BEFORE any agent context (PII guard), then extracts top findings with frequency + severity for the brief <prior-research> block. Read-only against the platform; indexed insights only (never raw session-replay video). Degrades to a noop when no research source is configured.
+tools: Read, Bash, Grep, Glob, ToolSearch
+color: green
+default-tier: sonnet
+tier-rationale: "Synthesis of pre-collected research reports into ranked findings; bounded extraction, not open design judgment — sonnet-tier."
+size_budget: M
+size_budget_rationale: "Honest tier sized to the ~105-line body. The agent states the read→pseudonymize→synthesize→write-<prior-research> flow and DELEGATES the PII transform to scripts/lib/pseudonymize.cjs and per-platform detail to connections/{usertesting,maze,hotjar}.md."
+parallel-safe: false
+typical-duration-seconds: 45
+reads-only: false
+writes:
+  - ".design/BRIEF.md (the <prior-research> block)"
+  - ".design/telemetry/design-arms.json (optional qualitative signal)"
+  - ".design/intel/insights.jsonl"
+---
+@reference/shared-preamble.md
+# user-research-synthesizer
+## Role
+Close the qualitative side of the outcome loop: turn pre-collected user-research into **brief-grade insights** the next cycle can act on. Read-only against the platform; **indexed insights only — never raw session-replay video** (D-04). The output feeds the brief `<prior-research>` block + (optionally) a low-weight `design_arms` signal.
+## PII guard — non-negotiable (D-05)
+**Every research payload passes through `scripts/lib/pseudonymize.cjs` BEFORE it enters ANY agent context, log, or event.** Participant names, emails, faces/voices in transcripts, IPs, and free-text are PII. The flow is **read → pseudonymize → reason** — never read → reason → redact. There is no path where a raw research payload reaches the model. A static CI test asserts this routing.
+## When invoked
+On demand (`/gdd:research-sync`) or auto-suggested when the verify cross-check finds `<prior-research>` data > 14 days old. Gate on a research source being `available` (per `connections/usertesting.md` / `connections/maze.md` / `connections/hotjar.md`); none → `research synthesis: no research source configured — skipped.` (degrade-to-noop).
+## Step 1 — Read (read-only) + pseudonymize
+Probe the configured source (ToolSearch MCP, else the platform API key env; injectable `fetchImpl` for hermetic tests). Pull indexed insights — test-report findings, task success/time, misclick rates, survey responses, heatmap aggregates. **Immediately** pipe every payload through `pseudonymize.cjs`:
+```bash
+node -e "const {pseudonymize}=require('./scripts/lib/pseudonymize.cjs'); process.stdout.write(pseudonymize(require('fs').readFileSync(0,'utf8')))" < raw-payload.json > safe-payload.json
+```
+Only `safe-payload.json` is ever read into reasoning.
+## Step 2 — Synthesize brief-grade findings
+From the pseudonymized payload, extract the top findings, each with:
+- **finding** — a one-line observation in user terms ("users miss the secondary CTA on mobile").
+- **frequency** — how many participants / sessions exhibited it.
+- **severity** — `critical | serious | minor` (blocks the task / slows it / cosmetic).
+Rank by `severity × frequency`. Keep the top N (default 7) — a brief is a focus list, not a transcript dump.
+## Step 3 — Write the `<prior-research>` block + optional signal
+Write the ranked findings into the brief's `<prior-research>` block (consumed by `skills/brief/SKILL.md` + checked at verify). When a finding maps cleanly to a tested design pattern, optionally fold a **low-weight** qualitative signal into `design_arms` (`observe(component, key, { won, source: 'research', weight: 0.5 })`) — research corroborates A/B, it does not outweigh it.
+## Record
+Emit a `## Research synthesis` summary: source, # reports read, the ranked findings (finding / frequency / severity), and confirmation that pseudonymize ran first. Emit a `research_synthesized` event (no PII). Close with:
+```
+## RESEARCH SYNTHESIS COMPLETE
+```

package/connections/connections.md CHANGED Viewed

@@ -2,7 +2,7 @@
 This directory contains connection specifications for external tools and MCPs that the get-design-done pipeline integrates with. Each connection has its own spec file. This file is the index.
-**Getting started:** run `/gdd:connections` for the interactive onboarding wizard — it probes all 27 connections, recommends setup based on your project type, and walks you through installing each one (auto-run for reversible MCP adds, copy-command for everything else). You can also run `/gdd:connections list` for a read-only status check or `/gdd:connections <name>` to jump to a single connection's setup.
+**Getting started:** run `/gdd:connections` for the interactive onboarding wizard — it probes all 33 connections, recommends setup based on your project type, and walks you through installing each one (auto-run for reversible MCP adds, copy-command for everything else). You can also run `/gdd:connections list` for a read-only status check or `/gdd:connections <name>` to jump to a single connection's setup.
 ---
@@ -43,6 +43,12 @@ This directory contains connection specifications for external tools and MCPs th
 | v0.dev | Active | [`connections/v0-dev.md`](connections/v0-dev.md) | **AI-native** (Wave 2, generator) — Vercel v0; MCP-first → REST + `V0_API_KEY`; component-generator `v0` impl (37.1) |
 | Plasmic | Active | [`connections/plasmic.md`](connections/plasmic.md) | **AI-native** (Wave 2, dual) — canvas read + code emission; component-generator `plasmic` impl (37.1) |
 | Builder.io | Active | [`connections/builder-io.md`](connections/builder-io.md) | **AI-native** (Wave 2, generator) — Visual Copilot, pull-only this phase; component-generator `builder-io` impl (37.1) |
+| LaunchDarkly | Active | [`connections/launchdarkly.md`](connections/launchdarkly.md) | **Outcome** (experiment-source) — read-only A/B results (`LAUNCHDARKLY_API_KEY`/MCP); `experiment-result-ingester` → `design_arms`; `GDD_DISABLE_LAUNCHDARKLY`; degrade-to-noop (38) |
+| Statsig | Active | [`connections/statsig.md`](connections/statsig.md) | **Outcome** (experiment-source) — read-only experiment/pulse results (`STATSIG_API_KEY`/MCP); → `design_arms`; `GDD_DISABLE_STATSIG`; degrade-to-noop (38) |
+| GrowthBook | Active | [`connections/growthbook.md`](connections/growthbook.md) | **Outcome** (experiment-source) — read-only results (`GROWTHBOOK_API_KEY`, self-hosted/cloud, /MCP); → `design_arms`; `GDD_DISABLE_GROWTHBOOK`; degrade-to-noop (38) |
+| UserTesting | Active | [`connections/usertesting.md`](connections/usertesting.md) | **Outcome** (user-research) — read-only test reports; **pseudonymize-first** → `user-research-synthesizer` → brief `<prior-research>`; `GDD_DISABLE_USERTESTING`; degrade-to-noop (38) |
+| Maze | Active | [`connections/maze.md`](connections/maze.md) | **Outcome** (user-research) — read-only usability metrics; **pseudonymize-first** → `user-research-synthesizer`; `GDD_DISABLE_MAZE`; degrade-to-noop (38) |
+| Hotjar | Active | [`connections/hotjar.md`](connections/hotjar.md) | **Outcome** (user-research) — read-only indexed insights (no raw video); **pseudonymize-first** → `user-research-synthesizer`; `GDD_DISABLE_HOTJAR`; degrade-to-noop (38) |
 ---

package/connections/growthbook.md ADDED Viewed

@@ -0,0 +1,110 @@
+# GrowthBook — Connection Specification
+This file is the connection specification for GrowthBook within the get-design-done pipeline. It lives in `connections/` alongside other connection specs (see [`connections/slack.md`](slack.md) for the structural sibling — an API/env-based connection with a three-value probe and degrade-to-noop).
+---
+GrowthBook is an open-source **experiment-source** for the outcome-learning layer (Phase 38). GDD **reads** A/B experiment results from GrowthBook and feeds each variant→outcome into the `design_arms` posterior, so shipped design decisions get reinforced or discounted by what actually performed in production. GDD never runs, creates, edits, or stops experiments — it is strictly **read-only** (D-04). Reads degrade to a noop when unconfigured or disabled; outcome learning simply pauses and the pipeline never blocks.
+GrowthBook ships in two deployment shapes, and GDD supports both: a **cloud** account (the hosted GrowthBook service) or a **self-hosted** instance you operate. GDD connects to whichever you point it at; it does not install, bundle, or host GrowthBook (mirrors the self-hosted-or-cloud split in [`connections/slack.md`](slack.md)'s sibling canvas specs).
+---
+## Setup
+**Prerequisites:** read-only access to a GrowthBook project's experiment results — either a GrowthBook **API key** (a read-only / viewer-scoped token, not a full-access token) **or** the GrowthBook MCP if it is installed in your runtime.
+**Token (env, never committed):**
+```bash
+export GROWTHBOOK_API_KEY="<read-only-api-key>"
+```
+**Optional host (self-hosted only):**
+```bash
+export GROWTHBOOK_API_HOST="https://growthbook.internal.example.com"
+```
+`GROWTHBOOK_API_HOST` is the **distinguishing signal** between cloud and self-hosted. Leave it unset for the hosted GrowthBook cloud (GDD targets the default cloud host); set it to your instance origin when self-hosting. GDD records the resolved host so downstream stages know which deployment produced a result.
+Use the narrowest scope GrowthBook offers (read-only / viewer). The key is a credential — never commit it (not in source, not in `.env`, not in config), never log it, rotate if exposed. GDD reads it from env only and never requests a write scope.
+**Verification:**
+```bash
+test -n "${GROWTHBOOK_API_KEY}" && echo "growthbook key present" || echo "growthbook key absent"
+```
+---
+## Availability Probe
+Probe is **MCP-first**, env-fallback, kill-switch-aware:
+1. If `GDD_DISABLE_GROWTHBOOK=1` → short-circuit to `not_configured` (treated as disabled; never probe further).
+2. Run `ToolSearch({ query: "growthbook" })`. If a GrowthBook MCP tool resolves → `growthbook: available`.
+3. Else check the env key: `test -n "${GROWTHBOOK_API_KEY}"`.
+   - Non-empty → `growthbook: available`
+   - Empty → `growthbook: not_configured`
+4. Source present (MCP or key) but a read errored at fetch time → `growthbook: unavailable`.
+5. When the env path is used, classify the deployment from `GROWTHBOOK_API_HOST`: unset → `deployment=cloud`; any other host → `deployment=self-hosted`. (Via MCP only, with no host exported, `deployment=unknown`.)
+Write the `growthbook` status to `.design/STATE.md` `<connections>` after probing, using the **three-value schema** plus the deployment marker:
+```xml
+<connections>
+growthbook: not_configured
+</connections>
+```
+When available, record the resolved deployment alongside the status, e.g. `growthbook: available (deployment=self-hosted)` or `growthbook: available (deployment=cloud)`.
+| Value | Meaning |
+|---|---|
+| `available` | GrowthBook MCP resolves OR `GROWTHBOOK_API_KEY` set, AND not disabled |
+| `unavailable` | source present but a result read errored |
+| `not_configured` | no MCP and no `GROWTHBOOK_API_KEY`, or `GDD_DISABLE_GROWTHBOOK=1` |
+| Field | Values | Meaning |
+|---|---|---|
+| `deployment` | `cloud` / `self-hosted` / `unknown` | Derived from `GROWTHBOOK_API_HOST`; `unknown` when only the MCP path is present and no host is exported |
+The kill-switch `GDD_DISABLE_GROWTHBOOK=1` forces `not_configured` regardless of MCP/key presence (mirrors the Phase 30 / 35.1 disable convention). `gsd-health` surfaces the state.
+---
+## Pipeline Integration
+GrowthBook contributes the **experiment-source** capability. The flow is read-only and one-directional (results in, never experiments out):
+1. The probe marks `growthbook: available` (and its `deployment`) in `.design/STATE.md`.
+2. The experiment-result ingester ([`agents/experiment-result-ingester.md`](../agents/experiment-result-ingester.md)) reads completed A/B results from GrowthBook — each experiment's variant identifiers plus their measured metric outcomes (the conversion/lift figures GrowthBook computed for that experiment).
+3. It maps each variant to the matching `design_arms` arm and records the outcome (win / loss / lift) against that arm's posterior, so the next design decision is informed by production evidence rather than priors alone.
+4. For each mapped result it emits an `experiment_result` event into the pipeline's event stream for downstream learning and audit.
+A variant that does not map to a known `design_arms` arm is recorded as unmatched and skipped — it never invents an arm. The ingester reads results only; it issues no experiment-creation, assignment, or mutation calls against GrowthBook (D-04). It surfaces `deployment` so an operator can tell cloud results from self-hosted ones in the event trail, since the two deployments can carry independent experiment sets.
+**Injectable fetch (hermetic tests):** the ingester takes an injectable `fetchImpl` (defaulting to the resolved MCP tool or global `fetch`). Tests pass a stub `fetchImpl` so `npm test` exercises the variant→outcome mapping with no real egress — no live GrowthBook call in CI, and no dependence on either the cloud host or a self-hosted origin. There is **no bundled GrowthBook SDK and no new dependency**; reads go through the MCP tool or the injectable `fetchImpl`.
+**Scope — read vs. never:**
+| GDD reads (read-only) | GDD never does (D-04) |
+|---|---|
+| Completed experiment results: variant ids + metric outcomes | Create, start, stop, or archive an experiment |
+| Per-variant lift / win-loss figures GrowthBook computed | Assign users, change traffic splits, or edit targeting |
+| Experiment / variant identifiers for `design_arms` mapping | Write any flag, feature, or experiment definition back |
+Everything GDD touches in GrowthBook is a `GET`-equivalent read. There is no GrowthBook code path in GDD that mutates state, which is what lets a reader/viewer-scoped key suffice and keeps the connection safe to leave attached.
+---
+## Fallback Behavior
+`not_configured` (no MCP, no key) or disabled (`GDD_DISABLE_GROWTHBOOK=1`) → the experiment-source **degrades to a noop**: the ingester is skipped, no `experiment_result` events are emitted, and the `design_arms` posterior simply does not get the outcome update this cycle. Design decisions still ship — they just rely on prior evidence instead of fresh experiment results.
+A read failure when a source *is* present (cloud unreachable, self-hosted origin down, or an errored response) → `growthbook: unavailable`; that cycle's ingestion is skipped (no error surfaced to the pipeline) and retried on the next probe. The ingester returns a skipped/empty result and never throws, so outcome learning is best-effort and **never blocks the pipeline** (mirrors the notify degrade-to-noop in [`connections/slack.md`](slack.md)). Always re-probe at stage entry — both the access path and the resolved deployment can change between sessions.
+---
+Do NOT edit the connection index here — the 38 wiring plan adds the Active-Connections row + the experiment-source matrix column.

package/connections/hotjar.md ADDED Viewed

@@ -0,0 +1,110 @@
+# Hotjar — Connection Specification
+This file is the connection specification for Hotjar within the get-design-done pipeline. It lives in `connections/` alongside other connection specs. See the connection index for the full connection capability matrix (the hotjar row is added at the Phase 38 wiring closeout).
+---
+Hotjar is a **user-research source** for the discover/plan stages. GDD reads **indexed insights only** — heatmap aggregates, indexed session-insight summaries, and survey results — and feeds findings, as brief-grade prior research, into a phase brief. The connection is strictly **read-only**: GDD never writes to Hotjar, and it **never reads raw session-replay video** (or any per-visitor recording). It pulls only the pre-indexed, aggregate insight surface.
+Session data is among the most PII-sensitive inputs the pipeline can touch — a single replay or un-aggregated event can carry names, emails, typed form values, IPs, and on-screen personal data. **CRITICAL (D-05): every Hotjar payload MUST pass through `scripts/lib/pseudonymize.cjs` BEFORE it reaches any agent context.** Pseudonymization is mandatory, not optional, and it is the single chokepoint between Hotjar data and any model prompt. This mirrors the redact-before-egress discipline used for the notification surfaces ([`connections/slack.md`](slack.md)), but inverted: redaction guards *outbound*; here pseudonymization guards *inbound* — research data must be scrubbed before it enters a prompt.
+---
+## Setup
+**Prerequisites:** a Hotjar account with API access, and a **read-only** API token scoped to insight/aggregate endpoints only (no recording-export scope).
+**Token (env, never committed):**
+```bash
+export HOTJAR_API_KEY="<your-read-only-token>"
+```
+Scope the token to **indexed insights / heatmap aggregates / survey results only** — never grant raw-recording or session-export scope, even if Hotjar offers it. GDD has no code path that downloads recordings, and the token must not be able to either. The key is a credential: never commit it (not in source, not in `.env`, not in config), never log it, and rotate it if exposed. GDD reads it from env only.
+**Verification:**
+```bash
+test -n "${HOTJAR_API_KEY}" && echo "hotjar token present" || echo "hotjar token absent"
+```
+---
+## Availability Probe
+Hotjar may be reached either through an MCP (if one is registered) or directly via its HTTP API with the env token. Probe **MCP-first**, then fall back to the env check.
+**Step H1 — MCP presence (preferred):**
+```
+ToolSearch({ query: "hotjar", max_results: 10 })
+```
+- Non-empty result → an MCP is registered → `hotjar: available`
+- Empty result → fall through to Step H2
+**Step H2 — token presence:**
+```bash
+test -n "${HOTJAR_API_KEY}"
+```
+- Non-empty → `hotjar: available`
+- Empty → `hotjar: not_configured`
+- Present (MCP or token) but a live insight fetch errored → `hotjar: unavailable`
+**Kill-switch:** Hotjar is forced to a noop when `GDD_DISABLE_HOTJAR=1` (env), regardless of MCP/token presence — the probe short-circuits to `not_configured` and no fetch is attempted. `gdd-health` surfaces the state (mirrors the Phase 30 / 35.x kill-switch pattern).
+**Write the `hotjar` status to `.design/STATE.md` `<connections>` after probing.** Three-value schema:
+| Value | Meaning |
+|---|---|
+| `available` | MCP registered, OR `HOTJAR_API_KEY` set — and not disabled |
+| `unavailable` | MCP/token present but a live insight fetch errored |
+| `not_configured` | no MCP and no `HOTJAR_API_KEY` (or `GDD_DISABLE_HOTJAR=1`) |
+```xml
+<connections>
+hotjar: not_configured
+</connections>
+```
+---
+## Pipeline Integration
+Hotjar feeds the **user-research** lane of discover/plan. The flow is strictly ordered, and the pseudonymize step is non-negotiable and comes **first**:
+1. **Fetch (read-only):** pull heatmap aggregates, indexed session-insight summaries, and survey results for the relevant page/flow. Aggregates only — never a raw recording.
+2. **Pseudonymize FIRST:** pass every fetched payload through `scripts/lib/pseudonymize.cjs` *before anything else touches it*. The scrubbed `{ payload, replacements }` is the only form allowed downstream. Nothing — no agent, no log, no event — sees the pre-scrub data.
+3. **Synthesize:** hand the pseudonymized payload to the `user-research-synthesizer` agent, which distills it into brief-grade insights (top friction points, drop-off zones, survey themes) without re-introducing any identifier.
+4. **Inject:** the synthesized, brief-grade insights land in the phase brief's `<prior-research>` block, where the plan stage reads them as prior evidence for design decisions.
+Stage flow: `heatmap / insight / survey aggregates → pseudonymize.cjs (FIRST) → user-research-synthesizer → brief-grade insights → brief <prior-research> block`.
+Adjacent methodology (sample sizing, heatmap/survey interpretation, over-claim guards) lives in the user-research reference doc and governs how the synthesizer reads these aggregates.
+The fetch path POSTs/GETs via an **injectable `fetchImpl`** (defaulting to the global `fetch`), so the test suite drives it with synthetic insight fixtures hermetically — no live Hotjar, no network. There is **no new dependency**: no `@hotjar/*` package, no SDK; just `fetch` + the existing pseudonymize primitive.
+---
+## Fallback Behavior
+Hotjar is an **enhancement, never a hard requirement** (D-03). When `hotjar: not_configured`, `hotjar: unavailable`, or the kill-switch is on, the user-research lane **degrades to a noop**: the `<prior-research>` block is simply built without Hotjar-sourced signals (other research sources, if any, still contribute), and the pipeline continues. Discover/plan never block on Hotjar availability or on a fetch failure.
+A failed or skipped fetch returns a benign skipped result and never throws. The synthesizer treats absent Hotjar input as "source: missing" and proceeds — same graceful-degradation contract the other optional connections use.
+---
+## PII + Privacy
+Session-research data is highly PII-sensitive. This section is binding, not advisory.
+- **Pseudonymize before context (mandatory, D-05):** every Hotjar payload passes through `scripts/lib/pseudonymize.cjs` **before** it reaches any agent prompt, the synthesizer, or any persisted artifact. There is no bypass path; pseudonymization is the single inbound chokepoint. (Note this is *pseudonymization, not anonymization* — identity correlation is reduced, not eliminated.)
+- **Aggregates, not raw sessions:** GDD reads only indexed insights, heatmap aggregates, and survey results. It **never** fetches, stores, or forwards raw session-replay video or per-visitor recordings — there is no code path that can, and the token must not be scoped to allow it.
+- **No PII in logs or events:** the pseudonymized payload is what flows downstream; the pre-scrub payload is never written to logs, never emitted in pipeline events, and never persisted. The `HOTJAR_API_KEY` is likewise never logged.
+- **Least scope:** prefer the narrowest read-only token Hotjar offers; if recording-export scope cannot be excluded at the token, treat that token as unsafe for this connection.
+---
+Do NOT edit the connection index here — the Phase 38 wiring plan adds the Active-Connections row + the experiment-source matrix column.

package/connections/launchdarkly.md ADDED Viewed

@@ -0,0 +1,83 @@
+# LaunchDarkly — Connection Specification
+This file is the connection specification for LaunchDarkly within the get-design-done pipeline. It lives in `connections/` alongside other connection specs (see [`connections/slack.md`](slack.md) for the structural sibling — an API/env-based connection with a three-value probe and degrade-to-noop).
+---
+LaunchDarkly is an **experiment-source** for the outcome-learning layer (Phase 38). GDD **reads** A/B experiment results from LaunchDarkly and feeds each variant→outcome into the `design_arms` posterior, so shipped design decisions get reinforced or discounted by what actually performed in production. GDD never runs, creates, edits, or stops experiments — it is strictly **read-only** (D-04). Reads degrade to a noop when unconfigured or disabled; outcome learning simply pauses and the pipeline never blocks.
+---
+## Setup
+**Prerequisites:** read-only access to a LaunchDarkly project's experiment results — either a LaunchDarkly **API key** (a reader/viewer-scoped token, not a writer token) **or** an SDK key, **or** the LaunchDarkly MCP if it is installed in your runtime.
+**Token (env, never committed):**
+```bash
+export LAUNCHDARKLY_API_KEY="<reader-scoped-api-or-sdk-key>"
+```
+Use the narrowest scope LaunchDarkly offers (reader/viewer). The key is a credential — never commit it (not in source, not in `.env`, not in config), never log it, rotate if exposed. GDD reads it from env only and never requests a write scope.
+**Verification:**
+```bash
+test -n "${LAUNCHDARKLY_API_KEY}" && echo "launchdarkly key present" || echo "launchdarkly key absent"
+```
+---
+## Availability Probe
+Probe is **MCP-first**, env-fallback, kill-switch-aware:
+1. If `GDD_DISABLE_LAUNCHDARKLY=1` → short-circuit to `not_configured` (treated as disabled; never probe further).
+2. Run `ToolSearch({ query: "launchdarkly" })`. If a LaunchDarkly MCP tool resolves → `launchdarkly: available`.
+3. Else check the env key: `test -n "${LAUNCHDARKLY_API_KEY}"`.
+   - Non-empty → `launchdarkly: available`
+   - Empty → `launchdarkly: not_configured`
+4. Source present (MCP or key) but a read errored at fetch time → `launchdarkly: unavailable`.
+Write the `launchdarkly` status to `.design/STATE.md` `<connections>` after probing:
+```xml
+<connections>
+launchdarkly: not_configured
+</connections>
+```
+| Value | Meaning |
+|---|---|
+| `available` | LaunchDarkly MCP resolves OR `LAUNCHDARKLY_API_KEY` set, AND not disabled |
+| `unavailable` | source present but a result read errored |
+| `not_configured` | no MCP and no `LAUNCHDARKLY_API_KEY`, or `GDD_DISABLE_LAUNCHDARKLY=1` |
+The kill-switch `GDD_DISABLE_LAUNCHDARKLY=1` forces `not_configured` regardless of MCP/key presence (mirrors the Phase 30 / 35.1 disable convention). `gsd-health` surfaces the state.
+---
+## Pipeline Integration
+LaunchDarkly contributes the **experiment-source** capability. The flow is read-only and one-directional (results in, never experiments out):
+1. The probe marks `launchdarkly: available` in `.design/STATE.md`.
+2. The experiment-result ingester (`agents/experiment-result-ingester.md`) reads completed A/B results from LaunchDarkly — variant identifiers plus their measured metric outcomes.
+3. It maps each variant to the matching `design_arms` arm and records the outcome (win / loss / lift) against that arm's posterior, so the next design decision is informed by production evidence.
+4. For each mapped result it emits an `experiment_result` event into the pipeline's event stream for downstream learning and audit.
+The ingester reads results only; it issues no experiment-creation, assignment, or mutation calls against LaunchDarkly (D-04).
+**Injectable fetch (hermetic tests):** the ingester takes an injectable `fetchImpl` (defaulting to the resolved MCP tool or global `fetch`). Tests pass a stub `fetchImpl` so `npm test` exercises the variant→outcome mapping with no real egress — no live LaunchDarkly call in CI. There is **no bundled LaunchDarkly SDK and no new dependency**; reads go through the MCP tool or the injectable `fetchImpl`.
+---
+## Fallback Behavior
+`not_configured` (no MCP, no key) or disabled (`GDD_DISABLE_LAUNCHDARKLY=1`) → the experiment-source **degrades to a noop**: the ingester is skipped, no `experiment_result` events are emitted, and the `design_arms` posterior simply does not get the outcome update this cycle. Design decisions still ship — they just rely on prior evidence instead of fresh experiment results.
+A read failure when a source *is* present → `launchdarkly: unavailable`; that cycle's ingestion is skipped (no error surfaced to the pipeline) and retried on the next probe. The ingester returns a skipped/empty result and never throws, so outcome learning is best-effort and **never blocks the pipeline** (mirrors the notify degrade-to-noop in [`connections/slack.md`](slack.md)).
+---
+Do NOT edit the connection index here — the 38 wiring plan adds the Active-Connections row + the experiment-source matrix column.