npm - @event4u/agent-config - Versions diffs - 2.7.0 → 2.9.0 - Mend

@event4u/agent-config 2.7.0 → 2.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (76) hide show

package/.agent-src/personas/cmo.md +122 -0
package/.agent-src/personas/customer-success-lead.md +126 -0
package/.agent-src/personas/engineering-manager.md +133 -0
package/.agent-src/personas/finance-partner.md +129 -0
package/.agent-src/personas/growth-pm.md +134 -0
package/.agent-src/personas/people-strategist.md +126 -0
package/.agent-src/personas/revops.md +125 -0
package/.agent-src/personas/strategist.md +129 -0
package/.agent-src/skills/activation-design/SKILL.md +160 -0
package/.agent-src/skills/build-buy-partner/SKILL.md +145 -0
package/.agent-src/skills/churn-prevention/SKILL.md +156 -0
package/.agent-src/skills/comp-banding/SKILL.md +160 -0
package/.agent-src/skills/competitive-moat-analysis/SKILL.md +152 -0
package/.agent-src/skills/content-funnel-design/SKILL.md +170 -0
package/.agent-src/skills/contracts-cognition/SKILL.md +147 -0
package/.agent-src/skills/data-handling-judgment/SKILL.md +155 -0
package/.agent-src/skills/deal-qualification-meddic/SKILL.md +165 -0
package/.agent-src/skills/editorial-calendar/SKILL.md +161 -0
package/.agent-src/skills/expansion-playbook/SKILL.md +171 -0
package/.agent-src/skills/forecast-accuracy/SKILL.md +157 -0
package/.agent-src/skills/forecasting/SKILL.md +164 -0
package/.agent-src/skills/fundraising-narrative/SKILL.md +189 -0
package/.agent-src/skills/funnel-analysis/SKILL.md +26 -2
package/.agent-src/skills/gtm-launch/SKILL.md +165 -0
package/.agent-src/skills/hiring-loop-design/SKILL.md +167 -0
package/.agent-src/skills/market-entry-analysis/SKILL.md +144 -0
package/.agent-src/skills/messaging-architecture/SKILL.md +184 -0
package/.agent-src/skills/onboarding-design/SKILL.md +158 -0
package/.agent-src/skills/onboarding-program/SKILL.md +157 -0
package/.agent-src/skills/one-on-one-cadence/SKILL.md +161 -0
package/.agent-src/skills/org-design/SKILL.md +158 -0
package/.agent-src/skills/perf-feedback-craft/SKILL.md +157 -0
package/.agent-src/skills/pipeline-strategy/SKILL.md +159 -0
package/.agent-src/skills/positioning-strategy/SKILL.md +177 -0
package/.agent-src/skills/privacy-review/SKILL.md +160 -0
package/.agent-src/skills/retention-loops/SKILL.md +161 -0
package/.agent-src/skills/runway-cognition/SKILL.md +136 -0
package/.agent-src/skills/scenario-modeling/SKILL.md +139 -0
package/.agent-src/skills/subagent-orchestration/SKILL.md +1 -1
package/.agent-src/skills/throughput-vs-morale-tradeoff/SKILL.md +165 -0
package/.agent-src/skills/unit-economics-modeling/SKILL.md +54 -7
package/.agent-src/skills/vision-articulation/SKILL.md +146 -0
package/.agent-src/skills/voice-and-tone-design/SKILL.md +163 -0
package/.agent-src/templates/agents/agent-project-settings.example.yml +1 -1
package/.agent-src/templates/scripts/telemetry/settings.py +65 -0
package/.agent-src/templates/scripts/tier_usage_report.py +183 -0
package/.claude-plugin/marketplace.json +34 -2
package/AGENTS.md +1 -1
package/CHANGELOG.md +135 -153
package/README.md +3 -3
package/docs/architecture.md +37 -11
package/docs/archive/CHANGELOG-pre-2.7.0.md +185 -0
package/docs/catalog.md +38 -4
package/docs/contracts/adr-forecast-construction-shape.md +89 -0
package/docs/contracts/adr-gtm-context-spine.md +115 -0
package/docs/contracts/adr-wing4-context-spine.md +125 -0
package/docs/contracts/command-clusters.md +41 -0
package/docs/contracts/command-surface-tiers.md +30 -9
package/docs/contracts/context-spine.md +58 -12
package/docs/contracts/cross-wing-handoff.md +3 -3
package/docs/contracts/mcp-beta-criteria.md +129 -0
package/docs/contracts/persona-schema.md +20 -3
package/docs/guidelines/gtm-handoff.md +114 -0
package/docs/guidelines/wing4-handoff.md +127 -0
package/docs/mcp-server.md +1 -1
package/package.json +1 -1
package/scripts/_cli/cmd_doctor.py +527 -14
package/scripts/_cli/cmd_validate.py +10 -0
package/scripts/agent-config +19 -18
package/scripts/install.py +5 -0
package/scripts/lint_context_spine_usage.py +5 -1
package/scripts/mcp_server/__init__.py +1 -0
package/scripts/mcp_server/server.py +4 -3
package/scripts/schemas/persona.schema.json +5 -0
package/scripts/schemas/skill.schema.json +2 -2
package/scripts/skill_linter.py +284 -6

package/.agent-src/skills/fundraising-narrative/SKILL.md ADDED Viewed

@@ -0,0 +1,189 @@
+---
+name: fundraising-narrative
+description: "Use when shaping a capital-raise pitch — why-now / why-us / why-this framing, market-size reasoning, traction-story construction. Triggers on 'tighten the pitch', 'why-now is weak'."
+status: active
+tier: senior
+source: package
+domain: product
+context_spine: [product, customer-segment]
+---
+# fundraising-narrative
+## When to use
+- A founder is preparing a capital-raise pitch and the why-now is borrowed from a deck template instead of earned from the segment-shift the team actually rides.
+- A deck is landing as "interesting but not now" with investors and the team needs to diagnose whether the gap is *why-now*, *why-us*, or *why-this*.
+- A traction story is being built from screenshots instead of from a coherent leading-indicator arc that explains why the next stage is reachable.
+Do NOT use to manage the investor-CRM pipeline (out of scope), run
+the data-room (out of scope), or draft the internal vision anchor
+the org rallies behind (route to Wing-4 `vision-articulation` — the
+external pitch under capital constraint and the internal anchor
+are siblings, not the same artefact).
+## Cognition cluster
+- **Mental model 1 — First-principles thinking.** *Why-now* is
+  the load-bearing claim and the one most often borrowed. Build
+  it from the segment-shift up — what changed in the world, the
+  customer, the technology — not from a deck template. See
+  [`docs/contracts/mental-models.md`](../../../docs/contracts/mental-models.md) § 1.
+- **Mental model 9 — Hypothesis-driven development.** A pitch is
+  a falsifiable hypothesis: *if X is true, our round closes.*
+  Name the X. Investors who disagree with the hypothesis are not
+  rejecting taste — they are rejecting the falsifiable claim. See
+  `mental-models.md` § 9.
+- **Mental model 16 — Leading vs. lagging indicators.** Revenue
+  is lagging; activation, retention curve, and qualified-pipeline
+  velocity are leading. The traction story leads with the leading
+  signals; revenue is the receipt, not the argument. See
+  `mental-models.md` § 16.
+- **Mental model 30 — Inversion.** Run the round-failure
+  premortem before the deck locks: *which investor heard what we
+  did not say.* Inversion surfaces the claim the deck assumes the
+  room already shares and probably does not. See
+  `mental-models.md` § 30.
+- **Context-spine — product + customer-segment.** Read **product**
+  for the proofs the traction story can actually back; read
+  **customer-segment** for the TAM/SAM argument that survives a
+  bottom-up scrutiny. See
+  [`context-spine`](../../../docs/contracts/context-spine.md).
+## Procedure
+### Step 0: Inherit the positioning frame and vision anchor
+Identify the locked positioning anchors from
+[`positioning-strategy`](../positioning-strategy/SKILL.md) and the internal vision
+anchor from `vision-articulation` if it exists. The fundraising
+narrative is the *external pitch under capital constraint*; it
+inherits the internal frame, it does not re-invent it. A pitch
+that contradicts the internal anchor will fracture on the first
+hire after the round closes.
+### Step 1: Analyze the inherited why-now
+Read the current why-now claim. Three checks: *is this a market
+shift, a customer shift, or a technology shift?*  *Did the shift
+happen in the last 24 months?*  *Would the segment recognise the
+shift without prompting?* A why-now that fails two of three is
+template-borrowed. Name what the inherited deck is leaning on.
+### Step 2: Build why-now from first principles
+Strip the inherited claim. Rebuild from the segment-shift up:
+- **Market shift.** What changed in the buyer's environment that
+  was not true 24 months ago? (Regulation, budget cycle, channel
+  collapse, competitive exit.)
+- **Customer shift.** What changed in how the ICP measures the
+  problem? (New KPI, new buying committee, new procurement gate.)
+- **Technology shift.** What is feasible now that was not? (Cost
+  curve, model capability, infrastructure unlock.)
+Pick the *one* shift the segment would name without prompting.
+That is the why-now spine. The others are supporting context.
+### Step 3: Construct why-us under capital constraint
+Why-us is *unfair advantage under the next 18 months of capital*,
+not credentials. Three anchors:
+- **Earned access.** The audience the team can already reach that
+  the next funded peer cannot.
+- **Earned proof.** The reference customer or load-bearing
+  retention curve the team owns now.
+- **Capital fit.** What the round buys that competitors cannot buy
+  in the same window. *"More engineers"* is not capital fit;
+  *"distribution lead-time the round protects"* is.
+### Step 4: Build the traction story from leading indicators
+Order the traction story leading-first:
+1. **Activation curve.** Time-to-first-value trend across cohorts.
+2. **Retention curve.** Cohort retention at the load-bearing
+   milestone, ideally non-trivial — not week 1.
+3. **Pipeline velocity.** Qualified-pipeline movement, not raw
+   pipeline volume.
+4. **Revenue.** The receipt, last in the sequence — not first.
+A traction story that opens on revenue assumes the room already
+believes the leading signals; the deck must earn that belief.
+### Step 5: Validate against the round-failure premortem
+Validate the narrative on three checks:
+1. **Premortem coverage.** Run *"the round did not close because…"*
+   with five failure modes. Verify the deck explicitly neutralises
+   the top three or accepts-with-mitigation; unnamed failure modes
+   are silent rejection routes.
+2. **Falsifiable hypothesis.** Confirm the pitch is the form
+   *"if X, then our round closes."* A pitch that cannot be
+   disagreed with is also a pitch that cannot be agreed with.
+3. **Internal-external consistency.** Diff the external pitch
+   against the internal vision anchor. Contradictions kill the
+   first post-round hire round; name them now.
+### Step 6: Hand back
+Hand the artefacts to the founder for delivery, to
+`messaging-architecture` for the post-round message-stack refresh
+(why-now often shifts the primary message), and to
+`vision-articulation` (Wing-4) for the internal-anchor diff if
+contradictions surfaced.
+## Related Skills
+**WHEN to use this**
+- The unit of work is the why-now / why-us / why-this triad under capital constraint, not a single deck slide.
+- A diagnosed pitch gap needs a structured rebuild, not slide-polish.
+- The traction story is being built screenshot-first; reorder it leading-first.
+**WHEN NOT to use this**
+- Internal vision-anchor authoring for org alignment — route to Wing-4 `vision-articulation`.
+- Message-stack work post-round — route to [`messaging-architecture`](../messaging-architecture/SKILL.md).
+- Positioning the category and segment — route to [`positioning-strategy`](../positioning-strategy/SKILL.md) first.
+- Investor-CRM pipeline or data-room operations — out of scope.
+## When the agent should load this
+- "Tighten the why-now for the seed round."
+- "Bau mir die Traction-Story für den Pitch."
+- "Investors keep saying 'interesting but not now' — diagnose."
+- "Run the round-failure premortem on the deck."
+- "Why-us reads as a credentials list — rebuild under capital constraint."
+## Output
+1. **`why-now-spine.md`** — the one market / customer / technology shift the segment names without prompting, with the 24-month evidence trail.
+2. **`why-us-anchors.md`** — earned-access · earned-proof · capital-fit, each with a load-bearing artefact citation.
+3. **`traction-arc.md`** — activation → retention → pipeline-velocity → revenue, leading-first ordering with the leading-indicator threshold per step.
+4. **`round-failure-premortem.md`** — five failure modes with neutraliser-or-accept verdict, internal-external consistency diff appended.
+## Gotcha
+- Why-now is the most-borrowed claim in pitches because it is the hardest to earn from first principles — the room can tell.
+- Capital-fit collapses to *"hire more"* when the team has not thought through what the round protects from competitors; protect-language is the discipline.
+- Internal-external contradictions read as charm in the room and as betrayal at the post-round all-hands.
+## Do NOT
+- Do NOT carry the internal vision anchor verbatim into the pitch — internal anchor is rally; external pitch is hypothesis under capital constraint.
+- Do NOT lead the traction story with revenue when the leading signals are the actual argument.
+- Do NOT make the why-now a template-shaped *"AI changes everything"* — the segment will know.
+- Do NOT manage CRM or data-room operations from this skill; out of scope.
+## Runnable example
+Mid-market HR analytics tool raising Series A, positioning locked (retention beats acquisition):
+- Why-now spine — *customer shift*: HR directors now own a board-quarter retention KPI (was true on 30 % of ICP boards 24 months ago, now 70 %; verified via 14 ICP board-decks reviewed).
+- Why-us anchors — *earned access*: 200-strong HR-director community already engaged. *Earned proof*: cohort-retention curve at week-12 holding at 78 % across 9 design-partner cohorts. *Capital-fit*: round protects 18 months of distribution lead-time before two funded peers reach the same segment.
+- Traction arc — activation (time-to-first-cohort-roll-up: 14 → 6 days across last 4 cohorts) → retention (78 % week-12 cohort) → pipeline-velocity (qualified-pipeline movement at 2.4× quarter-on-quarter) → revenue (the receipt).
+- Round-failure premortem — top three failure modes neutralised in deck; one accepted-with-mitigation (we are pre-revenue at enterprise tier — mitigated by 3 named pilot LOIs).
+- Hand-off → founder for delivery; `messaging-architecture` queued for post-round refresh.

package/.agent-src/skills/funnel-analysis/SKILL.md CHANGED Viewed

@@ -5,6 +5,7 @@ status: active
 tier: senior
 source: package
 domain: product
+context_spine: [product, customer-segment, funnel-stage]
 ---
 # funnel-analysis
@@ -17,6 +18,29 @@ domain: product
 Do NOT use for ranking features, valuation, or OKR decomposition (see Related Skills). Funnel analysis is a **diagnostic**, not a roadmap.
+## Cognition cluster
+- **Mental model 16 — Leading vs. lagging indicators.** Paid is
+  lagging; activation is leading; signup is upstream of both. A
+  funnel decision built on the lagging stage can only confirm the
+  miss; the leading stage names the binding fix. See
+  [`docs/contracts/mental-models.md`](../../../docs/contracts/mental-models.md) § 16.
+- **Mental model 13 — Occam's razor.** When a stage drops, the
+  simpler explanation usually wins: *"acquisition mix shifted"*
+  beats *"users no longer understand the product."* Pick the simpler
+  cause; it changes the move. See `mental-models.md` § 13.
+- **Mental model 3 — Pareto (80/20).** Drops are almost never
+  uniform across segments; ~20 % of the segment × stage cells carry
+  ~80 % of the loss. Segment before treating the average as
+  actionable. See `mental-models.md` § 3.
+- **Context-spine — product + customer-segment + funnel-stage.**
+  Read the **product** slot for what activation can actually mean
+  in-product (the activation event must be shippable), the
+  **customer-segment** slot for which segments' switch-events the
+  funnel is built for, and the **funnel-stage** slot for the
+  position of each stage relative to the buying journey. See
+  [`context-spine`](../../../docs/contracts/context-spine.md).
 ## Procedure
 ### Step 0: Inspect
@@ -40,7 +64,7 @@ Do NOT use for ranking features, valuation, or OKR decomposition (see Related Sk
 1. The right benchmark is **your own funnel one quarter ago**, not industry averages. Industry averages mix verticals so coarsely they're useless for action.
 2. For each stage: is current rate within ±2 percentage points of trailing-quarter median? If not, that stage is the primary suspect.
-3. If multiple stages drift simultaneously, the cause is upstream (acquisition mix change, broken instrumentation), not the stage itself.
+3. If multiple stages move off-band simultaneously, the cause is upstream (acquisition mix change, broken instrumentation), not the stage itself.
 ### Step 4: Segment the broken stage
@@ -98,4 +122,4 @@ Do NOT use for ranking features, valuation, or OKR decomposition (see Related Sk
 1. **`funnel-table.md`** — 5-stage funnel with cohort rates, 95% CI, and 12-week trend (sparkline or compact ASCII). One row per cohort week or month.
 2. **`segment-breakdown.md`** — table of the broken stage segmented by channel · device · plan · geo. Rates with CIs. Suspect segments highlighted.
-3. **`hypothesis-list.md`** — top 3 → for the broken segment-stage with cheapest-falsification experiment per cause and an explicit prediction for the next measurement.
+3. **`hypothesis-list.md`** — top 3 causes for the broken segment-stage with cheapest-falsification experiment per cause and an explicit prediction for the next measurement.

package/.agent-src/skills/gtm-launch/SKILL.md ADDED Viewed

@@ -0,0 +1,165 @@
+---
+name: gtm-launch
+description: "Use when sequencing a launch — alpha / beta / GA waves, audience-by-wave logic, narrative beats per wave, engineering-readiness gates. Triggers on 'plan the launch', 'sequence GA'."
+status: active
+tier: senior
+source: package
+domain: product
+context_spine: [product, customer-segment, channel-stage]
+---
+# gtm-launch
+## When to use
+- A product, feature, or major capability is approaching ship-readiness and the team needs a wave plan (alpha → beta → GA) keyed to audience and proof, not a date on a calendar.
+- A launch is being planned date-first; the team needs to invert and plan readiness-first so an unmet gate stops a wave instead of leaking past it.
+- A previous launch landed soft and the retro names "no audience-by-wave logic" or "narrative beats unclear per wave" as the cause.
+Do NOT use to write announcement copy (route to `release-comms`),
+lock the message stack (route to `messaging-architecture`), or plan
+post-launch retention loops (route to `retention-loops`).
+## Cognition cluster
+- **Mental model 10 — Reversible vs. irreversible decisions.** A GA
+  wave is largely irreversible: rolling back narrative and audience
+  expectations after public launch costs more than re-shipping the
+  product. Alpha and beta are reversible; treat them as the
+  decision-quality buffer. See
+  [`docs/contracts/mental-models.md`](../../../docs/contracts/mental-models.md) § 10.
+- **Mental model 29 — Premortem.** Before the wave plan locks, write
+  the post-mortem of the launch as if it failed. The premortem
+  surfaces the gates that need to hold; the wave plan is the inverse
+  of that list. See `mental-models.md` § 29.
+- **Mental model 16 — Leading vs. lagging indicators.** Engineering-
+  readiness signals (error rate, latency, support-load) are leading;
+  pipeline lift is lagging. A wave plan that gates on lagging signals
+  ships into a soft floor. See `mental-models.md` § 16.
+- **Context-spine — product + customer-segment + channel-stage.**
+  Read the **product** slot for shippable scope, the
+  **customer-segment** slot for who hears the launch on which wave,
+  and the **channel-stage** slot for where each wave's audience lives
+  in the awareness → decision arc. See
+  [`context-spine`](../../../docs/contracts/context-spine.md).
+## Procedure
+### Step 0: Inherit the message stack
+Identify the locked `primary-message.md`, `supporting-proofs.md`, and
+`audience-matrix.md` from [`messaging-architecture`](../messaging-architecture/SKILL.md).
+If the stack is missing or unstable, stop and route back. A launch
+plan without a locked message stack ships three different stories at
+three different surfaces.
+### Step 1: Run the premortem
+Write the launch post-mortem **as if it has already failed**. Three
+prompts: *"what did the segment hear that we did not say,"* *"what
+broke in the first 48 hours,"* *"what did the alternative say first
+and louder."* The premortem produces the failure-mode list the wave
+plan must neutralise.
+### Step 2: Define the gates per wave
+For each wave (alpha · beta · GA), define **entry gates** and **exit
+gates**:
+- *Alpha entry:* engineering-readiness signal threshold (error rate,
+  latency, instrumentation coverage). *Exit:* < N support tickets
+  per 100 sessions on the load-bearing flow.
+- *Beta entry:* alpha exit + audience-matrix proof exists for the
+  beta audience. *Exit:* leading indicator (activation, time-to-
+  first-value) clears threshold per `mental-models.md § 16`.
+- *GA entry:* beta exit + narrative beats locked for the public
+  segment. *Exit:* not applicable — GA is irreversible; the next
+  wave is *post-launch retention*, handed to `retention-loops`.
+### Step 3: Sequence the audience waves
+Audience waves are not seniority waves. They are **proof waves**.
+Each wave's audience is whichever segment generates the proof the
+*next* wave needs. Order:
+1. *Alpha audience* — the segment where the team can sit next to
+   the user. Proof: load-bearing flow does not break under real use.
+2. *Beta audience* — the segment whose adoption is the credibility
+   anchor for GA. Proof: a quotable reference and an activation
+   curve.
+3. *GA audience* — the full ICP segment from the `customer-segment`
+   slot. Proof: pipeline lift, narrative pickup, retention curve.
+### Step 4: Assign narrative beats per wave
+Each wave gets a narrative beat — the **one** thing the audience
+remembers. Alpha beat = trust signal (we are not winging it). Beta
+beat = proof signal (it works for someone like you). GA beat = the
+primary message from `messaging-architecture` Step 1. Beats stack;
+they do not contradict.
+### Step 5: Validate the plan against the premortem
+Validate each premortem failure mode against the wave plan: verify a
+specific gate or beat neutralises it. Any failure mode without an
+explicit neutraliser is a known leak — name it, do not bury it.
+Validation passes only when every premortem item is either
+neutralised or accepted-with-mitigation.
+### Step 6: Hand back
+Hand the artefacts to [`release-comms`](../release-comms/SKILL.md)
+for announcement-surface drafting, to
+[`editorial-calendar`](../editorial-calendar/SKILL.md) for cadence
+mapping, and to [`launch-readiness`](../launch-readiness/SKILL.md)
+for the merge-day checklist.
+## Related Skills
+**WHEN to use this**
+- The unit of work is the wave plan (alpha · beta · GA) with gates and beats, not a single announcement.
+- A launch needs readiness-gated sequencing instead of calendar-driven sequencing.
+- The team can name the message stack but not which audience hears which beat in which wave.
+**WHEN NOT to use this**
+- Writing the announcement copy or press surface — route to [`release-comms`](../release-comms/SKILL.md).
+- Locking the primary message and proofs — route to [`messaging-architecture`](../messaging-architecture/SKILL.md).
+- Pre-merge ops checklist (rollout, rollback, monitoring) — route to [`launch-readiness`](../launch-readiness/SKILL.md).
+- Post-launch retention design — route to [`retention-loops`](../retention-loops/SKILL.md).
+## When the agent should load this
+- "Plan the launch waves for the new pricing tier."
+- "Wir starten den GA — gib mir die Alpha-Beta-GA Sequenz."
+- "What are the entry gates for the beta wave?"
+- "Premortem the launch and rebuild the wave plan from the failure list."
+- "Sequence the audience waves around the proof we still need."
+## Output
+1. **`launch-premortem.md`** — three failure modes per prompt, ranked by carrying cost, each tagged with the wave that owns the neutraliser.
+2. **`wave-plan.md`** — three waves (alpha · beta · GA) with entry / exit gates, audience, leading-indicator threshold per wave.
+3. **`narrative-beats.md`** — one beat per wave (trust → proof → primary-message), with the line the team will not contradict on any surface during that wave.
+## Gotcha
+- Calendar-driven launches confuse a date with a gate. A date does not signal readiness; a gate does. The wave plan must hold even if the date slips two weeks.
+- "Friends-and-family alpha" is alpha-shaped theatre — it produces the wrong proof for the next wave. Recruit an alpha audience that exposes the load-bearing flow.
+- A premortem that produces only three failure modes was rushed; push for ten and keep the load-bearing three.
+## Do NOT
+- Do NOT write the announcement copy here — copy lives in `release-comms` downstream of locked beats.
+- Do NOT collapse alpha and beta to save calendar time — alpha and beta produce different proofs.
+- Do NOT lock GA without an explicit retention-loops handoff; an unowned post-launch fortnight is where most launches soften.
+## Runnable example
+Mid-market HR analytics tool launching workforce-analytics layer:
+- Premortem: (a) CFOs hear "another tool" not "retention saving"; (b) HRIS plug-in misconfigured under load; (c) reference customer quote not contractually approved by GA.
+- Wave plan — *Alpha:* 3 design-partner HR directors, gate = HRIS plug-in error-rate < 1 % under load. *Beta:* 10 HR leaders matching ICP, gate = activation curve hits 5 cohort-roll-ups per week. *GA:* full ICP segment, gate = quoted reference contractually approved.
+- Narrative beats — *Alpha:* "we sat next to you while it worked." *Beta:* "an HR director like you saved 7 hours last board-quarter." *GA:* primary message from `messaging-architecture` Step 1.
+- Hand-off → `release-comms` drafts the GA-wave surface; `retention-loops` owns the 30-day post-GA cohort.

package/.agent-src/skills/hiring-loop-design/SKILL.md ADDED Viewed

@@ -0,0 +1,167 @@
+---
+name: hiring-loop-design
+description: "Use when shaping an engineering hiring loop — stages, take-home vs live, calibration, bar-raiser, signal-vs-noise audit. Triggers on 'design our interview loop', 'audit our hiring bar'."
+status: active
+tier: senior
+source: package
+domain: process
+context_spine: [org-stage, product, customer-segment]
+---
+# hiring-loop-design
+## When to use
+- A first engineering hiring loop is being designed (early-stage co, first dedicated EM, first PM hire) and the question is *what stages, in what order, with what signal each*.
+- An existing loop is producing inconsistent outcomes (high false-positive rate, high false-negative rate, long time-to-hire) and the question is *which stage to fix*.
+- A new role family inside engineering is opening (first staff IC, first SRE, first ML eng) and the question is *what does the loop look like for this archetype*.
+Do NOT use this for non-engineering hiring as the primary surface (sales / GTM hiring is a different loop shape entirely), as a sourcing / recruiting-pipeline skill (separate surface area), or for applicant-tracking-system configuration.
+## Cognition cluster
+- **Mental model 1 — First principles.** Strip hiring to: *what signal does each stage produce that no other stage can produce?* Stages that duplicate signal waste candidate-time and interviewer-time. The strongest loops have one stage per signal, not five stages probing the same thing. See [`mental-models.md`](../../../docs/contracts/mental-models.md) § 1.
+- **Mental model 28 — Inversion.** *"What would make a great hire withdraw from this loop?"* — usually: 7+ stages, take-home > 6 hours, no role-context conversation, long calendar gaps, no senior-IC time, no offer-narrative. Inversion surfaces the canonical withdrawal causes; great candidates have options and use them.
+- **Mental model 21 — Second-order thinking.** A loose bar at L4 produces a chain: weaker L4 → harder L5 calibration → erodes ladder credibility → ICs leave. A single accept-the-no-vote ripples for 2+ years. The cost of a wrong hire dwarfs the cost of a missed hire; bar discipline is a multi-year compounding decision.
+- **Mental model — Base rates.** Most signals in interviews are noise; the most-confident signal-claim is usually the most-overfit to one observation. Calibrate against the base rate: *"out of 10 candidates who passed this stage with this signal, how many succeeded at 12 months?"* If unknown, the stage is unfalsifiable.
+- **Context-spine — org-stage + product + customer-segment.** Read **org-stage** for what's affordable (pre-seed: 3-stage loop, fast; growth: 5-stage with calibration; late: 5–6 with bar-raiser). Read **product** for what behaviors matter (deep-systems = system-design heavier; consumer = velocity + judgment heavier; regulated = ethics + judgment heavier). Read **customer-segment** for stakeholder-management exposure needed.
+## Cross-wing handoff
+- Composed downstream of Q1 `org-design` — the role-family shape determines the loop shape; hiring without a clear role definition is broken from stage 1.
+- Composed downstream of Q4 `perf-feedback-craft` — the calibration session is structurally a feedback exchange about a candidate; Q4's SBI + ladder-of-inference apply.
+- Hands off to Q3 `onboarding-program` — the loop's signal evidence becomes the day-1 ramp-evidence base.
+- Hands off to Q2 `comp-banding` for the offer construction step.
+## Procedure
+### Step 0: Define the role-shape before designing the loop
+For the role being hired, name:
+1. **Level** — L3 / L4 / L5 / L6 / staff / principal. Levels matter because signal evidence changes per level (L4 needs strong execution signal; L6 needs leverage / system-design signal).
+2. **Archetype** — IC-builder / IC-strategist / IC-system-designer / manager / staff-multiplier. Same level, different archetype = different loop shape.
+3. **First-90-day deliverable** — what should this person ship by day 90. Concrete enough to design loop signals against.
+A loop designed without a role definition produces noise. Force the role definition step.
+### Step 1: Map signal needs to stages
+For the role from Step 0, enumerate the signals that need direct evidence:
+1. **Coding / craft** — for IC roles. Live coding, take-home, or pair-programming.
+2. **System design** — for L5+. Two-hour bounded-scope problem.
+3. **Domain judgment** — for senior roles. Behavioral case with context-specific tradeoffs.
+4. **Communication / stakeholder** — for any role. Cross-functional collaboration exercise.
+5. **Leadership / multiplier** *(L6+)* — narrative of past leverage, mentoring decisions, ladder reasoning.
+6. **Values fit** — explicitly NOT culture fit. Concrete questions about handling pressure / disagreement / failure.
+One signal per stage. If two stages probe the same signal, kill one.
+### Step 2: Pick the stage shape per signal
+For each signal, pick the lightest-touch stage that produces the signal cleanly:
+1. **Recruiter / role-context call** *(30 min)* — role + company + light values. NOT a screen.
+2. **Hiring-manager screen** *(45 min)* — judgment + role-fit + reverse-context. Required.
+3. **Coding signal** — choose: (a) take-home ≤ 3 hours with explicit time-cap (most candidate-respectful for senior); (b) live coding 60 min (for L3–L4, faster signal); (c) pair-programming 90 min (best signal but heavy on interviewer time).
+4. **System design** *(2 hours, L5+)* — bounded scope; rubric set in advance.
+5. **Behavioral / domain judgment** *(45–60 min)* — structured by signal area; SBI-anchored prompts.
+6. **Leadership / values** *(45–60 min)* — narratives + specific past-decision probes.
+7. **Bar-raiser / cross-team** *(45 min, L5+)* — perspective from outside the hiring team to check ladder consistency.
+Take-home > 6 hours = candidate-hostile and produces survivorship-biased pools (only those with no other options finish). Keep ≤ 3 hours or skip.
+### Step 3: Design the rubric per stage
+Each stage gets a written rubric before the first interview runs:
+1. **Signal target** — what behavior demonstrates the signal at this level.
+2. **Anti-signal** — what behavior fails the signal (named, not inferred).
+3. **Strong-no-hire / no-hire / hire / strong-hire** — four-band scoring, not 5-band (5-band collapses to "3" for everything).
+4. **Evidence anchor** — what the interviewer writes down to support the rating; rating without evidence = inadmissible.
+Loops without rubrics produce gut-feel hires and hidden bias. Force rubrics before the loop opens.
+### Step 4: Calibration session shape
+For every offer-eligible candidate, run a calibration session before sending offer:
+1. **All interviewers attend** — synchronous or async-with-deadline.
+2. **Evidence-first reading** — each interviewer reads their rubric + evidence aloud (or shares the doc) before opinions are stated.
+3. **Bar-raiser veto** *(L5+)* — bar-raiser can no-hire even when the team votes yes; reverse is not true (team can no-hire over bar-raiser yes).
+4. **Decision** — strong-hire if all four bands are hire-or-above, no single strong-no; gray-zone = no-hire by default (the cost of a wrong hire dwarfs the cost of a missed hire).
+A loop without calibration loosens; the bar erodes by 5% per quarter without explicit recalibration cycles.
+### Step 5: Validate the loop design before opening it
+Before running the first candidate through, inspect three things:
+1. **Signal-stage mapping check** — confirm Step 1's signals each map to exactly one stage from Step 2; duplicate-signal stages fail and must be merged or dropped.
+2. **Rubric completeness** — assert every stage in Step 3 has a written rubric with signal + anti-signal + four-band scoring + evidence anchor; missing rubrics fail.
+3. **Candidate-time check** — verify total candidate hours ≤ 8 (5 interview hours + 3 take-home); loops exceeding 8 hours produce survivorship bias and erode top-of-funnel.
+All three must pass. If any fails, return to the failing step.
+### Step 6: Emit the hiring-loop design
+Produce the loop-design artifact for the hiring team. The artifact contains the role-shape, the stage-signal map, the per-stage rubrics, the calibration shape, and the candidate-time budget. The first three candidates after the loop opens trigger a retrospective check (signal-vs-actual review).
+## Related Skills
+**WHEN to use this**
+- New engineering role family being opened.
+- First-pass hiring loop design.
+- Audit of an existing loop with inconsistent outcomes.
+- Calibration / bar-raiser design.
+**WHEN NOT to use this**
+- Non-engineering hiring loops (GTM / sales / ops) — different loop shape; out of scope here.
+- Role / level decisions independent of hiring — route to [`comp-banding`](../comp-banding/SKILL.md) (Q2) for ladder design.
+- Feedback shape — route to [`perf-feedback-craft`](../perf-feedback-craft/SKILL.md) (Q4); S2 composes Q4 at the calibration session.
+- Onboarding after hire — route to [`onboarding-program`](../onboarding-program/SKILL.md) (Q3).
+## When the agent should load this
+- "Design our interview loop."
+- "Audit our hiring bar."
+- "Should we use a take-home?"
+- "Why are we mis-hiring at L5?"
+- "Wie sieht unser Hiring-Loop für Staff Engineer aus?"
+## Output
+1. **`role-shape.md`** — level + archetype + first-90-day deliverable definition.
+2. **`stage-signal-map.md`** — signal-needed × stage × time-budget × interviewer pool.
+3. **`per-stage-rubrics.md`** — signal + anti-signal + four-band scoring + evidence anchor per stage.
+4. **`calibration-shape.md`** — session format + bar-raiser rules + gray-zone default.
+5. **`candidate-time-budget.md`** — total candidate hours + take-home cap + scheduling shape.
+## Gotcha
+- "Culture fit" is a known bias-amplifier; use "values fit" with concrete questions about pressure / disagreement / failure.
+- Take-home longer than 3 hours = survivorship bias in your funnel; you get only those with no other options.
+- Loops that go above 5 onsite hours produce candidate withdrawal; great candidates have options.
+- A no-rubric loop produces "felt good in the room" hires that don't replicate.
+- Gray-zone calibration default must be no-hire; cost of false-positive dwarfs cost of false-negative.
+## Do NOT
+- Do NOT design a loop without a written role-shape; un-defined roles produce un-falsifiable signals.
+- Do NOT score on 5-band scales; everything collapses to "3" and the rubric becomes decorative.
+- Do NOT skip calibration on a "we all agree" basis; the disagreements are where the signal lives.
+## Runnable example
+Series-B SaaS opens its first staff-IC role; current loop is L4-shaped (45-min coding + 60-min behavioral + offer) and last two staff-level offers churned at 9 months.
+- Step 0 — Role-shape: L6 staff IC, archetype = system-designer with multiplier impact. First-90-day deliverable: scope and own one cross-team platform initiative.
+- Step 1 — Signal needs: system design (L6 anchor), domain judgment, leadership / multiplier, coding sanity-check, values fit. Five signals.
+- Step 2 — Loop: recruiter call (30) + HM screen (45) + system design (120) + domain judgment behavioral (60) + leadership narrative (60) + bar-raiser (45) + light coding sanity-check (60). Total candidate time: 7 hours + 0 take-home = 7 hours. Within 8-hour budget.
+- Step 3 — Rubrics drafted per stage; system-design rubric anchored to "drove three nontrivial architecture decisions with explicit tradeoffs" not "designed a great system".
+- Step 4 — Calibration: all 7 interviewers + bar-raiser; evidence-first; gray-zone defaults to no-hire; bar-raiser can veto.
+- Step 5 — Validate: signal-stage 1:1 mapping (no duplication); rubrics complete; 7 hours fits the 8-hour budget. Pass.
+- Step 6 — Emit loop; first three candidates trigger retrospective; review whether system-design rubric is calibrated to actual L6 staff behavior or diverging toward L5.