@event4u/agent-config 1.34.0 → 1.36.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/.agent-src/commands/memory/load.md +69 -0
  2. package/.agent-src/commands/memory/mine-session.md +151 -0
  3. package/.agent-src/commands/memory/promote.md +35 -0
  4. package/.agent-src/commands/memory/propose.md +10 -1
  5. package/.agent-src/commands/memory.md +5 -3
  6. package/.agent-src/commands/roadmap/process-full.md +20 -15
  7. package/.agent-src/contexts/authority/scope-mechanics.md +36 -0
  8. package/.agent-src/contexts/execution/autonomy-detection.md +7 -7
  9. package/.agent-src/contexts/execution/roadmap-process-loop.md +16 -10
  10. package/.agent-src/personas/discovery-lead.md +99 -0
  11. package/.agent-src/personas/product-owner.md +71 -52
  12. package/.agent-src/personas/revops-maintainer.md +100 -0
  13. package/.agent-src/personas/tech-writer.md +99 -0
  14. package/.agent-src/rules/autonomous-execution.md +25 -0
  15. package/.agent-src/rules/scope-control.md +12 -5
  16. package/.agent-src/skills/competitive-positioning/SKILL.md +152 -0
  17. package/.agent-src/skills/customer-research/SKILL.md +116 -0
  18. package/.agent-src/skills/decision-record/SKILL.md +78 -3
  19. package/.agent-src/skills/discovery-interview/SKILL.md +152 -0
  20. package/.agent-src/skills/launch-readiness/SKILL.md +156 -0
  21. package/.agent-src/skills/memory-consolidation/SKILL.md +216 -0
  22. package/.agent-src/skills/release-comms/SKILL.md +123 -0
  23. package/.agent-src/skills/roadmap-writing/SKILL.md +1 -1
  24. package/.agent-src/skills/stakeholder-tradeoff/SKILL.md +91 -3
  25. package/.agent-src/skills/voc-extract/SKILL.md +164 -0
  26. package/.agent-src/templates/roadmaps.md +14 -0
  27. package/.claude-plugin/marketplace.json +9 -1
  28. package/CHANGELOG.md +64 -0
  29. package/README.md +3 -3
  30. package/config/agent-settings.template.yml +35 -0
  31. package/docs/architecture.md +3 -3
  32. package/docs/catalog.md +14 -5
  33. package/docs/contracts/agent-memory-contract.md +15 -1
  34. package/docs/contracts/command-clusters.md +1 -1
  35. package/docs/contracts/context-spine.md +133 -0
  36. package/docs/contracts/file-ownership-matrix.json +388 -0
  37. package/docs/contracts/mental-models.md +336 -0
  38. package/docs/getting-started.md +1 -1
  39. package/docs/guidelines/agent-infra/engineering-memory-data-format.md +52 -0
  40. package/docs/guidelines/cross-role-handoff.md +127 -0
  41. package/package.json +1 -1
  42. package/scripts/check_memory.py +106 -4
  43. package/scripts/check_references.py +1 -0
  44. package/scripts/lint_context_spine_usage.py +133 -0
  45. package/scripts/lint_roadmap_complexity.py +87 -3
  46. package/scripts/mine_session.py +279 -0
  47. package/scripts/schemas/skill.schema.json +9 -0
@@ -0,0 +1,336 @@
1
+ ---
2
+ stability: stable
3
+ ---
4
+
5
+
6
+ # Mental Models — Top-30 Cross-Role Reference
7
+
8
+ > **Status:** active · **Stability:** stable · **Owner:** unified-senior-roles Block K4
9
+ > · **Hard cap:** 30 models · **R23 mitigation:** additions require removing one (zero-sum)
10
+
11
+ A ranked, citation-only reference. Senior skills cite a model by
12
+ its number when the cognition step it triggers needs framing prose
13
+ the skill would otherwise re-invent. The doc is not auto-loaded
14
+ and never appears in a prompt unless a skill names a row.
15
+
16
+ ## How this list was built
17
+
18
+ Council iter-1 (Anthropic `claude-sonnet-4-5` + OpenAI `gpt-4o`,
19
+ 2026-05-05) Q2 verdict: **Ranked Top-30**, cross-role bias.
20
+ Channel-specific (CAC/LTV-as-model, ad-auction, SEO keyword),
21
+ C-suite strategy (Blue Ocean, Porter's Five Forces), and sales
22
+ pipeline (BANT, MEDDIC) **explicitly cut** — they are domain
23
+ heuristics, not cross-role cognition tools. Additions require
24
+ removing one.
25
+
26
+ Each entry: title · domain · ≤ 8-line summary · one citation example
27
+ from a shipped skill (path is the proof of provenance, not a load
28
+ instruction).
29
+
30
+ ## The 30 models
31
+
32
+ ### 1. First-principles thinking
33
+
34
+ Strip the problem to assumptions you can defend from physics, contract,
35
+ or hard data. Re-derive the answer from those, not by analogy to past
36
+ decisions. The expensive part is identifying which "principle" is
37
+ actually load-bearing vs. inherited belief; the cheap part is the
38
+ re-derivation. Use when an inherited approach feels stale and you
39
+ suspect the real constraint moved.
40
+ *Cited by:* `.agent-src.uncompressed/skills/improve-before-implement/`
41
+ (challenges weak requirements before code is written).
42
+
43
+ ### 2. Jobs-to-be-Done (JTBD)
44
+
45
+ A user "hires" a product to make progress in a specific situation;
46
+ the job is the situation × motivation × expected outcome, not the
47
+ demographic. The unit of analysis is the **switch event** — what
48
+ caused them to fire the previous solution. JTBD reframes feature
49
+ requests as evidence of an unmet job, not feature gaps.
50
+ *Cited by:* `.agent-src.uncompressed/skills/po-discovery/`
51
+ (reframes fuzzy product asks via job-shape).
52
+
53
+ ### 3. Pareto principle (80/20)
54
+
55
+ Roughly 80% of effects come from 20% of causes. The lift is in
56
+ **identifying** the 20% — which user segment, which test failure,
57
+ which N+1 query — not in re-stating the ratio. Anti-pattern: using
58
+ 80/20 as permission to ignore the long tail without measuring it.
59
+ *Cited by:* `.agent-src.uncompressed/skills/performance-analysis/`
60
+ (N+1 detection prioritizes the 20% of queries causing 80% of latency).
61
+
62
+ ### 4. Second-order thinking
63
+
64
+ Ask "and then what?" until the chain breaks down. First-order picks
65
+ what looks best now; second-order weighs the consequences of the
66
+ consequences. Most "obvious" decisions die at second-order — the
67
+ optimization that ships now creates the maintenance debt that kills
68
+ velocity in 6 months.
69
+ *Cited by:* `.agent-src.uncompressed/skills/adversarial-review/`
70
+ (stress-tests a plan by walking past the immediate verdict).
71
+
72
+ ### 5. Opportunity cost
73
+
74
+ The real cost of any choice is the **next-best alternative you did
75
+ not pick**, not the dollar / time spent. A 2-week feature is not
76
+ "2 weeks expensive" — it is the highest-value 2-week feature you
77
+ chose not to ship instead. Naming the alternative makes the cost
78
+ legible; pretending there isn't one is the failure mode.
79
+ *Cited by:* `.agent-src.uncompressed/skills/rice-prioritization/`
80
+ (scores compete for capacity; non-shipped items are the cost basis).
81
+
82
+ ### 6. Theory of constraints
83
+
84
+ System throughput is bounded by exactly one constraint at a time.
85
+ Improving any non-constraint resource is local optimization with
86
+ zero system effect — usually negative, since it loads the actual
87
+ constraint harder. The discipline: identify, exploit, subordinate,
88
+ elevate, then find the next constraint.
89
+ *Cited by:* `.agent-src.uncompressed/skills/funnel-analysis/`
90
+ (identifies the single funnel stage that bounds conversion throughput).
91
+
92
+ ### 7. MVP (Minimum Viable Product)
93
+
94
+ The smallest thing that produces real evidence about whether the
95
+ hypothesis holds. MVP is a **measurement instrument**, not a
96
+ junior-grade product. The trap is shipping an MVP that cannot
97
+ distinguish "user behavior validates the hypothesis" from "user
98
+ behavior was driven by something else"; that is just a small
99
+ product.
100
+ *Cited by:* `.agent-src.uncompressed/skills/po-discovery/`
101
+ (scopes a discovery slice that produces a learning, not a feature).
102
+
103
+ ### 8. Build-Measure-Learn
104
+
105
+ A loop: build the smallest test, measure the actual signal, decide
106
+ to persevere or pivot. The skill is in **shortening the loop** —
107
+ weeks beat months because months let teams rationalize a failed
108
+ hypothesis. The loop is the artefact; any individual cycle is just
109
+ one iteration.
110
+ *Cited by:* `.agent-src.uncompressed/skills/test-driven-development/`
111
+ (the build-measure-learn loop applied to one function at a time).
112
+
113
+ ### 9. Hypothesis-driven development
114
+
115
+ State the hypothesis as a falsifiable sentence with a metric and a
116
+ threshold **before** writing the code. "If we do X, metric Y will
117
+ move by Z." Without the threshold, the team will declare any
118
+ movement victory. With it, the team learns from the misses.
119
+ *Cited by:* `.agent-src.uncompressed/skills/project-analysis-hypothesis-driven/`
120
+ (competing hypotheses + validation loops + evidence-based conclusions).
121
+
122
+ ### 10. Reversible vs. irreversible decisions
123
+
124
+ One-way doors deserve a high bar; two-way doors deserve a low bar
125
+ plus speed. The bias under uncertainty: shipping a reversible
126
+ decision early is almost always cheaper than the meeting required
127
+ to decide. The skill is recognizing irreversibility — usually data
128
+ shape, public API surface, or hiring.
129
+ *Cited by:* `.agent-src.uncompressed/skills/decision-record/`
130
+ (records the reversal-criteria so the irreversibility verdict is auditable).
131
+
132
+ ### 11. DX as first-class concern
133
+
134
+ Developer experience is a leading indicator of throughput; it is not
135
+ a polish task. Slow tests, fragile local setup, and surprising tool
136
+ output compound — every developer pays the tax every day. Treat DX
137
+ issues like user-facing bugs, with severity and SLA. The compounding
138
+ math makes "fix it later" almost always wrong.
139
+ *Cited by:* `.agent-src.uncompressed/skills/test-performance/`
140
+ (test-suite latency is a developer-facing metric, optimized as such).
141
+
142
+ ### 12. Conway's Law
143
+
144
+ The systems an organization builds mirror its communication structure.
145
+ Re-orgs propagate to architecture; architecture changes that fight the
146
+ org chart fail. The lever is bidirectional: pick the architecture you
147
+ want, then engineer the communication paths that produce it.
148
+ *Cited by:* `.agent-src.uncompressed/skills/api-design/`
149
+ (bounded-context choices follow team boundaries, not the other way around).
150
+
151
+ ### 13. Occam's Razor
152
+
153
+ Among hypotheses that fit the evidence equally well, prefer the one
154
+ that introduces the fewest new entities. In debugging, the boring
155
+ explanation (typo, off-by-one, stale cache) is usually correct. The
156
+ trap is **assuming** instead of **falsifying** — Occam suggests
157
+ order of investigation, not a verdict.
158
+ *Cited by:* `.agent-src.uncompressed/skills/systematic-debugging/`
159
+ (reproduce → isolate → hypothesize, simplest hypothesis first).
160
+
161
+ ### 14. Meadows leverage points
162
+
163
+ Donella Meadows' ranking: the highest-leverage interventions in a
164
+ system are paradigm shifts, then goals, then rules — far above
165
+ parameter tweaks. Most "improvement" effort fights parameters
166
+ (numbers, delays) at the bottom of the ranking. Climb the ranking
167
+ before optimizing.
168
+ *Cited by:* `.agent-src.uncompressed/skills/architecture-review-lens/`
169
+ (boundary / dependency-direction issues are higher-leverage than tweaks).
170
+
171
+ ### 15. Signal vs. noise
172
+
173
+ Every metric is a sum of underlying signal and measurement noise.
174
+ A change of size N is meaningful only if N exceeds the noise floor
175
+ for that metric × that horizon. The discipline: estimate the noise
176
+ band first, then evaluate the change. Without the noise band, every
177
+ movement looks like a trend.
178
+ *Cited by:* `.agent-src.uncompressed/skills/funnel-analysis/`
179
+ (stage-to-stage drop is read against the typical week-on-week noise).
180
+
181
+ ### 16. Leading vs. lagging indicators
182
+
183
+ Lagging indicators (revenue, churn, retention) are accurate and
184
+ late. Leading indicators (activation events, repeat-use, support
185
+ volume) are noisy and early. Operating on lagging alone means the
186
+ team learns about problems after they cost money. Pair them: lagging
187
+ is the score, leading is the steering wheel.
188
+ *Cited by:* `.agent-src.uncompressed/skills/dashboard-design/`
189
+ (RED / USE / Golden Signals split leading from lagging explicitly).
190
+
191
+ ### 17. Churn as health metric
192
+
193
+ Retention curves expose what acquisition cannot: whether the product
194
+ delivers ongoing value. A flat retention tail means the product
195
+ works for the survivors; a sliding tail means the underlying job is
196
+ not getting done. Churn is upstream of CAC payback — fix it first,
197
+ then scale.
198
+ *Cited by:* `.agent-src.uncompressed/skills/funnel-analysis/`
199
+ (retention bend distinguishes activation problems from product-fit).
200
+
201
+ ### 18. Pull vs. push systems
202
+
203
+ Pull systems start work when downstream capacity opens; push systems
204
+ start work when upstream capacity is free. Push optimizes individual
205
+ utilization, pull optimizes flow. Most software teams claim pull and
206
+ operate push (queues filling up, sprints overcommitted, WIP
207
+ unmanaged). The fix is WIP limits, not motivation.
208
+ *Cited by:* `.agent-src.uncompressed/skills/laravel-horizon/`
209
+ (queue-balance strategies are pull-vs-push policy in concrete form).
210
+
211
+ ### 19. Shift-left
212
+
213
+ Move quality / security / accessibility checks earlier in the
214
+ lifecycle — to the developer's machine, the PR, the design — where
215
+ the cost of fixing is order-of-magnitude lower. The trade-off:
216
+ shift-left adds friction at the front; the math holds when defect
217
+ escape rate drops faster than the friction cost.
218
+ *Cited by:* `.agent-src.uncompressed/skills/threat-modeling/`
219
+ (threats enumerated before implementation, not after pen-test).
220
+
221
+ ### 20. Latency vs. throughput
222
+
223
+ Optimizing one usually hurts the other; the trade-off is structural,
224
+ not implementation-detail. Batch processing trades latency for
225
+ throughput; real-time pipelines trade throughput for latency. The
226
+ mistake is optimizing without naming which one matters for the user
227
+ job at hand.
228
+ *Cited by:* `.agent-src.uncompressed/skills/database/`
229
+ (index strategy and query batching surface the trade-off explicitly).
230
+
231
+ ### 21. Trust boundaries
232
+
233
+ Every system has explicit lines across which inputs cannot be
234
+ trusted: client → server, tenant A → tenant B, free tier → paid
235
+ tier, public endpoint → internal service. Threats enter at boundary
236
+ crossings; defense lives at the crossing, not deeper. Drawing the
237
+ boundary correctly is half the security work.
238
+ *Cited by:* `.agent-src.uncompressed/skills/threat-modeling/`
239
+ (produces trust boundaries + abuse cases mapped to files).
240
+
241
+ ### 22. Defense in depth
242
+
243
+ No single control is sufficient; layer entry validation, business
244
+ rules, environment hardening, and instrumentation so a bypass at
245
+ one layer is caught at the next. The trap is mistaking redundancy
246
+ for security theatre — each layer must have an independent failure
247
+ mode, not a copy of the previous one.
248
+ *Cited by:* `.agent-src.uncompressed/skills/defense-in-depth/`
249
+ (turns local fix into structural one across four guard layers).
250
+
251
+ ### 23. Blast radius
252
+
253
+ Before changing shared code, enumerate every call site, event
254
+ consumer, queue worker, API client, migration, and test that the
255
+ change touches. The radius is the work; the diff is the artefact.
256
+ Underestimating radius is the failure mode that breaks production
257
+ when the change "looked small".
258
+ *Cited by:* `.agent-src.uncompressed/skills/blast-radius-analyzer/`
259
+ (file:line citation per dependency, BEFORE the edit).
260
+
261
+ ### 24. Iron triangle (scope / time / quality)
262
+
263
+ Scope, time, and quality are coupled — fix two, the third moves.
264
+ Pretending to fix all three is how teams ship at low quality and
265
+ call it on-time. The honest move is naming which two are fixed and
266
+ which one absorbs the variance, BEFORE the work starts.
267
+ *Cited by:* `.agent-src.uncompressed/skills/refine-ticket/`
268
+ (AC sharpening forces explicit scope decisions before estimation).
269
+
270
+ ### 25. Definition of Done
271
+
272
+ A shared, auditable description of what "done" means for a unit of
273
+ work — tests, docs, deployment, comms. Without one, every team
274
+ member ships their personal threshold and disputes downstream. The
275
+ discipline is making it visible, agreed, and consistently applied —
276
+ not the specific items on it.
277
+ *Cited by:* `.agent-src.uncompressed/skills/verify-completion-evidence/`
278
+ (fresh evidence is required before any "done" claim).
279
+
280
+ ### 26. Postmortem-driven learning
281
+
282
+ Incidents are signal-rich; the team that captures the signal beats
283
+ the team that hides the incident. Blameless postmortems separate
284
+ contribution from blame, surface systemic causes, and produce
285
+ mitigations that reduce future incident rate — not just the count
286
+ of one-off fixes.
287
+ *Cited by:* `.agent-src.uncompressed/skills/incident-commander/`
288
+ (severity framing + comms cadence + post-mortem skeleton).
289
+
290
+ ### 27. Tech debt as interest
291
+
292
+ Tech debt has a principal (the shortcut taken) and an interest
293
+ payment (the ongoing tax on velocity). Carrying the debt is the
294
+ right call when the principal is repayable and the interest is
295
+ small; the failure is treating ongoing high-interest debt as a
296
+ fixed cost. Track it like a balance sheet.
297
+ *Cited by:* `.agent-src.uncompressed/skills/tech-debt-tracker/`
298
+ (interest-vs-principal framing, prioritisation by carrying cost).
299
+
300
+ ### 28. Mise en place
301
+
302
+ Prepare every input — data, fixtures, dependencies, decisions —
303
+ before the cooking step starts. Switching between prep and cook is
304
+ where errors enter. In software, the analogue is staged commits,
305
+ prepared test fixtures, and decisions locked before implementation.
306
+ The discipline is the savings, not the metaphor.
307
+ *Cited by:* `.agent-src.uncompressed/skills/existing-ui-audit/`
308
+ (inventory before any non-trivial UI edit, hard gate).
309
+
310
+ ### 29. Premortem
311
+
312
+ Before kickoff, assume the project failed and ask why — names the
313
+ risks the team already knows but has not voiced. The trick is the
314
+ past-tense framing; "what could go wrong" surfaces less than "it
315
+ failed, what happened". Cheap; high signal-to-noise; the residual
316
+ output is a risk register the team actually defends.
317
+ *Cited by:* `.agent-src.uncompressed/skills/risk-officer/`
318
+ (blast-radius framing, mitigations, residual-risk verdict pre-commit).
319
+
320
+ ### 30. Inversion
321
+
322
+ Instead of asking "how do I succeed at X?", ask "how would I
323
+ guarantee failure at X?" — then avoid that. Inversion exposes the
324
+ non-obvious failure modes that direct planning misses, especially
325
+ in security, ops, and people problems where the failure surface is
326
+ larger than the success surface.
327
+ *Cited by:* `.agent-src.uncompressed/skills/adversarial-review/`
328
+ (devil's-advocate stress-test poking holes in a plan).
329
+
330
+ ## Adding or removing a model
331
+
332
+ Hard cap is 30. Adding a 31st requires removing one and naming the
333
+ swap in the PR description — the council verdict (R23) is that the
334
+ list earns its weight only if every entry is load-bearing. Removal
335
+ criteria: ≤ 1 citation across the catalog after one minor release,
336
+ or superseded by a more general entry.
@@ -153,7 +153,7 @@ Your agent now understands slash commands:
153
153
  | `/quality-fix` | Run and fix all quality checks |
154
154
  | `/chat-history` | Inspect the persistent chat-history log (read-only `show`) |
155
155
 
156
- → [Browse all 103 active commands](../.agent-src/commands/)
156
+ → [Browse all 104 active commands](../.agent-src/commands/)
157
157
 
158
158
  ---
159
159
 
@@ -43,6 +43,58 @@ rejects entries missing any required field.
43
43
  | `owner` | yes | team slug | who keeps this entry fresh |
44
44
  | `last_validated` | yes | ISO date | stale check per type |
45
45
  | `review_after_days` | yes | integer | triggers staleness warning |
46
+ | `priority` | no | `critical` \| `normal` \| `low` | tier-0 surfacing; defaults to `normal` |
47
+ | `ts_week` | no | ISO-week string `YYYY-Www` | promotion week stamp; convention, not enforced |
48
+
49
+ ### Priority semantics (`critical` / `normal` / `low`)
50
+
51
+ The `priority` field controls how aggressively `/memory:load` surfaces
52
+ an entry. The three-tier enum is intentional — see
53
+ `road-to-dream-skill-adoption.md` § B2 and the Phase 2 council brief
54
+ for why a fourth `high` tier was rejected.
55
+
56
+ | Value | Meaning | Reader behaviour |
57
+ |---|---|---|
58
+ | `critical` | Tier-0 — always surface regardless of query | `/memory:load` injects on every load, irrespective of key/query match |
59
+ | `normal` (default) | Standard query-matched retrieval | Surfaced when the lookup key/query matches the entry |
60
+ | `low` | Background — only surface on explicit full load | Skipped by query-matched retrieval; visible only via `/memory:load --type` full sweep |
61
+
62
+ **Tier-0 governance.**
63
+ `scripts/check_memory.py` enforces two soft guards on `critical` entries:
64
+
65
+ - **Critical-stale warning** — a `priority: critical` entry whose
66
+ `last_validated` is older than 90 days emits a `critical-stale` warning
67
+ during validation (still exit 0; the curator decides whether to
68
+ re-validate or downgrade).
69
+ - **Tier-0 inflation warning** — when a memory type accumulates more
70
+ than 10 active `critical` entries, the validator warns. The intent is
71
+ to keep the always-surface slice small enough to remain signal, not to
72
+ block writes; raise the threshold deliberately if the project's domain
73
+ genuinely needs more.
74
+
75
+ Both are warnings, never errors. The curator stays in charge.
76
+
77
+ ### Temporal jitter (`ts_week`)
78
+
79
+ `ts_week` stamps a curated entry with the **ISO week** it was promoted
80
+ (`YYYY-Www`, e.g. `2026-W17`). It is optional and **convention-only** —
81
+ the validator does not require it and does not reject entries without
82
+ it. Promotion tooling (`/memory:promote`) writes it; manual edits are
83
+ free to set or omit.
84
+
85
+ **Why ISO-week, not date-time.** Curated YAML lives in the repo and is
86
+ reviewable by anyone with access. A precise timestamp on every entry
87
+ leaks session timing — "this rule appeared Tuesday 3pm" correlates with
88
+ "the incident hit Tuesday 3pm". ISO-week granularity preserves long-
89
+ term ordering (useful for audit) while removing intra-week inference.
90
+
91
+ **When to use it.** Stamp on every promotion. Do not retroactively
92
+ backfill — empty `ts_week` for older entries is fine and a deliberate
93
+ non-signal.
94
+
95
+ **Privacy carve-outs.** Highly sensitive entries (incident-learnings
96
+ tied to active investigations) may omit `ts_week` entirely; the field
97
+ is not a forensic record.
46
98
 
47
99
  ## Type-specific required fields
48
100
 
@@ -0,0 +1,127 @@
1
+ # Cross-Role Handoff
2
+
3
+ Wing-specific prose for senior-tier skills handing off across role
4
+ boundaries. The mechanical contract — initiator → delegated(input) →
5
+ output, lint rules, worktree boundary — lives in
6
+ [`docs/contracts/cross-wing-handoff.md`](../contracts/cross-wing-handoff.md).
7
+ This guideline covers **when a role hands off to another role**,
8
+ **how to phrase the routing**, and the **L4 / C8 boundary**.
9
+
10
+ ## Wings at a glance
11
+
12
+ The senior catalog spans four wings. Each wing owns a cognition
13
+ cluster and emits artifacts the other wings can consume.
14
+
15
+ | Wing | Cluster | Senior skills (anchor examples) |
16
+ |---|---|---|
17
+ | **1. Engineering** | Code, architecture, debugging, review | `architecture-review-lens`, `bug-analyzer`, `judge-bug-hunter`, `blast-radius-analyzer` |
18
+ | **2. Product + Foundation** | Discovery, refinement, decisions | `po-discovery`, `refine-ticket`, `decision-record`, `rice-prioritization` |
19
+ | **3. GTM + Growth** | Customers, comms, funnel, channels | `customer-research` (L1), `release-comms` (L2), `funnel-analysis` |
20
+ | **4. Money + Strategy + Ops** | Unit economics, OKRs, capacity, risk | `unit-economics-modeling`, `okr-tree-modeling`, `dcf-modeling`, `risk-officer` |
21
+
22
+ A handoff is **cross-role** when the initiator and the delegate live
23
+ in different wings (or in different cognition clusters within the
24
+ same wing). Same-cluster delegation is normal composition and does
25
+ not need this guideline.
26
+
27
+ ## When to hand off
28
+
29
+ A senior skill SHOULD hand off — not absorb — when any of the four
30
+ fires:
31
+
32
+ 1. **Different cognition cluster.** The downstream step needs a
33
+ different mode of thinking (numbers vs. narrative; user vs.
34
+ system; risk vs. design). Absorbing it dilutes the skill.
35
+ 2. **Different artifact owner.** The output naturally lives under a
36
+ different role's catalog (e.g. `forecast-band.json` belongs to
37
+ Wing-4, not Wing-2).
38
+ 3. **Tier-mismatch risk.** Inlining the step would require
39
+ downgrading to a non-senior delegate; the cross-wing-handoff
40
+ linter blocks tier mismatches.
41
+ 4. **Re-use evidence.** The step is already cited by ≥ 2 other
42
+ senior skills; absorbing it duplicates cognition.
43
+
44
+ If none fire, keep the step inline. Cross-role plumbing is not free.
45
+
46
+ ## How to phrase the handoff
47
+
48
+ Two surfaces in the senior-skill template carry routing:
49
+
50
+ - **`## Related Skills` § *WHEN NOT to use this*** — the routing
51
+ list. One bullet per peer that owns the cognition the user might
52
+ expect from this skill but is wrong to ask here. Format:
53
+ *"X is the actual question — route to [`<peer>`](../<peer>/SKILL.md)"*.
54
+ - **`## Procedure` Composes line** — when the skill **does** call
55
+ another skill mid-procedure, declare it on a line beginning with
56
+ `Composes [`<peer>`](...)` so the linter can match the call site
57
+ to the delegate's `## Input` block.
58
+
59
+ The delegated skill's `## Input` block names the fields the initiator
60
+ must pass. Drift between the two is the failure the contract catches.
61
+
62
+ ## Decision tree
63
+
64
+ ```
65
+ Need a downstream cognition step?
66
+ ├── Same wing + same cluster? → keep inline (normal composition).
67
+ ├── Different cluster, but no shipped senior peer?
68
+ │ └── Implement inline; flag for next plate's audit (the cluster
69
+ │ might need its own senior).
70
+ ├── Different cluster + shipped senior peer + ≥ 2 reuse citations?
71
+ │ └── HAND OFF: declare in WHEN NOT block + Composes line.
72
+ └── Cross-wing chain (≥ 3 senior steps, ≥ 30 min each)?
73
+ └── Use `subagent-orchestration` mode 6 (worktrees) per
74
+ cross-wing-handoff.md § 3.
75
+ ```
76
+
77
+ ## L4 / C8 composition boundary
78
+
79
+ Council Q3 (2026-05-05) locks the disambiguation between L4
80
+ `stakeholder-tradeoff` and the sibling C8 `code-review-multi-lens`:
81
+
82
+ - **L4 fires** when a request crosses two stakeholder lenses
83
+ (engineering ↔ PO, PO ↔ ops, ops ↔ infra) and the trade-off is
84
+ **not yet code**. Output is a trade-off matrix + recommendation +
85
+ dissent log; the artifact is consumable by a roadmap or PR
86
+ description, not by a diff.
87
+ - **C8 fires** when the request **is already code** — PR open, draft
88
+ branch under review, or a diff supplied as input. Output is a
89
+ multi-lens code review (security · architecture · tests · quality)
90
+ bound to file:line spans.
91
+ - **C8 → L4 escalation.** A C8 verdict that surfaces a stakeholder
92
+ conflict — e.g. test-coverage judge fails but PO insists on
93
+ shipping — becomes **input to L4**. The escalation is one-way:
94
+ L4 produces the dissent log that decides whether C8's verdict
95
+ is overridden, with the override recorded in
96
+ [`decision-record`](../../.agent-src.uncompressed/skills/decision-record/SKILL.md).
97
+
98
+ The boundary keeps the two skills sharp — neither absorbs the other —
99
+ and gives the agent a deterministic rule for which one to load when
100
+ both look applicable.
101
+
102
+ ## Worked example
103
+
104
+ A PO refining a ticket (Wing-2) hits a sentence like *"the cheapest
105
+ acquisition channel is paid search, but only if CAC payback < 6 months"*:
106
+
107
+ 1. Wing-2 is the initiator (`refine-ticket`).
108
+ 2. The CAC question is Wing-4 cognition — `unit-economics-modeling`.
109
+ 3. `refine-ticket` hands off via:
110
+ - WHEN NOT entry: *"CAC / payback questions — route to
111
+ [`unit-economics-modeling`](../unit-economics-modeling/SKILL.md)"*.
112
+ - Composes line in the procedure step that needs the answer.
113
+ 4. The delegate's `## Input` block lists the fields (channel,
114
+ cohort, time horizon); `refine-ticket` passes them.
115
+ 5. The output (`payback-band.md`) feeds the AC of the original ticket.
116
+
117
+ No cluster collision, no tier mismatch, no untyped drift.
118
+
119
+ ## See also
120
+
121
+ - [`docs/contracts/cross-wing-handoff.md`](../contracts/cross-wing-handoff.md)
122
+ — the mechanical contract this guideline cites.
123
+ - [`docs/contracts/context-spine.md`](../contracts/context-spine.md)
124
+ — orthogonal context-slot mechanism, often used together with a
125
+ handoff (e.g. delegate reads `team` slot the initiator opted in to).
126
+ - `.agent-src.uncompressed/skills/subagent-orchestration/SKILL.md` § mode 6
127
+ — when the chain runs in fresh worktrees.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@event4u/agent-config",
3
- "version": "1.34.0",
3
+ "version": "1.36.0",
4
4
  "description": "Shared agent configuration \u2014 skills, rules, commands, guidelines, and templates for AI coding tools",
5
5
  "license": "MIT",
6
6
  "private": false,