@event4u/agent-config 2.8.0 → 2.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. package/.agent-src/personas/engineering-manager.md +133 -0
  2. package/.agent-src/personas/finance-partner.md +129 -0
  3. package/.agent-src/personas/people-strategist.md +126 -0
  4. package/.agent-src/personas/strategist.md +129 -0
  5. package/.agent-src/rules/no-roadmap-references.md +19 -0
  6. package/.agent-src/skills/build-buy-partner/SKILL.md +145 -0
  7. package/.agent-src/skills/comp-banding/SKILL.md +160 -0
  8. package/.agent-src/skills/competitive-moat-analysis/SKILL.md +152 -0
  9. package/.agent-src/skills/contracts-cognition/SKILL.md +147 -0
  10. package/.agent-src/skills/data-handling-judgment/SKILL.md +155 -0
  11. package/.agent-src/skills/forecasting/SKILL.md +164 -0
  12. package/.agent-src/skills/hiring-loop-design/SKILL.md +167 -0
  13. package/.agent-src/skills/market-entry-analysis/SKILL.md +144 -0
  14. package/.agent-src/skills/onboarding-program/SKILL.md +157 -0
  15. package/.agent-src/skills/one-on-one-cadence/SKILL.md +161 -0
  16. package/.agent-src/skills/org-design/SKILL.md +158 -0
  17. package/.agent-src/skills/perf-feedback-craft/SKILL.md +157 -0
  18. package/.agent-src/skills/privacy-review/SKILL.md +160 -0
  19. package/.agent-src/skills/runway-cognition/SKILL.md +136 -0
  20. package/.agent-src/skills/scenario-modeling/SKILL.md +139 -0
  21. package/.agent-src/skills/throughput-vs-morale-tradeoff/SKILL.md +165 -0
  22. package/.agent-src/skills/unit-economics-modeling/SKILL.md +54 -7
  23. package/.agent-src/skills/vision-articulation/SKILL.md +146 -0
  24. package/.agent-src/templates/agents/agent-project-settings.example.yml +1 -1
  25. package/.agent-src/templates/scripts/telemetry/settings.py +65 -0
  26. package/.agent-src/templates/scripts/tier_usage_report.py +183 -0
  27. package/.agent-src/templates/scripts/work_engine/hooks/builtin/memory_visibility.py +32 -3
  28. package/.agent-src/templates/scripts/work_engine/scoring/memory_visibility.py +147 -1
  29. package/.claude-plugin/marketplace.json +18 -1
  30. package/AGENTS.md +1 -1
  31. package/CHANGELOG.md +134 -0
  32. package/README.md +34 -14
  33. package/config/agent-settings.template.yml +28 -0
  34. package/docs/architecture.md +37 -11
  35. package/docs/catalog.md +22 -4
  36. package/docs/contracts/adr-forecast-construction-shape.md +89 -0
  37. package/docs/contracts/adr-wing4-context-spine.md +125 -0
  38. package/docs/contracts/command-clusters.md +41 -0
  39. package/docs/contracts/command-surface-tiers.md +25 -9
  40. package/docs/contracts/context-spine.md +8 -0
  41. package/docs/contracts/decision-trace-v1.md +30 -0
  42. package/docs/contracts/hook-architecture-v1.md +46 -0
  43. package/docs/contracts/mcp-beta-criteria.md +129 -0
  44. package/docs/contracts/memory-visibility-v1.md +33 -0
  45. package/docs/contracts/settings-sync-yaml-subset.md +138 -0
  46. package/docs/guidelines/wing4-handoff.md +127 -0
  47. package/docs/mcp-server.md +1 -1
  48. package/docs/readme-split-plan.md +102 -0
  49. package/package.json +1 -1
  50. package/scripts/_cli/cmd_doctor.py +527 -14
  51. package/scripts/_cli/cmd_settings_check.py +171 -0
  52. package/scripts/_cli/cmd_validate.py +10 -0
  53. package/scripts/agent-config +59 -18
  54. package/scripts/chat_history.py +19 -0
  55. package/scripts/check_council_references.py +46 -5
  56. package/scripts/hooks/dispatch_hook.py +5 -1
  57. package/scripts/hooks/replay_hook.py +144 -0
  58. package/scripts/hooks/state_io.py +24 -1
  59. package/scripts/hooks_doctor.py +184 -0
  60. package/scripts/install.py +5 -0
  61. package/scripts/lint_context_spine_usage.py +1 -0
  62. package/scripts/lint_hook_concern_budget.py +203 -0
  63. package/scripts/mcp_server/__init__.py +1 -0
  64. package/scripts/mcp_server/server.py +4 -3
  65. package/scripts/roadmap_progress_hook.py +11 -0
  66. package/scripts/schemas/skill.schema.json +2 -2
  67. package/scripts/skill_linter.py +107 -3
@@ -113,6 +113,36 @@ the trace inherits the **maximum** risk class across all files the
113
113
  phase touched. If no files were touched (pure planning phase), risk
114
114
  is `low`.
115
115
 
116
+ ## Memory consequence keys
117
+
118
+ **Purpose.** Bound the surface area where a memory hit can be said
119
+ to have *changed* an outcome. Closed list, not open — without this
120
+ bound, every memory call risks the "memory affected everything"
121
+ failure mode (Risk register row 2 of
122
+ [`agents/roadmaps/road-to-proof-not-features.md`](../../agents/roadmaps/road-to-proof-not-features.md)).
123
+
124
+ **Closed list (v1).** Exactly four keys. Adding a fifth requires a
125
+ schema bump + entry under `### Breaking` in `CHANGELOG.md`.
126
+
127
+ | Key | Source | Diff semantics |
128
+ |---|---|---|
129
+ | `confidence_band` | Top-level envelope field. | String inequality (`high` ≠ `medium` ≠ `low`). |
130
+ | `risk_class` | Top-level envelope field. | String inequality. |
131
+ | `applied_rules` | Derived: sorted list of `rules[].rule_id` where `applied == true`. | Set inequality. |
132
+ | `test_plan` | Derived: sorted list of test paths captured in the Plan-phase `state.plan.tests` slice. May be `null` when the phase is not `plan` or no Plan-phase tests were captured. | Set inequality; `null` on either side suppresses the key from the diff. |
133
+
134
+ **Diff semantics.** The producer renders two traces for the same
135
+ phase: one **with** the memory entry consulted, one **without**
136
+ (re-running the heuristic against `memory.hits` decremented by the
137
+ entry's contribution). The `affected` field is the sorted list of
138
+ keys above whose values differ between the two traces. Empty list
139
+ means "consulted but no key diverged" — the call was informational,
140
+ not load-bearing.
141
+
142
+ **Out of scope for v1.** Gradations beyond binary key-diverged /
143
+ not-diverged (overridden, combined, filtered). Tracked as a Phase-1-
144
+ gated revisit in the same Risk register.
145
+
116
146
  ## Privacy floor
117
147
 
118
148
  - `memory.ids` carries opaque ids only — no entry bodies, no secrets.
@@ -205,6 +205,50 @@ that:
205
205
  The dispatcher silently no-ops when called with `--platform copilot`;
206
206
  the fallback is consumed by reading the rule, not by hook invocation.
207
207
 
208
+ ## Fixture corpus — `tests/fixtures/hooks/`
209
+
210
+ Replay-safe, platform-native payloads. One JSON file per event in the
211
+ agent-config event vocabulary. Consumed by `./agent-config hooks:replay`
212
+ and by the dispatcher replay tests
213
+ (`tests/hooks/test_hooks_replay.py` — Phase 2.4c).
214
+
215
+ ```
216
+ tests/fixtures/hooks/
217
+ session_start.json · session_end.json · user_prompt_submit.json
218
+ pre_tool_use.json · post_tool_use.json · stop.json
219
+ pre_compact.json · agent_error.json
220
+ README.md — corpus contract + platform-shape table
221
+ ```
222
+
223
+ Each fixture is a **stdin payload** — the dispatcher wraps it via
224
+ `_build_envelope` before handing it to a concern. Required keys:
225
+
226
+ - Valid JSON object at the top level.
227
+ - `session_id` — string, non-empty (drives feedback dir naming).
228
+ - Event-specific fields realistic enough that the bound concerns
229
+ (`chat-history`, `roadmap-progress`, `context-hygiene`,
230
+ `verify-before-complete`, `minimal-safe-diff`) run without raising
231
+ — primarily `tool_name` (for `*_tool_use`), `prompt` (for
232
+ `user_prompt_submit`).
233
+ - No real user content. Committed alongside source; the redaction
234
+ workflow in [`hook-payload-capture`](../hook-payload-capture.md)
235
+ applies to **captured** payloads, not committed fixtures.
236
+
237
+ The corpus is platform-shape-representative, not platform-exhaustive
238
+ — multi-platform shape coverage lives in
239
+ `tests/hooks/test_event_shape_contract.py`. The replay test asserts
240
+ 1:1 mapping between `EVENT_VOCABULARY` and this directory.
241
+
242
+ ## Replay mode — `AGENT_CONFIG_REPLAY=1`
243
+
244
+ Concerns that write under `agents/state/` MUST honor the
245
+ `AGENT_CONFIG_REPLAY` env var: when set to `1`, skip all state
246
+ mutations and run as read-only. The dispatcher passes the env var
247
+ through to subprocess concerns unchanged. Concerns that do not honor
248
+ the flag are listed by `./agent-config hooks:doctor` as not
249
+ replay-safe; replay tests assert no `agents/state/` mutation
250
+ post-invocation.
251
+
208
252
  ## Stability
209
253
 
210
254
  Beta. Breaking changes between v1 and v2 are allowed in a minor
@@ -218,3 +262,5 @@ majors.
218
262
  operational how-to for capturing redacted live payloads to upgrade
219
263
  a platform's chat-history extractor from `docs-verified` to
220
264
  `payload-verified`.
265
+ - [`tests/fixtures/hooks/README.md`](../../tests/fixtures/hooks/README.md)
266
+ — fixture corpus contract.
@@ -0,0 +1,129 @@
1
+ ---
2
+ stability: experimental
3
+ mcp_scope: lite
4
+ ---
5
+
6
+ # MCP Beta Criteria — Promotion Gate (Hard Contract)
7
+
8
+ > **Status:** Active · governs the `experimental → beta` promotion for
9
+ > the MCP surface (`scripts/mcp_server/` local stdio kernel + the
10
+ > hosted `workers/mcp/` bridge). Owned by Phase 3 of
11
+ > [`road-to-surface-discipline.md`](../../agents/roadmaps/road-to-surface-discipline.md).
12
+ > Companion contract:
13
+ > [`mcp-phase-1-scope.md`](mcp-phase-1-scope.md) (local) ·
14
+ > [`mcp-cloud-scope.md`](mcp-cloud-scope.md) (hosted).
15
+
16
+ ## Purpose
17
+
18
+ The current MCP wording uses `experimental` across READMEs, module
19
+ docstrings, and the initialize-result server description. There is no
20
+ defined bar for retiring that label. This contract names six gates
21
+ that together flip `experimental → beta`. Every gate is **observable**
22
+ (test file, doc, or script), **falsifiable** (red is allowed; missing
23
+ is not), and **machine-reportable** through `agent-config doctor
24
+ --check mcp-beta-readiness` (lands in Phase 3 Step 5).
25
+
26
+ > **Iron Law:** all six gates must be green for the same release tag
27
+ > before any user-visible surface drops `experimental`. A green gate
28
+ > sheet on `main` does not authorize a back-dated wording change on a
29
+ > release branch that did not also pass the sheet.
30
+
31
+ ## The six gates
32
+
33
+ Each gate is owned by a single artefact. When the artefact is missing,
34
+ Phase 3 Step 3 creates a **failing test** (`pytest.skip("pending: …",
35
+ allow_module_level=True)` or `raise NotImplementedError("mcp-beta-gate-N
36
+ pending")`) so the AC stays falsifiable.
37
+
38
+ ### Gate 1 — External-client end-to-end run
39
+
40
+ At least one MCP client **outside this repo's own test harness** has
41
+ completed a full session against MCP Lite: `initialize` →
42
+ `prompts/list` → `prompts/get` → `resources/list` → `resources/read`
43
+ → shutdown. Evidence is a transcript or recorded session under
44
+ `tests/mcp/external-clients/` plus the client name and version
45
+ (Claude Desktop ≥ vX, Cursor ≥ vY, Zed ≥ vZ, Continue ≥ vW).
46
+
47
+ ### Gate 2 — Bearer-auth coverage
48
+
49
+ `tests/mcp/auth/` must cover four cases against the hosted Worker
50
+ surface — **happy path**, **401 on missing token**, **401 on expired
51
+ token**, **401 → 200 on rotated token**. Each case asserts the wire
52
+ envelope shape, not only the status code. Gate fails if any case is
53
+ skipped, xfailed, or absent.
54
+
55
+ ### Gate 3 — Lite/Full parity smoke suite
56
+
57
+ For every primitive the published surface exposes (`prompts/list`,
58
+ `prompts/get`, `resources/list`, `resources/read`), a parametrized
59
+ test asserts the response body from the hosted Worker (Lite) and the
60
+ local stdio kernel (Full) **byte-identical** (modulo the documented
61
+ deltas in `mcp-cloud-scope.md § Lite vs Full`). Failure must surface
62
+ the diff, not just a boolean.
63
+
64
+ ### Gate 4 — Health endpoint under load
65
+
66
+ The hosted Worker exposes `/healthz` (or equivalent) that returns a
67
+ structured JSON envelope `{status, uptime_s, build_sha,
68
+ last_content_refresh}`. A k6 / wrk smoke test in
69
+ `tests/mcp/load/healthz.k6.js` proves p95 < 200 ms across 60 s at 50
70
+ RPS. The local stdio kernel surfaces the same envelope through a
71
+ `server/health` JSON-RPC ping.
72
+
73
+ ### Gate 5 — Abuse / rate-limit plan
74
+
75
+ `docs/contracts/mcp-rate-limit.md` exists and pins three knobs —
76
+ per-token RPS, per-token daily quota, per-IP burst — with a fallback
77
+ behaviour on overrun (`429` + `Retry-After`). The Worker enforces the
78
+ knobs; a contract test in `tests/mcp/rate-limit/` asserts that
79
+ exceeding any knob returns `429` with a non-empty `Retry-After`.
80
+
81
+ ### Gate 6 — Lite ↔ Full no-drift
82
+
83
+ A nightly CI job runs the Phase 3 Step 3 parity suite (Gate 3) plus a
84
+ canary: ingest one prompt and one resource on both surfaces, hash the
85
+ body, and assert equality. Drift > 0 fails the job and posts a Slack
86
+ ping. Evidence: the workflow file (`.github/workflows/mcp-no-drift.yml`)
87
+ **and** at least one successful run within the last 7 days.
88
+
89
+ ## Promotion procedure
90
+
91
+ 1. Open a release-candidate branch named `release/mcp-beta-rcN`.
92
+ 2. Run `./agent-config doctor --check mcp-beta-readiness` — must
93
+ print all six gates green.
94
+ 3. Flip the wording in the **five** surfaces inventoried in
95
+ [`road-to-surface-discipline.md` Phase 3 Step 1](../../agents/roadmaps/road-to-surface-discipline.md):
96
+ `docs/mcp-server.md` (status banner + Remote-MCP sub-claim),
97
+ `README.md` (pointer line), `scripts/mcp_server/server.py`
98
+ (initialize-result `serverInfo.name`),
99
+ `scripts/mcp_server/__init__.py` (module docstring `Stability:`).
100
+ 4. Update the changelog with the gate sheet snapshot.
101
+ 5. Merge the RC branch through the normal review path. Tag is **not**
102
+ created until the gate sheet is reproducible on the merge commit.
103
+
104
+ ## Demotion procedure
105
+
106
+ Any single gate going red on `main` for more than 7 consecutive days
107
+ demotes the surface back to `experimental` at the next release. This
108
+ is a wording-only demotion; no code is reverted. The doctor check
109
+ reports the demotion automatically.
110
+
111
+ ## Surface delta
112
+
113
+ This contract adds **0 new commands**, **0 new skills**, **0 new
114
+ personas**. It defines a promotion gate; nothing more. Net surface
115
+ delta for Phase 3: ≤ 0.
116
+
117
+ ## Cross-references
118
+
119
+ - [`mcp-phase-1-scope.md`](mcp-phase-1-scope.md) — local stdio kernel
120
+ hard contract (A0).
121
+ - [`mcp-cloud-scope.md`](mcp-cloud-scope.md) — hosted Worker hard
122
+ contract (A0-cloud).
123
+ - [`mcp-tool-stub-envelope.md`](mcp-tool-stub-envelope.md) — Phase 1
124
+ discovery contract.
125
+ - [`STABILITY.md`](STABILITY.md) — stability tier definitions
126
+ (`experimental` / `beta` / `stable`) and what wording each tier may
127
+ use in user-visible surfaces.
128
+ - [`road-to-surface-discipline.md`](../../agents/roadmaps/road-to-surface-discipline.md)
129
+ — Phase 3 acceptance criteria and step-level evidence pointers.
@@ -24,6 +24,7 @@ and a single space:
24
24
 
25
25
  ```
26
26
  🧠 Memory: <hits>/<asks> · ids=[<comma-separated-ids>]
27
+ 🧠 Memory: <hits>/<asks> · ids=[<comma-separated-ids>] · affected: <keys>
27
28
  ```
28
29
 
29
30
  Examples:
@@ -32,6 +33,8 @@ Examples:
32
33
  🧠 Memory: 3/4 · ids=[mem_42, mem_57, mem_91]
33
34
  🧠 Memory: 0/2 · ids=[]
34
35
  🧠 Memory: 5/5 · ids=[mem_a01, mem_a02, mem_a03, …+2]
36
+ 🧠 Memory: 3/4 · ids=[mem_42, mem_57] · affected: confidence_band,applied_rules
37
+ 🧠 Memory: 2/4 · ids=[mem_42] · affected: none
35
38
  ```
36
39
 
37
40
  Cap at 5 ids inline; remainder rendered as `…+N`. The full id list
@@ -45,10 +48,15 @@ lives in the decision-trace JSON
45
48
  | `hits` | Count of `memory_retrieve_*` calls during this turn that returned ≥ 1 entry. |
46
49
  | `asks` | Count of `memory_retrieve_*` calls during this turn — both successful and empty. |
47
50
  | `ids` | Stable memory entry ids returned across all calls, deduped, ordered by retrieval timestamp. |
51
+ | `affected` | Optional trailing segment. Comma-separated list of decision-trace keys that diverged when this memory was consulted vs not consulted. Closed key list defined in [`decision-trace-v1.md § Memory consequence keys`](decision-trace-v1.md#memory-consequence-keys). Rendered as `none` when `hits ≥ 1` but no key diverged. Omitted entirely when `hits == 0` or when the producer cannot compute a counterfactual trace. |
48
52
 
49
53
  `hits ≤ asks` is invariant. If `asks == 0`, the engine MUST suppress
50
54
  the line entirely — no `0/0` noise.
51
55
 
56
+ The `affected` segment is a forward-compat trailing extension per
57
+ the Stability clause below — clients pinned to the segment-free
58
+ shape MUST still parse the line.
59
+
52
60
  ## Privacy floor
53
61
 
54
62
  The visibility line and the JSON it derives from MUST NOT contain:
@@ -88,6 +96,31 @@ counts and ids for downstream metrics.
88
96
  Cost-profile lookup respects `.agent-settings.yml`'s `cost_profile`
89
97
  key. Default is `standard`.
90
98
 
99
+ ## End-of-run "Memory changed decisions" block
100
+
101
+ When the visibility line carries a non-empty `affected` segment, the
102
+ engine MUST also append a structured block at the end of the run's
103
+ report surface so reviewers can audit attribution without parsing
104
+ the inline segment:
105
+
106
+ ```
107
+ Memory changed decisions:
108
+ - mem_42 → confidence_band
109
+ - mem_57 → confidence_band
110
+ ```
111
+
112
+ Rules:
113
+
114
+ - Suppressed entirely when `affected` is empty or absent (no key
115
+ diverged, or memory was not consulted).
116
+ - Each consulted id from the visibility line's `ids` is paired with
117
+ each affected key. v1 attribution is aggregate; per-id attribution
118
+ is a follow-up risk tracked in the roadmap Risk register.
119
+ - Block heading is the literal string `Memory changed decisions:`
120
+ followed by `-` bullet lines in `<id> → <key>` shape.
121
+ - Implementation: `format_changed_decisions_block` in
122
+ `work_engine/scoring/memory_visibility.py`.
123
+
91
124
  ## Audit-as-memory feed
92
125
 
93
126
  The visibility output produced by the engine is the input to the
@@ -0,0 +1,138 @@
1
+ ---
2
+ stability: beta
3
+ ---
4
+
5
+ # Settings-sync YAML subset
6
+
7
+ **Purpose.** Pin the YAML feature set that `.agent-settings.yml` and
8
+ `config/agent-settings.template.yml` may use, so contributors can cite a
9
+ contract instead of inferring it from
10
+ [`scripts/sync_yaml_rt.py`](../../scripts/sync_yaml_rt.py) source. The
11
+ sync engine ([ADR](adr-settings-sync-engine.md)) is a custom stdlib-only
12
+ round-trip parser/emitter; staying inside the subset below is what
13
+ keeps user-line preservation (every byte of every user line round-trips
14
+ unchanged unless the merger explicitly edits the key).
15
+
16
+ Authoritative source: this document. The module docstring of
17
+ `sync_yaml_rt.py` mirrors it; on drift, this file wins and the docstring
18
+ is corrected to match.
19
+
20
+ ## Supported
21
+
22
+ ### Document shape
23
+
24
+ - One YAML document per file. No `---` or `...` document separators.
25
+ - UTF-8. CRLF and LF line endings — both accepted, preserved per-line.
26
+
27
+ ### Mappings (sections)
28
+
29
+ - Block-style mappings only (`key: value` on its own line).
30
+ - Indent: 2- or 4-space, **no tabs** in indent.
31
+ - Nested mappings unlimited in depth (the template uses 3 levels —
32
+ e.g. `chat_history.archive.cleanup_after_days`).
33
+ - Duplicate keys at the same level: **last wins** (the later line
34
+ carries the value; the earlier entry is replaced).
35
+
36
+ ### Scalars (values)
37
+
38
+ - Bare scalars: `enabled`, `42`, `true`, `~`, `null`, `None`.
39
+ - Single-quoted strings: `'literal text'`.
40
+ - Double-quoted strings: `"literal text"`.
41
+ - Bools, ints, `~` / `null` / `None` are kept **verbatim** — the
42
+ parser does not normalise `True` → `true` or `null` → `~`.
43
+
44
+ ### Lists (sequences of scalars)
45
+
46
+ - Block-style lists:
47
+ ```yaml
48
+ allowlist:
49
+ - foo
50
+ - bar
51
+ ```
52
+ Indent inside the list must be consistent.
53
+ - Inline-flow lists, **flat only**: `[a, b, c]`.
54
+ - List items are scalars only. Nested mappings inside a list item are
55
+ **not** supported (see below).
56
+
57
+ ### Comments and blank lines
58
+
59
+ - `#`-comments — full-line and inline (`key: value # comment`). Both
60
+ preserved verbatim, including leading whitespace and the gap before
61
+ `#`.
62
+ - Blank lines preserved verbatim — the engine never collapses them.
63
+
64
+ ## Not supported (parser raises `ValueError` with a line number)
65
+
66
+ The following YAML features are out of contract. A user file that uses
67
+ any of them surfaces as `ValueError` from `scripts/sync_yaml_rt.py:sync`,
68
+ which `scripts/sync_agent_settings.py` catches and reports as **exit
69
+ code 2** with a line-numbered message.
70
+
71
+ - **Anchors and aliases** — `&name`, `*name`.
72
+ - **Multi-document streams** — `---` / `...` separators.
73
+ - **Nested flow mappings** — `key: {nested: value}` inline. Block-style
74
+ nested mappings are fine; flow-style nested mappings are not.
75
+ - **Nested mappings inside list items** — `- name: foo` followed by
76
+ indented children. Lists hold scalars only.
77
+ - **Complex keys** — `? [composite, key]: value`.
78
+ - **Tagged scalars** — `!!str 42`, `!Custom value`.
79
+ - **Multiline scalar styles** — `|` (literal) and `>` (folded) block
80
+ scalars.
81
+ - **Tabs in indent** — even one tab character in indent.
82
+ - **Mixed indent inside a block** — every child of a parent must share
83
+ the same indent.
84
+
85
+ Pinned by `tests/test_sync_round_trip.py` (34 tests) — every
86
+ not-supported feature has at least one fixture that asserts the
87
+ `ValueError` message.
88
+
89
+ ## Test pinning
90
+
91
+ - Verbatim round-trip: `tests/test_sync_round_trip.py::test_user_block_round_trip_is_idempotent`, `::test_three_level_idempotent`.
92
+ - Out-of-subset rejection: same file, fixtures under
93
+ `tests/fixtures/sync_yaml_rt/` named `bad_*.yml`.
94
+ - CLI exit code on malformed input:
95
+ `tests/test_sync_agent_settings.py::test_malformed_user_yaml_exits_2_with_message`.
96
+
97
+ Any parser change is gated on those tests staying green. New fixtures
98
+ for new features land under `tests/fixtures/sync_yaml_rt/`.
99
+
100
+ ## Why this subset (and why it is fixed)
101
+
102
+ The driving requirement from
103
+ [`layered-settings`](../guidelines/agent-infra/layered-settings.md) is
104
+ **verbatim user-line preservation**. `ruamel.yaml` and PyYAML both
105
+ re-emit through their own emitters, which normalises whitespace,
106
+ quoting, and blank-line placement. A stdlib parser limited to this
107
+ subset gives byte-identity across two consecutive syncs — the property
108
+ the merger relies on for additive insertion.
109
+
110
+ Out-of-subset YAML therefore is not a parser bug; it is a contract
111
+ violation by the user file. The friendly `ValueError` and exit code 2
112
+ are the contract's failure surface.
113
+
114
+ ## Revisit triggers
115
+
116
+ This subset is **fixed** until one of the
117
+ [ADR revisit triggers](adr-settings-sync-engine.md#revisit-triggers)
118
+ fires — namely:
119
+
120
+ 1. `.agent-settings.yml` schema gains a YAML feature outside the subset
121
+ (anchors, multi-doc, complex keys, nested flow mappings) — the cost
122
+ of extending the parser exceeds the cost of adopting `ruamel.yaml`.
123
+ 2. The verbatim-preservation contract is relaxed — the driver for the
124
+ custom parser is gone.
125
+ 3. The 0-dep posture for Python tooling is dropped at the package level
126
+ — the marginal cost of one more dep collapses.
127
+ 4. A maintenance bug surfaces in the engine that ruamel's mature spec
128
+ coverage would have prevented.
129
+
130
+ A new ADR (with successor link) is required to change the subset; this
131
+ document is updated in the same commit.
132
+
133
+ ## See also
134
+
135
+ - [`docs/contracts/adr-settings-sync-engine.md`](adr-settings-sync-engine.md) — decision record for the stdlib-only engine.
136
+ - [`docs/guidelines/agent-infra/layered-settings.md`](../guidelines/agent-infra/layered-settings.md) § Sync rules — the additive-merge-with-user-line-preservation contract this subset implements.
137
+ - [`scripts/sync_yaml_rt.py`](../../scripts/sync_yaml_rt.py) — implementation; module docstring mirrors this file.
138
+ - [`scripts/sync_agent_settings.py`](../../scripts/sync_agent_settings.py) — CLI driver and exit-code contract.
@@ -0,0 +1,127 @@
1
+ # Wing-4 Handoff
2
+
3
+ Wing-4-specific prose for the four load-bearing senior-skill chains
4
+ in the Money / Strategy / Operations cluster. The mechanical contract
5
+ — initiator → delegated(input) → output-artifact, lint rules, worktree
6
+ boundary — lives in
7
+ [`docs/contracts/cross-wing-handoff.md`](../contracts/cross-wing-handoff.md).
8
+ The cross-wing routing prose (when to hand off at all, L4 / C8
9
+ boundary, decision tree) lives in
10
+ [`docs/guidelines/cross-role-handoff.md`](cross-role-handoff.md). The
11
+ Wing-3 sibling — chains inside GTM / Growth — lives in
12
+ [`docs/guidelines/gtm-handoff.md`](gtm-handoff.md). This guideline
13
+ covers **what crosses each Wing-4 boundary**, **what the typed
14
+ artifact looks like**, and **who owns the failure mode when the
15
+ chain breaks**.
16
+
17
+ Cycle / dangling / tier-mismatch enforcement is not duplicated here —
18
+ `task lint-handoffs` (per cross-wing-handoff § 4) is the mechanical
19
+ gate.
20
+
21
+ ## Chain 1 — money → strategy
22
+
23
+ Three-step chain that turns unit-economics cognition into a
24
+ build-buy-partner verdict. Finance cluster owns the first two steps;
25
+ the cluster line crosses on the handoff to Strategy.
26
+
27
+ ```
28
+ unit-economics (O1)
29
+ → scenario-modeling (O4)
30
+ → build-buy-partner (P1)
31
+ ```
32
+
33
+ | Step | Hands off when | Typed artifact crossing the boundary | Failure-mode owner |
34
+ |---|---|---|---|
35
+ | O1 → O4 | CAC / LTV / contribution-margin / payback-period cognition locked for the segment. | `unit-economics-frame.md` — CAC / LTV ratio, contribution margin, payback band, burn-multiple verdict, segment scope. | O1 owns drift: a margin frame O4 cannot stress-test = O1's unit definition was wrong scope. |
36
+ | O4 → P1 | Three-statement scenarios + sensitivity bands + optionality reasoning locked across at least two cases. | `scenario-set.md` — base / upside / downside cases, sensitivity table, decision-relevant variables, optionality cost per case. | O4 owns drift: scenarios without an optionality-cost row force P1 to re-derive build-vs-buy economics. |
37
+
38
+ P1 self-closes against `build-buy-partner.md` — insource-vs-outsource-
39
+ vs-acquire verdict, integration-cost band, dependency-risk score,
40
+ exit-cost analysis.
41
+
42
+ ## Chain 2 — strategy → people
43
+
44
+ Two-step chain that turns a build-buy-partner verdict into an
45
+ org-design shape. Strategy cluster ships the verdict; People-Strategy
46
+ cluster reads it as input and owns the structure decision.
47
+
48
+ ```
49
+ build-buy-partner (P1)
50
+ → org-design (Q1)
51
+ ```
52
+
53
+ | Step | Hands off when | Typed artifact crossing the boundary | Failure-mode owner |
54
+ |---|---|---|---|
55
+ | P1 → Q1 | Insource-vs-outsource verdict + dependency-risk profile + integration-cost band locked. | `build-buy-verdict.md` — verdict (build / buy / partner / acquire), capability scope, dependency-risk score, integration cost, exit cost, optionality preservation note. | P1 owns drift: a verdict without exit-cost reasoning leaves Q1 designing teams against an unowned constraint. |
56
+
57
+ Q1 self-closes against `org-design-shape.md` — team-shape (functional /
58
+ cross-functional / squad), span-of-control band, Conway's-law alignment
59
+ note, reorg-cost ledger.
60
+
61
+ ## Chain 3 — people → EM
62
+
63
+ Two-step chain that specializes a generalized hiring loop for
64
+ engineering. People-Strategy cluster owns the generalized cognition;
65
+ Engineering-Manager cluster owns the engineering specialization.
66
+
67
+ ```
68
+ hiring-loop-design (Q-generalized, composed inside `org-design`)
69
+ → hiring-loop-design × eng-context (S2)
70
+ ```
71
+
72
+ | Step | Hands off when | Typed artifact crossing the boundary | Failure-mode owner |
73
+ |---|---|---|---|
74
+ | Q → S2 | Generalized loop stages + calibration-design + signal-vs-noise audit locked at people-strategy level. | `hiring-loop-shape.md` — stage list, per-stage signal, calibration cadence, bar-raiser logic, signal-vs-noise findings. | Q owns drift: a generalized loop without a calibration cadence forces S2 to invent one for engineering and the cognition diverges from the rest of the org. |
75
+
76
+ S2 self-closes against `eng-hiring-loop.md` — eng-specific stage
77
+ specialization (screen → take-home / system-design / coding /
78
+ behavioral / leadership), per-stage rubric, bar-raiser assignments,
79
+ candidate-throughput target.
80
+
81
+ ## Chain 4 — finance → GTM
82
+
83
+ Cross-wing chain — the only Wing-4 chain whose endpoint sits in
84
+ Wing 3. Finance owns the **cognition**; RevOps owns the **call**.
85
+ Interface-first-stub per iter-2 OQ4: O2-interface ships before the
86
+ H10 sibling can start, parallel to O2 implementation.
87
+
88
+ ```
89
+ forecasting (O2)
90
+ → forecast-accuracy (H10, Wing 3)
91
+ ```
92
+
93
+ | Step | Hands off when | Typed artifact crossing the boundary | Failure-mode owner |
94
+ |---|---|---|---|
95
+ | O2 → H10 | `forecast-construction-shape` ADR locked: top-down vs bottom-up enum, confidence-band signature, retro-loop signature. | `forecast-band.json` — commit value, best-case value, pipeline value, confidence band, retro signature, construction-shape tag. | **Interface contract owned by O2** (per cross-wing-handoff § 5 / W4 chain): if the ADR drifts, O2 breaks the contract, not H10. Mirrors `gtm-handoff.md` Chain 2 H10 → O2 framing from the Wing-3 side. |
96
+
97
+ H10's parallel-development rule (starts after O2-interface ≥ 100 %,
98
+ runs in parallel with O2 implementation) is recorded in the
99
+ `road-to-money-strategy-ops.md` O2 entry, the
100
+ `road-to-gtm-and-growth.md` H10 entry, and the cross-wing-handoff
101
+ contract — not duplicated here.
102
+
103
+ ## Reading the failure-mode column
104
+
105
+ The column answers one question: **when a downstream skill cannot
106
+ do its job, which upstream skill rewrites its artifact?** The owner
107
+ is the **upstream** skill, not the consumer — drift is always a
108
+ producer-side fix. This mirrors the W3 sibling and the W4 / W3
109
+ forecasting chain in the contract (O2 owns the interface; H10 only
110
+ consumes it).
111
+
112
+ ## See also
113
+
114
+ - [`docs/contracts/cross-wing-handoff.md`](../contracts/cross-wing-handoff.md)
115
+ — typed-handoff mechanical contract; `task lint-handoffs` enforces
116
+ cycles, dangling references, and tier mismatches over the graph.
117
+ - [`docs/guidelines/cross-role-handoff.md`](cross-role-handoff.md)
118
+ — when to hand off at all, how to phrase the routing, L4 / C8
119
+ boundary.
120
+ - [`docs/guidelines/gtm-handoff.md`](gtm-handoff.md) — Wing-3 sibling
121
+ for the brand → channel, discovery → pipeline, and funnel →
122
+ retention chains.
123
+ - [`docs/contracts/context-spine.md`](../contracts/context-spine.md)
124
+ § Wing-4 slots — `fiscal-period`, `org-stage`, `regulatory-regime`;
125
+ every chain step opts into ≥ 1 slot or carries an ADR opt-out.
126
+ - [`docs/contracts/adr-wing4-context-spine.md`](../contracts/adr-wing4-context-spine.md)
127
+ — durable record for the Wing-4 slot extension.
@@ -1,6 +1,6 @@
1
1
  # MCP Server
2
2
 
3
- > Status: **experimental** — Phase 1 + 2 + 3 shipped. No `tools/*` primitive yet (Phase 4, deferred behind a design call).
3
+ > Status: **experimental** — Phase 1 + 2 + 3 shipped. No `tools/*` primitive yet (Phase 4, deferred behind a design call). Promotion to **beta** is gated on the six criteria in [`docs/contracts/mcp-beta-criteria.md`](contracts/mcp-beta-criteria.md); current gate status: `./agent-config doctor --check mcp-beta-readiness` (Phase 3 of `road-to-surface-discipline.md`).
4
4
 
5
5
  `agent-config` ships a built-in [Model Context Protocol](https://modelcontextprotocol.io)
6
6
  server that exposes the package's read-only governance surface to MCP-aware
@@ -0,0 +1,102 @@
1
+ # README three-audience split — plan
2
+
3
+ Annotated outline for `P2.2a` in
4
+ [`road-to-proof-not-features.md`](../agents/roadmaps/road-to-proof-not-features.md).
5
+ Decides the **information architecture**, not the prose. No content
6
+ rewrite happens in this step; `P2.2b` applies the mapping.
7
+
8
+ ## Target headings (top of README, in order)
9
+
10
+ 1. **Use it in your project** — anchor `#use-it`
11
+ 2. **Prove it** — anchor `#prove-it`
12
+ 3. **Contribute** — anchor `#contribute`
13
+
14
+ Each branch opens with one paragraph + one primary CTA. AI Council is
15
+ not mentioned in any branch (verified by `P3.4`).
16
+
17
+ ### Anchor-stability promise
18
+
19
+ `P2.2b` must keep these existing anchors intact so external inbound
20
+ links survive:
21
+
22
+ | Anchor today | Lives under (new) | Why |
23
+ |---|---|---|
24
+ | `#quickstart` | `#use-it` | npm/composer search results, social links |
25
+ | `#supported-tools` | `#use-it` | most-cited section on the web |
26
+ | `#what-your-agent-is-asked-to-do` | `#prove-it` | linked from blog posts |
27
+ | `#documentation` | `#use-it` | docs portal entry |
28
+ | `#development` | `#contribute` | contributor guides |
29
+
30
+ Other section anchors may be renamed; `lint-readme` checks the table
31
+ above and the three new audience anchors only.
32
+
33
+ ## Block-by-block mapping
34
+
35
+ Every existing top-of-README block, in source order, mapped to
36
+ exactly one branch. "Drop" = block is retired; "Move" = relocated as-
37
+ is; "Reframe" = block stays but its lead-in / CTA changes (still no
38
+ copy rewrite in this step — the reframe direction is decided here,
39
+ applied in `P2.2b`).
40
+
41
+ | # | Block (current heading) | Lines | Branch | Action | Notes |
42
+ |---|---|---|---|---|---|
43
+ | 1 | Title + tagline + stats badge | 1–13 | — | Keep above branches | Survives unchanged; counts updated by `update_readme_counts`. |
44
+ | 2 | `## Start here` (three-paths table) | 15–25 | — | **Drop** | Replaced by the three branch sections themselves; rows map cleanly: `/onboard` → Use, `task ci` → Contribute, `task generate-tools` → Use. |
45
+ | 3 | `## Quickstart` lead-in | 27–39 | Use it | Move | Becomes the opening paragraph under `#use-it`. |
46
+ | 4 | `### For teams (recommended)` | 40–79 | Use it | Move | Primary CTA for `#use-it`. |
47
+ | 5 | `### Pick specific AIs` | 81–101 | Use it | Move | Stays under Quickstart subtree. |
48
+ | 6 | `#### Global install` | 103–124 | Use it | Move | Subsection of Pick specific AIs. |
49
+ | 7 | `### For individual use (optional)` | 126–144 | Use it | Move | Alternate install path. |
50
+ | 8 | `### Self-hosted MCP on Cloudflare` | 146–226 | Use it | Move | Operator install path; deep but consumer-facing. |
51
+ | 9 | `#### Lock your Worker behind Bearer` | 196–213 | Use it | Move | Subsection of MCP block; stays nested. |
52
+ | 10 | `### Optional: persistent agent memory` | 228–247 | Use it | Move | Companion package install. |
53
+ | 11 | `## 2-minute demo: /implement-ticket` | 251–285 | Prove it | Move | Flagship evidence surface. Primary CTA for `#prove-it`. |
54
+ | 12 | `### Sibling entrypoint: /work` | 287–316 | Prove it | Move | Same engine, second envelope. |
55
+ | 13 | `### Product UI track` | 318–347 | Prove it | Move | Third evidence surface. |
56
+ | 14 | `## What your agent is asked to do` | 351–365 | Prove it | Move | Intent table — proof of behaviour, not features. |
57
+ | 15 | `## What this package is — and what it isn't` | 369–398 | Prove it | Move | Scope-honesty surface; loadbearing for the "proof" framing. |
58
+ | 16 | `## You don't need everything` (cost profiles) | 402–423 | Prove it | Reframe | Currently sits as "feature" prose; the new framing is "proof that the package shrinks to fit". |
59
+ | 17 | `## Who this is for` (stack coverage) | 427–439 | Prove it | Move | Honest depth claim — also evidence-side. |
60
+ | 18 | `## Featured Skills` | 443–462 | Use it | Move | Catalog teaser → consumer surface. |
61
+ | 19 | `## Featured Commands` | 466–481 | Use it | Move | Catalog teaser → consumer surface. |
62
+ | 20 | `## Supported Tools / Project-installed` | 487–527 | Use it | Move | Per-tool install matrix. |
63
+ | 21 | `## Supported Tools / Plugin-installed` | 529–541 | Use it | Move | Subsection. |
64
+ | 22 | `## Supported Tools / Cloud / Hosted-agent` | 543–558 | Use it | Move | Subsection. |
65
+ | 23 | `## Core Principles` | 562–570 | Prove it | Move | Behavioural floor — proof-side. |
66
+ | 24 | `## Documentation` (index table) | 574–589 | Use it | Move | Doc portal entry. |
67
+ | 25 | `### Maintainer telemetry (opt-in)` | 591–608 | Contribute | Move | Engagement measurement — maintainer / contributor surface. |
68
+ | 26 | `### Context-aware command suggestion` | 610–629 | Use it | Move | Consumer-facing feature toggle. |
69
+ | 27 | `## Development` | 633–642 | Contribute | Move | Primary CTA for `#contribute`. |
70
+ | 28 | `## Requirements` | 644–649 | Use it | Move | Install gate — Use-side, not Contribute. |
71
+ | 29 | `## License` | 651–653 | — | Keep at bottom | Footer; outside the three branches. |
72
+
73
+ ## Branch outlines (post-migration shape)
74
+
75
+ ### `## Use it in your project`
76
+
77
+ Opening paragraph: one-line "Two minutes from npx to a better-behaved
78
+ agent." Primary CTA: `npx @event4u/agent-config init`. Children:
79
+ Quickstart subtree (#3–#7), MCP operator path (#8–#9), optional memory
80
+ (#10), Featured Skills + Commands (#18–#19), Supported Tools (#20–#22),
81
+ Documentation (#24), Command suggestion (#26), Requirements (#28).
82
+
83
+ ### `## Prove it`
84
+
85
+ Opening paragraph: one-line "What the agent actually does, with
86
+ evidence." Primary CTA: `/implement-ticket` demo (#11). Children:
87
+ `/work` (#12), Product UI track (#13), Intent table (#14), Scope
88
+ statement (#15), Cost profiles reframed (#16), Stack coverage (#17),
89
+ Core Principles (#23).
90
+
91
+ ### `## Contribute`
92
+
93
+ Opening paragraph: one-line "Editing rules, skills, commands — the
94
+ contributor loop." Primary CTA: `task ci` (#27). Children: Maintainer
95
+ telemetry (#25). External links: `CONTRIBUTING.md`, `AGENTS.md`,
96
+ `docs/development.md`.
97
+
98
+ ## Verification (P2.2c preview)
99
+
100
+ Grep-based test asserts `## Use it in your project`, `## Prove it`,
101
+ `## Contribute` appear in that order. `lint-readme` keeps anchor
102
+ stability for the rows in the Anchor-stability promise table.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@event4u/agent-config",
3
- "version": "2.8.0",
3
+ "version": "2.10.0",
4
4
  "description": "Shared agent configuration \u2014 skills, rules, commands, guidelines, and templates for AI coding tools",
5
5
  "license": "MIT",
6
6
  "private": false,