@event4u/agent-config 2.19.0 → 2.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (92) hide show
  1. package/.agent-src/commands/agent-status.md +29 -0
  2. package/.agent-src/commands/onboard.md +221 -81
  3. package/.agent-src/packs/README.md +49 -0
  4. package/.agent-src/packs/agency-delivery.yml +63 -0
  5. package/.agent-src/packs/content-engine.yml +53 -0
  6. package/.agent-src/packs/founder-mvp.yml +51 -0
  7. package/.agent-src/presets/README.md +26 -0
  8. package/.agent-src/presets/balanced.yml +34 -0
  9. package/.agent-src/presets/fast.yml +31 -0
  10. package/.agent-src/presets/strict.yml +38 -0
  11. package/.agent-src/profiles/README.md +29 -0
  12. package/.agent-src/profiles/agency.yml +27 -0
  13. package/.agent-src/profiles/content_creator.yml +25 -0
  14. package/.agent-src/profiles/developer.yml +26 -0
  15. package/.agent-src/profiles/finance.yml +24 -0
  16. package/.agent-src/profiles/founder.yml +25 -0
  17. package/.agent-src/profiles/ops.yml +25 -0
  18. package/.agent-src/rules/no-cheap-questions.md +25 -17
  19. package/.agent-src/skills/adr-create/SKILL.md +78 -68
  20. package/.agent-src/skills/subagent-orchestration/SKILL.md +33 -0
  21. package/.agent-src/templates/agents/agent-project-settings.example.yml +1 -1
  22. package/.agent-src/templates/skill-archive-note.md +101 -0
  23. package/.claude-plugin/marketplace.json +1 -1
  24. package/CHANGELOG.md +52 -30
  25. package/README.md +68 -72
  26. package/config/agent-settings.template.yml +22 -0
  27. package/docs/adrs/caveman/0001-default-off-until-bench.md +93 -0
  28. package/docs/adrs/caveman/README.md +9 -0
  29. package/docs/adrs/cost/0001-hard-stop-hook.md +114 -0
  30. package/docs/adrs/cost/README.md +9 -0
  31. package/docs/adrs/memory/0001-consumer-side-snapshot.md +111 -0
  32. package/docs/adrs/memory/README.md +9 -0
  33. package/docs/adrs/router/0001-three-tier-routing.md +119 -0
  34. package/docs/adrs/router/README.md +9 -0
  35. package/docs/adrs/schema/0001-json-schema-frontmatter.md +102 -0
  36. package/docs/adrs/schema/README.md +9 -0
  37. package/docs/adrs/smoke/0001-per-tier-smoke-scripts.md +99 -0
  38. package/docs/adrs/smoke/README.md +9 -0
  39. package/docs/architecture/current-onboard-baseline.md +126 -0
  40. package/docs/architecture/current-safety-behavior.md +137 -0
  41. package/docs/archive/CHANGELOG-pre-2.16.0.md +48 -0
  42. package/docs/contracts/adr-layout.md +108 -0
  43. package/docs/contracts/benchmark-corpus-spec.md +97 -0
  44. package/docs/contracts/benchmark-report-schema.md +111 -0
  45. package/docs/contracts/command-clusters.md +1 -0
  46. package/docs/contracts/command-taxonomy.md +137 -0
  47. package/docs/contracts/compression-default-kill-criterion.md +69 -0
  48. package/docs/contracts/config-presets.md +144 -0
  49. package/docs/contracts/cost-dashboard.md +143 -0
  50. package/docs/contracts/cost-enforcement.md +134 -0
  51. package/docs/contracts/file-ownership-matrix.json +0 -7
  52. package/docs/contracts/mcp-tool-inventory.md +53 -0
  53. package/docs/contracts/measurement-baseline.md +102 -0
  54. package/docs/contracts/namespace.md +125 -0
  55. package/docs/contracts/profile-system.md +142 -0
  56. package/docs/contracts/safety-model.md +129 -0
  57. package/docs/contracts/smoke-contracts.md +144 -0
  58. package/docs/contracts/workflow-packs.md +121 -0
  59. package/docs/decisions/ADR-010-profile-pack-preset-boundary.md +132 -0
  60. package/docs/decisions/INDEX.md +1 -0
  61. package/docs/featured-commands.md +27 -0
  62. package/docs/parity/bench-ruflo.json +58 -0
  63. package/docs/parity/bench.json +41 -0
  64. package/docs/parity/ruflo.md +46 -0
  65. package/docs/profiles.md +91 -0
  66. package/package.json +1 -1
  67. package/scripts/_cli/cmd_explain.py +250 -0
  68. package/scripts/_lib/bench_cost.py +138 -0
  69. package/scripts/_lib/bench_quality.py +118 -0
  70. package/scripts/_lib/bench_report.py +150 -0
  71. package/scripts/agent-config +13 -0
  72. package/scripts/audit_adr_coverage.py +175 -0
  73. package/scripts/audit_mcp_tools.py +146 -0
  74. package/scripts/bench_baseline_ready.py +108 -0
  75. package/scripts/bench_drift_check.py +151 -0
  76. package/scripts/bench_per_tool.py +216 -0
  77. package/scripts/bench_run.py +155 -0
  78. package/scripts/config/__init__.py +9 -0
  79. package/scripts/config/presets.py +206 -0
  80. package/scripts/config/profiles.py +173 -0
  81. package/scripts/cost/budget.mjs +73 -12
  82. package/scripts/cost/preflight.mjs +89 -0
  83. package/scripts/lint_archived_skills.py +143 -0
  84. package/scripts/lint_bench_corpus.py +161 -0
  85. package/scripts/lint_namespace.py +135 -0
  86. package/scripts/skill_overlap.py +204 -0
  87. package/scripts/skill_usage_collect.py +191 -0
  88. package/scripts/skill_usage_report.py +162 -0
  89. package/scripts/smoke/kernel.sh +101 -0
  90. package/scripts/smoke/router.sh +129 -0
  91. package/scripts/smoke/schema.sh +71 -0
  92. package/scripts/smoke/skills.sh +101 -0
@@ -0,0 +1,53 @@
1
+ ---
2
+ stability: beta
3
+ keep-beta-until: 2026-08-14
4
+ ---
5
+
6
+ # MCP tool inventory
7
+
8
+ > Generated by [`scripts/audit_mcp_tools.py`](../../scripts/audit_mcp_tools.py)
9
+ > from the source-of-truth catalog
10
+ > [`scripts/mcp_server/consumer_tool_catalog.json`](../../scripts/mcp_server/consumer_tool_catalog.json).
11
+ > Do **not** hand-edit; rerun `python3 scripts/audit_mcp_tools.py --write`.
12
+ >
13
+ > Step-11 Phase 5 Step 3 (`step-11-ruflo-parity.md`).
14
+
15
+ ## Summary
16
+
17
+ - **Total tools:** 20
18
+ - **By transport:** stdio=9
19
+ - **By side-effect:** fs-write=5, ro=12, shell=3
20
+ - **Discovery-only stubs (no implementation):** 11
21
+
22
+ ## Tools
23
+
24
+ | Tool | Side-effect | Transports | Catalog | Handler |
25
+ |---|---|---|---|---|
26
+ | `lint_skills` | `ro` | stdio | [`consumer_tool_catalog.json:7`](../../scripts/mcp_server/consumer_tool_catalog.json#L7) | [`tools.py:510`](../../scripts/mcp_server/tools.py#L510) |
27
+ | `chat_history_append` | `fs-write` | stdio | [`consumer_tool_catalog.json:24`](../../scripts/mcp_server/consumer_tool_catalog.json#L24) | [`tools.py:535`](../../scripts/mcp_server/tools.py#L535) |
28
+ | `chat_history_read` | `ro` | stdio | [`consumer_tool_catalog.json:43`](../../scripts/mcp_server/consumer_tool_catalog.json#L43) | [`tools.py:571`](../../scripts/mcp_server/tools.py#L571) |
29
+ | `memory_lookup` | `ro` | stdio | [`consumer_tool_catalog.json:59`](../../scripts/mcp_server/consumer_tool_catalog.json#L59) | [`tools.py:590`](../../scripts/mcp_server/tools.py#L590) |
30
+ | `memory_signal` | `fs-write` | _(stub)_ | [`consumer_tool_catalog.json:75`](../../scripts/mcp_server/consumer_tool_catalog.json#L75) | _stub-only_ |
31
+ | `memory_status` | `ro` | stdio | [`consumer_tool_catalog.json:91`](../../scripts/mcp_server/consumer_tool_catalog.json#L91) | [`tools.py:617`](../../scripts/mcp_server/tools.py#L617) |
32
+ | `skill_trigger_eval` | `ro` | _(stub)_ | [`consumer_tool_catalog.json:98`](../../scripts/mcp_server/consumer_tool_catalog.json#L98) | _stub-only_ |
33
+ | `suggest_command` | `ro` | _(stub)_ | [`consumer_tool_catalog.json:114`](../../scripts/mcp_server/consumer_tool_catalog.json#L114) | _stub-only_ |
34
+ | `suggest_skill_for_task` | `ro` | _(stub)_ | [`consumer_tool_catalog.json:129`](../../scripts/mcp_server/consumer_tool_catalog.json#L129) | _stub-only_ |
35
+ | `mine_session` | `ro` | _(stub)_ | [`consumer_tool_catalog.json:144`](../../scripts/mcp_server/consumer_tool_catalog.json#L144) | _stub-only_ |
36
+ | `update_form_request_messages` | `fs-write` | _(stub)_ | [`consumer_tool_catalog.json:158`](../../scripts/mcp_server/consumer_tool_catalog.json#L158) | _stub-only_ |
37
+ | `sync_gitignore` | `fs-write` | _(stub)_ | [`consumer_tool_catalog.json:173`](../../scripts/mcp_server/consumer_tool_catalog.json#L173) | _stub-only_ |
38
+ | `sync_agent_settings` | `fs-write` | _(stub)_ | [`consumer_tool_catalog.json:186`](../../scripts/mcp_server/consumer_tool_catalog.json#L186) | _stub-only_ |
39
+ | `run_tests` | `shell` | _(stub)_ | [`consumer_tool_catalog.json:200`](../../scripts/mcp_server/consumer_tool_catalog.json#L200) | _stub-only_ |
40
+ | `run_quality_checks` | `shell` | _(stub)_ | [`consumer_tool_catalog.json:214`](../../scripts/mcp_server/consumer_tool_catalog.json#L214) | _stub-only_ |
41
+ | `list_skills` | `ro` | stdio | [`consumer_tool_catalog.json:227`](../../scripts/mcp_server/consumer_tool_catalog.json#L227) | [`tools.py:631`](../../scripts/mcp_server/tools.py#L631) |
42
+ | `list_commands` | `ro` | stdio | [`consumer_tool_catalog.json:234`](../../scripts/mcp_server/consumer_tool_catalog.json#L234) | [`tools.py:644`](../../scripts/mcp_server/tools.py#L644) |
43
+ | `list_rules` | `ro` | stdio | [`consumer_tool_catalog.json:241`](../../scripts/mcp_server/consumer_tool_catalog.json#L241) | [`tools.py:657`](../../scripts/mcp_server/tools.py#L657) |
44
+ | `compile_router` | `shell` | _(stub)_ | [`consumer_tool_catalog.json:248`](../../scripts/mcp_server/consumer_tool_catalog.json#L248) | _stub-only_ |
45
+ | `read_resource_body` | `ro` | stdio | [`consumer_tool_catalog.json:261`](../../scripts/mcp_server/consumer_tool_catalog.json#L261) | [`tools.py:670`](../../scripts/mcp_server/tools.py#L670) |
46
+
47
+ ## Glossary
48
+
49
+ - **Side-effect** — `ro` (read-only) · `fs-write` (filesystem write) · `shell` (spawns processes).
50
+ - **Transports** — `stdio` (`scripts/mcp_server/`) · `worker` (`workers/mcp/`). A tool may live on both.
51
+ - **Stub** — catalog-listed for discovery; returns the `not_implemented` envelope from
52
+ [`mcp-tool-stub-envelope.md`](mcp-tool-stub-envelope.md) until promoted.
53
+
@@ -0,0 +1,102 @@
1
+ ---
2
+ stability: stable
3
+ ---
4
+
5
+ # Measurement baseline — contract
6
+
7
+ > **Status:** locked 2026-05-16 · **Owner:** `step-4-measurement-and-benchmark.md`
8
+ > · **Cited by:** every P2 enforcement roadmap (skill rationalization G0, north-star G1, compression default decision).
9
+
10
+ Single source of truth for what `task bench` measures, what counts as
11
+ drift, and what unblocks enforcement. Read this before pinning a number
12
+ to a roadmap or PR description.
13
+
14
+ ## What `task bench` measures
15
+
16
+ Four axes, all numeric, all reproducible from the same input:
17
+
18
+ | Axis | Source | Definition | Units |
19
+ |---|---|---|---:|
20
+ | **selection accuracy** | [`scripts/bench_runner.py`](../../scripts/bench_runner.py) | Keyword-overlap ranker hits the expected skill in top-K | % |
21
+ | **cost** | [`scripts/cost/track.mjs`](../../scripts/cost/track.mjs) session jsonl | Token+USD per model, captured live | USD |
22
+ | **quality** | regex / rubric assertions per prompt | `quality_assertion` matches in agent output | % |
23
+ | **projection fidelity** | [`scripts/bench_per_tool.py`](../../scripts/bench_per_tool.py) | `accuracy(tool) / accuracy(augment)` for skill-projecting tools | ratio |
24
+
25
+ Schemas: [`benchmark-report-schema.md`](benchmark-report-schema.md) ·
26
+ [`benchmark-corpus-spec.md`](benchmark-corpus-spec.md). Reports land at
27
+ `bench/reports/<utc-stamp>-<corpus>[-projection].{json,md}` —
28
+ timestamped, never overwritten, content-addressed by run.
29
+
30
+ ## Corpora — frozen for the soak window
31
+
32
+ | Corpus | Path | Prompts | Purpose |
33
+ |---|---|---:|---|
34
+ | `dev` | [`tests/eval/corpus-dev.yaml`](../../tests/eval/corpus-dev.yaml) | 10 | Developer task surface (Laravel/Symfony/React/CI/PR) |
35
+ | `non-dev` | [`tests/eval/corpus-non-dev.yaml`](../../tests/eval/corpus-non-dev.yaml) | 16 | Founder / agency / content creator surface (Wing-4) |
36
+
37
+ Total 26 prompts ≥ Acceptance Criteria floor of 25. Mid-window edits
38
+ to either YAML restart the 60-day clock per
39
+ [`compression-default-kill-criterion.md`](compression-default-kill-criterion.md) § 2.
40
+
41
+ ## What counts as drift
42
+
43
+ [`scripts/bench_drift_check.py`](../../scripts/bench_drift_check.py)
44
+ compares the latest report against a sliding window of the prior N runs
45
+ (default 5) for the same corpus.
46
+
47
+ | Axis | Threshold | Note |
48
+ |---|---|---|
49
+ | selection accuracy | latest − baseline_mean ≤ −5 pp | always evaluated |
50
+ | cost | latest / baseline_mean ≥ +20 % | only when both sides have `source: captured` |
51
+ | quality | latest − baseline_mean ≤ −10 pp | skipped when latest is `not_collected` |
52
+ | projection fidelity | tool fidelity < 0.85 | exit 1 from `task bench:projection` |
53
+
54
+ Drift exits with code 2 from `task bench:drift`. **CI posture during
55
+ soak:** all bench-drift steps `continue-on-error: true` and post a
56
+ sticky PR comment — informational only, not a merge gate. Flip to
57
+ required check happens via a separate PR once
58
+ `task bench:baseline-ready` returns 0 (see below).
59
+
60
+ ## What unblocks enforcement (the G1 gate)
61
+
62
+ ```
63
+ TASK bench:baseline-ready EXIT 0 IS THE ONLY AUTHORITY.
64
+ NO ANECDOTE, NO INDIVIDUAL REPORT, NO ROADMAP-SIDE OVERRIDE.
65
+ ```
66
+
67
+ [`scripts/bench_baseline_ready.py`](../../scripts/bench_baseline_ready.py)
68
+ returns exit 0 iff both:
69
+
70
+ 1. **Wall-clock soak:** `today − bench/baseline-start.txt ≥ --min-days` (default 60)
71
+ 2. **Report density:** `bench/reports/*-<corpus>.json` count ≥ `--min-reports` (default 30)
72
+
73
+ Soak start anchored at [`bench/baseline-start.txt`](../../bench/baseline-start.txt)
74
+ = **2026-05-16**. Earliest possible flip: **2026-07-15**, contingent
75
+ on the 30-report floor.
76
+
77
+ Downstream consumers:
78
+
79
+ - ``step-99-north-star-restructure.md` § Acceptance G1` — reads this exit code.
80
+ - [`compression-default-kill-criterion.md` § 3](compression-default-kill-criterion.md) — reads the decision table after baseline closes.
81
+ - ``step-2-skill-inventory-rationalization.md` § G0` — usage-data soak floor.
82
+
83
+ ## What the closeout writes
84
+
85
+ On baseline closure, the step-4 closeout writes the numeric verdict to
86
+ [`docs/parity/bench.json`](../parity/bench.json) — frozen snapshot with
87
+ the 30+ reports averaged, drift verdict, and the compression-default
88
+ decision per the kill-criterion table. That file is the artefact every
89
+ P2 roadmap reads — not the live `bench/reports/` directory.
90
+
91
+ ## Carve-outs
92
+
93
+ - **Pricing freshness:** [`bench/pricing.yaml`](../../bench/pricing.yaml) rows must carry `sourced_on: YYYY-MM-DD`. Stale prices = stale numbers = no trust (ruflo "measured-vs-claimed" pattern).
94
+ - **Subjective grading excluded:** quality scoring is mechanical via `quality_assertion`. No vibes.
95
+ - **Cursor / Cline / Windsurf:** rules-only surfaces, no SKILL.md projection. `bench:projection` reports them as `not_applicable` — the gap is acknowledged, not silently dropped.
96
+
97
+ ## Cross-references
98
+
99
+ - [`benchmark-report-schema.md`](benchmark-report-schema.md) · per-report JSON schema
100
+ - [`benchmark-corpus-spec.md`](benchmark-corpus-spec.md) · corpus YAML schema
101
+ - [`compression-default-kill-criterion.md`](compression-default-kill-criterion.md) · decision table read by step-4 closeout
102
+ - `step-4-measurement-and-benchmark.md` · the owning roadmap
@@ -0,0 +1,125 @@
1
+ ---
2
+ stability: stable
3
+ ---
4
+
5
+ # Namespace contract — skills, rules, commands, personas
6
+
7
+ > Every artefact name is a **stable identifier**: routed to from
8
+ > `router.json`, cited from skills, surfaced in `/help`, embedded in
9
+ > command paths, and back-referenced in test fixtures. Drift breaks
10
+ > all five surfaces silently.
11
+ >
12
+ > **Source:** Step-11 Phase 5 Step 1
13
+ > (`step-11-ruflo-parity.md`).
14
+ > **Enforcer:** [`scripts/lint_namespace.py`](../../scripts/lint_namespace.py),
15
+ > wired into `task lint-skills`.
16
+
17
+ ## 1. Shape
18
+
19
+ ```
20
+ <stem>-<intent> kebab-case, ASCII, lowercase
21
+ ```
22
+
23
+ | Component | Rule |
24
+ |---|---|
25
+ | Charset | `[a-z0-9-]+` only |
26
+ | Separator | single `-` between tokens; never `_`, `.`, or camelCase |
27
+ | Length | skills: 3 ≤ name ≤ 64 · rules / commands / personas: 2 ≤ name ≤ 64 (two-letter slot reserved for intentional acronyms — `pr`, `ci`, `qa`, `me`) |
28
+ | First char | `[a-z]` (digits and `-` forbidden at start) |
29
+ | Last char | `[a-z0-9]` (trailing `-` forbidden) |
30
+ | Run | no consecutive `--` |
31
+
32
+ The `<stem>` carries the **subject** (`commit`, `eloquent`,
33
+ `livewire`); the `<intent>` (optional) carries the **verb / lens**
34
+ (`-writing`, `-architect`, `-routing`). Single-token names are
35
+ permitted when the stem already encodes both (`commit`, `eloquent`,
36
+ `docker`).
37
+
38
+ ## 2. Reserved names — forbidden as artefact names
39
+
40
+ | Name | Reason |
41
+ |---|---|
42
+ | `pattern` | Reserved for trigger-pattern fixtures (see `tests/fixtures/triggers/`). |
43
+ | `claude-memories` | Reserved for the `~/.claude/CLAUDE.md` shape — host-agent state, not a package artefact. |
44
+ | `default` | Ambiguous with profile / mode defaults; collides with `.agent-settings.yml` keys. |
45
+ | `index` | Reserved for auto-generated INDEX.md files. |
46
+ | `router` | Reserved for `router.json` and the router contract. |
47
+
48
+ Reserved names apply at the **top level** of each artefact type. A
49
+ sub-verb under a namespaced group (e.g. `council/default.md` →
50
+ `/council:default`) is **not** a top-level identifier — the group
51
+ prefix disambiguates it, and reserved-name enforcement is skipped
52
+ for sub-verbs by the linter. A future artefact `pattern-foo` at the
53
+ top level is fine; bare `pattern` is not.
54
+
55
+ `README.md` and `INDEX.md` are documentation, not artefacts, and are
56
+ skipped by the linter.
57
+
58
+ ## 3. Per-type conventions
59
+
60
+ | Type | Source path | Naming nuance |
61
+ |---|---|---|
62
+ | Skill | `.agent-src.uncompressed/skills/<name>/SKILL.md` | Directory name == frontmatter `name`. |
63
+ | Rule | `.agent-src.uncompressed/rules/<name>.md` | Filename stem == frontmatter `id` (when present). |
64
+ | Command | `.agent-src.uncompressed/commands/<name>.md` or `<group>/<verb>.md` | Slash-command invocation `<name>` or `<group>:<verb>`. |
65
+ | Persona | `.agent-src.uncompressed/personas/<name>.md` | Cited from skill frontmatter `personas:` list. |
66
+
67
+ Sub-namespacing (`commit/in-chunks.md` → `/commit:in-chunks`) uses
68
+ the same charset rules per segment; the joining colon is implicit.
69
+
70
+ ## 4. Linter — `scripts/lint_namespace.py`
71
+
72
+ Walks the four source roots above, asserts each artefact name:
73
+
74
+ 1. Matches the regex `^[a-z][a-z0-9]*(-[a-z0-9]+)*$`.
75
+ 2. Length 3 ≤ name ≤ 64.
76
+ 3. Not in the reserved-names list.
77
+ 4. Skill: directory name matches frontmatter `name`.
78
+
79
+ Exit codes:
80
+
81
+ | Exit | Meaning |
82
+ |---|---|
83
+ | `0` | All names valid. |
84
+ | `1` | At least one name fails a rule. |
85
+ | `2` | Linter crashed (filesystem error, malformed frontmatter). |
86
+
87
+ Diagnostic format: one issue per line — `<path>: <rule> — <detail>`.
88
+
89
+ ## 5. Adding a new artefact
90
+
91
+ Pick the name; verify locally:
92
+
93
+ ```bash
94
+ python3 scripts/lint_namespace.py --name <candidate>
95
+ # or full run:
96
+ python3 scripts/lint_namespace.py
97
+ ```
98
+
99
+ If the candidate fails, the linter prints the rule it violated.
100
+ **Renames after release are expensive** — touch router.json, every
101
+ skill citing the old name, the bench corpus, and consumer settings.
102
+ Pay the naming cost once, upfront.
103
+
104
+ ## 6. Relationship to the frontmatter contract
105
+
106
+ The **shape** lives here. The **frontmatter keys** that carry the
107
+ name (`name:` in skills, `id:` in rules) live in
108
+ [`frontmatter-contract.md`](../../agents/docs/frontmatter-contract.md).
109
+ Both contracts share the regex; this file is the source of truth for
110
+ the regex string.
111
+
112
+ ## 7. Why this exists
113
+
114
+ `router.json` resolves `<kind>:<id>` strings at session start. Any
115
+ artefact rename breaks every routing entry pointing at the old name
116
+ without compile-time error. The linter catches the rename at the PR
117
+ boundary, not at runtime in a consumer.
118
+
119
+ ## 8. Out of scope
120
+
121
+ - File-system case sensitivity (we rely on lowercase-only names).
122
+ - Cross-tool aliases (Augment / Claude / Cursor all consume the same
123
+ name — projection is by content, not by alias).
124
+ - Versioning suffixes (`-v2`, `-legacy`). Use `status: superseded`
125
+ in frontmatter instead; never rename in place.
@@ -0,0 +1,142 @@
1
+ ---
2
+ stability: beta
3
+ keep-beta-until: 2026-08-14
4
+ ---
5
+
6
+ # Profile System — Contract
7
+
8
+ > **Status:** beta · **Owner:** package maintainer · **Last reviewed:** 2026-05-16
9
+ >
10
+ > Schema and semantics for the **Profile** axis introduced in step-15
11
+ > Phase 1 item 1. Profile answers *who is the user?* — audience
12
+ > taxonomy that selects the default skill/command surface, README
13
+ > entry-paragraph, and persona pre-selection. Boundary against
14
+ > `preset.id`, `pack.id`, and `cost_profile`:
15
+ > [`ADR-010`](../decisions/ADR-010-profile-pack-preset-boundary.md).
16
+
17
+ ## Decision
18
+
19
+ A **profile** declares the user's audience identity. Six seed profiles
20
+ ship; users can declare their own under
21
+ `.agent-src.uncompressed/profiles/<id>.yml`.
22
+
23
+ | `profile.id` | Audience | README entry-paragraph | Default `preset.id` |
24
+ |---|---|---|---|
25
+ | `founder` | Solo / early-stage founder; wears every hat | "Ship the company, not the codebase" | `fast` |
26
+ | `developer` | IC engineer; primary day-to-day user today | "Pair with a senior reviewer that never sleeps" | `balanced` |
27
+ | `content_creator` | Writers, ghostwriters, marketers | "Your voice, my hands" | `balanced` |
28
+ | `agency` | Multi-client delivery shop | "Same playbook across every client repo" | `strict` |
29
+ | `finance` | CFO / fractional finance / FP&A | "Forecasts and memos with the receipts attached" | `strict` |
30
+ | `ops` | RevOps, support, SRE-adjacent | "Procedures that get followed, not skipped" | `strict` |
31
+
32
+ The seed set is **fixed for v2.x**. Adding a seventh profile requires
33
+ an ADR — the contract surface that ships in the wizard
34
+ (`/onboard` role-selection) treats this set as exhaustive.
35
+
36
+ ## Profile shape
37
+
38
+ ```yaml
39
+ profile:
40
+ id: developer
41
+ audience:
42
+ label: "IC engineer"
43
+ readme_anchor: "developer" # selects README first-screen block
44
+ defaults:
45
+ preset_id: balanced # may be overridden by .agent-settings.yml
46
+ personas: [reviewer, security] # pre-selected persona ids
47
+ skills_hint: [developer-like-execution, verify-before-complete, minimal-safe-diff]
48
+ surface:
49
+ commands_hint: [work, implement-ticket, review-changes, fix]
50
+ docs_first_pointer: "docs/getting-started-by-role.md#developer"
51
+ ```
52
+
53
+ Per [ADR-010](../decisions/ADR-010-profile-pack-preset-boundary.md), a
54
+ profile **MAY** set `defaults.preset_id` but **MAY NOT** set any
55
+ preset-owned knob directly. The lint task (`task lint-config-schema`)
56
+ enforces this.
57
+
58
+ ## Loader contract
59
+
60
+ The Phase 1 loader lives at `scripts/config/profiles.py`. Resolution
61
+ chain (last writer wins):
62
+
63
+ 1. `pack.profile_id` (if pack active) → `profile.id`.
64
+ 2. `.agent-settings.yml` top-level `profile:` block → `profile.id`
65
+ and any user overrides for `audience` / `defaults` / `surface`.
66
+ 3. Environment variable `AGENT_CONFIG_PROFILE_ID` → `profile.id`.
67
+ 4. Runtime CLI flag `--profile=<id>` → `profile.id`, single session.
68
+
69
+ If no profile resolves, the loader **does not pick a default
70
+ silently** — it falls back to `developer` only when
71
+ `.agent-settings.yml` is missing entirely (fresh install before
72
+ `/onboard`). With a settings file present but no `profile:` block,
73
+ the loader raises a structured warning pointing to `/onboard`.
74
+
75
+ ```
76
+ RATIONALE: a silent default would hide the "I never picked an audience"
77
+ state from the wizard, breaking the council v3 observation that audience
78
+ choice must be a deliberate act of the user, not an agent inference.
79
+ ```
80
+
81
+ ## Resolution outcome
82
+
83
+ After the loader runs, the session has:
84
+
85
+ ```python
86
+ {
87
+ "id": "developer",
88
+ "audience": {"label": "IC engineer", "readme_anchor": "developer"},
89
+ "preset_id": "balanced",
90
+ "personas": ["reviewer", "security"],
91
+ "skills_hint": ["developer-like-execution", ...],
92
+ "commands_hint": ["work", "implement-ticket", ...],
93
+ "source": "user-settings | env | runtime | pack | default",
94
+ }
95
+ ```
96
+
97
+ The `source` field is mandatory and feeds the
98
+ `/agent-config explain`
99
+ command (Phase 1 item 3).
100
+
101
+ ## User-defined profiles
102
+
103
+ A consumer project MAY ship a custom profile under
104
+ `.agent-src.uncompressed/profiles/<id>.yml`. Constraints:
105
+
106
+ - `id` MUST be unique across seed + user-defined profiles.
107
+ - Shape MUST match the seed contract above (audience / defaults / surface).
108
+ - `defaults.preset_id` MUST reference an existing preset
109
+ ([`config-presets.md`](config-presets.md)).
110
+ - The lint task hard-fails on schema violations.
111
+
112
+ User-defined profiles do **not** require an ADR — they are project-local.
113
+ Only changes to the **seed set** require an ADR.
114
+
115
+ ## Drift detection
116
+
117
+ `task lint-config-schema` (added in Phase 1) hard-fails when:
118
+
119
+ - A profile YAML names a preset-owned knob (cost cap, autonomy,
120
+ confidence, risk).
121
+ - A profile YAML references a non-existent `preset_id`.
122
+ - The seed-profile count diverges from this contract's table.
123
+ - `defaults.personas` references a persona id that does not exist
124
+ under `.agent-src.uncompressed/personas/`.
125
+
126
+ ## Non-goals
127
+
128
+ - This contract does **not** define preset knobs. See
129
+ [`config-presets.md`](config-presets.md).
130
+ - It does **not** define packs. See `workflow-packs.md` (Phase 2 item 7).
131
+ - It does **not** override `cost_profile`. The rule-tier loader keeps
132
+ its independent axis per
133
+ [`cost-profile-defaults.md`](cost-profile-defaults.md).
134
+ - It does **not** ship a UI. Profile selection happens in `/onboard`
135
+ (step-15 Phase 1 item 2).
136
+
137
+ ## See also
138
+
139
+ - [`ADR-010`](../decisions/ADR-010-profile-pack-preset-boundary.md) — axis boundary.
140
+ - [`config-presets.md`](config-presets.md) — preset knobs.
141
+ - [`cost-profile-defaults.md`](cost-profile-defaults.md) — rule-tier axis (orthogonal).
142
+ - `step-15-product-refinement` — Phase 1 item 1.
@@ -0,0 +1,129 @@
1
+ ---
2
+ stability: beta
3
+ keep-beta-until: 2026-08-12
4
+ ---
5
+
6
+ # Universal safety model
7
+
8
+ > **Status:** beta — first draft 2026-05-16 (Phase 2 Item 9 of
9
+ > `step-15-product-refinement`).
10
+ >
11
+ > **Baseline:** [`docs/architecture/current-safety-behavior.md`](../architecture/current-safety-behavior.md)
12
+ > documents the pre-step-15 surface this contract replaces.
13
+
14
+ A **per-profile, per-domain safety policy** declared as a single
15
+ machine-readable table. Replaces the legacy "one autonomy switch for
16
+ everything" model documented in the baseline. Does **not** weaken the
17
+ four non-overridable floors — those keep their universal scope and
18
+ are referenced by id, not redeclared here.
19
+
20
+ ## The Iron Floor
21
+
22
+ ```
23
+ NO POLICY ENTRY MAY WIDEN AN EXISTING FLOOR.
24
+ ANY ENTRY THAT WOULD ALLOW A FLOOR-BLOCKED ACTION IS REJECTED AT LINT.
25
+ ```
26
+
27
+ The four floors are listed in
28
+ [`current-safety-behavior § The four non-overridable floors`](../architecture/current-safety-behavior.md#the-four-non-overridable-floors):
29
+ `non-destructive-by-default`, `scope-control § git-ops`,
30
+ `commit-policy`, `security-sensitive-stop`. Floor membership is
31
+ maintained in [`kernel-membership`](kernel-membership.md); a domain
32
+ listed there cannot be set to `allow` here.
33
+
34
+ ## Schema
35
+
36
+ ```yaml
37
+ # .agent-src.uncompressed/profiles/<id>.yml — new top-level key
38
+ profile:
39
+ id: <profile.id>
40
+ # ... existing fields ...
41
+ safety:
42
+ domains:
43
+ <domain-id>:
44
+ policy: <deny | ask | allow>
45
+ rationale: "<= 280 chars — why this policy for this profile>"
46
+ ```
47
+
48
+ ### Domain registry
49
+
50
+ Domains are declared in this contract, **not** invented per profile.
51
+ A profile may only reference an id from the table below.
52
+
53
+ | Domain id | What it gates | Floor reference |
54
+ |---|---|---|
55
+ | `prod_data` | Reads / writes against production data stores. | `non-destructive-by-default` |
56
+ | `prod_infra` | Terraform / k8s / cloud config touching prod. | `non-destructive-by-default` |
57
+ | `secrets` | Secret values in env, config, or output. | `security-sensitive-stop` |
58
+ | `auth_changes` | Auth, session, tenant-boundary, IAM edits. | `security-sensitive-stop` |
59
+ | `billing` | Pricing, invoicing, refund, payout logic. | `security-sensitive-stop` |
60
+ | `bulk_delete` | `rm -rf`, `DROP`, `TRUNCATE`, ≥ 5-file deletion. | `non-destructive-by-default` |
61
+ | `git_push` | `git push` to any remote. | `scope-control § git-ops` |
62
+ | `git_branch` | branch create / switch / delete. | `scope-control § git-ops` |
63
+ | `commit` | Any git commit. | `commit-policy` |
64
+ | `mcp_call_costly` | MCP / web / model call ≥ preset's `per_call_max_usd`. | — (advisory) |
65
+ | `pii_redact` | PII redaction in support / finance / recruiting / marketing outputs. | `domain-safety-pii-*` |
66
+ | `pii_log` | Logging of raw PII. | `domain-safety-logging-pii-floor` |
67
+ | `legal_advice` | Output shaped as legal advice. | `domain-safety-disclaimer-legal` |
68
+ | `medical_advice` | Output shaped as medical advice. | `domain-safety-disclaimer-medical` |
69
+ | `financial_advice` | Investment / tax / valuation positions. | `domain-safety-disclaimer-financial` |
70
+ | `pr_create` | Pull-request open / close / retarget. | `scope-control § git-ops` |
71
+ | `deploy` | Deploy / release / tag / pipeline trigger. | `non-destructive-by-default` |
72
+
73
+ ### Policy semantics
74
+
75
+ | Policy | Behaviour | Floor interaction |
76
+ |---|---|---|
77
+ | `deny` | The agent refuses. Numbered-option block surfaces the refusal and the rationale field; no override path. | `deny` is the default for every floor domain — it cannot be relaxed. |
78
+ | `ask` | The agent stops and asks a single numbered question per [`user-interaction`](../../.agent-src/rules/user-interaction.md). One question per turn. | `ask` is the default for every floor domain in a profile that has not opted out — the floor remains operative even when `policy=allow` is set elsewhere. |
79
+ | `allow` | The agent proceeds without asking. Trivial-question suppression applies. | `allow` is **forbidden** on any domain whose `Floor reference` column is non-empty. Linter rejects it. |
80
+
81
+ The legacy single switch (`personal.autonomy`) is preserved as a
82
+ **fallback** for any domain a profile does not declare — keeping
83
+ existing installs functional while profiles migrate.
84
+
85
+ ## Resolution
86
+
87
+ Order (last writer wins, subject to the Iron Floor):
88
+
89
+ 1. Domain default = `ask` for floor domains, `allow` otherwise.
90
+ 2. Profile `safety.domains.<id>.policy`.
91
+ 3. Active pack's profile (if `--pack <id>` is active).
92
+ 4. `.agent-settings.yml` user override under `profile.safety.domains`.
93
+
94
+ The explain command at [`explain config`](../../.agent-src/scripts/agent-config)
95
+ (Phase 1 Item 3 deliverable) surfaces the resolved policy per domain,
96
+ with the writer source per row.
97
+
98
+ ## Validation
99
+
100
+ `scripts/lint_safety_model.py` (Phase 2 deliverable — not yet
101
+ shipped) fails CI on:
102
+
103
+ - Unknown domain id.
104
+ - `allow` on a floor-referenced domain.
105
+ - Missing `rationale` (≤ 280 chars, plain prose).
106
+ - Profile declaring `safety` without at least one entry.
107
+
108
+ Until the linter lands, profiles are reviewed by hand at PR time.
109
+
110
+ ## What this contract does **not** do
111
+
112
+ - **Does not** introduce new safety rules. Every domain row maps to
113
+ an existing rule or to advisory cost guidance.
114
+ - **Does not** ship the loader. `scripts/config/safety.py` is a
115
+ Phase 2 deliverable deferred to its own step.
116
+ - **Does not** override domain-safety output floors. PII redaction
117
+ and disclaimer rules apply regardless of `safety.domains.*` —
118
+ `policy=allow` on `pii_redact` means "do not ask before redacting",
119
+ not "skip redaction".
120
+ - **Does not** authorize per-tool MCP overrides. Cost caps live in
121
+ [`config-presets`](config-presets.md).
122
+
123
+ ## See also
124
+
125
+ - [`current-safety-behavior`](../architecture/current-safety-behavior.md) — pre-step-15 baseline (what this replaces)
126
+ - [`config-presets`](config-presets.md) — cost caps and enforcement
127
+ - [`profile-system`](profile-system.md) — profile axis
128
+ - [`workflow-packs`](workflow-packs.md) — pack-level overrides
129
+ - `step-15-product-refinement` § Phase 2 Item 9