npm - @event4u/agent-config - Versions diffs - 2.18.0 → 2.20.0 - Mend

@event4u/agent-config 2.18.0 → 2.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (108) hide show

package/.agent-src/commands/agent-status.md +29 -0
package/.agent-src/commands/onboard.md +221 -81
package/.agent-src/commands/refine-ticket.md +3 -0
package/.agent-src/packs/README.md +49 -0
package/.agent-src/packs/agency-delivery.yml +63 -0
package/.agent-src/packs/content-engine.yml +53 -0
package/.agent-src/packs/founder-mvp.yml +51 -0
package/.agent-src/personas/README.md +8 -0
package/.agent-src/presets/README.md +26 -0
package/.agent-src/presets/balanced.yml +34 -0
package/.agent-src/presets/fast.yml +31 -0
package/.agent-src/presets/strict.yml +38 -0
package/.agent-src/profiles/README.md +29 -0
package/.agent-src/profiles/agency.yml +27 -0
package/.agent-src/profiles/content_creator.yml +25 -0
package/.agent-src/profiles/developer.yml +26 -0
package/.agent-src/profiles/finance.yml +24 -0
package/.agent-src/profiles/founder.yml +25 -0
package/.agent-src/profiles/ops.yml +25 -0
package/.agent-src/rules/no-cheap-questions.md +25 -17
package/.agent-src/skills/adr-create/SKILL.md +78 -68
package/.agent-src/skills/refine-ticket/SKILL.md +3 -0
package/.agent-src/skills/subagent-orchestration/SKILL.md +33 -0
package/.agent-src/templates/agents/agent-project-settings.example.yml +1 -1
package/.agent-src/templates/skill-archive-note.md +101 -0
package/.agent-src/user-types/README.md +124 -0
package/.agent-src/user-types/_template/user-type.md +95 -0
package/.agent-src/user-types/galabau-field-crew.md +100 -0
package/.agent-src/user-types/metalworking-shop.md +105 -0
package/.agent-src/user-types/truck-driver.md +113 -0
package/.claude-plugin/marketplace.json +1 -1
package/CHANGELOG.md +91 -30
package/README.md +68 -72
package/config/agent-settings.template.yml +22 -0
package/docs/adrs/caveman/0001-default-off-until-bench.md +93 -0
package/docs/adrs/caveman/README.md +9 -0
package/docs/adrs/cost/0001-hard-stop-hook.md +114 -0
package/docs/adrs/cost/README.md +9 -0
package/docs/adrs/memory/0001-consumer-side-snapshot.md +111 -0
package/docs/adrs/memory/README.md +9 -0
package/docs/adrs/router/0001-three-tier-routing.md +119 -0
package/docs/adrs/router/README.md +9 -0
package/docs/adrs/schema/0001-json-schema-frontmatter.md +102 -0
package/docs/adrs/schema/README.md +9 -0
package/docs/adrs/smoke/0001-per-tier-smoke-scripts.md +99 -0
package/docs/adrs/smoke/README.md +9 -0
package/docs/architecture/current-onboard-baseline.md +126 -0
package/docs/architecture/current-safety-behavior.md +137 -0
package/docs/archive/CHANGELOG-pre-2.16.0.md +48 -0
package/docs/contracts/adr-layout.md +108 -0
package/docs/contracts/adr-mcp-runtime.md +128 -0
package/docs/contracts/adr-user-types-axis.md +127 -0
package/docs/contracts/benchmark-corpus-spec.md +97 -0
package/docs/contracts/benchmark-report-schema.md +111 -0
package/docs/contracts/command-clusters.md +1 -0
package/docs/contracts/command-taxonomy.md +137 -0
package/docs/contracts/compression-default-kill-criterion.md +69 -0
package/docs/contracts/config-presets.md +144 -0
package/docs/contracts/cost-dashboard.md +143 -0
package/docs/contracts/cost-enforcement.md +134 -0
package/docs/contracts/file-ownership-matrix.json +0 -7
package/docs/contracts/mcp-tool-inventory.md +53 -0
package/docs/contracts/measurement-baseline.md +102 -0
package/docs/contracts/namespace.md +125 -0
package/docs/contracts/profile-system.md +142 -0
package/docs/contracts/safety-model.md +129 -0
package/docs/contracts/smoke-contracts.md +144 -0
package/docs/contracts/user-type-schema.md +146 -0
package/docs/contracts/workflow-packs.md +121 -0
package/docs/decisions/ADR-010-profile-pack-preset-boundary.md +132 -0
package/docs/decisions/INDEX.md +1 -0
package/docs/featured-commands.md +27 -0
package/docs/parity/bench-ruflo.json +58 -0
package/docs/parity/bench.json +41 -0
package/docs/parity/ruflo.md +46 -0
package/docs/profiles.md +91 -0
package/docs/recruits/_template.md +81 -0
package/package.json +1 -1
package/scripts/_cli/cmd_explain.py +250 -0
package/scripts/_lib/bench_cost.py +138 -0
package/scripts/_lib/bench_quality.py +118 -0
package/scripts/_lib/bench_report.py +150 -0
package/scripts/agent-config +13 -0
package/scripts/audit_adr_coverage.py +175 -0
package/scripts/audit_mcp_tools.py +146 -0
package/scripts/bench_baseline_ready.py +108 -0
package/scripts/bench_drift_check.py +151 -0
package/scripts/bench_per_tool.py +216 -0
package/scripts/bench_run.py +155 -0
package/scripts/compress.py +48 -2
package/scripts/config/__init__.py +9 -0
package/scripts/config/presets.py +206 -0
package/scripts/config/profiles.py +173 -0
package/scripts/cost/budget.mjs +73 -12
package/scripts/cost/preflight.mjs +89 -0
package/scripts/lint_archived_skills.py +143 -0
package/scripts/lint_bench_corpus.py +161 -0
package/scripts/lint_namespace.py +135 -0
package/scripts/schemas/user-type.schema.json +35 -0
package/scripts/skill_linter.py +139 -4
package/scripts/skill_overlap.py +204 -0
package/scripts/skill_tools/audit_user_type_coverage.py +148 -0
package/scripts/skill_usage_collect.py +191 -0
package/scripts/skill_usage_report.py +162 -0
package/scripts/smoke/kernel.sh +101 -0
package/scripts/smoke/router.sh +129 -0
package/scripts/smoke/schema.sh +71 -0
package/scripts/smoke/skills.sh +101 -0

package/docs/architecture/current-onboard-baseline.md ADDED Viewed

@@ -0,0 +1,126 @@
+# Current `/onboard` Baseline (pre-step-15)
+> **Status:** descriptive baseline · **Owner:** package maintainer ·
+> **Last reviewed:** 2026-05-16
+>
+> Documents the **current** `/onboard` flow so the Phase 1 Guided
+> Setup Wizard (step-15 item 2) has a baseline to extend. Council v3
+> unique finding (cannot "extend" an undocumented surface). This file
+> describes what ships today; it is **not** a proposal.
+## Surface
+`/onboard` lives at [`.agent-src.uncompressed/commands/onboard.md`](../../.agent-src.uncompressed/commands/onboard.md)
+(canonical source) and is triggered by the
+[`onboarding-gate`](../../.agent-src/rules/onboarding-gate.md) rule on
+the first turn when `onboarding.onboarded == false` in
+`.agent-settings.yml`. Cloud surfaces (Claude.ai Web, Skills API): fully
+inert — no settings file, no flow.
+## The 12 steps today
+| # | Step | Captures | Asked if |
+|---|---|---|---|
+| 1 | Greet + set expectations | — | always |
+| 2 | Offer user-global cross-project defaults | intent flag for step 9 | first-time-setup heuristic only |
+| 3 | `personal.user_name` | first name | unset |
+| 4 | `personal.ide` (+ auto-detect via `ps aux`) and `personal.open_edited_files` | IDE id, auto-open flag | unset |
+| 5 | `personal.pr_comment_bot_icon` | bool | always (no detection possible) |
+| 6 | `personal.rtk_installed` (via `which rtk`) | bool + install action | rtk not found |
+| 7 | `cost_profile` and `pipelines.skill_improvement` | profile id, learning bool | always (one summary screen) |
+| 8 | Mark `onboarding.onboarded: true` | — | always |
+| 9 | Write user-global `~/.event4u/agent-config/agent-settings.yml` | six whitelisted keys | step 2 captured "yes" |
+| 10 | Summary block | — | always |
+| 11 | Quickstart pointer (`/work` and `/implement-ticket`) | — | local only |
+| 12 | Maintainer telemetry hint (opt-in) | — | local only |
+## What `/onboard` does **not** capture today
+Step-15 Phase 1 item 2 introduces a new role-selection step ("8 options
+covering Software / Content / Founder / Consulting / Marketing / Finance
+/ Handwerk / Self-configure") that produces a `user_type`. Today, no
+`user_type` is captured. Specifically:
+- **No audience/role question.** `/onboard` knows the developer's name,
+  IDE, and rtk install status — never the audience taxonomy.
+- **No `profile.id`.** `profile.id` does not exist as a key in
+  `.agent-settings.yml`. Per
+  [ADR-010](../decisions/ADR-010-profile-pack-preset-boundary.md), it
+  is owned by the Phase 1 item 1 profile loader.
+- **No `preset.id`.** Same status — `preset.id` arrives with Phase 1
+  item 4.
+- **No `pack.id`.** Arrives with Phase 2 item 7.
+- **No risk-appetite question.** The current flow defers risk posture
+  to `personal.autonomy`, which is itself not part of the onboard
+  questions (it inherits the template default).
+- **No stack question.** Stack is inferred at runtime by detectors
+  (`scripts/detect/*`), not asked here.
+## Settings keys written today
+```yaml
+personal:
+  user_name: "<first-name>"        # step 3
+  ide: "code|phpstorm|cursor"       # step 4
+  open_edited_files: true|false     # step 4
+  pr_comment_bot_icon: true|false   # step 5
+  rtk_installed: true|false         # step 6
+cost_profile: "balanced"             # step 7 (default unchanged)
+pipelines:
+  skill_improvement: true            # step 7 (default unchanged)
+onboarding:
+  onboarded: true                    # step 8
+```
+User-global file (step 9, opt-in): the six whitelisted keys in
+[`scripts/_lib/agent_settings.py`](../../scripts/_lib/agent_settings.py)
+— `name`, `ide`, `cost_profile`, `personal.bot_icon`,
+`personal.autonomy`, `caveman.speak_scope`.
+## Iron Laws today
+- **One question per turn** ([`ask-when-uncertain`](../../.agent-src/rules/ask-when-uncertain.md)).
+- **Re-runnable** — invoking `/onboard` when `onboarded: true` walks the
+  flow again, never silently rewrites a value (asks before overwriting
+  `user_name` / `ide`).
+- **Never commits** — `.agent-settings.yml` is git-ignored.
+- **User-global write is opt-in + one-shot + never silent** — step 2
+  captures intent, step 9 re-confirms.
+## Gaps the wizard (Phase 1 item 2) must close
+1. **Add role-selection step** producing a `user_type` (later mapped to
+   `profile.id`). Eight options covering Software / Content / Founder /
+   Consulting / Marketing / Finance / Handwerk / Self-configure.
+   Inserted **before** step 8 (mark onboarded) so the profile loader
+   has a value to read on the next session start.
+2. **Add stack-detection confirmation step.** Run the existing
+   `scripts/detect/*` detectors, present the result, allow the user
+   to override. Without confirmation, profile-aware presets cannot
+   resolve.
+3. **Add risk-appetite question.** Maps to `preset.id` from
+   [`config-presets.md`](../contracts/config-presets.md). Three
+   options: `fast` / `balanced` / `strict`.
+4. **Write the new keys.** `profile.id`, `preset.id`, optionally
+   `pack.id`, plus the user-typed `user_type` as a stable audit field.
+## Wizard contract (Phase 1 item 2 acceptance)
+The wizard MUST:
+- Preserve every existing step semantically (no silent removal).
+- Insert role + stack + risk-appetite questions **before** step 8.
+- Honor the one-question-per-turn Iron Law.
+- Write `profile.id`, `preset.id`, and `user_type` to
+  `.agent-settings.yml` using the section-aware merge rules.
+- Be re-runnable (idempotent for unchanged answers).
+- Work offline (no network call required for any question).
+- Skip itself on cloud surfaces (inherit current cloud-noop behavior).
+## See also
+- [`/onboard` command](../../.agent-src.uncompressed/commands/onboard.md) — canonical source.
+- [`onboarding-gate`](../../.agent-src/rules/onboarding-gate.md) — trigger rule.
+- [`ADR-010`](../decisions/ADR-010-profile-pack-preset-boundary.md) — boundary the wizard must respect.
+- [`config-presets.md`](../contracts/config-presets.md) — preset axis the wizard writes.
+- [`agents/roadmaps/step-15-product-refinement.md`](../../agents/roadmaps/step-15-product-refinement.md) — Phase 1 item 2.

package/docs/architecture/current-safety-behavior.md ADDED Viewed

@@ -0,0 +1,137 @@
+# Current Safety Behavior — Baseline (pre-step-15)
+> **Status:** descriptive baseline · **Owner:** package maintainer ·
+> **Last reviewed:** 2026-05-16
+>
+> Documents the **current** safety / autonomy surface so the Phase 2
+> Universal Safety Model ADR (step-15 item 9) has a baseline to diff
+> against. Council v3 action #4 prerequisite. This file describes what
+> ships today; it is **not** a proposal for what should ship next.
+## Scope
+The current package has **one autonomy switch** plus **four
+non-overridable floors**. The Phase 2 ADR will replace the single switch
+with per-profile, per-domain `deny / ask / allow` declarations. Before
+that ADR can specify "replace X", X has to be written down.
+## The one switch — `personal.autonomy`
+**Where defined:** `.agent-settings.yml` under `personal.autonomy`.
+Template: `config/agent-settings.template.yml`.
+**Values:** `on` · `off` · `auto`.
+**Read site:** [`.agent-src/rules/autonomous-execution.md`](../../.agent-src/rules/autonomous-execution.md)
+(Iron-Law rule, kernel-loaded in every profile). Cached on the first
+turn; missing key treated as `on`.
+**What it gates:** trivial workflow questions (suppression). Examples:
+"Should I run the tests now?", "Should I create the branch?", "Continue
+with the next phase?". These are suppressed when `autonomy` resolves to
+`on`.
+**What it does NOT gate:** any of the four floors below, any
+[`scope-control`](../../.agent-src/rules/scope-control.md) git operation,
+or any [`commit-policy`](../../.agent-src/rules/commit-policy.md) commit
+default. The switch only narrows the **trivial-question** surface.
+### State table
+| State | Behavior on trivial workflow questions | Blocking / Hard-Floor / Commit gates |
+|---|---|---|
+| `on` | **Suppress** — agent acts, surfaces what it did | Unchanged — still apply |
+| `off` | **Ask** — numbered options, single question | Unchanged — still apply |
+| `auto` | Same as `off` until the user opts in via a standing autonomy directive ("just work", "arbeite eigenständig"). Then sticky-flip to `on` for the rest of the conversation. Mirror opt-out flips back. | Unchanged — still apply |
+### Opt-in detection
+Intent-matched, not literal-string-matched. Speech-act-checked: the
+phrase must be a meta-instruction, not content / quote / code. Detail:
+[`autonomy-detection`](../../.agent-src/contexts/execution/autonomy-detection.md),
+[`autonomy-mechanics`](../../.agent-src/contexts/execution/autonomy-mechanics.md).
+### Task scope vs conversation scope
+Two distinct autonomy shapes:
+| Shape | Trigger | Scope |
+|---|---|---|
+| **Conversation-wide trivial-question suppression** | "stop asking on trivial steps" — no deliverable named | Sticky for the rest of the conversation. Suppresses trivial workflow questions only. |
+| **Task-scoped autonomous execution** | "work autonomously on X", "arbeite die Roadmap Y komplett ab" — deliverable named | Bound to that task. Ends when the task ends. Does NOT authorize a different later deliverable. |
+Per [`autonomous-execution § task-scope`](../../.agent-src/rules/autonomous-execution.md#task-scope--autonomy-is-bound-to-the-named-task).
+## The four non-overridable floors
+No value of `personal.autonomy` lifts any of these. Standing
+autonomy directives, roadmap authorizations, or "just keep going"
+phrases never reach them.
+### 1. Hard Floor — `non-destructive-by-default`
+[`.agent-src/rules/non-destructive-by-default.md`](../../.agent-src/rules/non-destructive-by-default.md).
+Stops on: production-branch merges; deploy / release; push to remote;
+production data / infra writes; whimsical bulk deletions; commits
+containing bulk deletions or infra changes. **Always confirm this turn.**
+### 2. Git-ops Permission Gate — `scope-control`
+[`.agent-src/rules/scope-control.md § Git operations`](../../.agent-src/rules/scope-control.md#git-operations--permission-gated).
+Stops on: commit · push · merge · rebase · force-push · branch create /
+switch / delete · PR create / close / retarget · tag / release / pin.
+Permission must be **this turn or a standing instruction not yet
+revoked**.
+### 3. Commit Default — `commit-policy`
+[`.agent-src/rules/commit-policy.md`](../../.agent-src/rules/commit-policy.md).
+**Never commit, never ask about committing.** Four exceptions: user
+says so this turn · standing instruction · `/commit` invoked · roadmap
+authorization. Anything else → no commit.
+### 4. Security-sensitive STOP — `security-sensitive-stop`
+[`.agent-src/rules/security-sensitive-stop.md`](../../.agent-src/rules/security-sensitive-stop.md).
+Stops on: auth, billing, tenant boundaries, secrets, uploads,
+integrations, webhooks, public endpoints. Threat-model **before**
+editing.
+## Coverage map
+| Surface | What governs it |
+|---|---|
+| Trivial workflow question | `personal.autonomy` (the switch) |
+| Blocking architectural / scope question | [`ask-when-uncertain`](../../.agent-src/rules/ask-when-uncertain.md) (always) |
+| Tool / MCP call cost | None today — Phase 1 item 4 introduces preset-loader Hard Enforcement |
+| Skill / command allowlist per audience | None today — Phase 2 item 7 introduces packs |
+| Per-domain `deny / ask / allow` | None today — Phase 2 item 9 introduces this |
+| Hard Floor (prod, deploy, push, bulk-destructive) | Universal — not switchable |
+| Git ops | Universal permission gate — not switchable |
+| Commit | Universal default-deny — not switchable |
+## Gaps the Phase 2 ADR will address
+1. **One switch, one granularity.** Today, `autonomy: on` suppresses
+   *every* trivial question identically. A founder running the
+   `content-engine` pack may want autonomy for content, ask-mode for
+   spend; the current model cannot express that.
+2. **No per-domain policy.** Domain-safety rules
+   (`.agent-src/rules/domain-safety-*.md`) act as output floors but do
+   not declare `deny / ask / allow` per profile. The Phase 2 model
+   centralizes this.
+3. **No machine-readable safety schema.** The current behavior is
+   distributed across four rules. A consuming tool (the wizard, the
+   explain command) cannot ask "what is this install's safety posture?"
+   without reading rule prose.
+The Phase 2 ADR (`docs/contracts/safety-model.md`) inherits this
+baseline and adds: per-profile policy table, machine-readable schema,
+explain-trace integration. It MUST NOT silently relax any of the four
+floors above.
+## See also
+- [`autonomous-execution`](../../.agent-src/rules/autonomous-execution.md) · [`non-destructive-by-default`](../../.agent-src/rules/non-destructive-by-default.md) · [`scope-control`](../../.agent-src/rules/scope-control.md) · [`commit-policy`](../../.agent-src/rules/commit-policy.md) · [`security-sensitive-stop`](../../.agent-src/rules/security-sensitive-stop.md).
+- [`docs/safety.md`](../safety.md) — domain-safety output floors.
+- [`agents/roadmaps/step-15-product-refinement.md`](../../agents/roadmaps/step-15-product-refinement.md) — Phase 1 item 2a (this doc) and Phase 2 item 9 (Universal Safety Model ADR).

package/docs/archive/CHANGELOG-pre-2.16.0.md ADDED Viewed

@@ -0,0 +1,48 @@
+# Changelog Archive — pre-2.16.0
+> Frozen snapshot of `event4u/agent-config` changelog entries from
+> `2.15.0`, split out of the main
+> [`CHANGELOG.md`](../../CHANGELOG.md) on 2026-05-16 once the active
+> era's body crossed the 200-line drift cap enforced by
+> `tests/test_changelog_eras.py`.
+>
+> **Read-only.** New entries land in `CHANGELOG.md` § "Era: 2.16.x".
+> Entries here are not amended — git tag `2.15.0` remains the
+> canonical source for what shipped.
+>
+> Entry shape follows the conventions documented in
+> [`docs/contracts/CHANGELOG-conventions.md`](../contracts/CHANGELOG-conventions.md).
+> Earlier eras live in
+> [`CHANGELOG-pre-2.15.0.md`](CHANGELOG-pre-2.15.0.md),
+> [`CHANGELOG-pre-2.11.0.md`](CHANGELOG-pre-2.11.0.md),
+> [`CHANGELOG-pre-2.7.0.md`](CHANGELOG-pre-2.7.0.md), and
+> [`CHANGELOG-pre-2.2.0.md`](CHANGELOG-pre-2.2.0.md).
+## [2.15.0](https://github.com/event4u-app/agent-config/compare/2.14.0...2.15.0) (2026-05-15)
+### Features
+* **agent-user:** add /agents user command cluster (init, show, review, accept, update) ([15d53d8](https://github.com/event4u-app/agent-config/commit/15d53d8d9a2365b044831cd42127e247a70d7e20))
+* **agent-user:** add v1 schema contract for .agent-user.md persona file ([64f4eab](https://github.com/event4u-app/agent-config/commit/64f4eab62ccf6a2606fbca0c56d398372c05a7a0))
+### Bug Fixes
+* **agent-user:** inline council-reference summary per no-roadmap-references ([ee4d3ce](https://github.com/event4u-app/agent-config/commit/ee4d3cedf9f4429450d21ca5badc2ae5c2ecaaed))
+* **agent-user:** drop roadmap references per no-roadmap-references rule ([c8ade8d](https://github.com/event4u-app/agent-config/commit/c8ade8d7c5b495e0e4295aa0cb801e59076ee0b0))
+* **agent-user:** adjust keep-beta-until to fit 90-day window ([801b365](https://github.com/event4u-app/agent-config/commit/801b365117a2d1efb4505e504bdd730e4cbbc217))
+### Documentation
+* **persona:** README section + agent-settings legacy-fallback note ([4da7629](https://github.com/event4u-app/agent-config/commit/4da7629f1f0b5a35a64d0a861040ad8639a66ebe))
+* **roadmap:** mark step-3-agent-user-persona phases as in-progress ([f29d3bc](https://github.com/event4u-app/agent-config/commit/f29d3bce2380c0ea9c67e6094540b88d920ed9ff))
+### Chores
+* **roadmap:** close out + archive step-3-agent-user-persona ([09c0229](https://github.com/event4u-app/agent-config/commit/09c0229efd67af9cad7b2ca8202f4caa351d028d))
+* **ownership:** regenerate file-ownership-matrix for /agents user ([128890d](https://github.com/event4u-app/agent-config/commit/128890d880584704b4842a398555dd979ae54462))
+* **docs:** bump command count from 109 to 115 ([f8c61b1](https://github.com/event4u-app/agent-config/commit/f8c61b1d0ec48034e0d66e8d32534056ca4aa1f0))
+* **template:** bump agent_config_version pin to 2.14.0 ([fcb885f](https://github.com/event4u-app/agent-config/commit/fcb885fd19bdbca46ef91ec4d5e723cc6c186c6d))
+* **index:** regenerate agents/index.md + docs/catalog.md for /agents user ([56b281d](https://github.com/event4u-app/agent-config/commit/56b281d69960d3e57adbd24b9ec6fd24fc1a5aff))
+* **agent-user:** regenerate compressed sources + claude tool stubs ([f79b6d1](https://github.com/event4u-app/agent-config/commit/f79b6d1cfcf1caccde4a723ad779c65d9ed87198))
+Tests: 4352 (+12 since 2.14.0)

package/docs/contracts/adr-layout.md ADDED Viewed

@@ -0,0 +1,108 @@
+---
+stability: stable
+---
+# ADR Layout — Per-area Directories
+> Status: accepted · 2026-05-16 · Roadmap: `step-11-ruflo-parity` Phase 4
+## Scope
+Two ADR surfaces coexist in this repo. **Both are canonical** — neither supersedes the other.
+| Surface | Path | Use for |
+|---|---|---|
+| **Flat (legacy)** | `docs/decisions/ADR-NNN-<slug>.md` | Cross-cutting governance decisions: kernel composition, rule taxonomy, package-wide architecture. Numbering is global, sequential, gap-free. |
+| **Per-area** | `docs/adrs/<area>/NNNN-<slug>.md` | Sub-area decisions whose blast radius is one plugin / one subsystem. Numbering is per-area, starts at `0001`, padded to 4 digits. |
+Choice rule — does the decision constrain code **inside one area folder** (one runtime module, one contract group, one CLI surface)? → per-area. Does it constrain **the package's contract with consumers**? → flat. In doubt → per-area (cheaper to surface, easier to relocate).
+## Per-area layout
+```
+docs/adrs/
+  <area>/
+    README.md          # one-paragraph area scope + table of all ADRs in this area
+    0001-<slug>.md     # first ADR, retrospective or prospective
+    0002-<slug>.md
+    ...
+```
+`<area>` is a kebab-case stem matching one of:
+- An entry in the canonical area inventory (see [`scripts/audit_adr_coverage.py`](../../scripts/audit_adr_coverage.py) `AREAS`).
+- A new area added to that inventory in the same PR.
+Reserved areas (bootstrap pass — step-11 Phase 4 Step 3):
+| Area | Scope | Owner contract |
+|---|---|---|
+| `cost` | Budget ladder, hard-stop hook, cost reporting | [`cost-enforcement.md`](cost-enforcement.md) |
+| `caveman` | Caveman-speak compression, decompression, reversibility | [`compression-default-kill-criterion.md`](compression-default-kill-criterion.md) |
+| `schema` | Frontmatter schemas, v2 rigor, lint behaviour | [`schema-versioning.md`](schema-versioning.md) (when published) |
+| `router` | `router.json` shape, tier semantics, dispatch precedence | [`rule-router.md`](rule-router.md) |
+| `smoke` | Per-tier smoke contracts, baseline locks | [`smoke-contracts.md`](smoke-contracts.md) |
+| `memory` | Memory MCP, propose / promote / poison flow | [`agent-memory-contract.md`](agent-memory-contract.md) |
+## Frontmatter
+Identical across both surfaces:
+```yaml
+---
+adr: NNN              # zero-padded; per-area uses 4-digit (0001), flat uses 3-digit (010)
+area: <area> | flat   # 'flat' for docs/decisions/, otherwise the area slug
+status: proposed | accepted | superseded | deprecated
+date: YYYY-MM-DD
+decision: <slug>
+supersedes: — | ADR-<area>-NNNN | ADR-MMM
+superseded_by: — | ADR-<area>-NNNN | ADR-MMM
+phase: <roadmap-stem> · <phase-id>     # optional but recommended
+type: retrospective | prospective
+---
+```
+Supersession links cross surfaces: a per-area ADR may supersede a flat ADR and vice versa. The numeric prefix in `supersedes:` makes the target unambiguous (`ADR-007` = flat, `ADR-cost-0001` = per-area).
+## Per-area README contract
+Every `<area>/` directory carries a `README.md` with:
+1. One-paragraph area scope (≤ 4 sentences).
+2. Single contract pointer — the `docs/contracts/<X>.md` this area implements (or "no published contract" if pre-Phase 5).
+3. Numbered table of ADRs in the area: `| # | Title | Status | Date | Supersedes |`. Generated by `scripts/audit_adr_coverage.py --regen-area-readme <area>`.
+## Coverage gate
+`scripts/audit_adr_coverage.py --check` (wired to `task lint-adr-coverage`):
+- Warns when a `docs/contracts/<X>.md` exists without a matching `docs/adrs/<X>/0001-*.md`.
+- Hard-fails on number gaps within an area (e.g. `0001`, `0003` without `0002`).
+- Hard-fails on missing `README.md` in any non-empty area directory.
+- Warns on dangling `supersedes:` or `superseded_by:` references.
+Default mode is **warn** at the consumer surface; **fail** under `task ci`. Rationale: a new contract dropped without an ADR is a documentation gap, not a bug. CI enforces it for this package; consumer projects opt in by adding the task to their own pipeline.
+## Numbering & gaps
+- Per-area: 4-digit, gap-free, starts at `0001`. Re-use of numbers is a hard failure in the index regenerator.
+- Flat: 3-digit, gap-free, starts at `001`. Existing ADRs in `docs/decisions/` set the precedent.
+- A deleted ADR is **never** removed from history — supersede it. The lint surfaces broken supersession chains.
+## Relationship to `adr-create` skill
+[`adr-create`](../../.agent-src.uncompressed/skills/adr-create/SKILL.md) accepts an optional `<area>` argument (added in step-11 Phase 4 Step 4):
+- No `<area>` → flat surface, `docs/decisions/`.
+- `<area>` matches inventory → per-area surface, `docs/adrs/<area>/`.
+- `<area>` does **not** match inventory → skill refuses with a hint to update the inventory first.
+The skill's template, numbering logic, and validation hooks are identical for both surfaces; only the target directory and number padding differ.
+## References
+- [`docs/adrs/cost/0001-hard-stop-hook.md`](../adrs/cost/0001-hard-stop-hook.md) — first per-area ADR (bootstrap).
+- [`docs/decisions/INDEX.md`](../decisions/INDEX.md) — flat surface index.
+- [`scripts/audit_adr_coverage.py`](../../scripts/audit_adr_coverage.py) — coverage gate.
+- [`scripts/adr/regenerate_index.py`](../../scripts/adr/regenerate_index.py) — index regenerator (works on both surfaces; pass `--dir`).
+- `step-11-ruflo-parity` Phase 4 — origin.

package/docs/contracts/adr-mcp-runtime.md ADDED Viewed

@@ -0,0 +1,128 @@
+---
+stability: stable
+---
+# ADR — MCP server runtime: Anthropic `mcp` Python SDK
+> **Status:** Decided · 2026-05-10 (recorded 2026-05-15).
+> **Context:**
+> [`mcp-phase-1-scope.md`](mcp-phase-1-scope.md),
+> [`mcp-cloud-scope.md`](mcp-cloud-scope.md).
+## Decision
+The MCP server at `scripts/mcp_server/` runs on **Python 3.11+** using the
+official Anthropic **`mcp` Python SDK** (PyPI; pinned to `mcp==1.27.1`
+per [`scripts/mcp_server/requirements.txt`](../../scripts/mcp_server/requirements.txt)).
+**FastMCP** (the higher-level decorator wrapper) and the **MCP TypeScript SDK**
+are explicitly rejected for this surface.
+The hosted Cloudflare Worker bridge (`workers/mcp/`) is the only place a
+non-Python runtime is allowed, and it stays bound to the same wire contract
+(see [`mcp-cloud-scope.md`](mcp-cloud-scope.md)).
+## Why this was a real question
+The package already ships Python under `scripts/` (work engine, AI Council,
+skill linter, install driver helpers) and ships zero Node-runtime code paths
+outside the npx dispatcher. Picking a runtime for the MCP server had three
+candidates that all could have shipped:
+1. **MCP Python SDK** (low-level `Server` + `stdio_server` handlers).
+2. **FastMCP** (higher-level Pythonic decorators built on the same SDK).
+3. **MCP TypeScript SDK** (Node runtime, separate package).
+Without an ADR, this choice would have stayed implicit in the code and
+re-litigated every time a contributor read `scripts/mcp_server/server.py`.
+## Why MCP Python SDK (low-level) wins
+| Criterion | MCP Python SDK | FastMCP | MCP TypeScript SDK |
+|---|---|---|---|
+| Runtime already in repo | ✅ Python is the `scripts/` runtime | ✅ Same | ❌ Adds Node-runtime path for one server |
+| A0 safety boundary fit (read-only `prompts/list`, `prompts/get`, narrow `tools/*` allowlist) | ✅ Direct handler control matches the [Phase 1 scope contract](mcp-phase-1-scope.md) | ⚠️ Decorator sugar can obscure the unimplemented-tool guard | ✅ Possible but duplicates Python helpers |
+| Import-surface guard (`tests/test_mcp_server.py` asserts no `subprocess`, `os.system`, `os.popen`, no HTTP client in `scripts.mcp_server.prompts/tools`) | ✅ Trivial to enforce — one module set to audit | ⚠️ FastMCP pulls in extra deps that widen the audit surface | ❌ Would need a TS-side equivalent |
+| Reuse of existing project helpers (`scripts/skill_linter.py`, `scripts/chat_history.py`) | ✅ Direct in-process call | ✅ Same | ❌ Cross-runtime IPC or duplicated logic |
+| Pin / supply-chain footprint | One pin (`mcp==1.27.1`) + `PyYAML` | Adds FastMCP version coupling on top | Node toolchain (`npm`, `tsc`, `dist/`) |
+| Smoke-test path | `task mcp:setup && task mcp:run` (already shipped) | Would re-wrap the same SDK | Separate test runner |
+Evidence the decision is already realised in code:
+- [`scripts/mcp_server/server.py`](../../scripts/mcp_server/server.py) — uses
+  `mcp.server.Server`, `mcp.server.stdio.stdio_server`, `InitializationOptions`
+  directly (no FastMCP decorators).
+- [`scripts/mcp_server/__init__.py`](../../scripts/mcp_server/__init__.py) —
+  pins `__version__` and declares stability/contract pointer.
+- [`scripts/mcp_server/requirements.txt`](../../scripts/mcp_server/requirements.txt)
+  — `mcp==1.27.1`, no FastMCP, no Node tooling.
+- [`scripts/mcp_setup.sh`](../../scripts/mcp_setup.sh) — onboarding writes
+  the Claude Desktop config snippet against `python -m scripts.mcp_server`.
+## Tool surface (Phase 1 scoping)
+Locked separately by [`mcp-phase-1-scope.md`](mcp-phase-1-scope.md) Phase 4
+amendment. The current ALLOWLIST is exactly two tools, registered as a
+hardcoded module-level tuple in
+[`scripts/mcp_server/tools.py`](../../scripts/mcp_server/tools.py):
+| Tool | Mode | Source |
+|---|---|---|
+| `lint_skills` | read-only | wraps `scripts.skill_linter.lint_file` |
+| `chat_history_append` | path-scoped write | wraps `scripts.chat_history.append`; writes restricted to `agents/.agent-chat-history` or `.agent-chat-history` under the consumer root |
+No `push`, `merge`, `commit`, or prod-write surface is exposed. The
+unimplemented-tool envelope from
+[`mcp-tool-stub-envelope.md`](mcp-tool-stub-envelope.md) governs the rest of
+the [`consumer_tool_catalog.json`](../../scripts/mcp_server/consumer_tool_catalog.json)
+entries.
+`agent-config init`, `agent-config skills list`, and
+`agent-config council estimate` (the speculative tool surface in the
+step-14 stub) are **not** exposed today. They stay terminal-gated because
+their natural shape is a stateful CLI. The AI Council (claude-sonnet-4-5 +
+gpt-4o, 2026-05-10, 2 rounds, $0.06) converged on a hardcoded module-level
+ALLOWLIST with mandatory path-scoping for any write tool, and locked the
+rule that engine-state-bearing operations stay off the MCP wire until a
+real consumer ask justifies amending the A0 boundary.
+## Install surface
+The step-14 stub speculated about an `agent-config install --mcp` flag.
+That shape was rejected in favour of two existing entrypoints, both
+already shipped:
+- **One-liner onboarding:** `task mcp:setup` runs
+  [`scripts/mcp_setup.sh`](../../scripts/mcp_setup.sh) — creates
+  `.venv-mcp/`, installs `mcp`, and prints the Claude Desktop JSON snippet
+  the operator pastes into
+  `~/Library/Application Support/Claude/claude_desktop_config.json`
+  (with the per-OS variants documented in
+  [`docs/mcp-server.md`](../mcp-server.md)).
+- **Config writer:** `./agent-config mcp:render --claude-desktop` writes
+  the user-scope Claude Desktop config directly.
+Writing the global Claude Desktop config from the npx dispatcher without
+an operator pasting JSON is **not** part of this contract — Claude Desktop
+restarts and config-merge semantics make silent rewrites a footgun. The
+copy-paste path stays the canonical install shape until non-dev recruit
+evidence under `agents/eval-findings/` demonstrates the manual step is the
+actual adoption blocker.
+## Consequences
+- Adding a third tool to the MCP server is a code-review event against
+  `ALLOWLIST` in `scripts/mcp_server/tools.py`. No settings flag, no env
+  var, no dynamic registration — see
+  [`mcp-phase-1-scope.md`](mcp-phase-1-scope.md) Phase 4 amendment.
+- Picking up a future protocol break (MCP SDK 2.x) is one pin bump in
+  `scripts/mcp_server/requirements.txt`, gated on the 12 import-surface +
+  behaviour tests in `tests/test_mcp_server.py` staying green.
+- Re-opening FastMCP or the TypeScript SDK requires a new ADR that
+  supersedes this one with evidence (Python SDK shipping a deprecation
+  or FastMCP closing the safety-audit gap on `tools/*`).
+## See also
+- [`mcp-phase-1-scope.md`](mcp-phase-1-scope.md) — Phase 1–6 hard contract.
+- [`mcp-cloud-scope.md`](mcp-cloud-scope.md) — hosted Worker bridge scope.
+- [`mcp-tool-stub-envelope.md`](mcp-tool-stub-envelope.md) — unimplemented-tool wire shape.