npm - @event4u/agent-config - Versions diffs - 2.19.0 → 2.20.0 - Mend

@event4u/agent-config 2.19.0 → 2.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (92) hide show

package/.agent-src/commands/agent-status.md +29 -0
package/.agent-src/commands/onboard.md +221 -81
package/.agent-src/packs/README.md +49 -0
package/.agent-src/packs/agency-delivery.yml +63 -0
package/.agent-src/packs/content-engine.yml +53 -0
package/.agent-src/packs/founder-mvp.yml +51 -0
package/.agent-src/presets/README.md +26 -0
package/.agent-src/presets/balanced.yml +34 -0
package/.agent-src/presets/fast.yml +31 -0
package/.agent-src/presets/strict.yml +38 -0
package/.agent-src/profiles/README.md +29 -0
package/.agent-src/profiles/agency.yml +27 -0
package/.agent-src/profiles/content_creator.yml +25 -0
package/.agent-src/profiles/developer.yml +26 -0
package/.agent-src/profiles/finance.yml +24 -0
package/.agent-src/profiles/founder.yml +25 -0
package/.agent-src/profiles/ops.yml +25 -0
package/.agent-src/rules/no-cheap-questions.md +25 -17
package/.agent-src/skills/adr-create/SKILL.md +78 -68
package/.agent-src/skills/subagent-orchestration/SKILL.md +33 -0
package/.agent-src/templates/agents/agent-project-settings.example.yml +1 -1
package/.agent-src/templates/skill-archive-note.md +101 -0
package/.claude-plugin/marketplace.json +1 -1
package/CHANGELOG.md +52 -30
package/README.md +68 -72
package/config/agent-settings.template.yml +22 -0
package/docs/adrs/caveman/0001-default-off-until-bench.md +93 -0
package/docs/adrs/caveman/README.md +9 -0
package/docs/adrs/cost/0001-hard-stop-hook.md +114 -0
package/docs/adrs/cost/README.md +9 -0
package/docs/adrs/memory/0001-consumer-side-snapshot.md +111 -0
package/docs/adrs/memory/README.md +9 -0
package/docs/adrs/router/0001-three-tier-routing.md +119 -0
package/docs/adrs/router/README.md +9 -0
package/docs/adrs/schema/0001-json-schema-frontmatter.md +102 -0
package/docs/adrs/schema/README.md +9 -0
package/docs/adrs/smoke/0001-per-tier-smoke-scripts.md +99 -0
package/docs/adrs/smoke/README.md +9 -0
package/docs/architecture/current-onboard-baseline.md +126 -0
package/docs/architecture/current-safety-behavior.md +137 -0
package/docs/archive/CHANGELOG-pre-2.16.0.md +48 -0
package/docs/contracts/adr-layout.md +108 -0
package/docs/contracts/benchmark-corpus-spec.md +97 -0
package/docs/contracts/benchmark-report-schema.md +111 -0
package/docs/contracts/command-clusters.md +1 -0
package/docs/contracts/command-taxonomy.md +137 -0
package/docs/contracts/compression-default-kill-criterion.md +69 -0
package/docs/contracts/config-presets.md +144 -0
package/docs/contracts/cost-dashboard.md +143 -0
package/docs/contracts/cost-enforcement.md +134 -0
package/docs/contracts/file-ownership-matrix.json +0 -7
package/docs/contracts/mcp-tool-inventory.md +53 -0
package/docs/contracts/measurement-baseline.md +102 -0
package/docs/contracts/namespace.md +125 -0
package/docs/contracts/profile-system.md +142 -0
package/docs/contracts/safety-model.md +129 -0
package/docs/contracts/smoke-contracts.md +144 -0
package/docs/contracts/workflow-packs.md +121 -0
package/docs/decisions/ADR-010-profile-pack-preset-boundary.md +132 -0
package/docs/decisions/INDEX.md +1 -0
package/docs/featured-commands.md +27 -0
package/docs/parity/bench-ruflo.json +58 -0
package/docs/parity/bench.json +41 -0
package/docs/parity/ruflo.md +46 -0
package/docs/profiles.md +91 -0
package/package.json +1 -1
package/scripts/_cli/cmd_explain.py +250 -0
package/scripts/_lib/bench_cost.py +138 -0
package/scripts/_lib/bench_quality.py +118 -0
package/scripts/_lib/bench_report.py +150 -0
package/scripts/agent-config +13 -0
package/scripts/audit_adr_coverage.py +175 -0
package/scripts/audit_mcp_tools.py +146 -0
package/scripts/bench_baseline_ready.py +108 -0
package/scripts/bench_drift_check.py +151 -0
package/scripts/bench_per_tool.py +216 -0
package/scripts/bench_run.py +155 -0
package/scripts/config/__init__.py +9 -0
package/scripts/config/presets.py +206 -0
package/scripts/config/profiles.py +173 -0
package/scripts/cost/budget.mjs +73 -12
package/scripts/cost/preflight.mjs +89 -0
package/scripts/lint_archived_skills.py +143 -0
package/scripts/lint_bench_corpus.py +161 -0
package/scripts/lint_namespace.py +135 -0
package/scripts/skill_overlap.py +204 -0
package/scripts/skill_usage_collect.py +191 -0
package/scripts/skill_usage_report.py +162 -0
package/scripts/smoke/kernel.sh +101 -0
package/scripts/smoke/router.sh +129 -0
package/scripts/smoke/schema.sh +71 -0
package/scripts/smoke/skills.sh +101 -0

package/docs/contracts/command-taxonomy.md ADDED Viewed

@@ -0,0 +1,137 @@
+---
+stability: beta
+keep-beta-until: 2026-08-12
+---
+# Command taxonomy
+> **Status:** beta — first draft 2026-05-16 (Phase 2 Item 6 of
+> `step-15-product-refinement`).
+The taxonomy answers **"how is the command surface organized so each
+profile finds their three first commands in under 30 seconds?"** It is
+a **catalog-organization contract**, not an invocation-rename. Existing
+slash invocations (`/work`, `/fix ci`, `/research deep`) are preserved
+by the locked verb-cluster contract at
+[`command-clusters`](command-clusters.md). This file adds a **profile
+axis** on top of the verb axis without breaking either.
+## The two axes
+| Axis | Owner | Surface |
+|---|---|---|
+| **Verb-cluster** (existing) | [`command-clusters`](command-clusters.md) | Defines the invocation tree (`/fix ci` dispatches to the `ci` sub-command of the `fix` cluster). Linter-enforced. **Source of truth for invocation.** |
+| **Profile** (this contract) | [`profile-system`](profile-system.md) | Defines which verb-clusters and sub-commands are surfaced first for each profile (developer · content_creator · founder · agency · finance · ops). **Source of truth for discoverability.** |
+A command can be discoverable under multiple profiles. `/work` is
+universal — it appears in `commands_hint` for every profile. `/dcf-modeling`
+is finance-only. Discoverability is many-to-many; invocation stays
+single-source.
+## Membership rules
+### Profile membership
+A command appears in a profile's `commands_hint` (in
+`.agent-src.uncompressed/profiles/<id>.yml`) iff **all** hold:
+1. **First-week reach.** A user of that profile will reach for this
+   command within their first five sessions without being told.
+2. **Profile-coherent.** The command's domain matches the profile's
+   primary work surface (engineering for `developer`, content for
+   `content_creator`, etc.).
+3. **Verb-cluster owned.** The command exists in `command-clusters` —
+   no profile may declare a command that has not gone through the
+   verb-cluster linter.
+4. **Cap of five.** A profile's `commands_hint` is capped at five
+   entries. The cap is what makes "three first commands" possible.
+### Top-10 most-used (for alias / deprecation policy)
+The top-10 list is the **union of all six profiles' `commands_hint`
+lists, ranked by per-profile membership count**. As of 2026-05-16
+that union is, in rank order:
+1. `work` (6/6 profiles)
+2. `implement-ticket` (2/6 — developer, agency)
+3. `feature` (2/6 — founder, agency)
+4. `council` (2/6 — founder, finance)
+5. `challenge-me` (2/6 — founder, finance)
+6. `review-changes` (2/6 — developer, ops)
+7. `fix` (2/6 — developer, ops)
+8. `refine-ticket` (1/6 — agency)
+9. `commit` (1/6 — developer)
+10. `roadmap` (1/6 — agency)
+The top-10 is regenerated automatically from the profile YAMLs by
+`scripts/regen_top10.py` (Phase 2 deliverable — not yet shipped). Until
+the regen script lands, the list above is the locked snapshot.
+## Backward-compat policy
+The top-10 commands carry a **two-release backward-compat guarantee**:
+- A rename of any top-10 command (whether by verb-cluster restructure
+  or profile-axis reorganization) ships with an alias for **at least
+  two minor releases**.
+- The alias is recorded in the verb-cluster's `Replaces` column in
+  [`command-clusters`](command-clusters.md) and re-emits a one-line
+  deprecation notice to stderr on every invocation.
+- Removing the alias requires the `bundled-always-rules-acknowledged`
+  PR label and an entry in the CHANGELOG `Removed` section naming the
+  end-of-deprecation release.
+Commands outside the top-10 follow the existing verb-cluster
+deprecation rules (one release as a shim, then disappear).
+## Discoverability surfaces
+Three surfaces consume this contract:
+| Surface | Path | What it shows |
+|---|---|---|
+| **README** | `README.md` § "Six entry paths" | Per-profile `commands_hint` (max 5) rendered as the first-commands list per profile block |
+| **Catalog** | `docs/catalog.md` | All commands grouped by verb-cluster (primary axis), with a per-command `profiles:` line listing which profiles surface it |
+| **Wizard** | `.agent-src.uncompressed/commands/onboard.md` | After role selection, prints the five-command starter list from the selected profile's `commands_hint` |
+The README and wizard surfaces are already wired. The catalog `profiles:`
+line is a Phase 2 deliverable.
+## What this contract does **not** do
+- **Does not** rename any command. Invocation stays flat (`/work`, not
+  `/dev/work`). The `/dev/...` / `/ops/...` strawman in the Item 6
+  roadmap entry is **rejected** — adding a profile prefix to invocation
+  would dual-namespace the surface, conflict with verb-cluster cluster
+  heads, and require a 124-command migration with no measurable
+  discoverability gain over the README + wizard surfaces above.
+- **Does not** modify the verb-cluster contract. `command-clusters`
+  remains the locked source of truth for invocation. This contract is
+  additive.
+- **Does not** ship telemetry. The top-10 is derived from declared
+  profile membership, not observed usage. A usage-based top-10
+  recomputation is deferred to Item 10 (Cost Governance Dashboard),
+  which already collects per-command call counts.
+## Open questions (post-beta)
+1. **Profile evolution.** When a seventh profile lands (e.g.
+   `researcher`), what is the membership review process for the
+   top-10? Proposal: any new profile triggers a `regen_top10.py` run
+   and a CHANGELOG entry; no manual review unless the top-10 order
+   changes.
+2. **Profile-prefix invocation.** If the no-rename verdict is
+   revisited (e.g. user research shows discoverability still fails
+   even with the README + wizard surfaces), a separate ADR records
+   the decision; this contract does not pre-authorize it.
+3. **Catalog generator.** `docs/catalog.md` is currently
+   handwritten. The `profiles:` line proposed in the discoverability
+   table requires `scripts/regen_catalog.py` to consume profile YAMLs
+   — deferred to its own roadmap step.
+## See also
+- [`command-clusters`](command-clusters.md) — verb-axis (invocation)
+- [`profile-system`](profile-system.md) — profile-axis (discoverability)
+- [`command-surface-tiers`](command-surface-tiers.md) — tier-axis (`./agent-config --help` visibility)
+- `step-15-product-refinement` § Phase 2 Item 6

package/docs/contracts/compression-default-kill-criterion.md ADDED Viewed

@@ -0,0 +1,69 @@
+---
+stability: beta
+keep-beta-until: 2026-08-14
+---
+# Compression default — kill-criterion
+> **Status:** parked, criterion-deferred · **Owner:** `step-4-measurement-and-benchmark.md`
+> closeout phase · **Source:** [`council-synthesis.md` § 7](../../agents/audit-2026-05-14-north-star/council-synthesis.md)
+## Rule
+```
+DEFAULT STAYS OFF UNTIL `task bench` PRODUCES A NUMBER.
+DECISION OWNED BY step-4 CLOSEOUT, NOT BY THIS DOC OR BY step-99.
+```
+1. **Current state.** `caveman.speak_scope` defaults `off`. Carve-outs
+   (security · destructive · multi-step · code blocks · paths · numbered
+   options · Iron-Law markers) are documented in
+   [`caveman-speak`](../../.agent-src.uncompressed/rules/caveman-speak.md)
+   but the feature is non-promoted: no skill recommends turning it on,
+   no preset enables it, no profile depends on it.
+2. **Baseline window.** 60 days from the first green run of
+   `task bench` against the locked 25-prompt corpus
+   (`step-4-measurement-and-benchmark.md`
+   Phase 2). The corpus, the model, and the cost-tracker are frozen
+   for the window; mid-window changes restart the clock.
+3. **Decision points.** After the window closes, `step-4` closeout
+   reads `docs/parity/bench.json` and applies exactly one of:
+   | Measured tokens saved | Quality regression on corpus | Verdict |
+   |---|---|---|
+   | < 30 % | any | **Deprecate** — remove `caveman-speak` rule, archive `caveman-compress` script, retire `caveman.*` settings keys with a one-release deprecation window |
+   | ≥ 30 % | < 5 % | **Flip default on** — `caveman.speak_scope` defaults to a non-`off` value, carve-outs stay, statusline surfaces lifetime tokens saved |
+   | ≥ 30 % | ≥ 5 % | **Hold** — repeat the window once with tuned intensity ladder; second hold → deprecate |
+   "Quality regression" = host-side rubric on the corpus per
+   `step-4-measurement-and-benchmark.md` Phase 3. Numbers checked into
+   `docs/parity/bench.json` as the decision artefact.
+4. **No interim flip.** The default does not move on anecdote,
+   gut feeling, or a single benchmark snapshot. The 60-day window and
+   the table above are the only path to a default change.
+## Why this is parked, not decided
+The council split (Opus = remove now, o1 = measure-then-decide) is
+real. Either branch is wrong-shaped without numbers. The kill-criterion
+gives the audit a deterministic resolution path and stops every
+downstream roadmap from re-litigating compression on every PR.
+## Cross-references
+- ``step-99-north-star-restructure.md` § Phase 4`
+  — parks this criterion, does not decide.
+- `step-4-measurement-and-benchmark.md`
+  — owns `task bench`, the corpus, and the closeout that applies the
+  table above.
+- `step-10-caveman-parity.md`
+  — implements the carve-outs and the statusline integration the
+  "flip default on" branch depends on; blocks the default flip until
+  acceptance is green.
+- [`caveman-speak`](../../.agent-src.uncompressed/rules/caveman-speak.md)
+  — runtime rule; reads `caveman.speak_scope` from settings.
+## Done
+This doc exists to keep the decision visible. It is **not** an action
+item. `step-4` closeout closes the loop.

package/docs/contracts/config-presets.md ADDED Viewed

@@ -0,0 +1,144 @@
+---
+stability: beta
+keep-beta-until: 2026-08-14
+---
+# Config Presets — Contract
+> **Status:** beta · **Owner:** package maintainer · **Last reviewed:** 2026-05-16
+>
+> Schema and semantics for the **Config Preset** axis introduced in
+> step-15 Phase 1 item 4. Records the **Cost Enforcement** model
+> (Council v3 action #3 prerequisite) so the preset loader can ship.
+> Boundary against `profile.id`, `pack.id`, and `cost_profile`:
+> [`ADR-010`](../decisions/ADR-010-profile-pack-preset-boundary.md).
+## Decision
+A **preset** owns governance knobs that the user wants to tune as a
+bundle, not individually. Three seed presets ship; users can declare
+their own under `.agent-src.uncompressed/presets/<id>.yml`.
+| `preset.id` | Stance | Typical user |
+|---|---|---|
+| `fast` | Lowest friction; widest autonomy; loosest cost caps | Solo founder, throw-away prototype, exploration |
+| **`balanced`** *(default)* | Moderate friction; per-task autonomy; sensible cost caps | Day-to-day work; default for any new install |
+| `strict` | Highest friction; ask-by-default; tight cost caps; block-on-risk | Production paths, regulated work, shared trunks |
+Profile-aware overlay: `developer + strict` ≠ `founder + strict` — the
+profile selects which knob in the preset is read first (e.g. `developer`
+reads `block_on_risk.code_paths`, `founder` reads `block_on_risk.financial_paths`).
+## Preset shape
+```yaml
+preset:
+  id: balanced
+  autonomy:
+    default: auto              # on | off | auto (see autonomous-execution rule)
+    trivial_suppress: true
+  confidence:
+    min_band: medium           # low | medium | high — block plan if below
+    require_evidence: false
+  risk:
+    block_on: [security, prod_data]
+    ask_on: [bulk_delete, schema_change]
+  council:
+    auto_consult: false
+    cap_per_consult_usd: 0.50
+  mcp:
+    per_call_max_usd: 0.10
+    per_session_max_usd: 2.00
+  cost:
+    daily_max_usd: 10.00
+    weekly_max_usd: 50.00
+    monthly_max_usd: 150.00
+    enforce: hybrid            # see Cost Enforcement section
+  notifications:
+    threshold_pct: [50, 75, 90, 100]
+```
+## Cost Enforcement
+*Hybrid model* — recorded as the Phase 1 prerequisite per Council v3
+action #3. Two enforcement surfaces, one decision per call.
+### Hard enforcement (preset loader, blocking)
+The preset loader **refuses to dispatch** any council or MCP call whose
+*estimated* cost exceeds the active preset's per-call ceiling. The
+estimate is read from the model adapter (`council_cli.py estimate` for
+council; the MCP tool manifest for MCP). The block is raised **before**
+the network call. There is no override flag — the user must change the
+preset, override `cost.per_call_max_usd` in `.agent-settings.yml`, or
+pass `--preset=fast` on the CLI.
+```
+PRE-CALL CEILING IS HARD.
+NO RUNTIME OVERRIDE. NO "JUST THIS ONCE" FLAG.
+EXCEED → REFUSE → SURFACE THE CEILING + THE OVERRIDE PATH.
+```
+Applies to:
+- AI Council consults (`scripts/council_cli.py run`).
+- MCP tool calls dispatched through the universal dispatcher
+  ([`hook-architecture-v1`](hook-architecture-v1.md)).
+- Any future skill that reads `preset.cost.per_call_max_usd`.
+### Advisory dashboard (retroactive, non-blocking)
+`agent-config cost` (Phase 2 item 10) surfaces daily / weekly / monthly
+spend against the active preset's caps. The dashboard **does not**
+block — it warns at the thresholds in `preset.notifications.threshold_pct`
+(default `50 / 75 / 90 / 100`). At 100 %, the dashboard prints a hard
+warning; the next session start re-checks the cap against the running
+total before dispatching the next paid call.
+The advisory layer's role is **awareness**, not enforcement. Enforcement
+is exclusively the per-call ceiling above; retroactive blocking would
+turn a session unrecoverably hostile mid-task.
+### What the loader does **not** do
+- It does **not** estimate cost for unpaid local model calls
+  (`ollama`, local llama.cpp). These bypass both surfaces.
+- It does **not** estimate cost for non-LLM tool calls (file reads,
+  shell commands, MCP-static-resource fetches). The per-call ceiling
+  targets paid token spend.
+- It does **not** override the Hard Floor in
+  [`non-destructive-by-default`](../../.agent-src/rules/non-destructive-by-default.md)
+  — a preset cannot lift the universal safety floor.
+## Resolution chain
+Reads happen in this order; last writer wins for any single knob:
+1. `pack.preset_id` (if pack active) → set `preset.id`.
+2. `profile.preset_id` → set `preset.id` (if not already set by pack).
+3. `preset.<id>.yml` → fill all knobs.
+4. `.agent-settings.yml` user keys under `preset:` → override per-knob.
+5. Environment variables (`AGENT_CONFIG_PRESET_COST_DAILY_MAX_USD=…`)
+   → override per-knob.
+6. Runtime CLI flags (`--preset-cost-per-call-max-usd=…`) → override
+   per-knob, single session.
+Per [`ADR-010`](../decisions/ADR-010-profile-pack-preset-boundary.md),
+no other axis may write preset-owned knobs.
+## Drift detection
+`task lint-config-schema` (added in Phase 1) hard-fails when:
+- A pack YAML or profile YAML names a preset-owned knob.
+- A preset YAML names a knob outside this contract.
+- The three seed presets diverge from the documented stances above.
+## Non-goals
+- This contract does **not** define profiles, packs, or `cost_profile`.
+  See the corresponding contracts.
+- It does **not** ship a UI. CLI-first (`agent-config cost`,
+  `agent-config preset set <id>`).
+- It does **not** auto-migrate existing installs. Without a preset,
+  the loader falls back to current per-knob defaults (`balanced`-equivalent).

package/docs/contracts/cost-dashboard.md ADDED Viewed

@@ -0,0 +1,143 @@
+---
+stability: beta
+keep-beta-until: 2026-08-12
+---
+# Cost governance dashboard
+> **Status:** beta — first draft 2026-05-16 (Phase 2 Item 10 of
+> `step-15-product-refinement`).
+>
+> **Related:** [`config-presets`](config-presets.md) (caps schema) ·
+> [`cost-profile-defaults`](cost-profile-defaults.md) (default
+> selection) · `scripts/cost/budget.mjs` (existing local-store
+> primitive) · `scripts/cost/track.mjs` (session ingest).
+The `agent-config cost` subcommand surfaces accumulated spend against
+the active preset's caps. Read-only, CLI-first, no UI. Wraps the
+existing `scripts/cost/*.mjs` primitives behind a single discoverable
+verb so a user can ask "where am I against my budget?" without knowing
+the storage layout.
+## Surface
+```
+agent-config cost                       # default: status (this period's spend)
+agent-config cost status [--json]       # spend vs caps for daily/weekly/monthly
+agent-config cost ingest                # pull latest session.jsonl → local store
+agent-config cost history [--period=today|week|month] [--limit=N]
+agent-config cost reset --confirm       # truncate sessions.jsonl + budget.json
+```
+All subcommands are **read-only by default**. `ingest` writes only to
+`agents/cost-tracking/sessions.jsonl`. `reset` is destructive and
+gated by `--confirm` (Hard-Floor per
+[`non-destructive-by-default`](../../.agent-src/rules/non-destructive-by-default.md)).
+## `cost status` — output contract
+Human format:
+```
+Cost (preset: balanced · profile: developer)
+Period       Spent      Cap        Remaining   %   Status
+today        $2.43      $10.00     $7.57       24%  ✅
+week         $14.20     $40.00     $25.80      36%  ✅
+month        $52.10     $150.00    $97.90      35%  ✅
+MCP calls:     12 today · 47 this week · 188 this month
+Council calls:  1 today ·  3 this week ·  11 this month
+Next threshold notification at 75% (week: $30.00).
+```
+`--json` output schema:
+```json
+{
+  "preset": "balanced",
+  "profile": "developer",
+  "periods": {
+    "today":   {"spent_usd": 2.43,  "cap_usd": 10.00,  "remaining_usd": 7.57,  "pct": 0.243, "status": "ok"},
+    "week":    {"spent_usd": 14.20, "cap_usd": 40.00,  "remaining_usd": 25.80, "pct": 0.355, "status": "ok"},
+    "month":   {"spent_usd": 52.10, "cap_usd": 150.00, "remaining_usd": 97.90, "pct": 0.347, "status": "ok"}
+  },
+  "calls": {
+    "mcp":     {"today": 12, "week": 47, "month": 188},
+    "council": {"today": 1,  "week": 3,  "month": 11}
+  },
+  "next_threshold": {"period": "week", "pct": 0.75, "trigger_usd": 30.00}
+}
+```
+### Status field
+| Value | Trigger | Exit code |
+|---|---|---|
+| `ok` | `pct < 0.75` | 0 |
+| `warn` | `0.75 ≤ pct < 1.0` | 0 |
+| `over` | `pct ≥ 1.0` | 1 |
+Overall exit = worst-of across the three periods. `--json` always
+emits the full object regardless of exit.
+## Data sources
+| Field | Source |
+|---|---|
+| `preset` | Active preset id from [`config-presets`](config-presets.md) resolution chain. |
+| `cap_usd` | `preset.cost.{daily,weekly,monthly}_max_usd`. |
+| `spent_usd` | Sum of `cost_usd` field over `agents/cost-tracking/sessions.jsonl` records inside the period window. |
+| `calls.mcp.*` | Sum of `mcp_calls` field in the same records. |
+| `calls.council.*` | Count of records whose `kind` is `council`. |
+| `next_threshold` | Smallest `(period, pct ∈ preset.notifications.threshold_pct)` tuple where `spent_usd < pct × cap_usd`. |
+When the active preset declares no `cost.*` cap (legacy installs),
+`cap_usd` is reported as `null` and `status` is `ok`. The tool does
+**not** invent a default cap.
+## Enforcement vs surfacing
+`agent-config cost` is **read-only**. Enforcement (refuse a council
+or MCP call that would push spend over a cap) lives at the call site
+per the active preset's `cost.enforce` setting (`off`, `advisory`,
+`hybrid`, `hard`). This contract does not change enforcement; it only
+makes the existing local-store data discoverable.
+## Refresh model
+`sessions.jsonl` is appended to by the Claude Code session hooks
+(see `scripts/cost/track.mjs`). `cost status` reads what's there;
+`cost ingest` triggers a one-shot pull from `~/.claude/projects/`.
+Users running a non-Claude-Code agent surface call `cost ingest`
+manually after a session; users on Claude Code with hooks installed
+never need to.
+## Validation
+`scripts/lint_cost_dashboard.py` (Phase 2 deliverable — not yet
+shipped) fails CI on:
+- Schema drift in `sessions.jsonl` (missing required fields).
+- Preset declaring `cost.*` caps that disagree with this contract's
+  expected period grid.
+- `cost status --json` output diverging from the schema above.
+## What this contract does **not** do
+- **Does not** ship a UI. CLI-first, by design.
+- **Does not** introduce per-skill or per-command cost attribution
+  beyond `kind` (`council` vs other). Per-skill attribution is a
+  Phase 3 candidate.
+- **Does not** override per-call hard caps from the preset.
+- **Does not** roll up across multiple projects. Each project's
+  `agents/cost-tracking/` is its own scope.
+## See also
+- [`config-presets`](config-presets.md) — preset caps + `enforce` semantics
+- [`cost-profile-defaults`](cost-profile-defaults.md) — default preset selection
+- [`safety-model`](safety-model.md) — `mcp_call_costly` domain
+- `scripts/cost/budget.mjs`, `scripts/cost/track.mjs` — wrapped primitives
+- `step-15-product-refinement` § Phase 2 Item 10

package/docs/contracts/cost-enforcement.md ADDED Viewed

@@ -0,0 +1,134 @@
+---
+stability: stable
+---
+# Cost Enforcement Contract
+> Status: stable · Owner: `step-11-measurement-governance-parity` · Last reviewed: 2026-05-16
+How USD budgets read from `.agent-settings.yml` interact with the
+session-cost ledger (`agents/cost-tracking/sessions.jsonl`) and the
+budget evaluator (`scripts/cost/budget.mjs`).
+## Surface
+Two files. Settings file declares the budget; ledger file accumulates
+spend. The evaluator joins them and emits a tier.
+| File | Role |
+|---|---|
+| `.agent-settings.yml § cost` | Declarative: budgets per period + enforcement mode. |
+| `agents/cost-tracking/sessions.jsonl` | Append-only: per-session cost records (model, tokens, USD). |
+| `scripts/cost/budget.mjs` | Evaluator: joins both, emits `{ level, utilization_pct, enforcement, source }`. |
+| `scripts/cost/preflight.mjs` | Hard-stop hook: wraps `budget.mjs check` and exits non-zero at HARD_STOP when `enforcement: hard-stop`. |
+## Settings schema
+```yaml
+cost:
+  budgets:
+    daily: 0     # USD ceiling for rolling 24h. 0 = unbudgeted.
+    weekly: 0    # USD ceiling for rolling 7d.  0 = unbudgeted.
+    monthly: 0   # USD ceiling for rolling 30d. 0 = unbudgeted.
+  enforcement: advisory   # advisory | hard-stop
+```
+- `0` (or absent) on any period = that period is not enforced. The
+  evaluator falls back to a longer-period budget when checking shorter
+  periods, never the other way around.
+- `enforcement: advisory` is the default. Dashboards surface the
+  breach; the agent keeps working.
+- `enforcement: hard-stop` is opt-in. `scripts/cost/preflight.mjs`
+  exits non-zero at the HARD_STOP tier; wrapping shells / CI / `task`
+  bindings must check this before composing a turn.
+## Tier ladder (5-stage)
+| Utilization | Level | Emoji | Threshold-pct |
+|---:|---|:---:|---:|
+| `< 50 %` | `OK` | 🟢 | 0 |
+| `50–74 %` | `INFO` | 🟡 | 50 |
+| `75–89 %` | `WARNING` | 🟠 | 75 |
+| `90–99 %` | `CRITICAL` | 🔴 | 90 |
+| `≥ 100 %` | `HARD_STOP` | 🛑 | 100 |
+The legacy 4-stage draft (`under / 50 / 75 / 90 / 100`) folded `OK`
+into `under`. Parity-doc Phase 6 maps both forms verbatim.
+## Hook surface
+`scripts/cost/preflight.mjs` is the **single** turn-start surface.
+It wraps `budget.mjs check` and:
+1. Reads `cost.enforcement` from `.agent-settings.yml`.
+2. If `advisory` → always exits `0`, prints the tier as advisory text.
+3. If `hard-stop` and level is `HARD_STOP` → prints a refusal block
+   citing this contract and exits `1`.
+4. If no budget is configured at all → exits `0` (fail-open). Never
+   blocks unbudgeted work.
+The hook does **not** rewrite or block individual tool calls. It is a
+process-entry gate, intended to be invoked by:
+- `task ci`, `task work:*`, `task roadmap:*` wrappers.
+- The `/onboard` boot path (`scripts/install.py`-side guidance only).
+- Manual `node scripts/cost/preflight.mjs` for shell wrappers.
+## Bypass
+User-facing bypass mechanism (documented for the refusal block):
+- Raise the budget: edit `.agent-settings.yml § cost.budgets.<period>`.
+- Reset the ledger (drops historical spend from the calculation):
+  `node scripts/cost/track.mjs reset --confirm`.
+- Disable enforcement: set `cost.enforcement: advisory`.
+No environment-variable override. Bypass must be an explicit edit so
+the change is durable and auditable.
+## Default behaviour without a budget
+When `cost.budgets.{daily,weekly,monthly}` are all `0`:
+- `budget.mjs check` reports cumulative spend, no tier (returns the
+  no-budget JSON shape).
+- `preflight.mjs` exits `0`. Never blocks.
+- `agent-status` panel shows **only** the measured-spend USD figure;
+  the tier table is suppressed.
+## Source precedence
+`budget.mjs` reads budget config in this order:
+1. `.agent-settings.yml § cost` (when any value > 0).
+2. `agents/cost-tracking/budget.json` (legacy single-period JSON).
+3. None → no-budget output shape.
+The evaluator output carries `source: 'agent-settings.yml' | 'budget.json'`
+so dashboards can show where the figure came from.
+## Period mapping
+`BUDGET_PERIOD={today|week|month|all}` selects which budget value
+applies:
+| `BUDGET_PERIOD` | Settings key |
+|---|---|
+| `today` | `cost.budgets.daily` |
+| `week` | `cost.budgets.weekly` |
+| `month` | `cost.budgets.monthly` |
+| `all` (default) | First non-zero of `monthly → weekly → daily`. |
+## Acceptance fixtures
+`tests/fixtures/cost/budget/` carries five reference fixtures:
+`under-50`, `mid-75`, `high-90`, `at-100`, `over-100`. Each fixture
+ships a `sessions.jsonl` slice + an expected JSON output. The fixture
+suite is wired to `task test-cost-budget` per `step-11` Phase 2 Step 5.
+## See also
+- `step-11-ruflo-parity` — Measurement & Governance Parity roadmap.
+- `docs/contracts/cost-dashboard.md` — companion dashboard contract.
+- `scripts/cost/budget.mjs` — evaluator implementation.
+- `bench/pricing.yaml` — per-model USD pricing table.

package/docs/contracts/file-ownership-matrix.json CHANGED Viewed

@@ -6928,13 +6928,6 @@
       "via": "body_link",
       "depth": 1
     },
-    {
-      "source": ".agent-src.uncompressed/rules/no-cheap-questions.md",
-      "target": ".agent-src.uncompressed/contexts/contracts/frugality-charter.md",
-      "type": "READ_ONLY",
-      "via": "body_link",
-      "depth": 1
-    },
     {
       "source": ".agent-src.uncompressed/rules/no-cheap-questions.md",
       "target": ".agent-src.uncompressed/rules/ask-when-uncertain.md",