npm - @event4u/agent-config - Versions diffs - 1.17.0 → 1.18.0 - Mend

@event4u/agent-config 1.17.0 → 1.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

package/.agent-src/rules/context-hygiene.md CHANGED Viewed

@@ -7,6 +7,12 @@ source: package
 # Context Hygiene
+> **Enforced by:** [`scripts/context_hygiene_hook.py`](../../scripts/context_hygiene_hook.py)
+> on Augment + Claude Code (`PostToolUse`). The hook maintains
+> `agents/state/context-hygiene.json` (turn count, loop signal,
+> freshness milestones at 20/40/60); the prose below is the spec the
+> hook implements and the agent-side fallback.
 ## Conversation Freshness
 Monitor for **context decay** — long conversations degrade quality and waste tokens.

package/.agent-src/rules/direct-answers.md CHANGED Viewed

@@ -22,8 +22,7 @@ ANSWER THE SUBSTANCE. SHIP THE TRUTH.
 - No subjective judgments on user's code/decisions ("nice approach", "boring") unless evaluation was asked.
 - "Good catch" / "you're right" only when literally true and load-bearing.
 - Acknowledge mistakes without performative apologies — one sentence, switch behavior.
-Failure mode: praise as a hedge before bad news. Drop the cushion, deliver the news.
+- Failure mode — praise hedging bad news; drop the cushion, deliver the news.
 ## Iron Law 2 — No Invented Facts (severity-tiered)
@@ -39,11 +38,10 @@ WHEN VERIFICATION IS NOT WORTH THE COST → ASK.
 | **Medium — project-shape** ("uses X for Y", conventions, file location) | Verify if one tool call reaches it; otherwise hedge: *"I'd guess X — not checked"*. |
 | **Low — well-known idioms** (generic language/framework idioms, public APIs) | Inference acceptable. Mark as inference if not 100% sure. |
-Concrete examples and hedge-language patterns:
+Examples + hedge-language patterns:
 [`asking-and-brevity-examples`](../../docs/guidelines/agent-infra/asking-and-brevity-examples.md#direct-answers--severity-tiered-claim-examples).
-User override: "just guess", "rough estimate is fine", "skip the
-verify" → drop to Low for that turn.
+Override: "just guess", "rough estimate is fine", "skip the verify"
+→ drop to Low for that turn.
 ## Iron Law 3 — Brevity by Default
@@ -60,33 +58,26 @@ LONG ANSWERS ARE A FAILURE MODE, NOT A SIGN OF EFFORT.
 - One-true-answer question → one sentence + the answer.
 `token-efficiency` covers the loop side; this rule covers per-reply
-length. **Never overrides** `user-interaction` (numbered options
-stay) or command-mandated steps.
+length. **Never overrides** `user-interaction` (numbered options stay)
+or command-mandated steps.
 ## Emoji Scope — functional markers only
-**Whitelist (allowed):**
-- Mandated markers: `📒` (chat-history heartbeat, verbatim per
-  `chat-history-visibility`), mode markers from `role-mode-adherence`.
-- CLI status: `❌`, `✅`, `⚠️` (two-space rule from `language-and-tone`).
-- Roadmap checkboxes: `[x]`, `[~]`, `[-]`.
+**Whitelist:** `📒` (chat-history heartbeat, verbatim per
+`chat-history-visibility`); mode markers from `role-mode-adherence`;
+CLI status `❌` / `✅` / `⚠️` (two-space rule from `language-and-tone`);
+roadmap checkboxes `[x]` / `[~]` / `[-]`.
-**Blacklist (never in prose):**
+**Blacklist (never in prose):** opening flair (✨, 🚀, 🎉, 💡, 🔥, 👍);
+empathy (❤️, 🤗, 😊); section dividers, headline ornaments, reaction
+emojis. Unsure → assume blacklist.
-- Opening flair: ✨, 🚀, 🎉, 💡, 🔥, 👍.
-- Empathy: ❤️, 🤗, 😊.
-- Section dividers, headline ornaments, reaction emojis.
+## Failure modes
-Unsure → assume blacklist.
-## Failure modes the user will call out
-Trigger phrases per Iron Law and the in-language correction pattern:
+Trigger phrases + in-language correction pattern:
 [`asking-and-brevity-examples`](../../docs/guidelines/agent-infra/asking-and-brevity-examples.md#direct-answers--failure-modes-the-user-will-call-out).
-When called out: acknowledge once in the user's language, switch,
-no excuses (mirrors `language-and-tone` § slip handling).
+On call-out: acknowledge once in the user's language, switch, no
+excuses (mirrors `language-and-tone` § slip handling).
 ## Interactions

package/.agent-src/rules/no-cheap-questions.md CHANGED Viewed

@@ -9,13 +9,10 @@ source: package
 A question is **cheap** when the answer follows from stated context,
 an option breaches an Iron Law, choices differ only in sequencing /
-format, or one option is obviously dominant. Cheap questions are
-noise, regardless of `personal.autonomy`.
-Mode-independent. [`autonomous-execution`](autonomous-execution.md)'s
-"trivial" failure modes scope to `personal.autonomy: on` (or
-`auto`-after-opt-in); this rule lifts the **no-trade-off** subset to
-`off` and pre-opt-in `auto` too.
+format, or one option is obviously dominant. Mode-independent — holds
+in `off`, `auto`, and `on`; autonomy never lifts the no-trade-off
+floor (cf. [`autonomous-execution`](autonomous-execution.md), whose
+"trivial" failure modes only scope to `on` / opted-in `auto`).
 ## The Iron Laws
@@ -35,16 +32,12 @@ Hold in `off`, `auto`, and `on`. Autonomy never lifts them.
 - **CI / test asks** — [`verify-before-complete`](verify-before-complete.md) decides, not the user.
 - **Fenced-step re-asks** — "Start Phase 1?" after *"plan only"*; see
   [`scope-control § fenced step`](scope-control.md#fenced-step--user-set-review-gates).
-- **Iron-Law option** — option breaches `commit-policy`,
-  `scope-control § git-ops`, or `non-destructive-by-default`.
-- **Context-derived** — answer follows from prior turn / standing
-  instruction / roadmap; act, state the assumption inline.
-- **Dominant option** — one choice obviously correct; alternatives
-  carry no upside.
-- **Re-ask after decline** — forbidden per
-  [`scope-control § decline = silence`](scope-control.md#decline--silence--no-re-asking-on-the-same-task).
-Concrete examples per class:
+- **Iron-Law option** — breaches `commit-policy`, `scope-control` § git-ops, or `non-destructive-by-default`.
+- **Context-derived** — answer follows from prior turn / standing instruction / roadmap; act, state the assumption inline.
+- **Dominant option** — one choice obviously correct; alternatives carry no upside.
+- **Re-ask after decline** — forbidden per [`scope-control § decline = silence`](scope-control.md#decline--silence--no-re-asking-on-the-same-task).
+Examples per class:
 [`asking-and-brevity-examples`](../../docs/guidelines/agent-infra/asking-and-brevity-examples.md#cheap-question-class-catalog--extended-examples).
 ## Pre-Send Self-Check — MANDATORY before every question
@@ -52,11 +45,11 @@ Concrete examples per class:
 Run silently before any numbered-options block:
 1. Answer already in stated context?
-2. Any option violates [`commit-policy`](commit-policy.md), [`scope-control § git-ops`](scope-control.md), or [`non-destructive-by-default`](non-destructive-by-default.md)?
+2. Any option violates `commit-policy`, `scope-control` § git-ops, or `non-destructive-by-default`?
 3. Options pure sequencing / format, no trade-off?
 4. One option obviously dominant?
-5. User fenced next step (*"plan only"*, *"review first"*) → deliver + handback per [`scope-control § fenced step`](scope-control.md#fenced-step--user-set-review-gates).
-6. User already declined? Re-ask forbidden per [`scope-control § decline = silence`](scope-control.md#decline--silence--no-re-asking-on-the-same-task).
+5. User fenced next step (*"plan only"*, *"review first"*) → deliver + handback per `scope-control` § fenced step.
+6. User already declined? Re-ask forbidden per `scope-control` § decline = silence.
 Any "yes" → **do not ask**. Pick the dominant path, state assumption
 inline (*"assuming X — adjust if wrong"*), hand back. One-question-per-turn
@@ -68,7 +61,7 @@ the question is genuine.
 - Real architectural / scope decision with non-obvious trade-offs.
 - Vague-request trigger per [`ask-when-uncertain`](ask-when-uncertain.md#vague-request-triggers--must-ask).
 - Security-sensitive path per [`security-sensitive-stop`](security-sensitive-stop.md).
-- Hard Floor in [`non-destructive-by-default`](non-destructive-by-default.md) — confirmation mandatory.
+- Hard Floor per `non-destructive-by-default` — confirmation mandatory.
 - Two genuinely-equivalent paths; user preference is the tiebreaker.
 In doubt → ask. This rule narrows asking, never widens silence.

package/.agent-src/rules/onboarding-gate.md CHANGED Viewed

@@ -7,6 +7,12 @@ source: package
 # Onboarding Gate
+> **Enforced by:** [`scripts/onboarding_gate_hook.py`](../../scripts/onboarding_gate_hook.py)
+> on Augment + Claude Code (`SessionStart`). The hook refreshes
+> `agents/state/onboarding-gate.json` from `.agent-settings.yml`; the
+> prose below is the spec the hook implements and the fallback for
+> platforms without a hook surface.
 Forces a one-time `/onboard` run for each developer on each project. This
 replaces the previously scattered "ask once" patterns across `user_name`,
 `personal.ide`, `personal.rtk_installed`, and cost profile confirmation.
@@ -92,3 +98,4 @@ gate. This protects projects that were set up before this rule shipped.
 - [`layered-settings`](../../docs/guidelines/agent-infra/layered-settings.md) — merge rules for mid-life edits
 - [`agent-settings` template](../templates/agent-settings.md) — `onboarding.onboarded` reference
 - [`rule-type-governance`](rule-type-governance.md) — why this is `always`
+- [`hardening-pattern`](../../agents/contexts/hardening-pattern.md) — Tier 1 mechanical-rule contract

package/.agent-src/rules/roadmap-progress-sync.md CHANGED Viewed

@@ -12,6 +12,11 @@ load_context:
 # Roadmap Progress Sync
+> **Enforced by:** [`scripts/roadmap_progress_hook.py`](../../scripts/roadmap_progress_hook.py)
+> on Augment + Claude Code (`PostToolUse`). Hook is primary; the prose
+> below is the specification the hook implements and the fallback when
+> the platform has no hook surface.
 ## Iron Law — dashboard sync
 ```
@@ -90,6 +95,28 @@ roadmap left under `agents/roadmaps/` is a rule violation, not an
 optional cleanup. See `roadmap-management` skill for the archive vs
 skipped decision table.
+## Agent-authored roadmaps — placement is mandatory
+```
+A FILE THE AGENT DROPS INTO agents/roadmaps/ MUST EITHER
+(a) PASS check_roadmap_trackable.py AND LAND IN THE DASHBOARD, OR
+(b) NOT BE IN agents/roadmaps/ AT ALL.
+NO "DECISION MATRIX" / "DESIGN NOTE" SHORTCUT.
+```
+When the agent autonomously creates a roadmap, it owns the placement
+in the **same response**:
+- **Phase plan** (checkboxes, multi-turn execution) → `agents/roadmaps/<name>.md`, `status: ready` (default), regen dashboard.
+- **Decision matrix / ADR / pattern / lookup** (no `Phase N`, durable rationale) → `agents/contexts/<name>.md`.
+- **Completed work snapshot** → `agents/roadmaps/archive/<name>.md`.
+A non-trackable file in `agents/roadmaps/` is a rule violation — the
+trackable CI fails it, the dashboard hides it. The agent that
+created it moves it the same response. If the autonomous run also
+**finishes** the roadmap within the session (every box `[x]`/`[~]`/`[-]`),
+the completion-archival rule above fires too — same response.
 ## Autonomous execution — checkbox cadence
 When executing a roadmap autonomously (multi-turn, no per-step user

package/.agent-src/rules/rule-type-governance.md CHANGED Viewed

@@ -44,3 +44,31 @@ The `description` field IS the trigger. It must describe **when** the rule appli
 - Default to `auto`. Justify `always`.
 - If >50% of conversations don't need a rule → it must be `auto`.
 - `optimize-agents` command checks this and suggests changes.
+## Hardening tier — required on new or edited rules
+Every new rule, and every edited rule whose body changes the trigger
+or the obligation, MUST classify itself against the hardening tiers
+documented in [`rule-trigger-matrix.md`](../../agents/contexts/rule-trigger-matrix.md):
+| Tier | Meaning |
+|---|---|
+| `1` | Mechanically enforceable — hook acts, rule body stays minimal. |
+| `2a` | Marker nudge — hook injects signal, agent acts on it. |
+| `2b` | Structured injection / tool-call gate — hook reads/writes state, may deny. |
+| `3` | Soft, judgment-bound — no platform surface; self-check rule. |
+| `safety-floor` | Iron-Law subset, never modified. |
+| `mechanical-already` | Precedent — script enforces, rule body documents. |
+Classification surface: the optional `tier:` frontmatter field
+(declared in `scripts/schemas/rule.schema.json`). Recommended for new
+rules; bulk-retrofit of existing rules is tracked separately.
+Tier 3 dispositions are recorded centrally in
+[`agents/contexts/tier-3-dispositions.md`](../../agents/contexts/tier-3-dispositions.md)
+with a 6-month re-audit clock. New Tier 3 rules append to that list
+on landing.
+The `optimize-agents` command checks the tier alongside `type`/`source`
+and suggests escalations when a rule's trigger matches a hardening
+opportunity that has shipped since the rule was authored.

package/.agent-src/templates/roadmaps.md CHANGED Viewed

@@ -64,6 +64,10 @@ Check `AGENTS.md` or `Makefile` / `Taskfile.yml` for the exact commands.
 Copy the structure below into a new file:
 ```markdown
+---
+complexity: lightweight
+---
 # Roadmap: {Short descriptive title}
 > {One sentence: What is the expected outcome?}

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -6,7 +6,7 @@
   },
   "metadata": {
     "description": "Shared agent configuration \u2014 skills for AI coding tools (Claude Code, Augment, Cursor, Cline, Windsurf, Gemini CLI).",
-    "version": "1.17.0"
+    "version": "1.18.0"
   },
   "plugins": [
     {

package/CHANGELOG.md CHANGED Viewed

@@ -318,6 +318,41 @@ our recommendation order, not its support status.
   users" tension without removing any path that an existing user
   might rely on.
+## [1.18.0](https://github.com/event4u-app/agent-config/compare/1.17.0...1.18.0) (2026-05-04)
+### Features
+* **rules:** mandate hardening tier classification on new and edited rules ([42ff7c1](https://github.com/event4u-app/agent-config/commit/42ff7c1765e931a3e5e487ef83d01ca597a65800))
+* **hooks:** wire Tier 1 hooks on Claude Code for hardening parity ([55ede24](https://github.com/event4u-app/agent-config/commit/55ede24e65b5ab7e827fa3a40b04fe9dab091392))
+* **rules:** enforce placement for agent-authored roadmaps ([1624ede](https://github.com/event4u-app/agent-config/commit/1624ede571bcdd28adb7d8dd4868a92b9dbde646))
+* roadmap complexity standard with shape and tier linters ([bd1bac6](https://github.com/event4u-app/agent-config/commit/bd1bac650013e449e7318c60586b7648dbcc144e))
+* **hardening:** tier-1 hooks for onboarding-gate and context-hygiene ([5d107cd](https://github.com/event4u-app/agent-config/commit/5d107cd6fa1352d609491df12604ae0ddb7d7113))
+* **always-budget:** hard-compress direct-answers and no-cheap-questions ([2cb9b0b](https://github.com/event4u-app/agent-config/commit/2cb9b0be7b3e6486c3689fec75e7672507ca97cb))
+* outcome baselines and pattern-memory demos for foundational rules ([f43ede7](https://github.com/event4u-app/agent-config/commit/f43ede70a59cb7460557e6d76b041562056e78ee))
+### Bug Fixes
+* **check-refs:** treat .augment/state/*.json as runtime-only paths ([3d4c766](https://github.com/event4u-app/agent-config/commit/3d4c76695af38e764facc8630a97e553f5aac67f))
+* **rules:** sync compressed rules with hardening callouts ([89bd072](https://github.com/event4u-app/agent-config/commit/89bd07267d04d6346a1f36df92759c9208777a9a))
+### Documentation
+* **contexts,contracts:** unlink stable artifacts from archived roadmaps ([af4e5c2](https://github.com/event4u-app/agent-config/commit/af4e5c2de6dfb0d2143f48213e04d41c9354deca))
+* **contexts:** lock Tier 2 nudge surface, Tier 3 dispositions, platform parity ([10685a7](https://github.com/event4u-app/agent-config/commit/10685a7f3fceec1b93365cef3531a5e81f55396f))
+* **roadmaps:** relocate budget-v2-matrix to contexts as durable rationale ([33b903a](https://github.com/event4u-app/agent-config/commit/33b903a36feeacffddedde11ff9b1bc3bd5173e3))
+### Refactoring
+* **state:** move hook runtime state from .augment/state/ to agents/state/ ([ef5265e](https://github.com/event4u-app/agent-config/commit/ef5265e1ead67b089c4438efcbd4e94f120c6d3e))
+### Chores
+* **plugin:** restructure marketplace.json to registry shape ([f3e6f24](https://github.com/event4u-app/agent-config/commit/f3e6f2425fefef2cc5fc338d27e8628ab45a4d41))
+* **council:** archive Budget-v2 audit one-off ([00bf46e](https://github.com/event4u-app/agent-config/commit/00bf46e680e3a66bdbddac35e1ec3a08a8aa11f4))
+* **roadmaps:** archive hardening and context-layer-maturity tracks ([21dec26](https://github.com/event4u-app/agent-config/commit/21dec26b7a6455ab95df6704773426e3dd35574f))
+* regenerate dashboards, hashes, and roadmap progress ([06c50a9](https://github.com/event4u-app/agent-config/commit/06c50a9653bf4250a557ba2fe5a39b8609b3df30))
+* archive ai_council one-off scripts and add location guard ([63e6dbd](https://github.com/event4u-app/agent-config/commit/63e6dbdc49749e97621bcaeaa49f81beb5ffc98c))
 ## [1.17.0](https://github.com/event4u-app/agent-config/compare/1.16.0...1.17.0) (2026-05-04)
 ### Features

package/README.md CHANGED Viewed

@@ -7,7 +7,7 @@ Give your AI agents an audit-disciplined orchestration contract — testing, Git
 > Your agent picks up the project's stack, runs tests, prepares PRs, fixes CI — and follows your team's coding standards while doing it. Stack-aware skill sets ship for PHP (Laravel · Symfony · Zend/Laminas), JavaScript (Next.js · React · Node), and cross-stack concerns (API · testing · security · observability).
 <p align="center">
-  <strong>129 Skills</strong> · <strong>58 Rules</strong> · <strong>95 Commands</strong> · <strong>48 Guidelines</strong> · <strong>8 AI Tools</strong>
+  <strong>129 Skills</strong> · <strong>58 Rules</strong> · <strong>95 Commands</strong> · <strong>51 Guidelines</strong> · <strong>8 AI Tools</strong>
 </p>
 ---

package/docs/architecture.md CHANGED Viewed

@@ -65,7 +65,7 @@ fails on any source-side violation, without producing artifacts.
 | **Skills** | 129 | On-demand expertise — stack analysis (Laravel · Symfony · Zend / Laminas · Next.js · React · Node), testing, Docker, API design, security, observability, … |
 | **Rules** | 58 | Always-active constraints — coding standards, scope control, verification, language-and-tone, agent-authority |
 | **Commands** | 95 | Slash-command workflows — `/commit`, `/create-pr`, `/fix ci`, `/optimize skills`, `/feature plan`, `/work`, `/implement-ticket`, `/compress`, … |
-| **Guidelines** | 48 | Reference material cited by skills — PHP patterns, Eloquent, Playwright, agent-infra, … |
+| **Guidelines** | 51 | Reference material cited by skills — PHP patterns, Eloquent, Playwright, agent-infra, … |
 | **Templates** | 7 | Scaffolds for features, roadmaps, contexts, skills, overrides |
 | **Contexts** | 5 | Shared knowledge about the system itself |

package/docs/contracts/load-context-budget-model.md CHANGED Viewed

@@ -65,6 +65,86 @@ extraction pattern (Q1) and the **one-rule + three-contexts** consolidation
 (Q2); model (b) literal is the accounting that makes both patterns
 honest about their cost.
+## Load order (Q1) — file order in frontmatter
+Locked by Phase 1.2 of `road-to-context-layer-maturity` (council
+session `2026-05-03T17-56-21Z`, v2 lock).
+When a rule declares multiple `load_context:` and / or
+`load_context_eager:` entries, the agent processes them in the order
+they appear in the YAML list, top to bottom. `load_context_eager:`
+entries are concatenated into the active context in declaration
+order; `load_context:` entries are available for lazy retrieval in
+the same order, surfaced first-listed-first when the rule body cites
+them in prose.
+**Rejected alternatives:**
+- *Citation order in prose* — non-machine-readable; would force every
+  rule to embed a sort hint and the linter to parse rule body text.
+- *Priority field per entry* — adds frontmatter surface area without
+  observable benefit. The current rule with the most contexts
+  (`autonomous-execution`, 3 contexts) reads each in declaration
+  order without ambiguity.
+This contract treats list order as the canonical signal. Authors
+must order their `load_context:` entries from "load this first" to
+"load this last" so prose citations and frontmatter agree.
+## Per-rule context-count cap (Q2) — ≤ 3 contexts per rule
+Locked by Phase 1.3 of `road-to-context-layer-maturity` (council
+session `2026-05-03T17-56-21Z`, v2 lock). Enforced by
+`scripts/check_always_budget.py` as `MAX_CONTEXTS_PER_RULE = 3`.
+A rule's combined count of `load_context:` + `load_context_eager:`
+top-level entries must not exceed **3**. The cap is on *declared*
+entries (depth 1 from the rule), not transitive closure — depth-2
+context citations are governed by the depth-2 nesting cap below.
+**Rationale:**
+- Empirical max in the current rule set is 3
+  (`autonomous-execution`); the cap locks the ceiling without forcing
+  any rewrite.
+- A 4th declared context is the structural signal that the rule
+  should split (one rule, one obligation surface), not load more.
+- Cross-platform mechanical: a count check is O(N), no semantic
+  analysis, identical observable on Augment and Claude Code.
+The cap applies to **all rule types** (`always` and `auto`) — Q2 is
+a structural constraint on rule shape, not a budget concern.
+## Cross-rule sharing (Q3) — Model (b) literal locked
+Locked by Phase 1.4 of `road-to-context-layer-maturity` (decision
+3a). The accounting model in § The locked model above stands without
+shared-context discount.
+**Decision:** when context X is loaded by N rules, each rule pays
+the full `RawSize(X)` under Model (b). No `chars(X) / N` discount.
+**Rationale (3a over 3b):**
+- The linter is correct as-is — `RECOVERY_BAND_ENABLED` plus the
+  per-rule allowlist is already a working ratchet to drive total
+  utilization below 100 %.
+- 3b would require a linter rewrite plus a new failure-mode test
+  family (cross-rule attribution, divisor-stability invariants).
+  Phase 1.4a's 4-hour / 200-LOC feasibility cap was structured to
+  reject 3b on cost; with no current shared-context patterns
+  exceeding 1 loader per context (verified Phase 1.1 inventory), 3b
+  has no measurable upside.
+- Phase 4c (shared-context discount) becomes a no-op under 3a.
+  Phase 4 leverage shifts to 4a (demote), 4b (merge), and 4d
+  (hard-compress) — see `road-to-context-layer-maturity` Phase 4
+  inputs gate.
+**Reopener:** if a future inventory shows ≥ 3 shared-context loaders
+and total utilization re-enters the 95–100 % zone after Phase 4
+completes, run the Phase 1.4a feasibility spike and propose 3b via
+contract version bump.
 ## Nesting cap — depth 2
 A rule's `load_context:` may cite a context (depth 1). A context may

package/docs/contracts/load-context-schema.md CHANGED Viewed

@@ -85,6 +85,26 @@ Single-rule contexts that don't fit one of the canonical topics live
 directly under `contexts/` (see `contexts/model-recommendations.md`,
 `contexts/skills-and-commands.md`).
+## Per-rule context-count cap (Q2)
+A rule's combined count of `load_context:` + `load_context_eager:`
+top-level entries MUST be ≤ **3**. Enforced by
+`scripts/check_always_budget.py` (`MAX_CONTEXTS_PER_RULE`); see
+[`load-context-budget-model.md § Per-rule context-count cap (Q2)`](load-context-budget-model.md#per-rule-context-count-cap-q2--3-contexts-per-rule)
+for the locking contract.
+A 4th context is the structural signal that the rule should split,
+not load more. Authors hitting the cap should extract a sibling rule
+with its own obligation surface, not stretch the existing one.
+## Load order
+Entries load in **frontmatter list order**, top-to-bottom. Author
+your list in "first read" order so prose citations and frontmatter
+agree. See
+[`load-context-budget-model.md § Load order (Q1)`](load-context-budget-model.md#load-order-q1--file-order-in-frontmatter)
+for the locking rationale.
 ## Combined char-budget guard
 `load_context_eager:` triggers a budget check:

package/docs/contracts/roadmap-complexity-standard.md ADDED Viewed

@@ -0,0 +1,137 @@
+---
+stability: beta
+---
+# Roadmap Complexity Standard
+> **Audience:** roadmap authors and reviewers in `agents/roadmaps/`.
+> **Linter:** `scripts/lint_roadmap_complexity.py` (run via
+> `task lint-roadmap-complexity`).
+> **Source:** Phase 5 of the `road-to-context-layer-maturity` work
+> (now archived); reviewer-flagged drift after the
+> structural-optimization roadmap proved that "heavy" is correct for
+> structural work but wrong for normal feature work.
+Roadmaps drift toward heavyweight whenever the previous one was
+heavyweight. This contract pins **two tiers**, names exemplars, and
+hard-fails the lint on tier mismatch.
+## Tiers
+### Lightweight (default)
+Almost every roadmap is lightweight. The shape:
+- **≤ 6 phases** total
+- **≤ 1 page per phase** (≈ 60 source lines including header + steps
+  + exit gate; the linter doesn't enforce per-phase line budgets, but
+  reviewers do)
+- **No nested council debates** inside the roadmap (no
+  `## Council Round 1`, `## Council Round 2`, `### Verdict` sections)
+- **No 178-step backlogs** — phases are delivery-shaped, not
+  task-shaped
+- **≤ 600 lines total** (frontmatter + body, blank lines counted)
+- Frontmatter declares `complexity: lightweight`
+**Exemplars:**
+- `road-to-context-layer-maturity.md` (archived) — six phases,
+  ~376 lines, no nested council; the seed roadmap that triggered this
+  standard. Self-tagged as `lightweight`.
+- `road-to-rule-hardening.md` (archived) — five phases, ~263 lines;
+  mechanized the rule layer; sibling of the seed, also lightweight.
+**Typical use:** feature work, follow-ups, bounded refactors,
+mechanization passes, telemetry plumbing.
+### Structural (rare)
+Triggered only when the work changes a contract layer or a budget
+invariant. The shape:
+- Multi-round council, locked decisions, file-ownership matrix,
+  gating contracts on phase boundaries
+- **> 600 lines** total (typical: 800 – 1500)
+- May include `## Council Round N` / `### Verdict` blocks, ADR
+  cross-links, decision matrices
+- Frontmatter declares `complexity: structural`
+**Exemplars:**
+- Archived `agents/roadmaps/_archived/road-to-structural-optimization.md`
+  (closed 2026-05-03 after 6 phases of council-driven budget work).
+  ~1.5k lines, multi-round council, file-ownership matrix.
+**Triggered by:** changes to a public contract surface in
+`docs/contracts/`, a budget invariant in
+[`load-context-budget-model.md`](load-context-budget-model.md), or a
+priority hierarchy in
+[`rule-priority-hierarchy.md`](rule-priority-hierarchy.md).
+**Requires:** explicit user opt-in on creation. The agent must not
+upgrade a lightweight roadmap to structural mid-flight without that
+opt-in.
+## Anti-game clause
+The trigger is **contract-layer change**, not line count alone. A
+heavy roadmap split into two lightweights to dodge the gate is a
+linter-defeat — reviewers flag it on PR review. Conversely, a
+roadmap that legitimately needs the structural shape but tries to
+hide as `lightweight` to skip council overhead is the same defeat in
+the other direction.
+## Frontmatter
+Every roadmap in `agents/roadmaps/` (including draft and archived
+ones) declares its tier:
+```yaml
+---
+complexity: lightweight
+---
+```
+or
+```yaml
+---
+complexity: structural
+status: draft
+---
+```
+Other frontmatter keys (`status:`, `owner:`, `target_release:`) are
+permitted alongside but not required by this contract.
+## Linter contract
+`scripts/lint_roadmap_complexity.py` (≤ 150 LOC, stdlib only)
+enforces the **measurable** subset of this standard:
+| Check | Lightweight | Structural |
+|---|---|---|
+| `complexity:` frontmatter declared | required | required |
+| Total line count | ≤ 600 | > 0 (no upper cap) |
+| `## Phase N` heading count | ≤ 6 | no cap |
+| `## Council Round N` / `### Verdict` blocks | forbidden | allowed |
+| Council session cross-links (`agents/sessions/.../council-…`) | warn | allowed |
+The linter runs on every roadmap file under `agents/roadmaps/` and
+exits non-zero on any violation. Hooked into `task ci` via
+`task lint-roadmap-complexity`.
+**Out of scope for the linter** (reviewer judgment only): step count,
+per-phase length, the contract-layer-change trigger for the
+structural tier.
+## Migration
+Phase 5.3 of `road-to-context-layer-maturity` applies the standard
+retroactively to all open roadmaps in `agents/roadmaps/`: each gets
+a `complexity:` tag based on the rules above. No content rewrites
+land in that step — only the tag.
+Roadmaps that exceed the lightweight cap but plausibly should be
+lightweight (e.g. they accumulated drift) are tagged `structural`
+for now and may be split or trimmed in a follow-up. The migration
+records existing reality, not aspirational reality.