npm - @event4u/agent-config - Versions diffs - 1.13.0 → 1.15.0 - Mend

@event4u/agent-config 1.13.0 → 1.15.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (291) hide show

package/docs/contracts/agent-memory-contract.md ADDED Viewed

@@ -0,0 +1,149 @@
+---
+stability: beta
+---
+# Agent-Memory Contract (as expected by `agent-config`)
+**Purpose.** Freeze the interface `agent-config` currently expects from
+the sibling package `@event4u/agent-memory`, so when that package
+actually ships we can diff its real surface against this document in
+one place — instead of chasing drift across skills, commands, and
+helpers.
+**Ownership.** `agent-memory` is ours; we decide release timing. This
+doc is internal, not a spec handed to an external team. The
+authoritative spec-side documents live under
+[`agents/roadmaps/agent-memory/`](../../agents/roadmaps/agent-memory/); this context
+is the **consumer-side snapshot** — what our wired code assumes today.
+Last refreshed: 2026-04-22.
+## What this doc is *not*
+- Not a replacement for
+  [`road-to-retrieval-contract.md`](../../agents/roadmaps/agent-memory/road-to-retrieval-contract.md)
+  — that is the spec we hand to the agent-memory implementer.
+- Not a commitment that consumer code looks exactly like this forever
+  — it is a point-in-time pin.
+- Not an agent-facing skill. Humans read this when the package lands.
+## Expected backend states
+Defined in [`memory-access.md`](../../.agent-src.uncompressed/guidelines/agent-infra/memory-access.md)
+and `scripts/memory_status.py`:
+| Status | Meaning | Agent-config behaviour |
+|---|---|---|
+| `absent` | Package not installed or CLI not on PATH | File fallback only |
+| `misconfigured` | Installed but `health()` fails within 2s | Warn once / session, fall back to file |
+| `present` | Installed and healthy within 2s | Route retrieval through package |
+Detection must be **bounded** (≤ 2s cold probe), **cached** per
+process, **non-raising** on probe failure.
+## Expected CLI surface
+Probed in `scripts/memory_status.py` via `_CLI_CANDIDATES`:
+- Executable on `PATH` as **`memory`** (canonical, ships in
+  `@event4u/agent-memory` v1.1+ as `package.json#bin.memory`),
+  **`agent-memory`** (planned alias), or **`agentmem`** (legacy).
+- Supports `health` subcommand emitting a v1 health envelope on stdout
+  (`{contract_version, status, backend_version, features, latency_ms}`)
+  and exiting non-zero on unhealthy.
+- Supports `retrieve <query> [--type T …] [--limit N] [--layer 1|2|3]
+  [--budget N] [--low-trust] [--repository ID]` emitting a v1
+  retrieval envelope on stdout (always JSON).
+The retrieval invocation is **semantic, not key-based** — see the
+"⚠️ Known contract drift" section below for the consumer-side
+implication.
+If the released package diverges from these names, we update
+`_CLI_CANDIDATES` in `memory_status.py` — not the other way round.
+## Expected retrieval shape (present path)
+Source of truth:
+[`road-to-retrieval-contract.md`](../../agents/roadmaps/agent-memory/road-to-retrieval-contract.md).
+Consumer skills call the shared abstraction, not the package directly.
+**Request** (Python):
+```python
+retrieve(types=[…], keys=[…] or {…}, limit: int = 20, timeout_ms: int = 2000)
+```
+**Response** (v1 envelope) — mandatory fields per entry: `id`, `type`,
+`source ∈ {repo, operational}`, `confidence`, `body`. Optional:
+`trust`, `last_validated`, `shadowed_by`.
+Envelope: `contract_version`, `status ∈ {ok, partial, error}`,
+`entries`, `slices`, `errors`.
+## ⚠️ Known contract drift (consumer vs. spec)
+**Status: resolved at the consumer boundary.** The CLI / JSON output of
+`scripts/memory_lookup.py` already emits the v1 envelope with
+`source ∈ {repo, operational}`, `confidence`, `slices`, `status`, and
+`contract_version` — see `memory_lookup.py:320-345` (envelope
+assembly).
+What remains is **internal-only**: the private `Hit` dataclass inside
+`memory_lookup.py` still uses `source ∈ {curated, intake}` and `score`.
+No skill, command, or external consumer imports `Hit` directly — they
+all go through the public JSON surface, which is already spec-aligned.
+| Internal `Hit` field | Public envelope field | Visible to consumers? |
+|---|---|---|
+| `id`, `type` | `id`, `type` | yes — match |
+| `source ∈ {curated, intake}` | `source ∈ {repo, operational}` | no — translated at boundary |
+| `score ∈ [0,1]` | `confidence` | no — translated at boundary |
+| `path` | — | no — internal scoring signal |
+**Trigger to revisit:** if a second module starts importing the `Hit`
+dataclass directly (e.g. `memory_lookup.py` is split into multiple
+files and `Hit` becomes a public type), rename `Hit.source`/`Hit.score`
+to match the envelope so the boundary translation can be deleted.
+Until then, the internal naming is an implementation detail.
+There is also a **calling-convention drift** between the contract's
+key-based `retrieve(types, keys, limit)` and the package's
+semantic-only `retrieve(query, …)`. This is tracked separately and
+is the subject of an ongoing design decision (hybrid contract — keys
+synthesise into a query for the package path; file-fallback stays
+key-match).
+## Expected `propose()` / signal emission
+Shape used by `scripts/memory_signal.py` and the `/propose-memory`
+command: JSONL append-only drop-ins under
+`agents/memory/intake/*.jsonl`, one signal per line. When the package
+is present, the same payload is accepted by a `propose()` CLI or MCP
+call. File-drop is the always-works path.
+Required fields (keep in sync with `/propose-memory` command):
+`ts`, `type`, `key` (path or logical id), `observation`, `source`
+(`agent` or `human`), `session_id`.
+## Revisit triggers
+Q29 is **parked open**, not a blocker. Revisit when **one** of these
+holds:
+- `@event4u/agent-memory` ships a tagged release (any v0.x)
+- A consumer project explicitly asks for the `present` path
+- The agent-memory repo opens an integration PR against `agent-config`
+- We change the file fallback's public shape (then rewrite this doc
+  *before* the change lands)
+## See also
+- [`road-to-memory-self-consumption.md`](../../agents/roadmaps/road-to-memory-self-consumption.md)
+- [`road-to-agent-memory-integration.md`](../../agents/roadmaps/road-to-agent-memory-integration.md)
+- [`agent-memory/road-to-retrieval-contract.md`](../../agents/roadmaps/agent-memory/road-to-retrieval-contract.md)
+- [`agent-memory/road-to-promotion-flow.md`](../../agents/roadmaps/agent-memory/road-to-promotion-flow.md)
+- [`memory-access guideline`](../../.agent-src.uncompressed/guidelines/agent-infra/memory-access.md)
+- [`scripts/memory_status.py`](../../.agent-src.uncompressed/templates/scripts/memory_status.py)
+- [`scripts/memory_lookup.py`](../../.agent-src.uncompressed/templates/scripts/memory_lookup.py)
+- [`open-questions-2.md`](../../agents/roadmaps/archive/open-questions-2.md) — Q29

package/docs/contracts/artifact-engagement-flow.md ADDED Viewed

@@ -0,0 +1,262 @@
+---
+stability: beta
+---
+# Artifact Engagement — Flow & Recording Contract
+> Cross-cutting reference for the artifact-engagement telemetry system
+> shipped under
+> [`road-to-artifact-engagement-telemetry.md`](../../agents/roadmaps/road-to-artifact-engagement-telemetry.md).
+> Phase 1 + 2 ship the schema, CLI, and engine. Phase 3 ships the
+> agent-side hooks (this document).
+>
+> - **Created:** 2026-04-30
+> - **Status:** Phase 5 — schema, engine, recording, aggregator, renderer,
+>   and redaction validator are in place. Phase 6 (dogfooding) next.
+This document is the stable reference for **what gets recorded, when, and
+under which constraints**. The roadmap tracks phased delivery; the rule
+([`artifact-engagement-recording`](../../.agent-src.uncompressed/rules/artifact-engagement-recording.md))
+tells the agent when to fire; this doc explains the contract every
+recording must honour.
+## What this is
+A measurement layer that tells maintainers which **skills, rules,
+commands, guidelines and personas** the agent actually consults and
+applies during a `/implement-ticket` or `/work` run. The output (a local
+JSONL log) becomes the input to the Phase 4 aggregator + report renderer.
+## What this is *not*
+- Token-spend tracking — handled separately by the cost-profile system.
+- Tool-call telemetry — out of scope.
+- A retirement decision-maker — Phase 4's report surfaces signal; humans
+  decide what to retire.
+- A consumer-facing analytics product — local-only, opt-in, never
+  uploaded.
+## Threat model in one paragraph
+Engagement data is **observation**, not delivery. A missed event is
+cheap. A leaked prompt, a leaked path, or a leaked secret is not. Every
+field in the schema is **id-only**: no free-text, no payloads, no paths.
+The CLI rejects oversized strings on the input boundary; the agent must
+not even attempt to write content.
+## When the agent records
+Two granularities, one per task or one per phase-step:
+```
+refine → memory → analyze → plan → implement → test → verify → report
+                          ↑
+            granularity: phase-step → 1 record per arrow crossed
+            granularity: task        → 1 record on terminal state only
+```
+The setting lives at `telemetry.artifact_engagement.granularity` in
+`.agent-settings.yml`. Default `task`. Both flows
+([`/implement-ticket`](../../.agent-src.uncompressed/commands/implement-ticket.md)
+and [`/work`](../../.agent-src.uncompressed/commands/work.md)) use the
+same eight-step contract from
+[`implement-ticket-flow.md`](implement-ticket-flow.md).
+### Boundary semantics
+A boundary is **closed** when:
+- The dispatcher returns `SUCCESS`, `BLOCKED`, or `PARTIAL` for the
+  current step (phase-step granularity), or
+- The full eight-step flow reaches a terminal state (task granularity).
+A boundary is **never** closed by:
+- An error inside the agent's tool call (the user's task is paramount;
+  see "failure modes" below).
+- A model-side exception or timeout.
+- Settings reload mid-flow — settings are read once per task and cached.
+Within a single boundary, repeated `consulted` / `applied` mentions of
+the same `<kind>:<id>` pair **dedupe**. Reading `php-coder` three times
+records once.
+## What counts as consulted vs applied
+| Term | Meaning | Examples |
+|---|---|---|
+| **`consulted`** | The agent **read** the artefact this boundary | Opened `SKILL.md`, scanned a rule body, looked at a guideline, checked persona contract, cited the artefact id in reasoning |
+| **`applied`** | The artefact **influenced the output** this boundary | A skill's procedure produced the code; a rule's gate halted a flow; a guideline's pattern shaped the diff |
+`applied` is a strict subset of `consulted`. When in doubt → record as
+`consulted` only. Over-recording `applied` inflates the engagement
+signal and defeats the whole point of the system.
+## Forbidden — what NEVER goes into a record
+The privacy contract is enforced at **four** layers — schema (write
+gate), aggregator (read gate via `parse_event`), renderer (export gate),
+and CLI (surface gate). Each layer rejects the same set of shapes;
+defense in depth means a leak has to bypass all four to escape:
+- File paths — any string containing `/`, `\`, or a control character.
+- File extensions — trailing `.md`, `.py`, `.json`, `.yaml`, `.yml`,
+  `.php`, `.ts`, `.js`, etc. Detected by
+  `re.compile(r"\.[a-z0-9]{1,10}$", re.IGNORECASE)` in
+  `telemetry.engagement.check_id_redaction`.
+- Source code, prompts, ticket bodies, AC text, comments.
+- Branch names, commit shas, PR numbers, URLs.
+- Secrets, env vars, credentials, customer data.
+- Free-text strings longer than 200 chars.
+- Leading or trailing whitespace, tabs, newlines.
+- Empty strings.
+The id namespaces are stable and bounded:
+| Kind | Source | Example |
+|---|---|---|
+| `skills` | `.agent-src/skills/<id>/SKILL.md` | `php-coder`, `eloquent` |
+| `rules` | `.agent-src/rules/<id>.md` | `scope-control`, `language-and-tone` |
+| `commands` | `.agent-src/commands/<id>.md` | `commit`, `create-pr` |
+| `guidelines` | `.agent-src/guidelines/<path>/<id>.md` | `agent-interaction-and-decision-quality` |
+| `personas` | `.agent-src/personas/<id>.md` | `qa`, `senior-engineer` |
+`task_id` is the ticket key (`PROJ-123`) for `/implement-ticket` or a
+short opaque slug derived from the prompt for `/work`. Branch names,
+file paths, and free-text titles are forbidden in `task_id` — see the
+schema's `EngagementSchemaError` cases.
+### The four enforcement layers
+1. **Schema (write gate)** — `EngagementEvent.validate()` in
+   `telemetry/engagement.py` runs `check_id_redaction` over `task_id`
+   and every `consulted` / `applied` artefact id. The CLI exits `1`
+   when this fires; nothing reaches the JSONL.
+2. **Aggregator (read gate)** — `aggregator._iter_events` calls
+   `parse_event`, which re-runs the same validator. A pre-validator
+   line (e.g. an archived snapshot from before the validator landed)
+   is **skipped** and counted in `result.skipped_lines`. The renderer
+   never sees it.
+3. **Renderer (export gate)** — `_stat_to_dict` (JSON) and the
+   markdown row builder both call `check_id_redaction` again before
+   emitting any id. If a caller bypasses `parse_event` and hand-builds
+   an `AggregateResult`, the renderer raises `EngagementSchemaError`
+   instead of producing the row.
+4. **CLI (surface gate)** — `telemetry_report.main` catches
+   `EngagementSchemaError` from the renderer and exits `2` with a
+   `redaction validator refused report` message. A bad row is never
+   written to stdout, never piped to a teammate.
+A leak would have to defeat all four layers — write the JSONL outside
+the CLI, parse it outside `parse_event`, render it outside the
+project's renderer, and ship it to a teammate without the CLI in the
+loop. The contract treats that as out of scope.
+## How the agent records
+```bash
+./agent-config telemetry:record \
+  --task-id "$TASK_ID" \
+  --boundary task \
+  --consulted skills:php-coder \
+  --consulted rules:scope-control \
+  --applied skills:php-coder
+```
+Exit codes (the agent must read these):
+| Exit | Meaning | Action |
+|---|---|---|
+| `0` | Recorded, or telemetry disabled (silent no-op) | Continue |
+| `1` | Schema validation failed | Log internally, **continue** the user's task — do not halt |
+| `2` | IO failure (lock contention, disk full) | Same — observation, not delivery |
+Schema-rejection messages go to stderr; the agent reads them once and
+moves on. The user's task always takes priority.
+## Failure modes — DO NOT block the user's task
+- Telemetry is **observation**, not a delivery requirement.
+- A missing event is cheap; a halted task because telemetry can't write
+  is a critical failure mode that defeats the whole opt-in property.
+- The only error the agent surfaces to the user: when the user asked
+  for telemetry (`telemetry:status` confirms enabled) but no event
+  reached the log over a full task — that is a real bug.
+## Cost floor — what "default-off" means
+When `telemetry.artifact_engagement.enabled: false` (or section absent):
+- The rule is `auto` and the description does not match a typical
+  conversation; the rule never loads. Cost floor: 0 tokens.
+- The CLI exits `0` with no output and no file IO.
+- The engine (`work_engine.dispatcher`, `work_engine.cli`) does not
+  import any `telemetry.*` module — locked by
+  [`tests/telemetry/test_cost_floor.py`](../../tests/telemetry/test_cost_floor.py).
+- On cloud surfaces (Claude.ai Web, Skills API), the rule's
+  `cloud_safe: noop` marker keeps the rule inert regardless of settings.
+## Where the JSONL lives
+Default path: `.agent-engagement.jsonl` in the consumer repo root.
+Configurable via `telemetry.artifact_engagement.output.path`. The path
+is **always** added to the consumer's `.gitignore` block by the
+installer (Phase 1 wiring). Verify locally:
+```
+$ grep agent-engagement .gitignore
+.agent-engagement.jsonl
+```
+## How to audit a JSONL by hand
+The fastest path is the project's own validator — it enforces every
+rule listed above:
+```bash
+# Validate every line against the schema + redaction floor
+python3 -c '
+import pathlib, sys
+sys.path.insert(0, ".agent-src.uncompressed/templates/scripts")
+from telemetry.engagement import EngagementSchemaError, parse_event
+log = pathlib.Path(".agent-engagement.jsonl")
+ok = bad = 0
+for i, line in enumerate(log.read_text().splitlines(), 1):
+    if not line.strip():
+        continue
+    try:
+        parse_event(line + "\n")
+        ok += 1
+    except EngagementSchemaError as e:
+        bad += 1
+        print(f"line {i}: {e}", file=sys.stderr)
+print(f"{ok} valid, {bad} rejected")
+'
+# A bad-line spot-check that does not depend on the validator
+python3 -c '
+import json, pathlib, re
+forbidden = re.compile(r"[/\\\\]|\.[a-z0-9]{1,10}$", re.IGNORECASE)
+for line in pathlib.Path(".agent-engagement.jsonl").read_text().splitlines():
+    if not line.strip():
+        continue
+    obj = json.loads(line)
+    for kind in ("consulted", "applied"):
+        for ids in obj.get(kind, {}).values():
+            for v in ids:
+                if forbidden.search(v) or len(v) > 200:
+                    print("LEAK:", v)
+'
+```
+Either recipe is safe to run on a co-worker's archived JSONL — neither
+writes anything, both surface the same shapes the four enforcement
+layers reject.
+## See also
+- [`road-to-artifact-engagement-telemetry`](../../agents/roadmaps/road-to-artifact-engagement-telemetry.md) — phased delivery
+- [`artifact-engagement-recording`](../../.agent-src.uncompressed/rules/artifact-engagement-recording.md) — agent-side trigger
+- [`implement-ticket-flow`](implement-ticket-flow.md) — the eight-step contract this rule observes
+- [`scripts/telemetry/`](../../.agent-src.uncompressed/templates/scripts/telemetry/) — schema, boundary session, settings reader
+- [`tests/telemetry/`](../../tests/telemetry/) — contract enforcement (104 cases through Phase 5: schema, settings, aggregator, renderer, CLI, cost-floor, redaction)

package/docs/contracts/command-clusters.md ADDED Viewed

@@ -0,0 +1,126 @@
+---
+stability: beta
+---
+# Command-cluster contract
+> **Status:** beta — locked for `1.15.0` Phase 1 (top-3 clusters).
+> Phase 2 (remaining 12 clusters) waits one deprecation cycle.
+> Source roadmap: [`agents/roadmaps/road-to-governance-cleanup.md`](../../agents/roadmaps/road-to-governance-cleanup.md)
+> § F2.
+The agent-config command surface collapses related atomic commands
+into **verb clusters**. A cluster is a single top-level command
+(e.g. `/fix`) that dispatches to sub-commands (e.g. `/fix ci`,
+`/fix pr`). Old atomic commands stay one release as deprecation
+shims, then disappear.
+This file is the **locked source of truth** for which clusters
+exist and which sub-commands belong to each. The atomic-command
+linter (`scripts/lint_no_new_atomic_commands.py`) reads this file;
+new atomic commands without a `cluster:` field pointing to an
+entry below fail CI.
+## Phase 1 clusters (locked, ship in 1.15.0)
+| Cluster | Sub-commands | Replaces |
+|---|---|---|
+| `fix` | `ci` · `pr` · `pr-bots` · `pr-developers` · `portability` · `refs` · `seeder` | `fix-ci` · `fix-pr-comments` · `fix-pr-bot-comments` · `fix-pr-developer-comments` · `fix-portability` · `fix-references` · `fix-seeder` |
+| `optimize` | `agents` · `augmentignore` · `rtk` · `skills` | `optimize-agents` · `optimize-augmentignore` · `optimize-rtk-filters` · `optimize-skills` |
+| `feature` | `explore` · `plan` · `refactor` · `roadmap` | `feature-explore` · `feature-plan` · `feature-refactor` · `feature-roadmap` |
+**Net Phase 1:** 15 atomic commands → 3 cluster commands.
+## Phase 2 clusters (deferred to next minor release)
+Listed in `road-to-governance-cleanup.md` § Finding 2. Not yet locked
+into the linter; new atomic commands matching these prefixes still
+fail CI without a `cluster:` field once a Phase 2 entry is added
+here.
+- `chat-history` (`show` · `resume` · `clear` · `checkpoint`)
+- `agents` (`audit` · `cleanup` · `prepare`)
+- `memory` (`add` · `load` · `promote` · `propose`)
+- `roadmap` (`create` · `execute`)
+- `module` (`create` · `explore`)
+- `tests` (`create` · `execute`)
+- `context` (`create` · `refactor`)
+- `override` (`create` · `manage`)
+- `copilot-agents` (`init` · `optimize`)
+- `commit` (flag: `--in-chunks`)
+- `judge` (`solo` · `steps` · `on-diff`) + standalone `/review`
+- `create-pr` (flag: `--description-only`)
+## Frontmatter contract
+A new command file under `.agent-src.uncompressed/commands/` MUST
+declare `cluster:` in its frontmatter, pointing to one of the locked
+clusters above:
+```yaml
+---
+name: fix-ci          # legacy slug retained for the shim
+cluster: fix          # required: locked cluster name
+sub: ci               # required: sub-command identifier (kebab-case)
+description: Fetch CI errors from GitHub Actions and fix them
+---
+```
+The linter only flags **newly-added** files under `commands/`
+(git status `A`). Pre-existing commands without `cluster:` are
+grandfathered indefinitely; modifying them does NOT require adding
+the field. The goal is to stop the atomic surface from growing,
+not to retro-fit every legacy command into a Phase 1 cluster.
+## Deprecation shim contract
+A shim is a one-file stub that:
+1. Keeps the old command slug in `.agent-src.uncompressed/commands/`.
+2. Declares `superseded_by:` in frontmatter pointing to the new
+   cluster command (e.g. `superseded_by: fix ci`).
+3. Declares `deprecated_in:` with the release version (e.g.
+   `deprecated_in: 1.15.0`).
+4. Body contains exactly one warning line in the format:
+   ```
+   ⚠️  /<old-name> is deprecated; use /<cluster> <sub> instead.
+   ```
+5. Otherwise forwards verbatim to the cluster command (no logic).
+`scripts/skill_linter.py` enforces the warning-line shape on any
+file with `superseded_by:` set.
+## Removal cycle
+| State | Release |
+|---|---|
+| Cluster command shipped, shim active | `1.15.0` |
+| Shim emits warning, both work | `1.15.x` (one minor cycle) |
+| Shim removed, only cluster works | `1.16.0` |
+No permanent aliases. Consumers who pin a 1.15 minor get a full
+release window of warnings; the 1.16 release notes call out the
+removal explicitly.
+## Linter behavior
+`scripts/lint_no_new_atomic_commands.py`:
+- Reads the locked cluster names from this file (parsed from the
+  Phase 1 table above).
+- Finds every command file **added** since `--baseline`
+  (default: `main`) — modifications to existing files are ignored.
+- For each new file, requires `cluster:` to be set to one of the
+  locked names — OR `superseded_by:` (the file is a shim).
+- Exits non-zero on the first violation; lists every violator.
+`--all` mode (manual audit only, not in CI) lints every command
+file and surfaces grandfathered ones — useful when planning a
+Phase 2 cluster expansion.
+## See also
+- [`agents/roadmaps/road-to-governance-cleanup.md`](../../agents/roadmaps/road-to-governance-cleanup.md) — F2 phased-collapse decision.
+- [`agents/roadmaps/road-to-post-pr29-optimize.md`](../../agents/roadmaps/road-to-post-pr29-optimize.md) — P0.8 anchors this contract in 1.15.0.
+- [`docs/migrations/commands-1.15.0.md`](../migrations/commands-1.15.0.md) — user-facing migration notes.
+- [`docs/contracts/STABILITY.md`](STABILITY.md) — `beta` level rules apply.

package/docs/contracts/command-suggestion-flow.md ADDED Viewed

@@ -0,0 +1,148 @@
+---
+stability: beta
+---
+# Command Suggestion — Flow & Scoring Contract
+> Cross-cutting reference for the suggestion layer shipped under
+> [`road-to-context-aware-command-suggestion.md`](../../agents/roadmaps/road-to-context-aware-command-suggestion.md).
+> The runtime rule is [`command-suggestion`](../../.agent-src.uncompressed/rules/command-suggestion.md);
+> the architectural decision lives in [`adr-command-suggestion.md`](adr-command-suggestion.md).
+>
+> - **Created:** 2026-04-30
+> - **Status:** Phases 1–7 shipped — engine, rule, settings, sanitiser,
+>   GT-CS goldens (9/9 passing) all live.
+This document is the stable reference for **how a user prompt becomes a
+numbered-options block** (or stays silent). The roadmap tracks phased
+delivery; this doc explains the contract every suggestion must honour.
+## Flow at a glance
+```
+user turn
+   │
+   ├── starts with "/cmd" ──────────────► slash-commands handles it. STOP.
+   │
+   ├── senior-rule active                ►  STOP.  (clarification owed,
+   │   (ask-when-uncertain, scope gate,      role-mode contract, engine
+   │    role-mode, engine halt)              halt — suggestion stays silent)
+   │
+   ▼
+sanitize_message + sanitize_context
+   (strip fenced + inline code, strip prior suggestion-block echoes)
+   │
+   ▼
+match()    →  raw scored matches per eligible command
+   │
+   ▼
+rank()     →  apply floor, drop blocklisted, anti-noise heuristics,
+              cap at max_options, stable tie-break
+   │
+   ▼
+apply_cooldown()  →  drop (command, evidence) pairs shown in window
+   │
+   ▼
+render()   →  numbered-options block + Recommendation line
+   │
+   ▼
+agent emits the block as the first and ONLY thing this turn
+```
+## Scoring breakdown
+Per match, `match.py` computes:
+| Component | Weight | Source |
+|---|---|---|
+| `trigger_description` substring hit | base 0.6 | command frontmatter |
+| `trigger_context` substring hit | +0.1 (when both hit, `matched_trigger="both"`) | command frontmatter |
+| Token overlap (Jaccard, message ∪ ctx) | up to +0.05 | runtime |
+| Structural bonus (ticket key, file path) | +0.05, sets `has_structural_bonus=True` | runtime |
+Phrase length matters — longer matched evidence wins ties. The score is
+clamped to `[0.0, 1.0]`. `rank.py` then enforces the floor and the
+heuristics in
+[`adr-command-suggestion.md`](adr-command-suggestion.md#anti-noise-heuristics).
+## What gets recorded as evidence
+`Match.evidence` is the **literal substring** that triggered the match
+— `"ABC-123"` for a structural bonus, `"commit my changes"` for a
+description hit. The cooldown key is `(command_name, evidence)`, so two
+distinct triggers for the same command (a ticket key vs. a phrase)
+track separately.
+## Opt-outs — three paths, picked by scope
+| Scope | Key | Effect |
+|---|---|---|
+| Whole project | `commands.suggestion.enabled: false` | Layer fully silent; explicit slashes still work |
+| Specific command | `commands.suggestion.blocklist: [cmd1, cmd2]` | Those commands never appear; still callable |
+| Specific conversation | User types `/command-suggestion-off` | Disabled until user types `/command-suggestion-on` or chat ends |
+Per-command frontmatter overrides the global floor and cooldown:
+```yaml
+suggestion:
+  eligible: true
+  trigger_description: "..."
+  trigger_context: "..."
+  confidence_floor: 0.7   # stricter than global default
+  cooldown: 30m            # longer than global default
+```
+## Subordination — when to stay silent
+The rule lists four senior gates that outrank suggestion. The agent
+self-checks before emitting:
+1. **`scope-control`** — never resurface a git-op the user just declined.
+2. **`ask-when-uncertain`** — clarification owed → suggestion suppressed.
+3. **`verify-before-complete`** — evidence-gate verification active → silent.
+4. **Active role-mode contract or engine halt** (`/implement-ticket`
+   step waiting on user, `/work` mid-flow, etc.) — active flow has the floor.
+On any conflict → zero output. The user's prompt processes as it would
+without the rule active.
+## Anti-noise heuristics — when not to fire
+`rank.py` returns `[]` when:
+- Top score is below `confidence_floor` (default `0.6`).
+- Single match within `floor + 0.1` and no structural bonus.
+- Short prompt (< 6 words) matches > 2 commands without a structural bonus.
+- Prompt is a pure continuation phrase (`ok`, `weiter`, `continue`, …).
+Structural bonuses (ticket key, file path) override every suppressor.
+## Hardening tests — what we lock down
+`tests/test_command_suggester.py` covers 84 cases (matcher, rank,
+cooldown, sanitiser, render, settings, directive). The 9 GT-CS goldens
+(`tests/test_command_suggester_goldens.py`) lock end-to-end shape.
+Together they enforce:
+- **No execution without user pick** — engine has no execute path.
+- **No echo-trigger** — `sanitize.py` strips the previous block's shape.
+- **No code-block triggers** — `/commit` inside ``` ``` ``` ``` stays inert.
+- **No conversation hijack** — `enabled: false` and senior gates → silent.
+- **No multi-question stack** — only a numbered-options block, no extra ask.
+- **As-is option always last** — render contract test plus rule self-check.
+## Cloud bundles
+On Claude.ai Web / Skills API, the suggester package is **not** part
+of the standard bundle (T2 ZIPs ship rules + skills, not Python
+helpers). Treat `commands.suggestion.enabled` as `false` — degrade
+silently, never crash the turn. Local agents (Augment, Claude Code,
+Cursor, Cline, Windsurf) get the engine via `scripts/`.
+## See also
+- [`command-suggestion`](../../.agent-src.uncompressed/rules/command-suggestion.md) — runtime rule
+- [`adr-command-suggestion.md`](adr-command-suggestion.md) — architectural decision
+- [`command-suggestion-eligibility.md`](command-suggestion-eligibility.md) — locked eligibility table
+- [`road-to-context-aware-command-suggestion.md`](../../agents/roadmaps/road-to-context-aware-command-suggestion.md) — phased delivery
+- [`agent-settings`](../../.agent-src.uncompressed/templates/agent-settings.md) — `commands.suggestion.*` reference