npm - @event4u/agent-config - Versions diffs - 2.19.0 → 2.20.0 - Mend

@event4u/agent-config 2.19.0 → 2.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (92) hide show

package/.agent-src/commands/agent-status.md +29 -0
package/.agent-src/commands/onboard.md +221 -81
package/.agent-src/packs/README.md +49 -0
package/.agent-src/packs/agency-delivery.yml +63 -0
package/.agent-src/packs/content-engine.yml +53 -0
package/.agent-src/packs/founder-mvp.yml +51 -0
package/.agent-src/presets/README.md +26 -0
package/.agent-src/presets/balanced.yml +34 -0
package/.agent-src/presets/fast.yml +31 -0
package/.agent-src/presets/strict.yml +38 -0
package/.agent-src/profiles/README.md +29 -0
package/.agent-src/profiles/agency.yml +27 -0
package/.agent-src/profiles/content_creator.yml +25 -0
package/.agent-src/profiles/developer.yml +26 -0
package/.agent-src/profiles/finance.yml +24 -0
package/.agent-src/profiles/founder.yml +25 -0
package/.agent-src/profiles/ops.yml +25 -0
package/.agent-src/rules/no-cheap-questions.md +25 -17
package/.agent-src/skills/adr-create/SKILL.md +78 -68
package/.agent-src/skills/subagent-orchestration/SKILL.md +33 -0
package/.agent-src/templates/agents/agent-project-settings.example.yml +1 -1
package/.agent-src/templates/skill-archive-note.md +101 -0
package/.claude-plugin/marketplace.json +1 -1
package/CHANGELOG.md +52 -30
package/README.md +68 -72
package/config/agent-settings.template.yml +22 -0
package/docs/adrs/caveman/0001-default-off-until-bench.md +93 -0
package/docs/adrs/caveman/README.md +9 -0
package/docs/adrs/cost/0001-hard-stop-hook.md +114 -0
package/docs/adrs/cost/README.md +9 -0
package/docs/adrs/memory/0001-consumer-side-snapshot.md +111 -0
package/docs/adrs/memory/README.md +9 -0
package/docs/adrs/router/0001-three-tier-routing.md +119 -0
package/docs/adrs/router/README.md +9 -0
package/docs/adrs/schema/0001-json-schema-frontmatter.md +102 -0
package/docs/adrs/schema/README.md +9 -0
package/docs/adrs/smoke/0001-per-tier-smoke-scripts.md +99 -0
package/docs/adrs/smoke/README.md +9 -0
package/docs/architecture/current-onboard-baseline.md +126 -0
package/docs/architecture/current-safety-behavior.md +137 -0
package/docs/archive/CHANGELOG-pre-2.16.0.md +48 -0
package/docs/contracts/adr-layout.md +108 -0
package/docs/contracts/benchmark-corpus-spec.md +97 -0
package/docs/contracts/benchmark-report-schema.md +111 -0
package/docs/contracts/command-clusters.md +1 -0
package/docs/contracts/command-taxonomy.md +137 -0
package/docs/contracts/compression-default-kill-criterion.md +69 -0
package/docs/contracts/config-presets.md +144 -0
package/docs/contracts/cost-dashboard.md +143 -0
package/docs/contracts/cost-enforcement.md +134 -0
package/docs/contracts/file-ownership-matrix.json +0 -7
package/docs/contracts/mcp-tool-inventory.md +53 -0
package/docs/contracts/measurement-baseline.md +102 -0
package/docs/contracts/namespace.md +125 -0
package/docs/contracts/profile-system.md +142 -0
package/docs/contracts/safety-model.md +129 -0
package/docs/contracts/smoke-contracts.md +144 -0
package/docs/contracts/workflow-packs.md +121 -0
package/docs/decisions/ADR-010-profile-pack-preset-boundary.md +132 -0
package/docs/decisions/INDEX.md +1 -0
package/docs/featured-commands.md +27 -0
package/docs/parity/bench-ruflo.json +58 -0
package/docs/parity/bench.json +41 -0
package/docs/parity/ruflo.md +46 -0
package/docs/profiles.md +91 -0
package/package.json +1 -1
package/scripts/_cli/cmd_explain.py +250 -0
package/scripts/_lib/bench_cost.py +138 -0
package/scripts/_lib/bench_quality.py +118 -0
package/scripts/_lib/bench_report.py +150 -0
package/scripts/agent-config +13 -0
package/scripts/audit_adr_coverage.py +175 -0
package/scripts/audit_mcp_tools.py +146 -0
package/scripts/bench_baseline_ready.py +108 -0
package/scripts/bench_drift_check.py +151 -0
package/scripts/bench_per_tool.py +216 -0
package/scripts/bench_run.py +155 -0
package/scripts/config/__init__.py +9 -0
package/scripts/config/presets.py +206 -0
package/scripts/config/profiles.py +173 -0
package/scripts/cost/budget.mjs +73 -12
package/scripts/cost/preflight.mjs +89 -0
package/scripts/lint_archived_skills.py +143 -0
package/scripts/lint_bench_corpus.py +161 -0
package/scripts/lint_namespace.py +135 -0
package/scripts/skill_overlap.py +204 -0
package/scripts/skill_usage_collect.py +191 -0
package/scripts/skill_usage_report.py +162 -0
package/scripts/smoke/kernel.sh +101 -0
package/scripts/smoke/router.sh +129 -0
package/scripts/smoke/schema.sh +71 -0
package/scripts/smoke/skills.sh +101 -0

package/docs/contracts/smoke-contracts.md ADDED Viewed

@@ -0,0 +1,144 @@
+---
+stability: beta
+keep-beta-until: 2026-08-14
+---
+# Smoke Contracts — Phase 3 of step-11-ruflo-parity
+> **Status:** active · **Owner:** step-11 Phase 3 · **Sibling:**
+> [`measurement-baseline.md`](measurement-baseline.md) (snapshot semantics)
+> · [`cost-enforcement.md`](cost-enforcement.md) (cost ladder)
+Per-tier smoke scripts validate the system's structural baselines on
+every PR that touches the tier. Each script is **fast** (≤ 30 s wall),
+**deterministic** (same input → same exit), and **measured** (baseline
+numbers come from `task smoke:*` on `main` at lock-in, not from claims).
+## § 1 — Runtime budget
+Every `scripts/smoke/<tier>.sh` honours:
+| Limit | Value | Rationale |
+|---|---:|---|
+| Wall time | ≤ 30 s | CI matrix slot; local dev iteration |
+| External I/O | none beyond filesystem | no network, no MCP |
+| Output | last line is the **baseline declaration** | parseable by CI summary |
+A smoke that approaches 30 s should be split into sub-smokes, not
+optimised in place.
+## § 2 — Path-trigger globs
+CI's `.github/workflows/smoke.yml` dispatches the right scripts based on
+the paths touched in the PR:
+| Tier | Globs that trigger | Script |
+|---|---|---|
+| kernel | `.agent-src.uncompressed/rules/**`, `.agent-src/rules/**`, `router.json`, `scripts/measure_rule_budget.py` | `scripts/smoke/kernel.sh` |
+| router | `router.json`, `.agent-src.uncompressed/rules/**`, `.agent-src.uncompressed/skills/**`, `docs/contracts/**`, `docs/guidelines/**` | `scripts/smoke/router.sh` |
+| schema | `.agent-src.uncompressed/skills/**`, `.agent-src.uncompressed/rules/**`, `scripts/schemas/**`, `scripts/skill_linter.py`, `scripts/validate_frontmatter.py` | `scripts/smoke/schema.sh` |
+| skills | `.agent-src.uncompressed/skills/**` | `scripts/smoke/skills.sh` |
+`task smoke` runs all four locally regardless of paths.
+## § 3 — Baseline declarations (locked 2026-05-16)
+Smoke baselines are **measured today**, not aspirational. They lock
+**regression**: a smoke goes red only if the count drifts the wrong way.
+Drift toward the ideal (fewer breaches, more fences) updates the
+constant in the script body and the row below.
+### § 3.1 — Kernel (`scripts/smoke/kernel.sh`)
+```
+9 kernel rules · 8 carry Iron-Law fences · 1 dispatch index · ≤ 2 budget breaches
+```
+- **9 kernel rules** — fixed by [`kernel-membership.md`](kernel-membership.md).
+- **8 carry Iron-Law fences** — measured 2026-05-16. `agent-authority`
+  is the **dispatch index** (priority table pointing at the other four
+  authority rules); it is structurally exempt from the Iron-Law-fence
+  requirement and listed in the script's `EXEMPT_FROM_FENCE` set.
+- **≤ 2 budget breaches** — `python3 scripts/measure_rule_budget.py
+  --kernel-budget-check` currently reports 2 breaches
+  (`kernel-bucket > 26000`, `no-cheap-questions > 4000`). The smoke
+  asserts the count does not grow; reductions update `EXPECTED_BREACHES`
+  in `scripts/smoke/kernel.sh`. See
+  `road-to-kernel-and-router.md`
+  for the path back to zero.
+### § 3.2 — Router (`scripts/smoke/router.sh`)
+```
+75 router ids · 0 broken rule pointers · 35 routes_to refs · 2 missing contracts
+```
+- **75 ids** — 9 kernel + 24 tier_1 + 42 tier_2; every id resolves to
+  `.agent-src/rules/<id>.md`.
+- **0 broken rule pointers** — hard assertion; smoke fails on any miss.
+- **35 routes_to refs** across tier_1 + tier_2; resolver honours the
+  four prefixes (`skill:`, `command:`, `guideline:`, `contract:`).
+- **2 missing contracts** — measured 2026-05-16:
+  `contract:artifact-engagement-flow`,
+  `contract:command-suggestion-flow`. Tracked separately under
+  ``step-11` Phase 4 (ADR layout)`;
+  smoke asserts the count is `≤ EXPECTED_MISSING_CONTRACTS=2`.
+### § 3.3 — Schema (`scripts/smoke/schema.sh`)
+```
+438 lintable artefacts · 0 schema FAILs · ≤ 92 warns
+```
+- **0 FAILs** — hard assertion. `scripts/skill_linter.py --all` returns
+  exit 0/1 (warns) but never 2 (fail).
+- **≤ 92 warns** — measured 2026-05-16; locks regression. Warns
+  trending down updates the constant.
+- **v2 schema (step-5) deferred** — when
+  `step-5-schema-rigor.md`
+  Phase 1 closes, this smoke gains a `model_tier` presence assertion;
+  Phase 3 adds `schema_version: "2"`. Until then, v1 schema in
+  `scripts/schemas/skill.schema.json` is the contract.
+### § 3.4 — Skills (`scripts/smoke/skills.sh`)
+```
+5/5 random skills resolve · frontmatter parses · name matches directory
+```
+- **5 random skills** picked deterministically (seed = epoch day) from
+  `.agent-src.uncompressed/skills/*/SKILL.md` and re-validated via
+  `scripts/validate_frontmatter.py`. `agent-config explain skill` is
+  **not** invoked — `explain` only supports `{config,rule,route}` today
+  ([`scripts/agent-config/cmd_explain.py`](../../scripts/agent-config/cmd_explain.py));
+  filesystem-resolution is the contract.
+## § 4 — Local invocation
+```bash
+task smoke            # all four
+task smoke:kernel     # individual tiers
+task smoke:router
+task smoke:schema
+task smoke:skills
+```
+Every script honours `SMOKE_QUIET=1` (suppresses table output, keeps
+the final baseline line) for CI summary parsing.
+## § 5 — Failure modes
+| Symptom | Likely cause | Fix |
+|---|---|---|
+| `kernel.sh` reports > 8 missing fences | Kernel rule lost its Iron Law block during edit | Restore the fence; update `EXEMPT_FROM_FENCE` only for new dispatch indexes |
+| `router.sh` reports > 0 broken pointers | `router.json` references an id without a rule file | Add the rule or remove the route — never edit the smoke baseline up |
+| `schema.sh` reports FAILs | A skill / rule lost a required field | Restore via [`scripts/schemas/skill.schema.json`](../../scripts/schemas/skill.schema.json) |
+| `skills.sh` 5/5 random sample fails | Hand-edit broke frontmatter or renamed directory without updating `name:` | Restore filename ↔ slug coupling |
+## § 6 — See also
+- [`measurement-baseline.md`](measurement-baseline.md) — measurement substrate.
+- [`cost-enforcement.md`](cost-enforcement.md) — cost ladder, sibling smoke surface.
+- [`kernel-membership.md`](kernel-membership.md) — the 9-rule kernel set.
+- [`rule-router.md`](rule-router.md) — router contract.
+- `road-to-kernel-and-router.md` — kernel budget reduction path.

package/docs/contracts/workflow-packs.md ADDED Viewed

@@ -0,0 +1,121 @@
+---
+stability: beta
+keep-beta-until: 2026-08-12
+---
+# Workflow packs
+> **Status:** beta — first draft 2026-05-16 (Phase 2 Item 7 of
+> `step-15-product-refinement`).
+A **workflow pack** bundles a `(profile + preset + command-set +
+skill-allowlist)` combination into a single YAML so a user can adopt
+the full opinionated stance for their role without picking five
+independent settings.
+Packs do **not** introduce new commands, skills, or rules. They are
+a **composition contract** — every reference must resolve to an
+existing artefact that has already passed its own contract / linter
+gates.
+## Schema
+```yaml
+# .agent-src.uncompressed/packs/<pack-id>.yml
+pack:
+  id: <pack-id>                      # kebab-case, file name without .yml
+  audience:
+    label: "<human-readable>"
+    one_liner: "<= 120 chars, what the pack does for the user>"
+  composition:
+    profile_id: <profile.id>         # MUST exist in profiles/
+    preset_id: <preset.id>           # MUST exist in presets/
+  surface:
+    commands_allowed:                # ≤ 12 — slash-command names without leading slash
+      - <command>
+    skills_allowed:                  # ≤ 15 — skill IDs from skills-catalog
+      - <skill>
+    personas:                        # ≤ 4 — persona IDs from personas/
+      - <persona>
+  rationale:                         # why this combination, not free-form notes
+    why_this_profile: "<one paragraph>"
+    why_this_preset: "<one paragraph>"
+    why_these_commands: "<one paragraph>"
+```
+### Field semantics
+| Field | Type | Required | Notes |
+|---|---|:-:|---|
+| `pack.id` | string | yes | Matches file stem. No collision with `profile.id` or `preset.id`. |
+| `composition.profile_id` | string | yes | Override applied to the chain documented in [`profile-system`](profile-system.md). Pack-supplied id wins over `.agent-settings.yml` only when the user explicitly opts in via `/onboard --pack <id>`. |
+| `composition.preset_id` | string | yes | Override applied to the chain documented in [`config-presets`](config-presets.md). Same opt-in semantics. |
+| `surface.commands_allowed` | list[string] | yes | Cap = **12**. Items must appear in [`command-clusters`](command-clusters.md). The pack does **not** disable other commands — the cap is for the wizard's first-screen rendering, not enforcement. |
+| `surface.skills_allowed` | list[string] | yes | Cap = **15**. Items must appear in `docs/skills-catalog.md`. Same render-only semantics. |
+| `surface.personas` | list[string] | yes | Cap = **4**. Items must appear in `.agent-src.uncompressed/personas/`. |
+| `rationale.*` | string | yes | Forces every pack to justify its composition in plain prose; reviewed at PR time, not at runtime. |
+## Resolution chain
+Packs are an **opt-in layer above the profile + preset chain**. The
+loader at `scripts/config/packs.py` (Phase 2 deliverable — not yet
+shipped) reads the pack iff:
+1. `--pack <id>` flag passed to `/onboard` or `agent-config init`, **or**
+2. `pack.id` set in `.agent-settings.yml` (written by `/onboard --pack`).
+When a pack is active:
+- `composition.profile_id` is passed to `profiles.load()` as
+  `pack_profile_id` (already wired — see `scripts/config/profiles.py`).
+- `composition.preset_id` is passed to `presets.load()` analogously.
+- `surface.*` lists override the rendered command / skill lists in
+  `/onboard` and in the README "Six entry paths" surface **for the
+  duration of the active pack only**.
+Removing a pack (`/onboard --pack none`) reverts to the underlying
+profile + preset defaults; **no data is lost**.
+## Validation
+`scripts/lint_packs.py` (Phase 2 deliverable — not yet shipped) fails
+CI on:
+- Missing required field.
+- `profile_id` / `preset_id` / `commands_allowed` / `skills_allowed`
+  / `personas` referencing an artefact that does not exist.
+- Cap violation (commands > 12, skills > 15, personas > 4).
+- `pack.id` collision with another pack, profile, or preset id.
+Until the linter lands, packs are reviewed by hand at PR time against
+this schema.
+## What packs do **not** do
+- **Do not** declare new commands. Use [`command-clusters`](command-clusters.md).
+- **Do not** modify rules. Use the kernel-rule edit process.
+- **Do not** override safety floors. Domain-safety rules
+  (`.agent-src.uncompressed/rules/domain-safety-*.md`) apply
+  unconditionally — packs cannot widen the deny-list.
+- **Do not** ship telemetry or usage hints. Packs are pure composition.
+## Seed packs
+Three packs ship at Phase 2 Item 7 close:
+| Pack id | Profile | Preset | One-liner |
+|---|---|---|---|
+| `founder-mvp` | `founder` | `fast` | Ship the MVP and the pitch deck in the same week. |
+| `content-engine` | `content_creator` | `balanced` | Editorial calendar, brand voice, and ghostwriter on one loop. |
+| `agency-delivery` | `agency` | `strict` | Multi-client refine → estimate → deliver with audit-grade trace. |
+Each pack lives at `.agent-src.uncompressed/packs/<id>.yml` and is
+covered by the validation rules above.
+## See also
+- [`profile-system`](profile-system.md) — profile axis (audience defaults)
+- [`config-presets`](config-presets.md) — preset axis (risk appetite)
+- [`command-clusters`](command-clusters.md) — verb axis (invocation)
+- [`command-taxonomy`](command-taxonomy.md) — discoverability axis
+- `step-15-product-refinement` § Phase 2 Item 7

package/docs/decisions/ADR-010-profile-pack-preset-boundary.md ADDED Viewed

@@ -0,0 +1,132 @@
+---
+adr: 010
+status: proposed
+date: 2026-05-16
+decision: profile-pack-preset-boundary
+supersedes: —
+superseded_by: —
+phase: v2.x · step-15 Phase 1 prerequisite
+---
+# ADR-010 — Profile / Pack / Preset Boundary
+## Status
+**Proposed** · 2026-05-16 · pending Phase 1 of
+[`agents/roadmaps/step-15-product-refinement.md`](../../agents/roadmaps/step-15-product-refinement.md).
+Council v3 action #2 (`agents/council-responses/2026-05-16-step-15-product-refinement-v3.json`): <!-- council-ref-allowed: ADR decision-trace to originating council response -->
+**"Profile / Pack / Preset boundary is undefined; Phase 2 will duplicate
+Phase 1 abstractions"**. Promoted from Phase 2 to Phase 1 prerequisite —
+the profile loader (Phase 1 item 1) cannot ship without the boundary.
+## Context
+Step-15 introduces three new configuration concepts:
+- **Profile** — Phase 1 item 1: `profile.id` ∈ {`founder`, `developer`,
+  `content_creator`, `agency`, `finance`, `ops`}.
+- **Preset** — Phase 1 item 4: `preset.id` ∈ {`fast`, `balanced`,
+  `strict`} bundling 12+ governance knobs (cost caps, confidence band,
+  block-on-risk, …).
+- **Pack** — Phase 2 item 7: workflow bundles (`founder-mvp`,
+  `content-engine`, `agency-delivery`) of `(profile + preset +
+  command-set + skill-allowlist)`.
+A pre-existing fourth concept is in play:
+- **`cost_profile`** — current setting in `.agent-settings.yml`, values
+  `minimal` / `balanced` / `full` / `custom`. Owns **rule-tier loading**
+  (kernel · kernel + tier-1 · kernel + tier-1 + tier-2). Contract:
+  [`docs/contracts/cost-profile-defaults.md`](../contracts/cost-profile-defaults.md).
+Without a written boundary, three failure modes are predictable:
+1. The preset loader re-implements rule-tier gating (overlap with
+   `cost_profile`).
+2. Packs ship duplicate `profile` + `preset` defaults that drift from
+   the canonical source.
+3. Three teams add knobs to three places, and a user picking
+   `developer + strict + founder-mvp` discovers contradicting values
+   at runtime.
+## Decision
+Four orthogonal axes, four owners, one resolution chain.
+| Axis | Answers | Owns | Identity key |
+|---|---|---|---|
+| **Profile** | *Who is the user?* (audience taxonomy) | Default skill/command surface; README entry-paragraph; persona pre-selection | `profile.id` |
+| **Preset** | *How cautious is this run?* (risk + cost + autonomy budget) | The 12+ governance knobs (per-call $ ceiling, confidence band, block-on-risk, autonomy default, council escalation, …) | `preset.id` |
+| **Pack** | *What bundle of skills + commands?* (workflow recipe) | A frozen `(profile, preset, allow_skills, allow_commands)` 4-tuple; nothing more | `pack.id` |
+| **Cost Profile** | *How many rules load?* (token budget) | Rule-tier loading at session start (kernel · +tier-1 · +tier-2) | `cost_profile` |
+### Resolution chain (read order, last writer wins)
+```
+pack  →  profile  →  preset  →  cost_profile  →  user/env/runtime overrides
+```
+- A **pack** declares defaults for `profile`, `preset`, and the
+  skill / command allowlists. It cannot set `cost_profile` (that
+  axis belongs to the rule-tier loader and is governed separately).
+- A **profile** declares defaults for `preset`, audience-specific
+  README pointer, persona pre-selection. It cannot set any preset
+  knob directly — only `preset.id`.
+- A **preset** owns the 12+ knobs. No other axis writes them.
+- A **cost_profile** owns rule-tier loading. No other axis writes it.
+- The user's `.agent-settings.yml`, environment variables, and
+  runtime CLI flags override every axis above them.
+### Non-overlap rules (Iron Law)
+```
+A KNOB BELONGS TO EXACTLY ONE AXIS.
+DUPLICATION ACROSS AXES IS A CONTRACT VIOLATION.
+```
+- A pack **may not** override a preset knob; it overrides `preset.id`.
+- A profile **may not** override a preset knob; it overrides `preset.id`.
+- A preset **may not** override `cost_profile`; the user does that.
+- The CI `task lint-config-schema` (added in Phase 1) hard-fails on a
+  pack/profile YAML that names any preset-owned knob.
+## Consequences
+### Positive
+- Phase 1 ships the profile loader against a fixed surface (`profile.id`
+  → audience + `preset.id` + persona). No 12-knob inheritance ambiguity.
+- Phase 1 item 4 (Config Presets) owns the knobs alone. The "Cost
+  Enforcement" section in [`config-presets.md`](../contracts/config-presets.md)
+  has a single home.
+- Phase 2 item 7 (Workflow Packs) is a 4-tuple, not a re-implementation
+  of profile + preset. Pack YAML stays under 30 lines.
+- `cost_profile` keeps its single-axis charter; this ADR explicitly
+  refuses to fold it into the preset layer.
+### Negative
+- One more concept on the install screen (`profile` + `preset` + `pack`
+  + `cost_profile` = four axes). Mitigated by: the wizard (Phase 1 item
+  2) only asks for **profile** + **stack** + **risk appetite** and
+  derives the rest. Packs are opt-in; `cost_profile` keeps its
+  `balanced` default.
+- A skill-allowlist conflict between a pack and a runtime CLI flag is
+  resolved by "runtime wins". Users on a pack who shadow-disable a
+  skill will not see it again until the override is removed.
+### Neutral
+- This ADR records the boundary; it does **not** specify the seed
+  values for any axis. Profile IDs live in
+  [`docs/contracts/profile-system.md`](../contracts/profile-system.md)
+  (Phase 1 item 1). Preset knobs live in
+  [`docs/contracts/config-presets.md`](../contracts/config-presets.md)
+  (Phase 1 item 4). Pack shape lives in `docs/contracts/workflow-packs.md`
+  (Phase 2 item 7).
+## See also
+- [`docs/contracts/cost-profile-defaults.md`](../contracts/cost-profile-defaults.md) — the existing `cost_profile` contract this ADR explicitly does **not** touch.
+- [`agents/roadmaps/step-15-product-refinement.md`](../../agents/roadmaps/step-15-product-refinement.md) — Phase 1 items 1, 4 and Phase 2 item 7.
+- [`agents/council-responses/2026-05-16-step-15-product-refinement-v3.json`](../../agents/council-responses/2026-05-16-step-15-product-refinement-v3.json) — Council v3 action #2 (origin). <!-- council-ref-allowed: ADR decision-trace to originating council response -->

package/docs/decisions/INDEX.md CHANGED Viewed

@@ -13,6 +13,7 @@ _Auto-generated by `scripts/adr/regenerate_index.py`. Do not edit._
 | [ADR-007](ADR-007-agent-discovery-scopes.md) | Global Default Install With Export Subcommand | accepted | 2026-05-12 | — |
 | [ADR-008](ADR-008-installed-tools-manifest.md) | Committed Installed Tools Manifest Separate From Settings | proposed | 2026-05-12 | — |
 | [ADR-009](ADR-009-event4u-namespace.md) | Event4U Namespace And Claude Desktop Zip Bundles | accepted | 2026-05-13 | — |
+| [ADR-010](ADR-010-profile-pack-preset-boundary.md) | Profile Pack Preset Boundary | proposed | 2026-05-16 | — |
 ## Unnumbered (legacy)

package/docs/featured-commands.md ADDED Viewed

@@ -0,0 +1,27 @@
+# Featured Commands
+A curated subset of the 124 active commands. Full set lives in
+[`.agent-src/commands/`](../.agent-src/commands/) — see also
+[`docs/catalog.md`](catalog.md) for the complete index.
+## For developers
+| Command | What it does |
+|---|---|
+| [`/implement-ticket`](../.agent-src/commands/implement-ticket.md) | Drive a Jira / Linear ticket end-to-end through refine → plan → implement → test → verify |
+| [`/work`](../.agent-src/commands/work.md) | Same end-to-end loop for a free-form prompt — confidence-band gated |
+| [`/fix ci`](../.agent-src/commands/fix.md) | Fetch CI failures from GitHub Actions and fix them |
+| [`/review-changes`](../.agent-src/commands/review-changes.md) | Self-review local changes before creating a PR (five judges) |
+| [`/create-pr`](../.agent-src/commands/create-pr.md) | Create a GitHub PR with Jira-linked description |
+## For everyone
+| Command | What it does |
+|---|---|
+| [`/research`](../.agent-src/commands/research.md) | Survey / benchmark / competitive scan scaffolder — picks objects, defines fields |
+| [`po-discovery`](../.agent-src/skills/po-discovery/SKILL.md) | Shape a fuzzy product ask into a refined backlog item — problem framing, AC tightening |
+| [`/ghostwriter:write`](../.agent-src/commands/ghostwriter/write.md) | Draft in a public-figure voice profile (mandatory disclosure footer) |
+| [`/challenge-me`](../.agent-src/commands/challenge-me.md) | Interactive grill-style interview that sharpens a fuzzy plan into a copyable pitch |
+| [`/fundraising-narrative`](../.agent-src/skills/fundraising-narrative/SKILL.md) | Shape a capital-raise pitch — why-now / why-us / why-this framing |
+→ [Browse all 124 active commands](../.agent-src/commands/)

package/docs/parity/bench-ruflo.json ADDED Viewed

@@ -0,0 +1,58 @@
+{
+  "schema": "parity-bench-ruflo-v1",
+  "status": "infrastructure_ready_awaiting_corpus_run",
+  "owner_roadmap": "agents/roadmaps/step-11-ruflo-parity.md",
+  "parity_doc": "docs/parity/ruflo.md",
+  "parent_bench": "docs/parity/bench.json",
+  "claim_under_test": {
+    "source": "agents/audit-2026-05-14-north-star/external-findings.md § 2",
+    "headline": "Average dollar cost per 25-prompt corpus run, separated by model tier (Haiku / Sonnet / Opus) and by token class (input / output / cache-read / cache-write).",
+    "comparison_target": "ruflo cost-tracker README (claimed upstream, not yet pulled into this repo)",
+    "type": "claimed_upstream_not_verified_in_repo"
+  },
+  "measurement_protocol": {
+    "corpus": "bench/corpus/* (25-prompt corpus owned by step-4-measurement-and-benchmark.md)",
+    "tracker": "scripts/cost/track.mjs",
+    "pricing": "bench/pricing.yaml",
+    "session_source": "~/.claude/projects/*/sessions/*.jsonl (Claude Code-native, no manual tracking)",
+    "tokens_to_dollars": "track.mjs multiplies input/output/cache-read/cache-write tokens by per-1M pricing from bench/pricing.yaml, separated by model id",
+    "headline_output": "average dollar cost per 25-prompt run, with min / max / p50 / p90 across N reports"
+  },
+  "current_window": {
+    "report_count": 0,
+    "verdict": "awaiting_first_corpus_run",
+    "notes": "Phase 1-5 of step-11 delivered the cost-tracking and bench infrastructure. Phase 6 Step 2 awaits the first end-to-end 25-prompt corpus run against the live tracker. Until then this file exists as a methodology contract, not a verdict surface."
+  },
+  "soak_inheritance": {
+    "follows": "docs/parity/bench.json",
+    "min_days": 60,
+    "min_reports": 30,
+    "earliest_flip": "2026-07-15",
+    "arbiter_command": "task bench:baseline-ready",
+    "notes": "bench-ruflo.json flips status to 'baseline_ready' only after the parent bench.json flips. No independent soak window — same corpus, same arbiter."
+  },
+  "redundancy_verdict": {
+    "status": "pending",
+    "criterion": "Once bench.json soak completes, this verdict is set by comparing the dollar cost in current_window vs ruflo's published table.",
+    "outcome_branches": {
+      "redundant": "Our cost-per-25-prompt-run sits within Ruflo's published range (or beats it). G5 redundancy gate row for cost surface flips green.",
+      "behind": "Our cost-per-run > Ruflo's. Follow-up issue filed; G5 stays open."
+    }
+  },
+  "fields_pending_first_run": [
+    "current_window.avg_cost_per_run_usd",
+    "current_window.cost_by_model.haiku_usd",
+    "current_window.cost_by_model.sonnet_usd",
+    "current_window.cost_by_model.opus_usd",
+    "current_window.cost_by_class.input_usd",
+    "current_window.cost_by_class.output_usd",
+    "current_window.cost_by_class.cache_read_usd",
+    "current_window.cost_by_class.cache_write_usd"
+  ],
+  "decisions_pending": {},
+  "_meta": {
+    "created": "2026-05-16",
+    "created_by": "step-11-ruflo-parity.md Phase 6 Step 2",
+    "spec": "scripts/cost/track.mjs --bench-ruflo (planned wiring); for now the file is a methodology contract"
+  }
+}

package/docs/parity/bench.json ADDED Viewed

@@ -0,0 +1,41 @@
+{
+  "schema": "parity-bench-v1",
+  "status": "soak_in_progress",
+  "owner_roadmap": "agents/roadmaps/step-4-measurement-and-benchmark.md",
+  "contract": "docs/contracts/measurement-baseline.md",
+  "soak": {
+    "start_date": "2026-05-16",
+    "min_days": 60,
+    "min_reports": 30,
+    "earliest_flip": "2026-07-15",
+    "arbiter_command": "task bench:baseline-ready"
+  },
+  "current_window": {
+    "report_count": 1,
+    "days_elapsed": 0,
+    "verdict": "warmup",
+    "notes": "First infrastructure-shakedown run only. Numbers below are not the baseline."
+  },
+  "shakedown_run": {
+    "report": "bench/reports/2026-05-16T06-13-07Z-dev-projection.md",
+    "corpus": "dev",
+    "selection_accuracy_augment": 0.5,
+    "selection_accuracy_claude": 0.5,
+    "projection_fidelity_claude": 1.0,
+    "projection_fidelity_cursor": "not_applicable",
+    "projection_fidelity_cline": "not_applicable",
+    "projection_fidelity_windsurf": "not_applicable"
+  },
+  "decisions_pending": {
+    "compression_default": {
+      "kill_criterion": "docs/contracts/compression-default-kill-criterion.md",
+      "verdict": "deferred_until_baseline_closes",
+      "decision_owner": "step-4 closeout phase"
+    }
+  },
+  "downstream_consumers": [
+    "agents/roadmaps/step-99-north-star-restructure.md#G1",
+    "agents/roadmaps/step-2-skill-inventory-rationalization.md#G0",
+    "docs/contracts/compression-default-kill-criterion.md"
+  ]
+}

package/docs/parity/ruflo.md ADDED Viewed

@@ -0,0 +1,46 @@
+# Parity verdict — Ruflo
+> Per-row verdict against the eight Ruflo measurement-governance patterns
+> catalogued in
+> [`external-findings.md § 2`](../../agents/audit-2026-05-14-north-star/external-findings.md).
+> Owner roadmap: [`step-11-ruflo-parity.md`](../../agents/roadmaps/step-11-ruflo-parity.md)
+> (Phase 6 Step 1). Cross-index lives at
+> [`step-99-north-star-restructure.md`](../../agents/roadmaps/step-99-north-star-restructure.md)
+> Phase 5 Step 2.
+>
+> **Verdict legend:** `[x] covered by <file:line>` · `[~] superseded by <approach>` · `[!] gap`.
+> **Acceptance:** zero `[!]` rows. Closure flips the corresponding cell in the
+> [composite scorecard](../../agents/audit-2026-05-14-north-star/external-findings.md#5-composite-scorecard--agent-config-vs-the-field)
+> `vs Ruflo` column from `–` to `=` or `+`.
+**Measured-vs-claimed disclaimer:** Each row cites the **mechanism** that
+covers Ruflo's pattern. Numbers attached to those mechanisms (cost figures,
+smoke baselines, ADR count) are claimed until the 25-prompt bench corpus
+soak in [`bench.json`](bench.json) flips from `warmup` to `baseline_ready`
+(min 60 days, ≥ 30 reports — earliest 2026-07-15).
+## Verdict table
+| # | Ruflo pattern | Verdict | Evidence |
+|---|---|---|---|
+| 1 | **Cost-tracker plugin** — real model pricing, per-1M, separated input/output/cache | `[x] covered by` | [`scripts/cost/track.mjs`](../../scripts/cost/track.mjs) + [`bench/pricing.yaml`](../../bench/pricing.yaml) (Haiku/Sonnet/Opus per-1M, input/output/cache-read/cache-write split). Step-11 Phase 1. |
+| 2 | **Auto-capture from session jsonl** — reads Claude Code log, no manual tracking | `[x] covered by` | [`scripts/cost/track.mjs`](../../scripts/cost/track.mjs) reads `~/.claude/projects/*/sessions/*.jsonl` automatically. Step-11 Phase 1 Step 1. |
+| 3 | **50/75/90/100 % budget ladder with hard stop** | `[x] covered by` | [`scripts/cost/budget.mjs`](../../scripts/cost/budget.mjs) — exit codes 0/1/2/3 per tier; opt-in fail-closed via `cost.enforcement` setting. Fixtures: `tests/fixtures/cost/budget/{under-50,at-100,over-100}/`. Step-11 Phase 2. |
+| 4 | **Measured-vs-claimed disclaimer** — every percentage tagged "claimed upstream" | `[x] covered by` | One-line `**Measured-vs-claimed disclaimer:**` header block on all 9 active roadmaps in `agents/roadmaps/`. Verified 2026-05-16. Step-11 Phase 5 Step 4. |
+| 5 | **Smoke test as contract** — `bash scripts/smoke.sh` with declared baseline | `[x] covered by` | Four per-tier smoke scripts: [`scripts/smoke/kernel.sh`](../../scripts/smoke/kernel.sh), [`router.sh`](../../scripts/smoke/router.sh), [`schema.sh`](../../scripts/smoke/schema.sh), [`skills.sh`](../../scripts/smoke/skills.sh). Declared baselines in [`docs/contracts/smoke-contracts.md`](../contracts/smoke-contracts.md). CI gate: [`.github/workflows/smoke.yml`](../../.github/workflows/smoke.yml). Step-11 Phase 3. |
+| 6 | **Per-plugin ADR directory** — `docs/adrs/0001-*.md` co-located with subsystem | `[x] covered by` | Six bootstrap ADRs under [`docs/adrs/{cost,memory,router,schema,smoke,caveman}/`](../adrs/). Coverage gate: [`scripts/audit_adr_coverage.py`](../../scripts/audit_adr_coverage.py) (`task lint-adr-coverage`). Contract: [`docs/contracts/adr-layout.md`](../contracts/adr-layout.md). Step-11 Phase 4. |
+| 7 | **Namespace contract** — `<stem>-<intent>` kebab-case, reserved-names list | `[x] covered by` | [`scripts/lint_namespace.py`](../../scripts/lint_namespace.py) enforces shape + length floors + reserved-names + skill-dir-matches-name across 430 names · 0 issues. Contract: [`docs/contracts/namespace.md`](../contracts/namespace.md). CI gate: `task lint-namespace`. Step-11 Phase 5 Step 1. |
+| 8 | **Topology choices in swarm** — `hierarchical / mesh / star / adaptive` with anti-drift defaults | `[x] covered by` | [`.agent-src.uncompressed/skills/subagent-orchestration/SKILL.md`](../../.agent-src.uncompressed/skills/subagent-orchestration/SKILL.md) `Topology hints` subsection — 7-row table mapping each mode to topology + Ruflo anti-drift default (`hierarchical, 6–8 agents, raft consensus`). Step-11 Phase 5 Step 2. |
+| 9 | **MCP-tool count + source-line refs** — every tool with `<file>:<line>` citation | `[x] covered by` | [`docs/contracts/mcp-tool-inventory.md`](../contracts/mcp-tool-inventory.md) — 20 tools (9 stdio-implemented · 11 discovery stubs) each with catalog `<file>:<line>` + handler `<file>:<line>`. Generator: [`scripts/audit_mcp_tools.py`](../../scripts/audit_mcp_tools.py). CI drift gate: `task lint-mcp-inventory`. Step-11 Phase 5 Step 3. |
+## Open `[!]` rows
+**Zero.** Every Ruflo pattern is mechanism-covered. Numbers behind those
+mechanisms remain claimed until [`bench.json`](bench.json) soak completes
+(see disclaimer above).
+## Cross-references
+- Composite scorecard refresh: owned by [`step-99-north-star-restructure.md`](../../agents/roadmaps/step-99-north-star-restructure.md) Phase 5 Step 4 (replaces [`external-findings.md § 5`](../../agents/audit-2026-05-14-north-star/external-findings.md)).
+- Bench-ruflo redundancy verdict: [`bench-ruflo.json`](bench-ruflo.json) (step-11 Phase 6 Step 2).
+- G5 redundancy gate cite: step-99 Acceptance Criteria row "G5 — external redundancy (Domination Mandate)".