npm - @event4u/agent-config - Versions diffs - 6.0.0 → 6.1.0 - Mend

@event4u/agent-config 6.0.0 → 6.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (378) hide show

package/docs/decisions/ADR-093-ai-council-config-user-global.md ADDED Viewed

@@ -0,0 +1,111 @@
+---
+adr: 093
+status: accepted
+date: 2026-06-13
+decision: ai-council-config-user-global
+supersedes: —
+superseded_by: —
+phase: ai-council-config-relocation
+type: structural
+---
+# ADR-093 — Relocate the AI-council config to the user-global namespace
+## Status
+**Accepted** · 2026-06-13. Maintainer directive (standing, restated): the
+council is a per-developer facility and must be configured once, globally,
+not re-declared per project.
+## Context
+The council config (`.ai-council.yml`) was a **project-tracked** file at
+`agents/settings/.ai-council.yml`, checked into this repository (it held the
+maintainer's enabled members + `file:`/`env:` key references). Resolution was
+anchored on the project root (`council_cli.py:AI_COUNCIL_FILE = REPO_ROOT /
+"agents" / "settings" / ".ai-council.yml"`; `cmd_doctor.py` recomputed the
+same path).
+Three defects followed from the project-tracked layout:
+1. **Per-project, not per-developer.** A single developer who wants the
+   council everywhere had to drop the file into every project — exactly the
+   opposite of "configure once".
+2. **Commit / leak risk.** The config sat in the tracked tree of a public
+   package; only the `file:`/`env:` indirection (raw keys are refused by
+   `_validate_api_key_ref`) kept secrets out.
+3. **Silent unavailability.** On any surface without a project copy — cloud /
+   headless / a fresh checkout / a different worktree — resolution found
+   nothing and the council refused with "ai_council.enabled is false", even
+   when the developer had set it up. The raw API keys already lived in the
+   user-global namespace (`~/.event4u/agent-config/<provider>.key`, resolved
+   by `resolve_api_key`); only the config that points at them did not.
+The user-global namespace helper (`scripts/_lib/user_global_paths.py`,
+`event4u_root()` → `~/.event4u/agent-config/`, with the legacy
+`~/.config/agent-config/` read-fallback) already underpinned key resolution,
+so the config had a natural home there.
+## Decision
+**The council config is user-global by default.** A single
+`resolve_config_path(project_root)` in `scripts/ai_council/config.py` is the
+one place that decides which file is read, with this precedence
+(first match wins):
+1. `$AI_COUNCIL_CONFIG` — explicit absolute path (tests / power users);
+   honoured even when absent, so typos surface as "create it here".
+2. Project-local `<project_root>/agents/settings/.ai-council.yml` — a
+   consumer project that deliberately checks in its own config; overrides
+   the user-global file for that project only.
+3. User-global `~/.event4u/agent-config/settings/.ai-council.yml` (legacy
+   `~/.config/agent-config/` read-fallback) — the canonical default.
+When none exists, the resolver returns the user-global write target so
+callers' `.exists()` gate and "create it at …" messaging both point at the
+global location.
+`council_cli.py` and `cmd_doctor.py` both route through this resolver. The
+tracked `agents/settings/.ai-council.yml` is removed from the repository; the
+documented shape ships as `agents/templates/.ai-council.yml.example` to copy
+from. The maintainer's live config now lives at
+`~/.event4u/agent-config/settings/.ai-council.yml`.
+Project-local override is **kept** (not removed) so a consumer team can still
+pin a shared council config in their own repo — but it is no longer the
+default, and this package no longer ships one in its tracked tree.
+## Consequences
+- The council now resolves from the user-global file in every project,
+  worktree, and CWD — verified: a worktree with no project copy resolves
+  `~/.event4u/agent-config/settings/.ai-council.yml` and reports `members=2`.
+- No council config can be silently committed to this (or any consumer)
+  public repo by default.
+- `doctor council-cli` now names the user-global path in its
+  "no council config" / "config invalid" messaging.
+- The test suite is made hermetic via `EVENT4U_CONFIG_HOME` sandboxing so
+  it never reads the developer's real global config (it otherwise would,
+  now that "no project file" falls through to global).
+- **Not done here (follow-up):** auto-scaffolding the global file from the
+  example template during `agent-config` install/setup. Resolution +
+  manual/templated placement satisfy "the package always uses it"; an
+  installer step that writes the example to the global path when absent is
+  a nice-to-have tracked separately to keep `install.py` out of this diff.
+## Alternatives
+- **Global always wins, ignore project files.** Rejected — it would break a
+  consumer team's legitimately checked-in shared council config. Removing
+  this package's tracked copy already gives the maintainer global-everywhere
+  behaviour without taking the override away.
+- **Keep the file project-tracked, just gitignore it.** Rejected — still
+  per-project, still absent on fresh/cloud surfaces; does not deliver
+  "configure once per user".
+## References
+- Contract: [`docs/contracts/ai-council-config.md`](../contracts/ai-council-config.md) § File location.
+- `scripts/ai_council/config.py:resolve_config_path` — the resolver.
+- `scripts/_lib/user_global_paths.py` — user-global namespace + legacy fallback.
+- `agents/templates/.ai-council.yml.example` — the documented shape to copy.

package/docs/decisions/ADR-094-agent-memory-layer-removal.md ADDED Viewed

@@ -0,0 +1,94 @@
+---
+adr: 094
+status: accepted
+date: 2026-06-14
+decision: agent-memory-layer-removal
+supersedes: —
+superseded_by: —
+phase: memory-layer-cleanup
+type: structural
+---
+# ADR-094 — Remove the heavyweight agent-memory layer; keep file-first memory
+## Status
+**Accepted** · 2026-06-14. Resolved by three AI-council rounds
+(claude-sonnet-4-5 + gpt-4o) on 2026-06-14; maintainer-directed package
+optimization.
+## Context
+Engineering memory in this suite had **two layers**:
+- **Layer 1 — file-first (in this repo).** Curated YAML under
+  `agents/memory/<type>/` + agent-written `agents/memory/intake/*.jsonl`,
+  read through `scripts/memory_lookup.retrieve()`. Git-tracked, lintable,
+  redactable, vendor-neutral. No external infrastructure.
+- **Layer 2 — the `@event4u/agent-memory` companion package (separate repo).**
+  PostgreSQL + pgvector, an MCP server, Ebbinghaus decay, trust scoring, and
+  cross-project learning. Consumed optionally via a versioned cross-repo
+  contract; `scripts/memory_lookup.py` exposed an `operational_provider` seam,
+  `scripts/memory_status.py` probed for the package CLI, and the MCP tools
+  routed through it when "present".
+The package was unused, its integration roadmaps were already archived, and
+its external repository is being deleted. Native agent memory (Claude, Cursor)
+is improving on the exact axis — semantic cross-session recall — where the
+package competed, and its PostgreSQL + MCP runtime contradicted the suite's
+"no app runtime" positioning.
+## Decision
+1. **Remove Layer 2 entirely** from `agent-config` — the package binding and
+   the now-dead generic operational machinery it alone served:
+   `package_operational_provider` / `_cli_operational_provider`, the
+   `operational_provider` parameter, the `OperationalProvider` type, the
+   repo-vs-operational conflict rule, `Shadow` / `with_shadows` /
+   `shadowed_by`, the package-detection in `memory_status.py`, the MCP
+   `with_package` routing, and the `agent-memory-contract.md` contract doc.
+2. **Keep Layer 1** — file-first `retrieve()`, the `check_memory.py` redaction
+   gate, and the typed curated/intake store. This is *governance* (auditable,
+   portable, reviewable), a different job from native memory.
+3. **Simplify contract artefacts in-place, do not loudly break.** The internal
+   `retrieval-v1` schema + conformance suite were reduced to repo-only
+   (`source: "repo"`; no `operational` / `trust` / `shadowed_by`) rather than
+   version-bumped, because the only consumer (the package) is deleted.
+4. **Adopt MemSkill's write-time curation discipline** (Apache-2.0,
+   github.com/ViktorAxelsen/MemSkill) into the `memory-consolidation` skill —
+   dedupe before insert, split distinct facts, merge-and-preserve on update,
+   delete only on explicit contradiction, prefer no-op under uncertainty, skip
+   trivial/fleeting/speculative. The ML training/eval pipeline is **not**
+   adopted; MemSkill's own thesis (quality comes from write-time skills, not a
+   heavy storage substrate) reinforces the removal.
+## Consequences
+- One memory layer, file-backed, smaller surface, no infra prerequisite.
+- `memory_status.status()` is now a constant file-backend report; `health()`
+  still emits the v1 envelope shape so the MCP `memory_status` tool is stable.
+- No decay engine → committed memory is bounded manually (intake gitignored,
+  type narrowing, entry caps, archived-entry deletion — see
+  `road-to-memory-pipeline-consolidation.md`).
+- The memory ADR area (`docs/adrs/memory/`) is retired from
+  `audit_adr_coverage.py`; ADR 0001 there is marked superseded by this ADR.
+## Alternatives considered
+- **Freeze Layer 2 (leave inert).** Rejected — a dormant integration surface
+  advertising a capability the suite no longer has is misleading residue.
+- **Markdown-only (drop the YML layer too).** Rejected — the working
+  `retrieve()` machinery + the `check_memory.py` redaction gate are real
+  governance a schema-less markdown parser cannot cheaply replace; an 8-consumer
+  migration is untested risk for no gain (council, Option B).
+- **Revive Layer 2 later.** Gated: requires ≥2 funded consumer projects with a
+  named maintainer each + explicit PostgreSQL adoption.
+## References
+- `agents/roadmaps/archive/road-to-agent-memory-removal.md` — the executing roadmap (archived).
+- `agents/roadmaps/archive/road-to-memory-pipeline-consolidation.md` — follow-on
+  mining consolidation + size bounding.
+- `docs/adrs/memory/0001-consumer-side-snapshot.md` — superseded by this ADR.
+- `docs/guidelines/agent-infra/memory-access.md` — the surviving file-backed
+  retrieval contract.

package/docs/decisions/ADR-095-workspace-boundary-contract.md ADDED Viewed

@@ -0,0 +1,108 @@
+---
+adr: 095
+status: accepted
+date: 2026-06-14
+decision: workspace-boundary-contract
+supersedes: —
+superseded_by: —
+phase: v6.0.0 · final-readiness · Phase 4
+type: structural
+---
+# ADR-095 — Workspace boundary contract + import-edge drift check
+## Status
+**Accepted** · 2026-06-14. Lands Phase 4 of
+[`road-to-6.0.0-final-readiness`](../../agents/roadmaps/archive/road-to-6.0.0-final-readiness.md).
+Routed through the AI council (anthropic/claude-sonnet-4-5 + openai/gpt-4o,
+design mode, 2026-06-14; converged 2-round on "import-boundary linter as MVP
+with explicit semantic-drift disclaimer + escape-hatch").
+## Context
+The PR #489 / 6.0.0 review flagged that **"workspace" risks becoming the new
+"meta"** — a catch-all that absorbs every nearby concern until it owns
+everything and means nothing. The workspace is real and load-bearing (13
+`src/cli/python/workspace_*.py` modules behind `/work` and the host-drive
+loop), so the risk is concrete: a future change quietly teaches a workspace
+module to design skills, decide profile semantics, or pick video providers,
+and the boundary erodes one import at a time.
+The prior council finding (#3 on the parent roadmap) accepted a boundary
+contract **with modification**: this repo has no dependency-cruiser /
+TS-import-boundary tooling, so any drift mechanism must *fit the real surface*
+or *explicitly justify doc-governance-only*.
+## Decision
+1. **Author a boundary contract** at
+   [`docs/contracts/workspace-boundary.md`](../contracts/workspace-boundary.md)
+   enumerating what the workspace **owns** and **does not own**.
+2. **Drift mechanism = an import-edge linter** (`scripts/lint_workspace_boundary.py`),
+   AST-static, over `src/cli/python/workspace_*.py`. It fails if a workspace
+   module imports an owner-module of a *not-owned* domain. It is wired into CI.
+3. **Explicit scope disclaimer.** The linter enforces **import edges only**.
+   Semantic drift — a workspace module encoding profile semantics or analytics
+   *product strategy* without importing anything forbidden — is **not**
+   catchable by an import check and stays **doc-governance**, enforced in
+   review against the contract. The contract states this limit plainly so the
+   green check is never mistaken for "the boundary is fully enforced" (the
+   council's named false-confidence risk).
+4. **Escape hatch.** A `# boundary-exception: <reason>` pragma on the import
+   line lets a justified, reviewed exception through; the contract records that
+   such pragmas are reviewed like any boundary change.
+### Owns / does-not-own (locked by the parent roadmap's council)
+| Workspace **owns** | Workspace **does NOT own** |
+|---|---|
+| task orchestration | skill design |
+| host-session lifecycle | profile semantics |
+| continuation (multi-turn) | video-provider logic |
+| drive health | MCP-registry policy |
+| | analytics **product strategy** |
+## Consequences
+- The boundary becomes a CI gate, not a wish-list. Day-one state: **zero
+  violations** (survey below), so the check locks the current-correct boundary
+  rather than papering over drift.
+- The import check is cheap and precise here because workspace modules live in
+  `src/cli/python/` and the not-owned owner-modules live in `src/scripts/` —
+  separate namespaces, no false-positive surface (workspace modules import only
+  stdlib, third-party `keyring`/`cryptography`/`yaml`, and intra-workspace
+  `workspace_*`).
+- Semantic drift remains a review responsibility; the contract names it so
+  reviewers know the linter is a supplement, not a substitute.
+## Survey — existing violations
+AST import survey of all 13 `workspace_*.py` modules (2026-06-14):
+- The only cross-module import is `workspace_inbox → workspace_skills`
+  (intra-workspace; allowed).
+- **Zero** imports of any skill-design / profile / pack / video / MCP /
+  condense / router / persona owner-module.
+- `workspace_skills.py` *resolves* skill bodies for host hand-off (consumes,
+  does not design) → within bounds.
+- `workspace_analytics.py` records task completion/abandonment telemetry
+  (drive-health domain, not analytics product strategy) → within bounds.
+**Result: zero boundary violations recorded.**
+## Alternatives
+- **Doc-governance-only (no check).** Rejected as the sole mechanism: the
+  surface is concrete Python with clean namespaces, so the cheapest lock (an
+  import check) earns its keep against the most common drift vector. Kept as
+  the mechanism for *semantic* drift, which an import check cannot see.
+- **dependency-cruiser / TS import-boundary tooling.** N/A — the workspace
+  surface is Python, not TS; importing JS tooling for it is wrong-stack.
+## References
+- [`docs/contracts/workspace-boundary.md`](../contracts/workspace-boundary.md) — the contract.
+- [`scripts/lint_workspace_boundary.py`](../../src/scripts/lint_workspace_boundary.py) — the drift check.
+- [`ADR-050`](ADR-050-workspace-vs-package-root-boundary.md) — the workspace-vs-package-root trust boundary this refines at the module level.
+- [`docs/contracts/daily-workspace.md`](../contracts/daily-workspace.md) — cross-links this contract.

package/docs/decisions/INDEX.md CHANGED Viewed

@@ -93,6 +93,12 @@ _Auto-generated by `scripts/adr/regenerate_index.py`. Do not edit._
 | [ADR-087](ADR-087-installer-e2e-test-strategy.md) | Installer E2E Test Strategy | accepted | 2026-06-11 | — |
 | [ADR-088](ADR-088-no-external-runtime-federation.md) | No External Runtime Federation | accepted | 2026-06-11 | — |
 | [ADR-089](ADR-089-lean-local-plugin-install.md) | Lean Local Plugin Install | accepted | 2026-06-12 | — |
+| [ADR-090](ADR-090-visibility-command-frontmatter-field.md) | Visibility Command Frontmatter Field | accepted | 2026-06-13 | — |
+| [ADR-091](ADR-091-split-meta-capability-packs.md) | Split Meta Capability Packs | accepted | 2026-06-13 | — |
+| [ADR-092](ADR-092-defer-command-tier-alias-removal.md) | Defer Command Tier Alias Removal | accepted | 2026-06-13 | — |
+| [ADR-093](ADR-093-ai-council-config-user-global.md) | Ai Council Config User Global | accepted | 2026-06-13 | — |
+| [ADR-094](ADR-094-agent-memory-layer-removal.md) | Agent Memory Layer Removal | accepted | 2026-06-14 | — |
+| [ADR-095](ADR-095-workspace-boundary-contract.md) | Workspace Boundary Contract | accepted | 2026-06-14 | — |
 ## Unnumbered (legacy)

package/docs/development.md CHANGED Viewed

@@ -8,7 +8,7 @@
 ## Editing content
-1. **Always edit in `.agent-src.uncondensed/`** — never in `dist/agent-src/` or `.augment/` directly
+1. **Always edit in `src/`** — never in `dist/agent-src/` or `.augment/` directly
 2. Run `task sync` to copy non-`.md` files
 3. Use the `/condense` command to condense changed `.md` files
 4. Run `task ci` to verify everything passes before pushing
@@ -33,7 +33,7 @@ task consistency-fix           # Regenerate all derived outputs from source
 ### Sync & Condensation
 ```bash
-task sync                      # .agent-src.uncondensed/ → dist/agent-src/, then project → .augment/
+task sync                      # src/ → dist/agent-src/, then project → .augment/
 task sync-changed              # List .md files changed since last condensation
 task sync-check                # Check if dist/agent-src/ is in sync (for CI)
 task sync-check-hashes         # Verify condensed .md hashes match source
@@ -202,13 +202,11 @@ tests/
 └── consistency.yml            ← Sync + hash + tool verification
 src/templates/consumer-settings/   ← Settings templates for consumer projects
-.agent-src.uncondensed/         ← Source of truth (human-readable, verbose)
+src/                            ← Source of truth (human-readable, verbose)
 ├── rules/                     ← Behavior rules
 ├── skills/                    ← Skill definitions (SKILL.md per skill)
-├── commands/                  ← Slash command definitions
-├── guidelines/                ← Coding guidelines by language
-├── templates/                 ← Document scaffolds
-└── contexts/                  ← System knowledge documents
+├── domains/                   ← Slash command definitions (per pack)
+└── agent-src/                 ← Contexts, templates, profiles, personas, …
 dist/agent-src/                    ← Condensed output (token-efficient, shipped)
 ├── (same structure)           ← Condensed .md + copied non-.md files

package/docs/getting-started.md CHANGED Viewed

@@ -129,7 +129,7 @@ Your agent is now:
 - **Respecting your codebase** — no conflicting patterns
 - **Following standards** — consistent code quality
-This is enforced automatically by 79 rules. No configuration needed.
+This is enforced automatically by 83 rules. No configuration needed.
 ---
@@ -167,9 +167,9 @@ Your agent now understands slash commands:
 | `/optimize skills` | Audit skills, find duplicates, run linter |
 | `/feature plan` | Interactively plan a feature |
 | `/quality-fix` | Run and fix all quality checks |
-| `/chat-history` | Inspect the persistent chat-history log (read-only `show`) |
+| `/chat-history import` | Pull a prior session into the current chat (resume) |
-→ [Browse all 150 active commands](../dist/agent-src/commands/)
+→ [Browse all 147 active commands](../dist/agent-src/commands/)
 ---
@@ -188,7 +188,7 @@ Logging is **hook-only**: a structural Augment hook fires on
 transparently if the fingerprint does not match (fresh chat) and
 otherwise appended to.
-Run `/chat-history` (a.k.a. `/chat-history show`) any time to inspect
+Use your host's native transcript / session view any time to inspect
 the log size, last entries, and current fingerprint. For the rare case
 where auto-adopt misfires (corrupted file, hook misconfiguration), run
 `./agent-config chat-history:adopt` as the manual recovery lever.

package/docs/guidelines/agent-infra/5w2h-analysis.md CHANGED Viewed

@@ -257,4 +257,4 @@ Inversion → For each W / H, ask "what if it's missing?"
 ## ADOPT citation
-Adopted from [`ginobefun/deep-reading-analyst-skill`](https://github.com/ginobefun/deep-reading-analyst-skill) @ commit `26cd7dc9` · `src/deep-reading-analyst/references/5w2h_analysis.md` · MIT License.
+Adapted from an external reference.

package/docs/guidelines/agent-infra/comparison-matrix.md CHANGED Viewed

@@ -176,4 +176,4 @@ unresolved and needs more evidence.
 ## ADOPT citation
-Adopted from [`ginobefun/deep-reading-analyst-skill`](https://github.com/ginobefun/deep-reading-analyst-skill) @ commit `26cd7dc9` · `src/deep-reading-analyst/references/comparison_matrix.md` · MIT License.
+Adapted from an external reference.

package/docs/guidelines/agent-infra/corpus-grounding-authoring.md CHANGED Viewed

@@ -84,7 +84,7 @@ selection*, architecture-pattern selection. Watch note:
 | Domain | Skill / manifest | Tier | Source pin |
 |---|---|---|---|
-| Frontend design | `design-intelligence/data/manifest.json` | conditional-grounding | ui-ux-pro-max @ b7e3af80 |
+| Frontend design | `design-intelligence/data/manifest.json` | conditional-grounding | external reference @ b7e3af80 |
 | Security / threat-modeling | `threat-modeling/data/manifest.json` | conditional-grounding | MITRE ATT&CK v16 + OWASP ASVS 4.0 (derived) |
 | API design | `api-design/data/manifest.json` | lookup-only | RFC 9110/9457 + field-standard practice (derived) |
 | DB-query tuning | `database/data/manifest.json` | lookup-only | PostgreSQL 16 / MySQL 8 docs (derived) |

package/docs/guidelines/agent-infra/critical-thinking.md CHANGED Viewed

@@ -152,5 +152,5 @@ This prevents attacking strawmen and ensures fair evaluation.
 ## ADOPT citation
-Adopted from [`ginobefun/deep-reading-analyst-skill`](https://github.com/ginobefun/deep-reading-analyst-skill) @ commit `26cd7dc9` · `src/deep-reading-analyst/references/critical_thinking.md` · MIT License.
+Adapted from an external reference.

package/docs/guidelines/agent-infra/engineering-memory-data-format.md CHANGED Viewed

@@ -22,7 +22,6 @@ that prefer one file per type.
 | Type | Single-file path | Sharded path |
 |---|---|---|
 | Domain invariants | `agents/memory/domain-invariants.yml` | `agents/memory/domain-invariants/<hash>.yml` |
-| Architecture decisions | `agents/memory/architecture-decisions.yml` | `agents/memory/architecture-decisions/<hash>.yml` |
 | Incident learnings | `agents/memory/incident-learnings.yml` | `agents/memory/incident-learnings/<hash>.yml` |
 | Product rules | `agents/memory/product-rules.yml` | `agents/memory/product-rules/<hash>.yml` |
@@ -50,7 +49,7 @@ rejects entries missing any required field.
 The `priority` field controls how aggressively `/memory:load` surfaces
 an entry. The three-tier enum is intentional — see
-`road-to-dream-skill-adoption.md` § B2 and the Phase 2 council brief
+an internal roadmap (local-only) § B2 and the Phase 2 council brief
 for why a fourth `high` tier was rejected.
 | Value | Meaning | Reader behaviour |
@@ -103,9 +102,6 @@ templates for the full shape:
 - [`domain-invariants.example.yml`](../../templates/agents/memory/domain-invariants.example.yml)
   adds `rule`, `boundary`, `scope.paths`, `violation_contract`.
-- [`architecture-decisions.example.yml`](../../templates/agents/memory/architecture-decisions.example.yml)
-  adds `title`, `context`, `decision`, `alternatives_rejected`,
-  `trade_offs`, `paths`, `superseded_by`.
 - [`incident-learnings.example.yml`](../../templates/agents/memory/incident-learnings.example.yml)
   adds `pattern`, `trigger_conditions`, `consequence`, `guardrail`,
   `enforcement`, `severity`.

package/docs/guidelines/agent-infra/first-principles.md CHANGED Viewed

@@ -189,4 +189,4 @@ This exposes weak arguments and strengthens valid ones.
 ## ADOPT citation
-Adopted from [`ginobefun/deep-reading-analyst-skill`](https://github.com/ginobefun/deep-reading-analyst-skill) @ commit `26cd7dc9` · `src/deep-reading-analyst/references/first_principles.md` · MIT License.
+Adapted from an external reference.

package/docs/guidelines/agent-infra/frontier-reasoning-operating-profile.md ADDED Viewed

@@ -0,0 +1,164 @@
+# Frontier-Grade Reasoning — Operating Profile
+> Source dossier for the **Reasoning Discipline Protocol (RDP)** — the durable,
+> sourced rationale behind the RDP gate context, rule, and skills. It documents what a
+> frontier reasoning model (Anthropic's Fable 5 / Mythos 5, June 2026) does that
+> weaker models skip — the **transferable operating discipline** — and the
+> boundary of what is *not* transferable.
+>
+> Citation discipline: every catalog row names its source **and its dignity**
+> (Anthropic primary doc · third-party review · customer testimonial · our own
+> derivation). Two independent external model analyses (Claude + GPT) corroborate
+> but are never the primary evidence for a transferable-behavior line. A
+> line-by-line audit (2026-06-13) corrected five rows; the load-bearing claims
+> (over-prescription degrades, reasoning-in-response → refusal) are verbatim in
+> Anthropic's prompting doc and stand.
+## The one boundary that frames everything
+```
+CAPABILITY DOES NOT TRANSFER. DISCIPLINE DOES.
+```
+A frontier model's edge lives in its **weights** — gains spread across the whole
+training stack, with no single copyable prompt (the explicit framing is Nathan
+Lambert / Interconnects [lam]; Anthropic's docs support only the weaker "needs
+less scaffolding" [pf]). No skill, rule, or workflow makes Sonnet/Opus/GPT
+*equal* Fable 5, and anyone promising that is selling the hallucinated analysis
+this suite refuses to produce.
+What *does* transfer is the **operating discipline**: the steps a frontier model
+takes on its own that weaker models skip unless forced. On under-specified and
+long-horizon tasks, that discipline gap is the largest part of the *visible*
+quality difference. RDP transplants the discipline; it never claims the
+capability.
+> **Background (dates, precise):** Anthropic's own banner dates the **access
+> suspension of Fable 5 / Mythos 5 to 2026-06-12** [an]; the related US
+> export-control directive / Reuters report is **2026-06-13** [re]. Either way
+> the model is currently inaccessible, so we cannot A/B against it — our own
+> falsifiable eval is the only ground truth, which makes the discipline-transplant
+> approach *more* relevant, not less.
+## Sources (with dignity)
+- **[an]** Anthropic — *Claude Fable 5 and Mythos 5* announcement. Primary, but
+  several quotes on the page are **customer testimonials** (labeled per-row), not
+  Anthropic capability statements.
+- **[pf]** Anthropic docs — *Prompting Claude Fable 5*. Primary; the transferable
+  prompting playbook. Load-bearing claims live here.
+- **[in]** Anthropic docs — *Introducing Fable 5 / Mythos 5*. Primary; adaptive
+  thinking + effort parameter + memory tool + refusals.
+- **[ve]** Vellum — benchmark breakdown. **Third-party**; benchmark numbers are
+  not verifiable in Anthropic prose and must be labeled third-party or pulled to
+  the system card.
+- **[cr]** CodeRabbit model review. **Third-party**; source of the
+  "explore-environment-first" observation (not Anthropic-documented).
+- **[lam]** Nathan Lambert / Interconnects. **Third-party analysis**; "whole
+  training stack, no copyable prompt".
+- **[re]** Reuters — export-control report (2026-06-13).
+- Two external model analyses (Claude + GPT), provenance untracked in
+  `agents/.harvest-local/` — **corroboration only**, never primary.
+## Transferable-behavior catalog
+Each row: behavior → **source + dignity** → transplant mechanism → carrier.
+| Behavior | Source (dignity) | Transplant | Carrier |
+|---|---|---|---|
+| Audit progress against real tool results | **[pf] ✓ primary** | every claim cites a tool result | shipped: `verify-before-complete` |
+| Act when you have enough; no overplanning; outcome-first | **[pf] ✓ primary** | shipped | `direct-answers`, `autonomous-execution` |
+| Pause only when genuinely needed | **[pf] ✓ primary** | shipped | `no-cheap-questions` |
+| No over-refactor / minimal diff (at higher effort) | **[pf] ✓ primary** | shipped | `minimal-safe-diff` |
+| Parallel async subagents, dispatched readily | **[pf] ✓ primary** | default async dispatch | extend `subagent-orchestration` |
+| Fresh-context verifier beats self-critique (for **long-running** tasks) | **[pf] ✓ primary** | verifier subagent on a **structural-complexity** gate (not blanket) | extend `adversarial-review` |
+| Persistent **cross-run** notes (file memory = 3× vs Opus 4.8 on Slay the Spire) | **[an] ✓ primary (direct)** | lessons across runs, one per file | extend memory / `memory-consolidation` |
+| Infer the underlying goal (standard host only) | **[an] testimonial (Lovable); direction note** | infer goal, give **one** recommendation — NO "2–3 framings"; standard host only (a strong-reasoning host self-infers) | extend `improve-before-implement` |
+| Multi-hypothesis / "killing incorrect beliefs" | **[an] testimonial (Sean Ward); "multi-hypothesis" is our framing** | hypotheses + killed-beliefs in the notes file | `notes-first-reasoning` |
+| Adaptive effort (depth scales with hardness) | **[in],[pf] = API knob, not a scaffold** | strong-reasoning host: set `effort: high`; standard host w/o the knob: scaffold the effort/stop discipline | extend `autonomous-execution` (standard-host-only) |
+| Explore the environment first, then build | **[cr] third-party (NOT Anthropic-documented)** | enumerate constraints/tools/info-gaps, close by query/test before designing | extend `think-before-action` |
+| Risk-first decomposition (hardest/load-bearing unknown first) | **OUR DERIVATION — general engineering discipline, NOT Fable-documented** ([pf]'s "top of difficulty range" means **task selection**, not intra-task order) | resolve load-bearing uncertainties before dependent work | new skill `complexity-first-planning` |
+Adopted from GPT review (frontier-implicit behaviors; our adoption, corroboration
+only — all cost-gateable notes components):
+| Behavior | Source | Transplant | Carrier |
+|---|---|---|---|
+| Prediction tracking (calibration) | GPT review | prediction + confidence + result + lesson in notes | `notes-first` component `prediction_tracking` |
+| Uncertainty budget | GPT review | per-dimension uncertainty score → feeds adaptive effort | `notes-first` component `uncertainty_budget` |
+| Decision ledger | GPT review | decision + alternatives + reason + revisit-if in notes; **escalates to `decision-record`/ADR** when cross-task or structural | `notes-first` component `decision_ledger` |
+The ordered chain the orchestrator enforces:
+`ground → intent → notes → gather → audit → verify`.
+## Deferred to the Phase-8 audit (likely already covered)
+Two transferable [pf] behaviors are **not** given new artifacts up front — the
+Phase-8 de-prescriptivize audit checks existing coverage first and adds only
+verified gaps (HIGH priority):
+- **Re-ground the final summary** — write the closing summary for a reader who
+  saw none of the working thread; outcome first; drop working shorthand. Likely
+  covered by `language-and-tone` + `direct-answers`.
+- **Report findings and stop** — when the user is thinking out loud, don't apply
+  a fix. Likely covered by `scope-control`.
+## What does NOT transfer (and must not be faked)
+- **Raw capability** — vision SOTA, document/chart reasoning, codebase-scale
+  migrations. Benchmark deltas (e.g. SWE-Bench Pro 80.3 vs 69.2) are **[ve]
+  third-party**, not verifiable in Anthropic prose — label third-party or pull to
+  the system card; note "SWE-Bench Pro" is not named in [an] (which cites
+  FrontierCode/Cognition + Hebbia-Finance). These are weights, not prompts.
+- **"Show your work" in the response.** [pf] warns that reasoning-in-response can
+  trigger a `reasoning_extraction` refusal, and that over-prescriptive skills
+  *degrade* strong models. **Both verbatim in [pf] — the gating foundation.** RDP
+  keeps reasoning in notes + verifier subagents, never in the response, and
+  tier-gates prescription.
+- **Pokémon FireRed vision-only harness** — **[an] only** ([pf] does not mention
+  it).
+## Cost / benefit by host reasoning strength (the gating lens)
+RDP is **not** free — scaffolding costs tokens and, on strong-reasoning hosts,
+*degrades* output. So it engages only where it pays, via **table-free** signals.
+Per **ADR-035** the suite maintains no runtime model→band table, and `model_tier`
+is the *skill's* needed band (lite/medium/high), **not** the host's. Host strength
+is therefore **agent self-assessed**, never looked up (roadmap L10/L17):
+| Host / task | Self-coordination | RDP engagement | Why |
+|---|---|---|---|
+| standard host | low | **full** scaffolding | biggest discipline gap → biggest lift |
+| strong-reasoning host | high | **light / off** (use native `effort: high`) | over-scaffolding degrades + wastes tokens |
+| trivial / short task (any host) | n/a | **off** | no discipline gap to close; pure overhead |
+| verifier subagent (any host) | — | **only** on the structural-complexity gate (≥2 of: branching, ≥3 constraints, stateful, irreversibility) + token floor ≥1k | a full extra inference pass — the most expensive primitive |
+**One** constraint-light scaffold ships — no heavy/light content variants, since
+two variants would be a hidden model→band table. A standard host **expands it on
+request**. Two gates, both default-on: automatic (task signal + agent
+self-assessment) + the user `reasoning:` settings toggle (global + per-component
++ hard off).
+## Notes template grounding
+The session-notes template is grounded in Anthropic's **documented cross-run
+lessons memory** [pf] (one lesson per file; corrections + confirmed approaches;
+why they mattered). The in-task sections (`## In-Task Hypothesis Log`,
+`## Predictions`, `## Decisions`, `## Uncertainty`) are a clearly-marked **local
+derivation** for within-task scope — useful, but not claimed as Anthropic-
+documented, and kept on the notes side of the `reasoning_extraction` line.
+## Naming
+Neutral, no brand/capability claim. Umbrella: **Reasoning Discipline Protocol
+(RDP)**. Artifacts: `reasoning-orchestrator`, `notes-first-reasoning`,
+`complexity-first-planning`, `environment-grounding`. "Fable" / "Mythos" never
+appear in an artifact identifier.
+[an]: https://www.anthropic.com/news/claude-fable-5-mythos-5
+[pf]: https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/prompting-claude-fable-5
+[in]: https://platform.claude.com/docs/en/about-claude/models/introducing-claude-fable-5-and-claude-mythos-5
+[ve]: https://www.vellum.ai/blog/claude-fable-5-and-mythos-5-benchmarks-explained
+[re]: https://www.reuters.com/technology/us-blocks-foreign-access-anthropics-most-advanced-ai-models-axios-reports-2026-06-13/
+[cr]: https://www.coderabbit.ai/blog/fable-5-model-review
+[lam]: https://www.interconnects.ai/p/claude-fable-5-and-new-ai-safety

package/docs/guidelines/agent-infra/inversion-thinking.md CHANGED Viewed

@@ -385,4 +385,4 @@ Before any plan, do 15-minute pre-mortem:
 ## ADOPT citation
-Adopted from [`ginobefun/deep-reading-analyst-skill`](https://github.com/ginobefun/deep-reading-analyst-skill) @ commit `26cd7dc9` · `src/deep-reading-analyst/references/inversion_thinking.md` · MIT License.
+Adapted from an external reference.