npm - @maestrofrontier/frontier - Versions diffs - 1.4.0 - Mend

@maestrofrontier/frontier 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/AGENTS.md +214 -0
package/CLAUDE.md +29 -0
package/LICENSE +21 -0
package/README.md +521 -0
package/bin/maestro.cjs +75 -0
package/commands/compress.md +36 -0
package/commands/context-bar.md +30 -0
package/commands/frontier.md +124 -0
package/commands/settings.md +101 -0
package/commands/terse.md +23 -0
package/commands/update.md +59 -0
package/docs/orchestration.md +168 -0
package/frontier/cli.cjs +248 -0
package/frontier/config.cjs +441 -0
package/frontier/dispatch.cjs +255 -0
package/frontier/judge.cjs +92 -0
package/frontier/run.cjs +148 -0
package/frontier/schema.cjs +112 -0
package/frontier/semaphore.cjs +49 -0
package/frontier/synthesize.cjs +79 -0
package/hooks/frontier-autorun.cjs +124 -0
package/hooks/hooks.json +103 -0
package/hooks/maestro-doctrine-guard.cjs +81 -0
package/hooks/maestro-gate-reminder.cjs +58 -0
package/hooks/maestro-gate-telemetry.cjs +77 -0
package/hooks/maestro-loop-guard.cjs +76 -0
package/hooks/maestro-phase-scope.cjs +118 -0
package/hooks/maestro-statusline-sync.cjs +152 -0
package/hooks/maestro-subagent-guard.cjs +148 -0
package/hooks/maestro-terse-mode.cjs +189 -0
package/hooks/maestro-toolbudget-advisory.cjs +127 -0
package/integrations/README.md +87 -0
package/integrations/cline/skills/frontier/SKILL.md +75 -0
package/integrations/codex/prompts/frontier.md +66 -0
package/integrations/codex/prompts/update.md +36 -0
package/integrations/cursor/commands/frontier.md +63 -0
package/integrations/cursor/commands/update.md +34 -0
package/integrations/gemini/commands/frontier.toml +76 -0
package/integrations/windsurf/workflows/frontier.md +70 -0
package/package.json +52 -0
package/scripts/install.cjs +490 -0
package/settings/cli.cjs +140 -0
package/settings/config.cjs +309 -0

package/commands/context-bar.md ADDED Viewed

@@ -0,0 +1,30 @@
+---
+description: Toggle the Maestro context progress bar in the Claude Code status line
+argument-hint: [on|off]
+allowed-tools: Bash, Read
+---
+Toggle the Maestro context bar shown in the Claude Code status line.
+The bar is rendered by `context-bar.ps1` (Windows) or `context-bar.sh`
+(macOS / Linux) in the Claude Code statusline directory, normally
+`~/.claude/statusline/`. It self-gates on a flag file named
+`.context-bar-disabled` in that same directory:
+- flag file ABSENT  -> bar enabled (default)
+- flag file PRESENT -> bar disabled (status line shows the folder name only)
+Requested state: `$ARGUMENTS`
+Steps:
+1. Find the statusline directory: it is the directory containing the script
+   path in the `statusLine.command` field of `~/.claude/settings.json`.
+   Default to `~/.claude/statusline/` if it cannot be determined.
+2. Resolve the action against the flag file `.context-bar-disabled` in that
+   directory:
+   - `on` / `enable`   -> delete the flag file (no-op if already absent).
+   - `off` / `disable` -> create an empty flag file.
+   - no argument       -> toggle: delete the flag if present, else create it.
+3. Confirm the resulting state in one line. The change takes effect on the
+   next status line refresh.

package/commands/frontier.md ADDED Viewed

@@ -0,0 +1,124 @@
+---
+description: Maestro Frontier local multi-CLI engine: arm a mode (off/single/fusion) so it auto-runs every prompt, pick a model/preset, or run a one-off prompt
+argument-hint: "<off | single <model> | fusion <preset> | status | run <prompt> | adopt>"
+allowed-tools: Bash, Read
+---
+Drive the Maestro Frontier engine: a zero-dependency local multi-CLI
+fusion engine (parallel panel of local CLIs -> a judge model's
+analysis -> grounded synthesis). Default mode is `off`: the engine is opt-in and
+never runs until you switch it on. Arming it (`single` or `fusion`)
+makes it **auto-run on every prompt** — a `UserPromptSubmit` hook
+(`hooks/frontier-autorun.cjs`) routes each prompt through the engine and
+the live session relays the synthesized answer. `off` disables auto-run.
+Autorun blocks the turn until the engine returns; the hook carries a
+300s timeout (`hooks/hooks.json`), and a run that exceeds it is skipped
+so the turn proceeds normally. Any engine error degrades the same way.
+Requested action: `$ARGUMENTS`
+Map the argument to one engine CLI call and run it. The engine is
+self-contained; do not edit its state file yourself.
+1. Mode switch (persists across sessions; default `off`):
+   ```bash
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier mode off
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier mode single --model <model>
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier mode fusion --preset <preset>
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier mode fusion --preset custom --models <a,b,c>
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier mode fusion --preset <preset> --judge <model> --synth <model>
+   ```
+   Models: `opus` (Claude Opus 4.8), `gpt-5.5` (Codex), `gemini`
+   (Gemini 3.1 Pro). Presets: `opus-duo`, `opus-gpt`, `gpt-duo`,
+   `frontier-trio`, `custom`. The judge + synthesizer default to Opus,
+   but `gpt-duo` runs them on GPT-5.5 (a Codex-only fusion that needs no
+   `claude`), and `--judge`/`--synth` override the model for any preset,
+   so you can mix freely (e.g. `--judge opus --synth gpt-5.5`).
+2. Show current mode/preset:
+   ```bash
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier status
+   ```
+3. Run a prompt through the current mode (prompt as the argument, or
+   piped on stdin). This is a manual one-off; when the engine is armed
+   (`single`/`fusion`) the autorun hook already runs every prompt for
+   you, so `run` is mainly for scripting or an explicit re-run:
+   ```bash
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier run "<prompt>"
+   ```
+   - `off`: prints a notice and exits without spawning anything;
+     normal Maestro behavior is unchanged.
+   - `single`: dispatches the one selected CLI and prints its answer.
+   - `fusion`: runs the panel in parallel, then the judge and
+     synthesizer models; prints the final answer (a one-line run meta of
+     preset, models, analysis present, and failed models goes to stderr).
+4. Adopt a previously-armed **global** mode into this workspace
+   (per-workspace isolation means a workspace never inherits the old
+   global state automatically — this copies it in once, on demand):
+   ```bash
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier adopt
+   node "${CLAUDE_PLUGIN_ROOT}/bin/maestro.cjs" frontier adopt --force
+   ```
+   Reads the legacy global `frontier-state.json` read-only and writes it
+   into this workspace's `frontier-state.cc-<hash>.json`. It refuses to
+   overwrite an existing workspace state unless `--force`, never touches
+   the legacy file, and only targets a Claude Code `cc-*` scope. If
+   there is no legacy state (`missing-legacy`) it does nothing — arm the
+   workspace with `mode` instead.
+5. Report the result, matched to the action:
+   - `run`: report the engine's stdout verbatim.
+   - `adopt`: confirm the adopted mode/preset, or relay the
+     `ERROR [<reason>]` (e.g. `exists` — suggest `--force`; `missing-legacy`
+     — nothing to adopt). Arming now auto-runs on every prompt.
+   - arming a mode (`single`/`fusion`): confirm the mode, then tell the
+     user plainly that the engine now **auto-runs on every prompt** —
+     they just chat normally and the synthesized answer is relayed.
+     Do NOT tell the user to call `run` manually; arming already routes
+     every prompt through the engine (`run` is only a scripted one-off).
+   - `off`: confirm auto-run is disabled and normal Maestro resumes.
+   - `status`: report the current mode/preset.
+   On an error the engine prints `ERROR [<failure_reason>]: <detail>`
+   to stderr and exits non-zero; relay the failure_reason.
+Notes:
+- **Arming in Claude Code is per-workspace.** State is stored in a
+  workspace-scoped file (`frontier-state.cc-<hash>.json`, keyed to the
+  git project root). Arming in one workspace does not affect any other.
+  After upgrading Maestro, each Claude Code workspace starts `off` and
+  must be re-armed — no automatic migration into workspace scopes.
+  `adopt` (step 4) is the explicit opt-in to copy a prior global mode in.
+- Real `single`/`fusion` runs spawn local CLIs and cost tokens; each
+  cold `claude -p` panel/judge/synth call is non-trivial. Use small
+  prompts. `off` is free.
+- Headless web access varies per CLI (Codex confirmed; Claude and
+  Gemini gated off in this build). The engine sets a per-adapter
+  `webTools` flag accordingly; see the risk burndown.
+- `gemini` works well as a panel member, but on Windows it is a poor
+  `--judge`/`--synth` choice: it takes its prompt as an argument and the
+  judge/synth prompts contain newlines, which the win32 arg-safety guard
+  refuses, so the stage then degrades (judge omitted, synth falls back).
+  Use `opus` or `gpt-5.5` for judge/synth on Windows. (No such limit on
+  macOS/Linux, where args are passed directly.)
+## Binary overrides
+The engine is zero-dependency CommonJS under `frontier/`. Each CLI is
+resolved from your `PATH`: `claude` (Opus 4.8), `codex` (GPT-5.5), and
+`gemini` (Gemini 3.1 Pro). When a binary is not on `PATH`, or you want a
+specific build, point at it with an environment variable:
+- `MAESTRO_CLAUDE_BIN` sets the `claude` binary path.
+- `MAESTRO_CODEX_BIN` sets the `codex` binary path.
+- `MAESTRO_GEMINI_BIN` sets the `gemini` binary path.

package/commands/settings.md ADDED Viewed

@@ -0,0 +1,101 @@
+---
+description: View and change Maestro's toggles (terse, frontier, context-bar) — direct args or a keyboard picker
+argument-hint: "[status | list | help | set <key> <value> | terse|frontier|context-bar <value>]"
+allowed-tools: Bash, AskUserQuestion
+---
+See and change Maestro's toggles: terse output, the frontier fusion engine,
+and the context-bar status line. `compress` is an action, not a toggle, so it
+is not shown here.
+Two ways to use it. With **no arguments** it opens a keyboard picker. With
+**arguments** it runs the change directly and finishes — no questionnaire.
+One writer owns all state: `settings/cli.cjs`, which reads and writes the
+three existing stores. This command never edits those files directly; it
+always goes through the CLI so the existing readers stay in sync. Throughout,
+`SCLI` = `node "${CLAUDE_PLUGIN_ROOT}/settings/cli.cjs"`.
+Requested action: `$ARGUMENTS`
+## Route on `$ARGUMENTS`
+Trim it and split into tokens. The first token selects the path; **anything
+non-empty skips the picker entirely.**
+- **empty** → run the **Interactive picker** section below.
+- **`status`** (optionally `--json`) → run `SCLI status [--json]`, print it, stop.
+- **`list`** (optionally `--json`) → run `SCLI list [--json]`, print it, stop.
+- **`help`**, `-h`, `--help`, or anything unrecognized → run `SCLI help`,
+  print it (usage grammar + the full value matrix), stop. For an
+  unrecognized token, say so first, then show help.
+- **`set <key> <value> [flags]`** → pass straight through:
+  `SCLI set <key> <value> [--judge M] [--synth M] [--models a,b,c]`.
+- **shorthand** — first token is a key (`terse`, `frontier`, `context-bar`,
+  or `bar`): treat the rest as the value and run `SCLI set <key> <value>`.
+### Normalizing the value (both `set` and shorthand)
+The CLI takes one frontier value with a colon (`single:opus`,
+`fusion:opus-gpt`). Accept the friendlier space form and normalize:
+- `frontier off` → `SCLI set frontier off`
+- `frontier single opus` → `SCLI set frontier single:opus`
+- `frontier fusion opus-gpt` → `SCLI set frontier fusion:opus-gpt`
+- `frontier fusion custom --models opus,gpt-5.5,gemini`
+  → `SCLI set frontier fusion:custom --models opus,gpt-5.5,gemini`
+- `frontier fusion opus-gpt --judge opus --synth gpt-5.5` → pass the flags through
+- already-coloned (`frontier fusion:opus-gpt`) → pass as-is
+- `terse ultra`, `context-bar off` → `SCLI set terse ultra`, `SCLI set context-bar off`
+After any write, report any `WARNING:` line the CLI prints (for example an
+active `MAESTRO_TERSE_LEVEL` override or an unconfirmed status-line script),
+then run `SCLI status` so the result is visible. Confirm in one line.
+If a value is invalid the CLI exits non-zero with a message — relay it and
+show `SCLI help` so the user sees the valid values.
+## Interactive picker (no arguments)
+1. Read state and the catalog: `SCLI status --json` and `SCLI list --json`.
+   Build every option below from `list` — never hardcode a model or preset.
+2. First `AskUserQuestion` call, three questions, each pre-set to the current
+   value (only write a toggle whose answer differs from `status`):
+   - terse: `list.terse.values` (`off`, `lite`, `full`, `ultra`).
+   - context-bar: `list.contextBar.values` (`on`, `off`).
+   - frontier mode: `off`, `single`, `fusion` (`list.frontier.modes`).
+3. Frontier follow-ups (only if mode is not `off`):
+   - **single** → one question, `model` = `list.frontier.models` (3, fits);
+     result `single:<id>`.
+   - **fusion** → pick from `list.frontier.presets` (named presets + `custom`);
+     there are more than 4, so **page them 3 at a time** with a fourth
+     `More presets…` option that re-asks with the next three until all,
+     including `custom`, have been offered. Never drop a preset or require
+     typing its name.
+     - **custom** → one `multiSelect` question, `models` = `list.frontier.models`,
+       2–8 selected; result `fusion:custom --models <ids,joined>`.
+     - After any fusion preset, offer **override judge/synth?** (`No` uses
+       `list.frontier.presetStages[preset]` if present else
+       `list.frontier.defaults`; `Yes` → two questions, `judge` and `synth`,
+       each `list.frontier.stageModels`). On Windows prefer `opus`/`gpt-5.5`;
+       `gemini` degrades as judge/synth (see `commands/frontier.md`).
+4. Write each change with `SCLI set ...` (as in the routing section), report
+   warnings, then show `SCLI status`.
+## Codex and any other CLI
+No keyboard picker (Codex cannot host one). Use the CLI directly — it is the
+writer this command calls:
+```bash
+node settings/cli.cjs help               # usage grammar + every value
+node settings/cli.cjs status             # current values
+node settings/cli.cjs set frontier fusion:frontier-trio
+node settings/cli.cjs set frontier fusion:custom --models opus,gpt-5.5,gemini
+node settings/cli.cjs set frontier fusion:opus-gpt --judge opus --synth gpt-5.5
+```
+See [`docs/settings.md`](../docs/settings.md) for the full reference.

package/commands/terse.md ADDED Viewed

@@ -0,0 +1,23 @@
+---
+description: Switch Maestro terse output mode (lite|full|ultra|off)
+argument-hint: "[lite|full|ultra|off]"
+---
+Switch the Maestro terse output level. Requested level: `$ARGUMENTS`
+The maestro-terse-mode hook already updated the flag file when this
+command was submitted (bare invocation activates the configured
+default, or `full`). Do not edit the flag yourself.
+Steps:
+1. Read `${CLAUDE_PLUGIN_ROOT}/skills/terse/SKILL.md` and apply the
+   ruleset for the requested level from this response onward
+   (`off`: drop terse style entirely).
+2. Confirm in one line: new level, how to switch off
+   (`/maestro:terse off`, "stop terse", "normal mode"), and that the
+   statusline badge shows `[TERSE:<LEVEL>]` next session refresh.
+Boundaries (always): code/commits/PRs written normal; Auto-Clarity
+escape for security warnings, irreversible-action confirmations, and
+multi-step sequences.

package/commands/update.md ADDED Viewed

@@ -0,0 +1,59 @@
+---
+description: Update the Maestro plugin to the latest marketplace code and instruct the user to reload or restart
+argument-hint: ""
+allowed-tools: Bash, Read
+---
+Pull the latest Maestro plugin code from the marketplace and guide the user
+through applying it to the running session.
+Maestro pins no plugin version — the marketplace clone always tracks the newest
+committed code, so a marketplace update followed by a session reload resolves
+the latest without any manual version bump.
+## Steps
+1. **Pull the latest plugin code** via Bash:
+   ```bash
+   claude plugin marketplace update maestro
+   ```
+   This is the non-interactive CLI form of the marketplace update. Capture
+   stdout/stderr. If the command exits non-zero, report the error verbatim and
+   stop. If `claude` is not on PATH, tell the user to run this command in their
+   terminal manually and skip to step 3.
+2. **Report what changed** — after a successful update, show the latest commits:
+   ```bash
+   git -C "${CLAUDE_PLUGIN_ROOT}" log --oneline -5 2>/dev/null || true
+   ```
+   If the plugin root is not a git checkout (e.g. a marketplace zip install),
+   note the version is whatever the marketplace published and skip this step.
+3. **Reload the running session** — the plugin binary is pinned at session
+   start, so the new code is NOT live yet. Tell the user:
+   > Run `/reload-plugins` in this session to apply the update without
+   > restarting. If that command warns or is unavailable, restart Claude Code —
+   > the updated plugin loads automatically on next launch.
+   Do not attempt to run `/reload-plugins` via Bash; it is an in-session
+   slash command that must be entered by the user in the Claude Code UI.
+4. **Post-restart re-sync** — on restart the `SessionStart` hook fires and
+   re-syncs any wired copies (e.g. the statusline context-bar script).
+   This happens automatically; no manual step is needed.
+5. **Confirm** — report: update pulled, session reload required via
+   `/reload-plugins` or restart.
+Notes:
+- `/reload-plugins` applies the new code in-session; a full restart is
+  always a safe fallback.
+- Do NOT tell the user to run `/plugin update maestro` — that command does not
+  exist in Claude Code.
+- Do not edit any files.

package/docs/orchestration.md ADDED Viewed

@@ -0,0 +1,168 @@
+# Maestro Multi-Agent Orchestration: Full Protocol (S2-S6)
+Loaded on demand: read this file when the Decision Gate
+([AGENTS.md](../AGENTS.md) S1) returns a multi-agent verdict. The
+kernel's compact protocol is a subset of this document and suffices
+when this file is unavailable. Relocated verbatim from the always-on
+doctrine in v1.2, content here extends the kernel, never overrides
+it.
+---
+## Gate constraints (S1 detail)
+- Max 4 specialists per group.
+- >60% shared files or <=3 files in one chain: single-agent.
+- Overlapping ownership erases parallelism; high-centrality: bias
+  single.
+- Specialists must differ in role or context, not split identical
+  work, homogeneous splits underperform one agent with the same
+  budget. Split-design rule for the Planner, not a gate downgrade.
+- Parallelizability first: specialization pays only when subtasks are
+  structurally independent. Coupled subtasks: single-agent wins at
+  equal token budget, gains that ignore total compute don't count.
+- Adversarial review is the best-evidenced multi-agent win. Review
+  and debate panels: 3 specialists (odd, no ties); 4 stays the cap
+  for parallel workstreams.
+- How to split (and whether a split is too homogeneous) is the
+  Planner's call (S2), made after the spawn, never the gate's.
+---
+## 2. Planner [MULTI-AGENT]
+First sub-agent, created by calling the Task/Agent tool, never
+simulated inline by the orchestrator. No specialist work before
+Planner returns.
+Produces: subtasks with boundaries, dependency map, parallel groups
+(max 4), per-task file scope + objective + acceptance criteria, flags
+for single-agent subtasks and high-risk items, cross-talk pairs,
+token-cost assessment (flag >60% overlap), task-class match.
+Fewer broader > many narrow. Flag ambiguity, don't assume.
+Reading: recommends single-agent -> switch. Ambiguities -> surface.
+Task classes: Feature (spec/implement/test/integrate),
+Bug (reproduce/root-cause/fix/regress),
+Refactor (scope/refactor/test/verify),
+Audit (discover/analyze/consolidate), Docs+code (change/update/check).
+---
+## 3. Specialists [MULTI-AGENT]
+Manifest fields: ROLE, TASK, FILES (read/modify), UPSTREAM,
+ORIENTATION, ASSUMPTIONS, OUTPUT, ACCEPT, TOOLS (scoped), RULES (S7
+injected). ROLE = procedural workflow (step sequence + acceptance
+criteria), never a bare job title, identity labels alone don't
+change behavior.
+No conversation history, other tasks, full plan, or unrelated
+context. Isolation is the advantage. Out of scope: report and stop.
+---
+## 4. Cross-Talk [MULTI-AGENT]
+After each group: check if A modified B's files, changed B's
+interfaces, invalidated B's assumptions, or produced B's inputs.
+Route minimum context from A to B. If B completed, spawn correction
+agent. Orchestrator: spawn, sequence, detect, route, deliver. Never
+plan, code, review, or do specialist work.
+---
+## 5. Staff Engineer [MULTI-AGENT]
+Final sub-agent. Reviews integrated output.
+Packet: changed files + diffs, objective, decisions, risks,
+questions. Expand for: core architecture, security, central
+abstractions.
+Check: requirements met, specialist contradictions, cross-breakage
+(interfaces/imports/types/state), architectural drift, verification
+(S7.3), dead code/orphaned imports/incomplete renames,
+surgical-scope violations (S7.4).
+Returns PASS or FAIL (issues + owner + fix). Max 2 cycles, then
+deliver with issues listed.
+High-risk or contested verdicts: adversarial panel of 3 (odd, no
+ties), each prompted to refute, not confirm.
+---
+## 6. Orchestrator Discipline [MULTI-AGENT]
+- Route minimum viable info (signature, not 200-line diff)
+- Checkpoint before spawns/handoffs/resumes: objective, files,
+  requirements, decisions, risks, next action
+- Structured artifacts > transcript carryover
+- Stable scaffolds for cache reuse; no per-specialist rephrasing
+- Track agent status; report blocks immediately
+- Resume from latest artifact, not full history
+- Specialist fails: report, ask user. No silent retry >1
+- Deliver what asked. No gold-plating. Hooks > prompt reminders
+---
+## 9. Model Routing: full table
+Pick the cheapest model that handles the task. Orchestrator decides
+at spawn time; Planner (S2) assigns per subtask.
+| Tier | When | Examples |
+|------|------|----------|
+| Haiku | No edits, single source, low reasoning | Status lookup, chat, format, classify, extract |
+| **Sonnet** | 1-3 file edits, known scope. **Default** | Bug fix, refactor, test, review, docs |
+| Opus | 4+ files, novel design, high reversal cost | Architecture, security review, complex debug |
+| Frontier (Fable-class) | Orchestrator tier: long-horizon autonomous work, 1M-context audits, frontier reasoning | Orchestration, system design, deep multi-file debug, adversarial synthesis |
+When unsure: Sonnet.
+### Output caps
+Agent prompts MUST specify max response length. Oversized results
+bloat parent context and trigger compaction.
+| Agent tier | Cap | Exception |
+|------------|-----|-----------|
+| Haiku | 100 words | - |
+| Sonnet | 500 words | Code output (uncapped) |
+| Opus | Uncapped | - |
+| Frontier | Uncapped | - |
+| Explore | 200 words | Always, regardless of model |
+Explore agents: "report in under 200 words" in every prompt.
+### Tool-call budgets
+Action tokens are the third cost lever, beside output caps (above)
+and S8 input compression. Every subagent prompt carries a tool-call
+budget (manifest field `toolBudget`); idea adapted from
+claude-token-efficient (MIT).
+| Task type | Budget |
+|-----------|--------|
+| Routine subtask, known scope | ~20 calls |
+| Read-only research / Explore | ~10 calls |
+| Multi-file implementation | scale with file count; state it explicitly |
+Discipline inside the budget: read-first-write-once (read each
+needed file once, then edit, no re-read loops); one diagnostic read
+per failure, then the S7.3 two-attempt rule applies (stop, re-read
+from scratch, change approach). Budget exhausted: report progress
+and the named gap, never burn calls polling.
+Research agents returning raw dumps waste more tokens than they save.
+---
+## Self-evaluation (relocated S7.6)
+- Two perspectives: perfectionist critique + pragmatist accept
+- Bug autopsy: root cause vs symptom, prevention
+- After 2 failures: stop, re-read from scratch, different approach