npm - moflo - Versions diffs - 4.9.11 → 4.9.13 - Mend

moflo 4.9.11 → 4.9.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/.claude/commands/simplify.md +78 -30
package/.claude/skills/eldar/SKILL.md +305 -0
package/.claude/skills/simplify/SKILL.md +90 -21
package/README.md +25 -0
package/bin/hooks.mjs +2 -2
package/bin/index-guidance.mjs +14 -24
package/bin/index-patterns.mjs +13 -10
package/bin/session-start-launcher.mjs +205 -11
package/dist/src/cli/commands/doctor-checks-deep.js +76 -0
package/dist/src/cli/commands/doctor.js +53 -1
package/dist/src/cli/config/moflo-config.js +14 -3
package/dist/src/cli/init/moflo-init.js +20 -266
package/dist/src/cli/init/moflo-yaml-template.js +370 -0
package/dist/src/cli/mcp-tools/hooks-tools.js +3 -1
package/dist/src/cli/movector/model-router.js +66 -20
package/dist/src/cli/services/hook-block-hash.js +341 -0
package/dist/src/cli/services/index.js +2 -0
package/dist/src/cli/version.js +1 -1
package/package.json +2 -2

package/.claude/commands/simplify.md CHANGED Viewed

@@ -1,55 +1,103 @@
 ---
-description: Review changed code for reuse, quality, and efficiency, then fix any issues found.
+description: Review changed code for reuse, quality, and efficiency, then fix any issues found. Sizes review effort to the diff and routes the cheapest model that fits.
 ---
-# /simplify — Gate-Compliant Code Review
+# /simplify — Adaptive Gate-Compliant Code Review
-Review all changed files for reuse opportunities, code quality, and efficiency improvements.
+Review changed code for reuse, quality, and efficiency. **Effort scales with diff size; model is routed for cost.** A 5-line comment trim does not get 3 Opus agents.
-**This command overrides any built-in simplify skill.** Follow these steps exactly.
+## Phase 0: Gate prerequisites
-## Prerequisites (MANDATORY — do these FIRST)
+These satisfy the memory-first and task-create-first gates. Always do them before any Agent spawn.
-1. **Memory search**: Search for relevant patterns before reviewing
-```
-mcp__moflo__memory_search — query: "code quality patterns", namespace: "patterns"
-```
+1. **Memory search** — `mcp__moflo__memory_search — query: "code quality patterns", namespace: "patterns"`
+2. **Task create** — `TaskCreate — subject: "🔍 [Reviewer] Simplify changed code"`
-2. **Create task**: Track the simplification work
-```
-TaskCreate — subject: "🔍 [Reviewer] Simplify changed code", description: "Review changed files for reuse, quality, and efficiency"
+## Phase 1: Identify the diff
+```bash
+git diff HEAD          # working tree
+git diff main...HEAD   # committed since branch base
 ```
-## Execution
+Treat the union as the diff. Note whether `/simplify` already ran on this branch in this session — if so, you are in a **validation pass** (Phase 4 below).
-After prerequisites are satisfied, get the list of changed files:
+## Phase 2: Classify the diff
-```bash
-git diff --name-only HEAD~1
-```
+Pick the smallest tier the diff genuinely fits.
-Then launch 3 reviewer agents **in parallel** (single message, multiple Agent tool calls).
+| Tier | Trigger | Action |
+|------|---------|--------|
+| **TRIVIAL** | ≤10 net LOC, single file, comments/formatting/local renames only | Self-review, zero agents |
+| **SMALL** | ≤200 net LOC, ≤2 files, no API/dependency change | **One agent** (default for most diffs, including critical surface) |
+| **NORMAL** | ≥3 files, OR >200 LOC, OR public API change, OR new/removed dependency, OR cross-cutting refactor | Three parallel agents |
-**CRITICAL**: Each agent prompt below includes a mandatory memory search step. This is required because subagents must satisfy the memory-first gate independently before using Glob, Grep, or Read tools. Do NOT remove the memory search from agent prompts.
+Critical-surface files (launcher, hooks, MCP wiring) raise the *care* of the agent prompt — sharper checklist, blast-radius framing — they do **not** automatically escalate to NORMAL. Risk-weighted ≠ headcount-weighted.
+## Phase 3: Route the model (skip for TRIVIAL)
+Before spawning any Agent, ask the moflo router which model to use:
-### Agent 1: Reuse Reviewer
 ```
-Agent — name: "reuse-reviewer", run_in_background: true, subagent_type: "reviewer", prompt: "FIRST ACTION: Run mcp__moflo__memory_search with query 'code reuse patterns' and namespace 'patterns'. You MUST do this before any Glob, Grep, or Read calls. THEN review these changed files for code reuse opportunities. Look for: duplicated logic that could use existing utilities, patterns already solved elsewhere in the codebase, opportunities to extract shared helpers. Files: $CHANGED_FILES"
+mcp__moflo__hooks_model-route — {
+  task: "Review N-line change in <files> for reuse, quality, efficiency",
+  preferCost: true
+}
 ```
-### Agent 2: Quality Reviewer
+**Wording rules:** the router's complexity score is keyword-sensitive. Avoid `refactor`, `architect`, `audit`, `system`, `redesign`, `migrate` — those force opus even when scoring suggests sonnet. State LOC count, file count, and "review for reuse, quality, efficiency". Nothing more.
+**Hard rule for `/simplify`: opus is never correct.** Code review never needs Opus reasoning, even on critical surface. If the router returns `opus`, downgrade to `sonnet`. On router failure, default to `sonnet`. Comment trims and pure formatting → `haiku`.
+## Phase 4: Validation pass (re-run after fixes from a prior simplify)
+If `/simplify` already ran on this branch in this session AND the only edits since are fixes the prior pass surfaced, **default to TRIVIAL self-review** regardless of LOC count. The fan-out happened; the fix is small relative to the already-reviewed diff.
+Escalate one tier (self-review → SMALL agent) only if the fix introduced a new file, a new exported symbol, a new dependency, or a control-flow change not covered by the original findings. Never escalate a validation pass to NORMAL.
+## Phase 5: Run the appropriate review
+### TRIVIAL / Validation
+Run the three category checks (reuse / quality / efficiency) yourself in one pass against the diff. Most TRIVIAL diffs are clean — confirm and exit. Budget: ~30 seconds, no Agent.
+### SMALL — one agent
 ```
-Agent — name: "quality-reviewer", run_in_background: true, subagent_type: "reviewer", prompt: "FIRST ACTION: Run mcp__moflo__memory_search with query 'code quality patterns' and namespace 'patterns'. You MUST do this before any Glob, Grep, or Read calls. THEN review these changed files for code quality issues. Look for: unclear naming, overly complex logic, missing error handling at system boundaries, potential bugs, consistency with existing patterns. Files: $CHANGED_FILES"
+Agent — {
+  subagent_type: "reviewer",
+  model: "<sonnet (or haiku for trivial-formatting tier from router)>",
+  prompt: "FIRST ACTION: Run mcp__moflo__memory_search with query 'code review patterns' and namespace 'patterns' to satisfy the memory-first gate. THEN review this diff for reuse, quality, and efficiency. <diff inline>. Flag specific issues as file:line + 1-line description. Max 5 file reads. Under 200 words. Skip cosmetic style. Don't suggest cross-cutting refactors of code outside this diff."
+}
 ```
-### Agent 3: Efficiency Reviewer
+For critical-surface files, prepend a 1-line risk note (e.g., "This is `bin/session-start-launcher.mjs` — runs in every consumer's session-start hot path; cross-platform + blast-radius matter."). One careful agent, not three.
+### NORMAL — three parallel agents
+Launch three agents in a single message, each at the routed model (typically sonnet). Each agent gets the SMALL-tier tool-budget cap.
+- **Agent 1 (Reuse):** existing helpers/utilities that should be used; duplicated patterns; functions re-implementing something already in the codebase.
+- **Agent 2 (Quality):** redundant state, parameter sprawl, copy-paste with variation, leaky abstractions, stringly-typed code, nested conditionals 3+ levels, unnecessary comments.
+- **Agent 3 (Efficiency):** unnecessary work, missed concurrency, hot-path bloat, recurring no-op updates, TOCTOU existence checks, unbounded structures, over-broad reads.
+Each agent prompt must start with `FIRST ACTION: mcp__moflo__memory_search ... namespace: "patterns"` — subagents must satisfy the memory-first gate independently before Glob/Grep/Read.
+## Phase 6: Fix or skip
+Aggregate findings. Fix each issue directly that's worth fixing. False positives or out-of-scope: note and skip without arguing.
+If fixes were made, re-run tests to confirm nothing broke. If tests fail after a fix, revert it.
+After fixes: the next `/simplify` invocation is a **validation pass** (Phase 4). Bundle related fixes into one batch so a single validation pass covers them — don't re-fan-out for cosmetic micro-corrections.
+## Phase 7: Optional — record routing outcome
+If you spawned an agent, feed back the outcome so the router learns:
 ```
-Agent — name: "efficiency-reviewer", run_in_background: true, subagent_type: "reviewer", prompt: "FIRST ACTION: Run mcp__moflo__memory_search with query 'performance optimization patterns' and namespace 'patterns'. You MUST do this before any Glob, Grep, or Read calls. THEN review these changed files for efficiency improvements. Look for: unnecessary allocations, O(n^2) where O(n) is possible, redundant operations, opportunities to batch or cache. Files: $CHANGED_FILES"
+mcp__moflo__hooks_model-outcome — { task: "...", model: "<chosen>", outcome: "success" | "failure" | "escalated" }
 ```
-## Post-Review
+`escalated` only when a real miss happened that a higher tier would have caught — never used to retroactively justify opus.
+## Briefly summarize
-1. Collect findings from all 3 reviewers
-2. Apply fixes that preserve ALL existing functionality — no behavior changes
-3. If fixes were made, re-run tests to confirm nothing broke
-4. If tests fail after fixes, revert the simplification changes
+End with one or two sentences: which tier ran, which model, what was fixed (or "clean — no changes"). No headers, no bullets.

package/.claude/skills/eldar/SKILL.md ADDED Viewed

@@ -0,0 +1,305 @@
+---
+name: eldar
+description: Consult the Eldar — audit a project's moflo + Claude Code setup for portable, high-leverage gaps and guide remediation. Default mode is read-only audit with severity-ranked findings; --fix presents an interactive triage menu and walks the user through each chosen fix (healer, missing CLAUDE.md, sparse guidance, hook/MCP wiring, empty memory namespaces, stack→guidance gaps). Use when starting in a new project, when Claude feels lost or inefficient, when guidance/CLAUDE.md is sparse, or as a periodic health check.
+arguments: "[--fix]"
+---
+# /eldar — Consult the Eldar
+The Eldar audit a project's moflo + Claude Code setup for portable, high-leverage gaps. **Audit is read-only by default; `--fix` walks through remediation.** The Eldar consult the **Healer** (`flo healer`, the thematic alias for `flo doctor`), they do not replace them.
+**Arguments:** $ARGUMENTS
+## Modes
+| Mode | Trigger | What it does |
+|------|---------|--------------|
+| Audit | no flag (default) | Read-only scan; produces categorized findings + top-3 recommendation |
+| Fix | `--fix` | Audit, then interactive triage menu; user picks findings to address one at a time |
+## Step 0 — Memory First
+Before any file reads, run:
+```
+mcp__moflo__memory_search { query: "guidance rules project conventions stack", namespace: "guidance" }
+```
+The memory-first gate blocks reads otherwise. The search also surfaces any project-specific conventions the Eldar should weigh in their findings.
+## Step 1 — Run the Audit
+Walk the checklist below in order. Each check is a single category in the final report. Be explicit about what you find — both presence and absence. Severities: `error` (blocks productive work), `warn` (degrades quality), `info` (suggestion).
+### 1a. Setup Health — call the Healer
+```bash
+npx moflo healer --json
+```
+Parse the JSON output. Surface every `failed` check as `error`, every `warn` as `warn`. Do **not** invoke `flo doctor` directly — use the `healer` alias for thematic consistency.
+### 1b. Index Freshness
+Check for `.moflo/moflo.db` (existence + mtime). Query memory namespaces to confirm guidance + code-map are populated:
+```
+mcp__moflo__memory_stats — { namespace: "guidance" }
+mcp__moflo__memory_stats — { namespace: "code-map" }
+```
+Flag if `entries === 0` (warn) or db missing (error).
+### 1c. Version Skew
+```bash
+npm view moflo version    # latest published
+node -e "console.log(require('./package.json').devDependencies?.moflo || require('./package.json').dependencies?.moflo || 'not-installed')"
+```
+Compute minor-version delta. Warn if behind by ≥3 minors; info if behind by 1–2.
+### 1d. Model & Token Routing
+```
+mcp__moflo__hooks_model-stats — {}
+```
+If recent sonnet→opus escalation rate exceeds ~30%, flag as `info`: "router escalating frequently — see `.claude/guidance/shipped/moflo-claude-swarm-cohesion.md` for tuning". If stats unavailable (no history), skip silently.
+### 1e. CLAUDE.md
+Check `CLAUDE.md` (and `.claude/CLAUDE.md`) for:
+| Check | Threshold | Severity |
+|-------|-----------|----------|
+| Exists | required | error if missing |
+| Line count | 20–500 | warn if outside range |
+| Referenced files exist | every relative path it cites | warn per missing path |
+Use `Grep` over the file content for `\.claude/[a-z-]+/[a-z-]+\.md` patterns and verify each path resolves.
+### 1f. Guidance Content
+Count `.md` files under `.claude/guidance/` (recursive). Severity table:
+| File count | Severity |
+|------------|----------|
+| 0 | warn — "no guidance docs; Claude has nothing project-specific to follow" |
+| 1–2 | warn — "very sparse guidance" |
+| 3–10 | info |
+| 11+ | info |
+### 1g. Guidance Structure (only if 1f found ≥1 file)
+Apply the universal rules from `.claude/guidance/shipped/moflo-guidance-rules.md`. For each `.md` file, check:
+- Has `**Purpose:**` line right after H1
+- Has `## See Also` at end
+- Under 500 lines
+- H2 headings are specific (not "Overview", "Configuration", "Examples")
+- No hedged language in rule contexts (`should`, `might`, `consider`)
+Do **not** duplicate `/guidance -a`'s logic verbatim — just produce a one-line summary per file (`<file>: <N issues>`). The Eldar surface gaps; `/guidance -a` does the deep audit.
+### 1h. Memory Health
+For each of the canonical namespaces, check entry count:
+```
+mcp__moflo__memory_stats — { namespace: "guidance" }
+mcp__moflo__memory_stats — { namespace: "patterns" }
+mcp__moflo__memory_stats — { namespace: "learnings" }
+```
+Flag empty `learnings` as `info` (project hasn't accumulated decisions yet — fine for new projects). Flag empty `guidance` as `warn` (no indexed guidance means semantic search is degraded).
+### 1i. Hooks & MCP Wiring
+Read `.claude/settings.json`. Check:
+| Check | Severity |
+|-------|----------|
+| Session-start hook references the moflo launcher | error if missing |
+| `mcpServers.moflo` is configured | error if missing |
+| `hooks` section exists with at least pre-task/post-task entries | warn if absent |
+If settings.json is malformed JSON, surface as `error`.
+### 1j. Settings Sanity
+Spot-check `.claude/settings.json` for:
+- `permissions` block exists (info if absent — every prompt becomes a confirmation)
+- `env` block has at least the moflo entries the launcher writes
+- `statusLine` is configured (info — quality-of-life, not blocking)
+### 1k. Spell Inventory
+```bash
+npx moflo spell list
+```
+Flag `info` if count is 0 (no spells registered — user may not know they exist).
+### 1l. Subagent Fleet
+```
+Glob — { pattern: ".claude/agents/**/*.md" }
+```
+Count the result. `info` if 0 (no project-specific subagents — user is relying entirely on built-ins).
+### 1m. Stack → Guidance Cross-Reference (highest leverage)
+Detect the project's stack from manifests:
+| Manifest | Detected stack |
+|----------|----------------|
+| `package.json` deps | Node — inspect for React, Next, Drizzle, Prisma, Express, NestJS, Vite, etc. |
+| `pyproject.toml` / `requirements.txt` | Python — Django, FastAPI, SQLAlchemy, etc. |
+| `Cargo.toml` | Rust — axum, tokio, sqlx, etc. |
+| `go.mod` | Go — gin, sqlc, gorm, etc. |
+| `Gemfile` | Ruby — Rails, Sidekiq, etc. |
+For each detected technology, check whether `.claude/guidance/` mentions it (Grep for the technology name across the directory). Each `(detected stack item, no guidance match)` pair becomes one `info` finding: "uses Drizzle ORM but no DB-conventions guidance — high-leverage gap".
+This is the **highest-impact finding** for new adopters. Lead with it in the recommendation.
+### 1n. Anti-Pattern from History (best-effort, optional)
+If recent transcripts/commits are accessible, scan them for repeated manual work that an existing spell or agent already covers (e.g., 5+ separate `git status`/`git diff`/run-tests sequences in a session that `/simplify` would have handled). Surface as `info`: "consider /simplify for review loops". If unavailable, skip silently — never block the audit on this.
+## Step 2 — Render the Report
+Output a single table grouped by category, sorted by severity (`error` → `warn` → `info`):
+```
+ELDAR AUDIT — <project name>
+─────────────────────────────
+Category               Finding                                    Severity
+─────────────────────────────────────────────────────────────────────────
+Setup health           Healer reports 0 errors, 1 warning         warn
+Index freshness        Guidance index empty                       warn
+CLAUDE.md              File missing                               error
+Guidance content       0 docs in .claude/guidance/                warn
+Memory health          guidance namespace empty                   warn
+Stack → guidance       Drizzle ORM in deps; no DB guidance        info
+Stack → guidance       React Native; no mobile guidance           info
+Hooks & MCP wiring     all wired                                  ok
+... (etc) ...
+```
+Then list the **top 3 ranked recommendations** in plain English, with rationale and citation:
+```
+TOP 3 RECOMMENDATIONS
+─────────────────────
+1. Add CLAUDE.md (error)
+   Without it, Claude has no project entry point. Use the Eldar's
+   stack-aware scaffold via `/eldar --fix`.
+2. Add Drizzle conventions guidance (info — high leverage)
+   You use Drizzle ORM but have no DB-conventions doc. This is the
+   single highest-leverage gap for getting Claude to write idiomatic
+   queries and migrations in your codebase.
+   See: .claude/guidance/shipped/moflo-guidance-rules.md
+3. Run `flo healer --fix` (warn)
+   One auto-fixable warning. Run via `/eldar --fix` and select Healer.
+```
+End the audit with a one-line prompt: "Run `/eldar --fix` to address these interactively."
+## Step 3 — Fix Mode (`--fix` flag only)
+After the report, present a numbered triage menu:
+```
+TRIAGE MENU
+───────────
+[1] Add CLAUDE.md
+[2] Add Drizzle conventions guidance
+[3] Run flo healer --fix (1 warning)
+[4] Add empty .claude/guidance/ docs to memory namespaces
+Choose: all, none, or comma-separated numbers (e.g., 1,3): _
+```
+Drive each chosen finding through its sub-flow. Confirm before any write.
+### 3a. CLAUDE.md scaffold
+Ask the user 2–4 targeted questions based on detected stack:
+1. "What does this project do? (1-2 sentences for Claude's context)"
+2. "Primary tech stack confirmed: <detected list>. Anything missing?"
+3. "Any conventions Claude should follow (testing approach, branch model, etc.)?"
+4. "Any high-blast-radius areas Claude should be careful with?"
+Compose a CLAUDE.md draft incorporating their answers + standard moflo memory-first rule. **Show the draft to the user before writing.** Never auto-fill opinionated content.
+### 3b. Stack → guidance authoring
+For each chosen stack-gap finding:
+- Hand off to `/guidance` skill for the heavy lifting — it already enforces the universal rules.
+- Brief the user on what gap will be filled: "drafting Drizzle conventions doc covering query patterns, migrations, schema files".
+- Ask 2–4 targeted questions about *their* conventions (not generic Drizzle tips — Claude should follow how *they* use it).
+- The `/guidance` skill produces the draft and walks the user through the rules check.
+### 3c. Healer fixes
+```bash
+npx moflo healer --fix
+```
+Pass through the output verbatim. If the Healer reports manual-only fixes, surface them as next steps.
+### 3d. Hook/MCP wiring repair
+Suggest:
+```bash
+npx moflo init --upgrade
+```
+This is the standard wiring repair path. If the user is wary of running init, surface the specific missing keys from `.claude/settings.json` and offer to write them directly.
+### 3e. Empty namespaces
+Suggest concrete first entries based on detected stack. Example: "Your project uses Drizzle. Want me to seed `learnings` with the most common Drizzle gotchas as a starting set? You'd review each before storage."
+If the user declines, that's fine — empty `learnings` is a valid state for a young project.
+### 3f. After each fix
+After each chosen fix completes, ask: "Continue to next finding? (y/n)". Don't run them all in a batch — every change is high-leverage and deserves the user's attention.
+## Step 4 — Wrap-Up
+After audit (or audit + chosen fixes), end with:
+- **Audit-only**: One sentence — what was found, what to do next.
+- **Fix mode**: One sentence per applied fix, plus a closing line on what remains.
+Never leave the user without a clear next step.
+## Important
+- **Memory-first is mandatory.** Step 0 runs the search; the gate blocks reads otherwise.
+- **Call the Healer, not the Doctor.** `npx moflo healer` (alias) — never `flo doctor` — for thematic consistency.
+- **No auto-write of opinionated content.** Every guidance doc, every CLAUDE.md draft, every namespace seed gets shown to the user first.
+- **Portable only.** This skill ships to consumers via `.claude/skills/**/*.md` in the package files array. Never assume moflo source paths or moflo-internal state.
+- **No kitchen sink.** The audit checklist is locked at the categories above. New checks require a specific portable benefit and an issue to discuss them.
+- **Read-only by default.** `/eldar` (no flag) never writes. Only `--fix` writes, and only with per-finding confirmation.
+- **Hand off to specialists.** `/guidance` for guidance authoring, `flo healer --fix` for setup repair, `flo init --upgrade` for wiring. The Eldar route, they don't reimplement.
+## See Also
+- `.claude/guidance/shipped/moflo-guidance-rules.md` — Universal guidance writing rules used by `/guidance` and surfaced in 1g
+- `.claude/skills/guidance/SKILL.md` — The skill `/eldar --fix` hands off to for guidance authoring
+- `.claude/guidance/shipped/moflo-core-guidance.md` — moflo CLI / hooks / memory reference; useful when explaining wiring findings
+- `.claude/guidance/shipped/moflo-claude-swarm-cohesion.md` — Subagent + task coordination reference cited in routing findings

package/.claude/skills/simplify/SKILL.md CHANGED Viewed

@@ -5,7 +5,7 @@ description: Review changed code for reuse, quality, and efficiency, then fix an
 # /simplify — Adaptive Code Review
-Review changed code for reuse opportunities, quality issues, and efficiency improvements. **Effort scales with diff size** — a 5-line comment trim doesn't get the same treatment as a 500-line refactor.
+Review changed code for reuse opportunities, quality issues, and efficiency improvements. **Effort scales with diff size and reuses prior context** — a 5-line comment trim doesn't get the same treatment as a 500-line refactor, and a re-run after fixing pass-1 findings doesn't re-pay for a fresh fan-out.
 ## Phase 1: Identify changes
@@ -13,9 +13,11 @@ Run `git diff HEAD` (working tree) and `git diff main...HEAD` (committed) to get
 Treat the union of staged + unstaged + committed-since-base as the diff to review.
+Also note: was `/simplify` already run on this branch in this session? If yes, you're in a **validation pass** (Phase 2.5 below) — most of the heavy lifting is done.
 ## Phase 2: Classify the diff
-Pick the **smallest tier** the diff genuinely fits. When in doubt, escalate.
+Pick the **smallest tier** the diff genuinely fits. When in doubt, escalate one step (not two).
 ### TRIVIAL — self-review, no agent spawn
 ALL of these must hold:
@@ -28,38 +30,103 @@ ALL of these must hold:
 Examples that qualify: trimming a comment, fixing a typo in a log message, renaming a private helper, reformatting a single block.
 Examples that DON'T qualify: changing an `if` condition, reordering function args, deleting a try/catch.
-### SMALL — single agent, all three categories
+### SMALL — single agent, all three categories (DEFAULT for most diffs)
 ALL of these must hold:
-- ≤50 net LOC changed
+- ≤200 net LOC changed
 - ≤2 files
 - No structural changes (no new modules, no API additions/removals, no contract changes)
-Examples that qualify: extracting a constant, inlining a one-liner, swapping a `for` for a `forEach`, adding one early-return.
+This is the default tier for **most real diffs**, including changes to critical surface (launcher, hooks, MCP wiring). Critical surface raises the *care* of the agent prompt (sharper checklist, blast-radius framing), not the *number* of agents.
+Examples that qualify: extracting a constant, inlining a one-liner, swapping a `for` for a `forEach`, adding one early-return, refactoring a single function within a file, adding a cache fast-path inside an existing block.
-### NORMAL — three parallel agents (the original flow)
-Anything that doesn't fit TRIVIAL or SMALL. Includes any diff that:
-- Spans 3+ files
+### NORMAL — three parallel agents
+Reserved for **genuinely cross-cutting** changes. ANY of these triggers NORMAL:
+- 3+ files changed
+- >200 net LOC changed
 - Adds/removes/renames a public API
-- Changes control flow in a non-trivial way
 - Introduces or removes a dependency
-- Touches `bin/`, hooks, MCP tool handlers, or anything called out in `CLAUDE.md` as critical surface
+- Cross-cutting refactor (touches the same pattern in multiple modules)
+Three agents exist to cover orthogonal axes (Reuse / Quality / Efficiency) when the change is broad enough that one agent's tool-call budget can't survey it all. For single-file edits, one focused agent always covers all three axes — three is duplication, not coverage.
+## Phase 2.5: Validation pass (re-run after fixes)
+If `/simplify` already ran on this branch in this session AND the only edits since are fixes driven by the prior pass's findings, default to **self-review tier** regardless of LOC count. The fan-out already happened; the fix is small relative to the diff that was already reviewed.
+Escalate one tier (self-review → SMALL agent) only if the fix introduced any of:
+- A new file
+- A new exported symbol
+- A new dependency or import from a previously-untouched module
+- A change to control flow not covered in the original findings
+Do **not** escalate to NORMAL on a validation pass. If the fix is so structural that NORMAL is warranted, treat it as a fresh diff and start over from Phase 1.
+## Phase 2.7: Route the model (before any Agent spawn)
+For every tier that spawns an Agent (SMALL / NORMAL — TRIVIAL self-review skips this), call the moflo router to pick the cheapest model that fits the task **before** invoking Agent:
+```
+mcp__moflo__hooks_model-route — {
+  task: "<diff summary — see wording rules below>",
+  preferCost: true
+}
+```
+### Wording the task description
+The router's complexity score is keyword-sensitive. Words like `refactor`, `architect`, `audit`, `system`, `redesign`, `migrate` flip a high-complexity flag and force opus *even when scoring suggests sonnet*. For `/simplify` you are **always doing code review**, never genuine architecture, so frame the task accordingly:
+- ✅ Good: `"Review 110-line single-file change in bin/session-start-launcher.mjs for reuse, quality, efficiency."`
+- ❌ Bad: `"Review refactor that adds mtime-cache fast-path and architects new caching layer."`
-When CLAUDE.md flags a file as critical surface (SessionStart, launcher, hooks, MCP coordinator wiring, swarm/hive-mind), **always escalate to NORMAL** regardless of LOC count. Risk-weighted, not size-weighted.
+Drop the trigger words. State LOC count, file count, and "review for reuse, quality, efficiency". That's enough signal.
+### Applying the result
+The router returns `{ model: 'haiku' | 'sonnet' | 'opus', complexity, reasoning, alternatives, ... }`.
+**Hard rule for `/simplify`: opus is never correct.** Code review does not require Opus-tier reasoning even on critical surface. If the router returns `opus`:
+1. Look at `alternatives` — if `sonnet` scores higher than the selected model's confidence, downgrade to sonnet.
+2. Otherwise, downgrade to sonnet anyway (treat opus as "router was uncertain — pick the safer middle").
+Pass the final model verbatim to the Agent's `model` parameter (Agent accepts `'haiku' | 'sonnet' | 'opus'`). On router failure (MCP call errors), default to `'sonnet'`.
+In practice: comment trims and pure formatting → haiku; everything else for `/simplify` → sonnet.
+### Feed back the outcome
+After the agent completes, record the outcome so the router learns:
+```
+mcp__moflo__hooks_model-outcome — { task: "<same wording as route call>", model: "<chosen>", outcome: "success" | "failure" | "escalated" }
+```
+`escalated` = the agent missed something a higher-tier pass would have caught. That signal teaches the router to bias similar tasks upward next time. Don't fake `escalated` to retroactively justify opus — only record it when a *real* miss happened.
 ## Phase 3: Run the appropriate review
-### TRIVIAL: self-review
-Run the same three category checks (reuse / quality / efficiency) yourself, in one pass, against the diff. Most TRIVIAL diffs will be clean — the goal is to confirm, not to fan out. If you find an issue, fix it; otherwise stamp clean. Total budget: ~30 seconds, no Agent calls.
+### TRIVIAL / Validation: self-review
+Run the same three category checks (reuse / quality / efficiency) yourself, in one pass, against the diff. Most TRIVIAL and validation diffs will be clean — the goal is to confirm, not to fan out. If you find an issue, fix it; otherwise stamp clean. Total budget: ~30 seconds, no Agent calls. No router call needed.
-### SMALL: one agent
-Launch a SINGLE Agent with subagent_type `reviewer` covering all three categories in one prompt. Pass the diff inline. Budget: ~1 minute.
+### SMALL: one agent (model from router)
+Launch a SINGLE Agent with subagent_type `reviewer`, passing the model returned by Phase 2.7's router call. Cap the agent's tool budget by being explicit:
 ```
-Agent — subagent_type: "reviewer", prompt: "Review this diff for reuse, quality, and efficiency. <diff inline>. Flag specific issues with file:line; skip generic advice. Under 200 words."
+Agent — {
+  subagent_type: "reviewer",
+  model: "<from router, typically 'sonnet'>",
+  prompt: "Review this diff for reuse, quality, and efficiency. <diff inline>. Flag specific issues as file:line + 1-line description. Max 5 file reads. Under 200 words. Skip cosmetic style. Don't suggest cross-cutting refactors of code outside this diff."
+}
 ```
-### NORMAL: three parallel agents (original flow)
-Launch three agents in a single message — Reuse, Quality, Efficiency — passing the full diff to each. Use the original flow's category checklists.
+For critical-surface files, prepend a 1-line risk note to the prompt (e.g., "This is `bin/session-start-launcher.mjs` — runs in every consumer's session-start hot path; cross-platform + blast-radius matter."). One careful agent, not three.
+Budget: ~1 minute.
+### NORMAL: three parallel agents (model from router, applied to all)
+Launch three agents in a single message — Reuse, Quality, Efficiency — passing the full diff and the same routed `model` to each. Each agent gets the same tool-budget cap as SMALL.
 **Reuse**: existing helpers/utilities that should be used instead; duplicated patterns; new functions that re-implement something already in the codebase.
@@ -69,14 +136,16 @@ Launch three agents in a single message — Reuse, Quality, Efficiency — passi
 ## Phase 4: Fix or skip
-Aggregate findings. Fix each one directly. False positives or not-worth-fixing — note and skip without arguing. If TRIVIAL self-review found nothing, just confirm clean and exit.
+Aggregate findings. Fix each one directly. False positives or not-worth-fixing — note and skip without arguing. If self-review found nothing, just confirm clean and exit.
 If fixes were made, re-run tests to confirm nothing broke. If tests fail after a fix, revert it.
+After fixes: the next `/simplify` invocation is a **validation pass** (Phase 2.5). Do not re-fan-out unless the fix added genuinely new concerns — bundle related fixes into one batch so a single validation pass covers them.
 ## Phase 5: Stamp the gate
-Whatever tier ran, the gate (`check-before-pr`) registers /simplify as having executed. The skill is satisfied.
+Whatever tier ran, the gate (`check-before-pr`) registers /simplify as having executed. The skill is satisfied. Self-review counts.
 ## Briefly summarize
-End with one or two sentences: which tier, what was fixed (or "clean — no changes"). No headers, no bullets unless needed.
+End with one or two sentences: which tier ran, what was fixed (or "clean — no changes"). No headers, no bullets unless needed.

package/README.md CHANGED Viewed

@@ -431,6 +431,18 @@ Inside your AI client, use the `/spell-builder` skill to create, edit, and valid
 /spell-builder                           # Start the spell builder
 ```
+### Other AI-client skills shipped with MoFlo
+Beyond `/flo`, `/spell-builder`, and `/eldar`, MoFlo ships a handful of focused slash-command skills that work in any consumer project once you `flo init`:
+| Skill | Purpose |
+|-------|---------|
+| `/guidance` | Author and audit guidance docs in `.claude/guidance/`. Default mode walks you through one doc; `/guidance -a` audits every doc against the universal guidance rules (Purpose lines, See Also, line counts, hedged language). |
+| `/simplify` | Adaptive code review on the current diff. Tier-based fan-out — trivial edits get a self-review, small diffs get one routed agent, cross-cutting refactors get three parallel agents. Routes through the moflo model router for cost-aware execution. |
+| `/spell-schedule` | Schedule a spell on the local moflo daemon (cron, interval, or one-time) without leaving the chat. For remote Anthropic-cloud agents on a schedule, use Claude Code's built-in `/schedule` instead. |
+Run any of them with no arguments to see full usage, or browse the source in `.claude/skills/` (each skill is a single `SKILL.md` file).
 ### Epics
 Epics are a specialized process for handling GitHub issues that contain multiple child stories. When you pass a GitHub issue to `/flo` and it's detected as an epic, MoFlo processes each child story sequentially through the full `/flo` process (research → implement → test → PR).
@@ -553,6 +565,19 @@ flo healer -c embeddings         # Check only embeddings health
 flo healer --verbose             # Verbose output
 ```
+#### `/eldar` — Consult the Eldar (project setup audit + wizard)
+Where the Healer checks your moflo install, `/eldar` audits how Claude is set up to *use* the project — guidance, CLAUDE.md, memory namespaces, hook/MCP wiring, model routing, and stack-aware guidance gaps — then walks you through fixing whichever findings you pick. Use it when starting in a new project, when Claude feels lost or inefficient, or as a periodic health check.
+```
+/eldar                           # Read-only audit; categorized report + top-3 ranked recommendation
+/eldar --fix                     # Audit, then interactive triage menu — pick which findings to address
+```
+The Eldar **consult** the Healer (they call `flo healer --json` as one of the audit checks) — they don't replace it. Categories audited include setup health, index freshness, version skew, model/token routing, CLAUDE.md size + reference integrity, guidance content + structure, memory health, hook/MCP wiring, settings sanity, spell + subagent inventory, **stack → guidance cross-reference** (detects tech from package.json/pyproject.toml/Cargo.toml/go.mod and flags every detected technology with no matching guidance doc — the highest-leverage finding for new adopters), and best-effort anti-pattern detection from history.
+In `--fix` mode, each chosen finding drives the appropriate sub-flow: Healer for setup repair, the `/guidance` skill for guidance authoring (wizard, never autogen), a stack-aware scaffold for missing CLAUDE.md, `flo init --upgrade` for hook/MCP wiring. Every write is confirmed before it lands.
 #### `flo diagnose` — Integration Tests
 While `healer` checks your environment, `diagnose` exercises every subsystem end-to-end: memory CRUD, embedding generation, semantic search, swarm lifecycle, hive-mind consensus, task management, hooks, config, neural patterns, and init idempotency. All test data is cleaned up after each test — nothing is left behind.