npm - opencode-model-router - Versions diffs - 1.0.7 → 1.1.1 - Mend

opencode-model-router 1.0.7 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -6,11 +6,11 @@ An [OpenCode](https://opencode.ai) plugin that automatically routes tasks to tie
 The plugin injects a **delegation protocol** into the system prompt that teaches the primary agent to route work:
-| Tier | Default (Anthropic) | Purpose |
-|------|---------------------|---------|
-| `@fast` | Claude Haiku 4.5 | Exploration, search, file reads, grep |
-| `@medium` | Claude Sonnet 4.5 | Implementation, refactoring, tests, bug fixes |
-| `@heavy` | Claude Opus 4.6 | Architecture, complex debugging, security review |
+| Tier | Default (Anthropic) | Cost | Purpose |
+|------|---------------------|------|---------|
+| `@fast` | Claude Haiku 4.5 | 1x | Exploration, search, file reads, grep |
+| `@medium` | Claude Sonnet 4.5 | 5x | Implementation, refactoring, tests, bug fixes |
+| `@heavy` | Claude Opus 4.6 | 20x | Architecture, complex debugging, security review |
 The agent automatically delegates via the Task tool when it recognizes the task complexity, or when plan steps are annotated with `[tier:fast]`, `[tier:medium]`, or `[tier:heavy]` tags.
@@ -98,32 +98,32 @@ All configuration lives in `tiers.json` at the plugin root. Edit it to match you
 The plugin ships with four presets:
 **anthropic** (default):
-| Tier | Model | Notes |
-|------|-------|-------|
-| fast | `anthropic/claude-haiku-4-5` | Cheapest, fastest |
-| medium | `anthropic/claude-sonnet-4-5` | Extended thinking (variant: max) |
-| heavy | `anthropic/claude-opus-4-6` | Extended thinking (variant: max) |
+| Tier | Model | Cost | Notes |
+|------|-------|------|-------|
+| fast | `anthropic/claude-haiku-4-5` | 1x | Cheapest, fastest |
+| medium | `anthropic/claude-sonnet-4-5` | 5x | Extended thinking (variant: max) |
+| heavy | `anthropic/claude-opus-4-6` | 20x | Extended thinking (variant: max) |
 **openai**:
-| Tier | Model | Notes |
-|------|-------|-------|
-| fast | `openai/gpt-5.3-codex-spark` | Cheapest, fastest |
-| medium | `openai/gpt-5.3-codex` | Default settings (no variant/reasoning override) |
-| heavy | `openai/gpt-5.3-codex` | Variant: `xhigh` |
+| Tier | Model | Cost | Notes |
+|------|-------|------|-------|
+| fast | `openai/gpt-5.3-codex-spark` | 1x | Cheapest, fastest |
+| medium | `openai/gpt-5.3-codex` | 5x | Default settings (no variant/reasoning override) |
+| heavy | `openai/gpt-5.3-codex` | 20x | Variant: `xhigh` |
 **github-copilot**:
-| Tier | Model | Notes |
-|------|-------|-------|
-| fast | `github-copilot/claude-haiku-4-5` | Cheapest, fastest |
-| medium | `github-copilot/claude-sonnet-4-5` | Balanced coding model |
-| heavy | `github-copilot/claude-opus-4-6` | Variant: `thinking` |
+| Tier | Model | Cost | Notes |
+|------|-------|------|-------|
+| fast | `github-copilot/claude-haiku-4-5` | 1x | Cheapest, fastest |
+| medium | `github-copilot/claude-sonnet-4-5` | 5x | Balanced coding model |
+| heavy | `github-copilot/claude-opus-4-6` | 20x | Variant: `thinking` |
 **google**:
-| Tier | Model | Notes |
-|------|-------|-------|
-| fast | `google/gemini-2.5-flash` | Cheapest, fastest |
-| medium | `google/gemini-2.5-pro` | Balanced coding model |
-| heavy | `google/gemini-3-pro-preview` | Strongest reasoning in default set |
+| Tier | Model | Cost | Notes |
+|------|-------|------|-------|
+| fast | `google/gemini-2.5-flash` | 1x | Cheapest, fastest |
+| medium | `google/gemini-2.5-pro` | 5x | Balanced coding model |
+| heavy | `google/gemini-3-pro-preview` | 20x | Strongest reasoning in default set |
 Switch presets with the `/preset` command:
@@ -141,13 +141,14 @@ Add a new preset to the `presets` object in `tiers.json`:
     "my-preset": {
       "fast": {
         "model": "provider/model-name",
+        "costRatio": 1,
         "description": "What this tier does",
         "steps": 30,
         "prompt": "System prompt for the subagent",
         "whenToUse": ["Use case 1", "Use case 2"]
       },
-      "medium": { ... },
-      "heavy": { ... }
+      "medium": { "costRatio": 5, "..." : "..." },
+      "heavy": { "costRatio": 20, "..." : "..." }
     }
   }
 }
@@ -159,6 +160,7 @@ Each tier supports these fields:
 |-------|------|-------------|
 | `model` | string | Full model ID (`provider/model-name`) |
 | `variant` | string | Optional variant (e.g., `"max"` for extended thinking) |
+| `costRatio` | number | Relative cost multiplier (e.g., 1 for cheapest, 20 for most expensive). Injected into the system prompt so the agent considers cost when delegating. |
 | `thinking` | object | Anthropic thinking config: `{ "budgetTokens": 10000 }` |
 | `reasoning` | object | OpenAI reasoning config: `{ "effort": "high", "summary": "detailed" }` |
 | `description` | string | Human-readable description shown in `/tiers` |
@@ -167,6 +169,91 @@ Each tier supports these fields:
 | `color` | string | Optional display color |
 | `whenToUse` | string[] | List of use cases (shown in delegation protocol) |
+### Routing modes
+The plugin supports three routing modes that control how aggressively the agent delegates to cheaper tiers. Switch modes with the `/budget` command:
+| Mode | Default Tier | Behavior |
+|------|-------------|----------|
+| `normal` | `@medium` | Balanced quality and cost — delegates based on task complexity |
+| `budget` | `@fast` | Aggressive cost savings — defaults to cheapest tier, escalates only when needed |
+| `quality` | `@medium` | Quality-first — uses stronger models more liberally for better results |
+When a mode has `overrideRules`, those replace the global `rules` array in the system prompt. This lets each mode have fundamentally different delegation behavior.
+Configure modes in `tiers.json`:
+```json
+{
+  "modes": {
+    "normal": {
+      "defaultTier": "medium",
+      "description": "Balanced quality and cost"
+    },
+    "budget": {
+      "defaultTier": "fast",
+      "description": "Aggressive cost savings",
+      "overrideRules": [
+        "Default ALL tasks to @fast unless they clearly require code edits",
+        "Use @medium ONLY for: multi-file edits, complex refactors, test suites",
+        "Use @heavy ONLY when explicitly requested or after 2+ failed @medium attempts"
+      ]
+    },
+    "quality": {
+      "defaultTier": "medium",
+      "description": "Quality-first",
+      "overrideRules": [
+        "Default to @medium for all tasks including exploration",
+        "Use @heavy for architecture, debugging, security, or multi-file coordination",
+        "Use @fast only for trivial single-tool operations"
+      ]
+    }
+  }
+}
+```
+The active mode is persisted in `~/.config/opencode/opencode-model-router.state.json` and survives restarts.
+### Task taxonomy
+The `taskPatterns` object maps common coding task descriptions to tiers. This is injected into the system prompt as a routing guide so the agent can quickly look up which tier to use:
+```json
+{
+  "taskPatterns": {
+    "fast": [
+      "Find, search, locate, or grep files and code patterns",
+      "Read or display specific files or sections",
+      "Check git status, log, diff, or blame"
+    ],
+    "medium": [
+      "Implement a new feature, function, or component",
+      "Refactor or restructure existing code",
+      "Write or update tests",
+      "Fix a bug (first or second attempt)"
+    ],
+    "heavy": [
+      "Design system or module architecture from scratch",
+      "Debug a problem after 2+ failed attempts",
+      "Security audit or vulnerability review"
+    ]
+  }
+}
+```
+Customize these patterns to match your workflow. The agent uses them as heuristics, not hard rules.
+### Cost ratios
+Each tier's `costRatio` is injected into the system prompt so the agent is aware of relative costs:
+```
+Cost ratios: @fast=1x, @medium=5x, @heavy=20x.
+Always use the cheapest tier that can reliably handle the task.
+```
+Adjust `costRatio` values in each tier to reflect your actual provider pricing. The ratios don't need to be exact — they're directional signals for the agent.
 ### Rules
 The `rules` array in `tiers.json` controls when delegation happens. These are injected into the system prompt verbatim:
@@ -179,11 +266,33 @@ The `rules` array in `tiers.json` controls when delegation happens. These are in
     "Use @fast for any read-only exploration or research task",
     "Keep orchestration (planning, decisions, verification) for yourself -- delegate execution",
     "For trivial tasks (single grep, single file read), execute directly without delegation",
-    "Never delegate to @heavy if you are already running on an opus-class model -- do it yourself"
+    "Never delegate to @heavy if you are already running on an opus-class model -- do it yourself",
+    "If a task takes 1-2 tool calls, execute directly -- delegation overhead is not worth the cost",
+    "Consult the task routing guide below to match task type to the correct tier",
+    "Consider cost ratios when choosing tiers -- always use the cheapest tier that can reliably handle the task"
   ]
 }
 ```
+When a routing mode has `overrideRules`, those replace this array entirely for that mode.
+### Fallback
+The `fallback` section defines which presets to try when a provider fails:
+```json
+{
+  "fallback": {
+    "global": {
+      "anthropic": ["openai", "google", "github-copilot"],
+      "openai": ["anthropic", "google", "github-copilot"]
+    }
+  }
+}
+```
+When a delegated task fails with a provider/model/rate-limit error, the agent is instructed to retry with the next preset in the fallback chain.
 ## Commands
 | Command | Description |
@@ -191,6 +300,8 @@ The `rules` array in `tiers.json` controls when delegation happens. These are in
 | `/tiers` | Show active tier configuration and delegation rules |
 | `/preset` | List available presets |
 | `/preset <name>` | Switch to a different preset |
+| `/budget` | Show available routing modes and which is active |
+| `/budget <mode>` | Switch routing mode (`normal`, `budget`, or `quality`) |
 | `/annotate-plan [path]` | Annotate a plan file with `[tier:X]` tags for each step |
 ## Plan annotation

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "opencode-model-router",
-  "version": "1.0.7",
+  "version": "1.1.1",
   "description": "OpenCode plugin that routes tasks to tiered subagents (fast/medium/heavy) based on complexity",
   "type": "module",
   "main": "./src/index.ts",
@@ -35,5 +35,9 @@
   ],
   "peerDependencies": {
     "@opencode-ai/plugin": ">=1.0.0"
+  },
+  "devDependencies": {
+    "@types/node": "^25.2.3",
+    "typescript": "^5.9.3"
   }
 }

package/src/index.ts CHANGED Viewed

@@ -22,6 +22,7 @@ interface TierConfig {
   variant?: string;
   thinking?: ThinkingConfig;
   reasoning?: ReasoningConfig;
+  costRatio?: number;
   color?: string;
   description: string;
   steps?: number;
@@ -36,16 +37,26 @@ interface FallbackConfig {
   presets?: Record<string, Record<string, string[]>>;
 }
+interface ModeConfig {
+  defaultTier: string;
+  description: string;
+  overrideRules?: string[];
+}
 interface RouterConfig {
   activePreset: string;
+  activeMode?: string;
   presets: Record<string, Preset>;
   rules: string[];
   defaultTier: string;
   fallback?: FallbackConfig;
+  taskPatterns?: Record<string, string[]>;
+  modes?: Record<string, ModeConfig>;
 }
 interface RouterState {
   activePreset?: string;
+  activeMode?: string;
 }
 // ---------------------------------------------------------------------------
@@ -130,6 +141,39 @@ function validateConfig(raw: unknown): RouterConfig {
     throw new Error("tiers.json: 'defaultTier' must be a string");
   }
+  // Validate modes if present
+  if (obj.modes !== undefined) {
+    if (typeof obj.modes !== "object" || obj.modes === null || Array.isArray(obj.modes)) {
+      throw new Error("tiers.json: 'modes' must be an object");
+    }
+    const modes = obj.modes as Record<string, unknown>;
+    for (const [modeName, mode] of Object.entries(modes)) {
+      if (typeof mode !== "object" || mode === null) {
+        throw new Error(`tiers.json: mode '${modeName}' must be an object`);
+      }
+      const m = mode as Record<string, unknown>;
+      if (typeof m.defaultTier !== "string") {
+        throw new Error(`tiers.json: mode '${modeName}.defaultTier' must be a string`);
+      }
+      if (typeof m.description !== "string") {
+        throw new Error(`tiers.json: mode '${modeName}.description' must be a string`);
+      }
+    }
+  }
+  // Validate taskPatterns if present
+  if (obj.taskPatterns !== undefined) {
+    if (typeof obj.taskPatterns !== "object" || obj.taskPatterns === null || Array.isArray(obj.taskPatterns)) {
+      throw new Error("tiers.json: 'taskPatterns' must be an object");
+    }
+    const tp = obj.taskPatterns as Record<string, unknown>;
+    for (const [tierName, patterns] of Object.entries(tp)) {
+      if (!Array.isArray(patterns)) {
+        throw new Error(`tiers.json: taskPatterns.'${tierName}' must be an array of strings`);
+      }
+    }
+  }
   return raw as RouterConfig;
 }
@@ -150,9 +194,12 @@ function loadConfig(): RouterConfig {
           cfg.activePreset = resolved;
         }
       }
+      if (state.activeMode && cfg.modes?.[state.activeMode]) {
+        cfg.activeMode = state.activeMode;
+      }
     }
   } catch {
-    // Ignore state read errors and keep tiers.json active preset
+    // Ignore state read errors and keep tiers.json defaults
   }
   _cachedConfig = cfg;
@@ -160,6 +207,30 @@ function loadConfig(): RouterConfig {
   return cfg;
 }
+// ---------------------------------------------------------------------------
+// State persistence helpers
+// ---------------------------------------------------------------------------
+/** Read current persisted state (or empty object on failure). */
+function readState(): RouterState {
+  try {
+    if (existsSync(statePath())) {
+      return JSON.parse(readFileSync(statePath(), "utf-8")) as RouterState;
+    }
+  } catch {
+    // ignore
+  }
+  return {};
+}
+/** Write state to disk (merges with existing keys). */
+function writeState(patch: Partial<RouterState>): void {
+  const state = { ...readState(), ...patch };
+  const p = statePath();
+  mkdirSync(dirname(p), { recursive: true });
+  writeFileSync(p, JSON.stringify(state, null, 2) + "\n", "utf-8");
+}
 function saveActivePreset(presetName: string): void {
   const cfg = loadConfig();
   const resolved = resolvePresetName(cfg, presetName);
@@ -170,15 +241,23 @@ function saveActivePreset(presetName: string): void {
   cfg.activePreset = resolved;
   // Persist user-selected preset to state file only — never mutate tiers.json
-  const presetState: RouterState = { activePreset: resolved };
-  const p = statePath();
-  mkdirSync(dirname(p), { recursive: true });
-  writeFileSync(p, JSON.stringify(presetState, null, 2) + "\n", "utf-8");
+  writeState({ activePreset: resolved });
   // Invalidate cache so next read picks up the new active preset
   invalidateConfigCache();
 }
+function saveActiveMode(modeName: string): void {
+  const cfg = loadConfig();
+  if (!cfg.modes?.[modeName]) {
+    return;
+  }
+  cfg.activeMode = modeName;
+  writeState({ activeMode: modeName });
+  invalidateConfigCache();
+}
 function getActiveTiers(cfg: RouterConfig): Preset {
   return cfg.presets[cfg.activePreset] ?? Object.values(cfg.presets)[0]!;
 }
@@ -210,6 +289,15 @@ function buildAgentOptions(tier: TierConfig): Record<string, unknown> {
   return Object.keys(opts).length > 0 ? opts : {};
 }
+// ---------------------------------------------------------------------------
+// Mode helpers
+// ---------------------------------------------------------------------------
+function getActiveMode(cfg: RouterConfig): ModeConfig | undefined {
+  if (!cfg.modes || !cfg.activeMode) return undefined;
+  return cfg.modes[cfg.activeMode];
+}
 // ---------------------------------------------------------------------------
 // Fallback instructions builder
 // ---------------------------------------------------------------------------
@@ -222,23 +310,29 @@ function buildFallbackInstructions(cfg: RouterConfig): string {
   const map = presetMap && Object.keys(presetMap).length > 0 ? presetMap : fb.global;
   if (!map) return "";
-  const providerLines = Object.entries(map).flatMap(([provider, presetOrder]) => {
+  const chains = Object.entries(map).flatMap(([provider, presetOrder]) => {
     if (!Array.isArray(presetOrder)) return [];
-    const validOrder = presetOrder.filter(
-      (preset) => preset !== cfg.activePreset && Boolean(cfg.presets[preset]),
+    const valid = presetOrder.filter(
+      (p) => p !== cfg.activePreset && Boolean(cfg.presets[p]),
     );
-    return validOrder.length > 0 ? [`- ${provider}: ${validOrder.join(" -> ")}`] : [];
+    return valid.length > 0 ? [`${provider}→${valid.join("→")}`] : [];
   });
-  if (providerLines.length === 0) return "";
+  if (chains.length === 0) return "";
+  return `Err→retry-alt-tier→fail→direct. Chain: ${chains.join(" | ")}`;
+}
-  return [
-    "Fallback on delegated task errors:",
-    "1. If Task(...) returns provider/model/rate-limit/timeout/auth errors, retry once with a different tier suited to the same task.",
-    "2. If retry also fails, stop delegating that task and complete it directly in the primary agent.",
-    "3. Use the failing model prefix and this preset fallback order for next-run recovery (`/preset <name>` + restart):",
-    ...providerLines,
-  ].join("\n");
+// ---------------------------------------------------------------------------
+// Cost & taxonomy builders
+// ---------------------------------------------------------------------------
+function buildTaskTaxonomy(cfg: RouterConfig): string {
+  if (!cfg.taskPatterns || Object.keys(cfg.taskPatterns).length === 0) return "";
+  return Object.entries(cfg.taskPatterns)
+    .filter(([_, p]) => Array.isArray(p) && p.length > 0)
+    .map(([tier, patterns]) => `@${tier}→${(patterns as string[]).join("/")}`)
+    .join("\n");
 }
 // ---------------------------------------------------------------------------
@@ -248,40 +342,35 @@ function buildFallbackInstructions(cfg: RouterConfig): string {
 function buildDelegationProtocol(cfg: RouterConfig): string {
   const tiers = getActiveTiers(cfg);
-  const tierSummary = Object.entries(tiers)
+  // Compact tier summary: @name=model/variant(costRatio)
+  const tierLine = Object.entries(tiers)
     .map(([name, t]) => {
-      const shortModel = t.model.split("/").pop() ?? t.model;
-      const variant = t.variant ? ` (${t.variant})` : "";
-      return `@${name}=${shortModel}${variant}`;
+      const short = t.model.split("/").pop() ?? t.model;
+      const v = t.variant ? `/${t.variant}` : "";
+      const c = t.costRatio != null ? `(${t.costRatio}x)` : "";
+      return `@${name}=${short}${v}${c}`;
     })
-    .join(" | ");
+    .join(" ");
-  // Build per-tier whenToUse descriptions so the agent knows when to pick each tier
-  const tierDescriptions = Object.entries(tiers)
-    .map(([name, t]) => {
-      const uses = t.whenToUse.length > 0 ? t.whenToUse.join(", ") : t.description;
-      return `- @${name}: ${uses}`;
-    })
-    .join("\n");
+  // Compact mode
+  const mode = getActiveMode(cfg);
+  const modeSuffix = cfg.activeMode ? ` mode:${cfg.activeMode}` : "";
-  // Use configurable rules from tiers.json instead of hardcoded ones
-  const numberedRules = cfg.rules
-    .map((rule, i) => `${i + 1}. ${rule}`)
-    .join("\n");
+  // Compact task routing guide
+  const taxonomy = buildTaskTaxonomy(cfg);
-  const fallbackInstructions = buildFallbackInstructions(cfg);
+  // Compact rules
+  const effectiveRules = mode?.overrideRules?.length ? mode.overrideRules : cfg.rules;
+  const rulesLine = effectiveRules.map((r, i) => `${i + 1}.${r}`).join(" ");
+  const fallback = buildFallbackInstructions(cfg);
   return [
-    "## Model Delegation Protocol",
-    `Preset: ${cfg.activePreset}. Tiers: ${tierSummary}.`,
-    "",
-    "Tier capabilities:",
-    tierDescriptions,
-    "",
-    "Apply to every user message (plan and ad-hoc):",
-    numberedRules,
-    ...(fallbackInstructions ? ["", fallbackInstructions] : []),
-    "",
+    `## Model Delegation Protocol`,
+    `Preset: ${cfg.activePreset}. Tiers: ${tierLine}.${modeSuffix}`,
+    ...(taxonomy ? [taxonomy] : []),
+    rulesLine,
+    ...(fallback ? [fallback] : []),
     `Delegate with Task(subagent_type="fast|medium|heavy", prompt="...").`,
     "Keep orchestration and final synthesis in the primary agent.",
   ].join("\n");
@@ -320,6 +409,50 @@ function buildTiersOutput(cfg: RouterConfig): string {
   return lines.join("\n");
 }
+// ---------------------------------------------------------------------------
+// /budget command output
+// ---------------------------------------------------------------------------
+function buildBudgetOutput(cfg: RouterConfig, args: string): string {
+  const modes = cfg.modes;
+  if (!modes || Object.keys(modes).length === 0) {
+    return 'No modes configured in tiers.json. Add a "modes" section to enable budget mode.';
+  }
+  const requested = args.trim().toLowerCase();
+  const currentMode = cfg.activeMode || "normal";
+  // No args: show current mode and available modes
+  if (!requested) {
+    const lines = ["# Routing Modes\n"];
+    for (const [name, mode] of Object.entries(modes)) {
+      const active = name === currentMode ? " <- active" : "";
+      lines.push(`- **${name}**${active}: ${mode.description} (default tier: @${mode.defaultTier})`);
+    }
+    lines.push(`\nSwitch with: \`/budget <mode>\``);
+    return lines.join("\n");
+  }
+  // Switch mode
+  if (modes[requested]) {
+    saveActiveMode(requested);
+    const mode = modes[requested];
+    return [
+      `Routing mode switched to **${requested}**.`,
+      "",
+      mode.description,
+      `Default tier: @${mode.defaultTier}`,
+      ...(mode.overrideRules?.length
+        ? ["", "Active rules:", ...mode.overrideRules.map((r) => `- ${r}`)]
+        : []),
+      "",
+      "Mode change takes effect immediately on the next message.",
+    ].join("\n");
+  }
+  return `Unknown mode: "${requested}". Available: ${Object.keys(modes).join(", ")}`;
+}
 // ---------------------------------------------------------------------------
 // /preset command output
 // ---------------------------------------------------------------------------
@@ -413,6 +546,10 @@ const ModelRouterPlugin: Plugin = async (_ctx: PluginInput) => {
         template: "$ARGUMENTS",
         description: "Show or switch model presets (e.g., /preset openai)",
       };
+      opencodeConfig.command["budget"] = {
+        template: "$ARGUMENTS",
+        description: "Show or switch routing mode (e.g., /budget, /budget budget, /budget quality)",
+      };
       opencodeConfig.command["annotate-plan"] = {
         template: [
           "Annotate the plan with tier directives for model delegation.",
@@ -443,7 +580,7 @@ const ModelRouterPlugin: Plugin = async (_ctx: PluginInput) => {
     },
     // -----------------------------------------------------------------------
-    // Inject delegation protocol — uses cached config (invalidated on /preset)
+    // Inject delegation protocol — uses cached config (invalidated on /preset or /budget)
     // -----------------------------------------------------------------------
     "experimental.chat.system.transform": async (_input: any, output: any) => {
       try {
@@ -455,7 +592,7 @@ const ModelRouterPlugin: Plugin = async (_ctx: PluginInput) => {
     },
     // -----------------------------------------------------------------------
-    // Handle /tiers and /preset commands
+    // Handle /tiers, /preset, and /budget commands
     // -----------------------------------------------------------------------
     "command.execute.before": async (input: any, output: any) => {
       if (input.command === "tiers") {
@@ -474,6 +611,16 @@ const ModelRouterPlugin: Plugin = async (_ctx: PluginInput) => {
           text: buildPresetOutput(cfg, input.arguments ?? ""),
         });
       }
+      if (input.command === "budget") {
+        try {
+          cfg = loadConfig();
+        } catch {}
+        output.parts.push({
+          type: "text" as const,
+          text: buildBudgetOutput(cfg, input.arguments ?? ""),
+        });
+      }
     },
   };
 };

package/tiers.json CHANGED Viewed

@@ -1,9 +1,11 @@
 {
   "activePreset": "anthropic",
+  "activeMode": "normal",
   "presets": {
     "anthropic": {
       "fast": {
         "model": "anthropic/claude-haiku-4-5",
+        "costRatio": 1,
         "description": "Haiku 4.5 for exploration, search, and simple reads",
         "steps": 30,
         "prompt": "You are a fast exploration agent. Focus on speed and efficiency. Read files, search code, and return findings concisely. Do NOT make edits unless explicitly asked.",
@@ -17,6 +19,7 @@
       "medium": {
         "model": "anthropic/claude-sonnet-4-5",
         "variant": "max",
+        "costRatio": 5,
         "description": "Sonnet 4.5 max for implementation, refactoring, and tests",
         "steps": 50,
         "prompt": "You are an implementation agent. Write clean, production-quality code matching existing project patterns. Run linters/tests after changes when possible.",
@@ -31,6 +34,7 @@
       "heavy": {
         "model": "anthropic/claude-opus-4-6",
         "variant": "max",
+        "costRatio": 20,
         "description": "Opus 4.6 max for architecture, complex debugging, and security",
         "steps": 30,
         "prompt": "You are a senior architecture consultant. Analyze deeply, consider tradeoffs, and provide thorough reasoning. Be exhaustive in your analysis.",
@@ -45,6 +49,7 @@
     "openai": {
       "fast": {
         "model": "openai/gpt-5.3-codex-spark",
+        "costRatio": 1,
         "description": "GPT-5.3 Codex Spark for fast exploration and simple tasks",
         "steps": 30,
         "prompt": "You are a fast exploration agent. Focus on speed and efficiency. Read files, search code, and return findings concisely. Do NOT make edits unless explicitly asked.",
@@ -56,6 +61,7 @@
       },
       "medium": {
         "model": "openai/gpt-5.3-codex",
+        "costRatio": 5,
         "description": "GPT-5.3 Codex default settings for implementation and standard coding",
         "steps": 50,
         "prompt": "You are an implementation agent. Write clean, production-quality code matching existing project patterns. Run linters/tests after changes when possible.",
@@ -69,6 +75,7 @@
       "heavy": {
         "model": "openai/gpt-5.3-codex",
         "variant": "xhigh",
+        "costRatio": 20,
         "description": "GPT-5.3 Codex xhigh for architecture and complex tasks",
         "steps": 30,
         "prompt": "You are a senior architecture consultant. Analyze deeply, consider tradeoffs, and provide thorough reasoning.",
@@ -83,6 +90,7 @@
     "github-copilot": {
       "fast": {
         "model": "github-copilot/claude-haiku-4-5",
+        "costRatio": 1,
         "description": "Claude Haiku 4.5 via GitHub Copilot for fast exploration and simple tasks",
         "steps": 30,
         "prompt": "You are a fast exploration agent. Focus on speed and efficiency. Read files, search code, and return findings concisely. Do NOT make edits unless explicitly asked.",
@@ -95,6 +103,7 @@
       },
       "medium": {
         "model": "github-copilot/claude-sonnet-4-5",
+        "costRatio": 5,
         "description": "Claude Sonnet 4.5 via GitHub Copilot for implementation, refactoring, and tests",
         "steps": 50,
         "prompt": "You are an implementation agent. Write clean, production-quality code matching existing project patterns. Run linters/tests after changes when possible.",
@@ -109,6 +118,7 @@
       "heavy": {
         "model": "github-copilot/claude-opus-4-6",
         "variant": "thinking",
+        "costRatio": 20,
         "description": "Claude Opus 4.6 via GitHub Copilot for architecture, complex debugging, and security",
         "steps": 30,
         "prompt": "You are a senior architecture consultant. Analyze deeply, consider tradeoffs, and provide thorough reasoning. Be exhaustive in your analysis.",
@@ -123,6 +133,7 @@
     "google": {
       "fast": {
         "model": "google/gemini-2.5-flash",
+        "costRatio": 1,
         "description": "Gemini 2.5 Flash for fast exploration and simple tasks",
         "steps": 30,
         "prompt": "You are a fast exploration agent. Focus on speed and efficiency. Read files, search code, and return findings concisely. Do NOT make edits unless explicitly asked.",
@@ -135,6 +146,7 @@
       },
       "medium": {
         "model": "google/gemini-2.5-pro",
+        "costRatio": 5,
         "description": "Gemini 2.5 Pro for implementation, refactoring, and tests",
         "steps": 50,
         "prompt": "You are an implementation agent. Write clean, production-quality code matching existing project patterns. Run linters/tests after changes when possible.",
@@ -148,6 +160,7 @@
       },
       "heavy": {
         "model": "google/gemini-3-pro-preview",
+        "costRatio": 20,
         "description": "Gemini 3 Pro Preview for architecture, complex debugging, and security",
         "steps": 30,
         "prompt": "You are a senior architecture consultant. Analyze deeply, consider tradeoffs, and provide thorough reasoning. Be exhaustive in your analysis.",
@@ -160,6 +173,39 @@
       }
     }
   },
+  "taskPatterns": {
+    "fast": ["search", "grep", "read", "git-info", "ls", "lookup-docs/types", "count", "exists-check", "rename"],
+    "medium": ["impl-feature", "refactor", "write-tests", "bugfix(≤2)", "edit-logic", "code-review", "build-fix", "create-file", "db-migrate", "api-endpoint", "config-update"],
+    "heavy": ["arch-design", "debug(≥3fail)", "sec-audit", "perf-opt", "migrate-strategy", "multi-system-integration", "tradeoff-analysis", "rca"]
+  },
+  "modes": {
+    "normal": {
+      "defaultTier": "medium",
+      "description": "Balanced quality and cost — delegates based on task complexity"
+    },
+    "budget": {
+      "defaultTier": "fast",
+      "description": "Aggressive cost savings — defaults to cheapest tier, escalates only when needed",
+      "overrideRules": [
+        "default→@fast unless edits/complex-reasoning needed",
+        "@medium ONLY: multi-file-edit/refactor/test-suite/build-fix",
+        "@heavy ONLY: user-requested OR ≥2 @medium failures",
+        "trivial(grep/read/glob)→direct,no-delegate",
+        "batch related searches→single @fast",
+        "uncertain @fast vs @medium→@fast,escalate on fail"
+      ]
+    },
+    "quality": {
+      "defaultTier": "medium",
+      "description": "Quality-first — uses stronger models more liberally for better results",
+      "overrideRules": [
+        "default→@medium incl exploration when deep-context matters",
+        "@heavy: arch/debug/security/multi-file-coord",
+        "@fast ONLY: trivial single-tool ops (1 grep/1 read)",
+        "prefer thoroughness over speed"
+      ]
+    }
+  },
   "fallback": {
     "global": {
       "anthropic": ["openai", "google", "github-copilot"],
@@ -169,15 +215,14 @@
     }
   },
   "rules": [
-    "When a plan step contains [tier:fast], [tier:medium], or [tier:heavy], delegate to that agent",
-    "When a plan says 'use a fast/cheap model' -> delegate to @fast",
-    "When a plan says 'use a medium/balanced model' -> delegate to @medium",
-    "When a plan says 'use a heavy/powerful model' -> delegate to @heavy",
-    "Default to @medium for implementation tasks you could delegate",
-    "Use @fast for any read-only exploration or research task",
-    "Keep orchestration (planning, decisions, verification) for yourself - delegate execution",
-    "For trivial tasks (single grep, single file read), execute directly without delegation",
-    "Never delegate to @heavy if you are already running on an opus-class model - do it yourself"
+    "[tier:X]→delegate X",
+    "plan:fast/cheap→@fast | plan:medium→@medium | plan:heavy→@heavy",
+    "default:impl→@medium | readonly→@fast",
+    "orchestrate=self,delegate=exec",
+    "trivial(≤2tools)→direct,skip-delegate",
+    "self∈opus→never→@heavy,do-it-yourself",
+    "consult route-guide↑",
+    "min(cost,adequate-tier)"
   ],
   "defaultTier": "medium"
 }