opencode-model-router 1.0.7 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. package/README.md +139 -28
  2. package/package.json +5 -1
  3. package/src/index.ts +193 -46
  4. package/tiers.json +54 -9
package/README.md CHANGED
@@ -6,11 +6,11 @@ An [OpenCode](https://opencode.ai) plugin that automatically routes tasks to tie
6
6
 
7
7
  The plugin injects a **delegation protocol** into the system prompt that teaches the primary agent to route work:
8
8
 
9
- | Tier | Default (Anthropic) | Purpose |
10
- |------|---------------------|---------|
11
- | `@fast` | Claude Haiku 4.5 | Exploration, search, file reads, grep |
12
- | `@medium` | Claude Sonnet 4.5 | Implementation, refactoring, tests, bug fixes |
13
- | `@heavy` | Claude Opus 4.6 | Architecture, complex debugging, security review |
9
+ | Tier | Default (Anthropic) | Cost | Purpose |
10
+ |------|---------------------|------|---------|
11
+ | `@fast` | Claude Haiku 4.5 | 1x | Exploration, search, file reads, grep |
12
+ | `@medium` | Claude Sonnet 4.5 | 5x | Implementation, refactoring, tests, bug fixes |
13
+ | `@heavy` | Claude Opus 4.6 | 20x | Architecture, complex debugging, security review |
14
14
 
15
15
  The agent automatically delegates via the Task tool when it recognizes the task complexity, or when plan steps are annotated with `[tier:fast]`, `[tier:medium]`, or `[tier:heavy]` tags.
16
16
 
@@ -98,32 +98,32 @@ All configuration lives in `tiers.json` at the plugin root. Edit it to match you
98
98
  The plugin ships with four presets:
99
99
 
100
100
  **anthropic** (default):
101
- | Tier | Model | Notes |
102
- |------|-------|-------|
103
- | fast | `anthropic/claude-haiku-4-5` | Cheapest, fastest |
104
- | medium | `anthropic/claude-sonnet-4-5` | Extended thinking (variant: max) |
105
- | heavy | `anthropic/claude-opus-4-6` | Extended thinking (variant: max) |
101
+ | Tier | Model | Cost | Notes |
102
+ |------|-------|------|-------|
103
+ | fast | `anthropic/claude-haiku-4-5` | 1x | Cheapest, fastest |
104
+ | medium | `anthropic/claude-sonnet-4-5` | 5x | Extended thinking (variant: max) |
105
+ | heavy | `anthropic/claude-opus-4-6` | 20x | Extended thinking (variant: max) |
106
106
 
107
107
  **openai**:
108
- | Tier | Model | Notes |
109
- |------|-------|-------|
110
- | fast | `openai/gpt-5.3-codex-spark` | Cheapest, fastest |
111
- | medium | `openai/gpt-5.3-codex` | Default settings (no variant/reasoning override) |
112
- | heavy | `openai/gpt-5.3-codex` | Variant: `xhigh` |
108
+ | Tier | Model | Cost | Notes |
109
+ |------|-------|------|-------|
110
+ | fast | `openai/gpt-5.3-codex-spark` | 1x | Cheapest, fastest |
111
+ | medium | `openai/gpt-5.3-codex` | 5x | Default settings (no variant/reasoning override) |
112
+ | heavy | `openai/gpt-5.3-codex` | 20x | Variant: `xhigh` |
113
113
 
114
114
  **github-copilot**:
115
- | Tier | Model | Notes |
116
- |------|-------|-------|
117
- | fast | `github-copilot/claude-haiku-4-5` | Cheapest, fastest |
118
- | medium | `github-copilot/claude-sonnet-4-5` | Balanced coding model |
119
- | heavy | `github-copilot/claude-opus-4-6` | Variant: `thinking` |
115
+ | Tier | Model | Cost | Notes |
116
+ |------|-------|------|-------|
117
+ | fast | `github-copilot/claude-haiku-4-5` | 1x | Cheapest, fastest |
118
+ | medium | `github-copilot/claude-sonnet-4-5` | 5x | Balanced coding model |
119
+ | heavy | `github-copilot/claude-opus-4-6` | 20x | Variant: `thinking` |
120
120
 
121
121
  **google**:
122
- | Tier | Model | Notes |
123
- |------|-------|-------|
124
- | fast | `google/gemini-2.5-flash` | Cheapest, fastest |
125
- | medium | `google/gemini-2.5-pro` | Balanced coding model |
126
- | heavy | `google/gemini-3-pro-preview` | Strongest reasoning in default set |
122
+ | Tier | Model | Cost | Notes |
123
+ |------|-------|------|-------|
124
+ | fast | `google/gemini-2.5-flash` | 1x | Cheapest, fastest |
125
+ | medium | `google/gemini-2.5-pro` | 5x | Balanced coding model |
126
+ | heavy | `google/gemini-3-pro-preview` | 20x | Strongest reasoning in default set |
127
127
 
128
128
  Switch presets with the `/preset` command:
129
129
 
@@ -141,13 +141,14 @@ Add a new preset to the `presets` object in `tiers.json`:
141
141
  "my-preset": {
142
142
  "fast": {
143
143
  "model": "provider/model-name",
144
+ "costRatio": 1,
144
145
  "description": "What this tier does",
145
146
  "steps": 30,
146
147
  "prompt": "System prompt for the subagent",
147
148
  "whenToUse": ["Use case 1", "Use case 2"]
148
149
  },
149
- "medium": { ... },
150
- "heavy": { ... }
150
+ "medium": { "costRatio": 5, "..." : "..." },
151
+ "heavy": { "costRatio": 20, "..." : "..." }
151
152
  }
152
153
  }
153
154
  }
@@ -159,6 +160,7 @@ Each tier supports these fields:
159
160
  |-------|------|-------------|
160
161
  | `model` | string | Full model ID (`provider/model-name`) |
161
162
  | `variant` | string | Optional variant (e.g., `"max"` for extended thinking) |
163
+ | `costRatio` | number | Relative cost multiplier (e.g., 1 for cheapest, 20 for most expensive). Injected into the system prompt so the agent considers cost when delegating. |
162
164
  | `thinking` | object | Anthropic thinking config: `{ "budgetTokens": 10000 }` |
163
165
  | `reasoning` | object | OpenAI reasoning config: `{ "effort": "high", "summary": "detailed" }` |
164
166
  | `description` | string | Human-readable description shown in `/tiers` |
@@ -167,6 +169,91 @@ Each tier supports these fields:
167
169
  | `color` | string | Optional display color |
168
170
  | `whenToUse` | string[] | List of use cases (shown in delegation protocol) |
169
171
 
172
+ ### Routing modes
173
+
174
+ The plugin supports three routing modes that control how aggressively the agent delegates to cheaper tiers. Switch modes with the `/budget` command:
175
+
176
+ | Mode | Default Tier | Behavior |
177
+ |------|-------------|----------|
178
+ | `normal` | `@medium` | Balanced quality and cost — delegates based on task complexity |
179
+ | `budget` | `@fast` | Aggressive cost savings — defaults to cheapest tier, escalates only when needed |
180
+ | `quality` | `@medium` | Quality-first — uses stronger models more liberally for better results |
181
+
182
+ When a mode has `overrideRules`, those replace the global `rules` array in the system prompt. This lets each mode have fundamentally different delegation behavior.
183
+
184
+ Configure modes in `tiers.json`:
185
+
186
+ ```json
187
+ {
188
+ "modes": {
189
+ "normal": {
190
+ "defaultTier": "medium",
191
+ "description": "Balanced quality and cost"
192
+ },
193
+ "budget": {
194
+ "defaultTier": "fast",
195
+ "description": "Aggressive cost savings",
196
+ "overrideRules": [
197
+ "Default ALL tasks to @fast unless they clearly require code edits",
198
+ "Use @medium ONLY for: multi-file edits, complex refactors, test suites",
199
+ "Use @heavy ONLY when explicitly requested or after 2+ failed @medium attempts"
200
+ ]
201
+ },
202
+ "quality": {
203
+ "defaultTier": "medium",
204
+ "description": "Quality-first",
205
+ "overrideRules": [
206
+ "Default to @medium for all tasks including exploration",
207
+ "Use @heavy for architecture, debugging, security, or multi-file coordination",
208
+ "Use @fast only for trivial single-tool operations"
209
+ ]
210
+ }
211
+ }
212
+ }
213
+ ```
214
+
215
+ The active mode is persisted in `~/.config/opencode/opencode-model-router.state.json` and survives restarts.
216
+
217
+ ### Task taxonomy
218
+
219
+ The `taskPatterns` object maps common coding task descriptions to tiers. This is injected into the system prompt as a routing guide so the agent can quickly look up which tier to use:
220
+
221
+ ```json
222
+ {
223
+ "taskPatterns": {
224
+ "fast": [
225
+ "Find, search, locate, or grep files and code patterns",
226
+ "Read or display specific files or sections",
227
+ "Check git status, log, diff, or blame"
228
+ ],
229
+ "medium": [
230
+ "Implement a new feature, function, or component",
231
+ "Refactor or restructure existing code",
232
+ "Write or update tests",
233
+ "Fix a bug (first or second attempt)"
234
+ ],
235
+ "heavy": [
236
+ "Design system or module architecture from scratch",
237
+ "Debug a problem after 2+ failed attempts",
238
+ "Security audit or vulnerability review"
239
+ ]
240
+ }
241
+ }
242
+ ```
243
+
244
+ Customize these patterns to match your workflow. The agent uses them as heuristics, not hard rules.
245
+
246
+ ### Cost ratios
247
+
248
+ Each tier's `costRatio` is injected into the system prompt so the agent is aware of relative costs:
249
+
250
+ ```
251
+ Cost ratios: @fast=1x, @medium=5x, @heavy=20x.
252
+ Always use the cheapest tier that can reliably handle the task.
253
+ ```
254
+
255
+ Adjust `costRatio` values in each tier to reflect your actual provider pricing. The ratios don't need to be exact — they're directional signals for the agent.
256
+
170
257
  ### Rules
171
258
 
172
259
  The `rules` array in `tiers.json` controls when delegation happens. These are injected into the system prompt verbatim:
@@ -179,11 +266,33 @@ The `rules` array in `tiers.json` controls when delegation happens. These are in
179
266
  "Use @fast for any read-only exploration or research task",
180
267
  "Keep orchestration (planning, decisions, verification) for yourself -- delegate execution",
181
268
  "For trivial tasks (single grep, single file read), execute directly without delegation",
182
- "Never delegate to @heavy if you are already running on an opus-class model -- do it yourself"
269
+ "Never delegate to @heavy if you are already running on an opus-class model -- do it yourself",
270
+ "If a task takes 1-2 tool calls, execute directly -- delegation overhead is not worth the cost",
271
+ "Consult the task routing guide below to match task type to the correct tier",
272
+ "Consider cost ratios when choosing tiers -- always use the cheapest tier that can reliably handle the task"
183
273
  ]
184
274
  }
185
275
  ```
186
276
 
277
+ When a routing mode has `overrideRules`, those replace this array entirely for that mode.
278
+
279
+ ### Fallback
280
+
281
+ The `fallback` section defines which presets to try when a provider fails:
282
+
283
+ ```json
284
+ {
285
+ "fallback": {
286
+ "global": {
287
+ "anthropic": ["openai", "google", "github-copilot"],
288
+ "openai": ["anthropic", "google", "github-copilot"]
289
+ }
290
+ }
291
+ }
292
+ ```
293
+
294
+ When a delegated task fails with a provider/model/rate-limit error, the agent is instructed to retry with the next preset in the fallback chain.
295
+
187
296
  ## Commands
188
297
 
189
298
  | Command | Description |
@@ -191,6 +300,8 @@ The `rules` array in `tiers.json` controls when delegation happens. These are in
191
300
  | `/tiers` | Show active tier configuration and delegation rules |
192
301
  | `/preset` | List available presets |
193
302
  | `/preset <name>` | Switch to a different preset |
303
+ | `/budget` | Show available routing modes and which is active |
304
+ | `/budget <mode>` | Switch routing mode (`normal`, `budget`, or `quality`) |
194
305
  | `/annotate-plan [path]` | Annotate a plan file with `[tier:X]` tags for each step |
195
306
 
196
307
  ## Plan annotation
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "opencode-model-router",
3
- "version": "1.0.7",
3
+ "version": "1.1.1",
4
4
  "description": "OpenCode plugin that routes tasks to tiered subagents (fast/medium/heavy) based on complexity",
5
5
  "type": "module",
6
6
  "main": "./src/index.ts",
@@ -35,5 +35,9 @@
35
35
  ],
36
36
  "peerDependencies": {
37
37
  "@opencode-ai/plugin": ">=1.0.0"
38
+ },
39
+ "devDependencies": {
40
+ "@types/node": "^25.2.3",
41
+ "typescript": "^5.9.3"
38
42
  }
39
43
  }
package/src/index.ts CHANGED
@@ -22,6 +22,7 @@ interface TierConfig {
22
22
  variant?: string;
23
23
  thinking?: ThinkingConfig;
24
24
  reasoning?: ReasoningConfig;
25
+ costRatio?: number;
25
26
  color?: string;
26
27
  description: string;
27
28
  steps?: number;
@@ -36,16 +37,26 @@ interface FallbackConfig {
36
37
  presets?: Record<string, Record<string, string[]>>;
37
38
  }
38
39
 
40
+ interface ModeConfig {
41
+ defaultTier: string;
42
+ description: string;
43
+ overrideRules?: string[];
44
+ }
45
+
39
46
  interface RouterConfig {
40
47
  activePreset: string;
48
+ activeMode?: string;
41
49
  presets: Record<string, Preset>;
42
50
  rules: string[];
43
51
  defaultTier: string;
44
52
  fallback?: FallbackConfig;
53
+ taskPatterns?: Record<string, string[]>;
54
+ modes?: Record<string, ModeConfig>;
45
55
  }
46
56
 
47
57
  interface RouterState {
48
58
  activePreset?: string;
59
+ activeMode?: string;
49
60
  }
50
61
 
51
62
  // ---------------------------------------------------------------------------
@@ -130,6 +141,39 @@ function validateConfig(raw: unknown): RouterConfig {
130
141
  throw new Error("tiers.json: 'defaultTier' must be a string");
131
142
  }
132
143
 
144
+ // Validate modes if present
145
+ if (obj.modes !== undefined) {
146
+ if (typeof obj.modes !== "object" || obj.modes === null || Array.isArray(obj.modes)) {
147
+ throw new Error("tiers.json: 'modes' must be an object");
148
+ }
149
+ const modes = obj.modes as Record<string, unknown>;
150
+ for (const [modeName, mode] of Object.entries(modes)) {
151
+ if (typeof mode !== "object" || mode === null) {
152
+ throw new Error(`tiers.json: mode '${modeName}' must be an object`);
153
+ }
154
+ const m = mode as Record<string, unknown>;
155
+ if (typeof m.defaultTier !== "string") {
156
+ throw new Error(`tiers.json: mode '${modeName}.defaultTier' must be a string`);
157
+ }
158
+ if (typeof m.description !== "string") {
159
+ throw new Error(`tiers.json: mode '${modeName}.description' must be a string`);
160
+ }
161
+ }
162
+ }
163
+
164
+ // Validate taskPatterns if present
165
+ if (obj.taskPatterns !== undefined) {
166
+ if (typeof obj.taskPatterns !== "object" || obj.taskPatterns === null || Array.isArray(obj.taskPatterns)) {
167
+ throw new Error("tiers.json: 'taskPatterns' must be an object");
168
+ }
169
+ const tp = obj.taskPatterns as Record<string, unknown>;
170
+ for (const [tierName, patterns] of Object.entries(tp)) {
171
+ if (!Array.isArray(patterns)) {
172
+ throw new Error(`tiers.json: taskPatterns.'${tierName}' must be an array of strings`);
173
+ }
174
+ }
175
+ }
176
+
133
177
  return raw as RouterConfig;
134
178
  }
135
179
 
@@ -150,9 +194,12 @@ function loadConfig(): RouterConfig {
150
194
  cfg.activePreset = resolved;
151
195
  }
152
196
  }
197
+ if (state.activeMode && cfg.modes?.[state.activeMode]) {
198
+ cfg.activeMode = state.activeMode;
199
+ }
153
200
  }
154
201
  } catch {
155
- // Ignore state read errors and keep tiers.json active preset
202
+ // Ignore state read errors and keep tiers.json defaults
156
203
  }
157
204
 
158
205
  _cachedConfig = cfg;
@@ -160,6 +207,30 @@ function loadConfig(): RouterConfig {
160
207
  return cfg;
161
208
  }
162
209
 
210
+ // ---------------------------------------------------------------------------
211
+ // State persistence helpers
212
+ // ---------------------------------------------------------------------------
213
+
214
+ /** Read current persisted state (or empty object on failure). */
215
+ function readState(): RouterState {
216
+ try {
217
+ if (existsSync(statePath())) {
218
+ return JSON.parse(readFileSync(statePath(), "utf-8")) as RouterState;
219
+ }
220
+ } catch {
221
+ // ignore
222
+ }
223
+ return {};
224
+ }
225
+
226
+ /** Write state to disk (merges with existing keys). */
227
+ function writeState(patch: Partial<RouterState>): void {
228
+ const state = { ...readState(), ...patch };
229
+ const p = statePath();
230
+ mkdirSync(dirname(p), { recursive: true });
231
+ writeFileSync(p, JSON.stringify(state, null, 2) + "\n", "utf-8");
232
+ }
233
+
163
234
  function saveActivePreset(presetName: string): void {
164
235
  const cfg = loadConfig();
165
236
  const resolved = resolvePresetName(cfg, presetName);
@@ -170,15 +241,23 @@ function saveActivePreset(presetName: string): void {
170
241
  cfg.activePreset = resolved;
171
242
 
172
243
  // Persist user-selected preset to state file only — never mutate tiers.json
173
- const presetState: RouterState = { activePreset: resolved };
174
- const p = statePath();
175
- mkdirSync(dirname(p), { recursive: true });
176
- writeFileSync(p, JSON.stringify(presetState, null, 2) + "\n", "utf-8");
244
+ writeState({ activePreset: resolved });
177
245
 
178
246
  // Invalidate cache so next read picks up the new active preset
179
247
  invalidateConfigCache();
180
248
  }
181
249
 
250
+ function saveActiveMode(modeName: string): void {
251
+ const cfg = loadConfig();
252
+ if (!cfg.modes?.[modeName]) {
253
+ return;
254
+ }
255
+
256
+ cfg.activeMode = modeName;
257
+ writeState({ activeMode: modeName });
258
+ invalidateConfigCache();
259
+ }
260
+
182
261
  function getActiveTiers(cfg: RouterConfig): Preset {
183
262
  return cfg.presets[cfg.activePreset] ?? Object.values(cfg.presets)[0]!;
184
263
  }
@@ -210,6 +289,15 @@ function buildAgentOptions(tier: TierConfig): Record<string, unknown> {
210
289
  return Object.keys(opts).length > 0 ? opts : {};
211
290
  }
212
291
 
292
+ // ---------------------------------------------------------------------------
293
+ // Mode helpers
294
+ // ---------------------------------------------------------------------------
295
+
296
+ function getActiveMode(cfg: RouterConfig): ModeConfig | undefined {
297
+ if (!cfg.modes || !cfg.activeMode) return undefined;
298
+ return cfg.modes[cfg.activeMode];
299
+ }
300
+
213
301
  // ---------------------------------------------------------------------------
214
302
  // Fallback instructions builder
215
303
  // ---------------------------------------------------------------------------
@@ -222,23 +310,29 @@ function buildFallbackInstructions(cfg: RouterConfig): string {
222
310
  const map = presetMap && Object.keys(presetMap).length > 0 ? presetMap : fb.global;
223
311
  if (!map) return "";
224
312
 
225
- const providerLines = Object.entries(map).flatMap(([provider, presetOrder]) => {
313
+ const chains = Object.entries(map).flatMap(([provider, presetOrder]) => {
226
314
  if (!Array.isArray(presetOrder)) return [];
227
- const validOrder = presetOrder.filter(
228
- (preset) => preset !== cfg.activePreset && Boolean(cfg.presets[preset]),
315
+ const valid = presetOrder.filter(
316
+ (p) => p !== cfg.activePreset && Boolean(cfg.presets[p]),
229
317
  );
230
- return validOrder.length > 0 ? [`- ${provider}: ${validOrder.join(" -> ")}`] : [];
318
+ return valid.length > 0 ? [`${provider}→${valid.join("")}`] : [];
231
319
  });
232
320
 
233
- if (providerLines.length === 0) return "";
321
+ if (chains.length === 0) return "";
322
+ return `Err→retry-alt-tier→fail→direct. Chain: ${chains.join(" | ")}`;
323
+ }
234
324
 
235
- return [
236
- "Fallback on delegated task errors:",
237
- "1. If Task(...) returns provider/model/rate-limit/timeout/auth errors, retry once with a different tier suited to the same task.",
238
- "2. If retry also fails, stop delegating that task and complete it directly in the primary agent.",
239
- "3. Use the failing model prefix and this preset fallback order for next-run recovery (`/preset <name>` + restart):",
240
- ...providerLines,
241
- ].join("\n");
325
+ // ---------------------------------------------------------------------------
326
+ // Cost & taxonomy builders
327
+ // ---------------------------------------------------------------------------
328
+
329
+ function buildTaskTaxonomy(cfg: RouterConfig): string {
330
+ if (!cfg.taskPatterns || Object.keys(cfg.taskPatterns).length === 0) return "";
331
+
332
+ return Object.entries(cfg.taskPatterns)
333
+ .filter(([_, p]) => Array.isArray(p) && p.length > 0)
334
+ .map(([tier, patterns]) => `@${tier}→${(patterns as string[]).join("/")}`)
335
+ .join("\n");
242
336
  }
243
337
 
244
338
  // ---------------------------------------------------------------------------
@@ -248,40 +342,35 @@ function buildFallbackInstructions(cfg: RouterConfig): string {
248
342
  function buildDelegationProtocol(cfg: RouterConfig): string {
249
343
  const tiers = getActiveTiers(cfg);
250
344
 
251
- const tierSummary = Object.entries(tiers)
345
+ // Compact tier summary: @name=model/variant(costRatio)
346
+ const tierLine = Object.entries(tiers)
252
347
  .map(([name, t]) => {
253
- const shortModel = t.model.split("/").pop() ?? t.model;
254
- const variant = t.variant ? ` (${t.variant})` : "";
255
- return `@${name}=${shortModel}${variant}`;
348
+ const short = t.model.split("/").pop() ?? t.model;
349
+ const v = t.variant ? `/${t.variant}` : "";
350
+ const c = t.costRatio != null ? `(${t.costRatio}x)` : "";
351
+ return `@${name}=${short}${v}${c}`;
256
352
  })
257
- .join(" | ");
353
+ .join(" ");
258
354
 
259
- // Build per-tier whenToUse descriptions so the agent knows when to pick each tier
260
- const tierDescriptions = Object.entries(tiers)
261
- .map(([name, t]) => {
262
- const uses = t.whenToUse.length > 0 ? t.whenToUse.join(", ") : t.description;
263
- return `- @${name}: ${uses}`;
264
- })
265
- .join("\n");
355
+ // Compact mode
356
+ const mode = getActiveMode(cfg);
357
+ const modeSuffix = cfg.activeMode ? ` mode:${cfg.activeMode}` : "";
266
358
 
267
- // Use configurable rules from tiers.json instead of hardcoded ones
268
- const numberedRules = cfg.rules
269
- .map((rule, i) => `${i + 1}. ${rule}`)
270
- .join("\n");
359
+ // Compact task routing guide
360
+ const taxonomy = buildTaskTaxonomy(cfg);
271
361
 
272
- const fallbackInstructions = buildFallbackInstructions(cfg);
362
+ // Compact rules
363
+ const effectiveRules = mode?.overrideRules?.length ? mode.overrideRules : cfg.rules;
364
+ const rulesLine = effectiveRules.map((r, i) => `${i + 1}.${r}`).join(" ");
365
+
366
+ const fallback = buildFallbackInstructions(cfg);
273
367
 
274
368
  return [
275
- "## Model Delegation Protocol",
276
- `Preset: ${cfg.activePreset}. Tiers: ${tierSummary}.`,
277
- "",
278
- "Tier capabilities:",
279
- tierDescriptions,
280
- "",
281
- "Apply to every user message (plan and ad-hoc):",
282
- numberedRules,
283
- ...(fallbackInstructions ? ["", fallbackInstructions] : []),
284
- "",
369
+ `## Model Delegation Protocol`,
370
+ `Preset: ${cfg.activePreset}. Tiers: ${tierLine}.${modeSuffix}`,
371
+ ...(taxonomy ? [taxonomy] : []),
372
+ rulesLine,
373
+ ...(fallback ? [fallback] : []),
285
374
  `Delegate with Task(subagent_type="fast|medium|heavy", prompt="...").`,
286
375
  "Keep orchestration and final synthesis in the primary agent.",
287
376
  ].join("\n");
@@ -320,6 +409,50 @@ function buildTiersOutput(cfg: RouterConfig): string {
320
409
  return lines.join("\n");
321
410
  }
322
411
 
412
+ // ---------------------------------------------------------------------------
413
+ // /budget command output
414
+ // ---------------------------------------------------------------------------
415
+
416
+ function buildBudgetOutput(cfg: RouterConfig, args: string): string {
417
+ const modes = cfg.modes;
418
+ if (!modes || Object.keys(modes).length === 0) {
419
+ return 'No modes configured in tiers.json. Add a "modes" section to enable budget mode.';
420
+ }
421
+
422
+ const requested = args.trim().toLowerCase();
423
+ const currentMode = cfg.activeMode || "normal";
424
+
425
+ // No args: show current mode and available modes
426
+ if (!requested) {
427
+ const lines = ["# Routing Modes\n"];
428
+ for (const [name, mode] of Object.entries(modes)) {
429
+ const active = name === currentMode ? " <- active" : "";
430
+ lines.push(`- **${name}**${active}: ${mode.description} (default tier: @${mode.defaultTier})`);
431
+ }
432
+ lines.push(`\nSwitch with: \`/budget <mode>\``);
433
+ return lines.join("\n");
434
+ }
435
+
436
+ // Switch mode
437
+ if (modes[requested]) {
438
+ saveActiveMode(requested);
439
+ const mode = modes[requested];
440
+ return [
441
+ `Routing mode switched to **${requested}**.`,
442
+ "",
443
+ mode.description,
444
+ `Default tier: @${mode.defaultTier}`,
445
+ ...(mode.overrideRules?.length
446
+ ? ["", "Active rules:", ...mode.overrideRules.map((r) => `- ${r}`)]
447
+ : []),
448
+ "",
449
+ "Mode change takes effect immediately on the next message.",
450
+ ].join("\n");
451
+ }
452
+
453
+ return `Unknown mode: "${requested}". Available: ${Object.keys(modes).join(", ")}`;
454
+ }
455
+
323
456
  // ---------------------------------------------------------------------------
324
457
  // /preset command output
325
458
  // ---------------------------------------------------------------------------
@@ -413,6 +546,10 @@ const ModelRouterPlugin: Plugin = async (_ctx: PluginInput) => {
413
546
  template: "$ARGUMENTS",
414
547
  description: "Show or switch model presets (e.g., /preset openai)",
415
548
  };
549
+ opencodeConfig.command["budget"] = {
550
+ template: "$ARGUMENTS",
551
+ description: "Show or switch routing mode (e.g., /budget, /budget budget, /budget quality)",
552
+ };
416
553
  opencodeConfig.command["annotate-plan"] = {
417
554
  template: [
418
555
  "Annotate the plan with tier directives for model delegation.",
@@ -443,7 +580,7 @@ const ModelRouterPlugin: Plugin = async (_ctx: PluginInput) => {
443
580
  },
444
581
 
445
582
  // -----------------------------------------------------------------------
446
- // Inject delegation protocol — uses cached config (invalidated on /preset)
583
+ // Inject delegation protocol — uses cached config (invalidated on /preset or /budget)
447
584
  // -----------------------------------------------------------------------
448
585
  "experimental.chat.system.transform": async (_input: any, output: any) => {
449
586
  try {
@@ -455,7 +592,7 @@ const ModelRouterPlugin: Plugin = async (_ctx: PluginInput) => {
455
592
  },
456
593
 
457
594
  // -----------------------------------------------------------------------
458
- // Handle /tiers and /preset commands
595
+ // Handle /tiers, /preset, and /budget commands
459
596
  // -----------------------------------------------------------------------
460
597
  "command.execute.before": async (input: any, output: any) => {
461
598
  if (input.command === "tiers") {
@@ -474,6 +611,16 @@ const ModelRouterPlugin: Plugin = async (_ctx: PluginInput) => {
474
611
  text: buildPresetOutput(cfg, input.arguments ?? ""),
475
612
  });
476
613
  }
614
+
615
+ if (input.command === "budget") {
616
+ try {
617
+ cfg = loadConfig();
618
+ } catch {}
619
+ output.parts.push({
620
+ type: "text" as const,
621
+ text: buildBudgetOutput(cfg, input.arguments ?? ""),
622
+ });
623
+ }
477
624
  },
478
625
  };
479
626
  };
package/tiers.json CHANGED
@@ -1,9 +1,11 @@
1
1
  {
2
2
  "activePreset": "anthropic",
3
+ "activeMode": "normal",
3
4
  "presets": {
4
5
  "anthropic": {
5
6
  "fast": {
6
7
  "model": "anthropic/claude-haiku-4-5",
8
+ "costRatio": 1,
7
9
  "description": "Haiku 4.5 for exploration, search, and simple reads",
8
10
  "steps": 30,
9
11
  "prompt": "You are a fast exploration agent. Focus on speed and efficiency. Read files, search code, and return findings concisely. Do NOT make edits unless explicitly asked.",
@@ -17,6 +19,7 @@
17
19
  "medium": {
18
20
  "model": "anthropic/claude-sonnet-4-5",
19
21
  "variant": "max",
22
+ "costRatio": 5,
20
23
  "description": "Sonnet 4.5 max for implementation, refactoring, and tests",
21
24
  "steps": 50,
22
25
  "prompt": "You are an implementation agent. Write clean, production-quality code matching existing project patterns. Run linters/tests after changes when possible.",
@@ -31,6 +34,7 @@
31
34
  "heavy": {
32
35
  "model": "anthropic/claude-opus-4-6",
33
36
  "variant": "max",
37
+ "costRatio": 20,
34
38
  "description": "Opus 4.6 max for architecture, complex debugging, and security",
35
39
  "steps": 30,
36
40
  "prompt": "You are a senior architecture consultant. Analyze deeply, consider tradeoffs, and provide thorough reasoning. Be exhaustive in your analysis.",
@@ -45,6 +49,7 @@
45
49
  "openai": {
46
50
  "fast": {
47
51
  "model": "openai/gpt-5.3-codex-spark",
52
+ "costRatio": 1,
48
53
  "description": "GPT-5.3 Codex Spark for fast exploration and simple tasks",
49
54
  "steps": 30,
50
55
  "prompt": "You are a fast exploration agent. Focus on speed and efficiency. Read files, search code, and return findings concisely. Do NOT make edits unless explicitly asked.",
@@ -56,6 +61,7 @@
56
61
  },
57
62
  "medium": {
58
63
  "model": "openai/gpt-5.3-codex",
64
+ "costRatio": 5,
59
65
  "description": "GPT-5.3 Codex default settings for implementation and standard coding",
60
66
  "steps": 50,
61
67
  "prompt": "You are an implementation agent. Write clean, production-quality code matching existing project patterns. Run linters/tests after changes when possible.",
@@ -69,6 +75,7 @@
69
75
  "heavy": {
70
76
  "model": "openai/gpt-5.3-codex",
71
77
  "variant": "xhigh",
78
+ "costRatio": 20,
72
79
  "description": "GPT-5.3 Codex xhigh for architecture and complex tasks",
73
80
  "steps": 30,
74
81
  "prompt": "You are a senior architecture consultant. Analyze deeply, consider tradeoffs, and provide thorough reasoning.",
@@ -83,6 +90,7 @@
83
90
  "github-copilot": {
84
91
  "fast": {
85
92
  "model": "github-copilot/claude-haiku-4-5",
93
+ "costRatio": 1,
86
94
  "description": "Claude Haiku 4.5 via GitHub Copilot for fast exploration and simple tasks",
87
95
  "steps": 30,
88
96
  "prompt": "You are a fast exploration agent. Focus on speed and efficiency. Read files, search code, and return findings concisely. Do NOT make edits unless explicitly asked.",
@@ -95,6 +103,7 @@
95
103
  },
96
104
  "medium": {
97
105
  "model": "github-copilot/claude-sonnet-4-5",
106
+ "costRatio": 5,
98
107
  "description": "Claude Sonnet 4.5 via GitHub Copilot for implementation, refactoring, and tests",
99
108
  "steps": 50,
100
109
  "prompt": "You are an implementation agent. Write clean, production-quality code matching existing project patterns. Run linters/tests after changes when possible.",
@@ -109,6 +118,7 @@
109
118
  "heavy": {
110
119
  "model": "github-copilot/claude-opus-4-6",
111
120
  "variant": "thinking",
121
+ "costRatio": 20,
112
122
  "description": "Claude Opus 4.6 via GitHub Copilot for architecture, complex debugging, and security",
113
123
  "steps": 30,
114
124
  "prompt": "You are a senior architecture consultant. Analyze deeply, consider tradeoffs, and provide thorough reasoning. Be exhaustive in your analysis.",
@@ -123,6 +133,7 @@
123
133
  "google": {
124
134
  "fast": {
125
135
  "model": "google/gemini-2.5-flash",
136
+ "costRatio": 1,
126
137
  "description": "Gemini 2.5 Flash for fast exploration and simple tasks",
127
138
  "steps": 30,
128
139
  "prompt": "You are a fast exploration agent. Focus on speed and efficiency. Read files, search code, and return findings concisely. Do NOT make edits unless explicitly asked.",
@@ -135,6 +146,7 @@
135
146
  },
136
147
  "medium": {
137
148
  "model": "google/gemini-2.5-pro",
149
+ "costRatio": 5,
138
150
  "description": "Gemini 2.5 Pro for implementation, refactoring, and tests",
139
151
  "steps": 50,
140
152
  "prompt": "You are an implementation agent. Write clean, production-quality code matching existing project patterns. Run linters/tests after changes when possible.",
@@ -148,6 +160,7 @@
148
160
  },
149
161
  "heavy": {
150
162
  "model": "google/gemini-3-pro-preview",
163
+ "costRatio": 20,
151
164
  "description": "Gemini 3 Pro Preview for architecture, complex debugging, and security",
152
165
  "steps": 30,
153
166
  "prompt": "You are a senior architecture consultant. Analyze deeply, consider tradeoffs, and provide thorough reasoning. Be exhaustive in your analysis.",
@@ -160,6 +173,39 @@
160
173
  }
161
174
  }
162
175
  },
176
+ "taskPatterns": {
177
+ "fast": ["search", "grep", "read", "git-info", "ls", "lookup-docs/types", "count", "exists-check", "rename"],
178
+ "medium": ["impl-feature", "refactor", "write-tests", "bugfix(≤2)", "edit-logic", "code-review", "build-fix", "create-file", "db-migrate", "api-endpoint", "config-update"],
179
+ "heavy": ["arch-design", "debug(≥3fail)", "sec-audit", "perf-opt", "migrate-strategy", "multi-system-integration", "tradeoff-analysis", "rca"]
180
+ },
181
+ "modes": {
182
+ "normal": {
183
+ "defaultTier": "medium",
184
+ "description": "Balanced quality and cost — delegates based on task complexity"
185
+ },
186
+ "budget": {
187
+ "defaultTier": "fast",
188
+ "description": "Aggressive cost savings — defaults to cheapest tier, escalates only when needed",
189
+ "overrideRules": [
190
+ "default→@fast unless edits/complex-reasoning needed",
191
+ "@medium ONLY: multi-file-edit/refactor/test-suite/build-fix",
192
+ "@heavy ONLY: user-requested OR ≥2 @medium failures",
193
+ "trivial(grep/read/glob)→direct,no-delegate",
194
+ "batch related searches→single @fast",
195
+ "uncertain @fast vs @medium→@fast,escalate on fail"
196
+ ]
197
+ },
198
+ "quality": {
199
+ "defaultTier": "medium",
200
+ "description": "Quality-first — uses stronger models more liberally for better results",
201
+ "overrideRules": [
202
+ "default→@medium incl exploration when deep-context matters",
203
+ "@heavy: arch/debug/security/multi-file-coord",
204
+ "@fast ONLY: trivial single-tool ops (1 grep/1 read)",
205
+ "prefer thoroughness over speed"
206
+ ]
207
+ }
208
+ },
163
209
  "fallback": {
164
210
  "global": {
165
211
  "anthropic": ["openai", "google", "github-copilot"],
@@ -169,15 +215,14 @@
169
215
  }
170
216
  },
171
217
  "rules": [
172
- "When a plan step contains [tier:fast], [tier:medium], or [tier:heavy], delegate to that agent",
173
- "When a plan says 'use a fast/cheap model' -> delegate to @fast",
174
- "When a plan says 'use a medium/balanced model' -> delegate to @medium",
175
- "When a plan says 'use a heavy/powerful model' -> delegate to @heavy",
176
- "Default to @medium for implementation tasks you could delegate",
177
- "Use @fast for any read-only exploration or research task",
178
- "Keep orchestration (planning, decisions, verification) for yourself - delegate execution",
179
- "For trivial tasks (single grep, single file read), execute directly without delegation",
180
- "Never delegate to @heavy if you are already running on an opus-class model - do it yourself"
218
+ "[tier:X]delegate X",
219
+ "plan:fast/cheap→@fast | plan:medium→@medium | plan:heavy→@heavy",
220
+ "default:impl→@medium | readonly→@fast",
221
+ "orchestrate=self,delegate=exec",
222
+ "trivial(≤2tools)→direct,skip-delegate",
223
+ "self∈opus→never→@heavy,do-it-yourself",
224
+ "consult route-guide↑",
225
+ "min(cost,adequate-tier)"
181
226
  ],
182
227
  "defaultTier": "medium"
183
228
  }