npm - @renjfk/opencode-model-fallback - Versions diffs - 0.1.0 - Mend

@renjfk/opencode-model-fallback 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Soner Koksal
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,240 @@
+[![CI](https://github.com/renjfk/opencode-model-fallback/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/renjfk/opencode-model-fallback/actions/workflows/ci.yml)
+[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
+[![npm](https://img.shields.io/npm/v/@renjfk/opencode-model-fallback)](https://www.npmjs.com/package/@renjfk/opencode-model-fallback)
+[![Downloads](https://img.shields.io/npm/dm/@renjfk/opencode-model-fallback)](https://www.npmjs.com/package/@renjfk/opencode-model-fallback)
+# opencode-model-fallback
+Mapped model fallback router for [OpenCode](https://opencode.ai/).
+There are situations where you may want to use the quota that comes with a
+subscription first, then fall back to an API pay-as-you-go model only when that
+subscription-backed model is rate-limited or usage-limited. You can solve that
+with a local proxy, but maintaining a proxy server is often not worth it if all
+you need is a simple one-to-one fallback inside OpenCode. This plugin handles
+that routing directly in OpenCode.
+When a configured model hits a retryable provider failure, this plugin aborts
+the in-flight request, replays the latest user message on the mapped fallback
+model, persists a global cooldown for the failed model, and routes back to the
+original model after the cooldown expires.
+## Install
+Add to your OpenCode config at `~/.config/opencode/config.json`:
+```json
+{
+  "plugin": ["@renjfk/opencode-model-fallback"]
+}
+```
+## Configuration
+If you want to set plugin options, use the tuple form:
+```json
+{
+  "plugin": [
+    [
+      "@renjfk/opencode-model-fallback",
+      {
+        "mappings": {
+          "openai/gpt-5.4": "azure-ai-foundry/gpt-5.4",
+          "openai/gpt-5.5": "azure-ai-foundry/gpt-5.5"
+        }
+      }
+    ]
+  ]
+}
+```
+## Options
+- `mappings`: map original model IDs to fallback model IDs.
+- `retry_on_errors`: retryable HTTP status codes. Defaults to `429`.
+- `retryable_error_patterns`: retryable error message patterns. Defaults to `["rate.?limit"]`.
+- `cooldown_seconds`: how long a failed original model remains on fallback. Defaults to `3600`.
+- `timeout_seconds`: abort and retry if a response is inactive for this long. Defaults to `30`.
+- `notify_on_fallback`: show fallback/recovery toasts. Defaults to `true`.
+## How it works
+The plugin watches OpenCode chat and session events. When a request uses a model
+listed in `mappings`, that model is preferred unless it has an active global
+cooldown. If OpenCode reports a retryable provider failure, the plugin switches
+to the mapped fallback model and stores a global cooldown for the failed model.
+Global model cooldowns are persisted at:
+```
+~/.local/share/opencode/mapped-fallback-router.json
+```
+Persisted cooldowns let all sessions avoid immediately retrying a model that has
+just failed. When the cooldown expires, mapped requests are routed back to the
+original model.
+The plugin does not load balance, race models, or retry through a chain. Each
+mapping is one original model to one fallback model.
+## Scenarios
+### Normal request
+If you send a message with `openai/gpt-5.5` and the model has a mapping, the
+request goes to `openai/gpt-5.5` normally unless it has an active cooldown.
+If you select the mapped fallback model directly, the plugin still routes back
+to the original model unless the original has an active cooldown.
+### Retryable failure while streaming
+If OpenCode reports a retryable provider failure such as a rate limit or a
+configured retryable status code, the plugin aborts the current request and
+replays the latest user message on the mapped fallback model.
+For example:
+```json
+{
+  "mappings": {
+    "openai/gpt-5.5": "azure-ai-foundry/gpt-5.5"
+  }
+}
+```
+If `openai/gpt-5.5` fails with a retryable error, the session continues on
+`azure-ai-foundry/gpt-5.5`.
+### Active cooldown
+After fallback is triggered, the original model is considered cooling down for
+`cooldown_seconds`. During that cooldown, mapped requests use the fallback model
+instead of switching back and immediately hitting the same provider
+failure again.
+All sessions are routed straight to the fallback while the original model is
+cooling down.
+### Recovery
+When the cooldown expires, mapped requests switch back to the original model.
+### Exhausted fallback
+Mappings are one-to-one. If the fallback model also hits a retryable failure,
+there is no next fallback to try. The plugin shows a fallback exhausted toast
+when notifications are enabled.
+## Troubleshooting Retry Matching
+Use OpenCode's provider logs to find the exact status code, headers, and error
+body returned by a provider. This is the most reliable way to tune
+`retry_on_errors` and `retryable_error_patterns`.
+For a short headless reproduction, capture logs and stop the run after a few
+seconds to avoid long retry loops:
+```bash
+log="/tmp/opencode-provider.log"
+: > "$log"
+opencode run --print-logs --log-level DEBUG --model openai/gpt-5.3-codex --format json "Reply with OK only." 2> "$log" &
+pid=$!
+sleep 3
+kill "$pid" 2>/dev/null || true
+wait "$pid" 2>/dev/null || true
+```
+Then inspect the captured provider errors:
+```bash
+rg 'service=llm|AI_APICallError|statusCode|responseBody|x-codex|reset|usage_limit|rate.?limit' /tmp/opencode-provider.log
+```
+Look for an OpenCode log line like:
+```text
+ERROR ... service=llm providerID=openai modelID=gpt-5.3-codex ... error={...}
+```
+Inside `error`, check fields such as `statusCode`, `responseHeaders`,
+`responseBody`, `isRetryable`, and `data.error.message`. For example, OpenAI
+usage limits can appear as `statusCode: 429` with a response body containing
+`usage_limit_reached` and `The usage limit has been reached`. OpenAI Codex
+responses can also include reset headers such as `x-codex-primary-reset-at`,
+`x-codex-primary-reset-after-seconds`, `x-codex-secondary-reset-at`, and
+`x-codex-secondary-reset-after-seconds`.
+For TUI sessions, start OpenCode the same way and reproduce manually:
+```bash
+opencode --print-logs --log-level DEBUG 2> /tmp/opencode-provider.log
+```
+Use the provider `statusCode` and response body text to tune the retry rules:
+```json
+{
+  "plugin": [
+    [
+      "@renjfk/opencode-model-fallback",
+      {
+        "retry_on_errors": [429, 403],
+        "retryable_error_patterns": ["rate.?limit", "usage.?limit"],
+        "mappings": {
+          "openai/gpt-5.5": "azure-ai-foundry/gpt-5.5"
+        }
+      }
+    ]
+  ]
+}
+```
+If the status code is not in `retry_on_errors`, add it. If the response body has
+stable text or an error type, add a small regex matching it to
+`retryable_error_patterns`. If there is no `service=llm` error line, OpenCode did
+not reach the provider or the run was stopped before the provider returned.
+## Contributing
+opencode-model-fallback is open to contributions and ideas!
+### Issue conventions
+**Format:** `type: brief description`
+- `feat:` new features or functionality
+- `fix:` bug fixes
+- `enhance:` improvements to existing features
+- `chore:` maintenance tasks, dependencies, cleanup
+- `docs:` documentation updates
+- `build:` build system, CI/CD changes
+### Development
+```bash
+npm run test         # node test suite
+npm run check        # test + lint + fmt
+npm run lint         # oxlint
+npm run fmt          # oxfmt --check
+npm run fmt:fix      # oxfmt --write
+```
+### Test local plugin in OpenCode
+To test unpublished changes in the OpenCode TUI, point `~/.config/opencode/config.json`
+at the local repo path, not the npm package name:
+```json
+{
+  "plugin": ["/Users/your-user/opencode-model-fallback"]
+}
+```
+### Release process
+Manual releases via opencode; see [RELEASE_PROCESS.md](RELEASE_PROCESS.md).
+## License
+This project is licensed under the [MIT License](LICENSE).

package/index.js ADDED Viewed

@@ -0,0 +1,7 @@
+import { createMappedFallbackRouter } from "./lib/router.js";
+export async function MappedFallbackRouterPlugin(ctx, rawOptions) {
+  return createMappedFallbackRouter(ctx, rawOptions);
+}
+export default MappedFallbackRouterPlugin;

package/lib/errors.js ADDED Viewed

@@ -0,0 +1,36 @@
+export function isRetryable(error, options) {
+  const status = extractStatus(error);
+  if (status && options.retry_on_errors.includes(status)) return true;
+  const text = errorText(error).toLowerCase();
+  return options.retryable_error_patterns.some((pattern) => {
+    try {
+      return new RegExp(pattern, "i").test(text);
+    } catch {
+      return text.includes(pattern.toLowerCase());
+    }
+  });
+}
+function extractStatus(error) {
+  if (!error || typeof error !== "object") return undefined;
+  const value = error.statusCode ?? error.status ?? error.code;
+  if (typeof value === "number") return value;
+  if (typeof value === "string" && /^\d+$/.test(value)) return Number(value);
+  return undefined;
+}
+export function extractErrorName(error) {
+  if (!error || typeof error !== "object") return undefined;
+  return typeof error.name === "string" ? error.name : undefined;
+}
+function errorText(error) {
+  if (!error) return "";
+  if (typeof error === "string") return error;
+  if (error instanceof Error) return `${error.name} ${error.message}`;
+  try {
+    return JSON.stringify(error);
+  } catch {
+    return String(error);
+  }
+}

package/lib/models.js ADDED Viewed

@@ -0,0 +1,10 @@
+export function modelObject(model) {
+  const [providerID, ...modelParts] = model.split("/");
+  if (!providerID || modelParts.length === 0) return undefined;
+  return { providerID, modelID: modelParts.join("/") };
+}
+export function modelString(model) {
+  if (!model?.providerID || !model?.modelID) return undefined;
+  return `${model.providerID}/${model.modelID}`;
+}

package/lib/options.js ADDED Viewed

@@ -0,0 +1,25 @@
+const DEFAULT_OPTIONS = {
+  mappings: {},
+  retry_on_errors: [429],
+  retryable_error_patterns: ["rate.?limit"],
+  cooldown_seconds: 3600,
+  timeout_seconds: 30,
+  notify_on_fallback: true,
+};
+export function normalizeOptions(rawOptions) {
+  return {
+    ...DEFAULT_OPTIONS,
+    ...rawOptions,
+    mappings: normalizeMappings(rawOptions?.mappings ?? {}),
+  };
+}
+function normalizeMappings(mappings) {
+  const normalized = {};
+  for (const [from, to] of Object.entries(mappings)) {
+    if (!from || typeof to !== "string" || !to.includes("/")) continue;
+    normalized[from] = to;
+  }
+  return normalized;
+}

package/lib/router.js ADDED Viewed

@@ -0,0 +1,254 @@
+import { isRetryable, extractErrorName } from "./errors.js";
+import { modelObject, modelString } from "./models.js";
+import { normalizeOptions } from "./options.js";
+import { abortSession, getReplayParts } from "./session.js";
+import { createStateStore } from "./store.js";
+const POST_ABORT_DELAY_MS = 150;
+export function createMappedFallbackRouter(ctx, rawOptions) {
+  const options = normalizeOptions(rawOptions);
+  const fallbackToOriginal = Object.fromEntries(
+    Object.entries(options.mappings).map(([original, fallback]) => [fallback, original]),
+  );
+  const store = createStateStore();
+  const retrying = new Set();
+  const timers = new Map();
+  const selfAbortAt = new Map();
+  const activeOriginals = new Map();
+  let agentConfigs;
+  function hasMapping(model) {
+    return !!model && !!options.mappings[model];
+  }
+  function mappedOriginal(model) {
+    if (hasMapping(model)) return model;
+    return fallbackToOriginal[model];
+  }
+  function modelFromAgent(agent) {
+    const agentConfig = agent && agentConfigs?.[agent];
+    return typeof agentConfig === "object" && agentConfig ? agentConfig.model : undefined;
+  }
+  function selectedModel(requested) {
+    const original = mappedOriginal(requested);
+    if (!original) return requested;
+    const fallback = options.mappings[original];
+    if (!fallback) return requested;
+    const cooldown = store.getModelCooldown(original);
+    return cooldown ? fallback : original;
+  }
+  function timedOriginal(requested) {
+    if (!hasMapping(requested)) return undefined;
+    const cooldown = store.getModelCooldown(requested);
+    if (cooldown) return undefined;
+    return requested;
+  }
+  function shouldRoute(requested) {
+    return hasMapping(requested) || !!fallbackToOriginal[requested];
+  }
+  function resolveErrorModels(model, agent) {
+    const failed = model ?? modelFromAgent(agent);
+    const original = mappedOriginal(failed);
+    return { failed, original };
+  }
+  function shouldFallbackFromError(failed, original) {
+    if (!original) return false;
+    if (failed !== original) return false;
+    const cooldown = store.getModelCooldown(original);
+    return !cooldown;
+  }
+  function clearTimer(sessionID) {
+    const timer = timers.get(sessionID);
+    if (timer) clearTimeout(timer);
+    timers.delete(sessionID);
+  }
+  function scheduleTimeout(sessionID, original, agent) {
+    clearTimer(sessionID);
+    if (options.timeout_seconds <= 0 || !hasMapping(original)) return;
+    timers.set(
+      sessionID,
+      setTimeout(async () => {
+        timers.delete(sessionID);
+        if (retrying.has(sessionID) || !timedOriginal(original)) return;
+        retrying.add(sessionID);
+        try {
+          await abortCurrentSession(sessionID);
+          await retryWithFallback(sessionID, original, agent, "timeout");
+        } finally {
+          retrying.delete(sessionID);
+        }
+      }, options.timeout_seconds * 1000),
+    );
+  }
+  async function abortCurrentSession(sessionID) {
+    const aborted = await abortSession(ctx.client, sessionID);
+    if (aborted) selfAbortAt.set(sessionID, Date.now());
+  }
+  async function retryWithFallback(sessionID, original, agent, reason) {
+    const fallback = options.mappings[original];
+    const target = fallback ? modelObject(fallback) : undefined;
+    if (!target) return;
+    const failedAt = Date.now();
+    const cooldownUntil = failedAt + options.cooldown_seconds * 1000;
+    store.setModelCooldown(original, reason, failedAt, cooldownUntil);
+    const parts = await getReplayParts(ctx.client, ctx.directory, sessionID);
+    if (parts.length === 0) return;
+    clearTimer(sessionID);
+    try {
+      await new Promise((resolve) => setTimeout(resolve, POST_ABORT_DELAY_MS));
+      await ctx.client.session.promptAsync({
+        path: { id: sessionID },
+        body: { ...(agent ? { agent } : {}), model: target, parts },
+        query: { directory: ctx.directory },
+      });
+      await toast("Model Fallback", `${original} -> ${fallback} (${reason})`, "warning");
+    } catch {}
+  }
+  async function toast(title, message, variant) {
+    if (!options.notify_on_fallback) return;
+    await ctx.client.tui
+      .showToast({ body: { title, message, variant, duration: 5000 } })
+      .catch(() => {});
+  }
+  async function handleError(sessionID, error, model, agent, source) {
+    if (!sessionID || retrying.has(sessionID)) return;
+    const name = extractErrorName(error);
+    const selfAbort = selfAbortAt.get(sessionID);
+    if (name === "MessageAbortedError" && selfAbort && Date.now() - selfAbort < 2000) return;
+    const { failed, original } = resolveErrorModels(model, agent);
+    if (!original) return;
+    const retryable = isRetryable(error, options);
+    if (!retryable) return;
+    if (failed !== original) {
+      await toast("Model Fallback Exhausted", `No mapped fallback left for ${original}`, "error");
+      return;
+    }
+    if (!shouldFallbackFromError(failed, original)) return;
+    retrying.add(sessionID);
+    clearTimer(sessionID);
+    try {
+      await retryWithFallback(sessionID, original, agent, source);
+    } finally {
+      retrying.delete(sessionID);
+    }
+  }
+  async function handleProviderRetryStatus(props) {
+    const sessionID = props?.sessionID;
+    const status = props?.status;
+    if (!sessionID || !status || retrying.has(sessionID)) return;
+    const type = String(status.type ?? "").toLowerCase();
+    if (type !== "retry") return;
+    const agent = props?.agent;
+    const model =
+      props?.model ??
+      (typeof props?.providerID === "string" && typeof props?.modelID === "string"
+        ? `${props.providerID}/${props.modelID}`
+        : (activeOriginals.get(sessionID) ?? modelFromAgent(agent)));
+    const { failed, original } = resolveErrorModels(model, agent);
+    if (!original) return;
+    if (failed !== original) {
+      await toast("Model Fallback Exhausted", `No mapped fallback left for ${original}`, "error");
+      return;
+    }
+    if (!shouldFallbackFromError(failed, original)) return;
+    retrying.add(sessionID);
+    clearTimer(sessionID);
+    try {
+      await abortCurrentSession(sessionID);
+      await retryWithFallback(sessionID, original, agent, "session.status");
+    } finally {
+      retrying.delete(sessionID);
+    }
+  }
+  return {
+    name: "mapped-fallback-router",
+    config: (config) => {
+      const agentValue = config.agent;
+      agentConfigs =
+        agentValue && typeof agentValue === "object" && !Array.isArray(agentValue)
+          ? agentValue
+          : undefined;
+    },
+    "chat.message": async (input, output) => {
+      const sessionID = input.sessionID;
+      const requested = modelString(input.model) ?? modelFromAgent(input.agent);
+      if (!sessionID || !requested) return;
+      if (!shouldRoute(requested)) return;
+      const target = selectedModel(requested);
+      if (!target) return;
+      const original = mappedOriginal(requested);
+      if (original) activeOriginals.set(sessionID, original);
+      const model = modelObject(target);
+      if (model && output.message) output.message.model = model;
+      if (requested !== target) {
+        const isFallback = hasMapping(requested);
+        await toast(
+          isFallback ? "Model Fallback" : "Model Recovered",
+          `Using ${target} instead of ${requested}`,
+          isFallback ? "warning" : "info",
+        );
+      }
+      if (hasMapping(target)) scheduleTimeout(sessionID, target, input.agent);
+      else clearTimer(sessionID);
+    },
+    event: async ({ event }) => {
+      const props = event.properties;
+      if (event.type === "session.deleted") {
+        const id = props?.info?.id;
+        if (id) {
+          retrying.delete(id);
+          activeOriginals.delete(id);
+          clearTimer(id);
+        }
+        return;
+      }
+      if (event.type === "session.status") {
+        await handleProviderRetryStatus(props);
+        return;
+      }
+      if (event.type === "session.error") {
+        await handleError(
+          props?.sessionID,
+          props?.error,
+          props?.model,
+          props?.agent,
+          "session.error",
+        );
+        return;
+      }
+      if (event.type === "message.updated") {
+        const info = props?.info;
+        if (info?.role !== "assistant") return;
+        const sessionID = info?.sessionID;
+        if (!info?.error) {
+          if (sessionID) clearTimer(sessionID);
+          return;
+        }
+        const model =
+          info?.model ??
+          (typeof info?.providerID === "string" && typeof info?.modelID === "string"
+            ? `${info.providerID}/${info.modelID}`
+            : undefined);
+        await handleError(sessionID, info.error, model, info?.agent, "message.updated");
+      }
+    },
+  };
+}

package/lib/session.js ADDED Viewed

@@ -0,0 +1,25 @@
+export async function abortSession(client, sessionID) {
+  try {
+    await client.session.abort({ path: { id: sessionID } });
+    return true;
+  } catch {
+    // Best effort. The session may already be idle after an error.
+    return false;
+  }
+}
+export async function getReplayParts(client, directory, sessionID) {
+  const response = await client.session.messages({
+    path: { id: sessionID },
+    query: { directory },
+  });
+  const messages = response.data ?? [];
+  for (let index = messages.length - 1; index >= 0; index--) {
+    const message = messages[index];
+    const role = String(message.info?.role ?? "").toLowerCase();
+    const parts = message.parts ?? message.info?.parts ?? [];
+    if (role !== "user" || parts.length === 0) continue;
+    return parts.filter((part) => typeof part.type === "string" && part.type !== "compaction");
+  }
+  return [];
+}

package/lib/store.js ADDED Viewed

@@ -0,0 +1,51 @@
+import { existsSync, mkdirSync, readFileSync, renameSync, writeFileSync } from "node:fs";
+import { dirname, join } from "node:path";
+export const STORE_PATH = join(
+  process.env.XDG_DATA_HOME ?? join(process.env.HOME ?? "", ".local", "share"),
+  "opencode",
+  "mapped-fallback-router.json",
+);
+export function createStateStore(storePath = STORE_PATH) {
+  function read() {
+    try {
+      if (!existsSync(storePath)) return {};
+      const parsed = JSON.parse(readFileSync(storePath, "utf8"));
+      if (!parsed || typeof parsed !== "object") return {};
+      return parsed;
+    } catch {
+      return {};
+    }
+  }
+  function write(store) {
+    try {
+      mkdirSync(dirname(storePath), { recursive: true });
+      const tempPath = `${storePath}.${process.pid}.tmp`;
+      writeFileSync(tempPath, `${JSON.stringify(store, null, 2)}\n`);
+      renameSync(tempPath, storePath);
+    } catch {
+      // Persisted cooldown state is best effort. In-memory fallback still works.
+    }
+  }
+  return {
+    getModelCooldown(model) {
+      if (!model) return undefined;
+      const store = read();
+      const record = store[model];
+      if (!record) return undefined;
+      if (Date.now() < record.cooldownUntil) return record;
+      delete store[model];
+      write(store);
+      return undefined;
+    },
+    setModelCooldown(model, reason, failedAt, cooldownUntil) {
+      const store = read();
+      store[model] = { failedAt, cooldownUntil, reason };
+      write(store);
+    },
+  };
+}

package/package.json ADDED Viewed

@@ -0,0 +1,44 @@
+{
+  "name": "@renjfk/opencode-model-fallback",
+  "version": "0.1.0",
+  "description": "Mapped model fallback router for OpenCode. Routes retryable model failures to configured fallback models and recovers after cooldown.",
+  "keywords": [
+    "fallback",
+    "model",
+    "opencode",
+    "plugin",
+    "router"
+  ],
+  "license": "MIT",
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/renjfk/opencode-model-fallback.git"
+  },
+  "files": [
+    "index.js",
+    "lib"
+  ],
+  "type": "module",
+  "main": "index.js",
+  "exports": {
+    ".": {
+      "import": "./index.js"
+    },
+    "./tui": {
+      "import": "./index.js"
+    }
+  },
+  "scripts": {
+    "test": "vitest run",
+    "lint": "npx oxlint .",
+    "fmt": "npx oxfmt --check .",
+    "fmt:fix": "npx oxfmt --write .",
+    "check": "npm run test && npm run lint && npm run fmt",
+    "prepack": "npm run check"
+  },
+  "devDependencies": {
+    "oxfmt": "0.50.0",
+    "oxlint": "1.65.0",
+    "vitest": "latest"
+  }
+}