npm - @khanglvm/llm-router - Versions diffs - 2.2.2 → 2.2.4 - Mend

@khanglvm/llm-router 2.2.2 → 2.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +39 -256
package/package.json +2 -1
package/src/cli-entry.js +9 -0
package/src/node/upgrade-command.js +128 -0
package/src/node/web-console-client.js +23 -23
package/src/node/web-console-styles.generated.js +1 -1
package/src/runtime/handler/amp-web-search.js +14 -7
package/src/runtime/handler/provider-call.js +22 -2
package/src/runtime/handler/runtime-policy.js +2 -1
package/src/runtime/handler.js +2 -1
package/src/runtime/state-store.js +3 -0
package/wrangler.toml +12 -3

package/README.md CHANGED Viewed

@@ -1,21 +1,9 @@
 # LLM Router
-LLM Router is a local and Cloudflare-deployable gateway for routing one client endpoint across multiple LLM providers, models, aliases, fallbacks, and rate limits.
+A unified LLM gateway that routes requests across multiple providers through a single endpoint. Supports both OpenAI and Anthropic-compatible formats. Manage everything via Web UI or CLI — optimized for AI agents.
 ![LLM Router Web Console](./assets/screenshots/web-ui-dashboard.png)
-**Current version**: `2.2.0`
-NPM package:
-```bash
-@khanglvm/llm-router
-```
-Primary CLI command:
-```bash
-llr
-```
 ## Install
 ```bash
@@ -24,282 +12,77 @@ npm i -g @khanglvm/llm-router@latest
 ## Quick Start
-1. Open the Web UI:
 ```bash
-llr
+llr          # open Web UI
+llr start    # start the local gateway
+llr ai-help  # agent-oriented setup brief
 ```
-2. Add at least one provider and model.
-3. Optionally create aliases and fallback routes.
-4. Start the local gateway:
-```bash
-llr start
-```
-5. Point your client or coding tool at the local endpoint.
-## Supported Operator Flows
-- CLI: direct operations like `llr config --operation=...`, `llr start`, `llr deploy`, provider diagnostics, and coding-tool routing control
-- Web UI: browser-based config editing, provider probing, and local router control
-The legacy TUI flow is no longer part of the supported workflow.
-## Core Commands
-Open the Web UI:
-```bash
-llr
-llr config
-llr web
-```
-Run direct config operations:
-```bash
-llr config --operation=validate
-llr config --operation=snapshot
-llr config --operation=tool-status
-llr config --operation=list
-llr config --operation=discover-provider-models --endpoints=https://openrouter.ai/api/v1 --api-key=sk-...
-llr config --operation=test-provider --endpoints=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
-llr config --operation=upsert-provider --provider-id=openrouter --name=OpenRouter --base-url=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
-llr config --operation=upsert-model-alias --alias-id=chat.default --strategy=auto --targets=openrouter/gpt-4o-mini@3,anthropic/claude-3-5-haiku@2
-llr config --operation=set-provider-rate-limits --provider-id=openrouter --bucket-name="Monthly cap" --bucket-models=all --bucket-requests=20000 --bucket-window=month:1
-llr config --operation=set-master-key --generate-master-key=true
-llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
-llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default
-llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
-```
-Operate the local gateway:
-```bash
-llr start
-llr stop
-llr reclaim
-llr reload
-llr update
-```
+1. Open the Web UI and add a provider (API key or OAuth login)
+2. Create model aliases with routing strategy
+3. Start the gateway and point your tools at the local endpoint
-Get the agent-oriented setup brief:
+## What You Can Do
-```bash
-llr ai-help
-```
+- **Add & manage providers** — connect any OpenAI/Anthropic-compatible API endpoint, test connectivity, auto-discover models
+- **Unified endpoint** — one local gateway that accepts both OpenAI and Anthropic request formats
+- **Model aliases with routing** — group models into stable alias names with weighted round-robin, quota-aware balancing, and automatic fallback
+- **Rate limiting** — set request caps per model or across all models over configurable time windows
+- **Coding tool routing** — one-click routing config for Codex CLI, Claude Code, and AMP
+- **Web search** — built-in web search for AMP and other router-managed tools
+- **Deployable** — run locally or deploy to Cloudflare Workers
+- **AI-agent friendly** — full CLI parity with `llr config --operation=...` so agents can configure everything programmatically
 ## Web UI
-The Web UI is the default operator surface.
-```bash
-llr
-llr web --port=9090
-llr web --open=false
-```
-What it covers:
-- raw JSON config editing with validation
-- provider discovery and probe flows
-- alias, fallback, rate-limit, and AMP management
-- local router start, stop, and restart
-- coding-tool patch helpers for Codex CLI, Claude Code, and AMP
-The Web UI is localhost-only by default because it can expose secrets and live configuration.
-### Screenshots
+### Alias & Fallback
-**Alias & Fallback**
+Create stable route names across multiple providers with balancing and failover.
 ![Alias & Fallback](./assets/screenshots/web-ui-aliases.png)
-**AMP Configuration**
+### AMP (Beta)
-![AMP Configuration](./assets/screenshots/web-ui-amp.png)
+Route AMP-compatible requests through LLM Router with custom model mapping.
-**Claude Code Routing**
+![AMP Configuration](./assets/screenshots/web-ui-amp.png)
-![Claude Code Routing](./assets/screenshots/web-ui-claude-code.png)
+### Codex CLI
-**Codex CLI Routing**
+Route Codex CLI requests through the gateway with model override and thinking level.
 ![Codex CLI Routing](./assets/screenshots/web-ui-codex-cli.png)
-**Web Search**
-![Web Search](./assets/screenshots/web-ui-web-search.png)
-## CLI Parity
-The browser UI still gives the best interactive overview, but the CLI now exposes the main management flows an agent needs without relying on private web endpoints.
-```bash
-llr config --operation=validate
-llr config --operation=snapshot
-llr config --operation=tool-status
-llr reclaim
-llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
-llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default --default-haiku-model=chat.fast
-llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
-llr config --operation=set-codex-cli-routing --enabled=false
-llr config --operation=set-claude-code-routing --enabled=false
-llr config --operation=set-amp-client-routing --enabled=false --amp-client-settings-scope=workspace
-```
-Notes:
-- `validate` checks raw config JSON + schema without opening the Web UI.
-- `snapshot` combines config, runtime, startup, and coding-tool routing state.
-- `tool-status` focuses only on Codex CLI, Claude Code, and AMP client wiring.
-- `reclaim` force-frees the fixed local router port when another listener is blocking `llr start`.
-- `set-codex-cli-routing` accepts `--default-model=<route>` or `--default-model=__codex_cli_inherit__` to keep Codex's own model selection.
-- `set-claude-code-routing` accepts `--primary-model`, `--default-opus-model`, `--default-sonnet-model`, `--default-haiku-model`, `--subagent-model`, and `--thinking-level`.
-- `set-amp-client-routing` patches or restores AMP client settings/secrets separately from router-side AMP config.
-## Providers, Models, and Aliases
-- Provider: one upstream service such as OpenRouter or Anthropic
-- Model: one upstream model id exposed by that provider
-- Alias: one stable route name that can fan out to multiple provider/model targets
-- Rate-limit bucket: request cap scoped to one or more models over a time window
-Recommended pattern:
+### Claude Code
-1. Add providers with direct model lists.
-2. Create aliases for stable client-facing route names.
-3. Put balancing/fallback behavior behind the alias, not in the client.
+Route Claude Code through the gateway with per-tier model bindings.
-## Subscription Providers
-OAuth-backed subscription providers are supported.
-```bash
-llr config --operation=upsert-provider --provider-id=chatgpt --name="GPT Sub" --type=subscription --subscription-type=chatgpt-codex --subscription-profile=default
-llr config --operation=upsert-provider --provider-id=claude-sub --name="Claude Sub" --type=subscription --subscription-type=claude-code --subscription-profile=default
-llr subscription login --subscription-type=chatgpt-codex --profile=default
-llr subscription login --subscription-type=claude-code --profile=default
-llr subscription status
-```
+![Claude Code Routing](./assets/screenshots/web-ui-claude-code.png)
-Supported `subscription-type` values:
+### Web Search
-- `chatgpt-codex`
-- `claude-code`
+Configure search providers for AMP and other router-managed tools.
-Compliance note: using provider resources through LLM Router may violate a provider's terms. You are responsible for that usage.
+![Web Search](./assets/screenshots/web-ui-web-search.png)
-## AMP
+## AMP (Beta)
-LLM Router can front AMP-compatible routes locally and optionally proxy unresolved AMP traffic upstream.
+> AMP support is in beta. Features and API surface may change.
-Open the Web UI for AMP setup, or use direct CLI operations:
+LLM Router can front AMP-compatible routes locally and proxy unresolved traffic upstream. Configure via the Web UI or CLI:
 ```bash
-llr config --operation=set-amp-config --patch-amp-client-config=true --amp-client-settings-scope=workspace --amp-client-url=http://127.0.0.1:4000
-llr config --operation=set-amp-config --amp-default-route=chat.default --amp-routes="smart => chat.smart, rush => chat.fast"
-llr config --operation=set-amp-config --amp-upstream-url=https://ampcode.com --amp-upstream-api-key=amp_...
 llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
 ```
-## Local Real-Provider Suite
-The repo includes a local-only real-provider suite for the supported operator surfaces:
-- CLI config + local gateway start
-- Web UI discovery / probe / save / router control
-Setup:
-```bash
-cp .env.test-suite.example .env.test-suite
-```
-Then fill in your own provider keys, endpoints, and models.
-Run:
-```bash
-npm run test:provider-live
-```
-Legacy alias:
-```bash
-npm run test:provider-smoke
-```
-The live suite uses isolated temp HOME/config/runtime-state folders and does not overwrite your normal `~/.llm-router.json` or `~/.llm-router.runtime.json`.
-## Deploy to Cloudflare
-Deploy the current config to a Worker:
-```bash
-llr deploy
-llr deploy --dry-run=true
-llr deploy --workers-dev=true
-llr deploy --route-pattern=router.example.com/* --zone-name=example.com
-llr deploy --generate-master-key=true
-```
-Fast worker key rotation:
-```bash
-llr worker-key --generate-master-key=true
-llr worker-key --env=production --master-key=rotated-key
-```
-## Config File
-Local config path:
-```text
-~/.llm-router.json
-```
-LLM Router also keeps related runtime and token state under the same namespace for backward compatibility with the published package.
-Useful runtime env knobs:
-- `LLM_ROUTER_MAX_REQUEST_BODY_BYTES`: caps inbound JSON body size for the local router and worker runtime. Default is `8 MiB` for `/responses` requests and `1 MiB` for other JSON endpoints.
-- `LLM_ROUTER_UPSTREAM_TIMEOUT_MS`: overrides the provider request timeout.
-## Development
-Web UI dev loop:
-```bash
-npm run dev
-```
-Build the browser bundle:
-```bash
-npm run build:web-console
-```
-Run the JavaScript test suite:
-```bash
-node --test $(rg --files -g "*.test.js" src)
-```
-## Documentation
+## Subscription Providers
-Comprehensive documentation is available in the `docs/` directory:
+OAuth-backed subscription login is supported for ChatGPT.
-- **[Project Overview & PDR](./docs/project-overview-pdr.md)** — Feature matrix, target users, success metrics, constraints
-- **[Codebase Summary](./docs/codebase-summary.md)** — Directory structure, module relationships, entry points, test infrastructure
-- **[Code Standards](./docs/code-standards.md)** — Patterns, naming conventions, testing, error handling
-- **[System Architecture](./docs/system-architecture.md)** — Request lifecycle, subsystem boundaries, data flow, deployment models
-- **[Project Roadmap](./docs/project-roadmap.md)** — Current status, planned phases, timeline, success metrics
+> **Note:** ChatGPT subscriptions are separate from the OpenAI API and intended for use within OpenAI's own apps. Using them here may violate OpenAI's terms of service.
-## Security and Releases
+## Links
-- Security: [`SECURITY.md`](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md)
-- Release notes: [`CHANGELOG.md`](https://github.com/khanglvm/llm-router/blob/master/CHANGELOG.md)
-- AMP routing: [`docs/amp-routing.md`](./docs/amp-routing.md)
+- [Changelog](https://github.com/khanglvm/llm-router/blob/master/CHANGELOG.md)
+- [Security](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md)
+- [AMP Routing Docs](https://github.com/khanglvm/llm-router/blob/master/assets/amp-routing.md)

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@khanglvm/llm-router",
-  "version": "2.2.2",
+  "version": "2.2.4",
   "description": "LLM Router: single gateway endpoint for multi-provider LLMs with unified OpenAI+Anthropic format and seamless fallback",
   "keywords": [
     "llm-router",
@@ -32,6 +32,7 @@
     "test:provider-live": "node --test --test-concurrency=1 ./test/live-provider-suite.test.js",
     "test:provider-smoke": "npm run test:provider-live",
     "test:amp-smoke": "node ./scripts/amp-smoke-suite.mjs",
+    "test:worker": "node ./scripts/test-worker.mjs",
     "prepublishOnly": "npm run test:provider-live"
   },
   "dependencies": {

package/src/cli-entry.js CHANGED Viewed

@@ -9,6 +9,7 @@ import { FIXED_LOCAL_ROUTER_HOST, FIXED_LOCAL_ROUTER_PORT } from "./node/local-s
 import { resolveListenPort } from "./node/listen-port.js";
 import { runStartCommand } from "./node/start-command.js";
 import { runWebCommand } from "./node/web-command.js";
+import { runUpgradeCommand } from "./node/upgrade-command.js";
 function parseSimpleArgs(argv) {
   const positional = [];
@@ -198,6 +199,14 @@ export async function runCli(argv = process.argv.slice(2), isTTY = undefined, ov
     return runWebFastPath(configArgs.args, runWebCommandImpl);
   }
+  if ((first === "upgrade" || first === "update") && !parsed.wantsHelp) {
+    const result = await runUpgradeCommand({
+      onLine: (msg) => console.log(msg),
+      onError: (msg) => console.error(msg)
+    });
+    return result.exitCode ?? (result.ok ? 0 : 1);
+  }
   if (firstIsSetup && !parsed.wantsHelp) {
     const setupArgs = argv.slice(1);
     const parsedSetup = parseSimpleArgs(setupArgs);

package/src/node/upgrade-command.js ADDED Viewed

@@ -0,0 +1,128 @@
+import { execSync } from "node:child_process";
+import { readFileSync } from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+import {
+  getActiveRuntimeState,
+  stopProcessByPid,
+  clearRuntimeState,
+  spawnDetachedStart
+} from "./instance-state.js";
+const PKG_NAME = "@khanglvm/llm-router";
+function readInstalledVersion() {
+  try {
+    const dir = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../..");
+    const pkg = JSON.parse(readFileSync(path.join(dir, "package.json"), "utf8"));
+    return pkg.version || "unknown";
+  } catch {
+    return "unknown";
+  }
+}
+function fetchLatestVersion() {
+  try {
+    return execSync(`npm view ${PKG_NAME} version`, { encoding: "utf8" }).trim();
+  } catch {
+    return null;
+  }
+}
+function detectPackageManager() {
+  try {
+    const npmGlobalRoot = execSync("npm root -g", { encoding: "utf8" }).trim();
+    const entryReal = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../..");
+    if (entryReal.startsWith(npmGlobalRoot)) return "npm";
+  } catch { /* ignore */ }
+  try {
+    const out = execSync("pnpm list -g --json 2>/dev/null", { encoding: "utf8" });
+    if (out.includes(PKG_NAME)) return "pnpm";
+  } catch { /* ignore */ }
+  return "npm";
+}
+export async function runUpgradeCommand({ onLine, onError } = {}) {
+  const line = typeof onLine === "function" ? onLine : (msg) => console.log(msg);
+  const error = typeof onError === "function" ? onError : (msg) => console.error(msg);
+  const currentVersion = readInstalledVersion();
+  line(`Current version: ${currentVersion}`);
+  // Check latest
+  line("Checking for updates...");
+  const latestVersion = fetchLatestVersion();
+  if (!latestVersion) {
+    error("Could not fetch latest version from npm registry.");
+    return { ok: false, exitCode: 1 };
+  }
+  if (latestVersion === currentVersion) {
+    line(`Already on the latest version (${currentVersion}).`);
+    return { ok: true, exitCode: 0 };
+  }
+  line(`New version available: ${currentVersion} → ${latestVersion}`);
+  // Stop running instance
+  let wasRunning = false;
+  let savedState = null;
+  try {
+    const runtime = await getActiveRuntimeState();
+    if (runtime) {
+      wasRunning = true;
+      savedState = { ...runtime };
+      line(`Stopping running server (pid ${runtime.pid})...`);
+      const stopResult = await stopProcessByPid(runtime.pid);
+      if (stopResult.ok) {
+        await clearRuntimeState({ pid: runtime.pid });
+        line("Server stopped.");
+      } else {
+        error(`Warning: could not stop server cleanly — ${stopResult.reason || "unknown"}`);
+      }
+    }
+  } catch {
+    // instance-state not available, skip
+  }
+  // Install latest
+  const pm = detectPackageManager();
+  const installCmd = pm === "pnpm"
+    ? `pnpm add -g ${PKG_NAME}@latest`
+    : `npm install -g ${PKG_NAME}@latest`;
+  line(`Upgrading via: ${installCmd}`);
+  try {
+    execSync(installCmd, { stdio: "inherit" });
+  } catch {
+    error("Upgrade failed. You may need to run with sudo or fix npm permissions.");
+    return { ok: false, exitCode: 1 };
+  }
+  const newVersion = fetchLatestVersion() || latestVersion;
+  line(`Upgraded to ${newVersion}.`);
+  // Restart server if it was running
+  if (wasRunning && savedState) {
+    line("Restarting server...");
+    try {
+      spawnDetachedStart({
+        cliPath: savedState.cliPath || "",
+        configPath: savedState.configPath || "",
+        host: savedState.host || "127.0.0.1",
+        port: savedState.port || 18080,
+        watchConfig: savedState.watchConfig ?? true,
+        watchBinary: savedState.watchBinary ?? true,
+        requireAuth: savedState.requireAuth ?? false,
+      });
+      line("Server restarted.");
+    } catch (err) {
+      error(`Could not restart server: ${err instanceof Error ? err.message : String(err)}`);
+      line("Start manually with: llr start");
+    }
+  }
+  return { ok: true, exitCode: 0 };
+}