@khanglvm/llm-router 2.2.2 → 2.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,21 +1,9 @@
1
1
  # LLM Router
2
2
 
3
- LLM Router is a local and Cloudflare-deployable gateway for routing one client endpoint across multiple LLM providers, models, aliases, fallbacks, and rate limits.
3
+ A unified LLM gateway that routes requests across multiple providers through a single endpoint. Supports both OpenAI and Anthropic-compatible formats. Manage everything via Web UI or CLI optimized for AI agents.
4
4
 
5
5
  ![LLM Router Web Console](./assets/screenshots/web-ui-dashboard.png)
6
6
 
7
- **Current version**: `2.2.0`
8
-
9
- NPM package:
10
- ```bash
11
- @khanglvm/llm-router
12
- ```
13
-
14
- Primary CLI command:
15
- ```bash
16
- llr
17
- ```
18
-
19
7
  ## Install
20
8
 
21
9
  ```bash
@@ -24,282 +12,77 @@ npm i -g @khanglvm/llm-router@latest
24
12
 
25
13
  ## Quick Start
26
14
 
27
- 1. Open the Web UI:
28
-
29
15
  ```bash
30
- llr
16
+ llr # open Web UI
17
+ llr start # start the local gateway
18
+ llr ai-help # agent-oriented setup brief
31
19
  ```
32
20
 
33
- 2. Add at least one provider and model.
34
- 3. Optionally create aliases and fallback routes.
35
- 4. Start the local gateway:
36
-
37
- ```bash
38
- llr start
39
- ```
40
-
41
- 5. Point your client or coding tool at the local endpoint.
42
-
43
- ## Supported Operator Flows
44
-
45
- - CLI: direct operations like `llr config --operation=...`, `llr start`, `llr deploy`, provider diagnostics, and coding-tool routing control
46
- - Web UI: browser-based config editing, provider probing, and local router control
47
-
48
- The legacy TUI flow is no longer part of the supported workflow.
49
-
50
- ## Core Commands
51
-
52
- Open the Web UI:
53
-
54
- ```bash
55
- llr
56
- llr config
57
- llr web
58
- ```
59
-
60
- Run direct config operations:
61
-
62
- ```bash
63
- llr config --operation=validate
64
- llr config --operation=snapshot
65
- llr config --operation=tool-status
66
- llr config --operation=list
67
- llr config --operation=discover-provider-models --endpoints=https://openrouter.ai/api/v1 --api-key=sk-...
68
- llr config --operation=test-provider --endpoints=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
69
- llr config --operation=upsert-provider --provider-id=openrouter --name=OpenRouter --base-url=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
70
- llr config --operation=upsert-model-alias --alias-id=chat.default --strategy=auto --targets=openrouter/gpt-4o-mini@3,anthropic/claude-3-5-haiku@2
71
- llr config --operation=set-provider-rate-limits --provider-id=openrouter --bucket-name="Monthly cap" --bucket-models=all --bucket-requests=20000 --bucket-window=month:1
72
- llr config --operation=set-master-key --generate-master-key=true
73
- llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
74
- llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default
75
- llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
76
- ```
77
-
78
- Operate the local gateway:
79
-
80
- ```bash
81
- llr start
82
- llr stop
83
- llr reclaim
84
- llr reload
85
- llr update
86
- ```
21
+ 1. Open the Web UI and add a provider (API key or OAuth login)
22
+ 2. Create model aliases with routing strategy
23
+ 3. Start the gateway and point your tools at the local endpoint
87
24
 
88
- Get the agent-oriented setup brief:
25
+ ## What You Can Do
89
26
 
90
- ```bash
91
- llr ai-help
92
- ```
27
+ - **Add & manage providers** — connect any OpenAI/Anthropic-compatible API endpoint, test connectivity, auto-discover models
28
+ - **Unified endpoint** — one local gateway that accepts both OpenAI and Anthropic request formats
29
+ - **Model aliases with routing** — group models into stable alias names with weighted round-robin, quota-aware balancing, and automatic fallback
30
+ - **Rate limiting** — set request caps per model or across all models over configurable time windows
31
+ - **Coding tool routing** — one-click routing config for Codex CLI, Claude Code, and AMP
32
+ - **Web search** — built-in web search for AMP and other router-managed tools
33
+ - **Deployable** — run locally or deploy to Cloudflare Workers
34
+ - **AI-agent friendly** — full CLI parity with `llr config --operation=...` so agents can configure everything programmatically
93
35
 
94
36
  ## Web UI
95
37
 
96
- The Web UI is the default operator surface.
97
-
98
- ```bash
99
- llr
100
- llr web --port=9090
101
- llr web --open=false
102
- ```
103
-
104
- What it covers:
105
-
106
- - raw JSON config editing with validation
107
- - provider discovery and probe flows
108
- - alias, fallback, rate-limit, and AMP management
109
- - local router start, stop, and restart
110
- - coding-tool patch helpers for Codex CLI, Claude Code, and AMP
111
-
112
- The Web UI is localhost-only by default because it can expose secrets and live configuration.
113
-
114
- ### Screenshots
38
+ ### Alias & Fallback
115
39
 
116
- **Alias & Fallback**
40
+ Create stable route names across multiple providers with balancing and failover.
117
41
 
118
42
  ![Alias & Fallback](./assets/screenshots/web-ui-aliases.png)
119
43
 
120
- **AMP Configuration**
44
+ ### AMP (Beta)
121
45
 
122
- ![AMP Configuration](./assets/screenshots/web-ui-amp.png)
46
+ Route AMP-compatible requests through LLM Router with custom model mapping.
123
47
 
124
- **Claude Code Routing**
48
+ ![AMP Configuration](./assets/screenshots/web-ui-amp.png)
125
49
 
126
- ![Claude Code Routing](./assets/screenshots/web-ui-claude-code.png)
50
+ ### Codex CLI
127
51
 
128
- **Codex CLI Routing**
52
+ Route Codex CLI requests through the gateway with model override and thinking level.
129
53
 
130
54
  ![Codex CLI Routing](./assets/screenshots/web-ui-codex-cli.png)
131
55
 
132
- **Web Search**
133
-
134
- ![Web Search](./assets/screenshots/web-ui-web-search.png)
135
-
136
- ## CLI Parity
137
-
138
- The browser UI still gives the best interactive overview, but the CLI now exposes the main management flows an agent needs without relying on private web endpoints.
139
-
140
- ```bash
141
- llr config --operation=validate
142
- llr config --operation=snapshot
143
- llr config --operation=tool-status
144
- llr reclaim
145
- llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
146
- llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default --default-haiku-model=chat.fast
147
- llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
148
- llr config --operation=set-codex-cli-routing --enabled=false
149
- llr config --operation=set-claude-code-routing --enabled=false
150
- llr config --operation=set-amp-client-routing --enabled=false --amp-client-settings-scope=workspace
151
- ```
152
-
153
- Notes:
154
-
155
- - `validate` checks raw config JSON + schema without opening the Web UI.
156
- - `snapshot` combines config, runtime, startup, and coding-tool routing state.
157
- - `tool-status` focuses only on Codex CLI, Claude Code, and AMP client wiring.
158
- - `reclaim` force-frees the fixed local router port when another listener is blocking `llr start`.
159
- - `set-codex-cli-routing` accepts `--default-model=<route>` or `--default-model=__codex_cli_inherit__` to keep Codex's own model selection.
160
- - `set-claude-code-routing` accepts `--primary-model`, `--default-opus-model`, `--default-sonnet-model`, `--default-haiku-model`, `--subagent-model`, and `--thinking-level`.
161
- - `set-amp-client-routing` patches or restores AMP client settings/secrets separately from router-side AMP config.
162
-
163
- ## Providers, Models, and Aliases
164
-
165
- - Provider: one upstream service such as OpenRouter or Anthropic
166
- - Model: one upstream model id exposed by that provider
167
- - Alias: one stable route name that can fan out to multiple provider/model targets
168
- - Rate-limit bucket: request cap scoped to one or more models over a time window
169
-
170
- Recommended pattern:
56
+ ### Claude Code
171
57
 
172
- 1. Add providers with direct model lists.
173
- 2. Create aliases for stable client-facing route names.
174
- 3. Put balancing/fallback behavior behind the alias, not in the client.
58
+ Route Claude Code through the gateway with per-tier model bindings.
175
59
 
176
- ## Subscription Providers
177
-
178
- OAuth-backed subscription providers are supported.
179
-
180
- ```bash
181
- llr config --operation=upsert-provider --provider-id=chatgpt --name="GPT Sub" --type=subscription --subscription-type=chatgpt-codex --subscription-profile=default
182
- llr config --operation=upsert-provider --provider-id=claude-sub --name="Claude Sub" --type=subscription --subscription-type=claude-code --subscription-profile=default
183
- llr subscription login --subscription-type=chatgpt-codex --profile=default
184
- llr subscription login --subscription-type=claude-code --profile=default
185
- llr subscription status
186
- ```
60
+ ![Claude Code Routing](./assets/screenshots/web-ui-claude-code.png)
187
61
 
188
- Supported `subscription-type` values:
62
+ ### Web Search
189
63
 
190
- - `chatgpt-codex`
191
- - `claude-code`
64
+ Configure search providers for AMP and other router-managed tools.
192
65
 
193
- Compliance note: using provider resources through LLM Router may violate a provider's terms. You are responsible for that usage.
66
+ ![Web Search](./assets/screenshots/web-ui-web-search.png)
194
67
 
195
- ## AMP
68
+ ## AMP (Beta)
196
69
 
197
- LLM Router can front AMP-compatible routes locally and optionally proxy unresolved AMP traffic upstream.
70
+ > AMP support is in beta. Features and API surface may change.
198
71
 
199
- Open the Web UI for AMP setup, or use direct CLI operations:
72
+ LLM Router can front AMP-compatible routes locally and proxy unresolved traffic upstream. Configure via the Web UI or CLI:
200
73
 
201
74
  ```bash
202
- llr config --operation=set-amp-config --patch-amp-client-config=true --amp-client-settings-scope=workspace --amp-client-url=http://127.0.0.1:4000
203
- llr config --operation=set-amp-config --amp-default-route=chat.default --amp-routes="smart => chat.smart, rush => chat.fast"
204
- llr config --operation=set-amp-config --amp-upstream-url=https://ampcode.com --amp-upstream-api-key=amp_...
205
75
  llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
206
76
  ```
207
77
 
208
- ## Local Real-Provider Suite
209
-
210
- The repo includes a local-only real-provider suite for the supported operator surfaces:
211
-
212
- - CLI config + local gateway start
213
- - Web UI discovery / probe / save / router control
214
-
215
- Setup:
216
-
217
- ```bash
218
- cp .env.test-suite.example .env.test-suite
219
- ```
220
-
221
- Then fill in your own provider keys, endpoints, and models.
222
-
223
- Run:
224
-
225
- ```bash
226
- npm run test:provider-live
227
- ```
228
-
229
- Legacy alias:
230
-
231
- ```bash
232
- npm run test:provider-smoke
233
- ```
234
-
235
- The live suite uses isolated temp HOME/config/runtime-state folders and does not overwrite your normal `~/.llm-router.json` or `~/.llm-router.runtime.json`.
236
-
237
- ## Deploy to Cloudflare
238
-
239
- Deploy the current config to a Worker:
240
-
241
- ```bash
242
- llr deploy
243
- llr deploy --dry-run=true
244
- llr deploy --workers-dev=true
245
- llr deploy --route-pattern=router.example.com/* --zone-name=example.com
246
- llr deploy --generate-master-key=true
247
- ```
248
-
249
- Fast worker key rotation:
250
-
251
- ```bash
252
- llr worker-key --generate-master-key=true
253
- llr worker-key --env=production --master-key=rotated-key
254
- ```
255
-
256
- ## Config File
257
-
258
- Local config path:
259
-
260
- ```text
261
- ~/.llm-router.json
262
- ```
263
-
264
- LLM Router also keeps related runtime and token state under the same namespace for backward compatibility with the published package.
265
-
266
- Useful runtime env knobs:
267
-
268
- - `LLM_ROUTER_MAX_REQUEST_BODY_BYTES`: caps inbound JSON body size for the local router and worker runtime. Default is `8 MiB` for `/responses` requests and `1 MiB` for other JSON endpoints.
269
- - `LLM_ROUTER_UPSTREAM_TIMEOUT_MS`: overrides the provider request timeout.
270
-
271
- ## Development
272
-
273
- Web UI dev loop:
274
-
275
- ```bash
276
- npm run dev
277
- ```
278
-
279
- Build the browser bundle:
280
-
281
- ```bash
282
- npm run build:web-console
283
- ```
284
-
285
- Run the JavaScript test suite:
286
-
287
- ```bash
288
- node --test $(rg --files -g "*.test.js" src)
289
- ```
290
-
291
- ## Documentation
78
+ ## Subscription Providers
292
79
 
293
- Comprehensive documentation is available in the `docs/` directory:
80
+ OAuth-backed subscription login is supported for ChatGPT.
294
81
 
295
- - **[Project Overview & PDR](./docs/project-overview-pdr.md)** Feature matrix, target users, success metrics, constraints
296
- - **[Codebase Summary](./docs/codebase-summary.md)** — Directory structure, module relationships, entry points, test infrastructure
297
- - **[Code Standards](./docs/code-standards.md)** — Patterns, naming conventions, testing, error handling
298
- - **[System Architecture](./docs/system-architecture.md)** — Request lifecycle, subsystem boundaries, data flow, deployment models
299
- - **[Project Roadmap](./docs/project-roadmap.md)** — Current status, planned phases, timeline, success metrics
82
+ > **Note:** ChatGPT subscriptions are separate from the OpenAI API and intended for use within OpenAI's own apps. Using them here may violate OpenAI's terms of service.
300
83
 
301
- ## Security and Releases
84
+ ## Links
302
85
 
303
- - Security: [`SECURITY.md`](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md)
304
- - Release notes: [`CHANGELOG.md`](https://github.com/khanglvm/llm-router/blob/master/CHANGELOG.md)
305
- - AMP routing: [`docs/amp-routing.md`](./docs/amp-routing.md)
86
+ - [Changelog](https://github.com/khanglvm/llm-router/blob/master/CHANGELOG.md)
87
+ - [Security](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md)
88
+ - [AMP Routing Docs](https://github.com/khanglvm/llm-router/blob/master/assets/amp-routing.md)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@khanglvm/llm-router",
3
- "version": "2.2.2",
3
+ "version": "2.2.4",
4
4
  "description": "LLM Router: single gateway endpoint for multi-provider LLMs with unified OpenAI+Anthropic format and seamless fallback",
5
5
  "keywords": [
6
6
  "llm-router",
@@ -32,6 +32,7 @@
32
32
  "test:provider-live": "node --test --test-concurrency=1 ./test/live-provider-suite.test.js",
33
33
  "test:provider-smoke": "npm run test:provider-live",
34
34
  "test:amp-smoke": "node ./scripts/amp-smoke-suite.mjs",
35
+ "test:worker": "node ./scripts/test-worker.mjs",
35
36
  "prepublishOnly": "npm run test:provider-live"
36
37
  },
37
38
  "dependencies": {
package/src/cli-entry.js CHANGED
@@ -9,6 +9,7 @@ import { FIXED_LOCAL_ROUTER_HOST, FIXED_LOCAL_ROUTER_PORT } from "./node/local-s
9
9
  import { resolveListenPort } from "./node/listen-port.js";
10
10
  import { runStartCommand } from "./node/start-command.js";
11
11
  import { runWebCommand } from "./node/web-command.js";
12
+ import { runUpgradeCommand } from "./node/upgrade-command.js";
12
13
 
13
14
  function parseSimpleArgs(argv) {
14
15
  const positional = [];
@@ -198,6 +199,14 @@ export async function runCli(argv = process.argv.slice(2), isTTY = undefined, ov
198
199
  return runWebFastPath(configArgs.args, runWebCommandImpl);
199
200
  }
200
201
 
202
+ if ((first === "upgrade" || first === "update") && !parsed.wantsHelp) {
203
+ const result = await runUpgradeCommand({
204
+ onLine: (msg) => console.log(msg),
205
+ onError: (msg) => console.error(msg)
206
+ });
207
+ return result.exitCode ?? (result.ok ? 0 : 1);
208
+ }
209
+
201
210
  if (firstIsSetup && !parsed.wantsHelp) {
202
211
  const setupArgs = argv.slice(1);
203
212
  const parsedSetup = parseSimpleArgs(setupArgs);
@@ -0,0 +1,128 @@
1
+ import { execSync } from "node:child_process";
2
+ import { readFileSync } from "node:fs";
3
+ import path from "node:path";
4
+ import { fileURLToPath } from "node:url";
5
+ import {
6
+ getActiveRuntimeState,
7
+ stopProcessByPid,
8
+ clearRuntimeState,
9
+ spawnDetachedStart
10
+ } from "./instance-state.js";
11
+
12
+ const PKG_NAME = "@khanglvm/llm-router";
13
+
14
+ function readInstalledVersion() {
15
+ try {
16
+ const dir = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../..");
17
+ const pkg = JSON.parse(readFileSync(path.join(dir, "package.json"), "utf8"));
18
+ return pkg.version || "unknown";
19
+ } catch {
20
+ return "unknown";
21
+ }
22
+ }
23
+
24
+ function fetchLatestVersion() {
25
+ try {
26
+ return execSync(`npm view ${PKG_NAME} version`, { encoding: "utf8" }).trim();
27
+ } catch {
28
+ return null;
29
+ }
30
+ }
31
+
32
+ function detectPackageManager() {
33
+ try {
34
+ const npmGlobalRoot = execSync("npm root -g", { encoding: "utf8" }).trim();
35
+ const entryReal = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../..");
36
+ if (entryReal.startsWith(npmGlobalRoot)) return "npm";
37
+ } catch { /* ignore */ }
38
+
39
+ try {
40
+ const out = execSync("pnpm list -g --json 2>/dev/null", { encoding: "utf8" });
41
+ if (out.includes(PKG_NAME)) return "pnpm";
42
+ } catch { /* ignore */ }
43
+
44
+ return "npm";
45
+ }
46
+
47
+ export async function runUpgradeCommand({ onLine, onError } = {}) {
48
+ const line = typeof onLine === "function" ? onLine : (msg) => console.log(msg);
49
+ const error = typeof onError === "function" ? onError : (msg) => console.error(msg);
50
+
51
+ const currentVersion = readInstalledVersion();
52
+ line(`Current version: ${currentVersion}`);
53
+
54
+ // Check latest
55
+ line("Checking for updates...");
56
+ const latestVersion = fetchLatestVersion();
57
+ if (!latestVersion) {
58
+ error("Could not fetch latest version from npm registry.");
59
+ return { ok: false, exitCode: 1 };
60
+ }
61
+
62
+ if (latestVersion === currentVersion) {
63
+ line(`Already on the latest version (${currentVersion}).`);
64
+ return { ok: true, exitCode: 0 };
65
+ }
66
+
67
+ line(`New version available: ${currentVersion} → ${latestVersion}`);
68
+
69
+ // Stop running instance
70
+ let wasRunning = false;
71
+ let savedState = null;
72
+ try {
73
+ const runtime = await getActiveRuntimeState();
74
+ if (runtime) {
75
+ wasRunning = true;
76
+ savedState = { ...runtime };
77
+ line(`Stopping running server (pid ${runtime.pid})...`);
78
+ const stopResult = await stopProcessByPid(runtime.pid);
79
+ if (stopResult.ok) {
80
+ await clearRuntimeState({ pid: runtime.pid });
81
+ line("Server stopped.");
82
+ } else {
83
+ error(`Warning: could not stop server cleanly — ${stopResult.reason || "unknown"}`);
84
+ }
85
+ }
86
+ } catch {
87
+ // instance-state not available, skip
88
+ }
89
+
90
+ // Install latest
91
+ const pm = detectPackageManager();
92
+ const installCmd = pm === "pnpm"
93
+ ? `pnpm add -g ${PKG_NAME}@latest`
94
+ : `npm install -g ${PKG_NAME}@latest`;
95
+
96
+ line(`Upgrading via: ${installCmd}`);
97
+ try {
98
+ execSync(installCmd, { stdio: "inherit" });
99
+ } catch {
100
+ error("Upgrade failed. You may need to run with sudo or fix npm permissions.");
101
+ return { ok: false, exitCode: 1 };
102
+ }
103
+
104
+ const newVersion = fetchLatestVersion() || latestVersion;
105
+ line(`Upgraded to ${newVersion}.`);
106
+
107
+ // Restart server if it was running
108
+ if (wasRunning && savedState) {
109
+ line("Restarting server...");
110
+ try {
111
+ spawnDetachedStart({
112
+ cliPath: savedState.cliPath || "",
113
+ configPath: savedState.configPath || "",
114
+ host: savedState.host || "127.0.0.1",
115
+ port: savedState.port || 18080,
116
+ watchConfig: savedState.watchConfig ?? true,
117
+ watchBinary: savedState.watchBinary ?? true,
118
+ requireAuth: savedState.requireAuth ?? false,
119
+ });
120
+ line("Server restarted.");
121
+ } catch (err) {
122
+ error(`Could not restart server: ${err instanceof Error ? err.message : String(err)}`);
123
+ line("Start manually with: llr start");
124
+ }
125
+ }
126
+
127
+ return { ok: true, exitCode: 0 };
128
+ }