@khanglvm/llm-router 2.0.0-beta.1 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +27 -0
- package/README.md +163 -426
- package/package.json +3 -3
- package/src/cli/router-module.js +2773 -2587
- package/src/cli-entry.js +32 -103
- package/src/node/activity-log.js +119 -0
- package/src/node/coding-tool-config.js +85 -11
- package/src/node/config-workflows.js +51 -12
- package/src/node/instance-state.js +1 -1
- package/src/node/litellm-context-catalog.js +184 -0
- package/src/node/local-server.js +23 -3
- package/src/node/port-reclaim.js +2 -2
- package/src/node/start-command.js +22 -22
- package/src/node/startup-manager.js +3 -3
- package/src/node/web-command.js +1 -1
- package/src/node/web-console-assets.js +1 -1
- package/src/node/web-console-client.js +34 -29
- package/src/node/web-console-server.js +420 -38
- package/src/node/web-console-styles.generated.js +1 -1
- package/src/node/web-console-ui/buffered-text-input.js +133 -0
- package/src/node/web-console-ui/config-editor-utils.js +57 -4
- package/src/node/web-console-ui/dropdown-placement.js +153 -0
- package/src/node/web-console-ui/select-search-utils.js +6 -0
- package/src/node/web-console-ui/transient-integer-input-utils.js +12 -0
- package/src/runtime/balancer.js +78 -1
- package/src/runtime/codex-request-transformer.js +16 -7
- package/src/runtime/config.js +448 -12
- package/src/runtime/handler/amp-response.js +5 -3
- package/src/runtime/handler/amp-web-search.js +2232 -0
- package/src/runtime/handler/fallback.js +30 -2
- package/src/runtime/handler/provider-call.js +353 -36
- package/src/runtime/handler/provider-translation.js +14 -0
- package/src/runtime/handler/request.js +128 -2
- package/src/runtime/handler/route-debug.js +36 -0
- package/src/runtime/handler.js +210 -20
- package/src/runtime/subscription-provider.js +1 -1
- package/src/shared/coding-tool-bindings.js +49 -0
- package/src/shared/local-router-defaults.js +62 -0
- package/src/translator/request/claude-to-openai.js +43 -0
package/README.md
CHANGED
|
@@ -1,535 +1,272 @@
|
|
|
1
|
-
#
|
|
1
|
+
# LLM Router
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
LLM Router is a local and Cloudflare-deployable gateway for routing one client endpoint across multiple LLM providers, models, aliases, fallbacks, and rate limits.
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-
2. Support grouping models with rate-limit and load balancing strategy
|
|
7
|
-
3. Configuration auto reload in real time, no interruption
|
|
8
|
-
|
|
9
|
-
## Beta Notice
|
|
10
|
-
|
|
11
|
-
`2.0.0-beta.1` is the current public prerelease. It includes major AMP routing, web console, and local operator workflow changes, so treat it as beta and expect rough edges while validating it before a stable `2.0.0` release.
|
|
12
|
-
|
|
13
|
-
Short highlights in this beta:
|
|
14
|
-
- New localhost web console for config editing, provider testing, and router lifecycle control
|
|
15
|
-
- Quick patching for AMP Code, Codex CLI and Claude Code
|
|
16
|
-
- Expanded operator workflows across CLI, TUI, OAuth subscription setup, and live provider validation
|
|
17
|
-
- Fixed various format-transformation issues
|
|
18
|
-
|
|
19
|
-
## Install
|
|
20
|
-
|
|
21
|
-
Stable channel:
|
|
5
|
+
The npm package name stays the same:
|
|
22
6
|
|
|
23
7
|
```bash
|
|
24
|
-
|
|
8
|
+
@khanglvm/llm-router
|
|
25
9
|
```
|
|
26
10
|
|
|
27
|
-
|
|
11
|
+
The primary CLI command is now:
|
|
28
12
|
|
|
29
13
|
```bash
|
|
30
|
-
|
|
14
|
+
llr
|
|
31
15
|
```
|
|
32
16
|
|
|
33
|
-
|
|
17
|
+
`2.0.0` is the current public release. It includes the Web UI, AMP routing, and coding-tool integrations introduced in the 2.x line.
|
|
34
18
|
|
|
35
|
-
|
|
19
|
+
## Install
|
|
36
20
|
|
|
37
|
-
```
|
|
38
|
-
|
|
21
|
+
```bash
|
|
22
|
+
npm i -g @khanglvm/llm-router@latest
|
|
39
23
|
```
|
|
40
24
|
|
|
41
|
-
##
|
|
42
|
-
|
|
43
|
-
The repo now includes a local-only live provider suite that covers all three operator surfaces:
|
|
25
|
+
## Quick Start
|
|
44
26
|
|
|
45
|
-
|
|
46
|
-
- TUI config menus
|
|
47
|
-
- Web console provider discovery/test + browser bundle render
|
|
48
|
-
|
|
49
|
-
Setup:
|
|
27
|
+
1. Open the Web UI:
|
|
50
28
|
|
|
51
29
|
```bash
|
|
52
|
-
|
|
53
|
-
# fill your own provider keys/endpoints/models in .env.test-suite
|
|
30
|
+
llr
|
|
54
31
|
```
|
|
55
32
|
|
|
56
|
-
|
|
33
|
+
2. Add at least one provider and model.
|
|
34
|
+
3. Optionally create aliases and fallback routes.
|
|
35
|
+
4. Start the local gateway:
|
|
57
36
|
|
|
58
37
|
```bash
|
|
59
|
-
|
|
60
|
-
# legacy alias:
|
|
61
|
-
npm run test:provider-smoke
|
|
38
|
+
llr start
|
|
62
39
|
```
|
|
63
40
|
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
- `.env.test-suite` is gitignored and is intended only for local runs.
|
|
67
|
-
- The live suite uses isolated temp HOME/config/runtime-state folders so it does not overwrite your normal `~/.llm-router.json` or `~/.llm-router.runtime.json`.
|
|
68
|
-
- Public contributors should keep using `.env.test-suite.example` as the template and fill their own providers locally.
|
|
69
|
-
|
|
70
|
-
## Main Workflow
|
|
71
|
-
|
|
72
|
-
1. Add providers + models into llm-router (standard API-key providers or OAuth subscription providers)
|
|
73
|
-
2. Optionally, group models as alias with load balancing and auto fallback support
|
|
74
|
-
3. Start llm-router server, point your coding tool API and model to llm-router
|
|
75
|
-
|
|
76
|
-
## What Each Term Means
|
|
77
|
-
|
|
78
|
-
### Provider
|
|
79
|
-
The service endpoint you call (OpenRouter, Anthropic, etc.).
|
|
80
|
-
|
|
81
|
-
### Model
|
|
82
|
-
The actual model ID from that provider.
|
|
41
|
+
5. Point your client or coding tool at the local endpoint.
|
|
83
42
|
|
|
84
|
-
|
|
85
|
-
A request cap for a time window.
|
|
86
|
-
Examples:
|
|
87
|
-
- `40 requests / minute`
|
|
88
|
-
- `20,000 requests / month`
|
|
43
|
+
## Supported Operator Flows
|
|
89
44
|
|
|
90
|
-
|
|
91
|
-
|
|
45
|
+
- CLI: direct operations like `llr config --operation=...`, `llr start`, `llr deploy`, provider diagnostics, and coding-tool routing control
|
|
46
|
+
- Web UI: browser-based config editing, provider probing, and local router control
|
|
92
47
|
|
|
93
|
-
|
|
94
|
-
- `auto` (recommended)
|
|
95
|
-
- `ordered`
|
|
96
|
-
- `round-robin`
|
|
97
|
-
- `weighted-rr`
|
|
98
|
-
- `quota-aware-weighted-rr`
|
|
48
|
+
The legacy TUI flow is no longer part of the supported workflow.
|
|
99
49
|
|
|
100
|
-
|
|
101
|
-
A single model name that auto route/rotate across multiple models.
|
|
50
|
+
## Core Commands
|
|
102
51
|
|
|
103
|
-
|
|
104
|
-
- alias: `opus`
|
|
105
|
-
- targets:
|
|
106
|
-
- `openrouter/claude-opus-4.6`
|
|
107
|
-
- `anthropic/claude-opus-4.6`
|
|
108
|
-
|
|
109
|
-
Your app can use `opus` model and `llm-router` chooses target models based on your routing settings.
|
|
110
|
-
|
|
111
|
-
## Setup using Terminal User Interface (TUI)
|
|
112
|
-
|
|
113
|
-
Open the TUI:
|
|
52
|
+
Open the Web UI:
|
|
114
53
|
|
|
115
54
|
```bash
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
55
|
+
llr
|
|
56
|
+
llr config
|
|
57
|
+
llr web
|
|
119
58
|
```
|
|
120
59
|
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
### 1) Add Provider
|
|
124
|
-
Flow:
|
|
125
|
-
1. `Config manager`
|
|
126
|
-
2. `Providers`
|
|
127
|
-
3. `Add or edit`
|
|
128
|
-
4. Choose auth method:
|
|
129
|
-
- `API key` -> endpoint + API key + model list
|
|
130
|
-
- `OAuth` -> browser OAuth + editable model list
|
|
131
|
-
5. For `OAuth`:
|
|
132
|
-
- Choose subscription provider (`ChatGPT` or `Claude Code`)
|
|
133
|
-
- Enter provider name and provider ID
|
|
134
|
-
- Complete browser OAuth login inside this same flow
|
|
135
|
-
- Edit model list (pre-filled defaults; you can add/remove)
|
|
136
|
-
- llm-router live-tests every selected model before save
|
|
137
|
-
6. Save
|
|
138
|
-
|
|
139
|
-
### 1b) Add Subscription Provider (OAuth)
|
|
140
|
-
Commandline examples:
|
|
60
|
+
Run direct config operations:
|
|
141
61
|
|
|
142
62
|
```bash
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
--subscription-type=claude-code
|
|
63
|
+
llr config --operation=validate
|
|
64
|
+
llr config --operation=snapshot
|
|
65
|
+
llr config --operation=tool-status
|
|
66
|
+
llr config --operation=list
|
|
67
|
+
llr config --operation=discover-provider-models --endpoints=https://openrouter.ai/api/v1 --api-key=sk-...
|
|
68
|
+
llr config --operation=test-provider --endpoints=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
|
|
69
|
+
llr config --operation=upsert-provider --provider-id=openrouter --name=OpenRouter --base-url=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
|
|
70
|
+
llr config --operation=upsert-model-alias --alias-id=chat.default --strategy=auto --targets=openrouter/gpt-4o-mini@3,anthropic/claude-3-5-haiku@2
|
|
71
|
+
llr config --operation=set-provider-rate-limits --provider-id=openrouter --bucket-name="Monthly cap" --bucket-models=all --bucket-requests=20000 --bucket-window=month:1
|
|
72
|
+
llr config --operation=set-master-key --generate-master-key=true
|
|
73
|
+
llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
|
|
74
|
+
llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default
|
|
75
|
+
llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
|
|
157
76
|
```
|
|
158
77
|
|
|
159
|
-
|
|
160
|
-
- OAuth login is run during provider upsert (browser flow by default).
|
|
161
|
-
- Supported `subscription-type`: `chatgpt-codex` and `claude-code` (defaults to `chatgpt-codex`).
|
|
162
|
-
- Default model lists are prefilled by subscription type, then editable.
|
|
163
|
-
- Device-code login is available for `chatgpt-codex` only.
|
|
164
|
-
- No provider API key or endpoint probe input is required for subscription mode.
|
|
165
|
-
- Compliance notice: provider account/resource usage via `llm-router` may violate a provider's terms. You are solely responsible for compliance; `llm-router` maintainers take no responsibility for misuse.
|
|
166
|
-
|
|
167
|
-
### 2) Configure Model Fallback (Optional)
|
|
168
|
-
Flow:
|
|
169
|
-
1. `Config manager`
|
|
170
|
-
2. `Routing`
|
|
171
|
-
3. `Fallbacks`
|
|
172
|
-
4. Pick main model
|
|
173
|
-
5. Pick fallback models
|
|
174
|
-
6. Save
|
|
175
|
-
|
|
176
|
-
### 3) Configure Rate Limits (Optional)
|
|
177
|
-
Flow:
|
|
178
|
-
1. `Config manager`
|
|
179
|
-
2. `Routing`
|
|
180
|
-
3. `Rate limits`
|
|
181
|
-
4. `Create`
|
|
182
|
-
5. Set name, model scope, request cap, time window
|
|
183
|
-
6. Save
|
|
184
|
-
|
|
185
|
-
### 4) Group Models With Alias (Recommended)
|
|
186
|
-
Flow:
|
|
187
|
-
1. `Config manager`
|
|
188
|
-
2. `Routing`
|
|
189
|
-
3. `Aliases`
|
|
190
|
-
4. Set alias ID (example: `chat.default`)
|
|
191
|
-
5. Select target models
|
|
192
|
-
6. Save
|
|
193
|
-
|
|
194
|
-
### 5) Configure Model Load Balancer
|
|
195
|
-
Flow:
|
|
196
|
-
1. `Config manager`
|
|
197
|
-
2. `Routing`
|
|
198
|
-
3. `Aliases`
|
|
199
|
-
4. Open the alias you want to balance
|
|
200
|
-
5. Choose strategy (`auto` recommended)
|
|
201
|
-
6. Review alias targets
|
|
202
|
-
7. Save
|
|
203
|
-
|
|
204
|
-
### 6) Set Gateway Key
|
|
205
|
-
Flow:
|
|
206
|
-
1. `Config manager`
|
|
207
|
-
2. `Security`
|
|
208
|
-
3. `Master key`
|
|
209
|
-
4. Set or generate key
|
|
210
|
-
5. Save
|
|
211
|
-
|
|
212
|
-
## Setup using Web Console
|
|
213
|
-
|
|
214
|
-
Open the browser-based console:
|
|
78
|
+
Operate the local gateway:
|
|
215
79
|
|
|
216
80
|
```bash
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
81
|
+
llr start
|
|
82
|
+
llr stop
|
|
83
|
+
llr reclaim
|
|
84
|
+
llr reload
|
|
85
|
+
llr update
|
|
222
86
|
```
|
|
223
87
|
|
|
224
|
-
|
|
88
|
+
Get the agent-oriented setup brief:
|
|
225
89
|
|
|
226
90
|
```bash
|
|
227
|
-
|
|
91
|
+
llr ai-help
|
|
228
92
|
```
|
|
229
93
|
|
|
230
|
-
|
|
231
|
-
- Compact Claude-light localhost UI built with React, shadcn-style primitives, and Tailwind
|
|
232
|
-
- JSON-first config editor with live validation, external file sync, and a first-run quick-start wizard when no providers are configured
|
|
233
|
-
- Quick status cards for config health, managed router state, startup status, and recent activity
|
|
234
|
-
- Sections for:
|
|
235
|
-
- raw config editing with validate / prettify / save / open-in-editor actions
|
|
236
|
-
- provider inventory with per-provider probe actions
|
|
237
|
-
- OS startup enable / disable
|
|
238
|
-
- Start / restart / stop controls for the local router
|
|
239
|
-
- `Open Config File` buttons for detected editors like VS Code, Sublime, Cursor, TextEdit/default app, and other common local editors
|
|
94
|
+
## Web UI
|
|
240
95
|
|
|
241
|
-
|
|
96
|
+
The Web UI is the default operator surface.
|
|
242
97
|
|
|
243
98
|
```bash
|
|
244
|
-
|
|
245
|
-
|
|
99
|
+
llr
|
|
100
|
+
llr web --port=9090
|
|
101
|
+
llr web --open=false
|
|
246
102
|
```
|
|
247
103
|
|
|
248
|
-
|
|
249
|
-
- The web console is localhost-only by default because it exposes live config editing, including secrets.
|
|
250
|
-
- The web console runs as a separate service from the local router. Closing the UI does not stop the router service.
|
|
251
|
-
- `yarn dev` hot-reloads the browser UI and restarts the local router service when router source files change.
|
|
252
|
-
- If the config file contains invalid JSON, validation surfaces the parse error and save/probe/start actions stay guarded until the JSON is repaired.
|
|
253
|
-
- When the web console patches Codex CLI, it writes a generated `model_catalog_json` for both alias bindings and direct managed route refs like `provider/model`, which avoids Codex fallback metadata warnings for managed routes.
|
|
104
|
+
What it covers:
|
|
254
105
|
|
|
255
|
-
|
|
106
|
+
- raw JSON config editing with validation
|
|
107
|
+
- provider discovery and probe flows
|
|
108
|
+
- alias, fallback, rate-limit, and AMP management
|
|
109
|
+
- local router start, stop, and restart
|
|
110
|
+
- coding-tool patch helpers for Codex CLI, Claude Code, and AMP
|
|
256
111
|
|
|
257
|
-
|
|
258
|
-
llm-router start
|
|
259
|
-
```
|
|
112
|
+
The Web UI is localhost-only by default because it can expose secrets and live configuration.
|
|
260
113
|
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
Local endpoints:
|
|
264
|
-
- Unified: `http://127.0.0.1:8376/route`
|
|
265
|
-
- Anthropic-style: `http://127.0.0.1:8376/anthropic`
|
|
266
|
-
- OpenAI-style: `http://127.0.0.1:8376/openai`
|
|
267
|
-
- OpenAI legacy completions: `http://127.0.0.1:8376/openai/v1/completions`
|
|
268
|
-
- OpenAI Responses-style: `http://127.0.0.1:8376/openai/v1/responses` (Codex CLI-compatible)
|
|
269
|
-
- AMP OpenAI-style: `http://127.0.0.1:8376/api/provider/openai/v1/chat/completions`
|
|
270
|
-
- AMP Anthropic-style: `http://127.0.0.1:8376/api/provider/anthropic/v1/messages`
|
|
271
|
-
- AMP OpenAI Responses-style: `http://127.0.0.1:8376/api/provider/openai/v1/responses`
|
|
272
|
-
|
|
273
|
-
## Connect your coding tool
|
|
274
|
-
|
|
275
|
-
After setting master key, point your app/agent to local endpoint and use that key as auth token.
|
|
276
|
-
|
|
277
|
-
Claude Code example (`~/.claude/settings.local.json`):
|
|
278
|
-
|
|
279
|
-
```json
|
|
280
|
-
{
|
|
281
|
-
"env": {
|
|
282
|
-
"ANTHROPIC_BASE_URL": "http://127.0.0.1:8376",
|
|
283
|
-
"ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key",
|
|
284
|
-
"ANTHROPIC_DEFAULT_OPUS_MODEL": "provider_name/model_name_1",
|
|
285
|
-
"ANTHROPIC_DEFAULT_SONNET_MODEL": "provider_name/model_name_2",
|
|
286
|
-
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "provider_name/model_name_3"
|
|
287
|
-
}
|
|
288
|
-
}
|
|
289
|
-
```
|
|
114
|
+
## CLI Parity
|
|
290
115
|
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
- `All projects` for your global AMP config
|
|
305
|
-
5. Confirm the local `llm-router` URL and API key
|
|
306
|
-
6. Pick one default route such as `chat.default` or `provider/model`
|
|
307
|
-
7. `Save and exit`
|
|
308
|
-
|
|
309
|
-
That is enough to make AMP send requests to `llm-router`.
|
|
310
|
-
|
|
311
|
-
After that, if you want AMP modes like `smart`, `rush`, `deep`, or `oracle` to use different llm-router aliases/models:
|
|
312
|
-
|
|
313
|
-
1. Open `AMP`
|
|
314
|
-
2. Choose `Common AMP routes`
|
|
315
|
-
3. Pick the AMP route you want to customize
|
|
316
|
-
4. Pick the llm-router alias/model to use
|
|
317
|
-
5. Save
|
|
318
|
-
|
|
319
|
-
The `Advanced` menu is where the older, more detailed AMP controls now live:
|
|
320
|
-
|
|
321
|
-
- upstream / proxy settings
|
|
322
|
-
- legacy model-pattern mappings
|
|
323
|
-
- legacy subagent definitions and mappings
|
|
324
|
-
|
|
325
|
-
Recommended config snippet in `~/.llm-router.json`:
|
|
326
|
-
|
|
327
|
-
```json
|
|
328
|
-
{
|
|
329
|
-
"masterKey": "gw_your_gateway_key",
|
|
330
|
-
"defaultModel": "chat.default",
|
|
331
|
-
"amp": {
|
|
332
|
-
"upstreamUrl": "https://ampcode.com",
|
|
333
|
-
"upstreamApiKey": "amp_upstream_api_key",
|
|
334
|
-
"restrictManagementToLocalhost": true,
|
|
335
|
-
"preset": "builtin",
|
|
336
|
-
"defaultRoute": "chat.default",
|
|
337
|
-
"routes": {
|
|
338
|
-
"smart": "chat.smart",
|
|
339
|
-
"rush": "chat.fast",
|
|
340
|
-
"deep": "chat.deep",
|
|
341
|
-
"oracle": "chat.oracle",
|
|
342
|
-
"librarian": "chat.research",
|
|
343
|
-
"review": "chat.review",
|
|
344
|
-
"@google-gemini-flash-shared": "chat.tools",
|
|
345
|
-
"painter": "image.default"
|
|
346
|
-
},
|
|
347
|
-
"rawModelRoutes": [
|
|
348
|
-
{ "from": "gpt-*-codex*", "to": "chat.deep" }
|
|
349
|
-
],
|
|
350
|
-
"overrides": {
|
|
351
|
-
"entities": [
|
|
352
|
-
{
|
|
353
|
-
"id": "reviewer",
|
|
354
|
-
"type": "feature",
|
|
355
|
-
"match": ["gemini-4-pro*"],
|
|
356
|
-
"route": "chat.review"
|
|
357
|
-
}
|
|
358
|
-
]
|
|
359
|
-
},
|
|
360
|
-
"fallback": {
|
|
361
|
-
"onUnknown": "default-route",
|
|
362
|
-
"onAmbiguous": "default-route",
|
|
363
|
-
"proxyUpstream": true
|
|
364
|
-
}
|
|
365
|
-
}
|
|
366
|
-
}
|
|
116
|
+
The browser UI still gives the best interactive overview, but the CLI now exposes the main management flows an agent needs without relying on private web endpoints.
|
|
117
|
+
|
|
118
|
+
```bash
|
|
119
|
+
llr config --operation=validate
|
|
120
|
+
llr config --operation=snapshot
|
|
121
|
+
llr config --operation=tool-status
|
|
122
|
+
llr reclaim
|
|
123
|
+
llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
|
|
124
|
+
llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default --default-haiku-model=chat.fast
|
|
125
|
+
llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
|
|
126
|
+
llr config --operation=set-codex-cli-routing --enabled=false
|
|
127
|
+
llr config --operation=set-claude-code-routing --enabled=false
|
|
128
|
+
llr config --operation=set-amp-client-routing --enabled=false --amp-client-settings-scope=workspace
|
|
367
129
|
```
|
|
368
130
|
|
|
369
131
|
Notes:
|
|
370
|
-
- `amp` is the normalized config key. Input aliases `ampcode` and `amp-code` are also accepted.
|
|
371
|
-
- `amp.routes` is the new main user-facing mapping surface. Keys can be friendly AMP entities like `smart`, `rush`, `oracle`, `review`, `title`, or shared signatures like `@google-gemini-flash-shared`.
|
|
372
|
-
- `amp.defaultRoute` is AMP-specific fallback and is checked before the global `defaultModel`.
|
|
373
|
-
- `amp.rawModelRoutes` is the new-schema escape hatch for raw model-name matching when entity/signature routing is not enough.
|
|
374
|
-
- `amp.overrides` lets users add or update entity/signature detection without editing the built-in preset in code.
|
|
375
|
-
- `amp.preset=builtin` enables the shipped AMP catalog. Set `amp.preset=none` to disable built-in entity/signature detection entirely.
|
|
376
|
-
- Shared signatures exist because some AMP helpers currently share the same observed model family, such as `rush` + `title` on Haiku and `search` + `look-at` + `handoff` on Gemini Flash.
|
|
377
|
-
- AMP model matching now canonicalizes display-style names like `Claude Opus 4.6`, `GPT-5.3 Codex`, and `Gemini 3 Flash` before matching.
|
|
378
|
-
- Legacy AMP fields are still supported for backward compatibility: `amp.modelMappings`, `amp.subagentMappings`, `amp.subagentDefinitions`, and `amp.forceModelMappings`.
|
|
379
|
-
- When any new AMP schema fields are present (`preset`, `defaultRoute`, `routes`, `rawModelRoutes`, `overrides`, `fallback`), the new AMP resolver path is used. Otherwise legacy AMP routing behavior is preserved.
|
|
380
|
-
- Bare AMP model names like `gpt-4o-mini` are matched against configured local `model.id` and `model.aliases` automatically.
|
|
381
|
-
- If no local match is found and `amp.upstreamUrl` is set, `llm-router` proxies the request upstream to AMP.
|
|
382
|
-
- AMP management/auth routes (`/api/auth`, `/threads`, `/docs`, `/settings`, etc.) proxy through the configured AMP upstream and reuse your `masterKey` as the local gateway auth token.
|
|
383
|
-
- AMP Google `/api/provider/google/v1beta/...` requests are translated locally into OpenAI-compatible chat requests, including Gemini model listing, `generateContent`, and `streamGenerateContent`.
|
|
384
|
-
- `llm-router config --operation=set-amp-config` supports both the new AMP schema flags and the legacy AMP flags. The interactive wizard now leads with `Quick setup`, `Default AMP route`, and `Common AMP routes`, while the older mapping controls live under `Advanced`.
|
|
385
|
-
- If the AMP upstream API key is not found in local AMP config/secrets, the wizard tells you to open `https://ampcode.com/settings` and paste the key into `llm-router`.
|
|
386
|
-
- Developer notes and architecture details live in `docs/amp-routing.md`.
|
|
387
|
-
|
|
388
|
-
You can also configure the AMP block non-interactively:
|
|
389
132
|
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
393
|
-
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
|
|
397
|
-
```
|
|
133
|
+
- `validate` checks raw config JSON + schema without opening the Web UI.
|
|
134
|
+
- `snapshot` combines config, runtime, startup, and coding-tool routing state.
|
|
135
|
+
- `tool-status` focuses only on Codex CLI, Claude Code, and AMP client wiring.
|
|
136
|
+
- `reclaim` force-frees the fixed local router port when another listener is blocking `llr start`.
|
|
137
|
+
- `set-codex-cli-routing` accepts `--default-model=<route>` or `--default-model=__codex_cli_inherit__` to keep Codex's own model selection.
|
|
138
|
+
- `set-claude-code-routing` accepts `--primary-model`, `--default-opus-model`, `--default-sonnet-model`, `--default-haiku-model`, `--subagent-model`, and `--thinking-level`.
|
|
139
|
+
- `set-amp-client-routing` patches or restores AMP client settings/secrets separately from router-side AMP config.
|
|
398
140
|
|
|
399
|
-
|
|
141
|
+
## Providers, Models, and Aliases
|
|
400
142
|
|
|
401
|
-
|
|
402
|
-
|
|
403
|
-
|
|
404
|
-
|
|
405
|
-
--amp-model-mappings="* => rc/gpt-5.3-codex" \
|
|
406
|
-
--amp-subagent-mappings="oracle => rc/gpt-5.3-codex, planner => rc/gpt-5.3-codex"
|
|
407
|
-
```
|
|
143
|
+
- Provider: one upstream service such as OpenRouter or Anthropic
|
|
144
|
+
- Model: one upstream model id exposed by that provider
|
|
145
|
+
- Alias: one stable route name that can fan out to multiple provider/model targets
|
|
146
|
+
- Rate-limit bucket: request cap scoped to one or more models over a time window
|
|
408
147
|
|
|
409
|
-
|
|
148
|
+
Recommended pattern:
|
|
410
149
|
|
|
411
|
-
|
|
412
|
-
|
|
413
|
-
|
|
150
|
+
1. Add providers with direct model lists.
|
|
151
|
+
2. Create aliases for stable client-facing route names.
|
|
152
|
+
3. Put balancing/fallback behavior behind the alias, not in the client.
|
|
414
153
|
|
|
415
|
-
|
|
154
|
+
## Subscription Providers
|
|
155
|
+
|
|
156
|
+
OAuth-backed subscription providers are supported.
|
|
416
157
|
|
|
417
158
|
```bash
|
|
418
|
-
|
|
419
|
-
|
|
420
|
-
|
|
421
|
-
|
|
159
|
+
llr config --operation=upsert-provider --provider-id=chatgpt --name="GPT Sub" --type=subscription --subscription-type=chatgpt-codex --subscription-profile=default
|
|
160
|
+
llr config --operation=upsert-provider --provider-id=claude-sub --name="Claude Sub" --type=subscription --subscription-type=claude-code --subscription-profile=default
|
|
161
|
+
llr subscription login --subscription-type=chatgpt-codex --profile=default
|
|
162
|
+
llr subscription login --subscription-type=claude-code --profile=default
|
|
163
|
+
llr subscription status
|
|
422
164
|
```
|
|
423
165
|
|
|
424
|
-
|
|
166
|
+
Supported `subscription-type` values:
|
|
167
|
+
|
|
168
|
+
- `chatgpt-codex`
|
|
169
|
+
- `claude-code`
|
|
425
170
|
|
|
426
|
-
|
|
427
|
-
- sets `amp.preset=builtin`
|
|
428
|
-
- sets `amp.defaultRoute` to your current `defaultModel` (or the first configured provider/model)
|
|
429
|
-
- enables `amp.restrictManagementToLocalhost=true`
|
|
430
|
-
- auto-discovers `amp.upstreamApiKey` for `https://ampcode.com` from AMP secrets when available
|
|
171
|
+
Compliance note: using provider resources through LLM Router may violate a provider's terms. You are responsible for that usage.
|
|
431
172
|
|
|
432
|
-
|
|
173
|
+
## AMP
|
|
433
174
|
|
|
434
|
-
|
|
175
|
+
LLM Router can front AMP-compatible routes locally and optionally proxy unresolved AMP traffic upstream.
|
|
176
|
+
|
|
177
|
+
Open the Web UI for AMP setup, or use direct CLI operations:
|
|
435
178
|
|
|
436
179
|
```bash
|
|
437
|
-
|
|
438
|
-
|
|
439
|
-
|
|
180
|
+
llr config --operation=set-amp-config --patch-amp-client-config=true --amp-client-settings-scope=workspace --amp-client-url=http://127.0.0.1:4000
|
|
181
|
+
llr config --operation=set-amp-config --amp-default-route=chat.default --amp-routes="smart => chat.smart, rush => chat.fast"
|
|
182
|
+
llr config --operation=set-amp-config --amp-upstream-url=https://ampcode.com --amp-upstream-api-key=amp_...
|
|
183
|
+
llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
|
|
440
184
|
```
|
|
441
185
|
|
|
442
|
-
|
|
443
|
-
- global settings: `~/.config/amp/settings.json`
|
|
444
|
-
- workspace settings: `.amp/settings.json`
|
|
445
|
-
- secrets: `~/.local/share/amp/secrets.json`
|
|
186
|
+
## Local Real-Provider Suite
|
|
446
187
|
|
|
447
|
-
|
|
448
|
-
- `amp.url` in `settings.json`
|
|
449
|
-
- `apiKey@<endpoint-url>` in `secrets.json`
|
|
188
|
+
The repo includes a local-only real-provider suite for the supported operator surfaces:
|
|
450
189
|
|
|
451
|
-
|
|
190
|
+
- CLI config + local gateway start
|
|
191
|
+
- Web UI discovery / probe / save / router control
|
|
452
192
|
|
|
453
|
-
|
|
193
|
+
Setup:
|
|
454
194
|
|
|
455
195
|
```bash
|
|
456
|
-
|
|
196
|
+
cp .env.test-suite.example .env.test-suite
|
|
457
197
|
```
|
|
458
198
|
|
|
459
|
-
|
|
460
|
-
|
|
461
|
-
Key artifacts in the output directory:
|
|
199
|
+
Then fill in your own provider keys, endpoints, and models.
|
|
462
200
|
|
|
463
|
-
|
|
464
|
-
- `observed-models.json`: unique live AMP model labels grouped by case with resolver checks
|
|
465
|
-
- `summary.json`: top-level smoke results plus observed-model summary
|
|
201
|
+
Run:
|
|
466
202
|
|
|
467
|
-
|
|
203
|
+
```bash
|
|
204
|
+
npm run test:provider-live
|
|
205
|
+
```
|
|
468
206
|
|
|
469
|
-
|
|
207
|
+
Legacy alias:
|
|
470
208
|
|
|
471
|
-
```
|
|
472
|
-
|
|
473
|
-
"amp.url": "http://127.0.0.1:8376"
|
|
474
|
-
}
|
|
209
|
+
```bash
|
|
210
|
+
npm run test:provider-smoke
|
|
475
211
|
```
|
|
476
212
|
|
|
477
|
-
|
|
213
|
+
The live suite uses isolated temp HOME/config/runtime-state folders and does not overwrite your normal `~/.llm-router.json` or `~/.llm-router.runtime.json`.
|
|
478
214
|
|
|
479
|
-
|
|
480
|
-
{
|
|
481
|
-
"apiKey@http://127.0.0.1:8376": "gw_your_gateway_key"
|
|
482
|
-
}
|
|
483
|
-
```
|
|
215
|
+
## Deploy to Cloudflare
|
|
484
216
|
|
|
485
|
-
|
|
217
|
+
Deploy the current config to a Worker:
|
|
486
218
|
|
|
487
219
|
```bash
|
|
488
|
-
|
|
489
|
-
|
|
220
|
+
llr deploy
|
|
221
|
+
llr deploy --dry-run=true
|
|
222
|
+
llr deploy --workers-dev=true
|
|
223
|
+
llr deploy --route-pattern=router.example.com/* --zone-name=example.com
|
|
224
|
+
llr deploy --generate-master-key=true
|
|
490
225
|
```
|
|
491
226
|
|
|
492
|
-
|
|
227
|
+
Fast worker key rotation:
|
|
493
228
|
|
|
494
|
-
|
|
495
|
-
-
|
|
496
|
-
-
|
|
497
|
-
|
|
498
|
-
- the running proxy updates instantly
|
|
229
|
+
```bash
|
|
230
|
+
llr worker-key --generate-master-key=true
|
|
231
|
+
llr worker-key --env=production --master-key=rotated-key
|
|
232
|
+
```
|
|
499
233
|
|
|
500
|
-
|
|
234
|
+
## Config File
|
|
501
235
|
|
|
502
|
-
|
|
236
|
+
Local config path:
|
|
503
237
|
|
|
504
|
-
|
|
238
|
+
```text
|
|
239
|
+
~/.llm-router.json
|
|
240
|
+
```
|
|
505
241
|
|
|
506
|
-
|
|
242
|
+
LLM Router also keeps related runtime and token state under the same namespace for backward compatibility with the published package.
|
|
507
243
|
|
|
508
|
-
|
|
244
|
+
Useful runtime env knobs:
|
|
509
245
|
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
```
|
|
246
|
+
- `LLM_ROUTER_MAX_REQUEST_BODY_BYTES`: caps inbound JSON body size for the local router and worker runtime. Default is `8 MiB` for `/responses` requests and `1 MiB` for other JSON endpoints.
|
|
247
|
+
- `LLM_ROUTER_UPSTREAM_TIMEOUT_MS`: overrides the provider request timeout.
|
|
513
248
|
|
|
514
|
-
|
|
249
|
+
## Development
|
|
515
250
|
|
|
516
|
-
|
|
517
|
-
- `LLM_ROUTER_STATE_BACKEND=file` is ignored on Worker (auto-fallback to in-memory state).
|
|
518
|
-
- Stateful timing-dependent routing features (cursor balancing, local quota counters, cooldown persistence) are auto-disabled by default to keep route flow safe across Worker isolates.
|
|
519
|
-
- To opt in to best-effort stateful behavior on Worker, set `LLM_ROUTER_WORKER_ALLOW_BEST_EFFORT_STATEFUL_ROUTING=true`.
|
|
251
|
+
Web UI dev loop:
|
|
520
252
|
|
|
521
|
-
|
|
253
|
+
```bash
|
|
254
|
+
npm run dev
|
|
255
|
+
```
|
|
522
256
|
|
|
523
|
-
|
|
257
|
+
Build the browser bundle:
|
|
524
258
|
|
|
525
|
-
|
|
259
|
+
```bash
|
|
260
|
+
npm run build:web-console
|
|
261
|
+
```
|
|
526
262
|
|
|
527
|
-
|
|
263
|
+
Run the JavaScript test suite:
|
|
528
264
|
|
|
529
|
-
|
|
265
|
+
```bash
|
|
266
|
+
node --test $(rg --files -g "*.test.js" src)
|
|
267
|
+
```
|
|
530
268
|
|
|
531
|
-
##
|
|
269
|
+
## Security and Releases
|
|
532
270
|
|
|
533
|
-
-
|
|
271
|
+
- Security: [`SECURITY.md`](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md)
|
|
534
272
|
- Release notes: [`CHANGELOG.md`](https://github.com/khanglvm/llm-router/blob/master/CHANGELOG.md)
|
|
535
|
-
- Prereleases are published with explicit beta versions such as `2.0.0-beta.1`; pin them intentionally instead of treating them as stable upgrades.
|