@khanglvm/llm-router 2.0.0-beta.1 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/CHANGELOG.md +27 -0
  2. package/README.md +163 -426
  3. package/package.json +3 -3
  4. package/src/cli/router-module.js +2773 -2587
  5. package/src/cli-entry.js +32 -103
  6. package/src/node/activity-log.js +119 -0
  7. package/src/node/coding-tool-config.js +85 -11
  8. package/src/node/config-workflows.js +51 -12
  9. package/src/node/instance-state.js +1 -1
  10. package/src/node/litellm-context-catalog.js +184 -0
  11. package/src/node/local-server.js +23 -3
  12. package/src/node/port-reclaim.js +2 -2
  13. package/src/node/start-command.js +22 -22
  14. package/src/node/startup-manager.js +3 -3
  15. package/src/node/web-command.js +1 -1
  16. package/src/node/web-console-assets.js +1 -1
  17. package/src/node/web-console-client.js +34 -29
  18. package/src/node/web-console-server.js +420 -38
  19. package/src/node/web-console-styles.generated.js +1 -1
  20. package/src/node/web-console-ui/buffered-text-input.js +133 -0
  21. package/src/node/web-console-ui/config-editor-utils.js +57 -4
  22. package/src/node/web-console-ui/dropdown-placement.js +153 -0
  23. package/src/node/web-console-ui/select-search-utils.js +6 -0
  24. package/src/node/web-console-ui/transient-integer-input-utils.js +12 -0
  25. package/src/runtime/balancer.js +78 -1
  26. package/src/runtime/codex-request-transformer.js +16 -7
  27. package/src/runtime/config.js +448 -12
  28. package/src/runtime/handler/amp-response.js +5 -3
  29. package/src/runtime/handler/amp-web-search.js +2232 -0
  30. package/src/runtime/handler/fallback.js +30 -2
  31. package/src/runtime/handler/provider-call.js +353 -36
  32. package/src/runtime/handler/provider-translation.js +14 -0
  33. package/src/runtime/handler/request.js +128 -2
  34. package/src/runtime/handler/route-debug.js +36 -0
  35. package/src/runtime/handler.js +210 -20
  36. package/src/runtime/subscription-provider.js +1 -1
  37. package/src/shared/coding-tool-bindings.js +49 -0
  38. package/src/shared/local-router-defaults.js +62 -0
  39. package/src/translator/request/claude-to-openai.js +43 -0
package/README.md CHANGED
@@ -1,535 +1,272 @@
1
- # llm-router
1
+ # LLM Router
2
2
 
3
- ## Main Features
3
+ LLM Router is a local and Cloudflare-deployable gateway for routing one client endpoint across multiple LLM providers, models, aliases, fallbacks, and rate limits.
4
4
 
5
- 1. Single endpoint, unified providers & models
6
- 2. Support grouping models with rate-limit and load balancing strategy
7
- 3. Configuration auto reload in real time, no interruption
8
-
9
- ## Beta Notice
10
-
11
- `2.0.0-beta.1` is the current public prerelease. It includes major AMP routing, web console, and local operator workflow changes, so treat it as beta and expect rough edges while validating it before a stable `2.0.0` release.
12
-
13
- Short highlights in this beta:
14
- - New localhost web console for config editing, provider testing, and router lifecycle control
15
- - Quick patching for AMP Code, Codex CLI and Claude Code
16
- - Expanded operator workflows across CLI, TUI, OAuth subscription setup, and live provider validation
17
- - Fixed various format-transformation issues
18
-
19
- ## Install
20
-
21
- Stable channel:
5
+ The npm package name stays the same:
22
6
 
23
7
  ```bash
24
- npm i -g @khanglvm/llm-router@latest
8
+ @khanglvm/llm-router
25
9
  ```
26
10
 
27
- Beta preview:
11
+ The primary CLI command is now:
28
12
 
29
13
  ```bash
30
- npm i -g @khanglvm/llm-router@2.0.0-beta.1
14
+ llr
31
15
  ```
32
16
 
33
- ## Usage
17
+ `2.0.0` is the current public release. It includes the Web UI, AMP routing, and coding-tool integrations introduced in the 2.x line.
34
18
 
35
- Copy/paste this short instruction to your AI agent:
19
+ ## Install
36
20
 
37
- ```text
38
- Run `llm-router ai-help` first, then set up and operate llm-router for me using CLI commands.
21
+ ```bash
22
+ npm i -g @khanglvm/llm-router@latest
39
23
  ```
40
24
 
41
- ## Local Real-Provider Test Suite
42
-
43
- The repo now includes a local-only live provider suite that covers all three operator surfaces:
25
+ ## Quick Start
44
26
 
45
- - CLI config + `start`
46
- - TUI config menus
47
- - Web console provider discovery/test + browser bundle render
48
-
49
- Setup:
27
+ 1. Open the Web UI:
50
28
 
51
29
  ```bash
52
- cp .env.test-suite.example .env.test-suite
53
- # fill your own provider keys/endpoints/models in .env.test-suite
30
+ llr
54
31
  ```
55
32
 
56
- Run it:
33
+ 2. Add at least one provider and model.
34
+ 3. Optionally create aliases and fallback routes.
35
+ 4. Start the local gateway:
57
36
 
58
37
  ```bash
59
- npm run test:provider-live
60
- # legacy alias:
61
- npm run test:provider-smoke
38
+ llr start
62
39
  ```
63
40
 
64
- Notes:
65
-
66
- - `.env.test-suite` is gitignored and is intended only for local runs.
67
- - The live suite uses isolated temp HOME/config/runtime-state folders so it does not overwrite your normal `~/.llm-router.json` or `~/.llm-router.runtime.json`.
68
- - Public contributors should keep using `.env.test-suite.example` as the template and fill their own providers locally.
69
-
70
- ## Main Workflow
71
-
72
- 1. Add providers + models into llm-router (standard API-key providers or OAuth subscription providers)
73
- 2. Optionally, group models as alias with load balancing and auto fallback support
74
- 3. Start llm-router server, point your coding tool API and model to llm-router
75
-
76
- ## What Each Term Means
77
-
78
- ### Provider
79
- The service endpoint you call (OpenRouter, Anthropic, etc.).
80
-
81
- ### Model
82
- The actual model ID from that provider.
41
+ 5. Point your client or coding tool at the local endpoint.
83
42
 
84
- ### Rate-Limit Bucket
85
- A request cap for a time window.
86
- Examples:
87
- - `40 requests / minute`
88
- - `20,000 requests / month`
43
+ ## Supported Operator Flows
89
44
 
90
- ### Model Load Balancer
91
- Decides how traffic is distributed across models in an alias group.
45
+ - CLI: direct operations like `llr config --operation=...`, `llr start`, `llr deploy`, provider diagnostics, and coding-tool routing control
46
+ - Web UI: browser-based config editing, provider probing, and local router control
92
47
 
93
- Available strategies:
94
- - `auto` (recommended)
95
- - `ordered`
96
- - `round-robin`
97
- - `weighted-rr`
98
- - `quota-aware-weighted-rr`
48
+ The legacy TUI flow is no longer part of the supported workflow.
99
49
 
100
- ### Model Alias (Group models)
101
- A single model name that auto route/rotate across multiple models.
50
+ ## Core Commands
102
51
 
103
- Example:
104
- - alias: `opus`
105
- - targets:
106
- - `openrouter/claude-opus-4.6`
107
- - `anthropic/claude-opus-4.6`
108
-
109
- Your app can use `opus` model and `llm-router` chooses target models based on your routing settings.
110
-
111
- ## Setup using Terminal User Interface (TUI)
112
-
113
- Open the TUI:
52
+ Open the Web UI:
114
53
 
115
54
  ```bash
116
- llm-router --tui
117
- # or
118
- llm-router config --tui
55
+ llr
56
+ llr config
57
+ llr web
119
58
  ```
120
59
 
121
- Then follow this order.
122
-
123
- ### 1) Add Provider
124
- Flow:
125
- 1. `Config manager`
126
- 2. `Providers`
127
- 3. `Add or edit`
128
- 4. Choose auth method:
129
- - `API key` -> endpoint + API key + model list
130
- - `OAuth` -> browser OAuth + editable model list
131
- 5. For `OAuth`:
132
- - Choose subscription provider (`ChatGPT` or `Claude Code`)
133
- - Enter provider name and provider ID
134
- - Complete browser OAuth login inside this same flow
135
- - Edit model list (pre-filled defaults; you can add/remove)
136
- - llm-router live-tests every selected model before save
137
- 6. Save
138
-
139
- ### 1b) Add Subscription Provider (OAuth)
140
- Commandline examples:
60
+ Run direct config operations:
141
61
 
142
62
  ```bash
143
- # ChatGPT Codex subscription
144
- llm-router config \
145
- --operation=upsert-provider \
146
- --provider-id=chatgpt \
147
- --name="GPT Sub" \
148
- --type=subscription
149
-
150
- # Claude Code subscription
151
- llm-router config \
152
- --operation=upsert-provider \
153
- --provider-id=claude-sub \
154
- --name="Claude Sub" \
155
- --type=subscription \
156
- --subscription-type=claude-code
63
+ llr config --operation=validate
64
+ llr config --operation=snapshot
65
+ llr config --operation=tool-status
66
+ llr config --operation=list
67
+ llr config --operation=discover-provider-models --endpoints=https://openrouter.ai/api/v1 --api-key=sk-...
68
+ llr config --operation=test-provider --endpoints=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
69
+ llr config --operation=upsert-provider --provider-id=openrouter --name=OpenRouter --base-url=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
70
+ llr config --operation=upsert-model-alias --alias-id=chat.default --strategy=auto --targets=openrouter/gpt-4o-mini@3,anthropic/claude-3-5-haiku@2
71
+ llr config --operation=set-provider-rate-limits --provider-id=openrouter --bucket-name="Monthly cap" --bucket-models=all --bucket-requests=20000 --bucket-window=month:1
72
+ llr config --operation=set-master-key --generate-master-key=true
73
+ llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
74
+ llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default
75
+ llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
157
76
  ```
158
77
 
159
- Notes:
160
- - OAuth login is run during provider upsert (browser flow by default).
161
- - Supported `subscription-type`: `chatgpt-codex` and `claude-code` (defaults to `chatgpt-codex`).
162
- - Default model lists are prefilled by subscription type, then editable.
163
- - Device-code login is available for `chatgpt-codex` only.
164
- - No provider API key or endpoint probe input is required for subscription mode.
165
- - Compliance notice: provider account/resource usage via `llm-router` may violate a provider's terms. You are solely responsible for compliance; `llm-router` maintainers take no responsibility for misuse.
166
-
167
- ### 2) Configure Model Fallback (Optional)
168
- Flow:
169
- 1. `Config manager`
170
- 2. `Routing`
171
- 3. `Fallbacks`
172
- 4. Pick main model
173
- 5. Pick fallback models
174
- 6. Save
175
-
176
- ### 3) Configure Rate Limits (Optional)
177
- Flow:
178
- 1. `Config manager`
179
- 2. `Routing`
180
- 3. `Rate limits`
181
- 4. `Create`
182
- 5. Set name, model scope, request cap, time window
183
- 6. Save
184
-
185
- ### 4) Group Models With Alias (Recommended)
186
- Flow:
187
- 1. `Config manager`
188
- 2. `Routing`
189
- 3. `Aliases`
190
- 4. Set alias ID (example: `chat.default`)
191
- 5. Select target models
192
- 6. Save
193
-
194
- ### 5) Configure Model Load Balancer
195
- Flow:
196
- 1. `Config manager`
197
- 2. `Routing`
198
- 3. `Aliases`
199
- 4. Open the alias you want to balance
200
- 5. Choose strategy (`auto` recommended)
201
- 6. Review alias targets
202
- 7. Save
203
-
204
- ### 6) Set Gateway Key
205
- Flow:
206
- 1. `Config manager`
207
- 2. `Security`
208
- 3. `Master key`
209
- 4. Set or generate key
210
- 5. Save
211
-
212
- ## Setup using Web Console
213
-
214
- Open the browser-based console:
78
+ Operate the local gateway:
215
79
 
216
80
  ```bash
217
- llm-router
218
- # or
219
- llm-router config
220
- # explicit alias
221
- llm-router web
81
+ llr start
82
+ llr stop
83
+ llr reclaim
84
+ llr reload
85
+ llr update
222
86
  ```
223
87
 
224
- Local contributor development workflow:
88
+ Get the agent-oriented setup brief:
225
89
 
226
90
  ```bash
227
- yarn dev
91
+ llr ai-help
228
92
  ```
229
93
 
230
- What you get:
231
- - Compact Claude-light localhost UI built with React, shadcn-style primitives, and Tailwind
232
- - JSON-first config editor with live validation, external file sync, and a first-run quick-start wizard when no providers are configured
233
- - Quick status cards for config health, managed router state, startup status, and recent activity
234
- - Sections for:
235
- - raw config editing with validate / prettify / save / open-in-editor actions
236
- - provider inventory with per-provider probe actions
237
- - OS startup enable / disable
238
- - Start / restart / stop controls for the local router
239
- - `Open Config File` buttons for detected editors like VS Code, Sublime, Cursor, TextEdit/default app, and other common local editors
94
+ ## Web UI
240
95
 
241
- Useful flags:
96
+ The Web UI is the default operator surface.
242
97
 
243
98
  ```bash
244
- llm-router web --port=9090
245
- llm-router web --open=false
99
+ llr
100
+ llr web --port=9090
101
+ llr web --open=false
246
102
  ```
247
103
 
248
- Notes:
249
- - The web console is localhost-only by default because it exposes live config editing, including secrets.
250
- - The web console runs as a separate service from the local router. Closing the UI does not stop the router service.
251
- - `yarn dev` hot-reloads the browser UI and restarts the local router service when router source files change.
252
- - If the config file contains invalid JSON, validation surfaces the parse error and save/probe/start actions stay guarded until the JSON is repaired.
253
- - When the web console patches Codex CLI, it writes a generated `model_catalog_json` for both alias bindings and direct managed route refs like `provider/model`, which avoids Codex fallback metadata warnings for managed routes.
104
+ What it covers:
254
105
 
255
- ## Start Local Server
106
+ - raw JSON config editing with validation
107
+ - provider discovery and probe flows
108
+ - alias, fallback, rate-limit, and AMP management
109
+ - local router start, stop, and restart
110
+ - coding-tool patch helpers for Codex CLI, Claude Code, and AMP
256
111
 
257
- ```bash
258
- llm-router start
259
- ```
112
+ The Web UI is localhost-only by default because it can expose secrets and live configuration.
260
113
 
261
- The local router endpoint is fixed to `http://127.0.0.1:8376`.
262
-
263
- Local endpoints:
264
- - Unified: `http://127.0.0.1:8376/route`
265
- - Anthropic-style: `http://127.0.0.1:8376/anthropic`
266
- - OpenAI-style: `http://127.0.0.1:8376/openai`
267
- - OpenAI legacy completions: `http://127.0.0.1:8376/openai/v1/completions`
268
- - OpenAI Responses-style: `http://127.0.0.1:8376/openai/v1/responses` (Codex CLI-compatible)
269
- - AMP OpenAI-style: `http://127.0.0.1:8376/api/provider/openai/v1/chat/completions`
270
- - AMP Anthropic-style: `http://127.0.0.1:8376/api/provider/anthropic/v1/messages`
271
- - AMP OpenAI Responses-style: `http://127.0.0.1:8376/api/provider/openai/v1/responses`
272
-
273
- ## Connect your coding tool
274
-
275
- After setting master key, point your app/agent to local endpoint and use that key as auth token.
276
-
277
- Claude Code example (`~/.claude/settings.local.json`):
278
-
279
- ```json
280
- {
281
- "env": {
282
- "ANTHROPIC_BASE_URL": "http://127.0.0.1:8376",
283
- "ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key",
284
- "ANTHROPIC_DEFAULT_OPUS_MODEL": "provider_name/model_name_1",
285
- "ANTHROPIC_DEFAULT_SONNET_MODEL": "provider_name/model_name_2",
286
- "ANTHROPIC_DEFAULT_HAIKU_MODEL": "provider_name/model_name_3"
287
- }
288
- }
289
- ```
114
+ ## CLI Parity
290
115
 
291
- ## AMP CLI / AMP Code
292
-
293
- `llm-router` can now accept AMP provider-path requests and route them into your configured local models.
294
-
295
- ### Quick AMP setup in the TUI
296
-
297
- Recommended flow for non-expert users:
298
-
299
- 1. Run `llm-router`
300
- 2. Open `AMP`
301
- 3. Choose `Quick setup`
302
- 4. Pick where AMP should be patched:
303
- - `This workspace` for only the current repo
304
- - `All projects` for your global AMP config
305
- 5. Confirm the local `llm-router` URL and API key
306
- 6. Pick one default route such as `chat.default` or `provider/model`
307
- 7. `Save and exit`
308
-
309
- That is enough to make AMP send requests to `llm-router`.
310
-
311
- After that, if you want AMP modes like `smart`, `rush`, `deep`, or `oracle` to use different llm-router aliases/models:
312
-
313
- 1. Open `AMP`
314
- 2. Choose `Common AMP routes`
315
- 3. Pick the AMP route you want to customize
316
- 4. Pick the llm-router alias/model to use
317
- 5. Save
318
-
319
- The `Advanced` menu is where the older, more detailed AMP controls now live:
320
-
321
- - upstream / proxy settings
322
- - legacy model-pattern mappings
323
- - legacy subagent definitions and mappings
324
-
325
- Recommended config snippet in `~/.llm-router.json`:
326
-
327
- ```json
328
- {
329
- "masterKey": "gw_your_gateway_key",
330
- "defaultModel": "chat.default",
331
- "amp": {
332
- "upstreamUrl": "https://ampcode.com",
333
- "upstreamApiKey": "amp_upstream_api_key",
334
- "restrictManagementToLocalhost": true,
335
- "preset": "builtin",
336
- "defaultRoute": "chat.default",
337
- "routes": {
338
- "smart": "chat.smart",
339
- "rush": "chat.fast",
340
- "deep": "chat.deep",
341
- "oracle": "chat.oracle",
342
- "librarian": "chat.research",
343
- "review": "chat.review",
344
- "@google-gemini-flash-shared": "chat.tools",
345
- "painter": "image.default"
346
- },
347
- "rawModelRoutes": [
348
- { "from": "gpt-*-codex*", "to": "chat.deep" }
349
- ],
350
- "overrides": {
351
- "entities": [
352
- {
353
- "id": "reviewer",
354
- "type": "feature",
355
- "match": ["gemini-4-pro*"],
356
- "route": "chat.review"
357
- }
358
- ]
359
- },
360
- "fallback": {
361
- "onUnknown": "default-route",
362
- "onAmbiguous": "default-route",
363
- "proxyUpstream": true
364
- }
365
- }
366
- }
116
+ The browser UI still gives the best interactive overview, but the CLI now exposes the main management flows an agent needs without relying on private web endpoints.
117
+
118
+ ```bash
119
+ llr config --operation=validate
120
+ llr config --operation=snapshot
121
+ llr config --operation=tool-status
122
+ llr reclaim
123
+ llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
124
+ llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default --default-haiku-model=chat.fast
125
+ llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
126
+ llr config --operation=set-codex-cli-routing --enabled=false
127
+ llr config --operation=set-claude-code-routing --enabled=false
128
+ llr config --operation=set-amp-client-routing --enabled=false --amp-client-settings-scope=workspace
367
129
  ```
368
130
 
369
131
  Notes:
370
- - `amp` is the normalized config key. Input aliases `ampcode` and `amp-code` are also accepted.
371
- - `amp.routes` is the new main user-facing mapping surface. Keys can be friendly AMP entities like `smart`, `rush`, `oracle`, `review`, `title`, or shared signatures like `@google-gemini-flash-shared`.
372
- - `amp.defaultRoute` is AMP-specific fallback and is checked before the global `defaultModel`.
373
- - `amp.rawModelRoutes` is the new-schema escape hatch for raw model-name matching when entity/signature routing is not enough.
374
- - `amp.overrides` lets users add or update entity/signature detection without editing the built-in preset in code.
375
- - `amp.preset=builtin` enables the shipped AMP catalog. Set `amp.preset=none` to disable built-in entity/signature detection entirely.
376
- - Shared signatures exist because some AMP helpers currently share the same observed model family, such as `rush` + `title` on Haiku and `search` + `look-at` + `handoff` on Gemini Flash.
377
- - AMP model matching now canonicalizes display-style names like `Claude Opus 4.6`, `GPT-5.3 Codex`, and `Gemini 3 Flash` before matching.
378
- - Legacy AMP fields are still supported for backward compatibility: `amp.modelMappings`, `amp.subagentMappings`, `amp.subagentDefinitions`, and `amp.forceModelMappings`.
379
- - When any new AMP schema fields are present (`preset`, `defaultRoute`, `routes`, `rawModelRoutes`, `overrides`, `fallback`), the new AMP resolver path is used. Otherwise legacy AMP routing behavior is preserved.
380
- - Bare AMP model names like `gpt-4o-mini` are matched against configured local `model.id` and `model.aliases` automatically.
381
- - If no local match is found and `amp.upstreamUrl` is set, `llm-router` proxies the request upstream to AMP.
382
- - AMP management/auth routes (`/api/auth`, `/threads`, `/docs`, `/settings`, etc.) proxy through the configured AMP upstream and reuse your `masterKey` as the local gateway auth token.
383
- - AMP Google `/api/provider/google/v1beta/...` requests are translated locally into OpenAI-compatible chat requests, including Gemini model listing, `generateContent`, and `streamGenerateContent`.
384
- - `llm-router config --operation=set-amp-config` supports both the new AMP schema flags and the legacy AMP flags. The interactive wizard now leads with `Quick setup`, `Default AMP route`, and `Common AMP routes`, while the older mapping controls live under `Advanced`.
385
- - If the AMP upstream API key is not found in local AMP config/secrets, the wizard tells you to open `https://ampcode.com/settings` and paste the key into `llm-router`.
386
- - Developer notes and architecture details live in `docs/amp-routing.md`.
387
-
388
- You can also configure the AMP block non-interactively:
389
132
 
390
- ```bash
391
- llm-router config --operation=set-amp-config \
392
- --amp-upstream-url=https://ampcode.com \
393
- --amp-upstream-api-key=amp_... \
394
- --amp-default-route=chat.default \
395
- --amp-routes="smart => chat.smart, rush => chat.fast, @google-gemini-flash-shared => chat.tools" \
396
- --amp-raw-model-routes="gpt-*-codex* => chat.deep"
397
- ```
133
+ - `validate` checks raw config JSON + schema without opening the Web UI.
134
+ - `snapshot` combines config, runtime, startup, and coding-tool routing state.
135
+ - `tool-status` focuses only on Codex CLI, Claude Code, and AMP client wiring.
136
+ - `reclaim` force-frees the fixed local router port when another listener is blocking `llr start`.
137
+ - `set-codex-cli-routing` accepts `--default-model=<route>` or `--default-model=__codex_cli_inherit__` to keep Codex's own model selection.
138
+ - `set-claude-code-routing` accepts `--primary-model`, `--default-opus-model`, `--default-sonnet-model`, `--default-haiku-model`, `--subagent-model`, and `--thinking-level`.
139
+ - `set-amp-client-routing` patches or restores AMP client settings/secrets separately from router-side AMP config.
398
140
 
399
- Legacy-compatible CLI example:
141
+ ## Providers, Models, and Aliases
400
142
 
401
- ```bash
402
- llm-router config --operation=set-amp-config \
403
- --amp-force-model-mappings=true \
404
- --amp-subagent-definitions="oracle => /^gpt-\d+(?:\.\d+)?$/, planner => gpt-6*" \
405
- --amp-model-mappings="* => rc/gpt-5.3-codex" \
406
- --amp-subagent-mappings="oracle => rc/gpt-5.3-codex, planner => rc/gpt-5.3-codex"
407
- ```
143
+ - Provider: one upstream service such as OpenRouter or Anthropic
144
+ - Model: one upstream model id exposed by that provider
145
+ - Alias: one stable route name that can fan out to multiple provider/model targets
146
+ - Rate-limit bucket: request cap scoped to one or more models over a time window
408
147
 
409
- To reset custom AMP subagent names/patterns back to the built-in defaults:
148
+ Recommended pattern:
410
149
 
411
- ```bash
412
- llm-router config --operation=set-amp-config --reset-amp-subagent-definitions=true
413
- ```
150
+ 1. Add providers with direct model lists.
151
+ 2. Create aliases for stable client-facing route names.
152
+ 3. Put balancing/fallback behavior behind the alias, not in the client.
414
153
 
415
- To patch AMP so it points at your local `llm-router` without editing AMP files manually:
154
+ ## Subscription Providers
155
+
156
+ OAuth-backed subscription providers are supported.
416
157
 
417
158
  ```bash
418
- llm-router config --operation=set-amp-config \
419
- --patch-amp-client-config=true \
420
- --amp-client-settings-scope=workspace \
421
- --amp-client-url=http://127.0.0.1:8376
159
+ llr config --operation=upsert-provider --provider-id=chatgpt --name="GPT Sub" --type=subscription --subscription-type=chatgpt-codex --subscription-profile=default
160
+ llr config --operation=upsert-provider --provider-id=claude-sub --name="Claude Sub" --type=subscription --subscription-type=claude-code --subscription-profile=default
161
+ llr subscription login --subscription-type=chatgpt-codex --profile=default
162
+ llr subscription login --subscription-type=claude-code --profile=default
163
+ llr subscription status
422
164
  ```
423
165
 
424
- When you run the patch flow on a config that does not already have AMP routing configured, `llm-router` now bootstraps a safe default AMP setup automatically:
166
+ Supported `subscription-type` values:
167
+
168
+ - `chatgpt-codex`
169
+ - `claude-code`
425
170
 
426
- - patches AMP client `amp.url` + the local gateway API key entry
427
- - sets `amp.preset=builtin`
428
- - sets `amp.defaultRoute` to your current `defaultModel` (or the first configured provider/model)
429
- - enables `amp.restrictManagementToLocalhost=true`
430
- - auto-discovers `amp.upstreamApiKey` for `https://ampcode.com` from AMP secrets when available
171
+ Compliance note: using provider resources through LLM Router may violate a provider's terms. You are responsible for that usage.
431
172
 
432
- That means a normal existing config with `defaultModel`, providers, and `masterKey` can usually patch AMP and start using a single default local model immediately.
173
+ ## AMP
433
174
 
434
- Then customize AMP behavior later without re-patching the AMP client:
175
+ LLM Router can front AMP-compatible routes locally and optionally proxy unresolved AMP traffic upstream.
176
+
177
+ Open the Web UI for AMP setup, or use direct CLI operations:
435
178
 
436
179
  ```bash
437
- llm-router config --operation=set-amp-config \
438
- --amp-default-route=chat.default \
439
- --amp-routes="smart => chat.smart, rush => chat.fast, deep => chat.deep, oracle => chat.oracle"
180
+ llr config --operation=set-amp-config --patch-amp-client-config=true --amp-client-settings-scope=workspace --amp-client-url=http://127.0.0.1:4000
181
+ llr config --operation=set-amp-config --amp-default-route=chat.default --amp-routes="smart => chat.smart, rush => chat.fast"
182
+ llr config --operation=set-amp-config --amp-upstream-url=https://ampcode.com --amp-upstream-api-key=amp_...
183
+ llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
440
184
  ```
441
185
 
442
- AMP client file locations used by the wizard/patch flow:
443
- - global settings: `~/.config/amp/settings.json`
444
- - workspace settings: `.amp/settings.json`
445
- - secrets: `~/.local/share/amp/secrets.json`
186
+ ## Local Real-Provider Suite
446
187
 
447
- When patching AMP client files, `llm-router` only updates or adds:
448
- - `amp.url` in `settings.json`
449
- - `apiKey@<endpoint-url>` in `secrets.json`
188
+ The repo includes a local-only real-provider suite for the supported operator surfaces:
450
189
 
451
- All other existing AMP settings/secrets fields are preserved. Missing files/directories are created automatically.
190
+ - CLI config + local gateway start
191
+ - Web UI discovery / probe / save / router control
452
192
 
453
- Reusable local smoke test:
193
+ Setup:
454
194
 
455
195
  ```bash
456
- npm run test:amp-smoke
196
+ cp .env.test-suite.example .env.test-suite
457
197
  ```
458
198
 
459
- The smoke suite clones your current `~/.llm-router.json`, auto-discovers your AMP upstream key from local AMP secrets, forces all AMP traffic to `rc/gpt-5.3-codex`, runs headless AMP execute-mode checks (`smart`, `rush`, `deep`, plus an Oracle-style prompt), captures the raw inbound AMP `model` labels seen by `llm-router`, verifies each observed label still resolves through the current AMP matcher, and writes reusable logs/artifacts to a temp directory.
460
-
461
- Key artifacts in the output directory:
199
+ Then fill in your own provider keys, endpoints, and models.
462
200
 
463
- - `router-log.jsonl`: full inbound + upstream request log
464
- - `observed-models.json`: unique live AMP model labels grouped by case with resolver checks
465
- - `summary.json`: top-level smoke results plus observed-model summary
201
+ Run:
466
202
 
467
- Suggested AMP client setup:
203
+ ```bash
204
+ npm run test:provider-live
205
+ ```
468
206
 
469
- `~/.config/amp/settings.json`
207
+ Legacy alias:
470
208
 
471
- ```json
472
- {
473
- "amp.url": "http://127.0.0.1:8376"
474
- }
209
+ ```bash
210
+ npm run test:provider-smoke
475
211
  ```
476
212
 
477
- `~/.local/share/amp/secrets.json`
213
+ The live suite uses isolated temp HOME/config/runtime-state folders and does not overwrite your normal `~/.llm-router.json` or `~/.llm-router.runtime.json`.
478
214
 
479
- ```json
480
- {
481
- "apiKey@http://127.0.0.1:8376": "gw_your_gateway_key"
482
- }
483
- ```
215
+ ## Deploy to Cloudflare
484
216
 
485
- Or use environment variables:
217
+ Deploy the current config to a Worker:
486
218
 
487
219
  ```bash
488
- export AMP_URL=http://127.0.0.1:8376
489
- export AMP_API_KEY=gw_your_gateway_key
220
+ llr deploy
221
+ llr deploy --dry-run=true
222
+ llr deploy --workers-dev=true
223
+ llr deploy --route-pattern=router.example.com/* --zone-name=example.com
224
+ llr deploy --generate-master-key=true
490
225
  ```
491
226
 
492
- ## Real-Time Update Experience
227
+ Fast worker key rotation:
493
228
 
494
- When local server is running:
495
- - open `llm-router`
496
- - change provider/model/load-balancer/rate-limit/alias in TUI
497
- - save
498
- - the running proxy updates instantly
229
+ ```bash
230
+ llr worker-key --generate-master-key=true
231
+ llr worker-key --env=production --master-key=rotated-key
232
+ ```
499
233
 
500
- No stop/start cycle needed.
234
+ ## Config File
501
235
 
502
- Config/status outputs are shown in structured table layouts for easier operator review.
236
+ Local config path:
503
237
 
504
- ## Cloudflare Worker (Hosted)
238
+ ```text
239
+ ~/.llm-router.json
240
+ ```
505
241
 
506
- Use when you want a hosted endpoint instead of local server.
242
+ LLM Router also keeps related runtime and token state under the same namespace for backward compatibility with the published package.
507
243
 
508
- Guided deploy:
244
+ Useful runtime env knobs:
509
245
 
510
- ```bash
511
- llm-router deploy
512
- ```
246
+ - `LLM_ROUTER_MAX_REQUEST_BODY_BYTES`: caps inbound JSON body size for the local router and worker runtime. Default is `8 MiB` for `/responses` requests and `1 MiB` for other JSON endpoints.
247
+ - `LLM_ROUTER_UPSTREAM_TIMEOUT_MS`: overrides the provider request timeout.
513
248
 
514
- You will be guided in TUI to select account and deploy target.
249
+ ## Development
515
250
 
516
- Worker safety defaults:
517
- - `LLM_ROUTER_STATE_BACKEND=file` is ignored on Worker (auto-fallback to in-memory state).
518
- - Stateful timing-dependent routing features (cursor balancing, local quota counters, cooldown persistence) are auto-disabled by default to keep route flow safe across Worker isolates.
519
- - To opt in to best-effort stateful behavior on Worker, set `LLM_ROUTER_WORKER_ALLOW_BEST_EFFORT_STATEFUL_ROUTING=true`.
251
+ Web UI dev loop:
520
252
 
521
- ## Config File Location
253
+ ```bash
254
+ npm run dev
255
+ ```
522
256
 
523
- Local config file:
257
+ Build the browser bundle:
524
258
 
525
- `~/.llm-router.json`
259
+ ```bash
260
+ npm run build:web-console
261
+ ```
526
262
 
527
- ## Security
263
+ Run the JavaScript test suite:
528
264
 
529
- See [`SECURITY.md`](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md).
265
+ ```bash
266
+ node --test $(rg --files -g "*.test.js" src)
267
+ ```
530
268
 
531
- ## Versioning
269
+ ## Security and Releases
532
270
 
533
- - Semver: [Semantic Versioning](https://semver.org/)
271
+ - Security: [`SECURITY.md`](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md)
534
272
  - Release notes: [`CHANGELOG.md`](https://github.com/khanglvm/llm-router/blob/master/CHANGELOG.md)
535
- - Prereleases are published with explicit beta versions such as `2.0.0-beta.1`; pin them intentionally instead of treating them as stable upgrades.