npm - @khanglvm/llm-router - Versions diffs - 2.0.0-beta.1 → 2.0.0 - Mend

@khanglvm/llm-router 2.0.0-beta.1 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/CHANGELOG.md +27 -0
package/README.md +163 -426
package/package.json +3 -3
package/src/cli/router-module.js +2773 -2587
package/src/cli-entry.js +32 -103
package/src/node/activity-log.js +119 -0
package/src/node/coding-tool-config.js +85 -11
package/src/node/config-workflows.js +51 -12
package/src/node/instance-state.js +1 -1
package/src/node/litellm-context-catalog.js +184 -0
package/src/node/local-server.js +23 -3
package/src/node/port-reclaim.js +2 -2
package/src/node/start-command.js +22 -22
package/src/node/startup-manager.js +3 -3
package/src/node/web-command.js +1 -1
package/src/node/web-console-assets.js +1 -1
package/src/node/web-console-client.js +34 -29
package/src/node/web-console-server.js +420 -38
package/src/node/web-console-styles.generated.js +1 -1
package/src/node/web-console-ui/buffered-text-input.js +133 -0
package/src/node/web-console-ui/config-editor-utils.js +57 -4
package/src/node/web-console-ui/dropdown-placement.js +153 -0
package/src/node/web-console-ui/select-search-utils.js +6 -0
package/src/node/web-console-ui/transient-integer-input-utils.js +12 -0
package/src/runtime/balancer.js +78 -1
package/src/runtime/codex-request-transformer.js +16 -7
package/src/runtime/config.js +448 -12
package/src/runtime/handler/amp-response.js +5 -3
package/src/runtime/handler/amp-web-search.js +2232 -0
package/src/runtime/handler/fallback.js +30 -2
package/src/runtime/handler/provider-call.js +353 -36
package/src/runtime/handler/provider-translation.js +14 -0
package/src/runtime/handler/request.js +128 -2
package/src/runtime/handler/route-debug.js +36 -0
package/src/runtime/handler.js +210 -20
package/src/runtime/subscription-provider.js +1 -1
package/src/shared/coding-tool-bindings.js +49 -0
package/src/shared/local-router-defaults.js +62 -0
package/src/translator/request/claude-to-openai.js +43 -0

package/README.md CHANGED Viewed

@@ -1,535 +1,272 @@
-# llm-router
+# LLM Router
-## Main Features
+LLM Router is a local and Cloudflare-deployable gateway for routing one client endpoint across multiple LLM providers, models, aliases, fallbacks, and rate limits.
-1. Single endpoint, unified providers & models
-2. Support grouping models with rate-limit and load balancing strategy
-3. Configuration auto reload in real time, no interruption
-## Beta Notice
-`2.0.0-beta.1` is the current public prerelease. It includes major AMP routing, web console, and local operator workflow changes, so treat it as beta and expect rough edges while validating it before a stable `2.0.0` release.
-Short highlights in this beta:
-- New localhost web console for config editing, provider testing, and router lifecycle control
-- Quick patching for AMP Code, Codex CLI and Claude Code
-- Expanded operator workflows across CLI, TUI, OAuth subscription setup, and live provider validation
-- Fixed various format-transformation issues
-## Install
-Stable channel:
+The npm package name stays the same:
 ```bash
-npm i -g @khanglvm/llm-router@latest
+@khanglvm/llm-router
 ```
-Beta preview:
+The primary CLI command is now:
 ```bash
-npm i -g @khanglvm/llm-router@2.0.0-beta.1
+llr
 ```
-## Usage
+`2.0.0` is the current public release. It includes the Web UI, AMP routing, and coding-tool integrations introduced in the 2.x line.
-Copy/paste this short instruction to your AI agent:
+## Install
-```text
-Run `llm-router ai-help` first, then set up and operate llm-router for me using CLI commands.
+```bash
+npm i -g @khanglvm/llm-router@latest
 ```
-## Local Real-Provider Test Suite
-The repo now includes a local-only live provider suite that covers all three operator surfaces:
+## Quick Start
-- CLI config + `start`
-- TUI config menus
-- Web console provider discovery/test + browser bundle render
-Setup:
+1. Open the Web UI:
 ```bash
-cp .env.test-suite.example .env.test-suite
-# fill your own provider keys/endpoints/models in .env.test-suite
+llr
 ```
-Run it:
+2. Add at least one provider and model.
+3. Optionally create aliases and fallback routes.
+4. Start the local gateway:
 ```bash
-npm run test:provider-live
-# legacy alias:
-npm run test:provider-smoke
+llr start
 ```
-Notes:
-- `.env.test-suite` is gitignored and is intended only for local runs.
-- The live suite uses isolated temp HOME/config/runtime-state folders so it does not overwrite your normal `~/.llm-router.json` or `~/.llm-router.runtime.json`.
-- Public contributors should keep using `.env.test-suite.example` as the template and fill their own providers locally.
-## Main Workflow
-1. Add providers + models into llm-router (standard API-key providers or OAuth subscription providers)
-2. Optionally, group models as alias with load balancing and auto fallback support
-3. Start llm-router server, point your coding tool API and model to llm-router
-## What Each Term Means
-### Provider
-The service endpoint you call (OpenRouter, Anthropic, etc.).
-### Model
-The actual model ID from that provider.
+5. Point your client or coding tool at the local endpoint.
-### Rate-Limit Bucket
-A request cap for a time window.
-Examples:
-- `40 requests / minute`
-- `20,000 requests / month`
+## Supported Operator Flows
-### Model Load Balancer
-Decides how traffic is distributed across models in an alias group.
+- CLI: direct operations like `llr config --operation=...`, `llr start`, `llr deploy`, provider diagnostics, and coding-tool routing control
+- Web UI: browser-based config editing, provider probing, and local router control
-Available strategies:
-- `auto` (recommended)
-- `ordered`
-- `round-robin`
-- `weighted-rr`
-- `quota-aware-weighted-rr`
+The legacy TUI flow is no longer part of the supported workflow.
-### Model Alias (Group models)
-A single model name that auto route/rotate across multiple models.
+## Core Commands
-Example:
-- alias: `opus`
-- targets:
-  - `openrouter/claude-opus-4.6`
-  - `anthropic/claude-opus-4.6`
-Your app can use `opus` model and `llm-router` chooses target models based on your routing settings.
-## Setup using Terminal User Interface (TUI)
-Open the TUI:
+Open the Web UI:
 ```bash
-llm-router --tui
-# or
-llm-router config --tui
+llr
+llr config
+llr web
 ```
-Then follow this order.
-### 1) Add Provider
-Flow:
-1. `Config manager`
-2. `Providers`
-3. `Add or edit`
-4. Choose auth method:
-   - `API key` -> endpoint + API key + model list
-   - `OAuth` -> browser OAuth + editable model list
-5. For `OAuth`:
-   - Choose subscription provider (`ChatGPT` or `Claude Code`)
-   - Enter provider name and provider ID
-   - Complete browser OAuth login inside this same flow
-   - Edit model list (pre-filled defaults; you can add/remove)
-   - llm-router live-tests every selected model before save
-6. Save
-### 1b) Add Subscription Provider (OAuth)
-Commandline examples:
+Run direct config operations:
 ```bash
-# ChatGPT Codex subscription
-llm-router config \
-  --operation=upsert-provider \
-  --provider-id=chatgpt \
-  --name="GPT Sub" \
-  --type=subscription
-# Claude Code subscription
-llm-router config \
-  --operation=upsert-provider \
-  --provider-id=claude-sub \
-  --name="Claude Sub" \
-  --type=subscription \
-  --subscription-type=claude-code
+llr config --operation=validate
+llr config --operation=snapshot
+llr config --operation=tool-status
+llr config --operation=list
+llr config --operation=discover-provider-models --endpoints=https://openrouter.ai/api/v1 --api-key=sk-...
+llr config --operation=test-provider --endpoints=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
+llr config --operation=upsert-provider --provider-id=openrouter --name=OpenRouter --base-url=https://openrouter.ai/api/v1 --api-key=sk-... --models=gpt-4o-mini,gpt-4o
+llr config --operation=upsert-model-alias --alias-id=chat.default --strategy=auto --targets=openrouter/gpt-4o-mini@3,anthropic/claude-3-5-haiku@2
+llr config --operation=set-provider-rate-limits --provider-id=openrouter --bucket-name="Monthly cap" --bucket-models=all --bucket-requests=20000 --bucket-window=month:1
+llr config --operation=set-master-key --generate-master-key=true
+llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
+llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default
+llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
 ```
-Notes:
-- OAuth login is run during provider upsert (browser flow by default).
-- Supported `subscription-type`: `chatgpt-codex` and `claude-code` (defaults to `chatgpt-codex`).
-- Default model lists are prefilled by subscription type, then editable.
-- Device-code login is available for `chatgpt-codex` only.
-- No provider API key or endpoint probe input is required for subscription mode.
-- Compliance notice: provider account/resource usage via `llm-router` may violate a provider's terms. You are solely responsible for compliance; `llm-router` maintainers take no responsibility for misuse.
-### 2) Configure Model Fallback (Optional)
-Flow:
-1. `Config manager`
-2. `Routing`
-3. `Fallbacks`
-4. Pick main model
-5. Pick fallback models
-6. Save
-### 3) Configure Rate Limits (Optional)
-Flow:
-1. `Config manager`
-2. `Routing`
-3. `Rate limits`
-4. `Create`
-5. Set name, model scope, request cap, time window
-6. Save
-### 4) Group Models With Alias (Recommended)
-Flow:
-1. `Config manager`
-2. `Routing`
-3. `Aliases`
-4. Set alias ID (example: `chat.default`)
-5. Select target models
-6. Save
-### 5) Configure Model Load Balancer
-Flow:
-1. `Config manager`
-2. `Routing`
-3. `Aliases`
-4. Open the alias you want to balance
-5. Choose strategy (`auto` recommended)
-6. Review alias targets
-7. Save
-### 6) Set Gateway Key
-Flow:
-1. `Config manager`
-2. `Security`
-3. `Master key`
-4. Set or generate key
-5. Save
-## Setup using Web Console
-Open the browser-based console:
+Operate the local gateway:
 ```bash
-llm-router
-# or
-llm-router config
-# explicit alias
-llm-router web
+llr start
+llr stop
+llr reclaim
+llr reload
+llr update
 ```
-Local contributor development workflow:
+Get the agent-oriented setup brief:
 ```bash
-yarn dev
+llr ai-help
 ```
-What you get:
-- Compact Claude-light localhost UI built with React, shadcn-style primitives, and Tailwind
-- JSON-first config editor with live validation, external file sync, and a first-run quick-start wizard when no providers are configured
-- Quick status cards for config health, managed router state, startup status, and recent activity
-- Sections for:
-  - raw config editing with validate / prettify / save / open-in-editor actions
-  - provider inventory with per-provider probe actions
-  - OS startup enable / disable
-- Start / restart / stop controls for the local router
-- `Open Config File` buttons for detected editors like VS Code, Sublime, Cursor, TextEdit/default app, and other common local editors
+## Web UI
-Useful flags:
+The Web UI is the default operator surface.
 ```bash
-llm-router web --port=9090
-llm-router web --open=false
+llr
+llr web --port=9090
+llr web --open=false
 ```
-Notes:
-- The web console is localhost-only by default because it exposes live config editing, including secrets.
-- The web console runs as a separate service from the local router. Closing the UI does not stop the router service.
-- `yarn dev` hot-reloads the browser UI and restarts the local router service when router source files change.
-- If the config file contains invalid JSON, validation surfaces the parse error and save/probe/start actions stay guarded until the JSON is repaired.
-- When the web console patches Codex CLI, it writes a generated `model_catalog_json` for both alias bindings and direct managed route refs like `provider/model`, which avoids Codex fallback metadata warnings for managed routes.
+What it covers:
-## Start Local Server
+- raw JSON config editing with validation
+- provider discovery and probe flows
+- alias, fallback, rate-limit, and AMP management
+- local router start, stop, and restart
+- coding-tool patch helpers for Codex CLI, Claude Code, and AMP
-```bash
-llm-router start
-```
+The Web UI is localhost-only by default because it can expose secrets and live configuration.
-The local router endpoint is fixed to `http://127.0.0.1:8376`.
-Local endpoints:
-- Unified: `http://127.0.0.1:8376/route`
-- Anthropic-style: `http://127.0.0.1:8376/anthropic`
-- OpenAI-style: `http://127.0.0.1:8376/openai`
-- OpenAI legacy completions: `http://127.0.0.1:8376/openai/v1/completions`
-- OpenAI Responses-style: `http://127.0.0.1:8376/openai/v1/responses` (Codex CLI-compatible)
-- AMP OpenAI-style: `http://127.0.0.1:8376/api/provider/openai/v1/chat/completions`
-- AMP Anthropic-style: `http://127.0.0.1:8376/api/provider/anthropic/v1/messages`
-- AMP OpenAI Responses-style: `http://127.0.0.1:8376/api/provider/openai/v1/responses`
-## Connect your coding tool
-After setting master key, point your app/agent to local endpoint and use that key as auth token.
-Claude Code example (`~/.claude/settings.local.json`):
-```json
-{
-  "env": {
-    "ANTHROPIC_BASE_URL": "http://127.0.0.1:8376",
-    "ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key",
-    "ANTHROPIC_DEFAULT_OPUS_MODEL": "provider_name/model_name_1",
-    "ANTHROPIC_DEFAULT_SONNET_MODEL": "provider_name/model_name_2",
-    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "provider_name/model_name_3"
-  }
-}
-```
+## CLI Parity
-## AMP CLI / AMP Code
-`llm-router` can now accept AMP provider-path requests and route them into your configured local models.
-### Quick AMP setup in the TUI
-Recommended flow for non-expert users:
-1. Run `llm-router`
-2. Open `AMP`
-3. Choose `Quick setup`
-4. Pick where AMP should be patched:
-   - `This workspace` for only the current repo
-   - `All projects` for your global AMP config
-5. Confirm the local `llm-router` URL and API key
-6. Pick one default route such as `chat.default` or `provider/model`
-7. `Save and exit`
-That is enough to make AMP send requests to `llm-router`.
-After that, if you want AMP modes like `smart`, `rush`, `deep`, or `oracle` to use different llm-router aliases/models:
-1. Open `AMP`
-2. Choose `Common AMP routes`
-3. Pick the AMP route you want to customize
-4. Pick the llm-router alias/model to use
-5. Save
-The `Advanced` menu is where the older, more detailed AMP controls now live:
-- upstream / proxy settings
-- legacy model-pattern mappings
-- legacy subagent definitions and mappings
-Recommended config snippet in `~/.llm-router.json`:
-```json
-{
-  "masterKey": "gw_your_gateway_key",
-  "defaultModel": "chat.default",
-  "amp": {
-    "upstreamUrl": "https://ampcode.com",
-    "upstreamApiKey": "amp_upstream_api_key",
-    "restrictManagementToLocalhost": true,
-    "preset": "builtin",
-    "defaultRoute": "chat.default",
-    "routes": {
-      "smart": "chat.smart",
-      "rush": "chat.fast",
-      "deep": "chat.deep",
-      "oracle": "chat.oracle",
-      "librarian": "chat.research",
-      "review": "chat.review",
-      "@google-gemini-flash-shared": "chat.tools",
-      "painter": "image.default"
-    },
-    "rawModelRoutes": [
-      { "from": "gpt-*-codex*", "to": "chat.deep" }
-    ],
-    "overrides": {
-      "entities": [
-        {
-          "id": "reviewer",
-          "type": "feature",
-          "match": ["gemini-4-pro*"],
-          "route": "chat.review"
-        }
-      ]
-    },
-    "fallback": {
-      "onUnknown": "default-route",
-      "onAmbiguous": "default-route",
-      "proxyUpstream": true
-    }
-  }
-}
+The browser UI still gives the best interactive overview, but the CLI now exposes the main management flows an agent needs without relying on private web endpoints.
+```bash
+llr config --operation=validate
+llr config --operation=snapshot
+llr config --operation=tool-status
+llr reclaim
+llr config --operation=set-codex-cli-routing --enabled=true --default-model=chat.default
+llr config --operation=set-claude-code-routing --enabled=true --primary-model=chat.default --default-haiku-model=chat.fast
+llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
+llr config --operation=set-codex-cli-routing --enabled=false
+llr config --operation=set-claude-code-routing --enabled=false
+llr config --operation=set-amp-client-routing --enabled=false --amp-client-settings-scope=workspace
 ```
 Notes:
-- `amp` is the normalized config key. Input aliases `ampcode` and `amp-code` are also accepted.
-- `amp.routes` is the new main user-facing mapping surface. Keys can be friendly AMP entities like `smart`, `rush`, `oracle`, `review`, `title`, or shared signatures like `@google-gemini-flash-shared`.
-- `amp.defaultRoute` is AMP-specific fallback and is checked before the global `defaultModel`.
-- `amp.rawModelRoutes` is the new-schema escape hatch for raw model-name matching when entity/signature routing is not enough.
-- `amp.overrides` lets users add or update entity/signature detection without editing the built-in preset in code.
-- `amp.preset=builtin` enables the shipped AMP catalog. Set `amp.preset=none` to disable built-in entity/signature detection entirely.
-- Shared signatures exist because some AMP helpers currently share the same observed model family, such as `rush` + `title` on Haiku and `search` + `look-at` + `handoff` on Gemini Flash.
-- AMP model matching now canonicalizes display-style names like `Claude Opus 4.6`, `GPT-5.3 Codex`, and `Gemini 3 Flash` before matching.
-- Legacy AMP fields are still supported for backward compatibility: `amp.modelMappings`, `amp.subagentMappings`, `amp.subagentDefinitions`, and `amp.forceModelMappings`.
-- When any new AMP schema fields are present (`preset`, `defaultRoute`, `routes`, `rawModelRoutes`, `overrides`, `fallback`), the new AMP resolver path is used. Otherwise legacy AMP routing behavior is preserved.
-- Bare AMP model names like `gpt-4o-mini` are matched against configured local `model.id` and `model.aliases` automatically.
-- If no local match is found and `amp.upstreamUrl` is set, `llm-router` proxies the request upstream to AMP.
-- AMP management/auth routes (`/api/auth`, `/threads`, `/docs`, `/settings`, etc.) proxy through the configured AMP upstream and reuse your `masterKey` as the local gateway auth token.
-- AMP Google `/api/provider/google/v1beta/...` requests are translated locally into OpenAI-compatible chat requests, including Gemini model listing, `generateContent`, and `streamGenerateContent`.
-- `llm-router config --operation=set-amp-config` supports both the new AMP schema flags and the legacy AMP flags. The interactive wizard now leads with `Quick setup`, `Default AMP route`, and `Common AMP routes`, while the older mapping controls live under `Advanced`.
-- If the AMP upstream API key is not found in local AMP config/secrets, the wizard tells you to open `https://ampcode.com/settings` and paste the key into `llm-router`.
-- Developer notes and architecture details live in `docs/amp-routing.md`.
-You can also configure the AMP block non-interactively:
-```bash
-llm-router config --operation=set-amp-config \
-  --amp-upstream-url=https://ampcode.com \
-  --amp-upstream-api-key=amp_... \
-  --amp-default-route=chat.default \
-  --amp-routes="smart => chat.smart, rush => chat.fast, @google-gemini-flash-shared => chat.tools" \
-  --amp-raw-model-routes="gpt-*-codex* => chat.deep"
-```
+- `validate` checks raw config JSON + schema without opening the Web UI.
+- `snapshot` combines config, runtime, startup, and coding-tool routing state.
+- `tool-status` focuses only on Codex CLI, Claude Code, and AMP client wiring.
+- `reclaim` force-frees the fixed local router port when another listener is blocking `llr start`.
+- `set-codex-cli-routing` accepts `--default-model=<route>` or `--default-model=__codex_cli_inherit__` to keep Codex's own model selection.
+- `set-claude-code-routing` accepts `--primary-model`, `--default-opus-model`, `--default-sonnet-model`, `--default-haiku-model`, `--subagent-model`, and `--thinking-level`.
+- `set-amp-client-routing` patches or restores AMP client settings/secrets separately from router-side AMP config.
-Legacy-compatible CLI example:
+## Providers, Models, and Aliases
-```bash
-llm-router config --operation=set-amp-config \
-  --amp-force-model-mappings=true \
-  --amp-subagent-definitions="oracle => /^gpt-\d+(?:\.\d+)?$/, planner => gpt-6*" \
-  --amp-model-mappings="* => rc/gpt-5.3-codex" \
-  --amp-subagent-mappings="oracle => rc/gpt-5.3-codex, planner => rc/gpt-5.3-codex"
-```
+- Provider: one upstream service such as OpenRouter or Anthropic
+- Model: one upstream model id exposed by that provider
+- Alias: one stable route name that can fan out to multiple provider/model targets
+- Rate-limit bucket: request cap scoped to one or more models over a time window
-To reset custom AMP subagent names/patterns back to the built-in defaults:
+Recommended pattern:
-```bash
-llm-router config --operation=set-amp-config --reset-amp-subagent-definitions=true
-```
+1. Add providers with direct model lists.
+2. Create aliases for stable client-facing route names.
+3. Put balancing/fallback behavior behind the alias, not in the client.
-To patch AMP so it points at your local `llm-router` without editing AMP files manually:
+## Subscription Providers
+OAuth-backed subscription providers are supported.
 ```bash
-llm-router config --operation=set-amp-config \
-  --patch-amp-client-config=true \
-  --amp-client-settings-scope=workspace \
-  --amp-client-url=http://127.0.0.1:8376
+llr config --operation=upsert-provider --provider-id=chatgpt --name="GPT Sub" --type=subscription --subscription-type=chatgpt-codex --subscription-profile=default
+llr config --operation=upsert-provider --provider-id=claude-sub --name="Claude Sub" --type=subscription --subscription-type=claude-code --subscription-profile=default
+llr subscription login --subscription-type=chatgpt-codex --profile=default
+llr subscription login --subscription-type=claude-code --profile=default
+llr subscription status
 ```
-When you run the patch flow on a config that does not already have AMP routing configured, `llm-router` now bootstraps a safe default AMP setup automatically:
+Supported `subscription-type` values:
+- `chatgpt-codex`
+- `claude-code`
-- patches AMP client `amp.url` + the local gateway API key entry
-- sets `amp.preset=builtin`
-- sets `amp.defaultRoute` to your current `defaultModel` (or the first configured provider/model)
-- enables `amp.restrictManagementToLocalhost=true`
-- auto-discovers `amp.upstreamApiKey` for `https://ampcode.com` from AMP secrets when available
+Compliance note: using provider resources through LLM Router may violate a provider's terms. You are responsible for that usage.
-That means a normal existing config with `defaultModel`, providers, and `masterKey` can usually patch AMP and start using a single default local model immediately.
+## AMP
-Then customize AMP behavior later without re-patching the AMP client:
+LLM Router can front AMP-compatible routes locally and optionally proxy unresolved AMP traffic upstream.
+Open the Web UI for AMP setup, or use direct CLI operations:
 ```bash
-llm-router config --operation=set-amp-config \
-  --amp-default-route=chat.default \
-  --amp-routes="smart => chat.smart, rush => chat.fast, deep => chat.deep, oracle => chat.oracle"
+llr config --operation=set-amp-config --patch-amp-client-config=true --amp-client-settings-scope=workspace --amp-client-url=http://127.0.0.1:4000
+llr config --operation=set-amp-config --amp-default-route=chat.default --amp-routes="smart => chat.smart, rush => chat.fast"
+llr config --operation=set-amp-config --amp-upstream-url=https://ampcode.com --amp-upstream-api-key=amp_...
+llr config --operation=set-amp-client-routing --enabled=true --amp-client-settings-scope=workspace
 ```
-AMP client file locations used by the wizard/patch flow:
-- global settings: `~/.config/amp/settings.json`
-- workspace settings: `.amp/settings.json`
-- secrets: `~/.local/share/amp/secrets.json`
+## Local Real-Provider Suite
-When patching AMP client files, `llm-router` only updates or adds:
-- `amp.url` in `settings.json`
-- `apiKey@<endpoint-url>` in `secrets.json`
+The repo includes a local-only real-provider suite for the supported operator surfaces:
-All other existing AMP settings/secrets fields are preserved. Missing files/directories are created automatically.
+- CLI config + local gateway start
+- Web UI discovery / probe / save / router control
-Reusable local smoke test:
+Setup:
 ```bash
-npm run test:amp-smoke
+cp .env.test-suite.example .env.test-suite
 ```
-The smoke suite clones your current `~/.llm-router.json`, auto-discovers your AMP upstream key from local AMP secrets, forces all AMP traffic to `rc/gpt-5.3-codex`, runs headless AMP execute-mode checks (`smart`, `rush`, `deep`, plus an Oracle-style prompt), captures the raw inbound AMP `model` labels seen by `llm-router`, verifies each observed label still resolves through the current AMP matcher, and writes reusable logs/artifacts to a temp directory.
-Key artifacts in the output directory:
+Then fill in your own provider keys, endpoints, and models.
-- `router-log.jsonl`: full inbound + upstream request log
-- `observed-models.json`: unique live AMP model labels grouped by case with resolver checks
-- `summary.json`: top-level smoke results plus observed-model summary
+Run:
-Suggested AMP client setup:
+```bash
+npm run test:provider-live
+```
-`~/.config/amp/settings.json`
+Legacy alias:
-```json
-{
-  "amp.url": "http://127.0.0.1:8376"
-}
+```bash
+npm run test:provider-smoke
 ```
-`~/.local/share/amp/secrets.json`
+The live suite uses isolated temp HOME/config/runtime-state folders and does not overwrite your normal `~/.llm-router.json` or `~/.llm-router.runtime.json`.
-```json
-{
-  "apiKey@http://127.0.0.1:8376": "gw_your_gateway_key"
-}
-```
+## Deploy to Cloudflare
-Or use environment variables:
+Deploy the current config to a Worker:
 ```bash
-export AMP_URL=http://127.0.0.1:8376
-export AMP_API_KEY=gw_your_gateway_key
+llr deploy
+llr deploy --dry-run=true
+llr deploy --workers-dev=true
+llr deploy --route-pattern=router.example.com/* --zone-name=example.com
+llr deploy --generate-master-key=true
 ```
-## Real-Time Update Experience
+Fast worker key rotation:
-When local server is running:
-- open `llm-router`
-- change provider/model/load-balancer/rate-limit/alias in TUI
-- save
-- the running proxy updates instantly
+```bash
+llr worker-key --generate-master-key=true
+llr worker-key --env=production --master-key=rotated-key
+```
-No stop/start cycle needed.
+## Config File
-Config/status outputs are shown in structured table layouts for easier operator review.
+Local config path:
-## Cloudflare Worker (Hosted)
+```text
+~/.llm-router.json
+```
-Use when you want a hosted endpoint instead of local server.
+LLM Router also keeps related runtime and token state under the same namespace for backward compatibility with the published package.
-Guided deploy:
+Useful runtime env knobs:
-```bash
-llm-router deploy
-```
+- `LLM_ROUTER_MAX_REQUEST_BODY_BYTES`: caps inbound JSON body size for the local router and worker runtime. Default is `8 MiB` for `/responses` requests and `1 MiB` for other JSON endpoints.
+- `LLM_ROUTER_UPSTREAM_TIMEOUT_MS`: overrides the provider request timeout.
-You will be guided in TUI to select account and deploy target.
+## Development
-Worker safety defaults:
-- `LLM_ROUTER_STATE_BACKEND=file` is ignored on Worker (auto-fallback to in-memory state).
-- Stateful timing-dependent routing features (cursor balancing, local quota counters, cooldown persistence) are auto-disabled by default to keep route flow safe across Worker isolates.
-- To opt in to best-effort stateful behavior on Worker, set `LLM_ROUTER_WORKER_ALLOW_BEST_EFFORT_STATEFUL_ROUTING=true`.
+Web UI dev loop:
-## Config File Location
+```bash
+npm run dev
+```
-Local config file:
+Build the browser bundle:
-`~/.llm-router.json`
+```bash
+npm run build:web-console
+```
-## Security
+Run the JavaScript test suite:
-See [`SECURITY.md`](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md).
+```bash
+node --test $(rg --files -g "*.test.js" src)
+```
-## Versioning
+## Security and Releases
-- Semver: [Semantic Versioning](https://semver.org/)
+- Security: [`SECURITY.md`](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md)
 - Release notes: [`CHANGELOG.md`](https://github.com/khanglvm/llm-router/blob/master/CHANGELOG.md)
-- Prereleases are published with explicit beta versions such as `2.0.0-beta.1`; pin them intentionally instead of treating them as stable upgrades.