npm - @khanglvm/llm-router - Versions diffs - 1.0.5 → 1.0.8 - Mend

@khanglvm/llm-router 1.0.5 → 1.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/CHANGELOG.md +60 -0
package/README.md +134 -176
package/SECURITY.md +142 -0
package/package.json +27 -3
package/src/cli/router-module.js +2448 -301
package/src/index.js +2 -2
package/src/node/config-store.js +74 -6
package/src/node/local-server.js +9 -3
package/src/node/provider-probe.js +354 -97
package/src/runtime/balancer.js +310 -0
package/src/runtime/config.js +895 -45
package/src/runtime/handler/cache-mapping.js +306 -0
package/src/runtime/handler/config-loading.js +4 -1
package/src/runtime/handler/fallback.js +10 -0
package/src/runtime/handler/provider-call.js +40 -2
package/src/runtime/handler/reasoning-effort.js +313 -0
package/src/runtime/handler.js +414 -44
package/src/runtime/rate-limits.js +317 -0
package/src/runtime/state-store.file.js +335 -0
package/src/runtime/state-store.js +74 -0
package/src/runtime/state-store.memory.js +180 -0
package/src/translator/request/claude-to-openai.js +86 -25
package/src/translator/request/openai-to-claude.js +87 -13
package/.env.test-suite.example +0 -19

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,60 @@
+# Changelog
+All notable changes to this project will be documented in this file.
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [1.0.8] - 2026-02-28
+### Changed
+- Added focused npm `keywords` metadata in `package.json` to improve package discoverability.
+## [1.0.7] - 2026-02-28
+### Added
+- Added `llm-router ai-help` to generate an agent-oriented operating guide with live gateway checks and coding-tool patch instructions.
+- Added tests covering `ai-help` discovery output and first-run setup guidance.
+### Changed
+- Rewrote `README.md` into a shorter setup and operations guide focused on providers, aliases, rate limits, and local/hosted usage.
+## [1.0.6] - 2026-02-28
+### Added
+- Added a formal changelog for tracked, versioned releases.
+- Added npm package publish metadata to keep public publish defaults explicit.
+### Changed
+- Added an explicit package `files` whitelist so npm publishes are predictable.
+- Updated release workflow docs in `README.md` to require changelog + version updates before publish.
+## [1.0.5] - 2026-02-27
+### Fixed
+- Hardened release surface and added `.npmignore` coverage for safer package publishes.
+## [1.0.4] - 2026-02-26
+### Changed
+- Refined README guidance for routing and deployment usage.
+## [1.0.3] - 2026-02-26
+### Changed
+- Simplified project positioning and gateway copy in docs.
+## [1.0.2] - 2026-02-26
+### Changed
+- Documented smart fallback behavior and operational expectations.
+## [1.0.1] - 2026-02-25
+### Changed
+- Improved fallback strategy behavior and released patch update.
+## [1.0.0] - 2026-02-25
+### Added
+- Initial `llm-router` route release with local + Cloudflare Worker gateway flows.

package/README.md CHANGED Viewed

@@ -1,230 +1,188 @@
 # llm-router
-`llm-router` is a gateway api proxy for accessing multiple models across any provider that supports OpenAI or Anthropic formats.
+`llm-router` exposes unified API endpoint for multiple AI providers and models.
-It supports:
-- local route server `llm-router start`
-- Cloudflare Worker route runtime deployment `llm-router deploy`
-- CLI + TUI management `config`, `start`, `deploy`, `worker-key`
-- Seamless model fallback
+## Main feature
+1. Single endpoint, unified providers & models
+2. Support grouping models with rate-limit and load balancing strategy
+3. Configuration auto reload in real time, no interruption
 ## Install
 ```bash
-npm i -g @khanglvm/llm-router
+npm i -g @khanglvm/llm-router@latest
 ```
-## Quick Start
+## Usage
-```bash
-# 1) Open config TUI (default behavior) to manage providers, models, fallbacks, and auth
-llm-router
+Copy/paste this short instruction to your AI agent:
-# 2) Start local route server
-llm-router start
+```text
+Run `llm-router ai-help` first, then set up and operate llm-router for me using CLI commands.
 ```
-Local endpoints:
-- Unified (Auto transform): `http://127.0.0.1:8787/route` (or `/` and `/v1`)
-- Anthropic: `http://127.0.0.1:8787/anthropic`
-- OpenAI: `http://127.0.0.1:8787/openai`
+## Main Workflow
-## Usage Example
+1. Add Providers + models into llm-router
+2. Optionally, group models as alias with load balancing and auto fallback support
+3. Start llm-router server, point your coding tool API and model to llm-router
-```bash
-# Your AI Agent can help! Ask them to manage api router via this tool for you.
-# 1) Add provider + models + provider API key. You can ask your AI agent to do it for you, or manually via TUI or command line:
-llm-router config \
-  --operation=upsert-provider \
-  --provider-id=openrouter \
-  --name="OpenRouter" \
-  --base-url=https://openrouter.ai/api/v1 \
-  --api-key=sk-or-v1-... \
-  --models=claude-3-7-sonnet,gpt-4o \
-  --format=openai \
-  --skip-probe=true
-# 2) (Optional) Configure model fallback order
-llm-router config \
-  --operation=set-model-fallbacks \
-  --provider-id=openrouter \
-  --model=claude-3-7-sonnet \
-  --fallback-models=openrouter/gpt-4o
-# 3) Set master key (this is your gateway key for client apps)
-llm-router config --operation=set-master-key --master-key=gw_your_gateway_key
-# 4) Start gateway with auth required
-llm-router start --require-auth=true
-```
+## What Each Term Means
-Claude Code example (`~/.claude/settings.local.json`):
+### Provider
+The service endpoint you call (OpenRouter, Anthropic, etc.).
-```json
-{
-  "env": {
-    "ANTHROPIC_BASE_URL": "http://127.0.0.1:8787/anthropic",
-    "ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key"
-  }
-}
-```
-## Smart Fallback Behavior
+### Model
+The actual model ID from that provider.
-`llm-router` can fail over from a primary model to configured fallback models with status-aware logic:
-- `429` (rate-limited): immediate fallback (no origin retry), with `Retry-After` respected when present.
-- Temporary failures (`408`, `409`, `5xx`, network errors): origin-only bounded retries with jittered backoff, then fallback.
-- Billing/quota exhaustion (`402`, or provider-specific billing signals): immediate fallback with longer origin cooldown memory.
-- Auth and permission failures (`401` and relevant `403` cases): no retry; fallback to other providers/models when possible.
-- Policy/moderation blocks: no retry; cross-provider fallback is disabled by default (`LLM_ROUTER_ALLOW_POLICY_FALLBACK=false`).
-- Invalid client requests (`400`, `413`, `422`): no retry and no fallback short-circuit.
+### Rate-Limit Bucket
+A request cap for a time window.
+Examples:
+- `40 requests / minute`
+- `20,000 requests / month`
-## Main Commands
+### Model Load Balancer
+Decides how traffic is distributed across models in an alias group.
-```bash
-llm-router config
-llm-router start
-llm-router stop
-llm-router reload
-llm-router update
-llm-router deploy
-llm-router worker-key
-```
+Available strategies:
+- `auto` (recommended)
+- `ordered`
+- `round-robin`
+- `weighted-rr`
+- `quota-aware-weighted-rr`
-## Non-Interactive Config (Agent/CI Friendly)
+### Model Alias (Group models)
+A single model name that auto route/rotate across multiple models.
-```bash
-llm-router config \
-  --operation=upsert-provider \
-  --provider-id=openrouter \
-  --name="OpenRouter" \
-  --base-url=https://openrouter.ai/api/v1 \
-  --api-key=sk-or-v1-... \
-  --models=gpt-4o,claude-3-7-sonnet \
-  --format=openai \
-  --skip-probe=true
-```
+Example:
+- alias: `opus`
+- targets:
+  - `openrouter/claude-opus-4.6`
+  - `anthropic/claude-opus-4.6`
-Set local auth key:
+Your app can use `opus` model and `llm-router` chooses target models based on your routing settings.
-```bash
-llm-router config --operation=set-master-key --master-key=your_local_key
-# or generate a strong key automatically
-llm-router config --operation=set-master-key --generate-master-key=true
-```
+## Setup using Terminal User Interface (TUI)
-Start with auth required:
+Open the TUI:
 ```bash
-llm-router start --require-auth=true
+llm-router
 ```
-## Cloudflare Worker Deploy
-Worker project name in `wrangler.toml`: `llm-router-route`.
-### Option A: Guided deploy
+Then follow this order.
+### 1) Add Provider
+Flow:
+1. `Config manager`
+2. `Add/Edit provider`
+3. Enter provider name, endpoint, API key
+4. Enter model list
+5. Save
+### 2) Configure Model Fallback (Optional)
+Flow:
+1. `Config manager`
+2. `Set model silent-fallbacks`
+3. Pick main model
+4. Pick fallback models
+5. Save
+### 3) Configure Rate Limits (Optional)
+Flow:
+1. `Config manager`
+2. `Manage provider rate-limit buckets`
+3. `Create bucket(s)`
+4. Set name, model scope, request cap, time window
+5. Save
+### 4) Group Models With Alias (Recommended)
+Flow:
+1. `Config manager`
+2. `Add/Edit model alias`
+3. Set alias ID (example: `chat.default`)
+4. Select target models
+5. Save
+### 5) Configure Model Load Balancer
+Flow:
+1. `Config manager`
+2. `Add/Edit model alias`
+3. Open the alias you want to balance
+4. Choose strategy (`auto` recommended)
+5. Review alias targets
+6. Save
+### 6) Set Gateway Key
+Flow:
+1. `Config manager`
+2. `Set worker master key`
+3. Set or generate key
+4. Save
+## Start Local Server
 ```bash
-llm-router deploy
+llm-router start
 ```
-If `LLM_ROUTER_CONFIG_JSON` exceeds Cloudflare Free-tier secret size (`5 KB`), deploy now warns and requires explicit confirmation (default is `No`). In non-interactive environments, pass `--allow-large-config=true` to proceed intentionally.
-`deploy` requires `CLOUDFLARE_API_TOKEN` for Cloudflare API access. Create a **User Profile API token** at <https://dash.cloudflare.com/profile/api-tokens> (do not use Account API Tokens), then choose preset/template `Edit Cloudflare Workers`. If the env var is missing in interactive mode, the CLI will show the guide and prompt for token input securely.
+Local endpoints:
+- Unified: `http://127.0.0.1:8787/route`
+- Anthropic-style: `http://127.0.0.1:8787/anthropic`
+- OpenAI-style: `http://127.0.0.1:8787/openai`
-For multi-account tokens, set account explicitly in non-interactive runs:
-- `CLOUDFLARE_ACCOUNT_ID=<id>` or
-- `llm-router deploy --account-id=<id>`
+## Connect your coding tool
-`llm-router deploy` resolves deploy target from CLI/TUI input (workers.dev or custom route), generates a temporary Wrangler config at runtime, deploys with `--config`, then removes that temporary file. Personal route/account details are not persisted back into repo `wrangler.toml`.
+After setting master key, point your app/agent to local endpoint and use that key as auth token.
-For custom domains, the deploy helper now prints a DNS checklist and connectivity commands. Common setup for `llm.example.com`:
-- Create a DNS record in Cloudflare for `llm` (usually `CNAME llm -> @`)
-- Set **Proxy status = Proxied** (orange cloud)
-- Use route target `--route-pattern=llm.example.com/* --zone-name=example.com`
-- Claude Code base URL should be `https://llm.example.com/anthropic` (**no `:8787`**; that port is local-only)
+Claude Code example (`~/.claude/settings.local.json`):
-```bash
-llm-router deploy --export-only=true --out=.llm-router.worker.json
-wrangler secret put LLM_ROUTER_CONFIG_JSON < .llm-router.worker.json
-wrangler deploy
+```json
+{
+  "env": {
+    "ANTHROPIC_BASE_URL": "http://127.0.0.1:8787",
+    "ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key",
+    "ANTHROPIC_DEFAULT_OPUS_MODEL": "provider_name/model_name_1",
+    "ANTHROPIC_DEFAULT_SONNET_MODEL": "provider_name/model_name_2",
+    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "provider_name/model_name_3"
+  }
+}
 ```
-Rotate worker auth key quickly:
+## Real-Time Update Experience
-```bash
-llm-router worker-key --master-key=new_key
-# or generate and rotate immediately
-llm-router worker-key --env=production --generate-master-key=true
-```
-If you intentionally need to bypass weak-key checks (not recommended), add `--allow-weak-master-key=true` to `deploy` or `worker-key`.
+When local server is running:
+- open `llm-router`
+- change provider/model/load-balancer/rate-limit/alias in TUI
+- save
+- the running proxy updates instantly
-Cloudflare hardening and incident-response checklist: see [`SECURITY.md`](./SECURITY.md).
+No stop/start cycle needed.
-## Runtime Secrets / Env
+## Cloudflare Worker (Hosted)
-Primary:
-- `LLM_ROUTER_CONFIG_JSON`
-- `LLM_ROUTER_MASTER_KEY` (optional override)
+Use when you want a hosted endpoint instead of local server.
-Also supported:
-- `ROUTE_CONFIG_JSON`
-- `LLM_ROUTER_JSON`
+Guided deploy:
-Optional resilience tuning:
-- `LLM_ROUTER_ORIGIN_RETRY_ATTEMPTS` (default `3`)
-- `LLM_ROUTER_ORIGIN_RETRY_BASE_DELAY_MS` (default `250`)
-- `LLM_ROUTER_ORIGIN_RETRY_MAX_DELAY_MS` (default `3000`)
-- `LLM_ROUTER_ORIGIN_FALLBACK_COOLDOWN_MS` (default `45000`)
-- `LLM_ROUTER_ORIGIN_RATE_LIMIT_COOLDOWN_MS` (default `30000`)
-- `LLM_ROUTER_ORIGIN_BILLING_COOLDOWN_MS` (default `900000`)
-- `LLM_ROUTER_ORIGIN_AUTH_COOLDOWN_MS` (default `600000`)
-- `LLM_ROUTER_ORIGIN_POLICY_COOLDOWN_MS` (default `120000`)
-- `LLM_ROUTER_ALLOW_POLICY_FALLBACK` (default `false`)
-- `LLM_ROUTER_FALLBACK_CIRCUIT_FAILURES` (default `2`)
-- `LLM_ROUTER_FALLBACK_CIRCUIT_COOLDOWN_MS` (default `30000`)
-- `LLM_ROUTER_MAX_REQUEST_BODY_BYTES` (default `1048576`, min `4096`, max `20971520`)
-- `LLM_ROUTER_UPSTREAM_TIMEOUT_MS` (default `60000`, min `1000`, max `300000`)
+```bash
+llm-router deploy
+```
-Optional browser access (CORS):
-- By default, cross-origin browser reads are denied unless explicitly allow-listed.
-- `LLM_ROUTER_CORS_ALLOWED_ORIGINS` (comma-separated exact origins, e.g. `https://app.example.com`)
-- `LLM_ROUTER_CORS_ALLOW_ALL=true` (allows any origin; not recommended for production)
+You will be guided in TUI to select account and deploy target.
-Optional source IP allowlist (recommended for Worker deployments):
-- `LLM_ROUTER_ALLOWED_IPS` (comma-separated client IPs; denies requests from all other IPs)
-- `LLM_ROUTER_IP_ALLOWLIST` (alias of `LLM_ROUTER_ALLOWED_IPS`)
+## Config File Location
-## Default Config Path
+Local config file:
 `~/.llm-router.json`
-Minimal shape:
-```json
-{
-  "masterKey": "local_or_worker_key",
-  "defaultModel": "openrouter/gpt-4o",
-  "providers": [
-    {
-      "id": "openrouter",
-      "name": "OpenRouter",
-      "baseUrl": "https://openrouter.ai/api/v1",
-      "apiKey": "sk-or-v1-...",
-      "formats": ["openai"],
-      "models": [{ "id": "gpt-4o" }]
-    }
-  ]
-}
-```
+## Security
-## Smoke Test
+See [`SECURITY.md`](./SECURITY.md).
-```bash
-npm run test:provider-smoke
-```
+## Versioning
-Use `.env.test-suite.example` as template for provider-based smoke tests.
+- Semver: [Semantic Versioning](https://semver.org/)
+- Release notes: [`CHANGELOG.md`](./CHANGELOG.md)

package/SECURITY.md ADDED Viewed

@@ -0,0 +1,142 @@
+# Security Guide
+This guide focuses on preventing unauthorized access to costly LLM resources, especially in Cloudflare Worker deployments.
+## Quick Hardened Setup
+1. Generate and set a strong gateway key locally:
+```bash
+llm-router config --operation=set-master-key --generate-master-key=true
+```
+2. Deploy with worker defaults already set in this repo:
+- `workers_dev = false`
+- `preview_urls = false`
+3. Deploy config + secrets:
+```bash
+llm-router deploy --env=production
+```
+4. Restrict who can call the router:
+- Set `LLM_ROUTER_ALLOWED_IPS` (or `LLM_ROUTER_IP_ALLOWLIST`) to trusted source IPs.
+- Set `LLM_ROUTER_CORS_ALLOWED_ORIGINS` to explicit browser origins.
+- Keep `LLM_ROUTER_CORS_ALLOW_ALL` disabled in production.
+5. Expose only a custom domain route (not `workers.dev`):
+```toml
+[env.production]
+routes = [{ pattern = "api.example.com/*", zone_name = "example.com" }]
+```
+## Quick Master Key Generation
+Use generated keys instead of hand-written keys:
+```bash
+# Local config master key
+llm-router config --operation=set-master-key --generate-master-key=true
+# Rotate Cloudflare worker key directly
+llm-router worker-key --env=production --generate-master-key=true
+```
+Optional tuning:
+```bash
+llm-router worker-key \
+  --env=production \
+  --generate-master-key=true \
+  --master-key-length=64 \
+  --master-key-prefix=gw_
+```
+## Cloudflare Access (Recommended)
+Protect the worker behind Cloudflare Access so clients must present a service token before hitting the router.
+Suggested setup:
+1. Zero Trust -> Access -> Applications -> Add application.
+2. Type: Self-hosted.
+3. Domain: your API hostname (for example `api.example.com`).
+4. Policy: allow only a Service Token for machine-to-machine traffic.
+Client calls should include:
+- `CF-Access-Client-Id`
+- `CF-Access-Client-Secret`
+Reference:
+- [Cloudflare Access service tokens](https://developers.cloudflare.com/cloudflare-one/identity/service-tokens/)
+## WAF and Rate Limiting
+Use WAF custom rules and rate limiting to reduce abuse blast radius.
+Suggested custom rule expressions (adapt host/path to your deployment):
+1. Block non-allowlisted source IPs to route endpoint:
+```txt
+http.host eq "api.example.com" and starts_with(http.request.uri.path, "/route") and not ip.src in $llm_router_allowed_ips
+```
+2. Block unexpected methods on route endpoint:
+```txt
+http.host eq "api.example.com" and starts_with(http.request.uri.path, "/route") and not http.request.method in {"POST" "OPTIONS"}
+```
+Suggested rate limit rule:
+- Match expression:
+```txt
+http.host eq "api.example.com" and starts_with(http.request.uri.path, "/route")
+```
+- Threshold example:
+  - 60 requests / 1 minute per source IP (tighten or loosen by workload).
+  - Action: Block or Managed Challenge.
+References:
+- [Cloudflare WAF custom rules](https://developers.cloudflare.com/waf/custom-rules/)
+- [Cloudflare WAF rate limiting rules](https://developers.cloudflare.com/waf/rate-limiting-rules/)
+## Incident Response: Master Key Leak
+1. Rotate worker key immediately:
+```bash
+llm-router worker-key --env=production --generate-master-key=true
+```
+2. Rotate local config key (if reused anywhere):
+```bash
+llm-router config --operation=set-master-key --generate-master-key=true
+```
+3. Revoke exposed credentials and rotate provider API keys.
+4. Review Cloudflare logs/WAF events for abuse window.
+5. Tighten Access policy, IP allowlist, and rate limits before reopening traffic.
+## Router Runtime Hardening Knobs
+- `LLM_ROUTER_MAX_REQUEST_BODY_BYTES`
+- `LLM_ROUTER_UPSTREAM_TIMEOUT_MS`
+- `LLM_ROUTER_ALLOWED_IPS` / `LLM_ROUTER_IP_ALLOWLIST`
+- `LLM_ROUTER_CORS_ALLOWED_ORIGINS`
+- `LLM_ROUTER_CORS_ALLOW_ALL` (keep `false` in production)
+## Official References
+- [Workers Secrets](https://developers.cloudflare.com/workers/configuration/secrets/)
+- [Wrangler configuration](https://developers.cloudflare.com/workers/wrangler/configuration/)
+- [workers.dev routing controls](https://developers.cloudflare.com/workers/configuration/routing/workers-dev/)
+- [Preview URLs](https://developers.cloudflare.com/changelog/2024-03-14-preview-urls/)
+- [Cloudflare Access service tokens](https://developers.cloudflare.com/cloudflare-one/identity/service-tokens/)
+- [WAF custom rules](https://developers.cloudflare.com/waf/custom-rules/)
+- [WAF rate limiting](https://developers.cloudflare.com/waf/rate-limiting-rules/)
+- [API Shield sequence mitigation](https://developers.cloudflare.com/api-shield/security/sequence-mitigation/)

package/package.json CHANGED Viewed

@@ -1,7 +1,19 @@
 {
   "name": "@khanglvm/llm-router",
-  "version": "1.0.5",
+  "version": "1.0.8",
   "description": "Single gateway endpoint for multi-provider LLMs with unified OpenAI+Anthropic format and seamless fallback",
+  "keywords": [
+    "llm-router",
+    "llm-gateway",
+    "ai-proxy",
+    "openai-compatible",
+    "anthropic-compatible",
+    "model-routing",
+    "fallback",
+    "load-balancing",
+    "cloudflare-workers",
+    "agent-infra"
+  ],
   "type": "module",
   "main": "src/index.js",
   "bin": {
@@ -18,9 +30,21 @@
     "test:provider-smoke": "node ./scripts/provider-smoke-suite.mjs"
   },
   "dependencies": {
-    "@levu/snap": "^0.3.8"
+    "@levu/snap": "^0.3.11"
   },
   "devDependencies": {
     "wrangler": "^4.68.1"
-  }
+  },
+  "publishConfig": {
+    "access": "public"
+  },
+  "files": [
+    "src/**/*.js",
+    "!src/**/*.test.js",
+    "!src/**/*.spec.js",
+    "README.md",
+    "SECURITY.md",
+    "CHANGELOG.md",
+    "wrangler.toml"
+  ]
 }