@khanglvm/llm-router 1.0.5 → 1.0.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +60 -0
- package/README.md +134 -176
- package/SECURITY.md +142 -0
- package/package.json +27 -3
- package/src/cli/router-module.js +2448 -301
- package/src/index.js +2 -2
- package/src/node/config-store.js +74 -6
- package/src/node/local-server.js +9 -3
- package/src/node/provider-probe.js +354 -97
- package/src/runtime/balancer.js +310 -0
- package/src/runtime/config.js +895 -45
- package/src/runtime/handler/cache-mapping.js +306 -0
- package/src/runtime/handler/config-loading.js +4 -1
- package/src/runtime/handler/fallback.js +10 -0
- package/src/runtime/handler/provider-call.js +40 -2
- package/src/runtime/handler/reasoning-effort.js +313 -0
- package/src/runtime/handler.js +414 -44
- package/src/runtime/rate-limits.js +317 -0
- package/src/runtime/state-store.file.js +335 -0
- package/src/runtime/state-store.js +74 -0
- package/src/runtime/state-store.memory.js +180 -0
- package/src/translator/request/claude-to-openai.js +86 -25
- package/src/translator/request/openai-to-claude.js +87 -13
- package/.env.test-suite.example +0 -19
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
|
|
5
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
|
+
|
|
8
|
+
## [1.0.8] - 2026-02-28
|
|
9
|
+
|
|
10
|
+
### Changed
|
|
11
|
+
- Added focused npm `keywords` metadata in `package.json` to improve package discoverability.
|
|
12
|
+
|
|
13
|
+
## [1.0.7] - 2026-02-28
|
|
14
|
+
|
|
15
|
+
### Added
|
|
16
|
+
- Added `llm-router ai-help` to generate an agent-oriented operating guide with live gateway checks and coding-tool patch instructions.
|
|
17
|
+
- Added tests covering `ai-help` discovery output and first-run setup guidance.
|
|
18
|
+
|
|
19
|
+
### Changed
|
|
20
|
+
- Rewrote `README.md` into a shorter setup and operations guide focused on providers, aliases, rate limits, and local/hosted usage.
|
|
21
|
+
|
|
22
|
+
## [1.0.6] - 2026-02-28
|
|
23
|
+
|
|
24
|
+
### Added
|
|
25
|
+
- Added a formal changelog for tracked, versioned releases.
|
|
26
|
+
- Added npm package publish metadata to keep public publish defaults explicit.
|
|
27
|
+
|
|
28
|
+
### Changed
|
|
29
|
+
- Added an explicit package `files` whitelist so npm publishes are predictable.
|
|
30
|
+
- Updated release workflow docs in `README.md` to require changelog + version updates before publish.
|
|
31
|
+
|
|
32
|
+
## [1.0.5] - 2026-02-27
|
|
33
|
+
|
|
34
|
+
### Fixed
|
|
35
|
+
- Hardened release surface and added `.npmignore` coverage for safer package publishes.
|
|
36
|
+
|
|
37
|
+
## [1.0.4] - 2026-02-26
|
|
38
|
+
|
|
39
|
+
### Changed
|
|
40
|
+
- Refined README guidance for routing and deployment usage.
|
|
41
|
+
|
|
42
|
+
## [1.0.3] - 2026-02-26
|
|
43
|
+
|
|
44
|
+
### Changed
|
|
45
|
+
- Simplified project positioning and gateway copy in docs.
|
|
46
|
+
|
|
47
|
+
## [1.0.2] - 2026-02-26
|
|
48
|
+
|
|
49
|
+
### Changed
|
|
50
|
+
- Documented smart fallback behavior and operational expectations.
|
|
51
|
+
|
|
52
|
+
## [1.0.1] - 2026-02-25
|
|
53
|
+
|
|
54
|
+
### Changed
|
|
55
|
+
- Improved fallback strategy behavior and released patch update.
|
|
56
|
+
|
|
57
|
+
## [1.0.0] - 2026-02-25
|
|
58
|
+
|
|
59
|
+
### Added
|
|
60
|
+
- Initial `llm-router` route release with local + Cloudflare Worker gateway flows.
|
package/README.md
CHANGED
|
@@ -1,230 +1,188 @@
|
|
|
1
1
|
# llm-router
|
|
2
2
|
|
|
3
|
-
`llm-router`
|
|
3
|
+
`llm-router` exposes unified API endpoint for multiple AI providers and models.
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
5
|
+
## Main feature
|
|
6
|
+
|
|
7
|
+
1. Single endpoint, unified providers & models
|
|
8
|
+
2. Support grouping models with rate-limit and load balancing strategy
|
|
9
|
+
3. Configuration auto reload in real time, no interruption
|
|
10
10
|
|
|
11
11
|
## Install
|
|
12
12
|
|
|
13
13
|
```bash
|
|
14
|
-
npm i -g @khanglvm/llm-router
|
|
14
|
+
npm i -g @khanglvm/llm-router@latest
|
|
15
15
|
```
|
|
16
16
|
|
|
17
|
-
##
|
|
17
|
+
## Usage
|
|
18
18
|
|
|
19
|
-
|
|
20
|
-
# 1) Open config TUI (default behavior) to manage providers, models, fallbacks, and auth
|
|
21
|
-
llm-router
|
|
19
|
+
Copy/paste this short instruction to your AI agent:
|
|
22
20
|
|
|
23
|
-
|
|
24
|
-
llm-router
|
|
21
|
+
```text
|
|
22
|
+
Run `llm-router ai-help` first, then set up and operate llm-router for me using CLI commands.
|
|
25
23
|
```
|
|
26
24
|
|
|
27
|
-
|
|
28
|
-
- Unified (Auto transform): `http://127.0.0.1:8787/route` (or `/` and `/v1`)
|
|
29
|
-
- Anthropic: `http://127.0.0.1:8787/anthropic`
|
|
30
|
-
- OpenAI: `http://127.0.0.1:8787/openai`
|
|
25
|
+
## Main Workflow
|
|
31
26
|
|
|
32
|
-
|
|
27
|
+
1. Add Providers + models into llm-router
|
|
28
|
+
2. Optionally, group models as alias with load balancing and auto fallback support
|
|
29
|
+
3. Start llm-router server, point your coding tool API and model to llm-router
|
|
33
30
|
|
|
34
|
-
|
|
35
|
-
# Your AI Agent can help! Ask them to manage api router via this tool for you.
|
|
36
|
-
|
|
37
|
-
# 1) Add provider + models + provider API key. You can ask your AI agent to do it for you, or manually via TUI or command line:
|
|
38
|
-
llm-router config \
|
|
39
|
-
--operation=upsert-provider \
|
|
40
|
-
--provider-id=openrouter \
|
|
41
|
-
--name="OpenRouter" \
|
|
42
|
-
--base-url=https://openrouter.ai/api/v1 \
|
|
43
|
-
--api-key=sk-or-v1-... \
|
|
44
|
-
--models=claude-3-7-sonnet,gpt-4o \
|
|
45
|
-
--format=openai \
|
|
46
|
-
--skip-probe=true
|
|
47
|
-
|
|
48
|
-
# 2) (Optional) Configure model fallback order
|
|
49
|
-
llm-router config \
|
|
50
|
-
--operation=set-model-fallbacks \
|
|
51
|
-
--provider-id=openrouter \
|
|
52
|
-
--model=claude-3-7-sonnet \
|
|
53
|
-
--fallback-models=openrouter/gpt-4o
|
|
54
|
-
|
|
55
|
-
# 3) Set master key (this is your gateway key for client apps)
|
|
56
|
-
llm-router config --operation=set-master-key --master-key=gw_your_gateway_key
|
|
57
|
-
|
|
58
|
-
# 4) Start gateway with auth required
|
|
59
|
-
llm-router start --require-auth=true
|
|
60
|
-
```
|
|
31
|
+
## What Each Term Means
|
|
61
32
|
|
|
62
|
-
|
|
33
|
+
### Provider
|
|
34
|
+
The service endpoint you call (OpenRouter, Anthropic, etc.).
|
|
63
35
|
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
"env": {
|
|
67
|
-
"ANTHROPIC_BASE_URL": "http://127.0.0.1:8787/anthropic",
|
|
68
|
-
"ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key"
|
|
69
|
-
}
|
|
70
|
-
}
|
|
71
|
-
```
|
|
72
|
-
|
|
73
|
-
## Smart Fallback Behavior
|
|
36
|
+
### Model
|
|
37
|
+
The actual model ID from that provider.
|
|
74
38
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
-
|
|
79
|
-
-
|
|
80
|
-
- Policy/moderation blocks: no retry; cross-provider fallback is disabled by default (`LLM_ROUTER_ALLOW_POLICY_FALLBACK=false`).
|
|
81
|
-
- Invalid client requests (`400`, `413`, `422`): no retry and no fallback short-circuit.
|
|
39
|
+
### Rate-Limit Bucket
|
|
40
|
+
A request cap for a time window.
|
|
41
|
+
Examples:
|
|
42
|
+
- `40 requests / minute`
|
|
43
|
+
- `20,000 requests / month`
|
|
82
44
|
|
|
83
|
-
|
|
45
|
+
### Model Load Balancer
|
|
46
|
+
Decides how traffic is distributed across models in an alias group.
|
|
84
47
|
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
llm-router deploy
|
|
92
|
-
llm-router worker-key
|
|
93
|
-
```
|
|
48
|
+
Available strategies:
|
|
49
|
+
- `auto` (recommended)
|
|
50
|
+
- `ordered`
|
|
51
|
+
- `round-robin`
|
|
52
|
+
- `weighted-rr`
|
|
53
|
+
- `quota-aware-weighted-rr`
|
|
94
54
|
|
|
95
|
-
|
|
55
|
+
### Model Alias (Group models)
|
|
56
|
+
A single model name that auto route/rotate across multiple models.
|
|
96
57
|
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
--base-url=https://openrouter.ai/api/v1 \
|
|
103
|
-
--api-key=sk-or-v1-... \
|
|
104
|
-
--models=gpt-4o,claude-3-7-sonnet \
|
|
105
|
-
--format=openai \
|
|
106
|
-
--skip-probe=true
|
|
107
|
-
```
|
|
58
|
+
Example:
|
|
59
|
+
- alias: `opus`
|
|
60
|
+
- targets:
|
|
61
|
+
- `openrouter/claude-opus-4.6`
|
|
62
|
+
- `anthropic/claude-opus-4.6`
|
|
108
63
|
|
|
109
|
-
|
|
64
|
+
Your app can use `opus` model and `llm-router` chooses target models based on your routing settings.
|
|
110
65
|
|
|
111
|
-
|
|
112
|
-
llm-router config --operation=set-master-key --master-key=your_local_key
|
|
113
|
-
# or generate a strong key automatically
|
|
114
|
-
llm-router config --operation=set-master-key --generate-master-key=true
|
|
115
|
-
```
|
|
66
|
+
## Setup using Terminal User Interface (TUI)
|
|
116
67
|
|
|
117
|
-
|
|
68
|
+
Open the TUI:
|
|
118
69
|
|
|
119
70
|
```bash
|
|
120
|
-
llm-router
|
|
71
|
+
llm-router
|
|
121
72
|
```
|
|
122
73
|
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
74
|
+
Then follow this order.
|
|
75
|
+
|
|
76
|
+
### 1) Add Provider
|
|
77
|
+
Flow:
|
|
78
|
+
1. `Config manager`
|
|
79
|
+
2. `Add/Edit provider`
|
|
80
|
+
3. Enter provider name, endpoint, API key
|
|
81
|
+
4. Enter model list
|
|
82
|
+
5. Save
|
|
83
|
+
|
|
84
|
+
### 2) Configure Model Fallback (Optional)
|
|
85
|
+
Flow:
|
|
86
|
+
1. `Config manager`
|
|
87
|
+
2. `Set model silent-fallbacks`
|
|
88
|
+
3. Pick main model
|
|
89
|
+
4. Pick fallback models
|
|
90
|
+
5. Save
|
|
91
|
+
|
|
92
|
+
### 3) Configure Rate Limits (Optional)
|
|
93
|
+
Flow:
|
|
94
|
+
1. `Config manager`
|
|
95
|
+
2. `Manage provider rate-limit buckets`
|
|
96
|
+
3. `Create bucket(s)`
|
|
97
|
+
4. Set name, model scope, request cap, time window
|
|
98
|
+
5. Save
|
|
99
|
+
|
|
100
|
+
### 4) Group Models With Alias (Recommended)
|
|
101
|
+
Flow:
|
|
102
|
+
1. `Config manager`
|
|
103
|
+
2. `Add/Edit model alias`
|
|
104
|
+
3. Set alias ID (example: `chat.default`)
|
|
105
|
+
4. Select target models
|
|
106
|
+
5. Save
|
|
107
|
+
|
|
108
|
+
### 5) Configure Model Load Balancer
|
|
109
|
+
Flow:
|
|
110
|
+
1. `Config manager`
|
|
111
|
+
2. `Add/Edit model alias`
|
|
112
|
+
3. Open the alias you want to balance
|
|
113
|
+
4. Choose strategy (`auto` recommended)
|
|
114
|
+
5. Review alias targets
|
|
115
|
+
6. Save
|
|
116
|
+
|
|
117
|
+
### 6) Set Gateway Key
|
|
118
|
+
Flow:
|
|
119
|
+
1. `Config manager`
|
|
120
|
+
2. `Set worker master key`
|
|
121
|
+
3. Set or generate key
|
|
122
|
+
4. Save
|
|
123
|
+
|
|
124
|
+
## Start Local Server
|
|
128
125
|
|
|
129
126
|
```bash
|
|
130
|
-
llm-router
|
|
127
|
+
llm-router start
|
|
131
128
|
```
|
|
132
129
|
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
130
|
+
Local endpoints:
|
|
131
|
+
- Unified: `http://127.0.0.1:8787/route`
|
|
132
|
+
- Anthropic-style: `http://127.0.0.1:8787/anthropic`
|
|
133
|
+
- OpenAI-style: `http://127.0.0.1:8787/openai`
|
|
136
134
|
|
|
137
|
-
|
|
138
|
-
- `CLOUDFLARE_ACCOUNT_ID=<id>` or
|
|
139
|
-
- `llm-router deploy --account-id=<id>`
|
|
135
|
+
## Connect your coding tool
|
|
140
136
|
|
|
141
|
-
|
|
137
|
+
After setting master key, point your app/agent to local endpoint and use that key as auth token.
|
|
142
138
|
|
|
143
|
-
|
|
144
|
-
- Create a DNS record in Cloudflare for `llm` (usually `CNAME llm -> @`)
|
|
145
|
-
- Set **Proxy status = Proxied** (orange cloud)
|
|
146
|
-
- Use route target `--route-pattern=llm.example.com/* --zone-name=example.com`
|
|
147
|
-
- Claude Code base URL should be `https://llm.example.com/anthropic` (**no `:8787`**; that port is local-only)
|
|
139
|
+
Claude Code example (`~/.claude/settings.local.json`):
|
|
148
140
|
|
|
149
|
-
```
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
141
|
+
```json
|
|
142
|
+
{
|
|
143
|
+
"env": {
|
|
144
|
+
"ANTHROPIC_BASE_URL": "http://127.0.0.1:8787",
|
|
145
|
+
"ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key",
|
|
146
|
+
"ANTHROPIC_DEFAULT_OPUS_MODEL": "provider_name/model_name_1",
|
|
147
|
+
"ANTHROPIC_DEFAULT_SONNET_MODEL": "provider_name/model_name_2",
|
|
148
|
+
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "provider_name/model_name_3"
|
|
149
|
+
}
|
|
150
|
+
}
|
|
153
151
|
```
|
|
154
152
|
|
|
155
|
-
|
|
153
|
+
## Real-Time Update Experience
|
|
156
154
|
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
If you intentionally need to bypass weak-key checks (not recommended), add `--allow-weak-master-key=true` to `deploy` or `worker-key`.
|
|
155
|
+
When local server is running:
|
|
156
|
+
- open `llm-router`
|
|
157
|
+
- change provider/model/load-balancer/rate-limit/alias in TUI
|
|
158
|
+
- save
|
|
159
|
+
- the running proxy updates instantly
|
|
164
160
|
|
|
165
|
-
|
|
161
|
+
No stop/start cycle needed.
|
|
166
162
|
|
|
167
|
-
##
|
|
163
|
+
## Cloudflare Worker (Hosted)
|
|
168
164
|
|
|
169
|
-
|
|
170
|
-
- `LLM_ROUTER_CONFIG_JSON`
|
|
171
|
-
- `LLM_ROUTER_MASTER_KEY` (optional override)
|
|
165
|
+
Use when you want a hosted endpoint instead of local server.
|
|
172
166
|
|
|
173
|
-
|
|
174
|
-
- `ROUTE_CONFIG_JSON`
|
|
175
|
-
- `LLM_ROUTER_JSON`
|
|
167
|
+
Guided deploy:
|
|
176
168
|
|
|
177
|
-
|
|
178
|
-
-
|
|
179
|
-
|
|
180
|
-
- `LLM_ROUTER_ORIGIN_RETRY_MAX_DELAY_MS` (default `3000`)
|
|
181
|
-
- `LLM_ROUTER_ORIGIN_FALLBACK_COOLDOWN_MS` (default `45000`)
|
|
182
|
-
- `LLM_ROUTER_ORIGIN_RATE_LIMIT_COOLDOWN_MS` (default `30000`)
|
|
183
|
-
- `LLM_ROUTER_ORIGIN_BILLING_COOLDOWN_MS` (default `900000`)
|
|
184
|
-
- `LLM_ROUTER_ORIGIN_AUTH_COOLDOWN_MS` (default `600000`)
|
|
185
|
-
- `LLM_ROUTER_ORIGIN_POLICY_COOLDOWN_MS` (default `120000`)
|
|
186
|
-
- `LLM_ROUTER_ALLOW_POLICY_FALLBACK` (default `false`)
|
|
187
|
-
- `LLM_ROUTER_FALLBACK_CIRCUIT_FAILURES` (default `2`)
|
|
188
|
-
- `LLM_ROUTER_FALLBACK_CIRCUIT_COOLDOWN_MS` (default `30000`)
|
|
189
|
-
- `LLM_ROUTER_MAX_REQUEST_BODY_BYTES` (default `1048576`, min `4096`, max `20971520`)
|
|
190
|
-
- `LLM_ROUTER_UPSTREAM_TIMEOUT_MS` (default `60000`, min `1000`, max `300000`)
|
|
169
|
+
```bash
|
|
170
|
+
llm-router deploy
|
|
171
|
+
```
|
|
191
172
|
|
|
192
|
-
|
|
193
|
-
- By default, cross-origin browser reads are denied unless explicitly allow-listed.
|
|
194
|
-
- `LLM_ROUTER_CORS_ALLOWED_ORIGINS` (comma-separated exact origins, e.g. `https://app.example.com`)
|
|
195
|
-
- `LLM_ROUTER_CORS_ALLOW_ALL=true` (allows any origin; not recommended for production)
|
|
173
|
+
You will be guided in TUI to select account and deploy target.
|
|
196
174
|
|
|
197
|
-
|
|
198
|
-
- `LLM_ROUTER_ALLOWED_IPS` (comma-separated client IPs; denies requests from all other IPs)
|
|
199
|
-
- `LLM_ROUTER_IP_ALLOWLIST` (alias of `LLM_ROUTER_ALLOWED_IPS`)
|
|
175
|
+
## Config File Location
|
|
200
176
|
|
|
201
|
-
|
|
177
|
+
Local config file:
|
|
202
178
|
|
|
203
179
|
`~/.llm-router.json`
|
|
204
180
|
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
```json
|
|
208
|
-
{
|
|
209
|
-
"masterKey": "local_or_worker_key",
|
|
210
|
-
"defaultModel": "openrouter/gpt-4o",
|
|
211
|
-
"providers": [
|
|
212
|
-
{
|
|
213
|
-
"id": "openrouter",
|
|
214
|
-
"name": "OpenRouter",
|
|
215
|
-
"baseUrl": "https://openrouter.ai/api/v1",
|
|
216
|
-
"apiKey": "sk-or-v1-...",
|
|
217
|
-
"formats": ["openai"],
|
|
218
|
-
"models": [{ "id": "gpt-4o" }]
|
|
219
|
-
}
|
|
220
|
-
]
|
|
221
|
-
}
|
|
222
|
-
```
|
|
181
|
+
## Security
|
|
223
182
|
|
|
224
|
-
|
|
183
|
+
See [`SECURITY.md`](./SECURITY.md).
|
|
225
184
|
|
|
226
|
-
|
|
227
|
-
npm run test:provider-smoke
|
|
228
|
-
```
|
|
185
|
+
## Versioning
|
|
229
186
|
|
|
230
|
-
|
|
187
|
+
- Semver: [Semantic Versioning](https://semver.org/)
|
|
188
|
+
- Release notes: [`CHANGELOG.md`](./CHANGELOG.md)
|
package/SECURITY.md
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
# Security Guide
|
|
2
|
+
|
|
3
|
+
This guide focuses on preventing unauthorized access to costly LLM resources, especially in Cloudflare Worker deployments.
|
|
4
|
+
|
|
5
|
+
## Quick Hardened Setup
|
|
6
|
+
|
|
7
|
+
1. Generate and set a strong gateway key locally:
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
llm-router config --operation=set-master-key --generate-master-key=true
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
2. Deploy with worker defaults already set in this repo:
|
|
14
|
+
- `workers_dev = false`
|
|
15
|
+
- `preview_urls = false`
|
|
16
|
+
|
|
17
|
+
3. Deploy config + secrets:
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
llm-router deploy --env=production
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
4. Restrict who can call the router:
|
|
24
|
+
- Set `LLM_ROUTER_ALLOWED_IPS` (or `LLM_ROUTER_IP_ALLOWLIST`) to trusted source IPs.
|
|
25
|
+
- Set `LLM_ROUTER_CORS_ALLOWED_ORIGINS` to explicit browser origins.
|
|
26
|
+
- Keep `LLM_ROUTER_CORS_ALLOW_ALL` disabled in production.
|
|
27
|
+
|
|
28
|
+
5. Expose only a custom domain route (not `workers.dev`):
|
|
29
|
+
|
|
30
|
+
```toml
|
|
31
|
+
[env.production]
|
|
32
|
+
routes = [{ pattern = "api.example.com/*", zone_name = "example.com" }]
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Quick Master Key Generation
|
|
36
|
+
|
|
37
|
+
Use generated keys instead of hand-written keys:
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
# Local config master key
|
|
41
|
+
llm-router config --operation=set-master-key --generate-master-key=true
|
|
42
|
+
|
|
43
|
+
# Rotate Cloudflare worker key directly
|
|
44
|
+
llm-router worker-key --env=production --generate-master-key=true
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
Optional tuning:
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
llm-router worker-key \
|
|
51
|
+
--env=production \
|
|
52
|
+
--generate-master-key=true \
|
|
53
|
+
--master-key-length=64 \
|
|
54
|
+
--master-key-prefix=gw_
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## Cloudflare Access (Recommended)
|
|
58
|
+
|
|
59
|
+
Protect the worker behind Cloudflare Access so clients must present a service token before hitting the router.
|
|
60
|
+
|
|
61
|
+
Suggested setup:
|
|
62
|
+
1. Zero Trust -> Access -> Applications -> Add application.
|
|
63
|
+
2. Type: Self-hosted.
|
|
64
|
+
3. Domain: your API hostname (for example `api.example.com`).
|
|
65
|
+
4. Policy: allow only a Service Token for machine-to-machine traffic.
|
|
66
|
+
|
|
67
|
+
Client calls should include:
|
|
68
|
+
- `CF-Access-Client-Id`
|
|
69
|
+
- `CF-Access-Client-Secret`
|
|
70
|
+
|
|
71
|
+
Reference:
|
|
72
|
+
- [Cloudflare Access service tokens](https://developers.cloudflare.com/cloudflare-one/identity/service-tokens/)
|
|
73
|
+
|
|
74
|
+
## WAF and Rate Limiting
|
|
75
|
+
|
|
76
|
+
Use WAF custom rules and rate limiting to reduce abuse blast radius.
|
|
77
|
+
|
|
78
|
+
Suggested custom rule expressions (adapt host/path to your deployment):
|
|
79
|
+
|
|
80
|
+
1. Block non-allowlisted source IPs to route endpoint:
|
|
81
|
+
|
|
82
|
+
```txt
|
|
83
|
+
http.host eq "api.example.com" and starts_with(http.request.uri.path, "/route") and not ip.src in $llm_router_allowed_ips
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
2. Block unexpected methods on route endpoint:
|
|
87
|
+
|
|
88
|
+
```txt
|
|
89
|
+
http.host eq "api.example.com" and starts_with(http.request.uri.path, "/route") and not http.request.method in {"POST" "OPTIONS"}
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
Suggested rate limit rule:
|
|
93
|
+
- Match expression:
|
|
94
|
+
|
|
95
|
+
```txt
|
|
96
|
+
http.host eq "api.example.com" and starts_with(http.request.uri.path, "/route")
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
- Threshold example:
|
|
100
|
+
- 60 requests / 1 minute per source IP (tighten or loosen by workload).
|
|
101
|
+
- Action: Block or Managed Challenge.
|
|
102
|
+
|
|
103
|
+
References:
|
|
104
|
+
- [Cloudflare WAF custom rules](https://developers.cloudflare.com/waf/custom-rules/)
|
|
105
|
+
- [Cloudflare WAF rate limiting rules](https://developers.cloudflare.com/waf/rate-limiting-rules/)
|
|
106
|
+
|
|
107
|
+
## Incident Response: Master Key Leak
|
|
108
|
+
|
|
109
|
+
1. Rotate worker key immediately:
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
llm-router worker-key --env=production --generate-master-key=true
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
2. Rotate local config key (if reused anywhere):
|
|
116
|
+
|
|
117
|
+
```bash
|
|
118
|
+
llm-router config --operation=set-master-key --generate-master-key=true
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
3. Revoke exposed credentials and rotate provider API keys.
|
|
122
|
+
4. Review Cloudflare logs/WAF events for abuse window.
|
|
123
|
+
5. Tighten Access policy, IP allowlist, and rate limits before reopening traffic.
|
|
124
|
+
|
|
125
|
+
## Router Runtime Hardening Knobs
|
|
126
|
+
|
|
127
|
+
- `LLM_ROUTER_MAX_REQUEST_BODY_BYTES`
|
|
128
|
+
- `LLM_ROUTER_UPSTREAM_TIMEOUT_MS`
|
|
129
|
+
- `LLM_ROUTER_ALLOWED_IPS` / `LLM_ROUTER_IP_ALLOWLIST`
|
|
130
|
+
- `LLM_ROUTER_CORS_ALLOWED_ORIGINS`
|
|
131
|
+
- `LLM_ROUTER_CORS_ALLOW_ALL` (keep `false` in production)
|
|
132
|
+
|
|
133
|
+
## Official References
|
|
134
|
+
|
|
135
|
+
- [Workers Secrets](https://developers.cloudflare.com/workers/configuration/secrets/)
|
|
136
|
+
- [Wrangler configuration](https://developers.cloudflare.com/workers/wrangler/configuration/)
|
|
137
|
+
- [workers.dev routing controls](https://developers.cloudflare.com/workers/configuration/routing/workers-dev/)
|
|
138
|
+
- [Preview URLs](https://developers.cloudflare.com/changelog/2024-03-14-preview-urls/)
|
|
139
|
+
- [Cloudflare Access service tokens](https://developers.cloudflare.com/cloudflare-one/identity/service-tokens/)
|
|
140
|
+
- [WAF custom rules](https://developers.cloudflare.com/waf/custom-rules/)
|
|
141
|
+
- [WAF rate limiting](https://developers.cloudflare.com/waf/rate-limiting-rules/)
|
|
142
|
+
- [API Shield sequence mitigation](https://developers.cloudflare.com/api-shield/security/sequence-mitigation/)
|
package/package.json
CHANGED
|
@@ -1,7 +1,19 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@khanglvm/llm-router",
|
|
3
|
-
"version": "1.0.
|
|
3
|
+
"version": "1.0.8",
|
|
4
4
|
"description": "Single gateway endpoint for multi-provider LLMs with unified OpenAI+Anthropic format and seamless fallback",
|
|
5
|
+
"keywords": [
|
|
6
|
+
"llm-router",
|
|
7
|
+
"llm-gateway",
|
|
8
|
+
"ai-proxy",
|
|
9
|
+
"openai-compatible",
|
|
10
|
+
"anthropic-compatible",
|
|
11
|
+
"model-routing",
|
|
12
|
+
"fallback",
|
|
13
|
+
"load-balancing",
|
|
14
|
+
"cloudflare-workers",
|
|
15
|
+
"agent-infra"
|
|
16
|
+
],
|
|
5
17
|
"type": "module",
|
|
6
18
|
"main": "src/index.js",
|
|
7
19
|
"bin": {
|
|
@@ -18,9 +30,21 @@
|
|
|
18
30
|
"test:provider-smoke": "node ./scripts/provider-smoke-suite.mjs"
|
|
19
31
|
},
|
|
20
32
|
"dependencies": {
|
|
21
|
-
"@levu/snap": "^0.3.
|
|
33
|
+
"@levu/snap": "^0.3.11"
|
|
22
34
|
},
|
|
23
35
|
"devDependencies": {
|
|
24
36
|
"wrangler": "^4.68.1"
|
|
25
|
-
}
|
|
37
|
+
},
|
|
38
|
+
"publishConfig": {
|
|
39
|
+
"access": "public"
|
|
40
|
+
},
|
|
41
|
+
"files": [
|
|
42
|
+
"src/**/*.js",
|
|
43
|
+
"!src/**/*.test.js",
|
|
44
|
+
"!src/**/*.spec.js",
|
|
45
|
+
"README.md",
|
|
46
|
+
"SECURITY.md",
|
|
47
|
+
"CHANGELOG.md",
|
|
48
|
+
"wrangler.toml"
|
|
49
|
+
]
|
|
26
50
|
}
|