@khanglvm/llm-router 1.0.6 → 1.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,32 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [1.0.9] - 2026-03-03
9
+
10
+ ### Added
11
+ - Added dedicated modules for Cloudflare API preflight checks and Wrangler TOML target handling.
12
+ - Added runtime policy and route-debug helpers so stateful routing can be safely disabled by default on Cloudflare Worker.
13
+ - Added reusable timeout-signal utility and start-command port reclaim utilities with test coverage.
14
+
15
+ ### Changed
16
+ - Refactored CLI deploy/runtime handler code into focused modules with cleaner boundaries.
17
+ - Updated provider-call timeout handling to support both `AbortSignal.timeout` and `AbortController` fallback.
18
+ - Documented Worker safety defaults and switched README release/security links to canonical GitHub URLs.
19
+
20
+ ## [1.0.8] - 2026-02-28
21
+
22
+ ### Changed
23
+ - Added focused npm `keywords` metadata in `package.json` to improve package discoverability.
24
+
25
+ ## [1.0.7] - 2026-02-28
26
+
27
+ ### Added
28
+ - Added `llm-router ai-help` to generate an agent-oriented operating guide with live gateway checks and coding-tool patch instructions.
29
+ - Added tests covering `ai-help` discovery output and first-run setup guidance.
30
+
31
+ ### Changed
32
+ - Rewrote `README.md` into a shorter setup and operations guide focused on providers, aliases, rate limits, and local/hosted usage.
33
+
8
34
  ## [1.0.6] - 2026-02-28
9
35
 
10
36
  ### Added
package/README.md CHANGED
@@ -1,440 +1,193 @@
1
1
  # llm-router
2
2
 
3
- `llm-router` is a gateway api proxy for accessing multiple models across any provider that supports OpenAI or Anthropic formats.
3
+ `llm-router` exposes unified API endpoint for multiple AI providers and models.
4
4
 
5
- It supports:
6
- - local route server `llm-router start`
7
- - Cloudflare Worker route runtime deployment `llm-router deploy`
8
- - CLI + TUI management `config`, `start`, `deploy`, `worker-key`
9
- - Seamless model fallback
5
+ ## Main feature
10
6
 
11
- ## Install
12
-
13
- ```bash
14
- npm i -g @khanglvm/llm-router
15
- ```
16
-
17
- ## Versioning
7
+ 1. Single endpoint, unified providers & models
8
+ 2. Support grouping models with rate-limit and load balancing strategy
9
+ 3. Configuration auto reload in real time, no interruption
18
10
 
19
- - Follows [Semantic Versioning](https://semver.org/).
20
- - Release notes live in [`CHANGELOG.md`](./CHANGELOG.md).
21
- - npm publishes are configured for the public registry package.
22
-
23
- Release checklist:
24
- - Update `README.md` if user-facing behavior changed.
25
- - Add a dated entry in `CHANGELOG.md`.
26
- - Bump the package version before publish.
27
- - Publish with `npm publish`.
28
-
29
- ## Quick Start
11
+ ## Install
30
12
 
31
13
  ```bash
32
- # 1) Open config TUI (default behavior) to manage providers, models, fallbacks, and auth
33
- llm-router
34
-
35
- # 2) Start local route server
36
- llm-router start
14
+ npm i -g @khanglvm/llm-router@latest
37
15
  ```
38
16
 
39
- Local endpoints:
40
- - Unified (Auto transform): `http://127.0.0.1:8787/route` (or `/` and `/v1`)
41
- - Anthropic: `http://127.0.0.1:8787/anthropic`
42
- - OpenAI: `http://127.0.0.1:8787/openai`
43
-
44
- ## Usage Example
45
-
46
- ```bash
47
- # Your AI Agent can help! Ask them to manage api router via this tool for you.
48
-
49
- # 1) Add provider + models + provider API key. You can ask your AI agent to do it for you, or manually via TUI or command line:
50
- llm-router config \
51
- --operation=upsert-provider \
52
- --provider-id=openrouter \
53
- --name="OpenRouter" \
54
- --base-url=https://openrouter.ai/api/v1 \
55
- --api-key=sk-or-v1-... \
56
- --models=claude-3-7-sonnet,gpt-4o \
57
- --format=openai \
58
- --skip-probe=true
59
-
60
- # 2) (Optional) Configure model fallback order for direct provider/model requests
61
- llm-router config \
62
- --operation=set-model-fallbacks \
63
- --provider-id=openrouter \
64
- --model=claude-3-7-sonnet \
65
- --fallback-models=openrouter/gpt-4o
66
-
67
- # 3) (Optional) Create a model alias with a routing strategy and weighted targets
68
- llm-router config \
69
- --operation=upsert-model-alias \
70
- --alias-id=chat.default \
71
- --strategy=auto \
72
- --targets=openrouter/claude-3-7-sonnet@2,openrouter/gpt-4o@1 \
73
- --fallback-targets=openrouter/gpt-4o-mini
74
-
75
- # 4) (Optional) Add provider request-cap bucket (models: all)
76
- llm-router config \
77
- --operation=set-provider-rate-limits \
78
- --provider-id=openrouter \
79
- --bucket-name="Monthly cap" \
80
- --bucket-models=all \
81
- --bucket-requests=20000 \
82
- --bucket-window=month:1
83
-
84
- # 5) Set master key (this is your gateway key for client apps)
85
- llm-router config --operation=set-master-key --master-key=gw_your_gateway_key
86
-
87
- # 6) Start gateway with auth required
88
- llm-router start --require-auth=true
89
- ```
17
+ ## Usage
90
18
 
91
- Claude Code example (`~/.claude/settings.local.json`):
19
+ Copy/paste this short instruction to your AI agent:
92
20
 
93
- ```json
94
- {
95
- "env": {
96
- "ANTHROPIC_BASE_URL": "http://127.0.0.1:8787/anthropic",
97
- "ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key"
98
- }
99
- }
21
+ ```text
22
+ Run `llm-router ai-help` first, then set up and operate llm-router for me using CLI commands.
100
23
  ```
101
24
 
102
- ## Smart Fallback Behavior
103
-
104
- `llm-router` can fail over from a primary model to configured fallback models with status-aware logic:
105
- - `429` (rate-limited): immediate fallback (no origin retry), with `Retry-After` respected when present.
106
- - Temporary failures (`408`, `409`, `5xx`, network errors): origin-only bounded retries with jittered backoff, then fallback.
107
- - Billing/quota exhaustion (`402`, or provider-specific billing signals): immediate fallback with longer origin cooldown memory.
108
- - Auth and permission failures (`401` and relevant `403` cases): no retry; fallback to other providers/models when possible.
109
- - Policy/moderation blocks: no retry; cross-provider fallback is disabled by default (`LLM_ROUTER_ALLOW_POLICY_FALLBACK=false`).
110
- - Invalid client requests (`400`, `413`, `422`): no retry and no fallback short-circuit.
25
+ ## Main Workflow
111
26
 
112
- ## Model Alias Routing Strategies
27
+ 1. Add Providers + models into llm-router
28
+ 2. Optionally, group models as alias with load balancing and auto fallback support
29
+ 3. Start llm-router server, point your coding tool API and model to llm-router
113
30
 
114
- A model alias groups multiple models from different providers under one model name.
31
+ ## What Each Term Means
115
32
 
116
- Use `--strategy` when creating or updating a model alias:
33
+ ### Provider
34
+ The service endpoint you call (OpenRouter, Anthropic, etc.).
117
35
 
118
- - `auto`: Recommended set-and-forget mode. Automatically routes using quota, cooldown, and health signals to reduce rate-limit failures.
119
- - `ordered`: Tries targets in list order.
120
- - `round-robin`: Rotates evenly across eligible targets.
121
- - `weighted-rr`: Rotates like round-robin, but favors higher weights.
122
- - `quota-aware-weighted-rr`: Weighted routing plus remaining-capacity awareness.
123
-
124
- Example:
125
-
126
- ```bash
127
- llm-router config \
128
- --operation=upsert-model-alias \
129
- --alias-id=coding \
130
- --strategy=auto \
131
- --targets=rc/gpt-5.3-codex,zai/glm-5
132
- ```
133
-
134
- Concrete model alias example with provider-specific caps:
135
-
136
- ```bash
137
- llm-router config \
138
- --operation=upsert-model-alias \
139
- --alias-id=coding \
140
- --strategy=auto \
141
- --targets=rc/gpt-5.3-codex,zai/glm-5
142
-
143
- llm-router config \
144
- --operation=set-provider-rate-limits \
145
- --provider-id=rc \
146
- --bucket-name="Minute cap" \
147
- --bucket-models=gpt-5.3-codex \
148
- --bucket-requests=60 \
149
- --bucket-window=minute:1
150
-
151
- llm-router config \
152
- --operation=set-provider-rate-limits \
153
- --provider-id=zai \
154
- --bucket-name="5-hours cap" \
155
- --bucket-models=glm-5 \
156
- --bucket-requests=600 \
157
- --bucket-window=hour:5
158
- ```
159
-
160
- ## What Is A Bucket?
161
-
162
- A rate-limit bucket is a request cap for a time window.
36
+ ### Model
37
+ The actual model ID from that provider.
163
38
 
39
+ ### Rate-Limit Bucket
40
+ A request cap for a time window.
164
41
  Examples:
165
- - `40 req / 1 minute`
166
- - `600 req / 6 hours`
167
-
168
- Multiple buckets can apply to the same model scope at the same time. A candidate is treated as exhausted if any matching bucket is exhausted.
42
+ - `40 requests / minute`
43
+ - `20,000 requests / month`
169
44
 
170
- ## TUI Bucket Walkthrough
45
+ ### Model Load Balancer
46
+ Decides how traffic is distributed across models in an alias group.
171
47
 
172
- Use the config manager and select:
173
- - `Manage provider rate-limit buckets`
174
- - `Create bucket(s)`
175
-
176
- The TUI now guides you through:
177
- - Bucket name (friendly label)
178
- - Model scope (`all` or selected models with multiselect checkboxes)
179
- - Request cap
180
- - Window unit (`minute`, `hour(s)`, `week`, `month`)
181
- - Window size (hours support `N`, other preset units lock to `1`)
182
- - Review + optional add-another loop for combined policies
183
-
184
- Internal bucket ids are generated automatically from the name when omitted and shown as advanced detail in review.
185
-
186
- ## Combined-Cap Recipe (`40/min` + `600/6h`)
187
-
188
- ```bash
189
- llm-router config \
190
- --operation=set-provider-rate-limits \
191
- --provider-id=openrouter \
192
- --bucket-name="Minute cap" \
193
- --bucket-models=all \
194
- --bucket-requests=40 \
195
- --bucket-window=minute:1
196
-
197
- llm-router config \
198
- --operation=set-provider-rate-limits \
199
- --provider-id=openrouter \
200
- --bucket-name="6-hours cap" \
201
- --bucket-models=all \
202
- --bucket-requests=600 \
203
- --bucket-window=hour:6
204
- ```
205
-
206
- This keeps both limits active together for the same model scope.
207
-
208
- ## Rate-Limit Troubleshooting
209
-
210
- - Check routing decisions with `LLM_ROUTER_DEBUG_ROUTING=true` and inspect `x-llm-router-skipped-candidates`.
211
- - `quota-exhausted` means proactive pre-routing skip happened before an upstream call.
212
- - For provider `429`, cooldown is tracked from `Retry-After` when present, or from `LLM_ROUTER_ORIGIN_RATE_LIMIT_COOLDOWN_MS`.
213
- - Local mode persists state by default (file backend), while Worker defaults to in-memory state.
214
-
215
- ## Main Commands
216
-
217
- ```bash
218
- llm-router config
219
- llm-router start
220
- llm-router stop
221
- llm-router reload
222
- llm-router update
223
- llm-router deploy
224
- llm-router worker-key
225
- ```
226
-
227
- ## Non-Interactive Config (Agent/CI Friendly)
228
-
229
- ```bash
230
- llm-router config \
231
- --operation=upsert-provider \
232
- --provider-id=openrouter \
233
- --name="OpenRouter" \
234
- --base-url=https://openrouter.ai/api/v1 \
235
- --api-key=sk-or-v1-... \
236
- --models=gpt-4o,claude-3-7-sonnet \
237
- --format=openai \
238
- --skip-probe=true
239
-
240
- llm-router config \
241
- --operation=upsert-model-alias \
242
- --alias-id=chat.default \
243
- --strategy=auto \
244
- --targets=openrouter/gpt-4o-mini@3,anthropic/claude-3-5-haiku@2 \
245
- --fallback-targets=openrouter/gpt-4o
246
-
247
- llm-router config \
248
- --operation=set-provider-rate-limits \
249
- --provider-id=openrouter \
250
- --bucket-name="Monthly cap" \
251
- --bucket-models=all \
252
- --bucket-requests=20000 \
253
- --bucket-window=month:1
254
- ```
255
-
256
- Alias target syntax:
257
- - `--targets` / `--fallback-targets`: `<routeRef>@<weight>` or `<routeRef>:<weight>`
258
- - route refs: direct `provider/model` or alias id
259
-
260
- Routing strategy values:
48
+ Available strategies:
261
49
  - `auto` (recommended)
262
50
  - `ordered`
263
51
  - `round-robin`
264
52
  - `weighted-rr`
265
53
  - `quota-aware-weighted-rr`
266
54
 
267
- Rate-limit bucket window syntax:
268
- - `--bucket-window=month:1`
269
- - `--bucket-window=1w`
270
- - `--bucket-window=7day`
271
-
272
- Routing summary:
273
-
274
- ```bash
275
- llm-router config --operation=list-routing
276
- ```
55
+ ### Model Alias (Group models)
56
+ A single model name that auto route/rotate across multiple models.
277
57
 
278
- Explicit schema migration with backup:
58
+ Example:
59
+ - alias: `opus`
60
+ - targets:
61
+ - `openrouter/claude-opus-4.6`
62
+ - `anthropic/claude-opus-4.6`
279
63
 
280
- ```bash
281
- llm-router config --operation=migrate-config --target-version=2 --create-backup=true
282
- ```
64
+ Your app can use `opus` model and `llm-router` chooses target models based on your routing settings.
283
65
 
284
- Automatic version handling:
285
- - Local config loads with silent forward-migration to latest supported schema.
286
- - Migration is persisted automatically on read when possible (best-effort, no interactive prompt).
287
- - Future/newer version numbers do not fail only because of version mismatch; known fields are normalized best-effort.
66
+ ## Setup using Terminal User Interface (TUI)
288
67
 
289
- Set local auth key:
68
+ Open the TUI:
290
69
 
291
70
  ```bash
292
- llm-router config --operation=set-master-key --master-key=your_local_key
293
- # or generate a strong key automatically
294
- llm-router config --operation=set-master-key --generate-master-key=true
71
+ llm-router
295
72
  ```
296
73
 
297
- Start with auth required:
74
+ Then follow this order.
75
+
76
+ ### 1) Add Provider
77
+ Flow:
78
+ 1. `Config manager`
79
+ 2. `Add/Edit provider`
80
+ 3. Enter provider name, endpoint, API key
81
+ 4. Enter model list
82
+ 5. Save
83
+
84
+ ### 2) Configure Model Fallback (Optional)
85
+ Flow:
86
+ 1. `Config manager`
87
+ 2. `Set model silent-fallbacks`
88
+ 3. Pick main model
89
+ 4. Pick fallback models
90
+ 5. Save
91
+
92
+ ### 3) Configure Rate Limits (Optional)
93
+ Flow:
94
+ 1. `Config manager`
95
+ 2. `Manage provider rate-limit buckets`
96
+ 3. `Create bucket(s)`
97
+ 4. Set name, model scope, request cap, time window
98
+ 5. Save
99
+
100
+ ### 4) Group Models With Alias (Recommended)
101
+ Flow:
102
+ 1. `Config manager`
103
+ 2. `Add/Edit model alias`
104
+ 3. Set alias ID (example: `chat.default`)
105
+ 4. Select target models
106
+ 5. Save
107
+
108
+ ### 5) Configure Model Load Balancer
109
+ Flow:
110
+ 1. `Config manager`
111
+ 2. `Add/Edit model alias`
112
+ 3. Open the alias you want to balance
113
+ 4. Choose strategy (`auto` recommended)
114
+ 5. Review alias targets
115
+ 6. Save
116
+
117
+ ### 6) Set Gateway Key
118
+ Flow:
119
+ 1. `Config manager`
120
+ 2. `Set worker master key`
121
+ 3. Set or generate key
122
+ 4. Save
123
+
124
+ ## Start Local Server
298
125
 
299
126
  ```bash
300
- llm-router start --require-auth=true
127
+ llm-router start
301
128
  ```
302
129
 
303
- ## Cloudflare Worker Deploy
130
+ Local endpoints:
131
+ - Unified: `http://127.0.0.1:8787/route`
132
+ - Anthropic-style: `http://127.0.0.1:8787/anthropic`
133
+ - OpenAI-style: `http://127.0.0.1:8787/openai`
304
134
 
305
- Worker project name in `wrangler.toml`: `llm-router-route`.
135
+ ## Connect your coding tool
306
136
 
307
- ### Option A: Guided deploy
137
+ After setting master key, point your app/agent to local endpoint and use that key as auth token.
308
138
 
309
- ```bash
310
- llm-router deploy
311
- ```
139
+ Claude Code example (`~/.claude/settings.local.json`):
312
140
 
313
- If `LLM_ROUTER_CONFIG_JSON` exceeds Cloudflare Free-tier secret size (`5 KB`), deploy now warns and requires explicit confirmation (default is `No`). In non-interactive environments, pass `--allow-large-config=true` to proceed intentionally.
141
+ ```json
142
+ {
143
+ "env": {
144
+ "ANTHROPIC_BASE_URL": "http://127.0.0.1:8787",
145
+ "ANTHROPIC_AUTH_TOKEN": "gw_your_gateway_key",
146
+ "ANTHROPIC_DEFAULT_OPUS_MODEL": "provider_name/model_name_1",
147
+ "ANTHROPIC_DEFAULT_SONNET_MODEL": "provider_name/model_name_2",
148
+ "ANTHROPIC_DEFAULT_HAIKU_MODEL": "provider_name/model_name_3"
149
+ }
150
+ }
151
+ ```
314
152
 
315
- `deploy` requires `CLOUDFLARE_API_TOKEN` for Cloudflare API access. Create a **User Profile API token** at <https://dash.cloudflare.com/profile/api-tokens> (do not use Account API Tokens), then choose preset/template `Edit Cloudflare Workers`. If the env var is missing in interactive mode, the CLI will show the guide and prompt for token input securely.
153
+ ## Real-Time Update Experience
316
154
 
317
- For multi-account tokens, set account explicitly in non-interactive runs:
318
- - `CLOUDFLARE_ACCOUNT_ID=<id>` or
319
- - `llm-router deploy --account-id=<id>`
155
+ When local server is running:
156
+ - open `llm-router`
157
+ - change provider/model/load-balancer/rate-limit/alias in TUI
158
+ - save
159
+ - the running proxy updates instantly
320
160
 
321
- `llm-router deploy` resolves deploy target from CLI/TUI input (workers.dev or custom route), generates a temporary Wrangler config at runtime, deploys with `--config`, then removes that temporary file. Personal route/account details are not persisted back into repo `wrangler.toml`.
161
+ No stop/start cycle needed.
322
162
 
323
- For custom domains, the deploy helper now prints a DNS checklist and connectivity commands. Common setup for `llm.example.com`:
324
- - Create a DNS record in Cloudflare for `llm` (usually `CNAME llm -> @`)
325
- - Set **Proxy status = Proxied** (orange cloud)
326
- - Use route target `--route-pattern=llm.example.com/* --zone-name=example.com`
327
- - Claude Code base URL should be `https://llm.example.com/anthropic` (**no `:8787`**; that port is local-only)
163
+ ## Cloudflare Worker (Hosted)
328
164
 
329
- ```bash
330
- llm-router deploy --export-only=true --out=.llm-router.worker.json
331
- wrangler secret put LLM_ROUTER_CONFIG_JSON < .llm-router.worker.json
332
- wrangler deploy
333
- ```
165
+ Use when you want a hosted endpoint instead of local server.
334
166
 
335
- Rotate worker auth key quickly:
167
+ Guided deploy:
336
168
 
337
169
  ```bash
338
- llm-router worker-key --master-key=new_key
339
- # or generate and rotate immediately
340
- llm-router worker-key --env=production --generate-master-key=true
170
+ llm-router deploy
341
171
  ```
342
172
 
343
- If you intentionally need to bypass weak-key checks (not recommended), add `--allow-weak-master-key=true` to `deploy` or `worker-key`.
344
-
345
- Cloudflare hardening and incident-response checklist: see [`SECURITY.md`](./SECURITY.md).
346
-
347
- ## Runtime Secrets / Env
348
-
349
- Primary:
350
- - `LLM_ROUTER_CONFIG_JSON`
351
- - `LLM_ROUTER_MASTER_KEY` (optional override)
352
-
353
- Also supported:
354
- - `ROUTE_CONFIG_JSON`
355
- - `LLM_ROUTER_JSON`
173
+ You will be guided in TUI to select account and deploy target.
356
174
 
357
- Optional resilience tuning:
358
- - `LLM_ROUTER_ORIGIN_RETRY_ATTEMPTS` (default `3`)
359
- - `LLM_ROUTER_ORIGIN_RETRY_BASE_DELAY_MS` (default `250`)
360
- - `LLM_ROUTER_ORIGIN_RETRY_MAX_DELAY_MS` (default `3000`)
361
- - `LLM_ROUTER_ORIGIN_FALLBACK_COOLDOWN_MS` (default `45000`)
362
- - `LLM_ROUTER_ORIGIN_RATE_LIMIT_COOLDOWN_MS` (default `30000`)
363
- - `LLM_ROUTER_ORIGIN_BILLING_COOLDOWN_MS` (default `900000`)
364
- - `LLM_ROUTER_ORIGIN_AUTH_COOLDOWN_MS` (default `600000`)
365
- - `LLM_ROUTER_ORIGIN_POLICY_COOLDOWN_MS` (default `120000`)
366
- - `LLM_ROUTER_ALLOW_POLICY_FALLBACK` (default `false`)
367
- - `LLM_ROUTER_FALLBACK_CIRCUIT_FAILURES` (default `2`)
368
- - `LLM_ROUTER_FALLBACK_CIRCUIT_COOLDOWN_MS` (default `30000`)
369
- - `LLM_ROUTER_MAX_REQUEST_BODY_BYTES` (default `1048576`, min `4096`, max `20971520`)
370
- - `LLM_ROUTER_UPSTREAM_TIMEOUT_MS` (default `60000`, min `1000`, max `300000`)
175
+ Worker safety defaults:
176
+ - `LLM_ROUTER_STATE_BACKEND=file` is ignored on Worker (auto-fallback to in-memory state).
177
+ - Stateful timing-dependent routing features (cursor balancing, local quota counters, cooldown persistence) are auto-disabled by default to keep route flow safe across Worker isolates.
178
+ - To opt in to best-effort stateful behavior on Worker, set `LLM_ROUTER_WORKER_ALLOW_BEST_EFFORT_STATEFUL_ROUTING=true`.
371
179
 
372
- Optional browser access (CORS):
373
- - By default, cross-origin browser reads are denied unless explicitly allow-listed.
374
- - `LLM_ROUTER_CORS_ALLOWED_ORIGINS` (comma-separated exact origins, e.g. `https://app.example.com`)
375
- - `LLM_ROUTER_CORS_ALLOW_ALL=true` (allows any origin; not recommended for production)
180
+ ## Config File Location
376
181
 
377
- Optional source IP allowlist (recommended for Worker deployments):
378
- - `LLM_ROUTER_ALLOWED_IPS` (comma-separated client IPs; denies requests from all other IPs)
379
- - `LLM_ROUTER_IP_ALLOWLIST` (alias of `LLM_ROUTER_ALLOWED_IPS`)
380
-
381
- ## Default Config Path
182
+ Local config file:
382
183
 
383
184
  `~/.llm-router.json`
384
185
 
385
- Minimal shape:
386
-
387
- ```json
388
- {
389
- "version": 2,
390
- "masterKey": "local_or_worker_key",
391
- "defaultModel": "chat.default",
392
- "modelAliases": {
393
- "chat.default": {
394
- "strategy": "auto",
395
- "targets": [
396
- { "ref": "openrouter/gpt-4o" },
397
- { "ref": "anthropic/claude-3-5-haiku" }
398
- ],
399
- "fallbackTargets": [
400
- { "ref": "openrouter/gpt-4o-mini" }
401
- ]
402
- }
403
- },
404
- "providers": [
405
- {
406
- "id": "openrouter",
407
- "name": "OpenRouter",
408
- "baseUrl": "https://openrouter.ai/api/v1",
409
- "apiKey": "sk-or-v1-...",
410
- "formats": ["openai"],
411
- "models": [{ "id": "gpt-4o" }],
412
- "rateLimits": [
413
- {
414
- "id": "openrouter-all-month",
415
- "name": "Monthly cap",
416
- "models": ["all"],
417
- "requests": 20000,
418
- "window": { "unit": "month", "size": 1 }
419
- }
420
- ]
421
- }
422
- ]
423
- }
424
- ```
186
+ ## Security
425
187
 
426
- Direct vs model alias routing:
427
- - Direct route: request `model=provider/model` and optional model-level `fallbackModels` applies.
428
- - Model alias route: request `model=alias.id` (or set as `defaultModel`) and the model alias `targets` + `strategy` drive balancing. `auto` is the recommended default for new model aliases.
188
+ See [`SECURITY.md`](https://github.com/khanglvm/llm-router/blob/master/SECURITY.md).
429
189
 
430
- State durability caveats:
431
- - Local Node (`llm-router start`): routing state defaults to file-backed local persistence, so cooldowns/caps survive restarts.
432
- - Cloudflare Worker: default state is in-memory per isolate for now; long-window counters are best-effort until a durable Worker backend is configured.
433
-
434
- ## Smoke Test
435
-
436
- ```bash
437
- npm run test:provider-smoke
438
- ```
190
+ ## Versioning
439
191
 
440
- Use `.env.test-suite.example` as template for provider-based smoke tests.
192
+ - Semver: [Semantic Versioning](https://semver.org/)
193
+ - Release notes: [`CHANGELOG.md`](https://github.com/khanglvm/llm-router/blob/master/CHANGELOG.md)
package/package.json CHANGED
@@ -1,7 +1,19 @@
1
1
  {
2
2
  "name": "@khanglvm/llm-router",
3
- "version": "1.0.6",
3
+ "version": "1.0.9",
4
4
  "description": "Single gateway endpoint for multi-provider LLMs with unified OpenAI+Anthropic format and seamless fallback",
5
+ "keywords": [
6
+ "llm-router",
7
+ "llm-gateway",
8
+ "ai-proxy",
9
+ "openai-compatible",
10
+ "anthropic-compatible",
11
+ "model-routing",
12
+ "fallback",
13
+ "load-balancing",
14
+ "cloudflare-workers",
15
+ "agent-infra"
16
+ ],
5
17
  "type": "module",
6
18
  "main": "src/index.js",
7
19
  "bin": {