claw-llm-router 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +336 -0
- package/classifier.ts +516 -0
- package/docs/ARCHITECTURE.md +82 -0
- package/docs/CLASSIFIER.md +146 -0
- package/docs/PROVIDERS.md +228 -0
- package/index.ts +602 -0
- package/models.ts +104 -0
- package/openclaw.plugin.json +55 -0
- package/package.json +52 -0
- package/provider.ts +30 -0
- package/providers/anthropic.ts +332 -0
- package/providers/gateway.ts +128 -0
- package/providers/index.ts +135 -0
- package/providers/model-override.ts +81 -0
- package/providers/openai-compatible.ts +126 -0
- package/providers/types.ts +29 -0
- package/proxy.ts +282 -0
- package/router-logger.ts +101 -0
- package/tier-config.ts +288 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Donn Felker
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,336 @@
|
|
|
1
|
+
# Claw LLM Router
|
|
2
|
+
|
|
3
|
+
An [OpenClaw](https://openclaw.ai) plugin that cuts LLM costs **40–80%** by classifying prompts and routing them to the cheapest capable model. Simple questions go to fast/cheap models (Gemini Flash at ~$0.15/1M tokens); complex tasks go to frontier models. All routing happens locally in <1ms.
|
|
4
|
+
|
|
5
|
+
## Why
|
|
6
|
+
|
|
7
|
+
LLM costs add up fast when every prompt hits a frontier model. Most prompts don't need one. "What is the capital of France?" doesn't need Claude Opus — Gemini Flash answers it for 100x less. The router makes this automatic: you interact with a single model (`claw-llm-router/auto`) and the classifier picks the right backend.
|
|
8
|
+
|
|
9
|
+
## How It Works
|
|
10
|
+
|
|
11
|
+
```mermaid
|
|
12
|
+
flowchart TD
|
|
13
|
+
A[Incoming Request] --> B{Model ID?}
|
|
14
|
+
B -->|simple, medium, complex, reasoning| C[Forced Tier]
|
|
15
|
+
B -->|auto| D[Rule-Based Classifier]
|
|
16
|
+
D --> F[Assigned Tier]
|
|
17
|
+
F --> H[Load Tier Config]
|
|
18
|
+
C --> H
|
|
19
|
+
H --> I[Resolve Provider]
|
|
20
|
+
I --> J{Provider Type}
|
|
21
|
+
J -->|Google, OpenAI, Groq, etc.| K[Direct API Call]
|
|
22
|
+
J -->|Anthropic + API Key| L[Direct Anthropic API]
|
|
23
|
+
J -->|Anthropic + OAuth| M{Router is Primary?}
|
|
24
|
+
M -->|No| N[Gateway Fallback]
|
|
25
|
+
M -->|Yes| O[Gateway + Model Override]
|
|
26
|
+
K --> P[Response]
|
|
27
|
+
L --> P
|
|
28
|
+
N --> P
|
|
29
|
+
O --> P
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### The Four Tiers
|
|
33
|
+
|
|
34
|
+
| Tier | Default Model | When It's Used |
|
|
35
|
+
| ------------- | ------------------------------------- | -------------------------------------------------------------------------------- |
|
|
36
|
+
| **SIMPLE** | `google/gemini-2.5-flash` | Factual lookups, definitions, translations, greetings, yes/no, simple math |
|
|
37
|
+
| **MEDIUM** | `anthropic/claude-haiku-4-5-20251001` | Code snippets, explanations, summaries, moderate analysis |
|
|
38
|
+
| **COMPLEX** | `anthropic/claude-sonnet-4-6` | Multi-file code, architecture, long-form analysis, detailed technical work |
|
|
39
|
+
| **REASONING** | `anthropic/claude-opus-4-6` | Mathematical proofs, formal logic, multi-step derivations, deep chain-of-thought |
|
|
40
|
+
|
|
41
|
+
Every tier is configurable. Any OpenAI-compatible provider works, plus Anthropic's native Messages API.
|
|
42
|
+
|
|
43
|
+
### Classification
|
|
44
|
+
|
|
45
|
+
The classifier scores prompts across 15 weighted dimensions:
|
|
46
|
+
|
|
47
|
+
| Dimension | Weight | What It Detects |
|
|
48
|
+
| --------------------- | ------ | ----------------------------------------------------- |
|
|
49
|
+
| Reasoning markers | 0.18 | "prove", "theorem", "derive", "step by step" |
|
|
50
|
+
| Code presence | 0.15 | `function`, `class`, `import`, backtick blocks |
|
|
51
|
+
| Technical terms | 0.13 | "algorithm", "kubernetes", "distributed" |
|
|
52
|
+
| Multi-step patterns | 0.10 | "first...then", "step 1", numbered lists |
|
|
53
|
+
| Token count | 0.08 | Short prompts pull toward SIMPLE, long toward COMPLEX |
|
|
54
|
+
| Agentic tasks | 0.06 | "read file", "edit", "deploy", "fix", "debug" |
|
|
55
|
+
| Imperative verbs | 0.05 | "build", "create", "implement", "design" |
|
|
56
|
+
| Creative markers | 0.04 | "story", "poem", "brainstorm", "write a" |
|
|
57
|
+
| Question complexity | 0.04 | Multiple question marks |
|
|
58
|
+
| Constraint indicators | 0.04 | "at most", "within", "budget", "maximum" |
|
|
59
|
+
| Output format | 0.03 | "json", "yaml", "table", "csv" |
|
|
60
|
+
| Simple indicators | 0.02 | "what is", "define", "who is", "capital of" |
|
|
61
|
+
| Reference complexity | 0.02 | "the code", "above", "the api" |
|
|
62
|
+
| Domain specificity | 0.02 | "quantum", "fpga", "genomics", "zero-knowledge" |
|
|
63
|
+
| Negation complexity | 0.01 | "don't", "avoid", "except", "exclude" |
|
|
64
|
+
|
|
65
|
+
Scores map to tiers via boundaries (SIMPLE < 0.0, MEDIUM < 0.3, COMPLEX < 0.5, REASONING >= 0.5). The MEDIUM band is intentionally wide (0.30) so ambiguous prompts land confidently within it — no external LLM calls needed.
|
|
66
|
+
|
|
67
|
+
### Fallback Chain
|
|
68
|
+
|
|
69
|
+
If a provider call fails, the router tries the next tier up:
|
|
70
|
+
|
|
71
|
+
```
|
|
72
|
+
SIMPLE → MEDIUM → COMPLEX
|
|
73
|
+
MEDIUM → COMPLEX
|
|
74
|
+
COMPLEX → REASONING
|
|
75
|
+
REASONING → (no fallback)
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
## Architecture
|
|
79
|
+
|
|
80
|
+
See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for provider strategy, resolution logic, and the OAuth model override mechanism.
|
|
81
|
+
|
|
82
|
+
## Quickstart
|
|
83
|
+
|
|
84
|
+
### 1. Install the plugin
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
openclaw plugins install claw-llm-router
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
Or install from a local directory during development:
|
|
91
|
+
|
|
92
|
+
```bash
|
|
93
|
+
openclaw plugins install -l ./claw-llm-router
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
### 2. Set up API keys
|
|
97
|
+
|
|
98
|
+
The router needs at least one provider API key. Set keys for the providers you want to use:
|
|
99
|
+
|
|
100
|
+
| Provider | Environment Variable | Tier Suggestion |
|
|
101
|
+
| --------- | ---------------------------------------- | -------------------------- |
|
|
102
|
+
| Google | `GEMINI_API_KEY` | SIMPLE |
|
|
103
|
+
| Anthropic | `ANTHROPIC_API_KEY` or OAuth via `/auth` | MEDIUM, COMPLEX, REASONING |
|
|
104
|
+
| OpenAI | `OPENAI_API_KEY` | MEDIUM, COMPLEX |
|
|
105
|
+
| Groq | `GROQ_API_KEY` | SIMPLE |
|
|
106
|
+
| xAI | `XAI_API_KEY` | MEDIUM |
|
|
107
|
+
|
|
108
|
+
You can also add credentials through OpenClaw's `/auth` command.
|
|
109
|
+
|
|
110
|
+
> **Tip:** At minimum, set `GEMINI_API_KEY` for the SIMPLE tier and authenticate Anthropic via `/auth` for the other tiers. This covers all four tiers.
|
|
111
|
+
|
|
112
|
+
### 3. Restart and verify
|
|
113
|
+
|
|
114
|
+
```bash
|
|
115
|
+
openclaw gateway restart
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
Then run the doctor to verify everything is working:
|
|
119
|
+
|
|
120
|
+
```
|
|
121
|
+
/router doctor
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
### 4. Set as primary model (recommended)
|
|
125
|
+
|
|
126
|
+
To route all prompts through the router automatically, set it as the primary model in `~/.openclaw/openclaw.json`:
|
|
127
|
+
|
|
128
|
+
```json
|
|
129
|
+
{
|
|
130
|
+
"agents": {
|
|
131
|
+
"defaults": {
|
|
132
|
+
"model": {
|
|
133
|
+
"primary": "claw-llm-router/auto"
|
|
134
|
+
}
|
|
135
|
+
}
|
|
136
|
+
}
|
|
137
|
+
}
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
Restart the gateway after this change.
|
|
141
|
+
|
|
142
|
+
### Configure Tiers
|
|
143
|
+
|
|
144
|
+
Use the `/router` command in any OpenClaw chat:
|
|
145
|
+
|
|
146
|
+
```
|
|
147
|
+
/router setup # Show current config + suggestions
|
|
148
|
+
/router set SIMPLE google/gemini-2.5-flash
|
|
149
|
+
/router set MEDIUM anthropic/claude-haiku-4-5-20251001
|
|
150
|
+
/router set COMPLEX anthropic/claude-sonnet-4-6
|
|
151
|
+
/router set REASONING anthropic/claude-opus-4-6
|
|
152
|
+
/router doctor # Diagnose config, auth, and proxy issues
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
### API Key Resolution
|
|
156
|
+
|
|
157
|
+
The router reads API keys from OpenClaw's existing auth stores (never stores its own). Priority order:
|
|
158
|
+
|
|
159
|
+
1. Environment variable (e.g., `GEMINI_API_KEY`, `ANTHROPIC_API_KEY`, `XAI_API_KEY`, `MOONSHOT_API_KEY`)
|
|
160
|
+
2. `~/.openclaw/agents/main/agent/auth-profiles.json`
|
|
161
|
+
3. `~/.openclaw/agents/main/agent/auth.json`
|
|
162
|
+
4. `~/.openclaw/openclaw.json` `env.vars` section
|
|
163
|
+
|
|
164
|
+
## Usage
|
|
165
|
+
|
|
166
|
+
### `/router` Command
|
|
167
|
+
|
|
168
|
+
The `/router` slash command manages the router from any OpenClaw chat. No gateway restart needed for tier changes — they take effect on the next request.
|
|
169
|
+
|
|
170
|
+
| Command | What It Does |
|
|
171
|
+
| ------------------------------------- | ----------------------------------------------------------- |
|
|
172
|
+
| `/router` | Show status: uptime, proxy health, current tier assignments |
|
|
173
|
+
| `/router help` | List all subcommands with examples |
|
|
174
|
+
| `/router setup` | Show current config with suggested models per tier |
|
|
175
|
+
| `/router set <TIER> <provider/model>` | Change a tier's model |
|
|
176
|
+
| `/router doctor` | Diagnose config, API keys, base URLs, and proxy health |
|
|
177
|
+
|
|
178
|
+
Common workflows:
|
|
179
|
+
|
|
180
|
+
```
|
|
181
|
+
# Check what's running
|
|
182
|
+
/router
|
|
183
|
+
|
|
184
|
+
# Something broken? Run the doctor
|
|
185
|
+
/router doctor
|
|
186
|
+
|
|
187
|
+
# Swap SIMPLE tier to a different model
|
|
188
|
+
/router set SIMPLE groq/llama-3.3-70b-versatile
|
|
189
|
+
|
|
190
|
+
# See all options and suggested models
|
|
191
|
+
/router setup
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
### Model Selection
|
|
195
|
+
|
|
196
|
+
Switch to the router model in any chat:
|
|
197
|
+
|
|
198
|
+
```
|
|
199
|
+
/model claw-llm-router/auto
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
Or force a specific tier (useful for testing or when you know the complexity):
|
|
203
|
+
|
|
204
|
+
```
|
|
205
|
+
/model claw-llm-router/simple # Always use the cheapest model
|
|
206
|
+
/model claw-llm-router/complex # Skip classification, go straight to capable
|
|
207
|
+
/model claw-llm-router/reasoning # Force frontier reasoning model
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
### Via curl
|
|
211
|
+
|
|
212
|
+
The proxy runs on `http://127.0.0.1:8401` and speaks the OpenAI chat completions API:
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
# Auto-classify
|
|
216
|
+
curl -s http://127.0.0.1:8401/v1/chat/completions \
|
|
217
|
+
-H "Content-Type: application/json" \
|
|
218
|
+
-d '{
|
|
219
|
+
"model": "auto",
|
|
220
|
+
"messages": [{"role": "user", "content": "What is 2+2?"}],
|
|
221
|
+
"max_tokens": 50
|
|
222
|
+
}'
|
|
223
|
+
|
|
224
|
+
# Force a tier
|
|
225
|
+
curl -s http://127.0.0.1:8401/v1/chat/completions \
|
|
226
|
+
-H "Content-Type: application/json" \
|
|
227
|
+
-d '{
|
|
228
|
+
"model": "complex",
|
|
229
|
+
"messages": [{"role": "user", "content": "Design a microservice architecture"}],
|
|
230
|
+
"max_tokens": 500
|
|
231
|
+
}'
|
|
232
|
+
|
|
233
|
+
# Streaming
|
|
234
|
+
curl -s http://127.0.0.1:8401/v1/chat/completions \
|
|
235
|
+
-H "Content-Type: application/json" \
|
|
236
|
+
-d '{
|
|
237
|
+
"model": "auto",
|
|
238
|
+
"messages": [{"role": "user", "content": "Hello!"}],
|
|
239
|
+
"stream": true
|
|
240
|
+
}'
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
### Endpoints
|
|
244
|
+
|
|
245
|
+
| Endpoint | Method | Description |
|
|
246
|
+
| ---------------------- | ------ | ------------------------------------ |
|
|
247
|
+
| `/v1/chat/completions` | POST | Chat completions (OpenAI-compatible) |
|
|
248
|
+
| `/v1/models` | GET | List available models |
|
|
249
|
+
| `/health` | GET | Health check |
|
|
250
|
+
|
|
251
|
+
### Virtual Model IDs
|
|
252
|
+
|
|
253
|
+
| Model ID | Behavior |
|
|
254
|
+
| ----------- | -------------------------------- |
|
|
255
|
+
| `auto` | Classify and route automatically |
|
|
256
|
+
| `simple` | Force SIMPLE tier |
|
|
257
|
+
| `medium` | Force MEDIUM tier |
|
|
258
|
+
| `complex` | Force COMPLEX tier |
|
|
259
|
+
| `reasoning` | Force REASONING tier |
|
|
260
|
+
|
|
261
|
+
## Troubleshooting
|
|
262
|
+
|
|
263
|
+
Run the built-in doctor to check your setup:
|
|
264
|
+
|
|
265
|
+
```
|
|
266
|
+
/router doctor
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
It verifies:
|
|
270
|
+
|
|
271
|
+
- Config file exists and all 4 tiers are configured
|
|
272
|
+
- Each tier has a valid `provider/model-id` format, resolvable base URL, and available API key
|
|
273
|
+
- Proxy is running and healthy on port 8401
|
|
274
|
+
- Whether the router is set as the primary model
|
|
275
|
+
|
|
276
|
+
Any issues are flagged with fix instructions (e.g., which env var to set, how to add a custom provider).
|
|
277
|
+
|
|
278
|
+
## Project Structure
|
|
279
|
+
|
|
280
|
+
```
|
|
281
|
+
claw-llm-router/
|
|
282
|
+
├── index.ts # Plugin entry point, OpenClaw registration
|
|
283
|
+
├── proxy.ts # HTTP proxy server, request routing
|
|
284
|
+
├── classifier.ts # Rule-based prompt classifier (15 dimensions)
|
|
285
|
+
├── tier-config.ts # Tier-to-model config, API key loading
|
|
286
|
+
├── models.ts # Model definitions, provider constants
|
|
287
|
+
├── provider.ts # OpenClaw provider plugin definition
|
|
288
|
+
├── router-config.json # Tier configuration (auto-generated)
|
|
289
|
+
├── router-logger.ts # RouterLogger class — centralized log formatting
|
|
290
|
+
├── openclaw.plugin.json # Plugin manifest
|
|
291
|
+
├── docs/
|
|
292
|
+
│ ├── ARCHITECTURE.md # Provider strategy, OAuth override mechanism
|
|
293
|
+
│ ├── CLASSIFIER.md # Classifier dimensions, weights, extraction
|
|
294
|
+
│ └── PROVIDERS.md # Step-by-step guide for adding providers
|
|
295
|
+
├── providers/
|
|
296
|
+
│ ├── types.ts # LLMProvider interface, shared types
|
|
297
|
+
│ ├── openai-compatible.ts # Google, OpenAI, Groq, Mistral, etc.
|
|
298
|
+
│ ├── anthropic.ts # Anthropic Messages API (direct key)
|
|
299
|
+
│ ├── gateway.ts # OpenClaw gateway fallback (OAuth)
|
|
300
|
+
│ ├── model-override.ts # In-process override store for recursion prevention
|
|
301
|
+
│ └── index.ts # Provider registry + resolution
|
|
302
|
+
└── tests/
|
|
303
|
+
├── classifier.test.ts
|
|
304
|
+
├── proxy.test.ts
|
|
305
|
+
├── tier-config.test.ts
|
|
306
|
+
└── providers/
|
|
307
|
+
├── anthropic.test.ts
|
|
308
|
+
├── gateway.test.ts
|
|
309
|
+
├── openai-compatible.test.ts
|
|
310
|
+
├── model-override.test.ts
|
|
311
|
+
└── registry.test.ts
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
## Testing
|
|
315
|
+
|
|
316
|
+
Tests use Node.js built-in test runner (`node:test`). No external test dependencies.
|
|
317
|
+
|
|
318
|
+
```bash
|
|
319
|
+
# Run all tests
|
|
320
|
+
npx tsx --test tests/providers/*.test.ts tests/classifier.test.ts tests/proxy.test.ts tests/tier-config.test.ts
|
|
321
|
+
|
|
322
|
+
# Provider tests only
|
|
323
|
+
npx tsx --test tests/providers/*.test.ts
|
|
324
|
+
|
|
325
|
+
# Classifier tests only
|
|
326
|
+
npx tsx --test tests/classifier.test.ts
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
|
|
330
|
+
## Adding a New Provider
|
|
331
|
+
|
|
332
|
+
See [docs/PROVIDERS.md](docs/PROVIDERS.md) for a step-by-step guide to implementing a new provider.
|
|
333
|
+
|
|
334
|
+
## License
|
|
335
|
+
|
|
336
|
+
MIT
|