@relayplane/proxy 0.1.10 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +112 -298
- package/__tests__/server.test.ts +512 -0
- package/__tests__/telemetry.test.ts +126 -0
- package/dist/cli.d.ts +35 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/cli.js +262 -3115
- package/dist/cli.js.map +1 -1
- package/dist/config.d.ts +80 -0
- package/dist/config.d.ts.map +1 -0
- package/dist/config.js +208 -0
- package/dist/config.js.map +1 -0
- package/dist/index.d.ts +25 -1130
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +72 -3096
- package/dist/index.js.map +1 -1
- package/dist/server.d.ts +209 -0
- package/dist/server.d.ts.map +1 -0
- package/dist/server.js +1089 -0
- package/dist/server.js.map +1 -0
- package/dist/streaming.d.ts +80 -0
- package/dist/streaming.d.ts.map +1 -0
- package/dist/streaming.js +271 -0
- package/dist/streaming.js.map +1 -0
- package/dist/telemetry.d.ts +111 -0
- package/dist/telemetry.d.ts.map +1 -0
- package/dist/telemetry.js +315 -0
- package/dist/telemetry.js.map +1 -0
- package/package.json +21 -46
- package/src/cli.ts +341 -0
- package/src/config.ts +206 -0
- package/src/index.ts +82 -0
- package/src/server.ts +1328 -0
- package/src/streaming.ts +331 -0
- package/src/telemetry.ts +343 -0
- package/tsconfig.json +19 -0
- package/vitest.config.ts +21 -0
- package/LICENSE +0 -21
- package/dist/cli.d.mts +0 -1
- package/dist/cli.mjs +0 -3134
- package/dist/cli.mjs.map +0 -1
- package/dist/index.d.mts +0 -1141
- package/dist/index.mjs +0 -3039
- package/dist/index.mjs.map +0 -1
package/README.md
CHANGED
|
@@ -1,371 +1,185 @@
|
|
|
1
1
|
# @relayplane/proxy
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Intelligent AI model routing proxy for cost optimization and observability.
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
> **Note:** Designed for standard API key users (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`). MAX subscription OAuth is not currently supported — MAX users should continue using their provider directly.
|
|
8
|
-
|
|
9
|
-
> ⚠️ **Cost Monitoring Required**
|
|
10
|
-
>
|
|
11
|
-
> RelayPlane routes requests to LLM providers using your API keys. **This incurs real costs.**
|
|
12
|
-
>
|
|
13
|
-
> - Set up billing alerts with your providers (Anthropic, OpenAI, etc.)
|
|
14
|
-
> - Monitor usage through your provider's dashboard
|
|
15
|
-
> - Use `/relayplane stats` or `curl localhost:3001/control/stats` to track usage
|
|
16
|
-
> - Start with test requests to understand routing behavior
|
|
17
|
-
>
|
|
18
|
-
> RelayPlane provides cost *optimization*, not cost *elimination*. You are responsible for monitoring your actual spending.
|
|
19
|
-
|
|
20
|
-
[](https://github.com/RelayPlane/proxy/actions/workflows/ci.yml)
|
|
21
|
-
[](https://www.npmjs.com/package/@relayplane/proxy)
|
|
22
|
-
[](https://opensource.org/licenses/MIT)
|
|
23
|
-
|
|
24
|
-
## Install
|
|
25
|
-
|
|
26
|
-
```bash
|
|
27
|
-
npm install @relayplane/proxy
|
|
28
|
-
```
|
|
29
|
-
|
|
30
|
-
Or run directly:
|
|
5
|
+
## Installation
|
|
31
6
|
|
|
32
7
|
```bash
|
|
33
|
-
|
|
34
|
-
```
|
|
35
|
-
|
|
36
|
-
## CLI Commands
|
|
37
|
-
|
|
38
|
-
```bash
|
|
39
|
-
# Start the proxy server
|
|
40
|
-
npx @relayplane/proxy
|
|
41
|
-
|
|
42
|
-
# Start on custom port
|
|
43
|
-
npx @relayplane/proxy --port 8080
|
|
44
|
-
|
|
45
|
-
# View routing statistics
|
|
46
|
-
npx @relayplane/proxy stats
|
|
47
|
-
|
|
48
|
-
# View stats for last 30 days
|
|
49
|
-
npx @relayplane/proxy stats --days 30
|
|
50
|
-
|
|
51
|
-
# Show help
|
|
52
|
-
npx @relayplane/proxy --help
|
|
53
|
-
```
|
|
54
|
-
|
|
55
|
-
## OpenClaw Slash Commands
|
|
56
|
-
|
|
57
|
-
If you're using OpenClaw, these chat commands are available:
|
|
58
|
-
|
|
59
|
-
| Command | Description |
|
|
60
|
-
|---------|-------------|
|
|
61
|
-
| `/relayplane stats` | Show usage statistics and cost savings |
|
|
62
|
-
| `/relayplane status` | Show proxy health and configuration |
|
|
63
|
-
| `/relayplane switch <mode>` | Change routing mode (auto\|cost\|fast\|quality) |
|
|
64
|
-
| `/relayplane models` | List available routing models |
|
|
65
|
-
|
|
66
|
-
Example:
|
|
67
|
-
```
|
|
68
|
-
/relayplane stats
|
|
69
|
-
/relayplane switch cost
|
|
8
|
+
npm install -g @relayplane/proxy
|
|
70
9
|
```
|
|
71
10
|
|
|
72
11
|
## Quick Start
|
|
73
12
|
|
|
74
|
-
### 1. Set your API keys
|
|
75
|
-
|
|
76
13
|
```bash
|
|
77
|
-
|
|
78
|
-
export
|
|
79
|
-
|
|
80
|
-
```
|
|
81
|
-
|
|
82
|
-
### 2. Start the proxy
|
|
83
|
-
|
|
84
|
-
```bash
|
|
85
|
-
npx @relayplane/proxy --port 3001
|
|
86
|
-
```
|
|
14
|
+
# Set your API keys
|
|
15
|
+
export ANTHROPIC_API_KEY=your-key
|
|
16
|
+
export OPENAI_API_KEY=your-key
|
|
87
17
|
|
|
88
|
-
|
|
18
|
+
# Start the proxy
|
|
19
|
+
relayplane-proxy
|
|
89
20
|
|
|
90
|
-
|
|
21
|
+
# Configure your tools to use the proxy
|
|
91
22
|
export ANTHROPIC_BASE_URL=http://localhost:3001
|
|
92
23
|
export OPENAI_BASE_URL=http://localhost:3001
|
|
93
24
|
|
|
94
|
-
#
|
|
95
|
-
openclaw
|
|
96
|
-
```
|
|
97
|
-
|
|
98
|
-
That's it. All API calls now route through RelayPlane for intelligent model selection.
|
|
99
|
-
|
|
100
|
-
## How It Works
|
|
101
|
-
|
|
102
|
-
```
|
|
103
|
-
Your Tool (OpenClaw, Cursor, etc.)
|
|
104
|
-
│
|
|
105
|
-
▼
|
|
106
|
-
RelayPlane Proxy
|
|
107
|
-
├── Infers task type (code_review, analysis, etc.)
|
|
108
|
-
├── Checks routing rules
|
|
109
|
-
├── Selects optimal model (Haiku for simple, Opus for complex)
|
|
110
|
-
├── Tracks outcomes (success/failure/latency)
|
|
111
|
-
└── Learns patterns → improves over time
|
|
112
|
-
│
|
|
113
|
-
▼
|
|
114
|
-
Provider (Anthropic, OpenAI, etc.)
|
|
25
|
+
# Run your AI tools (Claude Code, Cursor, Aider, etc.)
|
|
115
26
|
```
|
|
116
27
|
|
|
117
|
-
##
|
|
28
|
+
## Features
|
|
118
29
|
|
|
119
|
-
|
|
30
|
+
- **Intelligent Routing**: Routes requests to the optimal model based on task type
|
|
31
|
+
- **Cost Tracking**: Tracks and reports API costs across all providers
|
|
32
|
+
- **Provider Agnostic**: Works with Anthropic, OpenAI, Gemini, xAI, and more
|
|
33
|
+
- **Local Learning**: Learns from your usage patterns to improve routing
|
|
34
|
+
- **Privacy First**: Never sees your prompts or responses
|
|
120
35
|
|
|
121
|
-
|
|
122
|
-
- **Pattern Detection** — Identifies what works for your specific codebase
|
|
123
|
-
- **Continuous Improvement** — Routing gets smarter the more you use it
|
|
124
|
-
- **Local Intelligence** — All learning happens in your local SQLite DB
|
|
36
|
+
## CLI Options
|
|
125
37
|
|
|
126
38
|
```bash
|
|
127
|
-
|
|
128
|
-
npx @relayplane/proxy stats
|
|
39
|
+
relayplane-proxy [command] [options]
|
|
129
40
|
|
|
130
|
-
|
|
131
|
-
|
|
41
|
+
Commands:
|
|
42
|
+
(default) Start the proxy server
|
|
43
|
+
telemetry [on|off|status] Manage telemetry settings
|
|
44
|
+
stats Show usage statistics
|
|
45
|
+
config Show configuration
|
|
132
46
|
|
|
133
|
-
|
|
134
|
-
|
|
47
|
+
Options:
|
|
48
|
+
--port <number> Port to listen on (default: 3001)
|
|
49
|
+
--host <string> Host to bind to (default: 127.0.0.1)
|
|
50
|
+
--offline Disable all network calls except LLM endpoints
|
|
51
|
+
--audit Show telemetry payloads before sending
|
|
52
|
+
-v, --verbose Enable verbose logging
|
|
53
|
+
-h, --help Show this help message
|
|
54
|
+
--version Show version
|
|
135
55
|
```
|
|
136
56
|
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
## Supported Providers
|
|
57
|
+
## Telemetry
|
|
140
58
|
|
|
141
|
-
|
|
142
|
-
|----------|--------|-----------|-------|
|
|
143
|
-
| **Anthropic** | Claude 3.5 Haiku, Sonnet 4, Opus 4.5 | ✓ | ✓ |
|
|
144
|
-
| **OpenAI** | GPT-4o, GPT-4o-mini, GPT-4.1, o1, o3 | ✓ | ✓ |
|
|
145
|
-
| **Google** | Gemini 2.0 Flash, Gemini Pro | ✓ | ✓ |
|
|
146
|
-
| **xAI** | Grok (grok-*) | ✓ | ✓ |
|
|
147
|
-
| **Moonshot** | Moonshot v1 (8k, 32k, 128k) | ✓ | ✓ |
|
|
59
|
+
RelayPlane collects anonymous telemetry to improve model routing. This data helps us understand usage patterns and optimize routing decisions.
|
|
148
60
|
|
|
149
|
-
|
|
61
|
+
### What We Collect (Exact Schema)
|
|
150
62
|
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
| Pay Opus token rates for simple tasks | Route simple tasks to Haiku (1/10 the cost) |
|
|
164
|
-
| Static model selection | Learns from outcomes over time |
|
|
165
|
-
| Manual optimization | Automatic cost-quality balance |
|
|
166
|
-
| No visibility into spend | Built-in savings tracking |
|
|
63
|
+
```json
|
|
64
|
+
{
|
|
65
|
+
"device_id": "anon_8f3a...",
|
|
66
|
+
"task_type": "code_review",
|
|
67
|
+
"model": "claude-3-5-haiku",
|
|
68
|
+
"tokens_in": 1847,
|
|
69
|
+
"tokens_out": 423,
|
|
70
|
+
"latency_ms": 2341,
|
|
71
|
+
"success": true,
|
|
72
|
+
"cost_usd": 0.02
|
|
73
|
+
}
|
|
74
|
+
```
|
|
167
75
|
|
|
168
|
-
|
|
76
|
+
### Field Descriptions
|
|
169
77
|
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
78
|
+
| Field | Type | Description |
|
|
79
|
+
|-------|------|-------------|
|
|
80
|
+
| `device_id` | string | Anonymous random ID (not fingerprintable) |
|
|
81
|
+
| `task_type` | string | Inferred from token patterns, NOT prompt content |
|
|
82
|
+
| `model` | string | The model that handled the request |
|
|
83
|
+
| `tokens_in` | number | Input token count |
|
|
84
|
+
| `tokens_out` | number | Output token count |
|
|
85
|
+
| `latency_ms` | number | Request latency in milliseconds |
|
|
86
|
+
| `success` | boolean | Whether the request succeeded |
|
|
87
|
+
| `cost_usd` | number | Estimated cost in USD |
|
|
175
88
|
|
|
176
|
-
|
|
89
|
+
### Task Types
|
|
177
90
|
|
|
178
|
-
|
|
179
|
-
import { startProxy, RelayPlane, calculateSavings } from '@relayplane/proxy';
|
|
91
|
+
Task types are inferred from request characteristics (token counts, ratios, etc.) - never from prompt content:
|
|
180
92
|
|
|
181
|
-
|
|
182
|
-
|
|
93
|
+
- `quick_task` - Short input/output (< 500 tokens each)
|
|
94
|
+
- `code_review` - Medium-long input, medium output
|
|
95
|
+
- `generation` - High output/input ratio
|
|
96
|
+
- `classification` - Low output/input ratio, short output
|
|
97
|
+
- `long_context` - Input > 10,000 tokens
|
|
98
|
+
- `content_generation` - Output > 1,000 tokens
|
|
99
|
+
- `tool_use` - Request includes tool calls
|
|
100
|
+
- `general` - Default classification
|
|
183
101
|
|
|
184
|
-
|
|
185
|
-
const relay = new RelayPlane({});
|
|
186
|
-
const result = await relay.run({ prompt: 'Review this code...' });
|
|
187
|
-
console.log(result.taskType); // 'code_review'
|
|
188
|
-
console.log(result.model); // 'anthropic:claude-3-5-haiku-latest'
|
|
102
|
+
### What We NEVER Collect
|
|
189
103
|
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
104
|
+
- ❌ Your prompts
|
|
105
|
+
- ❌ Model responses
|
|
106
|
+
- ❌ File paths or contents
|
|
107
|
+
- ❌ Anything that could identify you or your project
|
|
193
108
|
|
|
194
|
-
|
|
195
|
-
```
|
|
109
|
+
### Verification
|
|
196
110
|
|
|
197
|
-
|
|
111
|
+
You can verify exactly what data is collected:
|
|
198
112
|
|
|
199
113
|
```bash
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
Options:
|
|
203
|
-
--port <number> Port to listen on (default: 3001)
|
|
204
|
-
--host <string> Host to bind to (default: 127.0.0.1)
|
|
205
|
-
-v, --verbose Enable verbose logging
|
|
206
|
-
-h, --help Show help
|
|
207
|
-
```
|
|
208
|
-
|
|
209
|
-
## REST API
|
|
210
|
-
|
|
211
|
-
The proxy exposes control endpoints for stats and monitoring:
|
|
114
|
+
# See telemetry payloads before they're sent
|
|
115
|
+
relayplane-proxy --audit
|
|
212
116
|
|
|
213
|
-
|
|
117
|
+
# Disable all telemetry transmission
|
|
118
|
+
relayplane-proxy --offline
|
|
214
119
|
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
```bash
|
|
218
|
-
curl http://localhost:3001/control/status
|
|
120
|
+
# View the source code
|
|
121
|
+
# https://github.com/RelayPlane/proxy
|
|
219
122
|
```
|
|
220
123
|
|
|
221
|
-
|
|
222
|
-
{
|
|
223
|
-
"enabled": true,
|
|
224
|
-
"mode": "cascade",
|
|
225
|
-
"modelOverrides": {}
|
|
226
|
-
}
|
|
227
|
-
```
|
|
124
|
+
### Opt-Out
|
|
228
125
|
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
Aggregated statistics and routing counts.
|
|
126
|
+
To disable telemetry completely:
|
|
232
127
|
|
|
233
128
|
```bash
|
|
234
|
-
|
|
129
|
+
relayplane-proxy telemetry off
|
|
235
130
|
```
|
|
236
131
|
|
|
237
|
-
|
|
238
|
-
{
|
|
239
|
-
"uptimeMs": 3600000,
|
|
240
|
-
"uptimeFormatted": "60m 0s",
|
|
241
|
-
"totalRequests": 142,
|
|
242
|
-
"successfulRequests": 138,
|
|
243
|
-
"failedRequests": 4,
|
|
244
|
-
"successRate": "97.2%",
|
|
245
|
-
"avgLatencyMs": 1203,
|
|
246
|
-
"escalations": 12,
|
|
247
|
-
"routingCounts": {
|
|
248
|
-
"auto": 100,
|
|
249
|
-
"cost": 30,
|
|
250
|
-
"passthrough": 12
|
|
251
|
-
},
|
|
252
|
-
"modelCounts": {
|
|
253
|
-
"anthropic/claude-3-5-haiku-latest": 98,
|
|
254
|
-
"anthropic/claude-sonnet-4-20250514": 44
|
|
255
|
-
}
|
|
256
|
-
}
|
|
257
|
-
```
|
|
258
|
-
|
|
259
|
-
### `POST /control/enable` / `POST /control/disable`
|
|
260
|
-
|
|
261
|
-
Enable or disable routing (passthrough mode when disabled).
|
|
132
|
+
To re-enable:
|
|
262
133
|
|
|
263
134
|
```bash
|
|
264
|
-
|
|
265
|
-
curl -X POST http://localhost:3001/control/disable
|
|
135
|
+
relayplane-proxy telemetry on
|
|
266
136
|
```
|
|
267
137
|
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
Update configuration (hot-reload, merges with existing).
|
|
138
|
+
Check current status:
|
|
271
139
|
|
|
272
140
|
```bash
|
|
273
|
-
|
|
274
|
-
-H "Content-Type: application/json" \
|
|
275
|
-
-d '{"routing": {"mode": "cascade"}}'
|
|
141
|
+
relayplane-proxy telemetry status
|
|
276
142
|
```
|
|
277
143
|
|
|
278
144
|
## Configuration
|
|
279
145
|
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
```json
|
|
283
|
-
{
|
|
284
|
-
"enabled": true,
|
|
285
|
-
"routing": {
|
|
286
|
-
"mode": "cascade",
|
|
287
|
-
"cascade": {
|
|
288
|
-
"enabled": true,
|
|
289
|
-
"models": [
|
|
290
|
-
"claude-3-haiku-20240307",
|
|
291
|
-
"claude-3-5-sonnet-20241022",
|
|
292
|
-
"claude-3-opus-20240229"
|
|
293
|
-
],
|
|
294
|
-
"escalateOn": "uncertainty",
|
|
295
|
-
"maxEscalations": 1
|
|
296
|
-
},
|
|
297
|
-
"complexity": {
|
|
298
|
-
"enabled": true,
|
|
299
|
-
"simple": "claude-3-haiku-20240307",
|
|
300
|
-
"moderate": "claude-3-5-sonnet-20241022",
|
|
301
|
-
"complex": "claude-3-opus-20240229"
|
|
302
|
-
}
|
|
303
|
-
},
|
|
304
|
-
"reliability": {
|
|
305
|
-
"cooldowns": {
|
|
306
|
-
"enabled": true,
|
|
307
|
-
"allowedFails": 3,
|
|
308
|
-
"windowSeconds": 60,
|
|
309
|
-
"cooldownSeconds": 120
|
|
310
|
-
}
|
|
311
|
-
},
|
|
312
|
-
"modelOverrides": {}
|
|
313
|
-
}
|
|
314
|
-
```
|
|
315
|
-
|
|
316
|
-
**Edit and save — changes apply instantly** (hot-reload, no restart needed).
|
|
317
|
-
|
|
318
|
-
### Configuration Options
|
|
319
|
-
|
|
320
|
-
| Field | Description |
|
|
321
|
-
|-------|-------------|
|
|
322
|
-
| `enabled` | Enable/disable routing (false = passthrough mode) |
|
|
323
|
-
| `routing.mode` | `"cascade"` or `"standard"` |
|
|
324
|
-
| `routing.cascade.models` | Ordered list of models to try (cheapest first) |
|
|
325
|
-
| `routing.cascade.escalateOn` | When to escalate: `"uncertainty"`, `"refusal"`, or `"error"` |
|
|
326
|
-
| `routing.complexity.simple/moderate/complex` | Models for each complexity level |
|
|
327
|
-
| `reliability.cooldowns` | Auto-disable failing providers temporarily |
|
|
328
|
-
| `modelOverrides` | Map input model names to different targets |
|
|
146
|
+
Configuration is stored in `~/.relayplane/config.json`.
|
|
329
147
|
|
|
330
|
-
###
|
|
148
|
+
### Set API Key (Pro Features)
|
|
331
149
|
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
{
|
|
335
|
-
"routing": {
|
|
336
|
-
"complexity": {
|
|
337
|
-
"complex": "gpt-4o"
|
|
338
|
-
}
|
|
339
|
-
}
|
|
340
|
-
}
|
|
150
|
+
```bash
|
|
151
|
+
relayplane-proxy config set-key your-api-key
|
|
341
152
|
```
|
|
342
153
|
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
"claude-3-opus": "claude-3-5-sonnet-20241022"
|
|
348
|
-
}
|
|
349
|
-
}
|
|
154
|
+
### View Configuration
|
|
155
|
+
|
|
156
|
+
```bash
|
|
157
|
+
relayplane-proxy config
|
|
350
158
|
```
|
|
351
159
|
|
|
352
|
-
##
|
|
160
|
+
## Usage Statistics
|
|
353
161
|
|
|
354
|
-
|
|
162
|
+
View your usage statistics:
|
|
355
163
|
|
|
356
164
|
```bash
|
|
357
|
-
|
|
358
|
-
sqlite3 ~/.relayplane/data.db "SELECT * FROM runs ORDER BY created_at DESC LIMIT 10"
|
|
359
|
-
|
|
360
|
-
# Check routing rules
|
|
361
|
-
sqlite3 ~/.relayplane/data.db "SELECT * FROM routing_rules"
|
|
165
|
+
relayplane-proxy stats
|
|
362
166
|
```
|
|
363
167
|
|
|
364
|
-
|
|
168
|
+
This shows:
|
|
169
|
+
- Total requests and cost
|
|
170
|
+
- Success rate
|
|
171
|
+
- Breakdown by model
|
|
172
|
+
- Breakdown by task type
|
|
173
|
+
|
|
174
|
+
## Environment Variables
|
|
365
175
|
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
176
|
+
| Variable | Description |
|
|
177
|
+
|----------|-------------|
|
|
178
|
+
| `ANTHROPIC_API_KEY` | Anthropic API key |
|
|
179
|
+
| `OPENAI_API_KEY` | OpenAI API key |
|
|
180
|
+
| `GEMINI_API_KEY` | Google Gemini API key |
|
|
181
|
+
| `XAI_API_KEY` | xAI/Grok API key |
|
|
182
|
+
| `MOONSHOT_API_KEY` | Moonshot API key |
|
|
369
183
|
|
|
370
184
|
## License
|
|
371
185
|
|