@relayplane/proxy 0.1.10 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,371 +1,185 @@
1
1
  # @relayplane/proxy
2
2
 
3
- **100% Local. Zero Cloud. Full Control.**
3
+ Intelligent AI model routing proxy for cost optimization and observability.
4
4
 
5
- Intelligent AI model routing that cuts costs by 50-80% while maintaining quality.
6
-
7
- > **Note:** Designed for standard API key users (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`). MAX subscription OAuth is not currently supported — MAX users should continue using their provider directly.
8
-
9
- > ⚠️ **Cost Monitoring Required**
10
- >
11
- > RelayPlane routes requests to LLM providers using your API keys. **This incurs real costs.**
12
- >
13
- > - Set up billing alerts with your providers (Anthropic, OpenAI, etc.)
14
- > - Monitor usage through your provider's dashboard
15
- > - Use `/relayplane stats` or `curl localhost:3001/control/stats` to track usage
16
- > - Start with test requests to understand routing behavior
17
- >
18
- > RelayPlane provides cost *optimization*, not cost *elimination*. You are responsible for monitoring your actual spending.
19
-
20
- [![CI](https://github.com/RelayPlane/proxy/actions/workflows/ci.yml/badge.svg)](https://github.com/RelayPlane/proxy/actions/workflows/ci.yml)
21
- [![npm version](https://img.shields.io/npm/v/@relayplane/proxy)](https://www.npmjs.com/package/@relayplane/proxy)
22
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
23
-
24
- ## Install
25
-
26
- ```bash
27
- npm install @relayplane/proxy
28
- ```
29
-
30
- Or run directly:
5
+ ## Installation
31
6
 
32
7
  ```bash
33
- npx @relayplane/proxy
34
- ```
35
-
36
- ## CLI Commands
37
-
38
- ```bash
39
- # Start the proxy server
40
- npx @relayplane/proxy
41
-
42
- # Start on custom port
43
- npx @relayplane/proxy --port 8080
44
-
45
- # View routing statistics
46
- npx @relayplane/proxy stats
47
-
48
- # View stats for last 30 days
49
- npx @relayplane/proxy stats --days 30
50
-
51
- # Show help
52
- npx @relayplane/proxy --help
53
- ```
54
-
55
- ## OpenClaw Slash Commands
56
-
57
- If you're using OpenClaw, these chat commands are available:
58
-
59
- | Command | Description |
60
- |---------|-------------|
61
- | `/relayplane stats` | Show usage statistics and cost savings |
62
- | `/relayplane status` | Show proxy health and configuration |
63
- | `/relayplane switch <mode>` | Change routing mode (auto\|cost\|fast\|quality) |
64
- | `/relayplane models` | List available routing models |
65
-
66
- Example:
67
- ```
68
- /relayplane stats
69
- /relayplane switch cost
8
+ npm install -g @relayplane/proxy
70
9
  ```
71
10
 
72
11
  ## Quick Start
73
12
 
74
- ### 1. Set your API keys
75
-
76
13
  ```bash
77
- export ANTHROPIC_API_KEY="sk-ant-..."
78
- export OPENAI_API_KEY="sk-..."
79
- # Optional: GEMINI_API_KEY, XAI_API_KEY, MOONSHOT_API_KEY
80
- ```
81
-
82
- ### 2. Start the proxy
83
-
84
- ```bash
85
- npx @relayplane/proxy --port 3001
86
- ```
14
+ # Set your API keys
15
+ export ANTHROPIC_API_KEY=your-key
16
+ export OPENAI_API_KEY=your-key
87
17
 
88
- ### 3. Point your tools to the proxy
18
+ # Start the proxy
19
+ relayplane-proxy
89
20
 
90
- ```bash
21
+ # Configure your tools to use the proxy
91
22
  export ANTHROPIC_BASE_URL=http://localhost:3001
92
23
  export OPENAI_BASE_URL=http://localhost:3001
93
24
 
94
- # Now run OpenClaw, Cursor, Aider, or any tool
95
- openclaw
96
- ```
97
-
98
- That's it. All API calls now route through RelayPlane for intelligent model selection.
99
-
100
- ## How It Works
101
-
102
- ```
103
- Your Tool (OpenClaw, Cursor, etc.)
104
-
105
-
106
- RelayPlane Proxy
107
- ├── Infers task type (code_review, analysis, etc.)
108
- ├── Checks routing rules
109
- ├── Selects optimal model (Haiku for simple, Opus for complex)
110
- ├── Tracks outcomes (success/failure/latency)
111
- └── Learns patterns → improves over time
112
-
113
-
114
- Provider (Anthropic, OpenAI, etc.)
25
+ # Run your AI tools (Claude Code, Cursor, Aider, etc.)
115
26
  ```
116
27
 
117
- ## Learning & Adaptation
28
+ ## Features
118
29
 
119
- RelayPlane doesn't just route it **learns from every request**:
30
+ - **Intelligent Routing**: Routes requests to the optimal model based on task type
31
+ - **Cost Tracking**: Tracks and reports API costs across all providers
32
+ - **Provider Agnostic**: Works with Anthropic, OpenAI, Gemini, xAI, and more
33
+ - **Local Learning**: Learns from your usage patterns to improve routing
34
+ - **Privacy First**: Never sees your prompts or responses
120
35
 
121
- - **Outcome Tracking** — Records success/failure for each route decision
122
- - **Pattern Detection** — Identifies what works for your specific codebase
123
- - **Continuous Improvement** — Routing gets smarter the more you use it
124
- - **Local Intelligence** — All learning happens in your local SQLite DB
36
+ ## CLI Options
125
37
 
126
38
  ```bash
127
- # View your routing stats (last 7 days)
128
- npx @relayplane/proxy stats
39
+ relayplane-proxy [command] [options]
129
40
 
130
- # View last 30 days
131
- npx @relayplane/proxy stats --days 30
41
+ Commands:
42
+ (default) Start the proxy server
43
+ telemetry [on|off|status] Manage telemetry settings
44
+ stats Show usage statistics
45
+ config Show configuration
132
46
 
133
- # Query the raw data directly
134
- sqlite3 ~/.relayplane/data.db "SELECT model, task_type, COUNT(*) FROM runs GROUP BY model, task_type"
47
+ Options:
48
+ --port <number> Port to listen on (default: 3001)
49
+ --host <string> Host to bind to (default: 127.0.0.1)
50
+ --offline Disable all network calls except LLM endpoints
51
+ --audit Show telemetry payloads before sending
52
+ -v, --verbose Enable verbose logging
53
+ -h, --help Show this help message
54
+ --version Show version
135
55
  ```
136
56
 
137
- Unlike static routing rules, RelayPlane adapts to **your** usage patterns.
138
-
139
- ## Supported Providers
57
+ ## Telemetry
140
58
 
141
- | Provider | Models | Streaming | Tools |
142
- |----------|--------|-----------|-------|
143
- | **Anthropic** | Claude 3.5 Haiku, Sonnet 4, Opus 4.5 | ✓ | ✓ |
144
- | **OpenAI** | GPT-4o, GPT-4o-mini, GPT-4.1, o1, o3 | ✓ | ✓ |
145
- | **Google** | Gemini 2.0 Flash, Gemini Pro | ✓ | ✓ |
146
- | **xAI** | Grok (grok-*) | ✓ | ✓ |
147
- | **Moonshot** | Moonshot v1 (8k, 32k, 128k) | ✓ | ✓ |
59
+ RelayPlane collects anonymous telemetry to improve model routing. This data helps us understand usage patterns and optimize routing decisions.
148
60
 
149
- ## Routing Modes
61
+ ### What We Collect (Exact Schema)
150
62
 
151
- | Model | Description |
152
- |-------|-------------|
153
- | `relayplane:auto` | Infers task type, routes to optimal model |
154
- | `relayplane:cost` | Prioritizes cheapest models (maximum savings) |
155
- | `relayplane:quality` | Uses best available model |
156
-
157
- Or pass through explicit models: `claude-3-5-sonnet-latest`, `gpt-4o`, etc.
158
-
159
- ## Why RelayPlane?
160
-
161
- | Without RelayPlane | With RelayPlane |
162
- |-------------------|-----------------|
163
- | Pay Opus token rates for simple tasks | Route simple tasks to Haiku (1/10 the cost) |
164
- | Static model selection | Learns from outcomes over time |
165
- | Manual optimization | Automatic cost-quality balance |
166
- | No visibility into spend | Built-in savings tracking |
63
+ ```json
64
+ {
65
+ "device_id": "anon_8f3a...",
66
+ "task_type": "code_review",
67
+ "model": "claude-3-5-haiku",
68
+ "tokens_in": 1847,
69
+ "tokens_out": 423,
70
+ "latency_ms": 2341,
71
+ "success": true,
72
+ "cost_usd": 0.02
73
+ }
74
+ ```
167
75
 
168
- ## Key Features
76
+ ### Field Descriptions
169
77
 
170
- - **100% Local** All data in SQLite (`~/.relayplane/data.db`)
171
- - **Zero Friction** — Set 2 env vars, done
172
- - **Learning** Improves routing based on outcomes
173
- - **Full Streaming** SSE support for all providers
174
- - **Tool Calls** Function calling across providers
78
+ | Field | Type | Description |
79
+ |-------|------|-------------|
80
+ | `device_id` | string | Anonymous random ID (not fingerprintable) |
81
+ | `task_type` | string | Inferred from token patterns, NOT prompt content |
82
+ | `model` | string | The model that handled the request |
83
+ | `tokens_in` | number | Input token count |
84
+ | `tokens_out` | number | Output token count |
85
+ | `latency_ms` | number | Request latency in milliseconds |
86
+ | `success` | boolean | Whether the request succeeded |
87
+ | `cost_usd` | number | Estimated cost in USD |
175
88
 
176
- ## Programmatic Usage
89
+ ### Task Types
177
90
 
178
- ```typescript
179
- import { startProxy, RelayPlane, calculateSavings } from '@relayplane/proxy';
91
+ Task types are inferred from request characteristics (token counts, ratios, etc.) - never from prompt content:
180
92
 
181
- // Start the proxy
182
- await startProxy({ port: 3001, verbose: true });
93
+ - `quick_task` - Short input/output (< 500 tokens each)
94
+ - `code_review` - Medium-long input, medium output
95
+ - `generation` - High output/input ratio
96
+ - `classification` - Low output/input ratio, short output
97
+ - `long_context` - Input > 10,000 tokens
98
+ - `content_generation` - Output > 1,000 tokens
99
+ - `tool_use` - Request includes tool calls
100
+ - `general` - Default classification
183
101
 
184
- // Or use RelayPlane directly
185
- const relay = new RelayPlane({});
186
- const result = await relay.run({ prompt: 'Review this code...' });
187
- console.log(result.taskType); // 'code_review'
188
- console.log(result.model); // 'anthropic:claude-3-5-haiku-latest'
102
+ ### What We NEVER Collect
189
103
 
190
- // Check savings
191
- const savings = calculateSavings(relay.store, 30);
192
- console.log(`Saved ${savings.savingsPercent}% this month`);
104
+ - Your prompts
105
+ - Model responses
106
+ - File paths or contents
107
+ - ❌ Anything that could identify you or your project
193
108
 
194
- relay.close();
195
- ```
109
+ ### Verification
196
110
 
197
- ## CLI Options
111
+ You can verify exactly what data is collected:
198
112
 
199
113
  ```bash
200
- npx @relayplane/proxy [options]
201
-
202
- Options:
203
- --port <number> Port to listen on (default: 3001)
204
- --host <string> Host to bind to (default: 127.0.0.1)
205
- -v, --verbose Enable verbose logging
206
- -h, --help Show help
207
- ```
208
-
209
- ## REST API
210
-
211
- The proxy exposes control endpoints for stats and monitoring:
114
+ # See telemetry payloads before they're sent
115
+ relayplane-proxy --audit
212
116
 
213
- ### `GET /control/status`
117
+ # Disable all telemetry transmission
118
+ relayplane-proxy --offline
214
119
 
215
- Proxy status and current configuration.
216
-
217
- ```bash
218
- curl http://localhost:3001/control/status
120
+ # View the source code
121
+ # https://github.com/RelayPlane/proxy
219
122
  ```
220
123
 
221
- ```json
222
- {
223
- "enabled": true,
224
- "mode": "cascade",
225
- "modelOverrides": {}
226
- }
227
- ```
124
+ ### Opt-Out
228
125
 
229
- ### `GET /control/stats`
230
-
231
- Aggregated statistics and routing counts.
126
+ To disable telemetry completely:
232
127
 
233
128
  ```bash
234
- curl http://localhost:3001/control/stats
129
+ relayplane-proxy telemetry off
235
130
  ```
236
131
 
237
- ```json
238
- {
239
- "uptimeMs": 3600000,
240
- "uptimeFormatted": "60m 0s",
241
- "totalRequests": 142,
242
- "successfulRequests": 138,
243
- "failedRequests": 4,
244
- "successRate": "97.2%",
245
- "avgLatencyMs": 1203,
246
- "escalations": 12,
247
- "routingCounts": {
248
- "auto": 100,
249
- "cost": 30,
250
- "passthrough": 12
251
- },
252
- "modelCounts": {
253
- "anthropic/claude-3-5-haiku-latest": 98,
254
- "anthropic/claude-sonnet-4-20250514": 44
255
- }
256
- }
257
- ```
258
-
259
- ### `POST /control/enable` / `POST /control/disable`
260
-
261
- Enable or disable routing (passthrough mode when disabled).
132
+ To re-enable:
262
133
 
263
134
  ```bash
264
- curl -X POST http://localhost:3001/control/enable
265
- curl -X POST http://localhost:3001/control/disable
135
+ relayplane-proxy telemetry on
266
136
  ```
267
137
 
268
- ### `POST /control/config`
269
-
270
- Update configuration (hot-reload, merges with existing).
138
+ Check current status:
271
139
 
272
140
  ```bash
273
- curl -X POST http://localhost:3001/control/config \
274
- -H "Content-Type: application/json" \
275
- -d '{"routing": {"mode": "cascade"}}'
141
+ relayplane-proxy telemetry status
276
142
  ```
277
143
 
278
144
  ## Configuration
279
145
 
280
- RelayPlane creates a config file on first run at `~/.relayplane/config.json`:
281
-
282
- ```json
283
- {
284
- "enabled": true,
285
- "routing": {
286
- "mode": "cascade",
287
- "cascade": {
288
- "enabled": true,
289
- "models": [
290
- "claude-3-haiku-20240307",
291
- "claude-3-5-sonnet-20241022",
292
- "claude-3-opus-20240229"
293
- ],
294
- "escalateOn": "uncertainty",
295
- "maxEscalations": 1
296
- },
297
- "complexity": {
298
- "enabled": true,
299
- "simple": "claude-3-haiku-20240307",
300
- "moderate": "claude-3-5-sonnet-20241022",
301
- "complex": "claude-3-opus-20240229"
302
- }
303
- },
304
- "reliability": {
305
- "cooldowns": {
306
- "enabled": true,
307
- "allowedFails": 3,
308
- "windowSeconds": 60,
309
- "cooldownSeconds": 120
310
- }
311
- },
312
- "modelOverrides": {}
313
- }
314
- ```
315
-
316
- **Edit and save — changes apply instantly** (hot-reload, no restart needed).
317
-
318
- ### Configuration Options
319
-
320
- | Field | Description |
321
- |-------|-------------|
322
- | `enabled` | Enable/disable routing (false = passthrough mode) |
323
- | `routing.mode` | `"cascade"` or `"standard"` |
324
- | `routing.cascade.models` | Ordered list of models to try (cheapest first) |
325
- | `routing.cascade.escalateOn` | When to escalate: `"uncertainty"`, `"refusal"`, or `"error"` |
326
- | `routing.complexity.simple/moderate/complex` | Models for each complexity level |
327
- | `reliability.cooldowns` | Auto-disable failing providers temporarily |
328
- | `modelOverrides` | Map input model names to different targets |
146
+ Configuration is stored in `~/.relayplane/config.json`.
329
147
 
330
- ### Examples
148
+ ### Set API Key (Pro Features)
331
149
 
332
- Use GPT-4o for complex tasks:
333
- ```json
334
- {
335
- "routing": {
336
- "complexity": {
337
- "complex": "gpt-4o"
338
- }
339
- }
340
- }
150
+ ```bash
151
+ relayplane-proxy config set-key your-api-key
341
152
  ```
342
153
 
343
- Override a specific model:
344
- ```json
345
- {
346
- "modelOverrides": {
347
- "claude-3-opus": "claude-3-5-sonnet-20241022"
348
- }
349
- }
154
+ ### View Configuration
155
+
156
+ ```bash
157
+ relayplane-proxy config
350
158
  ```
351
159
 
352
- ## Data Storage
160
+ ## Usage Statistics
353
161
 
354
- All data stored locally at `~/.relayplane/data.db` (SQLite).
162
+ View your usage statistics:
355
163
 
356
164
  ```bash
357
- # View recent runs
358
- sqlite3 ~/.relayplane/data.db "SELECT * FROM runs ORDER BY created_at DESC LIMIT 10"
359
-
360
- # Check routing rules
361
- sqlite3 ~/.relayplane/data.db "SELECT * FROM routing_rules"
165
+ relayplane-proxy stats
362
166
  ```
363
167
 
364
- ## Links
168
+ This shows:
169
+ - Total requests and cost
170
+ - Success rate
171
+ - Breakdown by model
172
+ - Breakdown by task type
173
+
174
+ ## Environment Variables
365
175
 
366
- - [RelayPlane Proxy](https://relayplane.com/integrations/openclaw)
367
- - [GitHub](https://github.com/RelayPlane/proxy)
368
- - [RelayPlane](https://relayplane.com/)
176
+ | Variable | Description |
177
+ |----------|-------------|
178
+ | `ANTHROPIC_API_KEY` | Anthropic API key |
179
+ | `OPENAI_API_KEY` | OpenAI API key |
180
+ | `GEMINI_API_KEY` | Google Gemini API key |
181
+ | `XAI_API_KEY` | xAI/Grok API key |
182
+ | `MOONSHOT_API_KEY` | Moonshot API key |
369
183
 
370
184
  ## License
371
185