@relayplane/proxy 1.8.16 → 1.8.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +57 -57
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -3,28 +3,28 @@
|
|
|
3
3
|
[](https://www.npmjs.com/package/@relayplane/proxy)
|
|
4
4
|
[](https://github.com/RelayPlane/proxy/blob/main/LICENSE)
|
|
5
5
|
|
|
6
|
-
An open-source LLM proxy that sits between your AI agents and providers. Tracks every request, shows where the money goes, and offers configurable task-aware routing
|
|
6
|
+
An open-source LLM proxy that sits between your AI agents and providers. Tracks every request, shows where the money goes, and offers configurable task-aware routing - all running **locally, for free**.
|
|
7
7
|
|
|
8
8
|
**Free, open-source proxy features:**
|
|
9
9
|
- 📊 Per-request cost tracking across 11 providers
|
|
10
|
-
- 💰 **Cache-aware cost tracking**
|
|
10
|
+
- 💰 **Cache-aware cost tracking** - accurately tracks Anthropic prompt caching with cache read savings, creation costs, and true per-request costs
|
|
11
11
|
- 🔀 Configurable task-aware routing (complexity-based, cascade, model overrides)
|
|
12
|
-
- 🛡️ Circuit breaker
|
|
13
|
-
- 📈 **Local dashboard** at `localhost:4100`
|
|
14
|
-
- 💵 **Budget enforcement**
|
|
15
|
-
- 🔍 **Anomaly detection**
|
|
16
|
-
- 🔔 **Cost alerts**
|
|
17
|
-
- ⬇️ **Auto-downgrade**
|
|
18
|
-
- 📦 **Aggressive cache**
|
|
19
|
-
- 🤖 **Per-agent cost tracking**
|
|
20
|
-
- 📝 **Content logging**
|
|
21
|
-
- 🔐 **OAuth passthrough**
|
|
22
|
-
- 🧠 **Osmosis mesh**
|
|
23
|
-
- 🔧 **systemd/launchd service**
|
|
24
|
-
- 🏥 **Health watchdog**
|
|
25
|
-
- 🛡️ **Config resilience**
|
|
26
|
-
|
|
27
|
-
> **Cloud dashboard available separately**
|
|
12
|
+
- 🛡️ Circuit breaker - if the proxy fails, your agent doesn't notice
|
|
13
|
+
- 📈 **Local dashboard** at `localhost:4100` - cost breakdown, savings analysis, provider health, agent breakdown
|
|
14
|
+
- 💵 **Budget enforcement** - daily/hourly/per-request spend limits with block, warn, downgrade, or alert actions
|
|
15
|
+
- 🔍 **Anomaly detection** - catches runaway agent loops, cost spikes, and token explosions in real time
|
|
16
|
+
- 🔔 **Cost alerts** - threshold alerts at configurable percentages, webhook delivery, alert history
|
|
17
|
+
- ⬇️ **Auto-downgrade** - automatically switches to cheaper models when budget thresholds are hit
|
|
18
|
+
- 📦 **Aggressive cache** - exact-match response caching with gzipped disk persistence
|
|
19
|
+
- 🤖 **Per-agent cost tracking** - identifies agents by system prompt fingerprint and tracks cost per agent
|
|
20
|
+
- 📝 **Content logging** - dashboard shows system prompt preview, user message, and response preview per request
|
|
21
|
+
- 🔐 **OAuth passthrough** - correctly forwards `user-agent` and `x-app` headers for Claude Max subscription users (OpenClaw compatible)
|
|
22
|
+
- 🧠 **Osmosis mesh** - collective learning layer that shares anonymized routing signals across users (on by default, opt-out: `relayplane mesh off`)
|
|
23
|
+
- 🔧 **systemd/launchd service** - `relayplane service install` for always-on operation with auto-restart
|
|
24
|
+
- 🏥 **Health watchdog** - `/health` endpoint with uptime tracking and active probing
|
|
25
|
+
- 🛡️ **Config resilience** - atomic writes, automatic backup/restore, credential separation
|
|
26
|
+
|
|
27
|
+
> **Cloud dashboard available separately** - see [Cloud Dashboard & Pro Features](#cloud-dashboard--pro-features) below. Your prompts always stay local.
|
|
28
28
|
|
|
29
29
|
## Quick Start
|
|
30
30
|
|
|
@@ -37,7 +37,7 @@ relayplane start
|
|
|
37
37
|
|
|
38
38
|
Works with any agent framework that talks to OpenAI or Anthropic APIs. Point your client at `http://localhost:4100` (set `ANTHROPIC_BASE_URL` or `OPENAI_BASE_URL`) and the proxy handles the rest.
|
|
39
39
|
|
|
40
|
-
## What's New in v1.
|
|
40
|
+
## What's New in v1.8.14+
|
|
41
41
|
|
|
42
42
|
**Breaking changes for upgraders:**
|
|
43
43
|
|
|
@@ -77,7 +77,7 @@ A minimal config file:
|
|
|
77
77
|
}
|
|
78
78
|
```
|
|
79
79
|
|
|
80
|
-
All configuration is optional
|
|
80
|
+
All configuration is optional - sensible defaults are applied for every field. The proxy merges your config with its defaults via deep merge, so you only need to specify what you want to change.
|
|
81
81
|
|
|
82
82
|
## Architecture
|
|
83
83
|
|
|
@@ -118,12 +118,12 @@ Provider APIs (Anthropic/OpenAI/Gemini/xAI/...)
|
|
|
118
118
|
RelayPlane is a local HTTP proxy. You point your agent at `localhost:4100` by setting `ANTHROPIC_BASE_URL` or `OPENAI_BASE_URL`. The proxy:
|
|
119
119
|
|
|
120
120
|
1. **Intercepts** your LLM API requests
|
|
121
|
-
2. **Classifies** the task using heuristics (token count, prompt patterns, keyword matching
|
|
121
|
+
2. **Classifies** the task using heuristics (token count, prompt patterns, keyword matching - no LLM calls)
|
|
122
122
|
3. **Routes** to the configured model based on classification and your routing rules (or passes through to the original model by default)
|
|
123
123
|
4. **Forwards** the request directly to the LLM provider (your prompts go straight to the provider, not through RelayPlane servers)
|
|
124
124
|
5. **Records** token counts, latency, and cost locally for your dashboard
|
|
125
125
|
|
|
126
|
-
**Default behavior is passthrough**
|
|
126
|
+
**Default behavior is passthrough** - requests go to whatever model your agent requested. Routing (cascade, complexity-based) is configurable and must be explicitly enabled.
|
|
127
127
|
|
|
128
128
|
## Complexity-Based Routing
|
|
129
129
|
|
|
@@ -144,11 +144,11 @@ The proxy classifies incoming requests by complexity (simple, moderate, complex)
|
|
|
144
144
|
|
|
145
145
|
**How classification works:**
|
|
146
146
|
|
|
147
|
-
- **Simple**
|
|
148
|
-
- **Moderate**
|
|
149
|
-
- **Complex**
|
|
147
|
+
- **Simple** - Short prompts, straightforward Q&A, basic code tasks
|
|
148
|
+
- **Moderate** - Multi-step reasoning, code review, analysis with context
|
|
149
|
+
- **Complex** - Architecture decisions, large codebases, tasks with many tools, long prompts with evaluation/comparison language
|
|
150
150
|
|
|
151
|
-
The classifier scores requests based on message count, total token length, tool usage, and content patterns (e.g., words like "analyze", "compare", "evaluate" increase the score). This happens locally
|
|
151
|
+
The classifier scores requests based on message count, total token length, tool usage, and content patterns (e.g., words like "analyze", "compare", "evaluate" increase the score). This happens locally - no prompt content is sent anywhere.
|
|
152
152
|
|
|
153
153
|
## Model Overrides
|
|
154
154
|
|
|
@@ -209,8 +209,8 @@ Use semantic model names instead of provider-specific IDs:
|
|
|
209
209
|
| `rp:fast` | `anthropic/claude-3-5-haiku` | OpenRouter |
|
|
210
210
|
| `rp:cheap` | `google/gemini-2.0-flash-001` | OpenRouter |
|
|
211
211
|
| `rp:balanced` | `anthropic/claude-3-5-haiku` | OpenRouter |
|
|
212
|
-
| `relayplane:auto` | Same as `rp:balanced` |
|
|
213
|
-
| `rp:auto` | Same as `rp:balanced` |
|
|
212
|
+
| `relayplane:auto` | Same as `rp:balanced` | - |
|
|
213
|
+
| `rp:auto` | Same as `rp:balanced` | - |
|
|
214
214
|
|
|
215
215
|
Use these as the `model` field in your API requests:
|
|
216
216
|
|
|
@@ -238,7 +238,7 @@ Append `:cost`, `:fast`, or `:quality` to any model name to hint at routing pref
|
|
|
238
238
|
| `:fast` | Optimize for lowest latency |
|
|
239
239
|
| `:quality` | Optimize for best output quality |
|
|
240
240
|
|
|
241
|
-
The suffix is stripped before provider lookup
|
|
241
|
+
The suffix is stripped before provider lookup - the base model must still be valid. Suffixes influence routing decisions when the proxy has multiple options.
|
|
242
242
|
|
|
243
243
|
## Provider Cooldowns / Reliability
|
|
244
244
|
|
|
@@ -303,12 +303,12 @@ relayplane telemetry off
|
|
|
303
303
|
|
|
304
304
|
The proxy sends anonymized metadata to `api.relayplane.com`:
|
|
305
305
|
|
|
306
|
-
- **device_id**
|
|
307
|
-
- **task_type**
|
|
308
|
-
- **model**
|
|
309
|
-
- **tokens_in/out**
|
|
310
|
-
- **latency_ms**
|
|
311
|
-
- **cost_usd**
|
|
306
|
+
- **device_id** - Random anonymous hash (no PII)
|
|
307
|
+
- **task_type** - Heuristic classification label (e.g., "code_generation", "summarization")
|
|
308
|
+
- **model** - Which model was used
|
|
309
|
+
- **tokens_in/out** - Token counts
|
|
310
|
+
- **latency_ms** - Response time
|
|
311
|
+
- **cost_usd** - Estimated cost
|
|
312
312
|
|
|
313
313
|
**Never collected:** prompts, responses, file paths, or anything that could identify you or your project. Your prompts go directly to LLM providers, never through RelayPlane servers. Mesh (on by default) shares anonymized metadata: model, tokens, cost, latency, success/fail. Opt out: `relayplane mesh off`.
|
|
314
314
|
|
|
@@ -316,7 +316,7 @@ The proxy sends anonymized metadata to `api.relayplane.com`:
|
|
|
316
316
|
|
|
317
317
|
When the proxy connects and telemetry is enabled, it will confirm:
|
|
318
318
|
```
|
|
319
|
-
[RelayPlane] Cloud dashboard connected
|
|
319
|
+
[RelayPlane] Cloud dashboard connected - telemetry enabled.
|
|
320
320
|
Your prompts stay local. Only anonymous metadata (model, tokens, cost) is sent.
|
|
321
321
|
Disable anytime: relayplane telemetry off
|
|
322
322
|
```
|
|
@@ -343,19 +343,19 @@ The built-in dashboard runs at [http://localhost:4100](http://localhost:4100) (o
|
|
|
343
343
|
|
|
344
344
|
- Total requests, success rate, average latency
|
|
345
345
|
- Cost breakdown by model and provider (with provider column to distinguish `anthropic` vs `openrouter` for same model names)
|
|
346
|
-
- **Agent Cost Breakdown**
|
|
346
|
+
- **Agent Cost Breakdown** - per-agent spend table identifying agents by system prompt fingerprint
|
|
347
347
|
- Recent request history with agent column and expandable rows (state persists across the 5-second auto-refresh)
|
|
348
|
-
- **Content previews**
|
|
349
|
-
- **Honest savings breakdown**
|
|
350
|
-
- Error detail capture
|
|
348
|
+
- **Content previews** - system prompt preview, user message, and response preview in expandable rows
|
|
349
|
+
- **Honest savings breakdown** - routing savings (RelayPlane's contribution) vs cache savings (Anthropic's feature), with tooltip explaining the calculation
|
|
350
|
+
- Error detail capture - failed requests show the error message and HTTP status code
|
|
351
351
|
- Provider health status
|
|
352
352
|
- Wider 1600px layout for dense data views
|
|
353
353
|
|
|
354
354
|
### Per-Agent Cost Tracking
|
|
355
355
|
|
|
356
|
-
RelayPlane v1.7 identifies each agent by fingerprinting its system prompt. This groups all requests from the same agent together
|
|
356
|
+
RelayPlane v1.7 identifies each agent by fingerprinting its system prompt. This groups all requests from the same agent together - even across sessions - so you can see exactly which agent is responsible for which costs.
|
|
357
357
|
|
|
358
|
-
The Agent Cost Breakdown table in the dashboard shows total spend, request count, and average cost per request for each distinct agent. No configuration required
|
|
358
|
+
The Agent Cost Breakdown table in the dashboard shows total spend, request count, and average cost per request for each distinct agent. No configuration required - fingerprinting happens automatically.
|
|
359
359
|
|
|
360
360
|
### Content Logging
|
|
361
361
|
|
|
@@ -365,7 +365,7 @@ When content logging is enabled, the dashboard stores and displays:
|
|
|
365
365
|
- The first user message in the conversation
|
|
366
366
|
- A preview of the model's response
|
|
367
367
|
|
|
368
|
-
This makes it easy to correlate a cost spike with the actual request that caused it. Content is stored locally only
|
|
368
|
+
This makes it easy to correlate a cost spike with the actual request that caused it. Content is stored locally only - nothing is sent to RelayPlane servers.
|
|
369
369
|
|
|
370
370
|
### Auth Passthrough (Claude Max / OpenClaw Users)
|
|
371
371
|
|
|
@@ -573,7 +573,7 @@ relayplane cache on/off # Toggle caching
|
|
|
573
573
|
|
|
574
574
|
## Osmosis Mesh
|
|
575
575
|
|
|
576
|
-
Opt-in collective learning layer. Share anonymized routing signals (model, task type, tokens, cost
|
|
576
|
+
Opt-in collective learning layer. Share anonymized routing signals (model, task type, tokens, cost - never prompts) and benefit from the network's routing intelligence.
|
|
577
577
|
|
|
578
578
|
```json
|
|
579
579
|
{
|
|
@@ -617,14 +617,14 @@ The service unit includes `WatchdogSec=30` (systemd) and `KeepAlive` (launchd) f
|
|
|
617
617
|
|
|
618
618
|
Configuration is protected against corruption:
|
|
619
619
|
|
|
620
|
-
- **Atomic writes**
|
|
621
|
-
- **Automatic backup**
|
|
622
|
-
- **Auto-restore**
|
|
623
|
-
- **Credential separation**
|
|
620
|
+
- **Atomic writes** - config is written to a `.tmp` file then renamed (no partial writes)
|
|
621
|
+
- **Automatic backup** - `config.json.bak` is updated before every save
|
|
622
|
+
- **Auto-restore** - if `config.json` is corrupt/missing, the proxy restores from backup
|
|
623
|
+
- **Credential separation** - API keys live in `credentials.json`, surviving config resets
|
|
624
624
|
|
|
625
625
|
## Circuit Breaker
|
|
626
626
|
|
|
627
|
-
If the proxy ever fails, all traffic automatically bypasses it
|
|
627
|
+
If the proxy ever fails, all traffic automatically bypasses it - your agent talks directly to the provider. When RelayPlane recovers, traffic resumes. No manual intervention needed.
|
|
628
628
|
|
|
629
629
|
## CLI Reference
|
|
630
630
|
|
|
@@ -657,9 +657,9 @@ relayplane [command] [options]
|
|
|
657
657
|
|------|---------|-------------|
|
|
658
658
|
| `--port <n>` | `4100` | Port to listen on |
|
|
659
659
|
| `--host <s>` | `127.0.0.1` | Host to bind to |
|
|
660
|
-
| `--offline` |
|
|
661
|
-
| `--audit` |
|
|
662
|
-
| `-v, --verbose` |
|
|
660
|
+
| `--offline` | - | No network calls except LLM endpoints |
|
|
661
|
+
| `--audit` | - | Show telemetry payloads before sending |
|
|
662
|
+
| `-v, --verbose` | - | Verbose logging |
|
|
663
663
|
|
|
664
664
|
## Cloud Dashboard & Pro Features
|
|
665
665
|
|
|
@@ -669,9 +669,9 @@ Cloud dashboard is **free for all signed-up users**. Just `relayplane login`. Fo
|
|
|
669
669
|
|
|
670
670
|
| Feature | Plan |
|
|
671
671
|
|---------|------|
|
|
672
|
-
| Cloud dashboard
|
|
672
|
+
| Cloud dashboard - run history, cost trends, analytics | Free (all tiers) |
|
|
673
673
|
| 30-day cloud history, weekly cost digest, routing recommendations | Starter ($9/mo) |
|
|
674
|
-
| Full mesh intelligence
|
|
674
|
+
| Full mesh intelligence - routing signals from thousands of agents | Pro ($29/mo) |
|
|
675
675
|
| 90-day history, data export, cost spike alerts | Pro |
|
|
676
676
|
| Private team mesh, per-agent spend limits, approval flows | Max ($99/mo) |
|
|
677
677
|
| Governance & compliance rules, audit logs | Max |
|
|
@@ -681,18 +681,18 @@ Cloud dashboard is **free for all signed-up users**. Just `relayplane login`. Fo
|
|
|
681
681
|
### Connecting to Cloud
|
|
682
682
|
|
|
683
683
|
```bash
|
|
684
|
-
relayplane login # authenticate
|
|
684
|
+
relayplane login # authenticate - unlocks cloud dashboard (free)
|
|
685
685
|
```
|
|
686
686
|
|
|
687
687
|
Telemetry is on by default. The cloud dashboard requires it to display your data. Disable anytime: `relayplane telemetry off`.
|
|
688
688
|
|
|
689
|
-
> **Privacy-first:** Telemetry sends only anonymous metadata
|
|
689
|
+
> **Privacy-first:** Telemetry sends only anonymous metadata - model name, token counts, cost, latency. Your prompts, inputs, and outputs **never leave your machine**. Mesh is also on by default; opt out: `relayplane mesh off`.
|
|
690
690
|
|
|
691
691
|
---
|
|
692
692
|
|
|
693
693
|
## Your Keys Stay Yours
|
|
694
694
|
|
|
695
|
-
RelayPlane requires your own provider API keys. Your prompts go directly to LLM providers
|
|
695
|
+
RelayPlane requires your own provider API keys. Your prompts go directly to LLM providers - never through RelayPlane servers. All proxy execution is local. Mesh telemetry (anonymous metadata only) is on by default. Opt out: `relayplane mesh off`. Your prompts always go directly to providers.
|
|
696
696
|
|
|
697
697
|
## License
|
|
698
698
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@relayplane/proxy",
|
|
3
|
-
"version": "1.8.
|
|
3
|
+
"version": "1.8.17",
|
|
4
4
|
"description": "Open source cost intelligence proxy for AI agents. Cut LLM costs ~80% with smart model routing. Dashboard, policy engine, 11 providers. MIT licensed.",
|
|
5
5
|
"homepage": "https://relayplane.com",
|
|
6
6
|
"repository": {
|