npm - @relayplane/proxy - Versions diffs - 0.2.1 → 1.1.0 - Mend

@relayplane/proxy 0.2.1 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/README.md +221 -120
package/dist/server.d.ts +13 -0
package/dist/server.d.ts.map +1 -0
package/dist/server.js +637 -0
package/dist/server.js.map +1 -0
package/package.json +34 -38
package/dist/brain-client.d.ts +0 -87
package/dist/brain-client.d.ts.map +0 -1
package/dist/brain-client.js +0 -205
package/dist/brain-client.js.map +0 -1
package/dist/cli.d.ts +0 -36
package/dist/cli.d.ts.map +0 -1
package/dist/cli.js +0 -304
package/dist/cli.js.map +0 -1
package/dist/config.d.ts +0 -80
package/dist/config.d.ts.map +0 -1
package/dist/config.js +0 -208
package/dist/config.js.map +0 -1
package/dist/index.d.ts +0 -27
package/dist/index.d.ts.map +0 -1
package/dist/index.js +0 -60
package/dist/index.js.map +0 -1
package/dist/standalone-proxy.d.ts +0 -79
package/dist/standalone-proxy.d.ts.map +0 -1
package/dist/standalone-proxy.js +0 -2414
package/dist/standalone-proxy.js.map +0 -1
package/dist/telemetry.d.ts +0 -127
package/dist/telemetry.d.ts.map +0 -1
package/dist/telemetry.js +0 -426
package/dist/telemetry.js.map +0 -1

package/README.md CHANGED Viewed

@@ -1,185 +1,286 @@
 # @relayplane/proxy
-Intelligent AI model routing proxy for cost optimization and observability.
+Local LLM proxy server for RelayPlane - route requests through multiple AI providers.
+## What's New in 1.1
+- 🩺 **Health Endpoint** — `GET /health` with uptime, stats, and provider status
+- ⚠️ **Usage Warnings** — Console and header warnings at 80%, 90%, 100% of limits
+- 📊 **Response Headers** — `X-RelayPlane-Daily-Usage`, `X-RelayPlane-Monthly-Usage`, `X-RelayPlane-Usage-Warning`
+- 💰 **Spending Limits** — Configure `limits.daily` and `limits.monthly` with 429 when exceeded
+- 🏷️ **Model Aliases** — `rp:fast`, `rp:cheap`, `rp:best`, `rp:balanced` shortcuts
+## Features
+- **OpenAI-compatible API** - Drop-in replacement for OpenAI SDK
+- **Multi-provider routing** - Automatically routes to OpenAI, Anthropic, Groq, Together, OpenRouter
+- **Model aliases** - `rp:fast`, `rp:cheap`, `rp:best` shortcuts
+- **Dry-run mode** - Test routing without making API calls
+- **Usage tracking** - Track tokens, cost, and latency
+- **Spending limits** - Daily/monthly cost limits with warnings
+- **Health endpoint** - `/health` for monitoring and uptime checks
 ## Installation
 ```bash
-npm install -g @relayplane/proxy
+npm install @relayplane/proxy
 ```
-## Quick Start
+Or use via the CLI:
 ```bash
-# Set your API keys
-export ANTHROPIC_API_KEY=your-key
-export OPENAI_API_KEY=your-key
+npm install -g @relayplane/cli
+relayplane proxy start
+```
-# Start the proxy
-relayplane-proxy
+## Quick Start
-# Configure your tools to use the proxy
-export ANTHROPIC_BASE_URL=http://localhost:3001
-export OPENAI_BASE_URL=http://localhost:3001
+```bash
+# Set API keys
+export OPENAI_API_KEY=sk-...
+export ANTHROPIC_API_KEY=sk-ant-...
-# Run your AI tools (Claude Code, Cursor, Aider, etc.)
+# Start the proxy
+npx @relayplane/proxy
+# Make requests
+curl http://localhost:8787/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
 ```
-## Features
+## Endpoints
-- **Intelligent Routing**: Routes requests to the optimal model based on task type
-- **Cost Tracking**: Tracks and reports API costs across all providers
-- **Provider Agnostic**: Works with Anthropic, OpenAI, Gemini, xAI, and more
-- **Local Learning**: Learns from your usage patterns to improve routing
-- **Privacy First**: Never sees your prompts or responses
+### `GET /health`
-## CLI Options
+Health check endpoint for monitoring.
 ```bash
-relayplane-proxy [command] [options]
-Commands:
-  (default)              Start the proxy server
-  telemetry [on|off|status]  Manage telemetry settings
-  stats                  Show usage statistics
-  config                 Show configuration
-Options:
-  --port <number>    Port to listen on (default: 3001)
-  --host <string>    Host to bind to (default: 127.0.0.1)
-  --offline          Disable all network calls except LLM endpoints
-  --audit            Show telemetry payloads before sending
-  -v, --verbose      Enable verbose logging
-  -h, --help         Show this help message
-  --version          Show version
+curl http://localhost:8787/health
 ```
-## Telemetry
-RelayPlane collects anonymous telemetry to improve model routing. This data helps us understand usage patterns and optimize routing decisions.
-### What We Collect (Exact Schema)
+Response:
 ```json
 {
-  "device_id": "anon_8f3a...",
-  "task_type": "code_review",
-  "model": "claude-3-5-haiku",
-  "tokens_in": 1847,
-  "tokens_out": 423,
-  "latency_ms": 2341,
-  "success": true,
-  "cost_usd": 0.02
+  "status": "ok",
+  "uptime": 3600,
+  "version": "1.1.0",
+  "providers": {
+    "openai": "configured",
+    "anthropic": "configured",
+    "groq": "not_configured",
+    "together": "not_configured",
+    "openrouter": "not_configured"
+  },
+  "requestsHandled": 150,
+  "requestsSuccessful": 148,
+  "requestsFailed": 2,
+  "dailyCost": 1.25,
+  "dailyLimit": 10.00,
+  "monthlyCost": 25.50,
+  "monthlyLimit": 100.00,
+  "usage": {
+    "inputTokens": 50000,
+    "outputTokens": 25000,
+    "totalCost": 1.25
+  }
 }
 ```
-### Field Descriptions
+### `GET /v1/models`
-| Field | Type | Description |
-|-------|------|-------------|
-| `device_id` | string | Anonymous random ID (not fingerprintable) |
-| `task_type` | string | Inferred from token patterns, NOT prompt content |
-| `model` | string | The model that handled the request |
-| `tokens_in` | number | Input token count |
-| `tokens_out` | number | Output token count |
-| `latency_ms` | number | Request latency in milliseconds |
-| `success` | boolean | Whether the request succeeded |
-| `cost_usd` | number | Estimated cost in USD |
+List available models including aliases.
-### Task Types
+```bash
+curl http://localhost:8787/v1/models
+```
-Task types are inferred from request characteristics (token counts, ratios, etc.) - never from prompt content:
+### `POST /v1/chat/completions`
-- `quick_task` - Short input/output (< 500 tokens each)
-- `code_review` - Medium-long input, medium output
-- `generation` - High output/input ratio
-- `classification` - Low output/input ratio, short output
-- `long_context` - Input > 10,000 tokens
-- `content_generation` - Output > 1,000 tokens
-- `tool_use` - Request includes tool calls
-- `general` - Default classification
+OpenAI-compatible chat completions.
-### What We NEVER Collect
+```bash
+curl http://localhost:8787/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "rp:best",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
+```
-- ❌ Your prompts
-- ❌ Model responses
-- ❌ File paths or contents
-- ❌ Anything that could identify you or your project
+## Model Aliases
-### Verification
+| Alias | Resolves To | Provider | Use Case |
+|-------|-------------|----------|----------|
+| `rp:fast` | llama-3.1-8b-instant | Groq | Lowest latency |
+| `rp:cheap` | llama-3.1-8b-instant | Groq | Lowest cost |
+| `rp:best` | claude-3-5-sonnet-20241022 | Anthropic | Highest quality |
+| `rp:balanced` | gpt-4o-mini | OpenAI | Good balance |
-You can verify exactly what data is collected:
+## Dry-Run Mode
-```bash
-# See telemetry payloads before they're sent
-relayplane-proxy --audit
+Test routing logic without making API calls:
-# Disable all telemetry transmission
-relayplane-proxy --offline
+```bash
+curl http://localhost:8787/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "X-Dry-Run: true" \
+  -d '{
+    "model": "gpt-4o",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
+```
-# View the source code
-# https://github.com/RelayPlane/proxy
+Response:
+```json
+{
+  "dry_run": true,
+  "routing": {
+    "model": "gpt-4o",
+    "provider": "openai",
+    "endpoint": "https://api.openai.com/v1/chat/completions"
+  },
+  "estimate": {
+    "inputTokens": 10,
+    "expectedOutputTokens": 500,
+    "estimatedCost": 0.0125,
+    "currency": "USD"
+  },
+  "limits": {
+    "daily": 10.00,
+    "dailyUsed": 1.25,
+    "dailyRemaining": 8.75,
+    "monthly": 100.00,
+    "monthlyUsed": 25.50,
+    "monthlyRemaining": 74.50
+  }
+}
 ```
-### Opt-Out
+## Response Headers
-To disable telemetry completely:
+The proxy adds usage information to response headers:
-```bash
-relayplane-proxy telemetry off
+| Header | Description |
+|--------|-------------|
+| `X-RelayPlane-Cost` | Cost of this request |
+| `X-RelayPlane-Latency` | Request latency in ms |
+| `X-RelayPlane-Daily-Usage` | Daily usage (e.g., "1.25/10.00") |
+| `X-RelayPlane-Monthly-Usage` | Monthly usage (e.g., "25.50/100.00") |
+| `X-RelayPlane-Usage-Warning` | Warning when approaching limits (80%, 90%, 100%) |
+Example warning header:
+```
+X-RelayPlane-Usage-Warning: ⚠️ You've used $8.50 of your $10 daily limit
 ```
-To re-enable:
+Console warnings are also logged when approaching limits:
+```
+⚠️  Daily spending at 80%: $8.00 / $10
+⚠️  Daily spending at 90%: $9.00 / $10
+⚠️  DAILY LIMIT REACHED: $10.00 / $10 (100%)
+```
-```bash
-relayplane-proxy telemetry on
+## Spending Limits
+Configure limits in `~/.relayplane/config.json`:
+```json
+{
+  "limits": {
+    "daily": 10.00,
+    "monthly": 100.00
+  }
+}
 ```
-Check current status:
+When limits are reached, the proxy returns HTTP `429 Too Many Requests`:
-```bash
-relayplane-proxy telemetry status
+```json
+{
+  "error": {
+    "message": "Daily spending limit reached ($10.00 / $10.00)",
+    "code": "spending_limit_exceeded",
+    "type": "rate_limit_error"
+  }
+}
 ```
-## Configuration
+Headers included with 429 response:
+- `Retry-After: 86400` (seconds until daily reset)
+- `X-RelayPlane-Daily-Usage: 10.00/10.00`
-Configuration is stored in `~/.relayplane/config.json`.
+## Usage Tracking
-### Set API Key (Pro Features)
+Usage is logged to `~/.relayplane/usage.jsonl`:
-```bash
-relayplane-proxy config set-key your-api-key
+```jsonl
+{"timestamp":"2024-01-15T12:00:00Z","model":"gpt-4o","provider":"openai","inputTokens":100,"outputTokens":50,"cost":0.00125,"latencyMs":1500,"success":true}
 ```
-### View Configuration
+Daily totals are tracked in `~/.relayplane/daily-usage.json`:
-```bash
-relayplane-proxy config
+```json
+{
+  "date": "2024-01-15",
+  "cost": 1.25,
+  "requests": 50
+}
 ```
-## Usage Statistics
-View your usage statistics:
+Monthly totals are tracked in `~/.relayplane/monthly-usage.json`:
-```bash
-relayplane-proxy stats
+```json
+{
+  "month": "2024-01",
+  "cost": 25.50,
+  "requests": 1200
+}
 ```
-This shows:
-- Total requests and cost
-- Success rate
-- Breakdown by model
-- Breakdown by task type
 ## Environment Variables
-| Variable | Description |
-|----------|-------------|
-| `ANTHROPIC_API_KEY` | Anthropic API key |
-| `OPENAI_API_KEY` | OpenAI API key |
-| `GEMINI_API_KEY` | Google Gemini API key |
-| `XAI_API_KEY` | xAI/Grok API key |
-| `MOONSHOT_API_KEY` | Moonshot API key |
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `RELAYPLANE_PROXY_PORT` | 8787 | Port to listen on |
+| `RELAYPLANE_PROXY_HOST` | 127.0.0.1 | Host to bind to |
+| `RELAYPLANE_CONFIG_DIR` | ~/.relayplane | Config directory |
+| `OPENAI_API_KEY` | - | OpenAI API key |
+| `ANTHROPIC_API_KEY` | - | Anthropic API key |
+| `GROQ_API_KEY` | - | Groq API key |
+| `TOGETHER_API_KEY` | - | Together AI API key |
+| `OPENROUTER_API_KEY` | - | OpenRouter API key |
+## Provider Detection
+Models are automatically routed to the correct provider:
+| Pattern | Provider |
+|---------|----------|
+| `gpt-*`, `o1-*` | OpenAI |
+| `claude-*` | Anthropic |
+| `llama-*`, `mixtral-*` | Groq |
+| `meta-llama/*`, `mistralai/*` | Together |
+| Contains `/` | OpenRouter |
+## Using with OpenAI SDK
+```python
+from openai import OpenAI
+client = OpenAI(
+    base_url="http://localhost:8787/v1",
+    api_key="not-needed"  # API keys are configured on the proxy
+)
+response = client.chat.completions.create(
+    model="rp:best",  # Uses Claude 3.5 Sonnet
+    messages=[{"role": "user", "content": "Hello!"}]
+)
+```
 ## License

package/dist/server.d.ts ADDED Viewed

@@ -0,0 +1,13 @@
+#!/usr/bin/env node
+/**
+ * RelayPlane Local LLM Proxy Server
+ *
+ * Routes OpenAI-compatible requests to multiple providers.
+ * Features:
+ * - /health endpoint for monitoring
+ * - Usage tracking with spending warnings
+ * - Model aliases (rp:fast, rp:cheap, rp:best)
+ * - Dry-run mode for testing
+ */
+export {};
+//# sourceMappingURL=server.d.ts.map

package/dist/server.d.ts.map ADDED Viewed

	@@ -0,0 +1 @@
1	+ {"version":3,"file":"server.d.ts","sourceRoot":"","sources":["../src/server.ts"],"names":[],"mappings":";AACA;;;;;;;;;GASG"}