npm - cto-ai-cli - Versions diffs - 3.2.0 → 4.0.0 - Mend

cto-ai-cli 3.2.0 → 4.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

package/DOCS.md +201 -0
package/README.md +70 -2
package/dist/action/index.js +607 -83
package/dist/api/dashboard.js +85 -23
package/dist/api/dashboard.js.map +1 -1
package/dist/api/server.js +86 -24
package/dist/api/server.js.map +1 -1
package/dist/cli/gateway.js +2925 -0
package/dist/cli/score.js +2656 -217
package/dist/cli/v2/index.js +111 -49
package/dist/cli/v2/index.js.map +1 -1
package/dist/engine/index.d.ts +85 -1
package/dist/engine/index.js +643 -42
package/dist/engine/index.js.map +1 -1
package/dist/gateway/index.d.ts +281 -0
package/dist/gateway/index.js +2803 -0
package/dist/gateway/index.js.map +1 -0
package/dist/govern/index.d.ts +45 -4
package/dist/govern/index.js +318 -33
package/dist/govern/index.js.map +1 -1
package/dist/interact/index.js +86 -24
package/dist/interact/index.js.map +1 -1
package/dist/mcp/v2.js +108 -46
package/dist/mcp/v2.js.map +1 -1
package/package.json +3 -2

package/DOCS.md CHANGED Viewed

@@ -6,6 +6,7 @@
 - [CLI Commands](#cli-commands)
 - [Security Audit](#security-audit---audit)
+- [Context Gateway](#context-gateway)
 - [MCP Server](#mcp-server)
 - [API Server](#api-server)
 - [Programmatic API](#programmatic-api)
@@ -236,6 +237,206 @@ Based on findings, CTO generates actionable recommendations:
 ---
+## Context Gateway
+A transparent HTTP proxy that sits between your application and any LLM provider. It intercepts every request, scans for secrets, optimizes context, tracks costs, and enforces budgets — all with zero external dependencies.
+### Quick start
+```bash
+npx cto-gateway                    # Start on port 8787
+npx cto-gateway --port 9000        # Custom port
+npx cto-gateway --block-secrets    # Hard block on critical secrets
+npx cto-gateway --budget-daily 10  # $10/day limit
+```
+Then point your app:
+```bash
+export OPENAI_BASE_URL=http://localhost:8787
+```
+Every request must include the target provider URL as a header:
+```
+x-cto-target: https://api.openai.com/v1/chat/completions
+```
+### Architecture
+```
+Your App → Gateway (localhost:8787) → Provider (api.openai.com)
+                │
+                ├── Secret Scanner  → redacts/blocks secrets in messages
+                ├── Context Optimizer → injects CTO-selected context
+                ├── Cost Tracker → logs per-request cost to JSONL
+                ├── Budget Guard → rejects requests over limit (429)
+                └── Dashboard → live web UI at /__cto
+```
+### CLI flags
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--port <n>` | `8787` | Port to listen on |
+| `--host <addr>` | `127.0.0.1` | Host to bind to |
+| `--project <path>` | `.` | Project to analyze for context optimization |
+| `--block-secrets` | off | Hard block requests with critical secrets (403) |
+| `--budget-daily <$>` | unlimited | Max cost per day — returns 429 when exceeded |
+| `--budget-monthly <$>` | unlimited | Max cost per month |
+| `--no-optimize` | on | Disable CTO context injection |
+| `--no-redact` | on | Disable secret redaction |
+| `--no-dashboard` | on | Disable web dashboard |
+### Provider detection
+The Gateway auto-detects providers from the `x-cto-target` URL and request headers:
+| Provider | Detection | Auth header |
+|----------|-----------|-------------|
+| OpenAI | `api.openai.com` or `/v1/chat/completions` | `Authorization: Bearer sk-...` |
+| Anthropic | `api.anthropic.com` or `anthropic-version` header | `x-api-key` |
+| Google AI | `generativelanguage.googleapis.com` | `x-goog-api-key` |
+| Azure OpenAI | `*.openai.azure.com` or `api-key` header | `api-key` |
+| Custom | Fallback — assumes OpenAI-compatible | `Authorization` |
+### Model pricing
+Built-in pricing for accurate cost tracking:
+| Model | Input ($/M tokens) | Output ($/M tokens) | Context window |
+|-------|--------------------|--------------------|----------------|
+| gpt-4o | $2.50 | $10.00 | 128K |
+| gpt-4o-mini | $0.15 | $0.60 | 128K |
+| o1 | $15.00 | $60.00 | 200K |
+| o3-mini | $1.10 | $4.40 | 200K |
+| claude-sonnet-4 | $3.00 | $15.00 | 200K |
+| claude-3.5-haiku | $0.80 | $4.00 | 200K |
+| gemini-2.5-pro | $1.25 | $10.00 | 1M |
+| gemini-2.0-flash | $0.10 | $0.40 | 1M |
+### Request lifecycle
+1. **Receive** — Client sends POST request to Gateway
+2. **Budget check** — If daily/monthly budget exceeded → 429
+3. **Parse** — Detect provider, extract messages from provider-specific format
+4. **Scan secrets** — Run 30+ patterns against all message content
+5. **Redact or block** — Replace secrets with `***REDACTED***` or return 403
+6. **Optimize context** — If analysis ready, inject CTO-selected files into system prompt
+7. **Forward** — Proxy to provider (streaming SSE or buffered)
+8. **Track** — Log cost, tokens, savings, latency to JSONL
+9. **Respond** — Forward provider response to client (zero-copy for streams)
+### Streaming support
+The Gateway fully supports Server-Sent Events (SSE) streaming:
+- Detects streaming from `Content-Type: text/event-stream`
+- Zero-copy passthrough: chunks are forwarded to client as they arrive
+- Async token tracking: parses SSE events in background without blocking the stream
+- Usage data extracted from final SSE chunk (when provider includes it)
+### Dashboard
+Available at `http://localhost:8787/__cto` (configurable via `--dashboardPath`).
+Shows:
+- **Today**: requests, cost, tokens saved, secrets redacted
+- **This month**: totals + budget progress
+- **Feature status**: optimization, redaction, tracking, audit log
+- **By model**: breakdown of requests, tokens, and cost per model
+- **By provider**: requests and cost per provider
+- Auto-refreshes every 30 seconds
+### Usage storage
+| File | Format | Description |
+|------|--------|-------------|
+| `.cto/gateway/usage/YYYY-MM.jsonl` | JSON Lines | One line per request. Monthly files. |
+Each line:
+```json
+{
+  "id": "a1b2c3d4",
+  "timestamp": "2026-02-24T23:52:00.000Z",
+  "provider": "openai",
+  "model": "gpt-4o",
+  "inputTokens": 1200,
+  "outputTokens": 350,
+  "costUSD": 0.0065,
+  "originalTokens": 6200,
+  "optimizedTokens": 1200,
+  "savedTokens": 5000,
+  "savedUSD": 0.0130,
+  "secretsRedacted": 2,
+  "secretsBlocked": false,
+  "latencyMs": 152,
+  "stream": true
+}
+```
+### Budget enforcement
+When a budget is set, the Gateway checks cost totals before every request:
+| Condition | HTTP response |
+|-----------|---------------|
+| Daily budget exceeded | `429 Too Many Requests` + `{ "error": "Daily budget exceeded", "budget": 10, "current": 10.42 }` |
+| Monthly budget exceeded | `429 Too Many Requests` + `{ "error": "Monthly budget exceeded" }` |
+| Critical secrets + `--block-secrets` | `403 Forbidden` + `{ "error": "Request blocked: secrets detected" }` |
+Budget alerts are emitted at 80% of limit (configurable via `alertThreshold`).
+### Programmatic API
+```typescript
+import { ContextGateway, UsageTracker } from 'cto-ai-cli/gateway';
+const gateway = new ContextGateway({
+  port: 8787,
+  projectPath: '/path/to/project',
+  redactSecrets: true,
+  blockOnSecrets: false,
+  budgetDaily: 20,
+  budgetMonthly: 500,
+});
+// Listen to events
+gateway.onEvent((event) => {
+  if (event.type === 'request') console.log(`${event.record.model}: $${event.record.costUSD}`);
+  if (event.type === 'budget-alert') console.log(`Budget warning: ${event.period}`);
+});
+await gateway.start();
+// Get usage summary
+const tracker = gateway.getTracker();
+const summary = tracker.getSummary('month');
+console.log(`This month: ${summary.totalRequests} requests, $${summary.totalCostUSD}`);
+// Provider detection (standalone)
+import { detectProvider, estimateCost } from 'cto-ai-cli/gateway';
+const provider = detectProvider('https://api.openai.com/v1/chat/completions', {});
+const cost = estimateCost(provider, 'gpt-4o', 5000, 1000);
+```
+### Interceptor (standalone)
+```typescript
+import { interceptRequest } from 'cto-ai-cli/gateway';
+import type { Message, GatewayConfig } from 'cto-ai-cli/gateway';
+const messages: Message[] = [
+  { role: 'user', content: 'Deploy with key sk-live_abc123...' },
+];
+const result = await interceptRequest(messages, config, analysis);
+// result.secretsRedacted → 1
+// result.messages[0].content → 'Deploy with key sk-l**********23...'
+// result.decisions → ['Redacted 1 secret(s) in user message: api-key']
+```
+---
 ## MCP Server
 CTO exposes 19 tools via the Model Context Protocol.

package/README.md CHANGED Viewed

@@ -3,7 +3,7 @@
 > **Early access** — This is a test version. We'd love your feedback.
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
-[![Tests](https://img.shields.io/badge/tests-449_passing-brightgreen.svg)](#)
+[![Tests](https://img.shields.io/badge/tests-573_passing-brightgreen.svg)](#)
 ## Try it now (zero install)
@@ -301,6 +301,72 @@ Every day, developers accidentally send secrets to AI tools:
 ---
+## 🌐 Context Gateway — AI proxy for your entire team
+Every AI API call from your team passes through the Gateway. It sits between your app and any LLM provider, automatically optimizing context, redacting secrets, and tracking costs.
+```bash
+npx cto-gateway
+```
+```
+  ⚡ CTO Context Gateway v4.0.0
+  🌐 Proxy:      http://127.0.0.1:8787
+  📊 Dashboard:  http://127.0.0.1:8787/__cto
+  📁 Project:    /your/project
+  ✅ Context optimization
+  ✅ Secret redaction
+  ✅ Cost tracking
+  ⬜ Daily budget (unlimited)
+  How to connect:
+    OPENAI_BASE_URL=http://127.0.0.1:8787
+    + set header: x-cto-target: https://api.openai.com/v1/chat/completions
+  Waiting for requests...
+  18:52:34  openai/gpt-4o  1200 tokens  $0.0075 (saved 5.2K tokens, $0.0130) [2 secrets redacted]  152ms
+```
+### What it does
+| Feature | Description |
+|---------|-------------|
+| **Secret redaction** | Scans every message for API keys, tokens, passwords → auto-redacts before sending to the LLM |
+| **Secret blocking** | Optional hard block — reject requests that contain critical secrets |
+| **Context optimization** | Injects CTO-selected files, type definitions, and hub modules into system prompts |
+| **Cost tracking** | Tracks per-request cost by model and provider. Persistent JSONL logs. |
+| **Budget enforcement** | Set daily/monthly limits. Gateway returns 429 when exceeded. |
+| **Live dashboard** | Dark-theme web UI at `/__cto` — today's stats, monthly breakdown, model costs |
+| **SSE streaming** | Full passthrough of streaming responses with zero-copy. No added latency. |
+| **Multi-provider** | OpenAI, Anthropic, Google AI, Azure OpenAI, and any OpenAI-compatible API |
+### Supported providers & models
+| Provider | Models | Pricing tracked |
+|----------|--------|----------------|
+| **OpenAI** | GPT-4o, GPT-4o Mini, o1, o1-mini, o3-mini | ✅ |
+| **Anthropic** | Claude Sonnet 4, Claude 3.5 Haiku, Claude 3 Opus | ✅ |
+| **Google** | Gemini 2.5 Pro, Gemini 2.0 Flash, Gemini 1.5 Pro | ✅ |
+| **Azure OpenAI** | Same as OpenAI (different hosting) | ✅ |
+| **Custom** | Any OpenAI-compatible API (Ollama, LiteLLM, etc.) | Manual |
+### Configuration
+```bash
+cto-gateway --port 9000                  # Custom port
+cto-gateway --block-secrets              # Hard block on critical secrets
+cto-gateway --budget-daily 10            # Max $10/day
+cto-gateway --budget-monthly 200         # Max $200/month
+cto-gateway --project ./my-app           # Analyze a specific project
+cto-gateway --no-optimize                # Disable context injection
+cto-gateway --no-redact                  # Disable secret redaction
+```
+---
 ## What you can do with CTO
 | Use case | How |
@@ -309,12 +375,14 @@ Every day, developers accidentally send secrets to AI tools:
 | **Auto-optimize context** | `npx cto-ai-cli --fix` → generates `.cto/context.md` to paste into AI |
 | **Task-specific context** | `npx cto-ai-cli --context "refactor auth"` → optimized for your task |
 | **Security audit** | `npx cto-ai-cli --audit` → detect secrets & PII before AI sees them |
+| **AI proxy (Gateway)** | `npx cto-gateway` → proxy with secret redaction + cost tracking |
 | **Shareable report** | `npx cto-ai-cli --report` → markdown report + README badge |
 | **Compare vs open source** | `npx cto-ai-cli --compare` → your score vs Zod, Next.js, Express |
 | **Compare strategies** | `npx cto-ai-cli --benchmark` → CTO vs naive vs random |
 | **Get context for a task** | `cto2 interact "your task"` |
 | **Use in your AI editor** | Add MCP server (see setup above) |
 | **Block secrets in CI** | `CI=true npx cto-ai-cli --audit` |
+| **Budget control** | `cto-gateway --budget-daily 10 --budget-monthly 200` |
 | **JSON output (scripting)** | `npx cto-ai-cli --json` |
 ---
@@ -350,7 +418,7 @@ git clone <repo-url>
 cd cto
 npm install
 npm run build
-npm test          # 449 tests
+npm test          # 573 tests
 npm run typecheck # strict TypeScript
 ```