npm - switchboard-fyi - Versions diffs - 0.1.0 - Mend

switchboard-fyi 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/LICENSE +21 -0
package/README.md +538 -0
package/bin/switchboard-gateway.mjs +5543 -0
package/bin/switchboard-inspector.mjs +814 -0
package/bin/switchboard.mjs +6936 -0
package/docs/codex-subscription-provider-proxy.md +133 -0
package/docs/known-limitations.md +69 -0
package/docs/mvp-usage.md +207 -0
package/docs/routing-api.md +197 -0
package/docs/smoke-test.md +190 -0
package/lib/switchboard-core.mjs +779 -0
package/package.json +50 -0

package/docs/codex-subscription-provider-proxy.md ADDED Viewed

@@ -0,0 +1,133 @@
+# Codex Subscription Provider Proxy
+This is the canonical Codex integration for Switchboard.
+Use the Codex model-provider boundary:
+```text
+codex -> Switchboard /v1/responses -> chatgpt.com/backend-api/codex/responses
+```
+Do not use the Codex app-server `turn/start` boundary for core routing. That
+boundary can only choose one model for a whole user turn. The model-provider
+boundary sees the internal `/v1/responses` calls Codex makes during an agent
+loop, so Switchboard can route each call independently.
+## Why This Works
+Codex supports custom model providers:
+```toml
+model_provider = "switchboard"
+[model_providers.switchboard]
+name = "Switchboard"
+base_url = "http://127.0.0.1:8787/v1"
+wire_api = "responses"
+requires_openai_auth = true
+request_max_retries = 0
+stream_max_retries = 0
+```
+With ChatGPT/Codex login, Codex sends the user's local ChatGPT bearer auth to
+the configured provider. Switchboard must keep this local and forward it only to
+the ChatGPT Codex backend:
+```text
+https://chatgpt.com/backend-api/codex/responses
+```
+Do not forward Codex ChatGPT auth to:
+```text
+https://api.openai.com/v1/responses
+```
+That public Platform endpoint rejects the token with missing API scopes.
+Switchboard v1 is subscription-only for Codex: OpenAI API-key auth is rejected
+locally before routing and must not fall through to `api.openai.com`.
+## Local Proof
+Verified locally on 2026-05-18 with Codex CLI `0.130.0`.
+Command shape:
+```bash
+node ./bin/switchboard-gateway.mjs start \
+  --port 8787 \
+  --forward \
+  --codex-chatgpt-upstream
+codex exec \
+  --skip-git-repo-check \
+  --dangerously-bypass-approvals-and-sandbox \
+  -c 'model_provider="switchboard"' \
+  -c 'model="gpt-5.5"' \
+  -c 'model_providers.switchboard.name="Switchboard"' \
+  -c 'model_providers.switchboard.base_url="http://127.0.0.1:8787/v1"' \
+  -c 'model_providers.switchboard.wire_api="responses"' \
+  -c 'model_providers.switchboard.requires_openai_auth=true' \
+  -c 'model_providers.switchboard.request_max_retries=0' \
+  -c 'model_providers.switchboard.stream_max_retries=0' \
+  "Reply with exactly: subscription-provider-ok"
+```
+Observed:
+- Codex sent `POST /v1/responses` to Switchboard.
+- The request included a ChatGPT bearer auth header.
+- Switchboard forwarded to `/backend-api/codex/responses`.
+- The upstream returned `200`.
+## Per-Internal-Call Proof
+A single Codex user prompt that asked it to run a failing test, fix a bug, rerun
+the test, and return a final phrase produced 6 separate Codex `model_request`
+events in `~/.switchboard/harnesses/codex/events.jsonl`.
+Each request was:
+```text
+POST /v1/responses -> /backend-api/codex/responses
+```
+All returned `status: 200`.
+This proves Switchboard can observe and route Codex's internal model calls when
+Codex is configured through the model-provider boundary.
+## Model Mutation Proof
+The same path accepted a forced model mutation:
+```bash
+node ./bin/switchboard-gateway.mjs start \
+  --port 8787 \
+  --forward \
+  --codex-chatgpt-upstream \
+  --force-model gpt-5.4-mini
+```
+Switchboard logged:
+```json
+{
+  "requestedModel": "gpt-5.5",
+  "forwardedModel": "gpt-5.4-mini",
+  "status": 200
+}
+```
+That proves the provider proxy can do the actual routing action, not just
+observe requests.
+## Product Rule
+For Codex subscription support:
+- run Switchboard locally on `127.0.0.1`
+- configure Codex with a temporary custom `model_provider`
+- forward Codex ChatGPT auth only to the ChatGPT Codex backend
+- route each `/v1/responses` request independently
+- never collect ChatGPT tokens in hosted Switchboard infrastructure

package/docs/known-limitations.md ADDED Viewed

@@ -0,0 +1,69 @@
+# Known Limitations
+This is still a local-first MVP, not a hosted service.
+## CLI Usage
+- `switchboard install` creates PATH shims for normal `codex` and `claude`
+  usage. Open a new terminal after install so the managed PATH block is active.
+- If `~/.switchboard/bin` is not before the real tool locations in PATH, plain
+  `codex` or `claude` will bypass Switchboard. Check `switchboard status`.
+## Routing
+- The Switchboard API owns the Workers AI difficulty-classifier prompt, model
+  choice, and route decision.
+- The CLI does not run a local classifier or fallback router.
+- It can still miss nuance.
+- The CLI routes when the API directs it to route, then protects the developer
+  with automatic original-call fallback when a routed provider call fails before
+  useful output is sent.
+- Codex routing should happen at the Responses model-provider boundary, not
+  `turn/start`. The local subscription probe saw multiple `/v1/responses` calls
+  inside one Codex user prompt.
+## Tokens
+- Token counts are approximate when upstream usage is unavailable.
+- Dashboard totals focus on requests and tokens processed.
+- When upstream usage is missing, the CLI falls back to local estimates.
+## Logs
+- Logs are local JSONL files under `~/.switchboard`.
+- Prompt previews are local-only and known secrets are sanitized.
+- Set `logging.redactPrompts` to `true` if you want local previews replaced
+  with placeholders.
+- Log rotation is size-based and controlled by config.
+## Process Lifecycle
+- The wrappers start local proxies for each wrapped session.
+- They stop those proxies when the wrapped tool exits.
+- `switchboard status` shows detected Switchboard proxy processes,
+  but it does not kill anything yet.
+## Codex App-Server
+- The app-server remote WebSocket flow is not the supported routing path.
+- It proves `turn/start.params.model` can be rewritten, but that only controls
+  one model choice for an entire Codex turn.
+- Do not use it as the core routing path.
+## Codex Subscription Provider Proxy
+- The canonical Codex path uses a custom `model_provider` with
+  `wire_api = "responses"`.
+- Switchboard receives each Codex `/v1/responses` request and forwards it to
+  `https://chatgpt.com/backend-api/codex/responses`.
+- This path preserves ChatGPT/Codex subscription auth locally and enables
+  per-internal-call routing.
+- This is still an experimental local integration. Do not collect or forward
+  ChatGPT tokens to a hosted Switchboard service.
+## Claude Code
+- The Claude path uses `ANTHROPIC_BASE_URL` and forwards through the local
+  gateway.
+- Claude Code may make background/internal requests. Switchboard logs and routes
+  those separately from user-visible prompts.

package/docs/mvp-usage.md ADDED Viewed

@@ -0,0 +1,207 @@
+# Switchboard MVP Usage
+Install Switchboard shims once, then keep using normal Codex and Claude
+commands:
+```bash
+switchboard install
+```
+The installer writes shims to `~/.switchboard/bin`, records the real binary
+paths in `~/.switchboard/install.json`, and adds a managed PATH block to your
+shell rc file. Open a new terminal after install.
+## Codex
+```bash
+codex --no-alt-screen "Reply with exactly: codex-ok"
+```
+Interactive:
+```bash
+codex
+```
+Switchboard starts a local OpenAI Responses-compatible HTTP gateway, launches
+the real Codex CLI with a temporary custom `model_provider`, and mutates
+`model` on each forwarded `/v1/responses` request.
+Canonical Codex path:
+```text
+codex -> Switchboard /v1/responses -> chatgpt.com/backend-api/codex/responses
+```
+This is the subscription-preserving, per-internal-call path. Do not build new
+Codex routing work on the app-server `turn/start` proxy; that boundary can only
+route a whole user turn.
+Switchboard API classifier mode:
+```bash
+switchboard login
+switchboard balance
+switchboard config use-switchboard-api https://api.switchboard.fyi
+```
+After that, normal wrapped Codex runs use the Switchboard Cloudflare API as the
+only classifier:
+```bash
+codex \
+  exec --skip-git-repo-check --dangerously-bypass-approvals-and-sandbox \
+  "Reply with exactly: router-ok"
+```
+The API checks credits, calls the Cloudflare Workers AI difficulty classifier,
+and returns the request difficulty, reason code, flags, and a short
+dashboard-safe task label.
+If the configured Codex proxy port is busy, Switchboard automatically chooses
+the next free local port unless you explicitly pass `--port`.
+## Claude Code
+```bash
+claude -p "Reply with exactly: claude-ok" --model sonnet
+```
+Interactive:
+```bash
+claude
+```
+Switchboard starts a local Anthropic-compatible HTTP gateway, launches the real
+Claude Code with `ANTHROPIC_BASE_URL` pointed at it, and mutates the forwarded
+`model` when the route is cheap-safe.
+Claude Code uses the same Switchboard API classifier and account credits as
+Codex.
+## Start And Observe
+```bash
+switchboard start codex
+switchboard observe codex
+```
+- `start`: routes requests and opens the live dashboard.
+- `observe`: calls the same API for the recommendation and opens the same live
+  dashboard, but preserves the requested model locally. The API recommendation
+  still spends one request credit.
+Each harness has a single active owner. Starting Codex or Claude Code again
+while it is already in `start` or `observe` mode prints the owning pid and exits;
+closing that terminal clears the owner.
+Wrapped Codex and Claude Code sessions are single-owner too: a second routed
+session for the same harness is refused until the first terminal exits.
+## Useful Overrides
+Codex:
+```bash
+codex --cheap-model gpt-5.4-mini
+codex --force-model gpt-5.4-mini
+```
+The Codex wrapper injects these temporary config overrides:
+```bash
+-c 'model_provider="switchboard"'
+-c 'model_providers.switchboard.base_url="http://127.0.0.1:<port>/v1"'
+-c 'model_providers.switchboard.wire_api="responses"'
+-c 'model_providers.switchboard.requires_openai_auth=true'
+```
+Claude:
+```bash
+claude --cheap-model claude-haiku-4-5
+```
+## Logs
+```bash
+switchboard logs codex
+switchboard logs claude-code
+tail -f ~/.switchboard/harnesses/codex/events.jsonl
+tail -f ~/.switchboard/harnesses/claude/events.jsonl
+```
+Each harness `events.jsonl` includes:
+```text
+classification_api_payload
+classification_api_result
+route_applied
+forward_result
+upstream_response
+```
+`upstream_response` is the result-side audit event. It shares the route
+`decisionId` and records the upstream status, total latency, body bytes,
+response ids, usage when the provider exposes it, stream event counts, error
+summary, and a redacted output preview.
+## Config And Status
+```bash
+switchboard config init
+switchboard config show
+switchboard status
+```
+Config lives at:
+```bash
+~/.switchboard/config.json
+```
+The config controls default mode, default cheap models, dashboard refresh,
+prompt preview logging, and log rotation. Prompt previews are shown locally
+with known secrets sanitized. Set `logging.redactPrompts` to `true` only if
+you want local previews replaced with placeholders.
+Authenticated CLI and gateway crashes are reported best-effort to the
+Switchboard API for reliability triage. Expected user states such as missing
+credits, missing login, invalid flags, missing harness binaries, and local
+configuration refusals are filtered out. Reports are sanitized, truncated, and
+sent with a short timeout; failed reporting never changes CLI exit behavior.
+Set `diagnostics.errorReporting.enabled` to `false` or
+`SWITCHBOARD_ERROR_REPORTING=0` to disable remote error reporting.
+Wrapped Codex and Claude exits are quiet by default. Set
+`sessionReceipt.enabled` to `true` to print a session receipt with calls
+observed, preserved requests, forced overrides, observe-only calls, and tokens
+processed.
+## Dashboard
+```bash
+switchboard dashboard codex
+switchboard dashboard claude-code
+switchboard watch codex
+switchboard watch claude-code
+switchboard inspect
+switchboard inspect --harness codex
+switchboard inspect --harness claude-code
+switchboard dashboard claude-code --interval 1000
+```
+The dashboard reads harness-scoped JSONL route logs under
+`~/.switchboard/harnesses/<harness>/`. Use `codex` or `claude-code` to inspect
+one harness at a time, or `all` for the aggregate compatibility view. The top
+section shows requests and tokens processed, plus routing health. The middle
+section shows route mix and model changes from `x -> y`. The bottom section is
+the live call feed, which now shows the actual model used. Use `--live` while another
+terminal runs the matching harness through Switchboard. Live mode uses an
+alternate-screen watch view, so it does not push your terminal scrollback. Press
+`q` or Ctrl-C to exit.
+`switchboard inspect` starts the local web inspector for routing API debugging. It
+groups events by `decisionId` and shows payload, decision, applied route,
+forwarding status, and raw events in separate tabs.

package/docs/routing-api.md ADDED Viewed

@@ -0,0 +1,197 @@
+# Classification API
+Switchboard now uses the API for one thing only:
+- classify how hard a request is
+The local CLI owns model selection.
+## Product Boundary
+The request flow is:
+1. the gateway intercepts a Codex or Claude request
+2. it derives a compact normalized packet
+3. it sends that packet to `POST /v1/classify`
+4. the API returns difficulty `1..5`, confidence, reason code, flags, and a short task label
+5. the local CLI maps that difficulty to the configured model
+6. the gateway forwards the provider request using that chosen model
+The API does not return a target model.
+## Endpoint
+```text
+POST /v1/classify
+```
+Authentication and billing work the same way as the old routing endpoint: each
+successful new classification request debits one request credit.
+## Request Shape
+The gateway sends a compact packet with only the information that materially
+changes difficulty classification.
+Example:
+```json
+{
+  "request_id": "uuid",
+  "packet": {
+    "schemaVersion": 1,
+    "decisionId": "uuid",
+    "integration": "codex",
+    "observed": {
+      "requestedModel": "gpt-5.5",
+      "reasoningEffort": "xhigh",
+      "stream": true,
+      "inputItemCount": 173,
+      "estimatedInputTokens": 183232,
+      "bodyBytes": 732926,
+      "toolCount": 17
+    },
+    "capabilities": {
+      "hasShellTool": true,
+      "hasFileEditTool": true,
+      "hasWebTool": true,
+      "isSubagentCall": false,
+      "hasParentAgent": true,
+      "isLikelyContinuationAfterTool": true
+    },
+    "context": {
+      "latestUserText": "fix the failing login form",
+      "recentAssistantText": "I found the stack trace in auth.ts",
+      "recentToolResultText": "TypeError: cannot read properties of undefined",
+      "recentErrorText": "TypeError in auth.ts:142",
+      "filesOrSymbolsMentioned": ["auth.ts", "LoginForm.tsx"]
+    },
+    "features": {
+      "exact": false,
+      "error": true,
+      "stack": true,
+      "tests": false,
+      "risk": true,
+      "db": false,
+      "auth": true,
+      "billing": false,
+      "infra": false,
+      "ci": false,
+      "deploy": false,
+      "routing": false,
+      "multiFile": true,
+      "statefulMissingContext": false
+    }
+  },
+  "settings": {
+    "harness": "codex",
+    "observe": false,
+    "blockedTargets": []
+  },
+  "metadata": {
+    "harness": "codex",
+    "requested_model": "gpt-5.5",
+    "estimated_input_tokens": 183232,
+    "body_bytes": 732926
+  }
+}
+```
+Notes:
+- do not send the full provider request body
+- do not send the full conversation history
+- `settings` may include local model profiles, difficulty maps, blocked
+  targets, and observe mode for accounting and analytics
+- the API does not choose or return the provider target model; local CLI policy
+  remains the execution source of truth
+## Response Shape
+Example:
+```json
+{
+  "allowed": true,
+  "classification": {
+    "difficulty": 4,
+    "confidence": 0.81,
+    "reason_code": "multi_file_debugging",
+    "reason": "Multi-file debugging with tool use and moderate ambiguity.",
+    "task_label": "🔎 Fix login failure",
+    "flags": {
+      "high_risk": false,
+      "long_context": false,
+      "tool_heavy": true,
+      "continuation": true
+    }
+  },
+  "credit": {
+    "status": "allowed",
+    "debited": true,
+    "chargeable": true
+  }
+}
+```
+The gateway then performs local model selection based on the harness difficulty
+map.
+`task_label` is a dashboard-only summary generated by the classifier. It should
+be one leading emoji plus 3-6 words, sanitized so it does not echo secrets,
+tokens, emails, URLs, raw code, or stack traces. New CLIs prefer this label in
+the local live feed and fall back to the raw task preview when it is absent.
+## Difficulty Meaning
+Difficulty means:
+> the minimum model class likely to succeed reliably
+Typical guidance:
+- `1`: tiny exact task
+- `2`: light bounded task
+- `3`: routine coding or analysis work
+- `4`: multi-step or multi-file work
+- `5`: hard debugging, architecture, large-context reasoning, or high-risk work
+## Observe Mode
+Observe mode remains local-only.
+The gateway still calls `/v1/classify`, but preserves the requested model
+locally. Logs and dashboards show the difficulty result and the model that
+would have been chosen from the local difficulty map.
+## Logs
+The main classification events are:
+```text
+classification_api_payload
+classification_api_result
+model_request
+route_applied
+forward_result
+upstream_response
+```
+Inspect them locally:
+```bash
+tail -f ~/.switchboard/harnesses/codex/events.jsonl
+tail -f ~/.switchboard/harnesses/claude/events.jsonl
+switchboard inspect
+```
+## Dashboard
+The dashboard should be read as:
+- what difficulty did the API assign?
+- what concise task label did the classifier assign?
+- what model did the local CLI choose for that difficulty?
+- how much did that save?
+It is no longer a mode-or-tier product.