switchboard-fyi 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,133 @@
1
+ # Codex Subscription Provider Proxy
2
+
3
+ This is the canonical Codex integration for Switchboard.
4
+
5
+ Use the Codex model-provider boundary:
6
+
7
+ ```text
8
+ codex -> Switchboard /v1/responses -> chatgpt.com/backend-api/codex/responses
9
+ ```
10
+
11
+ Do not use the Codex app-server `turn/start` boundary for core routing. That
12
+ boundary can only choose one model for a whole user turn. The model-provider
13
+ boundary sees the internal `/v1/responses` calls Codex makes during an agent
14
+ loop, so Switchboard can route each call independently.
15
+
16
+ ## Why This Works
17
+
18
+ Codex supports custom model providers:
19
+
20
+ ```toml
21
+ model_provider = "switchboard"
22
+
23
+ [model_providers.switchboard]
24
+ name = "Switchboard"
25
+ base_url = "http://127.0.0.1:8787/v1"
26
+ wire_api = "responses"
27
+ requires_openai_auth = true
28
+ request_max_retries = 0
29
+ stream_max_retries = 0
30
+ ```
31
+
32
+ With ChatGPT/Codex login, Codex sends the user's local ChatGPT bearer auth to
33
+ the configured provider. Switchboard must keep this local and forward it only to
34
+ the ChatGPT Codex backend:
35
+
36
+ ```text
37
+ https://chatgpt.com/backend-api/codex/responses
38
+ ```
39
+
40
+ Do not forward Codex ChatGPT auth to:
41
+
42
+ ```text
43
+ https://api.openai.com/v1/responses
44
+ ```
45
+
46
+ That public Platform endpoint rejects the token with missing API scopes.
47
+ Switchboard v1 is subscription-only for Codex: OpenAI API-key auth is rejected
48
+ locally before routing and must not fall through to `api.openai.com`.
49
+
50
+ ## Local Proof
51
+
52
+ Verified locally on 2026-05-18 with Codex CLI `0.130.0`.
53
+
54
+ Command shape:
55
+
56
+ ```bash
57
+ node ./bin/switchboard-gateway.mjs start \
58
+ --port 8787 \
59
+ --forward \
60
+ --codex-chatgpt-upstream
61
+
62
+ codex exec \
63
+ --skip-git-repo-check \
64
+ --dangerously-bypass-approvals-and-sandbox \
65
+ -c 'model_provider="switchboard"' \
66
+ -c 'model="gpt-5.5"' \
67
+ -c 'model_providers.switchboard.name="Switchboard"' \
68
+ -c 'model_providers.switchboard.base_url="http://127.0.0.1:8787/v1"' \
69
+ -c 'model_providers.switchboard.wire_api="responses"' \
70
+ -c 'model_providers.switchboard.requires_openai_auth=true' \
71
+ -c 'model_providers.switchboard.request_max_retries=0' \
72
+ -c 'model_providers.switchboard.stream_max_retries=0' \
73
+ "Reply with exactly: subscription-provider-ok"
74
+ ```
75
+
76
+ Observed:
77
+
78
+ - Codex sent `POST /v1/responses` to Switchboard.
79
+ - The request included a ChatGPT bearer auth header.
80
+ - Switchboard forwarded to `/backend-api/codex/responses`.
81
+ - The upstream returned `200`.
82
+
83
+ ## Per-Internal-Call Proof
84
+
85
+ A single Codex user prompt that asked it to run a failing test, fix a bug, rerun
86
+ the test, and return a final phrase produced 6 separate Codex `model_request`
87
+ events in `~/.switchboard/harnesses/codex/events.jsonl`.
88
+
89
+ Each request was:
90
+
91
+ ```text
92
+ POST /v1/responses -> /backend-api/codex/responses
93
+ ```
94
+
95
+ All returned `status: 200`.
96
+
97
+ This proves Switchboard can observe and route Codex's internal model calls when
98
+ Codex is configured through the model-provider boundary.
99
+
100
+ ## Model Mutation Proof
101
+
102
+ The same path accepted a forced model mutation:
103
+
104
+ ```bash
105
+ node ./bin/switchboard-gateway.mjs start \
106
+ --port 8787 \
107
+ --forward \
108
+ --codex-chatgpt-upstream \
109
+ --force-model gpt-5.4-mini
110
+ ```
111
+
112
+ Switchboard logged:
113
+
114
+ ```json
115
+ {
116
+ "requestedModel": "gpt-5.5",
117
+ "forwardedModel": "gpt-5.4-mini",
118
+ "status": 200
119
+ }
120
+ ```
121
+
122
+ That proves the provider proxy can do the actual routing action, not just
123
+ observe requests.
124
+
125
+ ## Product Rule
126
+
127
+ For Codex subscription support:
128
+
129
+ - run Switchboard locally on `127.0.0.1`
130
+ - configure Codex with a temporary custom `model_provider`
131
+ - forward Codex ChatGPT auth only to the ChatGPT Codex backend
132
+ - route each `/v1/responses` request independently
133
+ - never collect ChatGPT tokens in hosted Switchboard infrastructure
@@ -0,0 +1,69 @@
1
+ # Known Limitations
2
+
3
+ This is still a local-first MVP, not a hosted service.
4
+
5
+ ## CLI Usage
6
+
7
+ - `switchboard install` creates PATH shims for normal `codex` and `claude`
8
+ usage. Open a new terminal after install so the managed PATH block is active.
9
+ - If `~/.switchboard/bin` is not before the real tool locations in PATH, plain
10
+ `codex` or `claude` will bypass Switchboard. Check `switchboard status`.
11
+
12
+ ## Routing
13
+
14
+ - The Switchboard API owns the Workers AI difficulty-classifier prompt, model
15
+ choice, and route decision.
16
+ - The CLI does not run a local classifier or fallback router.
17
+ - It can still miss nuance.
18
+ - The CLI routes when the API directs it to route, then protects the developer
19
+ with automatic original-call fallback when a routed provider call fails before
20
+ useful output is sent.
21
+ - Codex routing should happen at the Responses model-provider boundary, not
22
+ `turn/start`. The local subscription probe saw multiple `/v1/responses` calls
23
+ inside one Codex user prompt.
24
+
25
+ ## Tokens
26
+
27
+ - Token counts are approximate when upstream usage is unavailable.
28
+ - Dashboard totals focus on requests and tokens processed.
29
+ - When upstream usage is missing, the CLI falls back to local estimates.
30
+
31
+ ## Logs
32
+
33
+ - Logs are local JSONL files under `~/.switchboard`.
34
+ - Prompt previews are local-only and known secrets are sanitized.
35
+ - Set `logging.redactPrompts` to `true` if you want local previews replaced
36
+ with placeholders.
37
+ - Log rotation is size-based and controlled by config.
38
+
39
+ ## Process Lifecycle
40
+
41
+ - The wrappers start local proxies for each wrapped session.
42
+ - They stop those proxies when the wrapped tool exits.
43
+ - `switchboard status` shows detected Switchboard proxy processes,
44
+ but it does not kill anything yet.
45
+
46
+ ## Codex App-Server
47
+
48
+ - The app-server remote WebSocket flow is not the supported routing path.
49
+ - It proves `turn/start.params.model` can be rewritten, but that only controls
50
+ one model choice for an entire Codex turn.
51
+ - Do not use it as the core routing path.
52
+
53
+ ## Codex Subscription Provider Proxy
54
+
55
+ - The canonical Codex path uses a custom `model_provider` with
56
+ `wire_api = "responses"`.
57
+ - Switchboard receives each Codex `/v1/responses` request and forwards it to
58
+ `https://chatgpt.com/backend-api/codex/responses`.
59
+ - This path preserves ChatGPT/Codex subscription auth locally and enables
60
+ per-internal-call routing.
61
+ - This is still an experimental local integration. Do not collect or forward
62
+ ChatGPT tokens to a hosted Switchboard service.
63
+
64
+ ## Claude Code
65
+
66
+ - The Claude path uses `ANTHROPIC_BASE_URL` and forwards through the local
67
+ gateway.
68
+ - Claude Code may make background/internal requests. Switchboard logs and routes
69
+ those separately from user-visible prompts.
@@ -0,0 +1,207 @@
1
+ # Switchboard MVP Usage
2
+
3
+ Install Switchboard shims once, then keep using normal Codex and Claude
4
+ commands:
5
+
6
+ ```bash
7
+ switchboard install
8
+ ```
9
+
10
+ The installer writes shims to `~/.switchboard/bin`, records the real binary
11
+ paths in `~/.switchboard/install.json`, and adds a managed PATH block to your
12
+ shell rc file. Open a new terminal after install.
13
+
14
+ ## Codex
15
+
16
+ ```bash
17
+ codex --no-alt-screen "Reply with exactly: codex-ok"
18
+ ```
19
+
20
+ Interactive:
21
+
22
+ ```bash
23
+ codex
24
+ ```
25
+
26
+ Switchboard starts a local OpenAI Responses-compatible HTTP gateway, launches
27
+ the real Codex CLI with a temporary custom `model_provider`, and mutates
28
+ `model` on each forwarded `/v1/responses` request.
29
+
30
+ Canonical Codex path:
31
+
32
+ ```text
33
+ codex -> Switchboard /v1/responses -> chatgpt.com/backend-api/codex/responses
34
+ ```
35
+
36
+ This is the subscription-preserving, per-internal-call path. Do not build new
37
+ Codex routing work on the app-server `turn/start` proxy; that boundary can only
38
+ route a whole user turn.
39
+
40
+ Switchboard API classifier mode:
41
+
42
+ ```bash
43
+ switchboard login
44
+ switchboard balance
45
+ switchboard config use-switchboard-api https://api.switchboard.fyi
46
+ ```
47
+
48
+ After that, normal wrapped Codex runs use the Switchboard Cloudflare API as the
49
+ only classifier:
50
+
51
+ ```bash
52
+ codex \
53
+ exec --skip-git-repo-check --dangerously-bypass-approvals-and-sandbox \
54
+ "Reply with exactly: router-ok"
55
+ ```
56
+
57
+ The API checks credits, calls the Cloudflare Workers AI difficulty classifier,
58
+ and returns the request difficulty, reason code, flags, and a short
59
+ dashboard-safe task label.
60
+
61
+ If the configured Codex proxy port is busy, Switchboard automatically chooses
62
+ the next free local port unless you explicitly pass `--port`.
63
+
64
+ ## Claude Code
65
+
66
+ ```bash
67
+ claude -p "Reply with exactly: claude-ok" --model sonnet
68
+ ```
69
+
70
+ Interactive:
71
+
72
+ ```bash
73
+ claude
74
+ ```
75
+
76
+ Switchboard starts a local Anthropic-compatible HTTP gateway, launches the real
77
+ Claude Code with `ANTHROPIC_BASE_URL` pointed at it, and mutates the forwarded
78
+ `model` when the route is cheap-safe.
79
+
80
+ Claude Code uses the same Switchboard API classifier and account credits as
81
+ Codex.
82
+
83
+ ## Start And Observe
84
+
85
+ ```bash
86
+ switchboard start codex
87
+ switchboard observe codex
88
+ ```
89
+
90
+ - `start`: routes requests and opens the live dashboard.
91
+ - `observe`: calls the same API for the recommendation and opens the same live
92
+ dashboard, but preserves the requested model locally. The API recommendation
93
+ still spends one request credit.
94
+
95
+ Each harness has a single active owner. Starting Codex or Claude Code again
96
+ while it is already in `start` or `observe` mode prints the owning pid and exits;
97
+ closing that terminal clears the owner.
98
+
99
+ Wrapped Codex and Claude Code sessions are single-owner too: a second routed
100
+ session for the same harness is refused until the first terminal exits.
101
+
102
+ ## Useful Overrides
103
+
104
+ Codex:
105
+
106
+ ```bash
107
+ codex --cheap-model gpt-5.4-mini
108
+ codex --force-model gpt-5.4-mini
109
+ ```
110
+
111
+ The Codex wrapper injects these temporary config overrides:
112
+
113
+ ```bash
114
+ -c 'model_provider="switchboard"'
115
+ -c 'model_providers.switchboard.base_url="http://127.0.0.1:<port>/v1"'
116
+ -c 'model_providers.switchboard.wire_api="responses"'
117
+ -c 'model_providers.switchboard.requires_openai_auth=true'
118
+ ```
119
+
120
+ Claude:
121
+
122
+ ```bash
123
+ claude --cheap-model claude-haiku-4-5
124
+ ```
125
+
126
+ ## Logs
127
+
128
+ ```bash
129
+ switchboard logs codex
130
+ switchboard logs claude-code
131
+ tail -f ~/.switchboard/harnesses/codex/events.jsonl
132
+ tail -f ~/.switchboard/harnesses/claude/events.jsonl
133
+ ```
134
+
135
+ Each harness `events.jsonl` includes:
136
+
137
+ ```text
138
+ classification_api_payload
139
+ classification_api_result
140
+ route_applied
141
+ forward_result
142
+ upstream_response
143
+ ```
144
+
145
+ `upstream_response` is the result-side audit event. It shares the route
146
+ `decisionId` and records the upstream status, total latency, body bytes,
147
+ response ids, usage when the provider exposes it, stream event counts, error
148
+ summary, and a redacted output preview.
149
+
150
+ ## Config And Status
151
+
152
+ ```bash
153
+ switchboard config init
154
+ switchboard config show
155
+ switchboard status
156
+ ```
157
+
158
+ Config lives at:
159
+
160
+ ```bash
161
+ ~/.switchboard/config.json
162
+ ```
163
+
164
+ The config controls default mode, default cheap models, dashboard refresh,
165
+ prompt preview logging, and log rotation. Prompt previews are shown locally
166
+ with known secrets sanitized. Set `logging.redactPrompts` to `true` only if
167
+ you want local previews replaced with placeholders.
168
+
169
+ Authenticated CLI and gateway crashes are reported best-effort to the
170
+ Switchboard API for reliability triage. Expected user states such as missing
171
+ credits, missing login, invalid flags, missing harness binaries, and local
172
+ configuration refusals are filtered out. Reports are sanitized, truncated, and
173
+ sent with a short timeout; failed reporting never changes CLI exit behavior.
174
+ Set `diagnostics.errorReporting.enabled` to `false` or
175
+ `SWITCHBOARD_ERROR_REPORTING=0` to disable remote error reporting.
176
+
177
+ Wrapped Codex and Claude exits are quiet by default. Set
178
+ `sessionReceipt.enabled` to `true` to print a session receipt with calls
179
+ observed, preserved requests, forced overrides, observe-only calls, and tokens
180
+ processed.
181
+
182
+ ## Dashboard
183
+
184
+ ```bash
185
+ switchboard dashboard codex
186
+ switchboard dashboard claude-code
187
+ switchboard watch codex
188
+ switchboard watch claude-code
189
+ switchboard inspect
190
+ switchboard inspect --harness codex
191
+ switchboard inspect --harness claude-code
192
+ switchboard dashboard claude-code --interval 1000
193
+ ```
194
+
195
+ The dashboard reads harness-scoped JSONL route logs under
196
+ `~/.switchboard/harnesses/<harness>/`. Use `codex` or `claude-code` to inspect
197
+ one harness at a time, or `all` for the aggregate compatibility view. The top
198
+ section shows requests and tokens processed, plus routing health. The middle
199
+ section shows route mix and model changes from `x -> y`. The bottom section is
200
+ the live call feed, which now shows the actual model used. Use `--live` while another
201
+ terminal runs the matching harness through Switchboard. Live mode uses an
202
+ alternate-screen watch view, so it does not push your terminal scrollback. Press
203
+ `q` or Ctrl-C to exit.
204
+
205
+ `switchboard inspect` starts the local web inspector for routing API debugging. It
206
+ groups events by `decisionId` and shows payload, decision, applied route,
207
+ forwarding status, and raw events in separate tabs.
@@ -0,0 +1,197 @@
1
+ # Classification API
2
+
3
+ Switchboard now uses the API for one thing only:
4
+
5
+ - classify how hard a request is
6
+
7
+ The local CLI owns model selection.
8
+
9
+ ## Product Boundary
10
+
11
+ The request flow is:
12
+
13
+ 1. the gateway intercepts a Codex or Claude request
14
+ 2. it derives a compact normalized packet
15
+ 3. it sends that packet to `POST /v1/classify`
16
+ 4. the API returns difficulty `1..5`, confidence, reason code, flags, and a short task label
17
+ 5. the local CLI maps that difficulty to the configured model
18
+ 6. the gateway forwards the provider request using that chosen model
19
+
20
+ The API does not return a target model.
21
+
22
+ ## Endpoint
23
+
24
+ ```text
25
+ POST /v1/classify
26
+ ```
27
+
28
+ Authentication and billing work the same way as the old routing endpoint: each
29
+ successful new classification request debits one request credit.
30
+
31
+ ## Request Shape
32
+
33
+ The gateway sends a compact packet with only the information that materially
34
+ changes difficulty classification.
35
+
36
+ Example:
37
+
38
+ ```json
39
+ {
40
+ "request_id": "uuid",
41
+ "packet": {
42
+ "schemaVersion": 1,
43
+ "decisionId": "uuid",
44
+ "integration": "codex",
45
+ "observed": {
46
+ "requestedModel": "gpt-5.5",
47
+ "reasoningEffort": "xhigh",
48
+ "stream": true,
49
+ "inputItemCount": 173,
50
+ "estimatedInputTokens": 183232,
51
+ "bodyBytes": 732926,
52
+ "toolCount": 17
53
+ },
54
+ "capabilities": {
55
+ "hasShellTool": true,
56
+ "hasFileEditTool": true,
57
+ "hasWebTool": true,
58
+ "isSubagentCall": false,
59
+ "hasParentAgent": true,
60
+ "isLikelyContinuationAfterTool": true
61
+ },
62
+ "context": {
63
+ "latestUserText": "fix the failing login form",
64
+ "recentAssistantText": "I found the stack trace in auth.ts",
65
+ "recentToolResultText": "TypeError: cannot read properties of undefined",
66
+ "recentErrorText": "TypeError in auth.ts:142",
67
+ "filesOrSymbolsMentioned": ["auth.ts", "LoginForm.tsx"]
68
+ },
69
+ "features": {
70
+ "exact": false,
71
+ "error": true,
72
+ "stack": true,
73
+ "tests": false,
74
+ "risk": true,
75
+ "db": false,
76
+ "auth": true,
77
+ "billing": false,
78
+ "infra": false,
79
+ "ci": false,
80
+ "deploy": false,
81
+ "routing": false,
82
+ "multiFile": true,
83
+ "statefulMissingContext": false
84
+ }
85
+ },
86
+ "settings": {
87
+ "harness": "codex",
88
+ "observe": false,
89
+ "blockedTargets": []
90
+ },
91
+ "metadata": {
92
+ "harness": "codex",
93
+ "requested_model": "gpt-5.5",
94
+ "estimated_input_tokens": 183232,
95
+ "body_bytes": 732926
96
+ }
97
+ }
98
+ ```
99
+
100
+ Notes:
101
+
102
+ - do not send the full provider request body
103
+ - do not send the full conversation history
104
+ - `settings` may include local model profiles, difficulty maps, blocked
105
+ targets, and observe mode for accounting and analytics
106
+ - the API does not choose or return the provider target model; local CLI policy
107
+ remains the execution source of truth
108
+
109
+ ## Response Shape
110
+
111
+ Example:
112
+
113
+ ```json
114
+ {
115
+ "allowed": true,
116
+ "classification": {
117
+ "difficulty": 4,
118
+ "confidence": 0.81,
119
+ "reason_code": "multi_file_debugging",
120
+ "reason": "Multi-file debugging with tool use and moderate ambiguity.",
121
+ "task_label": "🔎 Fix login failure",
122
+ "flags": {
123
+ "high_risk": false,
124
+ "long_context": false,
125
+ "tool_heavy": true,
126
+ "continuation": true
127
+ }
128
+ },
129
+ "credit": {
130
+ "status": "allowed",
131
+ "debited": true,
132
+ "chargeable": true
133
+ }
134
+ }
135
+ ```
136
+
137
+ The gateway then performs local model selection based on the harness difficulty
138
+ map.
139
+
140
+ `task_label` is a dashboard-only summary generated by the classifier. It should
141
+ be one leading emoji plus 3-6 words, sanitized so it does not echo secrets,
142
+ tokens, emails, URLs, raw code, or stack traces. New CLIs prefer this label in
143
+ the local live feed and fall back to the raw task preview when it is absent.
144
+
145
+ ## Difficulty Meaning
146
+
147
+ Difficulty means:
148
+
149
+ > the minimum model class likely to succeed reliably
150
+
151
+ Typical guidance:
152
+
153
+ - `1`: tiny exact task
154
+ - `2`: light bounded task
155
+ - `3`: routine coding or analysis work
156
+ - `4`: multi-step or multi-file work
157
+ - `5`: hard debugging, architecture, large-context reasoning, or high-risk work
158
+
159
+ ## Observe Mode
160
+
161
+ Observe mode remains local-only.
162
+
163
+ The gateway still calls `/v1/classify`, but preserves the requested model
164
+ locally. Logs and dashboards show the difficulty result and the model that
165
+ would have been chosen from the local difficulty map.
166
+
167
+ ## Logs
168
+
169
+ The main classification events are:
170
+
171
+ ```text
172
+ classification_api_payload
173
+ classification_api_result
174
+ model_request
175
+ route_applied
176
+ forward_result
177
+ upstream_response
178
+ ```
179
+
180
+ Inspect them locally:
181
+
182
+ ```bash
183
+ tail -f ~/.switchboard/harnesses/codex/events.jsonl
184
+ tail -f ~/.switchboard/harnesses/claude/events.jsonl
185
+ switchboard inspect
186
+ ```
187
+
188
+ ## Dashboard
189
+
190
+ The dashboard should be read as:
191
+
192
+ - what difficulty did the API assign?
193
+ - what concise task label did the classifier assign?
194
+ - what model did the local CLI choose for that difficulty?
195
+ - how much did that save?
196
+
197
+ It is no longer a mode-or-tier product.