@legioncodeinc/rflectr 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/README.md +1 -5
  2. package/dist/cli.js +1 -1
  3. package/library/README.md +39 -39
  4. package/library/issues/README.md +46 -46
  5. package/library/issues/backlog/README.md +26 -26
  6. package/library/issues/completed/README.md +13 -13
  7. package/library/issues/in-work/README.md +13 -13
  8. package/library/knowledge/README.md +34 -34
  9. package/library/knowledge/private/README.md +40 -40
  10. package/library/knowledge/private/standards/documentation-framework.md +154 -154
  11. package/library/knowledge/public/README.md +49 -49
  12. package/library/notes/README.md +21 -21
  13. package/library/requirements/README.md +51 -51
  14. package/library/requirements/backlog/README.md +30 -30
  15. package/library/requirements/completed/README.md +14 -14
  16. package/library/requirements/completed/prd-002-provider-registry/prd-002-provider-registry-index.md +263 -0
  17. package/library/requirements/completed/prd-003-model-discovery-classification/prd-003-model-discovery-classification-index.md +260 -0
  18. package/library/requirements/completed/prd-004-translation-layer/prd-004-translation-layer-index.md +196 -0
  19. package/library/requirements/completed/prd-005-local-proxy-catalog-routing/prd-005-local-proxy-catalog-routing-index.md +176 -0
  20. package/library/requirements/completed/prd-006-credential-storage/prd-006-credential-storage-index.md +190 -0
  21. package/library/requirements/completed/prd-006-credential-storage/qa/.gitkeep +0 -0
  22. package/library/requirements/completed/prd-007-oauth-device-flows/prd-007-oauth-device-flows-index.md +208 -0
  23. package/library/requirements/completed/prd-008-preferences-tiers-favorites/prd-008-preferences-tiers-favorites-index.md +249 -0
  24. package/library/requirements/completed/prd-008-preferences-tiers-favorites/qa/.gitkeep +0 -0
  25. package/library/requirements/completed/prd-009-codex-integration/prd-009-codex-integration-index.md +212 -0
  26. package/library/requirements/completed/prd-009-codex-integration/qa/.gitkeep +0 -0
  27. package/library/requirements/completed/prd-010-gemini-cli-integration/prd-010-gemini-cli-integration-index.md +211 -0
  28. package/library/requirements/completed/prd-010-gemini-cli-integration/qa/.gitkeep +0 -0
  29. package/library/requirements/completed/prd-011-claude-desktop-integration/prd-011-claude-desktop-integration-index.md +228 -0
  30. package/library/requirements/completed/prd-012-server-gateway/prd-012-server-gateway-index.md +356 -0
  31. package/library/requirements/completed/prd-012-server-gateway/qa/.gitkeep +0 -0
  32. package/library/requirements/in-work/README.md +19 -19
  33. package/library/requirements/reports/README.md +31 -31
  34. package/package.json +1 -1
@@ -0,0 +1,196 @@
1
+ # PRD-004: Translation Layer (Vercel AI SDK Adapter) *(Retroactive)*
2
+
3
+ > **Status:** Shipped
4
+ > **Priority:** —
5
+ > **Effort:** —
6
+ > **Written:** June 2026
7
+ > **Retroactive:** Yes — written after implementation (rflectr v0.2.7).
8
+ > **Source:** `src/sdk-adapter.ts`, `src/provider-factory.ts`, `src/openai-adapter.ts`, `src/gemini-parts.ts`
9
+
10
+ ---
11
+
12
+ ## Overview
13
+
14
+ A Claude Code / Codex / Gemini host speaks one wire format (Anthropic `/v1/messages`, OpenAI Responses, or Gemini REST). The alternative model backends rflectr re-points those hosts at speak many — OpenAI Chat Completions, OpenAI Responses, Gemini `v1beta`, xAI, Mistral, DeepSeek, OpenRouter, openai-compatible endpoints, and so on. The translation layer is the single code path that bridges the host's format to whatever the selected model actually wants.
15
+
16
+ The defining decision is that there is **exactly one translation path**, not N. Every non-Anthropic provider routes through the Vercel AI SDK (`ai` + `@ai-sdk/*`) — the same packages OpenCode loads — which owns wire format, endpoint selection, and provider quirks. rflectr never hand-rolls an Anthropic→OpenAI or Anthropic→Gemini translator per provider. Adding a provider once makes it usable from every host.
17
+
18
+ Anthropic-native models skip the adapter entirely and are forwarded raw; `isSdkMigratedNpm(npm)` (`src/provider-factory.ts:68`) is the gate — true for any npm except `@ai-sdk/anthropic`.
19
+
20
+ See the knowledge doc: [`../../../knowledge/private/ai/translation-layer.md`](../../../knowledge/private/ai/translation-layer.md).
21
+
22
+ ## What Was Built
23
+
24
+ - A provider factory (`createLanguageModel`) that maps OpenCode's `api.npm` package name to a Vercel AI SDK `LanguageModel` via dynamic `import(npm)` and `create*`-export discovery (`src/provider-factory.ts:102`).
25
+ - A Responses-vs-Chat API selector (`modelPrefersResponsesApi`) that routes newer OpenAI/xAI reasoning models through the Responses API (`src/provider-factory.ts:34`).
26
+ - The Anthropic-facing adapter: `translateRequest` (request → SDK call params), `streamAnthropicResponse` (SDK `fullStream` → Anthropic SSE), and `generateAnthropicResponse` (non-streaming) (`src/sdk-adapter.ts:222`, `:413`, `:431`).
27
+ - Inline `role:'system'` folding — Claude Code's mid-conversation skills list and system-reminders are merged into the system prompt instead of being dropped (`src/sdk-adapter.ts:90`, `:232`).
28
+ - The `thought_signature` round-trip — smuggled through the tool-use id so reasoning models (especially Gemini) get their signature echoed back verbatim (`src/proxy-shared.ts:93`, `:72`).
29
+ - Per-provider reasoning capability + effort translation (`getReasoningCapabilities`, `effortProviderOptions`, `thinkingProviderOptions`, `deepMergeProviderOptions`) (`src/provider-factory.ts:469`, `:633`, `:741`, `:725`).
30
+ - Two more host-facing directions reusing the same factory + SDK model: the OpenAI Chat Completions adapter (`src/openai-adapter.ts`), the Codex Responses adapter (`src/codex-responses-adapter.ts`), and the Gemini parts translator (`src/gemini-parts.ts`).
31
+
32
+ ## Goals
33
+
34
+ - One translation path for all non-Anthropic providers — eliminate the combinatorial per-provider translator mess.
35
+ - Let the SDK own wire format, endpoint selection, and provider quirks (message ordering, reasoning signatures, tool-call encoding).
36
+ - Preserve host behavior that the Anthropic wire format alone would lose — inline system messages and reasoning signatures.
37
+ - Keep `dist/cli.js` small by loading SDK provider packages on demand from `node_modules`.
38
+ - Make every host (Claude Code, Codex, Gemini, Claude Desktop gateway) converge on the same factory so a provider added once works everywhere.
39
+
40
+ ## Non-Goals
41
+
42
+ - Hand-rolled per-provider wire translation — explicitly rejected in favor of the SDK.
43
+ - Owning the agent tool loop. The adapter is strictly **one turn per request**; Claude Code owns the loop (`src/sdk-adapter.ts:3`).
44
+ - Bundling SDK provider packages into the CLI. They ship as `dependencies` / `external` and resolve at runtime (`src/provider-factory.ts:81`).
45
+ - Routing Anthropic-native models — those bypass the adapter and are forwarded raw (passthrough handled by the proxy, see PRD-005).
46
+ - Accurate cost reporting for non-Anthropic models (Claude Code applies its own pricing table; documented limitation).
47
+
48
+ ## Features
49
+
50
+ | # | Feature | Source |
51
+ |---|---------|--------|
52
+ | F1 | Single-path gate (`isSdkMigratedNpm`): SDK for all npm except `@ai-sdk/anthropic` | `src/provider-factory.ts:68` |
53
+ | F2 | `createLanguageModel({npm,modelId,apiKey,baseURL})` factory via dynamic `import(npm)` + `create*` discovery | `src/provider-factory.ts:102`, `:72`, `:81` |
54
+ | F3 | Responses-vs-Chat API selection (`modelPrefersResponsesApi`) for OpenAI/xAI | `src/provider-factory.ts:34` |
55
+ | F4 | `translateRequest` — messages, tools, `tool_choice`, system | `src/sdk-adapter.ts:222` |
56
+ | F5 | Inline `role:'system'` folding into the system prompt | `src/sdk-adapter.ts:90`, `:232` |
57
+ | F6 | `streamAnthropicResponse` — SDK `fullStream` → Anthropic SSE | `src/sdk-adapter.ts:413`, `:273` |
58
+ | F7 | `generateAnthropicResponse` — non-streaming case | `src/sdk-adapter.ts:431` |
59
+ | F8 | `thought_signature` round-trip via tool-use id (encode/decode) | `src/proxy-shared.ts:93`, `:72`; `src/sdk-adapter.ts:190`, `:352` |
60
+ | F9 | Per-provider reasoning effort/thinking translation | `src/provider-factory.ts:469`, `:633`, `:741` |
61
+ | F10 | OpenAI Chat Completions host direction (reuses factory + SDK model) | `src/openai-adapter.ts:38` |
62
+ | F11 | Gemini parts → Anthropic blocks translation | `src/gemini-parts.ts:16`, `:45` |
63
+ | F12 | Codex Responses host direction (reuses factory + SDK model) | `src/codex-responses-adapter.ts:243` |
64
+
65
+ ## Architecture & Implementation
66
+
67
+ ### One translation path
68
+
69
+ ```mermaid
70
+ flowchart TD
71
+ host["Host wire format<br/>(Anthropic / Responses / Gemini)"]
72
+ host --> gate{"isSdkMigratedNpm(npm)?"}
73
+ gate -- "no (@ai-sdk/anthropic)" --> raw["raw passthrough (PRD-005)"]
74
+ gate -- "yes" --> trans["translate*Request() — host body → SdkCallParams"]
75
+ trans --> factory["createLanguageModel({npm, modelId, apiKey, baseURL})"]
76
+ factory --> model["Vercel AI SDK LanguageModel"]
77
+ model --> run["streamText / generateText"]
78
+ run --> back["map SDK fullStream → host SSE / JSON"]
79
+ back --> host
80
+ ```
81
+
82
+ The classifier in PRD-003 supplies `npm` and `modelFormat`; the proxy in PRD-005 dispatches into this layer.
83
+
84
+ ### Request translation (Anthropic → SDK)
85
+
86
+ `translateRequest(body, npm, options?)` (`src/sdk-adapter.ts:222`) builds an `SdkCallParams` object:
87
+
88
+ - **Messages** — `translateMessages` (`src/sdk-adapter.ts:153`) walks Anthropic blocks into SDK `ModelMessage[]`: text, images (`imagePart`, `:103`), `tool_use` → `tool-call`, `tool_result` → a `tool` role message, and `thinking` → SDK `reasoning` parts. Tool-result messages need a tool name, resolved by `annotateToolNames` (`:116`) which builds an id→name map first.
89
+ - **Tools** — `translateTools` (`:204`) wraps each Anthropic tool as an SDK `tool({ description, inputSchema: jsonSchema(...) })`.
90
+ - **`tool_choice`** — `translateToolChoice` (`:214`) maps `auto`→`'auto'`, `any`→`'required'`, `tool`→`{type:'tool',toolName}`.
91
+ - **System folding** — `systemToString` (`:82`) flattens the top-level `system`; `inlineSystemText` (`:90`) collects mid-conversation `role:'system'` messages (Claude Code injects the skills list and system-reminders this way) and joins them into the system prompt so they are not dropped (`:232`).
92
+ - **Reasoning** — effort is read from `output_config.effort` (`anthropicEffortFromRequest`, `:65`) or `options.defaultEffort` (the Claude Desktop gateway omits effort); `providerOptions` is the deep-merge of `thinkingProviderOptions(npm)` and `effortProviderOptions(...)` (`:244`).
93
+ - **ChatGPT Codex OAuth branch** — when `openAiOAuth` is set, the system prompt moves into `providerOptions.openai.instructions`, `system`/`maxOutputTokens` are cleared, and a default `"You are a coding assistant."` is used if no system text exists (`:235`, `:251`, `:258`).
94
+
95
+ ### Factory discovery (`createLanguageModel`)
96
+
97
+ `createLanguageModel(spec)` (`src/provider-factory.ts:102`) is async and routes by `npm`:
98
+
99
+ | npm | Behavior | Source |
100
+ |---|---|---|
101
+ | `@ai-sdk/google-vertex/anthropic` (`VERTEX_ANTHROPIC_NPM`) | Claude on Vertex AI via ADC (no apiKey) | `:105` |
102
+ | `@ai-sdk/openai` | OAuth → ChatGPT Codex backend (`https://chatgpt.com/backend-api/codex`); API key → direct. `modelPrefersResponsesApi()` picks `openai.responses(id)` vs `openai.chat(id)` | `:117` |
103
+ | `@ai-sdk/xai` | Direct; also consults `modelPrefersResponsesApi()` | `:134` |
104
+ | `@ai-sdk/google` | Direct — ignores `baseURL`, uses native `v1beta` (passing the OpenAI-compat discovery URL would 404) | `:142` |
105
+ | `@ai-sdk/anthropic` | Direct; strips a trailing `/v1` from `baseURL`, re-appends `/v1` for the SDK | `:149` |
106
+ | `@ai-sdk/openai-compatible`, `@openrouter/ai-sdk-provider` | Routed via `baseURL` | `:160`, `:167` |
107
+ | anything else | `loadSdkProviderFactory(npm)` → dynamic `import(npm)` → `findCreateFactory` finds the `create*()` export | `:170`, `:81`, `:72` |
108
+
109
+ `findCreateFactory` (`:72`) scans the module's exports for a function whose name starts with `create`. `loadSdkProviderFactory` (`:81`) caches the promise per npm and, on `ERR_MODULE_NOT_FOUND`, raises an install hint. Reasoning models matching `/deepseek-r1|think|reasoning|qwq/` are wrapped with `extractReasoningMiddleware({ tagName: 'think' })` (`:176`). The `@ai-sdk/*` packages are npm `dependencies` marked `external` in `tsup.config.ts`, so `dist/cli.js` stays small.
110
+
111
+ ### Responses-vs-Chat selection
112
+
113
+ `modelPrefersResponsesApi(modelId)` (`src/provider-factory.ts:34`) returns true when a model must use `/v1/responses` rather than `/v1/chat/completions`:
114
+
115
+ | Pattern | Examples | Matched by |
116
+ |---|---|---|
117
+ | `gpt-5.4` / `gpt-5.5` prefixes | `gpt-5.4`, `gpt-5.5-fast` | `RESPONSES_ONLY_PREFIXES` exact/`-`-prefix (`:12`, `:36`) |
118
+ | `gpt-5-pro` / `gpt-5.2-pro` | `gpt-5-pro` | same |
119
+ | `gpt-5-codex` and versioned codex | `gpt-5-codex`, `gpt-5.3-codex` | prefix + `gpt-*-codex` check (`:40`) |
120
+ | o-series | `o3`, `o4`, `o3-mini` | prefix list (`:18`) |
121
+ | xAI multi-agent | `grok-4.20-multi-agent`, `grok-4.2-multiagent` | `grok-*` + `multi-agent`/`multiagent` (`:42`) |
122
+
123
+ `upstreamModelId` (from PRD-003) carries OpenCode's `api.id` because catalog ids may differ from upstream API ids (e.g. `gpt-5.5-fast` → `gpt-5.5`).
124
+
125
+ ### Response translation (SDK → Anthropic SSE)
126
+
127
+ `writeAnthropicStream` (`src/sdk-adapter.ts:273`) consumes the SDK `fullStream` and emits Anthropic SSE events. It tracks one open content block at a time and maps SDK parts:
128
+
129
+ - `reasoning-start` / `reasoning-delta` → `thinking` block + `thinking_delta`; `reasoning-end` captures the round-trip signature emitted later as a `signature_delta` on close (`:321`, `:331`, `:302`).
130
+ - `text-start` / `text-delta` → `text` block + `text_delta` (`:337`).
131
+ - `tool-input-start` / `-delta` → `tool_use` block (id encoded with the thought signature) + `input_json_delta`; `tool-call` handles the non-streamed case (`:349`, `:365`).
132
+ - `finish` maps `finishReason` and usage; `error` closes the open block and emits an Anthropic `error` event (`:381`, `:393`).
133
+
134
+ `streamAnthropicResponse` (`:413`) wires `streamText` to `writeAnthropicStream` and swallows stream-property rejections; `generateAnthropicResponse` (`:431`) runs `generateText` and builds a single Anthropic message JSON.
135
+
136
+ ### thought_signature round-trip
137
+
138
+ Reasoning models (especially Gemini) require their `thought_signature` echoed back verbatim on the next turn, but the Anthropic wire format has no field for it. rflectr smuggles it through the tool-use id:
139
+
140
+ - **Encode** — `encodeToolUseId(rawId, signature)` (`src/proxy-shared.ts:93`) base64url-encodes the signature and appends it after a `__ts__` separator: `{rawId}__ts__{base64url(signature)}`. When emitting blocks, the adapter calls it at `tool-input-start` / `tool-call` with the signature from `grabRoundTripSignature` (`src/sdk-adapter.ts:352`, `:371`).
141
+ - **Decode** — `splitToolUseId(id)` (`src/proxy-shared.ts:72`) recovers `{ rawId, thoughtSignature }`. On the next request, `translateMessages` decodes it and, for Google, sets `providerOptions.google.thoughtSignature` on the `tool-call` part (`src/sdk-adapter.ts:190`); the SDK then handles Gemini's strict echo-back.
142
+
143
+ `grabRoundTripSignature` (`src/proxy-shared.ts:22`) reads the signature from provider metadata — `google.thoughtSignature` / `thought_signature` for Gemini, `openai.reasoningEncryptedContent` for OpenAI Responses. On the Gemini host direction, `partThoughtSignature` (`src/gemini-parts.ts:8`) pulls it off the function-call part and `parseGeminiPart` re-encodes it into the tool-use id (`:30`).
144
+
145
+ > **Separator note:** the live separator is `__ts__` with a base64url-encoded payload (`src/proxy-shared.ts:38`, `:93`). `::ts::` is retained only as a legacy decode fallback for sessions started before the change (`:82`). The knowledge doc describes the `::ts::` form.
146
+
147
+ ### Reusing the factory for the other two host directions
148
+
149
+ The same `createLanguageModel` + `streamText`/`generateText` underpins all hosts; only the host-facing translation differs:
150
+
151
+ - **OpenAI Chat Completions** (`src/openai-adapter.ts`): `translateOpenAiRequest` (`:38`) builds `SdkCallParams` from an OpenAI body; `generateOpenAiResponse` (`:131`) / `streamOpenAiResponse` (`:161`) emit the OpenAI JSON / SSE shape.
152
+ - **Codex Responses API** (`src/codex-responses-adapter.ts`): `translateResponsesRequest` (`:243`) / `translateResponsesInput` (`:166`) / `translateResponsesTools` (`:230`) build SDK params from a Responses body; `streamResponsesResponse` (`:575`) / `generateResponsesResponse` (`:593`) emit Responses SSE/JSON.
153
+ - **Gemini REST** (`src/gemini-parts.ts`): `parseGeminiPart` (`:16`), `collectAnthropicBlocksFromGeminiParts` (`:45`), and `mapGeminiUsage` (`:76`) translate Gemini parts and usage.
154
+
155
+ ## Acceptance Criteria
156
+
157
+ - [x] All non-Anthropic providers route through the Vercel AI SDK; `@ai-sdk/anthropic` is the only npm that bypasses it (`isSdkMigratedNpm`, `src/provider-factory.ts:68`).
158
+ - [x] `createLanguageModel` resolves an SDK `LanguageModel` from `{npm, modelId, apiKey, baseURL}` via dynamic `import(npm)` + `create*` discovery (`src/provider-factory.ts:102`, `:72`, `:81`).
159
+ - [x] Special factory branches exist for Vertex Anthropic, OpenAI (OAuth Codex backend + Responses/chat), xAI, Google `v1beta`, Anthropic `/v1` normalization, openai-compatible, and OpenRouter (`src/provider-factory.ts:105`–`:174`).
160
+ - [x] `modelPrefersResponsesApi` returns true for GPT-5.4+/5.5/pro, `*-codex`, the o-series, and xAI multi-agent models (`src/provider-factory.ts:34`).
161
+ - [x] `translateRequest` maps messages, tools, `tool_choice`, and system into `SdkCallParams` (`src/sdk-adapter.ts:222`).
162
+ - [x] Inline `role:'system'` messages are folded into the system prompt rather than dropped (`src/sdk-adapter.ts:90`, `:232`).
163
+ - [x] `streamAnthropicResponse` maps the SDK `fullStream` to Anthropic SSE and `generateAnthropicResponse` handles the non-streaming case (`src/sdk-adapter.ts:413`, `:431`).
164
+ - [x] `thought_signature` is encoded into the tool-use id and decoded back into `providerOptions.google.thoughtSignature` (`src/proxy-shared.ts:93`, `:72`; `src/sdk-adapter.ts:190`).
165
+ - [x] Per-provider reasoning effort/thinking is translated via `getReasoningCapabilities` / `effortProviderOptions` / `thinkingProviderOptions` / `deepMergeProviderOptions` (`src/provider-factory.ts:469`, `:633`, `:741`, `:725`).
166
+ - [x] The Codex Responses and OpenAI/Gemini host directions reuse the same factory + SDK model (`src/codex-responses-adapter.ts:243`, `src/openai-adapter.ts:38`, `src/gemini-parts.ts:16`).
167
+ - [x] SDK provider packages load on demand and stay `external` so `dist/cli.js` remains small (`src/provider-factory.ts:81`; `tsup.config.ts`).
168
+
169
+ ## Files
170
+
171
+ | File | Role |
172
+ |------|------|
173
+ | `src/sdk-adapter.ts` | Anthropic `/v1/messages` ↔ SDK; `translateRequest`, `translateMessages`, `streamAnthropicResponse`, `generateAnthropicResponse`, inline-system folding |
174
+ | `src/provider-factory.ts` | `createLanguageModel`, `isSdkMigratedNpm`, `modelPrefersResponsesApi`, reasoning capability + effort translation |
175
+ | `src/openai-adapter.ts` | OpenAI Chat Completions host direction (`translateOpenAiRequest`, `generate`/`streamOpenAiResponse`) |
176
+ | `src/gemini-parts.ts` | Gemini content-part → Anthropic block translation + usage mapping |
177
+ | `src/codex-responses-adapter.ts` | Codex Responses API host direction (translation aspects) |
178
+ | `src/proxy-shared.ts` | `encodeToolUseId` / `splitToolUseId` (thought_signature round-trip), `grabRoundTripSignature`, SSE helpers |
179
+ | `tests/sdk-adapter.test.ts`, `tests/provider-factory.test.ts` | Unit coverage for the pure translation functions |
180
+
181
+ ## Risks & Known Limitations
182
+
183
+ - **`thought_signature` separator collision.** The signature is appended after a separator in the tool-use id. The (live) `__ts__` separator carries a base64url payload, so a collision would require the encoded payload to itself contain `__ts__` — extremely unlikely. The legacy `::ts::` form (kept as a decode fallback) would break only if a raw signature literally contained `::ts::` (`src/proxy-shared.ts:38`, `:82`).
184
+ - **Gemini strict echo-back.** Gemini rejects requests that don't echo `thought_signature` verbatim. This is why the hand-rolled Gemini-native path was retired — the SDK handles the echo-back once the signature round-trips (`src/sdk-adapter.ts:190`, `src/gemini-parts.ts:8`).
185
+ - **Cost display inaccuracy.** Claude Code applies its own pricing table, so non-Anthropic model cost is always wrong (documented, by design).
186
+ - **One turn per request.** The adapter never loops; if a host expected the adapter to drive a tool loop it would break. Claude Code owns the loop (`src/sdk-adapter.ts:3`).
187
+ - **`@ai-sdk/github-copilot` is unsupported.** OpenCode loads it from internal `@opencode-ai/core`, not a public npm factory the dynamic `import(npm)` can resolve (documented limitation).
188
+ - **Provider-specific reasoning mappings are heuristic.** Effort levels are mapped per provider (e.g. xAI has no `medium`; DeepSeek `low/medium`→`high`), so a requested effort may snap to the nearest valid value (`src/provider-factory.ts:424`, `:359`).
189
+
190
+ ## Related
191
+
192
+ - [`../../../knowledge/private/ai/translation-layer.md`](../../../knowledge/private/ai/translation-layer.md) — the knowledge doc this PRD documents.
193
+ - [`../prd-003-model-discovery-classification/prd-003-model-discovery-classification-index.md`](../prd-003-model-discovery-classification/prd-003-model-discovery-classification-index.md) — classification supplies the `npm` and `modelFormat` that drive factory routing and the `isSdkMigratedNpm` gate.
194
+ - [`../prd-005-local-proxy-catalog-routing/prd-005-local-proxy-catalog-routing-index.md`](../prd-005-local-proxy-catalog-routing/prd-005-local-proxy-catalog-routing-index.md) — the local proxy dispatches Anthropic-format requests into this adapter (or raw passthrough for `@ai-sdk/anthropic`).
195
+ - [`../prd-009-codex-integration/prd-009-codex-integration-index.md`](../prd-009-codex-integration/prd-009-codex-integration-index.md) — Codex reuses the factory + SDK model via the Responses host direction.
196
+ - [`../prd-010-gemini-cli-integration/prd-010-gemini-cli-integration-index.md`](../prd-010-gemini-cli-integration/prd-010-gemini-cli-integration-index.md) — Gemini CLI reuses the factory via the Gemini REST host direction.
@@ -0,0 +1,176 @@
1
+ # PRD-005: Local Proxy & Catalog Routing *(Retroactive)*
2
+
3
+ > **Status:** Shipped
4
+ > **Priority:** —
5
+ > **Effort:** —
6
+ > **Written:** June 2026
7
+ > **Retroactive:** Yes — written after implementation (rflectr v0.2.7).
8
+ > **Source:** `src/proxy.ts`, `src/catalog.ts`, `src/proxy-shared.ts`, `src/upstream-forward.ts`
9
+
10
+ ---
11
+
12
+ ## Overview
13
+
14
+ rflectr re-points a host tool (Claude Code) at an alternative model backend by pointing `ANTHROPIC_BASE_URL` at a throwaway HTTP server it spins up on `127.0.0.1:<random ephemeral port>`. This **local proxy** accepts requests in Anthropic's wire format (`POST /v1/messages`, `GET /v1/models`) and, per request, either forwards them raw to a provider that already speaks Anthropic, or hands them to the Vercel AI SDK adapter (PRD-004) for any other provider.
15
+
16
+ The proxy is created at launch and torn down when the host process exits. It exists in two shapes: a **single-model** wrapper (`startProxy`) for an ordinary launch, and a **multi-route catalog** (`startProxyCatalog`) for switch-menu sessions where Claude Code's `/model` picker can hop between a starting model and the user's favorites. Catalog routing depends on a small set of pure route-builder functions in `src/catalog.ts` and an alias scheme (`aliasModelId`) that rewrites non-`claude-*` model ids into a form Claude Code's gateway model discovery will accept.
17
+
18
+ This PRD documents the proxy server, its request-dispatch model, the `ProxyRoute` carrier type, the synthetic model catalog, the alias scheme, the catalog route builders, and the shared upstream-forwarding helpers.
19
+
20
+ ## What Was Built
21
+
22
+ - A local HTTP server bound to `127.0.0.1` on an OS-chosen ephemeral port (`server.listen(0, '127.0.0.1', …)`, `src/proxy.ts:294`) that serves `HEAD /`, `GET /v1/models`, `GET /v1/models/:id`, and `POST /v1/messages`.
23
+ - A per-request dispatch that resolves a `ProxyRoute` by model id and branches on `route.modelFormat`: `anthropic` → raw passthrough; otherwise → SDK adapter (`src/proxy.ts:211`, `src/proxy.ts:230`).
24
+ - Two entry points: `startProxyCatalog(routes, defaultAliasId, debug)` (`src/proxy.ts:112`) and the single-model wrapper `startProxy(completionsUrl, modelId, debug, contextWindow?, sdk?, apiKey?)` (`src/proxy.ts:315`), which builds a one-route catalog.
25
+ - A `ProxyHandle` (`{ port, token, close() }`) whose `token` (a `randomUUID()`) becomes the child's `ANTHROPIC_API_KEY`, so only the launched child can call the proxy (`src/proxy.ts:117`, `src/proxy.ts:177`).
26
+ - The `aliasModelId(realId, providerId)` rewrite that makes non-`claude-*` ids gateway-discovery-safe as `anthropic-{provider}__{id}` (`src/proxy.ts:96`).
27
+ - A synthetic `GET /v1/models` catalog, one entry per route, each carrying `context_window` via `formatAnthropicModelEntry` / `formatAnthropicModelList` (`src/proxy.ts:138`, `src/proxy.ts:162`; `src/server/models.ts:58`).
28
+ - Catalog route builders in `src/catalog.ts`: `localModelToRoute`, `zenGoModelToRoute`, `makeRouteResolver`, and `buildCatalogRoutes` (capped at `MAX_MODEL_CATALOG = 20`, `src/constants.ts:51`).
29
+ - Shared upstream forwarding in `src/upstream-forward.ts` (`relayAnthropicMessages`, `postJsonUpstream`, `anthropicUpstreamHeaders`, `UpstreamUnreachableError`), reused by both this proxy and the `server` command's router (PRD-012).
30
+ - Format-agnostic glue in `src/proxy-shared.ts` (`sseChunk`, `encodeToolUseId`/`splitToolUseId`, `grabRoundTripSignature`, `silenceSdkWarnings`, …) and request/response shapes in `src/proxy-types.ts`.
31
+
32
+ ## Goals
33
+
34
+ - Let the host tool talk to any registered backend without modifying `settings.json` — env-var-only, child-process-scoped (see PRD-001).
35
+ - Translate only when necessary: a provider that already speaks Anthropic gets a raw byte-for-byte relay; everything else goes through the single SDK translation path (PRD-004).
36
+ - Support mid-session model switching by advertising a multi-model catalog the host can pick from, while keeping each model's real upstream id and key hidden behind a stable alias.
37
+ - Report accurate context windows to the host's status bar in single-model launches.
38
+ - Lock the proxy to the launched child via a per-session token.
39
+
40
+ ## Non-Goals
41
+
42
+ - **Translation internals.** Wire-format mapping, endpoint selection, and provider quirks belong to the SDK adapter and provider factory (PRD-004). The proxy only chooses *which* path to dispatch to.
43
+ - **The Codex (`/v1/responses`) and Gemini (`/v1beta/...`) proxies.** Those are sibling servers that share `proxy-shared.ts` but expose different endpoints — see PRD-009 and PRD-010.
44
+ - **The standalone `server` gateway.** That long-lived multi-provider gateway reuses `upstream-forward.ts` but is its own surface (PRD-012).
45
+ - **Favorites collection / persistence.** How favorites are chosen and stored is PRD-008; this PRD consumes a `FavoriteModel[]` to build routes.
46
+ - **Live-switch context-window accuracy.** In gateway-discovery (switch-menu) mode the host fetches `/v1/models` once at startup; the displayed window reflects the launch model only (see Risks).
47
+
48
+ ## Features
49
+
50
+ | Feature | Description | Source |
51
+ | --- | --- | --- |
52
+ | Ephemeral local server | Binds `127.0.0.1:0`; OS picks the port; returned in `ProxyHandle.port`. | `src/proxy.ts:294` |
53
+ | Per-session token auth | `POST /v1/messages` requires `x-api-key`/`Bearer` == the proxy's `randomUUID()` token, else 401. | `src/proxy.ts:117`, `src/proxy.ts:177` |
54
+ | Health-check ping | `HEAD /` → 200 (Claude Code startup ping). | `src/proxy.ts:148` |
55
+ | Synthetic model list | `GET /v1/models` returns one entry per route with `context_window`; `GET /v1/models/:id` returns a single entry or 404. | `src/proxy.ts:155` |
56
+ | Anthropic passthrough | `modelFormat === 'anthropic'` → `relayAnthropicMessages` to `{baseUrl}/v1/messages`, forwarding `anthropic-beta`. | `src/proxy.ts:211` |
57
+ | SDK-backed dispatch | `isSdkMigratedNpm(route.npm)` → `createLanguageModel` + `streamAnthropicResponse`/`generateAnthropicResponse`. | `src/proxy.ts:230` |
58
+ | Streaming + non-streaming | Honors `body.stream`; streams Anthropic SSE or returns JSON. | `src/proxy.ts:255` |
59
+ | `aliasModelId` | Rewrites non-`claude-*` ids to `anthropic-{providerId}__{id}` for gateway discovery; `claude-*` pass through. | `src/proxy.ts:96` |
60
+ | Alias-tolerant route lookup | `routeLookupIds` resolves prefix/suffix/`models/`-prefixed variants to the same route. | `src/proxy.ts:103` |
61
+ | Single-model wrapper | `startProxy` builds a one-route catalog from a completions URL + optional `sdk` carrier. | `src/proxy.ts:315` |
62
+ | Catalog route builders | `localModelToRoute`, `zenGoModelToRoute`, `makeRouteResolver`, `buildCatalogRoutes` (cap 20, dedup vs. starting route). | `src/catalog.ts:11`–`100` |
63
+ | Shared upstream forwarding | `relayAnthropicMessages`, `postJsonUpstream`, `anthropicUpstreamHeaders`, `UpstreamUnreachableError`. | `src/upstream-forward.ts` |
64
+ | Trace logging | When `debug`, redacted secure log via `appendSecureLog` (0600). | `src/proxy.ts:24`, `src/proxy.ts:40` |
65
+
66
+ ## Architecture & Implementation
67
+
68
+ ### Request dispatch flow
69
+
70
+ ```mermaid
71
+ flowchart TD
72
+ req["POST /v1/messages (Anthropic format)"]
73
+ req --> auth{"x-api-key == proxy token?"}
74
+ auth -->|no| e401["401 Invalid proxy token"]
75
+ auth -->|yes| lookup["lookupRoute(byAlias, body.model)\nfall back to defaultRoute"]
76
+ lookup --> fmt{"route.modelFormat"}
77
+ fmt -->|anthropic| fwd["relayAnthropicMessages()\n→ {upstreamUrl}/v1/messages (raw)"]
78
+ fmt -->|"openai (else)"| sdkguard{"isSdkMigratedNpm(route.npm)"}
79
+ sdkguard -->|true| adapter["createLanguageModel +\nstream/generateAnthropicResponse"]
80
+ sdkguard -->|false| e500["500 No SDK provider configured"]
81
+ fwd --> resp["Anthropic SSE / JSON to host"]
82
+ adapter --> resp
83
+ ```
84
+
85
+ - **Token gate.** `extractApiKey(req)` (from `x-api-key` or `Bearer`) must equal `proxyToken`, else `401 Invalid proxy token` (`src/proxy.ts:176`). The token is a `randomUUID()` generated per proxy (`src/proxy.ts:117`) and handed to the child as its `ANTHROPIC_API_KEY`.
86
+ - **Route resolution.** `const route = lookupRoute(byAlias, originalModel) ?? defaultRoute` (`src/proxy.ts:195`). `lookupRoute` tries each id produced by `routeLookupIds` (`src/proxy.ts:103`), which strips the `[1m]` context suffix and handles a leading `models/` prefix so Claude Code's id variants resolve to one route.
87
+ - **Anthropic passthrough.** The raw Anthropic body (with `model` swapped to `route.realModelId`) is relayed to `${upstreamUrl}/v1/messages`, forwarding the inbound `anthropic-beta` header (`src/proxy.ts:211`–`224`). Failures surface as `502` and, for network errors, `UpstreamUnreachableError` (`src/upstream-forward.ts:45`).
88
+ - **SDK path.** `sdkTranslateRequest(body, route.npm, …)` builds SDK params, `createLanguageModel({ npm, modelId, apiKey, baseURL, … })` resolves the provider, then `streamAnthropicResponse`/`generateAnthropicResponse` map to Anthropic output (`src/proxy.ts:230`–`281`). A non-`anthropic` route with no SDK-migrated npm is a misconfiguration → `500` (`src/proxy.ts:284`).
89
+ - **Body decoding.** `readBody` honors `Content-Encoding` (gzip/deflate/br/zstd) and caps the body at 50 MB (`src/http-utils.ts:34`).
90
+
91
+ ### The `ProxyRoute` carrier
92
+
93
+ Each route is self-contained — it carries everything needed to serve a request (`src/proxy.ts:72`): `aliasId` (advertised id), `realModelId` (sent upstream), `displayName`, `upstreamUrl`, `apiKey` (per-route; empty → 401), `modelFormat`, `contextWindow`, and the SDK/provider fields `npm`, `baseURL`, `providerId`, `authType`, `oauthAccountId`, `supportedParameters`, `reasoning`, `interleavedReasoningField`. `upstreamUrl` is a full chat-completions URL for openai-format routes, or a base URL **without** `/v1` for anthropic routes (the relay appends `/v1/messages`).
94
+
95
+ ### Route resolution & catalog assembly (`src/catalog.ts`)
96
+
97
+ - `localModelToRoute(lp, model)` (`src/catalog.ts:11`) maps a discovered local-provider model to a `ProxyRoute`, returning `null` for unserveable models (anthropic without `baseUrl`; openai without an SDK npm and without a `completionsUrl`).
98
+ - `zenGoModelToRoute(model, apiKey)` (`src/catalog.ts:33`) maps a Zen/Go cloud model; `unsupported` formats return `null`. openai-format Zen/Go models route through `@ai-sdk/openai-compatible` with `baseURL = ${backend.baseUrl}/v1`; anthropic-format stay direct passthrough (no `npm`).
99
+ - `makeRouteResolver(localProviders, zenModels, goModels, zenGoApiKey)` (`src/catalog.ts:53`) returns a `(providerId, modelId) => ProxyRoute | undefined` closure that dispatches `zen`/`go` to the cloud builder and anything else to the local builder.
100
+ - `buildCatalogRoutes(startingRoute, favorites, resolveRoute, max = 20)` (`src/catalog.ts:81`) resolves each favorite, dedupes against the starting route's `aliasId`, caps at `MAX_MODEL_CATALOG`, and reports `droppedFavorites` (stale/unresolvable). The starting route is always first.
101
+
102
+ Both builders run the model id and `contextWindow` through `claudeCodeClientModelId(aliasModelId(id, providerId), window)` so the advertised alias is gateway-safe and carries the `[1m]` suffix when the window exceeds the default.
103
+
104
+ ### Alias scheme
105
+
106
+ `aliasModelId(realId, providerId)` (`src/proxy.ts:96`) leaves `claude-*` ids unchanged and rewrites everything else to `anthropic-{slug}__{realId}`, where `slug` is the provider id lowercased and non-alphanumerics collapsed to `-`. Using the stable provider **id** (not display name) means renaming a provider does not break the alias. Claude Code's gateway model discovery only surfaces ids beginning `claude` or `anthropic`; this rewrite is what makes a third-party model selectable in the `/model` picker. (A side effect: after a switch-menu session a bare `claude` may still show a relay alias, because Claude Code caches the gateway id.)
107
+
108
+ ### Single-model vs. catalog launch
109
+
110
+ `startProxy` (`src/proxy.ts:315`) is a thin wrapper that constructs one `ProxyRoute` from a completions URL plus an optional `sdk` carrier (`{ npm, baseURL, upstreamModelId, providerId, authType, … }`) and an `apiKey`, then calls `startProxyCatalog` with that single route as the default. Switch-menu launches instead call `buildCatalogRoutes` first and pass the full route array to `startProxyCatalog` (consumed by PRD-001 / PRD-008).
111
+
112
+ ## API Surface
113
+
114
+ The proxy listens on `http://127.0.0.1:<ProxyHandle.port>`.
115
+
116
+ ### `HEAD /`
117
+ Health-check ping → `200`, empty body (`src/proxy.ts:148`).
118
+
119
+ ### `GET /v1/models`
120
+ Returns the synthetic catalog: `{ data: [...], has_more: false, first_id, last_id }`, one entry per route, each with `context_window` resolved by `resolveContextWindow` (`src/proxy.ts:138`, `src/server/models.ts:89`). No auth required.
121
+
122
+ ### `GET /v1/models/:id`
123
+ Returns a single formatted entry for the resolved route, or `404 not_found_error` if the id matches no route (`src/proxy.ts:156`–`166`).
124
+
125
+ ### `POST /v1/messages`
126
+ The main translation path. Requires the proxy token (`401` otherwise). Body is Anthropic `messages` format. Response is Anthropic SSE when `body.stream` is truthy, else Anthropic JSON. Error envelope is always `{ type: 'error', error: { type, message } }` — `400` invalid JSON, `401` bad/missing key, `500` misconfigured route, `502` upstream failure (`src/proxy.ts:175`–`285`).
127
+
128
+ Any other method/path → `404 Unknown endpoint` (`src/proxy.ts:289`).
129
+
130
+ ## Acceptance Criteria
131
+
132
+ - [x] Proxy binds `127.0.0.1` on an OS-chosen ephemeral port and returns it in `ProxyHandle.port` (`src/proxy.ts:294`).
133
+ - [x] `POST /v1/messages` rejects requests whose key ≠ the per-session token with `401` (`src/proxy.ts:176`).
134
+ - [x] `HEAD /` returns `200` for the host's startup health check (`src/proxy.ts:148`).
135
+ - [x] `GET /v1/models` returns one entry per route with a `context_window` field (`src/proxy.ts:138`, `src/server/models.ts:58`).
136
+ - [x] `GET /v1/models/:id` returns the matching entry or `404` (`src/proxy.ts:156`).
137
+ - [x] `modelFormat === 'anthropic'` routes relay raw to `{upstreamUrl}/v1/messages`, forwarding `anthropic-beta` (`src/proxy.ts:211`).
138
+ - [x] Non-anthropic routes with an SDK-migrated `npm` dispatch through `createLanguageModel` + the SDK adapter (`src/proxy.ts:230`).
139
+ - [x] A non-anthropic route without a registered SDK npm returns `500` (`src/proxy.ts:284`).
140
+ - [x] Streaming is honored via `body.stream`; SSE for streaming, JSON otherwise (`src/proxy.ts:255`).
141
+ - [x] `aliasModelId` leaves `claude-*` unchanged and rewrites others to `anthropic-{providerId}__{id}` (`src/proxy.ts:96`).
142
+ - [x] `startProxy` is a single-route wrapper around `startProxyCatalog` (`src/proxy.ts:315`).
143
+ - [x] `buildCatalogRoutes` dedupes against the starting route, caps at `MAX_MODEL_CATALOG` (20), and reports dropped favorites (`src/catalog.ts:81`).
144
+ - [x] `localModelToRoute` / `zenGoModelToRoute` return `null` for unserveable / `unsupported` models (`src/catalog.ts:12`, `src/catalog.ts:34`).
145
+ - [x] `relayAnthropicMessages` distinguishes a network failure (`UpstreamUnreachableError`) from an upstream error response (`src/upstream-forward.ts:45`, `src/upstream-forward.ts:72`).
146
+ - [x] Upstream forwarding is shared with the `server` router via `src/upstream-forward.ts` (PRD-012).
147
+
148
+ ## Files
149
+
150
+ | File | Role |
151
+ | --- | --- |
152
+ | `src/proxy.ts` | The Anthropic-facing proxy: `startProxyCatalog`, `startProxy`, `aliasModelId`, `ProxyRoute`, `ProxyHandle`, request dispatch, synthetic `/v1/models`, token auth, trace logging. |
153
+ | `src/catalog.ts` | Route builders: `localModelToRoute`, `zenGoModelToRoute`, `makeRouteResolver`, `buildCatalogRoutes`. |
154
+ | `src/proxy-shared.ts` | Format-agnostic glue shared across the Anthropic/Responses/Gemini proxies (SSE, tool-use id round-trip, signature grab, SDK-warning silencer). |
155
+ | `src/proxy-types.ts` | Anthropic/Gemini/OpenAI request & response shapes. |
156
+ | `src/upstream-forward.ts` | Shared upstream forwarding: `relayAnthropicMessages`, `postJsonUpstream`, Anthropic header helpers, `UpstreamUnreachableError`. |
157
+ | `src/http-utils.ts` | `readBody` (Content-Encoding decode + 50 MB cap), `extractApiKey`, `sendJson`. |
158
+ | `src/server/models.ts` | `formatAnthropicModelEntry` / `formatAnthropicModelList` / `resolveContextWindow` (consumed for the synthetic catalog). |
159
+ | `src/context-model-id.ts` | `claudeCodeClientModelId`, `routeLookupIds`, `stripOneMContextSuffix` (alias id derivation + lookup tolerance). |
160
+ | `src/constants.ts` | `MAX_MODEL_CATALOG = 20`. |
161
+
162
+ ## Risks & Known Limitations
163
+
164
+ - **Switch-menu context window reflects the launch model only.** In gateway-discovery mode Claude Code fetches `/v1/models` once at startup and its discovery payload carries no `context_window`; only `CLAUDE_CODE_MAX_CONTEXT_TOKENS` (fixed at launch) drives the status bar. Single-model launches show the correct window. (Documented in CLAUDE.md / the knowledge doc.)
165
+ - **Cached gateway alias.** After a switch-menu session, a bare `claude` may show a relay alias (e.g. `anthropic-opencode-go__deepseek-v4-flash`) because Claude Code caches the gateway id at `~/.claude/cache/gateway-models.json`. Reset with `claude --model sonnet`.
166
+ - **Catalog cap.** Catalogs are capped at 20 routes (`MAX_MODEL_CATALOG`); favorites beyond the cap, or unresolvable ones, are silently dropped (surfaced as `droppedFavorites`).
167
+ - **Empty per-route key → 401.** A route with an empty `apiKey` returns `401 Missing API key` (`src/proxy.ts:203`); OAuth-only / placeholder-key gaps surface here.
168
+ - **`thought_signature` separator collision.** Tool-use ids encode a thought signature as `{id}__ts__{base64url}`; an id literally containing the separator would break round-tripping (extremely unlikely; legacy `::ts::` form still parsed for in-flight sessions). See PRD-004.
169
+
170
+ ## Related
171
+
172
+ - [`../../../knowledge/private/integrations/local-proxy.md`](../../../knowledge/private/integrations/local-proxy.md) — knowledge doc this PRD is grounded in.
173
+ - [`../prd-004-translation-layer/prd-004-translation-layer-index.md`](../prd-004-translation-layer/prd-004-translation-layer-index.md) — the SDK adapter the proxy dispatches non-anthropic routes to.
174
+ - [`../prd-001-cli-core-launch-orchestration/prd-001-cli-core-launch-orchestration-index.md`](../prd-001-cli-core-launch-orchestration/prd-001-cli-core-launch-orchestration-index.md) — launch flow that starts and tears down the proxy.
175
+ - [`../prd-008-preferences-tiers-favorites/prd-008-preferences-tiers-favorites-index.md`](../prd-008-preferences-tiers-favorites/prd-008-preferences-tiers-favorites-index.md) — favorites that feed `buildCatalogRoutes`.
176
+ - [`../prd-012-server-gateway/prd-012-server-gateway-index.md`](../prd-012-server-gateway/prd-012-server-gateway-index.md) — the standalone gateway that reuses `upstream-forward.ts`.
@@ -0,0 +1,190 @@
1
+ # PRD-006: Credential Storage & API Key Management *(Retroactive)*
2
+
3
+ > **Status:** Shipped
4
+ > **Priority:** —
5
+ > **Effort:** —
6
+ > **Written:** June 2026
7
+ > **Retroactive:** Yes — written after implementation (rflectr v0.2.7).
8
+ > **Source:** `src/key-setup.ts`, `src/registry/auth-broker.ts`, `src/registry/provider-auth.ts`
9
+
10
+ ---
11
+
12
+ ## Overview
13
+
14
+ `rflectr` re-points Claude Code / Codex / Gemini at alternative model backends, which means it must hold API keys and OAuth tokens for the OpenCode Zen/Go cloud backend and for every registry provider. This PRD documents the credential subsystem: where secrets live, how they are resolved at launch, the interactive key-collection flow, and the per-platform save options.
15
+
16
+ The design principle is **secrets never touch the config files**. `providers.json` and `config.json` hold only an `authRef` pointer; the actual secret lives in an environment variable, the OS keyring (via `@napi-rs/keyring`), or — only when the user opts in — a plaintext shell profile / persistent env var. The OS keyring is the default everywhere it is available, and a missing native keyring binary degrades gracefully rather than crashing because the module is loaded through a dynamic `import()` (`src/env.ts:141`, `src/env.ts:155`).
17
+
18
+ This is the security-critical surface of the project. The canonical narrative lives in the knowledge base at [`../../../knowledge/private/security/credential-storage.md`](../../../knowledge/private/security/credential-storage.md).
19
+
20
+ ---
21
+
22
+ ## What Was Built
23
+
24
+ - A cross-platform OS credential store backed by `@napi-rs/keyring` (service `rflectr`), with `getPassword` / `setPassword` / `deletePassword` wrappers that never throw — failures are classified into human-readable reasons by `classifyKeyringError()` (`src/env.ts:73`).
25
+ - A silent startup read: `resolveOrCollectApiKey()` first calls `resolveApiKey()` and then `readFromCredentialStore()` so that when a key already exists no prompt is shown (`src/key-setup.ts:38`, `src/key-setup.ts:58`).
26
+ - An interactive key-collection prompt for the OpenCode Zen/Go key with **platform-specific save options** (macOS / Windows / Linux desktop / Linux headless), implemented in `resolveOrCollectApiKey()` (`src/key-setup.ts:86`–`src/key-setup.ts:185`).
27
+ - Immediate session activation: `process.env['OPENCODE_API_KEY']` is set the moment a key is resolved or collected, regardless of the persistence choice (`src/key-setup.ts:62`, `src/key-setup.ts:187`).
28
+ - A layered resolution order for *provider* keys via `resolveProviderCredential(providerId, authRef)`: namespaced env var → `env:`-ref → `global:opencode` chain → per-provider keyring account (`src/env.ts:241`).
29
+ - A keyring migration protocol (read → write → verify → delete) that lifts legacy `rflectr` / `opencode-starter` entries into the canonical `global:opencode` account only after the new entry verifies (`src/env.ts:199`).
30
+ - An OAuth auth broker that delegates login to the OpenCode CLI and copies the resulting tokens into the rflectr keychain (`src/registry/auth-broker.ts:17`), plus a native-vs-broker selector (`src/registry/provider-auth.ts:151`).
31
+ - Key validation before import: `validateImportKey()` rejects placeholder/empty/invalid keys by actually probing the provider's model endpoint (`src/registry/validate-import-key.ts:30`).
32
+
33
+ ---
34
+
35
+ ## Goals
36
+
37
+ - Keep secrets out of plaintext config files; store only an `authRef` pointer in the registry.
38
+ - Default to the OS-native secure store on every platform, and degrade gracefully when it is unavailable.
39
+ - Never block startup on a missing native module — keyring failures are caught and surfaced as diagnostics, not crashes.
40
+ - Resolve a usable key at launch with a deterministic, documented priority order.
41
+ - Make the key active for the *current* session immediately, independent of where (or whether) it is persisted.
42
+ - Validate provider keys before importing them so the registry never holds a known-bad credential.
43
+
44
+ ## Non-Goals
45
+
46
+ - A custom encryption-at-rest scheme — the OS keyring is the trust anchor.
47
+ - Server-mode network authentication — the `server` command's password gate is owned by PRD-012 (`src/server/auth.ts`).
48
+ - OAuth device-flow mechanics themselves — token acquisition is PRD-007; this PRD covers only how the resulting tokens are *stored and resolved*.
49
+ - Rotating or expiring API keys on a schedule.
50
+
51
+ ---
52
+
53
+ ## Features
54
+
55
+ | # | Feature | Source |
56
+ |---|---------|--------|
57
+ | F1 | `@napi-rs/keyring` OS credential store via dynamic `import()` (graceful degrade) | `src/env.ts:139`–`src/env.ts:173` |
58
+ | F2 | Silent startup read — no prompt when a key already exists | `src/key-setup.ts:38`, `src/key-setup.ts:58` |
59
+ | F3 | Per-platform save options (macOS / Windows / Linux desktop / headless) | `src/key-setup.ts:86`–`src/key-setup.ts:185` |
60
+ | F4 | `OPENCODE_API_KEY` set in `process.env` immediately, regardless of save choice | `src/key-setup.ts:62`, `src/key-setup.ts:187` |
61
+ | F5 | Layered provider-key resolution (`resolveProviderCredential`) | `src/env.ts:241` |
62
+ | F6 | `global:opencode` fallback chain (env → keyring → legacy services) | `src/env.ts:176` |
63
+ | F7 | Legacy-entry migration (read → write → verify → delete) | `src/env.ts:199` |
64
+ | F8 | Keyring error classification (never throws) | `src/env.ts:73` |
65
+ | F9 | Secret Service availability probe (Linux) | `src/env.ts:367`, `src/key-setup.ts:79` |
66
+ | F10 | OAuth broker → keychain copy | `src/registry/auth-broker.ts:17`, `src/registry/provider-auth.ts:160` |
67
+ | F11 | Pre-import key validation (placeholder / invalid / manual-auth) | `src/registry/validate-import-key.ts:30` |
68
+ | F12 | `--dry-run` simulates save without writing | `src/key-setup.ts:123`–`src/key-setup.ts:133` |
69
+
70
+ ---
71
+
72
+ ## Architecture & Implementation
73
+
74
+ ### Where secrets live
75
+
76
+ Secrets are never written to `providers.json` or `config.json` — those hold only an `authRef` pointer. The actual secret lives in one of three places (`src/env.ts:241`):
77
+
78
+ 1. **Env var** — the namespaced `RFLECTR_KEY_<PROVIDER_ID_UPPER>` (highest priority, `rflectrKeyEnvVar()` at `src/env.ts:129`), or whatever `env:VAR_NAME` the `authRef` names (`src/env.ts:252`).
79
+ 2. **OS keyring** — service `rflectr` via `@napi-rs/keyring`. Accounts: `provider:<id>` (`src/env.ts:96`), `oauth:provider:<id>` (`src/env.ts:100`, a JSON `StoredOAuthCredential`), and `global:opencode` (`src/env.ts:94`).
80
+ 3. **Legacy keyring entries** — `rflectr` / `opencode-starter` accounts, auto-migrated on first successful read (`src/env.ts:199`).
81
+
82
+ ### Key resolution order
83
+
84
+ For a **provider** key, `resolveProviderCredential(providerId, authRef, diag?)` resolves in this order (`src/env.ts:241`):
85
+
86
+ 1. Namespaced env var `RFLECTR_KEY_<ID>` (`src/env.ts:246`).
87
+ 2. If `authRef` is an `env:` ref → read that env var (`src/env.ts:252`).
88
+ 3. If the keyring account is `global:opencode` → run the `readGlobalOpencodeCredential()` chain (`src/env.ts:256`).
89
+ 4. Otherwise → read the per-provider keyring account, decoding/refreshing OAuth JSON if present (`src/env.ts:260`, `src/env.ts:324`).
90
+
91
+ For the shared **OpenCode Zen/Go** key, `readGlobalOpencodeCredential()` tries, in order (`src/env.ts:176`): `OPENCODE_API_KEY` env (`resolveApiKey()`, `src/env.ts:20`) → keyring `global:opencode` → legacy keyring `rflectr` → oldest legacy service `opencode-starter`. On a successful legacy read, `migrateGlobalOpencodeCredential()` rewrites it to `global:opencode` using a read → write → verify → delete protocol — the old entry is deleted only after the new one verifies (`src/env.ts:217`–`src/env.ts:233`).
92
+
93
+ ### Per-platform storage matrix
94
+
95
+ `resolveOrCollectApiKey()` builds the option list per platform (`src/key-setup.ts:86`). The default selection is keychain/credential-manager/secret-service where available, else profile (`src/key-setup.ts:118`).
96
+
97
+ | Platform | Options | Source |
98
+ |----------|---------|--------|
99
+ | **macOS** | Keychain only · Keychain + `~/.zshrc` (or profile) auto-load · shell profile (plaintext) · session only | `src/key-setup.ts:87`–`src/key-setup.ts:93` |
100
+ | **Windows** | Windows Credential Manager · `setx` user env var (plaintext) · session only | `src/key-setup.ts:95`–`src/key-setup.ts:100` |
101
+ | **Linux desktop** | Secret Service (GNOME Keyring / KWallet) · shell profile (plaintext) · session only | `src/key-setup.ts:103`–`src/key-setup.ts:111` |
102
+ | **Linux headless** | shell profile · session only — shown with a note explaining why secure storage is unavailable | `src/key-setup.ts:105`–`src/key-setup.ts:111` |
103
+
104
+ Notes grounded in code:
105
+
106
+ - The macOS auto-load line uses the `security` CLI directly so the shell can source it (`src/key-setup.ts:143`): `export OPENCODE_API_KEY="$(security find-generic-password -s rflectr -a global:opencode -w 2>/dev/null)"`. It is appended only if not already present (`src/key-setup.ts:145`).
107
+ - `setx` is invoked with piped stdio (`stdio: ['pipe','pipe','pipe']`) to suppress its "SUCCESS" stdout (`src/key-setup.ts:164`).
108
+ - Secret Service availability is probed with a test `getPassword()` against a throwaway `rflectr-probe` entry (`isSecretServiceAvailable()`, `src/env.ts:367`); if the daemon isn't running the secure option is hidden and a note is shown (`src/key-setup.ts:103`–`src/key-setup.ts:107`).
109
+ - `detectShellProfile()` chooses the right profile file per platform/shell — `~/.zshrc`, `~/.bash_profile`, `~/.bashrc`, or `~/.profile` (`src/key-setup.ts:21`).
110
+ - The plaintext shell-profile path single-quotes and escapes the key before appending (`src/key-setup.ts:179`–`src/key-setup.ts:180`).
111
+
112
+ ### Immediate session activation
113
+
114
+ In every code path — found in store, freshly pasted, or save failed — `process.env['OPENCODE_API_KEY']` is set so the key is live for the current process: at the store-hit branch (`src/key-setup.ts:62`) and at the end of collection (`src/key-setup.ts:187`). This is the one documented mutation of the parent environment; `buildChildEnv()` otherwise only mutates the child (`src/env.ts:48`).
115
+
116
+ ### Graceful degradation
117
+
118
+ Every keyring operation goes through `readKeyringAccount` / `writeKeyringAccount` / `deleteKeyringAccount`, each wrapping a dynamic `import('@napi-rs/keyring')` in try/catch and routing failures through `classifyKeyringError()` into a `diag?` callback (`src/env.ts:139`–`src/env.ts:173`). A missing native binary therefore yields a warning ("native keyring module not available"), not a crash. `@napi-rs/keyring` ships as an `optionalDependency` and is marked `external` in the bundle so it resolves from `node_modules` at runtime.
119
+
120
+ ### OAuth token storage
121
+
122
+ OAuth tokens are stored in the same keyring under `oauth:provider:<id>` as a serialized `OpencodeOAuthCredential` JSON (`oauthCredentialToKeychainJson()`, `src/registry/opencode-auth.ts:110`). `authenticateProvider()` saves them via `saveProviderCredential(oauthAuthRef(registryId), …)` (`src/registry/provider-auth.ts:160`, `src/registry/provider-auth.ts:194`) and warns — without failing — if the write doesn't land. The broker path (`runOpencodeAuthBroker()`, `src/registry/auth-broker.ts:17`) delegates the actual login to `opencode auth login`, then reads the token back out of OpenCode's `auth.json` (`src/registry/opencode-auth.ts:80`). On resolution, OAuth JSON in a keyring account is decoded and, when near expiry, refreshed in place (`src/env.ts:290`, `src/env.ts:324`). The acquisition mechanics are PRD-007.
123
+
124
+ ### Key validation before import
125
+
126
+ `validateImportKey()` (`src/registry/validate-import-key.ts:30`) gates registry imports: OAuth providers pass through (`:34`); empty keys are rejected (`:38`); gcloud/AWS/Azure providers are flagged `untested-manual` (`:44`); otherwise the key is probed against the provider's real model endpoint and rejected as `placeholder-key` or `invalid-key` if the API refuses it (`:83`–`:107`). Placeholder keys are recognized by `isLikelyPlaceholderKey()` / `isPlaceholderProviderKey()` (`src/registry/refresh-credentials.ts:25`, `:30`), and a small env-fallback table lets `anthropic`/`openai` fall back to their standard SDK env vars when OpenCode supplied only a placeholder (`src/registry/refresh-credentials.ts:20`, `:56`).
127
+
128
+ ---
129
+
130
+ ## Security Considerations
131
+
132
+ - **Plaintext options are opt-in and clearly labelled.** The `setx` and shell-profile choices write the key in cleartext; their prompt hints say so explicitly ("plaintext", "visible in System Properties → Environment Variables") (`src/key-setup.ts:91`, `src/key-setup.ts:98`, `src/key-setup.ts:109`). The default selection is always the secure store when available (`src/key-setup.ts:118`).
133
+ - **Keyring is the default trust anchor.** Secrets live in the OS keyring by default; config files hold only `authRef` pointers (`src/env.ts:241`).
134
+ - **The provider's real key never reaches the child when proxying** — the child gets a proxy token while the local proxy holds the real key (env contract, PRD-001 / PRD-005). Confirmed by `buildChildEnv()` setting `ANTHROPIC_API_KEY` to whatever caller passes — the proxy token on proxy routes (`src/env.ts:55`).
135
+ - **What is never logged:** the key value itself is never written to the trace log. The trace path uses `writeSecureLogLine()` and logs only the *reason* string from a keyring diagnostic, never the secret (`src/key-setup.ts:52`–`src/key-setup.ts:56`). Dry-run output masks the value (`setx OPENCODE_API_KEY ***`, `src/key-setup.ts:128`). The interactive prompt uses `p.password()` so the paste is not echoed (`src/key-setup.ts:69`).
136
+ - **Migration is non-destructive on failure.** The legacy entry is deleted only after the new `global:opencode` entry reads back identical; a verification mismatch keeps the legacy entry and warns (`src/env.ts:220`–`src/env.ts:224`).
137
+ - **OAuth file permission hygiene.** When reading OpenCode's `auth.json`, a warning is emitted if the file is group/world-readable (`authFilePermissionWarning()`, `src/registry/opencode-auth.ts:66`).
138
+ - **`--dry-run` writes nothing.** All persistence branches are skipped and replaced by `[dry-run]` log lines (`src/key-setup.ts:123`).
139
+
140
+ ---
141
+
142
+ ## Acceptance Criteria
143
+
144
+ - [x] Secrets are stored in the OS keyring (or opt-in plaintext), never in `providers.json` / `config.json` — registry holds only `authRef` (`src/env.ts:241`).
145
+ - [x] `@napi-rs/keyring` is loaded via dynamic `import()` and a missing native binary degrades gracefully without crashing (`src/env.ts:141`, `src/env.ts:155`).
146
+ - [x] On startup, an existing key is read silently and no prompt is shown (`src/key-setup.ts:38`, `src/key-setup.ts:58`).
147
+ - [x] macOS offers 4 save options (Keychain · Keychain + auto-load · profile · session) (`src/key-setup.ts:87`).
148
+ - [x] Windows offers 3 save options (Credential Manager · `setx` · session) (`src/key-setup.ts:95`).
149
+ - [x] Linux desktop offers Secret Service · profile · session; headless offers profile · session with an explanatory note (`src/key-setup.ts:103`).
150
+ - [x] `process.env['OPENCODE_API_KEY']` is set immediately on resolve/collect regardless of save choice (`src/key-setup.ts:62`, `src/key-setup.ts:187`).
151
+ - [x] Provider keys resolve via the documented order: namespaced env → `env:`-ref → `global:opencode` chain → per-provider keyring (`src/env.ts:241`).
152
+ - [x] Legacy keyring entries migrate via read → write → verify → delete (`src/env.ts:199`).
153
+ - [x] OAuth tokens are stored in the keychain and warn (not fail) on write failure (`src/registry/provider-auth.ts:160`, `:195`).
154
+ - [x] Provider keys are validated against the live endpoint before import; placeholder/invalid keys are rejected (`src/registry/validate-import-key.ts:30`).
155
+ - [x] The key value is never written to the trace log (only the diagnostic reason) and is masked in dry-run output (`src/key-setup.ts:55`, `src/key-setup.ts:128`).
156
+ - [x] `--dry-run` performs no writes (`src/key-setup.ts:123`).
157
+
158
+ ---
159
+
160
+ ## Files
161
+
162
+ | File | Role |
163
+ |------|------|
164
+ | `src/key-setup.ts` | Interactive Zen/Go key collection, per-platform save options, shell-profile detection, dry-run simulation |
165
+ | `src/env.ts` | Credential store wrappers, `resolveProviderCredential`, `global:opencode` chain, migration, `classifyKeyringError`, `isSecretServiceAvailable`, `buildChildEnv` env contract |
166
+ | `src/registry/auth-broker.ts` | Delegate OAuth login to OpenCode CLI, read token back from `auth.json` |
167
+ | `src/registry/provider-auth.ts` | Native-vs-broker OAuth selector; saves tokens to keychain; upserts registry provider |
168
+ | `src/registry/opencode-auth.ts` | Read/decode OpenCode `auth.json`; OAuth JSON (de)serialization; file-permission warning |
169
+ | `src/registry/refresh-credentials.ts` | Placeholder-key detection; env-fallback table for refresh |
170
+ | `src/registry/validate-import-key.ts` | Pre-import key validation against live endpoints |
171
+ | `src/cli.ts` | Calls `resolveOrCollectApiKey` / `readGlobalOpencodeCredential` in the launch flow (`src/cli.ts:13`, `:888`) |
172
+
173
+ ---
174
+
175
+ ## Risks & Known Limitations
176
+
177
+ - **Plaintext persistence is user-selectable.** `setx` and shell-profile options store the key in cleartext by design, for users without a working keyring. Mitigated by clear labelling and a secure default (`src/key-setup.ts:91`, `:98`, `:109`).
178
+ - **Keyring dependency is optional and native.** If `@napi-rs/keyring` fails to load, no secure storage is available and the user is steered to session-only or plaintext (`src/env.ts:141`). The probe (`src/env.ts:367`) catches this on Linux before showing the option.
179
+ - **OAuth broker requires the OpenCode CLI.** Providers without native OAuth and without OpenCode installed cannot complete broker login (`src/registry/auth-broker.ts:22`, `src/registry/provider-auth.ts:167`).
180
+ - **gcloud/AWS/Azure providers are not importable by API key.** They are flagged `untested-manual` and must be configured via OpenCode env auth (`src/registry/validate-import-key.ts:44`).
181
+ - **Server-mode exposure.** When the `server` command binds beyond localhost, its single password is the only gate — out of scope here, owned by PRD-012.
182
+
183
+ ---
184
+
185
+ ## Related
186
+
187
+ - Knowledge: [`../../../knowledge/private/security/credential-storage.md`](../../../knowledge/private/security/credential-storage.md) — credential storage & environment isolation narrative.
188
+ - [PRD-001 — CLI Core & Launch Orchestration](../prd-001-cli-core-launch-orchestration/prd-001-cli-core-launch-orchestration-index.md) — the `buildChildEnv()` env contract and scrubbed child environment.
189
+ - [PRD-002 — Provider Registry](../prd-002-provider-registry/prd-002-provider-registry-index.md) — the registry that stores per-provider `authRef` pointers and triggers import-time validation.
190
+ - [PRD-007 — OAuth Device Flows](../prd-007-oauth-device-flows/prd-007-oauth-device-flows-index.md) — the other credential path: how OAuth tokens are *acquired* before they land in the keychain documented here.