npm - @legioncodeinc/rflectr - Versions diffs - 0.1.0 - Mend

@legioncodeinc/rflectr 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (69) hide show

package/library/README.md ADDED Viewed

@@ -0,0 +1,39 @@
+---
+ai_description: |
+  This is the root of the repository's documentation library (schema v2).
+  You own everything under library/ except notes/, which is human-only.
+  Sub-trees: knowledge/ (public and private docs), requirements/ (product
+  work: PRDs), issues/ (reactive bug/incident work: IRDs), notes/ (junk
+  drawer, read-only to agents).
+  Schema reference: legion-shared/standards/library-schema-v2.md.
+  Standardize script: pnpm standardize-library --repository <name>.
+human_description: |
+  Root of this repository's documentation library.
+  - knowledge/: reference documentation split by audience (public vs private)
+  - requirements/: planned product work (PRDs) with backlog/in-work/completed lifecycle
+  - issues/: reactive bug and incident work (IRDs) with same lifecycle
+  - notes/: unstructured scratch space — only humans write here
+  Run `pnpm standardize-library --repository <name>` to scaffold any missing structure.
+---
+# Library
+Documentation root for this repository. Schema version: **v2**.
+See [`legion-shared/standards/library-schema-v2.md`](../../legion-shared/standards/library-schema-v2.md) for the full specification.
+## Top-level layout
+| Folder | What goes here |
+|---|---|
+| `knowledge/public/` | End-user / customer-facing docs: overviews, guides, FAQs |
+| `knowledge/private/` | Internal engineering and business docs: ADRs, standards, domain knowledge |
+| `requirements/` | Product and feature work: PRDs in backlog/in-work/completed |
+| `issues/` | Reactive bug and incident work: IRDs in backlog/in-work/completed |
+| `notes/` | Human-only scratch space |
+## What does NOT belong here
+- Brand assets → `legion-shared/brands/`
+- Wiki entity pages → `legion-wiki/<repo>/wiki/` (derived, never edit)
+- Library mirrors → `legion-wiki/<repo>/library/` (derived, never edit)

package/library/issues/README.md ADDED Viewed

@@ -0,0 +1,46 @@
+---
+ai_description: |
+  This folder contains all reactive bug and incident work (IRDs).
+  It is a PEER of requirements/, not nested under it.
+  Sub-folders: backlog/, in-work/, completed/ — same lifecycle as requirements/.
+  IRD folder naming: ird-<###>-<kebab-slug>/
+  IRD numbers match the GitHub issue number for this repo.
+  Never invent IRD numbers — a GitHub issue must exist first.
+  IRDs are single-scope: one issue per IRD, no sub-IRDs.
+  Do NOT put PRDs here — those go in requirements/.
+human_description: |
+  Reactive bug and incident work (IRDs), organized by lifecycle stage.
+  - backlog/: tracked issues with a fix plan, not yet started
+  - in-work/: issues currently being fixed
+  - completed/: resolved issues (move entire folder)
+  IRD numbers match GitHub issue numbers. Create an IRD only after the
+  GitHub issue exists.
+---
+# Issues
+Reactive bug and incident work (IRDs), organized by lifecycle state.
+## Sub-folders
+| Folder | State | Description |
+|---|---|---|
+| `backlog/` | Tracked | IRDs with a fix plan, not yet in progress |
+| `in-work/` | Active | Issues currently being resolved |
+| `completed/` | Resolved | Entire IRD folder moves here when the issue closes |
+## IRD folder structure
+```
+ird-042-stale-cache/
+  ird-042-stale-cache-index.md     single-scope fix plan
+  qa/
+    ird-042-stale-cache-qa.md      QA audit (written by quality-guardian)
+```
+## Naming rules
+- Folder: `ird-<###>-<kebab-slug>/`
+- Index: `ird-<###>-<kebab-slug>-index.md`
+- IRD number = GitHub issue number (never invented)
+- No sub-IRDs (scope one issue per IRD)

package/library/issues/backlog/README.md ADDED Viewed

@@ -0,0 +1,26 @@
+---
+ai_description: |
+  Contains IRD folders for tracked issues not yet in active fix work.
+  Create a new IRD here only AFTER the GitHub issue exists for this repo.
+  IRD folder: ird-<###>-<slug>/ where ### = GitHub issue number.
+  Must contain: ird-<###>-<slug>-index.md (the fix plan) and qa/ folder.
+  IRDs are single-scope: do not add sub-IRDs.
+human_description: |
+  IRDs planned but not yet in active fix work. Create IRDs here.
+  - Naming: ird-042-stale-cache/ with ird-042-stale-cache-index.md inside
+  - IRD number must match the GitHub issue number
+  - Create only after the GitHub issue exists
+  Move to in-work/ when fix work begins.
+---
+# Issues — Backlog
+Tracked issues with a fix plan, not yet in active resolution.
+## Creating a new IRD
+1. Confirm the GitHub issue number (e.g., #42).
+2. Create `ird-042-<kebab-slug>/`.
+3. Create `ird-042-<slug>-index.md` — the single-scope fix plan.
+4. Create `qa/` subfolder (empty; `quality-guardian` writes here).
+5. No sub-IRDs — keep scope to one issue.

package/library/issues/completed/README.md ADDED Viewed

@@ -0,0 +1,13 @@
+---
+ai_description: |
+  Resolved IRD folders. Entire ird-<###>-<slug>/ folders move here from
+  in-work/ when the corresponding GitHub issue is closed and verified.
+  Read-only after landing — do NOT edit or re-open IRDs here.
+human_description: |
+  Resolved issue folders. Move entire ird-NNN-slug/ here from in-work/
+  when the GitHub issue is closed and the fix is confirmed. Read-only.
+---
+# Issues — Completed
+Resolved IRD folders. Entire `ird-<###>-<slug>/` folders land here when the GitHub issue closes and the fix is confirmed. Do not edit files here after landing.

package/library/issues/in-work/README.md ADDED Viewed

@@ -0,0 +1,13 @@
+---
+ai_description: |
+  IRD folders actively being resolved. Mirror of requirements/in-work/
+  but for issues. Move entire ird-<###>-<slug>/ folder from backlog/
+  here when fix work begins, then to completed/ when the issue closes.
+human_description: |
+  IRDs currently being fixed. Move folder from backlog/ here when work
+  starts, and to completed/ when the GitHub issue is closed.
+---
+# Issues — In Work
+IRDs currently being resolved. Move from `backlog/` → here when fix work starts, then `completed/` when the GitHub issue closes.

package/library/knowledge/README.md ADDED Viewed

@@ -0,0 +1,34 @@
+---
+ai_description: |
+  This folder contains all reference documentation for this repository,
+  split by intended audience: public/ for end-users, private/ for internal
+  team and AI agents. When filing a new doc, default to private/. Promote
+  to public/ only when the content is intentionally customer-facing.
+  Allowed writes: knowledge/public/<domain>/<slug>.md and
+  knowledge/private/<domain>/<slug>.md. ADRs always go in
+  knowledge/private/architecture/ADR-<n>-<slug>.md.
+  Never write to knowledge/ itself (write to the sub-folders).
+human_description: |
+  Reference documentation split by audience.
+  - public/: docs that will eventually be surfaced to customers or published
+  - private/: internal engineering, architecture, business, and strategy docs
+  When adding a new doc, pick the right subdomain folder inside public/ or
+  private/. If the domain doesn't exist yet, create it.
+---
+# Knowledge
+Reference documentation for this repository, organized by audience.
+## Sub-folders
+| Folder | Audience | Typical content |
+|---|---|---|
+| `public/` | End-users, customers, external | Overviews, user guides, FAQs |
+| `private/` | Internal team + AI agents | ADRs, standards, architecture, domain engineering docs |
+## Decision rule: public vs private
+> "Would I publish this on a help center or product docs site?"
+Yes → `public/`. No → `private/`. When in doubt, `private/`.

package/library/knowledge/private/README.md ADDED Viewed

@@ -0,0 +1,40 @@
+---
+ai_description: |
+  This folder contains internal engineering and business documentation.
+  ADRs MUST live in architecture/ADR-<n>-<kebab-slug>.md.
+  Engineering standards MUST live in standards/documentation-framework.md.
+  Other domain folders (<domain>/) are repo-specific and may be created as
+  needed (ai/, auth/, data/, frontend/, infrastructure/, integrations/,
+  marketing/, operations/, personas/, reporting/, roadmap/, scanners/,
+  security/, strategy/, etc.).
+  Do NOT file customer-facing content here (that goes in knowledge/public/).
+  Write path: library/knowledge/private/<domain>/<kebab-slug>.md.
+human_description: |
+  Internal engineering and business documentation.
+  - architecture/: Architecture Decision Records (ADRs)
+  - standards/: Documentation framework and coding standards
+  - <domain>/: Any repo-specific knowledge domain (ai/, auth/, data/, etc.)
+  Default landing zone for any doc that does not need to be customer-facing.
+  When creating a new domain folder, add a README.md explaining what belongs.
+---
+# Knowledge — Private
+Internal documentation for engineers, product, and AI agents.
+## Required sub-folders (always present)
+| Folder | Contents |
+|---|---|
+| `architecture/` | ADRs: `ADR-<n>-<kebab-slug>.md`. Locked decisions with context, alternatives, consequences. |
+| `standards/` | `documentation-framework.md` and any repo-specific writing rules. |
+## Optional domain folders
+Create any of these as needed: `ai/`, `auth/`, `data/`, `frontend/`, `infrastructure/`, `integrations/`, `marketing/`, `operations/`, `personas/`, `reporting/`, `roadmap/`, `scanners/`, `security/`, `strategy/`, `reference/`, `<product>-ux-ui/`.
+## What does NOT belong here
+- Customer-facing content (put in `knowledge/public/`)
+- PRDs or IRDs (put in `requirements/` or `issues/`)
+- Brand assets (put in `legion-shared/brands/`)

package/library/knowledge/private/ai/README.md ADDED Viewed

@@ -0,0 +1,8 @@
+# AI
+The model-routing brain: wire-format translation and model discovery/classification.
+| Doc | Covers |
+|---|---|
+| [`translation-layer.md`](translation-layer.md) | The Vercel AI SDK adapter + provider factory — the single translation path; Responses-API selection; thought_signature round-trip. |
+| [`model-discovery-classification.md`](model-discovery-classification.md) | `classifyModelFormat`, the two-source merge, context-window resolution, cost-display limitation. |

package/library/knowledge/private/ai/model-discovery-classification.md ADDED Viewed

@@ -0,0 +1,81 @@
+# Model Discovery & Classification
+> Category: Ai | Version: 1.0 | Date: June 2026 | Status: Active
+How `rflectr` builds the model list a user picks from, and how it decides whether each model is forwarded raw or translated. Read [`translation-layer.md`](translation-layer.md) for what happens *after* a model is classified.
+**Related:**
+- [`translation-layer.md`](translation-layer.md)
+- [`../data/provider-registry.md`](../data/provider-registry.md)
+- Source: `src/constants.ts` (`classifyModelFormat`), `src/models.ts`, `src/context-window.ts`, `src/context-model-id.ts`, `src/registry/materialize.ts`
+---
+## The format decision
+Every model carries a `modelFormat` that drives the launch branch (`'anthropic'` = direct passthrough, anything else = SDK adapter proxy). It is computed by `classifyModelFormat(modelId, providerNpm)` in `src/constants.ts`:
+```ts
+if (providerNpm === '@ai-sdk/anthropic') return 'anthropic';
+if (providerNpm === '@ai-sdk/openai')    return 'unsupported';
+if (providerNpm === '@ai-sdk/google')    return 'unsupported';
+// Fallback: ID-prefix heuristics when no cache npm is known
+if (id.startsWith('claude-'))  return 'anthropic';
+if (id.startsWith('gpt-'))     return 'unsupported';
+if (id.startsWith('gemini-'))  return 'unsupported';
+return 'openai';
+```
+The four values mean:
+| `modelFormat` | Meaning |
+|---|---|
+| `anthropic` | Direct passthrough to the provider's Anthropic endpoint. `isAnthropicNative` is true. |
+| `openai` | Routed through the SDK adapter via the local proxy. The catch-all for everything that isn't natively Anthropic. |
+| `unsupported` | Hidden in the **cloud OpenCode wizard** only. GPT/Gemini through OpenCode Zen/Go's proxy layer needs model-specific endpoints that the cloud path can't provide. |
+> **Important nuance:** `unsupported` is a *cloud-wizard* restriction, not a global one. To use GPT or Gemini models, configure the **local OpenAI / Google provider** (which carries the real `@ai-sdk/openai` / `@ai-sdk/google` npm) — those route through the SDK adapter normally. The `unsupported` classification only blocks the OpenCode Zen/Go proxy layer where direct OpenAI/Google access isn't available.
+---
+## The two-source merge
+The cloud (OpenCode Zen/Go) model list is built from two sources merged together:
+```mermaid
+flowchart TD
+    api["GET {backendUrl}/v1/models — available ids (no auth)"]
+    cache["~/.cache/opencode/models.json — name, family, cost, provider.npm"]
+    api --> merge["mergeModels()"]
+    cache --> merge
+    merge --> enrich["enriched ModelInfo with modelFormat, sourceBackend, contextWindow"]
+```
+- **Primary:** `GET {backendUrl}/v1/models` returns the available model ids (no auth required).
+- **Enrichment:** `~/.cache/opencode/models.json` (written by the OpenCode CLI, path in `OPENCODE_CACHE_PATH`) supplies `name`, `family`, `cost`, and `provider.npm`. It is optional enrichment, never a runtime dependency.
+`sourceBackend` is set from the backend that was queried. This matters for the `go` subscription tier, which shows Zen free models *and* Go paid models in one combined list — `sourceBackend` lets the launcher set the correct `ANTHROPIC_BASE_URL` per selected model.
+### Stale free models
+`STALE_FREE_MODELS` in `src/constants.ts` lists models whose free promotion ended but the API still returns. They are filtered out in `mergeModels()`. (Historically this held `qwen3.6-plus-free`; treat the constant as the source of truth.)
+---
+## Registry models
+Registry providers (`~/.rflectr/providers.json`) carry their own `CachedModel[]`, each already stamped with `modelFormat`, `npm`, `upstreamModelId`, `contextWindow`, `cost`, `supportedParameters`, and `reasoning`. `materializeRegistry` (`src/registry/materialize.ts`) converts those into runtime `LocalProviderModel`s and applies per-agent hiding via `shouldHideModel()` — e.g. Zen/Go favorites are hidden from Codex, which has no gateway path for them. See [`../data/provider-registry.md`](../data/provider-registry.md).
+---
+## Context window resolution
+The status bar in Claude Code shows remaining context, which depends on a correct window. `resolveContextWindow(modelId, contextWindow?)` (`src/context-window.ts`) picks the window; `buildChildEnv` writes it to `CLAUDE_CODE_MAX_CONTEXT_TOKENS`. The proxy's synthetic `GET /v1/models` includes `context_window` per model (`formatAnthropicModelEntry`) so the host can render it.
+A model id may carry a `[1m]` suffix to denote a 1-million-token context variant; `stripOneMContextSuffix` / `claudeCodeClientModelId` (`src/context-model-id.ts`) separate the wire id from the display id and the context hint. In switch-menu mode the window is fixed at launch and does not change on live `/model` switch (see [`../architecture/launch-flow-claude.md`](../architecture/launch-flow-claude.md#the-context-window-caveat)).
+---
+## Cost display is inaccurate for non-Anthropic models
+Claude Code applies its own internal pricing table to whatever model id it sees, so the cost it shows for a Groq/DeepSeek/Gemini model is wrong. This is a documented, unfixable-from-here limitation — the host owns its pricing display.

package/library/knowledge/private/ai/translation-layer.md ADDED Viewed

@@ -0,0 +1,88 @@
+# The Translation Layer
+> Category: Ai | Version: 1.0 | Date: June 2026 | Status: Active
+The single path that lets a Claude Code / Codex / Gemini host talk to any non-Anthropic model. This doc explains the Vercel AI SDK adapter (`src/sdk-adapter.ts`) and the provider factory (`src/provider-factory.ts`) that feeds it. Read [`../architecture/system-overview.md`](../architecture/system-overview.md) first.
+**Related:**
+- [`model-discovery-classification.md`](model-discovery-classification.md)
+- [`../integrations/local-proxy.md`](../integrations/local-proxy.md)
+- [`../integrations/harnesses.md`](../integrations/harnesses.md)
+- Source: `src/sdk-adapter.ts`, `src/provider-factory.ts`, `src/proxy-shared.ts`, `src/codex-responses-adapter.ts`, `src/gemini-proxy.ts`
+---
+## Why there is exactly one translation path
+A naive launcher would hand-roll a translator per provider: Anthropic→OpenAI, Anthropic→Gemini, and so on. That is a combinatorial mess and every provider has quirks (message ordering, reasoning signatures, tool-call encoding). `rflectr` instead routes **all** non-Anthropic providers through the Vercel AI SDK (`ai` + `@ai-sdk/*`) — the same packages OpenCode loads. The SDK owns wire format, endpoint selection, and provider quirks, so there is one translation path, not N.
+```mermaid
+flowchart TD
+    host["Host wire format (Anthropic / Responses / Gemini)"]
+    host --> trans["translate*Request() — host body → SDK call params"]
+    trans --> factory["createLanguageModel({npm, modelId, apiKey, baseURL})"]
+    factory --> model["Vercel AI SDK LanguageModel"]
+    model --> stream["streamText / generateText"]
+    stream --> back["map SDK fullStream → host SSE/response"]
+    back --> host
+```
+The rule that decides whether to translate at all: `isSdkMigratedNpm(npm)` (`src/provider-factory.ts`) is true for **any** npm except `@ai-sdk/anthropic`. Anthropic-format models skip the adapter and are forwarded raw.
+---
+## provider-factory.ts — npm → LanguageModel
+`createLanguageModel(spec)` (async) is the factory. `spec` carries `{ npm, modelId, apiKey, baseURL?, providerId?, authType?, oauthAccountId?, vertex? }`. It dynamically `import(npm)`s the SDK package and discovers its `create*` factory. The router has special branches:
+| npm | Behaviour |
+|---|---|
+| `@ai-sdk/google-vertex/anthropic` (`VERTEX_ANTHROPIC_NPM`) | Claude on Google Vertex AI via gcloud Application Default Credentials (no apiKey). |
+| `@ai-sdk/openai` | OAuth → ChatGPT Codex backend (`https://chatgpt.com/backend-api/codex`); API key → direct OpenAI. `modelPrefersResponsesApi()` picks `openai.responses(id)` vs `openai.chat(id)`. |
+| `@ai-sdk/xai` | Direct; also consults `modelPrefersResponsesApi()`. |
+| `@ai-sdk/google` | Direct — ignores `baseURL`, uses the native `/v1beta` endpoint. |
+| `@ai-sdk/anthropic` | Direct; strips a trailing `/v1` from `baseURL` if present. |
+| `@ai-sdk/openai-compatible`, `@openrouter/ai-sdk-provider` | Routed via `baseURL`. |
+| anything else | `loadSdkProviderFactory(npm)` finds the `create*()` export dynamically. |
+The `@ai-sdk/*` provider packages ship as npm **`dependencies`** and are marked `external` in `tsup.config.ts`, so they resolve from `node_modules` at runtime and keep `dist/cli.js` small.
+### The Responses API selector
+`modelPrefersResponsesApi(modelId)` returns true for OpenAI/xAI models that require the Responses API rather than Chat Completions: GPT-5.4+, GPT-5.5, `gpt-5-pro` / `gpt-5.2-pro`, `*-codex`, the o-series (`o3`, `o4`, …), and xAI `grok-*-multi-agent`. Newer OpenAI reasoning models only round-trip correctly through Responses, so this selection is load-bearing — not cosmetic.
+OpenCode catalog ids may differ from upstream API ids (e.g. `gpt-5.5-fast` → `gpt-5.5`); `upstreamModelId` carries OpenCode's `api.id` for the actual upstream call.
+### Reasoning capabilities
+`getReasoningCapabilities(npm, modelId, metadata)` returns a `ReasoningCapabilities` describing whether a model exposes controllable effort/thinking, internal-only reasoning, or none — covering Claude 4.6+, Gemini 2.5+/3, Mistral reasoners, xAI reasoners, DeepSeek V4, Kimi, and OpenRouter. The adapter uses this (with `effortProviderOptions` / `thinkingProviderOptions` / `deepMergeProviderOptions`) to translate a host's thinking/effort request into the right provider option block.
+---
+## sdk-adapter.ts — Anthropic ↔ SDK
+The adapter handles the Claude-Code-facing direction. Its contract: **one turn per request.** Claude Code owns the tool loop, so the adapter never loops; it translates a single request and streams a single response.
+- `translateRequest(body, npm, options?)` builds the SDK call params from an Anthropic request — messages, tools, `tool_choice`, system. Critically, it **folds inline `role:'system'` messages into the system prompt**: Claude Code injects the skills list and system-reminders as system-role messages mid-conversation, and dropping them would break behaviour. `TranslateRequestOptions` carries `defaultEffort` (fallback when the client omits effort, e.g. the Claude Desktop gateway), `reasoningMetadata`, and `openAiOAuth` (ChatGPT Codex OAuth manages its own output limit and requires `instructions`).
+- `streamAnthropicResponse` maps the SDK `fullStream` to Anthropic SSE.
+- `generateAnthropicResponse` handles the non-streaming case.
+### The thought_signature round-trip
+Reasoning models (especially Gemini) require their `thought_signature` to be echoed back verbatim on the next turn. Anthropic's wire format has no field for it, so `rflectr` smuggles it through the tool-use id:
+- **Encode:** `encodeToolUseId(rawId, signature)` (`src/proxy-shared.ts`) produces `{id}::ts::{signature}`.
+- **Decode:** `splitToolUseId(id)` recovers `{ rawId, thoughtSignature }`, which is fed back into `providerOptions.google.thoughtSignature`.
+Gemini puts the signature on tool-call parts (captured at `tool-input-start`); the SDK then handles Gemini's strict echo-back. This is the reason the old hand-rolled Gemini-native path was retired. The `::ts::` separator would break only if a signature literally contained `::ts::` — extremely unlikely, and a documented edge.
+---
+## The other two host directions
+The same factory + SDK model is reused; only the host-facing translation differs:
+- **Codex Responses API** (`src/codex-responses-adapter.ts`): `translateResponsesRequest` / `translateResponsesInput` / `translateResponsesTools` build SDK params from a Responses body; `streamResponsesResponse` / `generateResponsesResponse` emit the Responses SSE/JSON shape.
+- **Gemini REST** (`src/gemini-proxy.ts` + `src/gemini-parts.ts`): `translateGeminiRequest` extracts system instruction, contents, tools, and generation config; `parseGeminiPart` / `collectAnthropicBlocksFromGeminiParts` / `mapGeminiUsage` translate parts and usage.
+All three converge on `createLanguageModel` + `streamText`/`generateText`. That convergence is the whole point: add a provider once and every host can use it.

package/library/knowledge/private/architecture/README.md ADDED Viewed

@@ -0,0 +1,10 @@
+# Architecture
+System-level design docs and Architecture Decision Records (ADRs).
+| Doc | Covers |
+|---|---|
+| [`system-overview.md`](system-overview.md) | What rflectr is, its surfaces, the shared translation core, env isolation. **Start here.** |
+| [`launch-flow-claude.md`](launch-flow-claude.md) | The `rflectr claude` flow: single-model vs switch-menu launch paths. |
+ADRs (when added) live here as `ADR-<n>-<kebab-slug>.md`.

package/library/knowledge/private/architecture/launch-flow-claude.md ADDED Viewed

@@ -0,0 +1,93 @@
+# Claude Code Launch Flow
+> Category: Architecture | Version: 1.0 | Date: June 2026 | Status: Active
+How `rflectr claude` goes from a command line to a running Claude Code process pointed at the chosen model. Read [`system-overview.md`](system-overview.md) first. This doc traces `runClaudeCommand` in `src/cli.ts` and its two launch paths.
+**Related:**
+- [`system-overview.md`](system-overview.md)
+- [`../integrations/local-proxy.md`](../integrations/local-proxy.md)
+- [`../ai/translation-layer.md`](../ai/translation-layer.md)
+- [`../data/preferences-config.md`](../data/preferences-config.md)
+- Source: `src/cli.ts` (`runClaudeCommand`, `runModelsCommand`, `launchClaudeViaCatalog`), `src/env.ts` (`buildChildEnv`), `src/catalog.ts`
+---
+## The two modes
+The launch has two shapes, decided by one line in `runClaudeCommand`:
+```ts
+const switchMenuActive = favorites.length > 0 && !launchPlan.skip;
+```
+- **Single-model mode** (no favorites saved): one model, one route. Launch and exit.
+- **Switch-menu mode** (`rflectr models` has saved at least one favorite): a multi-route catalog proxy is started and Claude Code's gateway model discovery is enabled, so the in-session `/model` command lists the starting model plus every favorite for live switching.
+Favorites are managed by `runModelsCommand` (also in `src/cli.ts`) and persisted to `~/.rflectr/config.json`. The cap is `MAX_MODEL_CATALOG = 20` (`src/constants.ts`).
+---
+## Single-model flow
+```mermaid
+flowchart TD
+    start["rflectr claude"] --> bin["findClaudeBinary()"]
+    bin --> firstrun["needsFirstRunSetup? → runFirstRunWizard()"]
+    firstrun --> catalog["fetchProviderCatalog()"]
+    catalog --> pickP["p.select: which provider?"]
+    pickP --> pickM["pickLocalModel(provider)"]
+    pickM --> fmt{"selectedModel.modelFormat"}
+    fmt -->|anthropic| direct["buildChildEnv(model.baseUrl, ...) — no proxy"]
+    fmt -->|openai| proxy["startProxy(...) → buildChildEnv(127.0.0.1:port, ...)"]
+    direct --> launch["launchClaude(env, model, args)"]
+    proxy --> launch
+    launch --> wait["Claude Code runs (stdio inherited)"]
+    wait --> close["proxyHandle.close() on exit"]
+```
+The format branch is the heart of it (`src/cli.ts`):
+- `modelFormat === 'anthropic'` → **direct passthrough.** `buildChildEnv(selectedModel.baseUrl!, selectedModel.id, launchApiKey, undefined, contextWindow)`. No proxy; Claude Code talks straight to the provider's Anthropic-compatible endpoint. `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1` is set so beta headers are stripped on the direct hop.
+- otherwise → **SDK adapter proxy.** `startProxy(completionsUrl, modelId, trace, contextWindow, { npm, baseURL, upstreamModelId, providerId, authType, oauthAccountId, supportedParameters, reasoning, interleavedReasoningField }, apiKey)` returns a `ProxyHandle`; the child env points `ANTHROPIC_BASE_URL` at `http://127.0.0.1:<port>`.
+The provider API key is resolved by `resolveLocalProviderApiKey(activeProvider)` (`src/provider-catalog.ts`). If none is found, launch aborts with a message pointing at `rflectr providers`.
+---
+## Switch-menu (catalog) flow
+When favorites exist, `runClaudeCommand` delegates to `launchClaudeViaCatalog`:
+1. `makeRouteResolver(localProviders, zenModels, goModels, zenGoApiKey)` (`src/catalog.ts`) builds a function that maps a `(providerId, modelId)` pair to a `ProxyRoute`.
+2. `buildCatalogRoutes(startingRoute, favorites, resolveRoute)` builds the route list — **starting model + favorites only**, never the full catalog — and reports `droppedFavorites` (favorites whose provider/model is no longer available, silently skipped).
+3. `startProxyCatalog(catalogRoutes, startingRoute.aliasId, trace)` starts one proxy serving all routes.
+4. `buildChildEnv(..., enableGatewayDiscovery = true)` sets `CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY=1` so Claude Code fetches `/v1/models` from the proxy and populates `/model`.
+Each route's id is rewritten by `aliasModelId()` (`src/proxy.ts`) so Claude Code sees unique, gateway-safe names (e.g. `anthropic-opencode-go__deepseek-v4-flash`) in the picker. See [`../integrations/local-proxy.md`](../integrations/local-proxy.md).
+A `__favorites__` pseudo-provider is unshifted onto the provider picker in switch-menu mode; selecting it loads the Favorites Catalog and uses `resolveFirstAvailableFavorite` as the starting model.
+---
+## The context-window caveat
+In switch-menu mode the displayed context window reflects the **launch** model and does **not** update on a live `/model` switch. Claude Code's gateway model discovery only carries `id` + `display_name` (no `context_window`) and fetches `/v1/models` once at startup, so `CLAUDE_CODE_MAX_CONTEXT_TOKENS` — fixed at launch by `buildChildEnv` — is the only lever. Single-model launches show the correct window. This is a documented, by-design limitation.
+---
+## Flags and boot shortcuts
+`parseArgs` (`src/cli.ts`) recognises starter flags (`--dry-run`, `--setup`, `--trace`, `--help`, `--version`) and relay launch flags (`--provider`, `--model`). Everything after `--`, and any unrecognised flag, is forwarded verbatim to Claude Code (`claudeArgs`).
+- `--provider X --model Y` (or print mode `-p`) skips the wizard entirely via `planLaunchWizard` / `findProviderAndModel` (`src/launch-target.ts`).
+- `--dry-run` runs the whole wizard but prints a preview (`printDryRun`) and writes nothing — it ignores saved env keys, keyring, tier, and preferences, simulating a fresh first run.
+- `--trace` writes a debug log under `~/.rflectr/logs/` and prints errors on exit (`prepareClaudeTraceLog` / `printTraceLog`).
+Clean-stdout agent mode (`wantsCleanAgentStdout`, `setAgentStdoutMode`) suppresses the interactive intro/spinners when Claude Code is run in print/pipe mode.
+---
+## What the child process receives
+For the exact env contract (`ANTHROPIC_BASE_URL`, `ANTHROPIC_API_KEY`, `ANTHROPIC_MODEL`, `CLAUDE_CODE_MAX_CONTEXT_TOKENS`, the tool-search / system-prompt compat vars, and the removed conflict vars), see [`../security/credential-storage.md`](../security/credential-storage.md#child-process-environment).

package/library/knowledge/private/architecture/system-overview.md ADDED Viewed

@@ -0,0 +1,108 @@
+# System Overview
+> Category: Architecture | Version: 1.0 | Date: June 2026 | Status: Active
+Read this first. It explains what `rflectr` is, the surfaces it exposes, and the single translation core that every surface shares. New engineers should read this before diving into any domain doc.
+**Related:**
+- [`launch-flow-claude.md`](launch-flow-claude.md)
+- [`../ai/translation-layer.md`](../ai/translation-layer.md)
+- [`../ai/model-discovery-classification.md`](../ai/model-discovery-classification.md)
+- [`../integrations/local-proxy.md`](../integrations/local-proxy.md)
+- [`../integrations/harnesses.md`](../integrations/harnesses.md)
+- [`../data/provider-registry.md`](../data/provider-registry.md)
+- Source: `src/cli.ts`, `src/constants.ts`
+---
+## Why this exists
+The major agentic coding tools — Claude Code, OpenAI Codex (CLI and desktop), Google Gemini CLI, and Claude Desktop — each speak only to their vendor's own API by default. `rflectr` is a launcher that re-points each of those tools at a *different* model backend without the tool noticing. You can run Claude Code against a Groq Llama model, Codex against DeepSeek, or Gemini CLI against a local Ollama endpoint — picking the model from an interactive wizard, then handing the host tool an environment that makes it believe it is talking to its native API.
+The published npm package and CLI binary are both named `rflectr` (`package.json` is the single source of truth for the version; see [`CLAUDE.md`](../../../../CLAUDE.md) for the release workflow). The repository directory is `rflectr`.
+The hard problem `rflectr` solves is **wire-format translation**: Claude Code emits Anthropic `/v1/messages`, Codex emits the OpenAI Responses API, Gemini CLI emits the Gemini REST protocol — but the chosen model may speak any of those formats (or none of them directly). A local HTTP proxy sits between the host tool and the upstream model and translates in both directions. All non-Anthropic translation flows through one path: the **Vercel AI SDK adapter** (see [`../ai/translation-layer.md`](../ai/translation-layer.md)).
+---
+## The surfaces
+Every surface is a subcommand of the `rflectr` CLI, dispatched from `parseArgs` / `main` in `src/cli.ts`.
+| Command | What it launches | Host wire format | Doc |
+|---|---|---|---|
+| `rflectr claude` | Claude Code CLI | Anthropic `/v1/messages` | [`launch-flow-claude.md`](launch-flow-claude.md) |
+| `rflectr codex` | OpenAI Codex CLI | OpenAI Responses (`/v1/responses`) | [`../integrations/harnesses.md`](../integrations/harnesses.md) |
+| `rflectr codex-app` | Codex desktop app | OpenAI Responses | [`../integrations/harnesses.md`](../integrations/harnesses.md) |
+| `rflectr gemini` | Gemini CLI | Gemini REST (`:generateContent`) | [`../integrations/harnesses.md`](../integrations/harnesses.md) |
+| `rflectr claude-app` | Claude Desktop app | Anthropic (gateway config) | [`../integrations/harnesses.md`](../integrations/harnesses.md) |
+| `rflectr server` | Foreground API gateway | Anthropic + OpenAI compatible | [`../infrastructure/server-gateway.md`](../infrastructure/server-gateway.md) |
+| `rflectr providers` | Provider registry manager | — | [`../data/provider-registry.md`](../data/provider-registry.md) |
+| `rflectr models` / `favorites` | Favorite-model manager | — | [`launch-flow-claude.md`](launch-flow-claude.md) |
+```mermaid
+flowchart TD
+    subgraph hosts["Host tools (unmodified)"]
+        cc["Claude Code"]
+        cx["Codex CLI / app"]
+        gm["Gemini CLI"]
+        cd["Claude Desktop"]
+    end
+    subgraph relay["rflectr"]
+        proxy["Local proxy 127.0.0.1:random"]
+        adapter["Vercel AI SDK adapter"]
+        registry["Provider registry ~/.rflectr/providers.json"]
+    end
+    subgraph upstream["Upstream models"]
+        anth["Anthropic-format endpoints"]
+        sdk["Any @ai-sdk provider (OpenAI, Groq, Gemini, xAI, ...)"]
+    end
+    cc -->|ANTHROPIC_BASE_URL| proxy
+    cx -->|OPENAI base_url| proxy
+    gm -->|GOOGLE_GEMINI_BASE_URL| proxy
+    cd -->|gateway config| proxy
+    proxy -->|"anthropic format"| anth
+    proxy -->|"everything else"| adapter
+    adapter --> sdk
+    registry --> proxy
+```
+---
+## The shared core
+Three subsystems are reused by every surface:
+**1. The provider registry** (`src/registry/`, `src/provider-catalog.ts`). The list of providers and their models lives in `~/.rflectr/providers.json`. It is the single source of truth for what shows up in every wizard. Built-in templates (Groq, Mistral, OpenAI, Ollama, …) are defined in `src/provider-templates.ts`; OpenCode Zen / Go are always available even with no registry. See [`../data/provider-registry.md`](../data/provider-registry.md).
+**2. The translation layer** (`src/sdk-adapter.ts`, `src/provider-factory.ts`). `createLanguageModel({ npm, modelId, apiKey, baseURL })` turns whatever npm package OpenCode/the registry assigned the provider into a Vercel AI SDK `LanguageModel`. The adapter then maps the host's wire format to and from that model, one turn per request — the host tool always owns its own tool loop. See [`../ai/translation-layer.md`](../ai/translation-layer.md).
+**3. The local proxy** (`src/proxy.ts` and the per-protocol variants `src/codex-proxy.ts`, `src/gemini-proxy.ts`). A throwaway HTTP server on `127.0.0.1:<random port>` that the host tool is pointed at. Anthropic-format models are forwarded raw; everything else goes through the SDK adapter. `aliasModelId()` rewrites non-`claude-*` ids to a gateway-safe form. See [`../integrations/local-proxy.md`](../integrations/local-proxy.md).
+---
+## Environment isolation, not config editing
+`rflectr` never edits the host tool's settings file. It launches the child process with a purpose-built environment (`buildChildEnv` in `src/env.ts`) that:
+- **Removes** the 17 conflicting Anthropic/Vertex/Bedrock/AWS/Foundry env vars listed in `CONFLICTING_ENV_VARS` (`src/constants.ts`), so stale cloud config can't leak in.
+- **Sets** `ANTHROPIC_BASE_URL`, `ANTHROPIC_API_KEY`, `ANTHROPIC_MODEL`, and `CLAUDE_CODE_MAX_CONTEXT_TOKENS`.
+This avoids the backup/restore problem that settings-file rewriters have. The one caveat: Claude Code itself persists the launched model to `~/.claude/settings.json`, so a later bare `claude` may still show a relay alias. See [`../security/credential-storage.md`](../security/credential-storage.md) for the full env contract.
+The two desktop apps (`claude-app`, `codex-app`) are the exception — they *do* write config files, because the apps have no env to inherit. Those writes are backed up and restored on exit via lock files. See [`../integrations/harnesses.md`](../integrations/harnesses.md).
+---
+## A critical URL constraint
+`BACKENDS.baseUrl` in `src/constants.ts` must **not** include `/v1`. The Anthropic SDK appends `/v1/messages` automatically, so `https://opencode.ai/zen/v1` would produce requests to `/zen/v1/v1/messages` → 404. The same rule applies anywhere an Anthropic-format `baseUrl` is built. This is the single most common configuration footgun in the codebase.
+---
+## Where to go next
+- To trace a launch end to end: [`launch-flow-claude.md`](launch-flow-claude.md).
+- To understand how a Groq or DeepSeek model gets spoken to as if it were Anthropic: [`../ai/translation-layer.md`](../ai/translation-layer.md).
+- To understand how the model list is built and classified: [`../ai/model-discovery-classification.md`](../ai/model-discovery-classification.md).
+- To understand credential handling: [`../security/credential-storage.md`](../security/credential-storage.md) and [`../auth/oauth-device-flows.md`](../auth/oauth-device-flows.md).

package/library/knowledge/private/auth/README.md ADDED Viewed

@@ -0,0 +1,9 @@
+# Auth
+Provider sign-in flows (distinct from credential storage, which is under `security/`).
+| Doc | Covers |
+|---|---|
+| [`oauth-device-flows.md`](oauth-device-flows.md) | RFC 8628 device flows for OpenAI/ChatGPT, xAI/Grok, GitHub Copilot; token storage and refresh; PKCE. |
+For where keys/tokens are stored and the env contract, see [`../security/credential-storage.md`](../security/credential-storage.md).