@legioncodeinc/rflectr 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1 -5
- package/dist/cli.js +1 -1
- package/library/README.md +39 -39
- package/library/issues/README.md +46 -46
- package/library/issues/backlog/README.md +26 -26
- package/library/issues/completed/README.md +13 -13
- package/library/issues/in-work/README.md +13 -13
- package/library/knowledge/README.md +34 -34
- package/library/knowledge/private/README.md +40 -40
- package/library/knowledge/private/standards/documentation-framework.md +154 -154
- package/library/knowledge/public/README.md +49 -49
- package/library/notes/README.md +21 -21
- package/library/requirements/README.md +51 -51
- package/library/requirements/backlog/README.md +30 -30
- package/library/requirements/completed/README.md +14 -14
- package/library/requirements/completed/prd-002-provider-registry/prd-002-provider-registry-index.md +263 -0
- package/library/requirements/completed/prd-003-model-discovery-classification/prd-003-model-discovery-classification-index.md +260 -0
- package/library/requirements/completed/prd-004-translation-layer/prd-004-translation-layer-index.md +196 -0
- package/library/requirements/completed/prd-005-local-proxy-catalog-routing/prd-005-local-proxy-catalog-routing-index.md +176 -0
- package/library/requirements/completed/prd-006-credential-storage/prd-006-credential-storage-index.md +190 -0
- package/library/requirements/completed/prd-006-credential-storage/qa/.gitkeep +0 -0
- package/library/requirements/completed/prd-007-oauth-device-flows/prd-007-oauth-device-flows-index.md +208 -0
- package/library/requirements/completed/prd-008-preferences-tiers-favorites/prd-008-preferences-tiers-favorites-index.md +249 -0
- package/library/requirements/completed/prd-008-preferences-tiers-favorites/qa/.gitkeep +0 -0
- package/library/requirements/completed/prd-009-codex-integration/prd-009-codex-integration-index.md +212 -0
- package/library/requirements/completed/prd-009-codex-integration/qa/.gitkeep +0 -0
- package/library/requirements/completed/prd-010-gemini-cli-integration/prd-010-gemini-cli-integration-index.md +211 -0
- package/library/requirements/completed/prd-010-gemini-cli-integration/qa/.gitkeep +0 -0
- package/library/requirements/completed/prd-011-claude-desktop-integration/prd-011-claude-desktop-integration-index.md +228 -0
- package/library/requirements/completed/prd-012-server-gateway/prd-012-server-gateway-index.md +356 -0
- package/library/requirements/completed/prd-012-server-gateway/qa/.gitkeep +0 -0
- package/library/requirements/in-work/README.md +19 -19
- package/library/requirements/reports/README.md +31 -31
- package/package.json +1 -1
|
@@ -1,14 +1,14 @@
|
|
|
1
|
-
---
|
|
2
|
-
ai_description: |
|
|
3
|
-
Contains shipped PRD folders. Entire prd-<###>-<slug>/ folders move
|
|
4
|
-
here from in-work/ when the work ships. Read-only after landing here —
|
|
5
|
-
do NOT edit or move files out of completed/.
|
|
6
|
-
The PRD index, sub-PRDs, and qa/ sub-folder all travel together.
|
|
7
|
-
human_description: |
|
|
8
|
-
Shipped PRD folders. Move entire prd-NNN-slug/ here from in-work/ when
|
|
9
|
-
the feature ships. Read-only — do not edit completed PRDs.
|
|
10
|
-
---
|
|
11
|
-
|
|
12
|
-
# Requirements — Completed
|
|
13
|
-
|
|
14
|
-
Shipped PRD folders. Entire `prd-<###>-<slug>/` folders land here after the work ships and is confirmed in production. Do not edit files here after landing.
|
|
1
|
+
---
|
|
2
|
+
ai_description: |
|
|
3
|
+
Contains shipped PRD folders. Entire prd-<###>-<slug>/ folders move
|
|
4
|
+
here from in-work/ when the work ships. Read-only after landing here —
|
|
5
|
+
do NOT edit or move files out of completed/.
|
|
6
|
+
The PRD index, sub-PRDs, and qa/ sub-folder all travel together.
|
|
7
|
+
human_description: |
|
|
8
|
+
Shipped PRD folders. Move entire prd-NNN-slug/ here from in-work/ when
|
|
9
|
+
the feature ships. Read-only — do not edit completed PRDs.
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
# Requirements — Completed
|
|
13
|
+
|
|
14
|
+
Shipped PRD folders. Entire `prd-<###>-<slug>/` folders land here after the work ships and is confirmed in production. Do not edit files here after landing.
|
package/library/requirements/completed/prd-002-provider-registry/prd-002-provider-registry-index.md
ADDED
|
@@ -0,0 +1,263 @@
|
|
|
1
|
+
# PRD-002: Provider Registry *(Retroactive)*
|
|
2
|
+
|
|
3
|
+
> **Status:** Shipped
|
|
4
|
+
> **Priority:** —
|
|
5
|
+
> **Effort:** —
|
|
6
|
+
> **Written:** June 2026
|
|
7
|
+
> **Retroactive:** Yes — written after implementation (rflectr v0.2.7).
|
|
8
|
+
> **Source:** `src/registry/*`, `src/provider-templates.ts`, `src/provider-catalog.ts`, `src/providers-command.ts`
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Overview
|
|
13
|
+
|
|
14
|
+
The Provider Registry is the on-disk catalog of AI providers and their cached models — the single source of truth that feeds every launch wizard in rflectr (`claude`, `codex`, `gemini`, `server`). It is persisted to `~/.rflectr/providers.json`, a JSON document that holds provider metadata and model caches but **never** holds secrets: each entry carries an `authRef` pointer into the OS keyring rather than a key.
|
|
15
|
+
|
|
16
|
+
Earlier versions of rflectr discovered providers by spawning OpenCode's `opencode serve` and reading `/config/providers` on *every* launch — slow, and a hard runtime coupling to a running OpenCode process. The registry replaces that model: providers are imported or added **once**, persisted, and read directly on every launch. OpenCode import (`rflectr providers import`) becomes a one-time operation rather than a per-launch dependency, while OpenCode Zen / Go remain available even against an empty registry (via a live OpenCode API key).
|
|
17
|
+
|
|
18
|
+
See the authoritative knowledge doc: [`../../../knowledge/private/data/provider-registry.md`](../../../knowledge/private/data/provider-registry.md).
|
|
19
|
+
|
|
20
|
+
## What Was Built
|
|
21
|
+
|
|
22
|
+
- A typed, versioned on-disk schema (`ProviderRegistry` / `RegistryProvider` / `CachedModel`) persisted to `~/.rflectr/providers.json` with secure permissions (`0o600` file in a `0o700` dir), atomic writes, and a `.bak` backup.
|
|
23
|
+
- A library of **27 built-in provider templates** (Groq, Mistral, OpenAI, Google, Ollama, LM Studio, OpenRouter, Anthropic, Zen/Go stubs, etc.) so a user can add a provider without hand-entering base URLs.
|
|
24
|
+
- A full CRUD surface via the `rflectr providers` command: hub, `add`, `import`, `list`, `remove`, `refresh-models`, `auth`.
|
|
25
|
+
- One-time **OpenCode import** that merges API-key and OAuth providers from `opencode serve` + `auth.json`, with duplicate-provider migration and interactive conflict resolution.
|
|
26
|
+
- **Materialization**: registry entries → runtime `LocalProvider[]` consumed by every wizard, with credential resolution, per-agent model hiding, and Google id normalization.
|
|
27
|
+
- **models.dev capability enrichment** (bundled snapshot + optional live refresh) for reasoning/tool-call metadata.
|
|
28
|
+
- An **SSRF URL-security guard** for custom-endpoint base URLs (DNS resolution + private-range blocking + metadata-host blocklist).
|
|
29
|
+
- **Schema migrations** for legacy cloud ids (`opencode`→`zen`, `opencode-go`→`go`) and OAuth provider id splits (`openai`→`openai-oauth`, `xai`→`xai-oauth`).
|
|
30
|
+
|
|
31
|
+
## Goals
|
|
32
|
+
|
|
33
|
+
1. Make provider discovery a one-time persisted operation, decoupling launch from a running OpenCode process.
|
|
34
|
+
2. Store provider/model metadata locally with secure file permissions and **no secrets on disk**.
|
|
35
|
+
3. Offer a curated template catalog so common providers can be added with a key alone.
|
|
36
|
+
4. Provide a single materialization path that every wizard (Claude/Codex/Gemini/Server) consumes identically.
|
|
37
|
+
5. Keep OpenCode as the source of truth for *which* models a provider exposes; rflectr maintains no per-package allowlist beyond the templates.
|
|
38
|
+
6. Guard custom-endpoint URLs against SSRF.
|
|
39
|
+
|
|
40
|
+
## Non-Goals
|
|
41
|
+
|
|
42
|
+
- Storing API keys or OAuth tokens in `providers.json` (credentials live in the keyring — see [PRD-006](../prd-006-credential-storage/) / [PRD-007](../prd-007-oauth-device-flows/)).
|
|
43
|
+
- Maintaining a per-npm-package model allowlist beyond the template catalog (OpenCode/provider API is authoritative).
|
|
44
|
+
- Supporting providers whose auth cannot be reduced to a single forwarded API key or OAuth token (Bedrock/Azure/Vertex are reference-only templates; Vertex is served only through the dedicated `server --vertex` path).
|
|
45
|
+
- Model classification/format heuristics themselves (owned by [PRD-003](../prd-003-model-discovery-classification/)).
|
|
46
|
+
|
|
47
|
+
## Features
|
|
48
|
+
|
|
49
|
+
| # | Feature | Source | Acceptance |
|
|
50
|
+
|---|---------|--------|------------|
|
|
51
|
+
| 1 | On-disk schema + secure persistence | `src/registry/types.ts`, `src/registry/io.ts` | [AC-1](#acceptance-criteria) |
|
|
52
|
+
| 2 | Built-in template catalog (27 templates) | `src/provider-templates.ts` | [AC-2](#acceptance-criteria) |
|
|
53
|
+
| 3 | `rflectr providers` CRUD command | `src/providers-command.ts`, `src/registry/crud.ts` | [AC-3](#acceptance-criteria) |
|
|
54
|
+
| 4 | Template add (key test + model fetch) | `src/registry/add-template.ts` | [AC-4](#acceptance-criteria) |
|
|
55
|
+
| 5 | Custom-endpoint add (OpenAI/Anthropic) | `src/registry/custom-endpoint.ts` | [AC-5](#acceptance-criteria) |
|
|
56
|
+
| 6 | OpenCode one-time import + conflict resolution | `src/registry/import-opencode.ts`, `src/registry/import-build.ts` | [AC-6](#acceptance-criteria) |
|
|
57
|
+
| 7 | Materialization → `LocalProvider[]` | `src/registry/materialize.ts`, `src/registry/load.ts` | [AC-7](#acceptance-criteria) |
|
|
58
|
+
| 8 | Schema migrations | `src/registry/migrate.ts` | [AC-8](#acceptance-criteria) |
|
|
59
|
+
| 9 | models.dev capability enrichment | `src/registry/models-dev.ts` | [AC-9](#acceptance-criteria) |
|
|
60
|
+
| 10 | URL-security SSRF guard | `src/registry/url-security.ts` | [AC-10](#acceptance-criteria) |
|
|
61
|
+
| 11 | Model-list refresh per `modelSource` | `src/registry/refresh-models.ts` | [AC-11](#acceptance-criteria) |
|
|
62
|
+
| 12 | Catalog adapters for pickers/server | `src/provider-catalog.ts` | [AC-12](#acceptance-criteria) |
|
|
63
|
+
|
|
64
|
+
## Architecture & Implementation
|
|
65
|
+
|
|
66
|
+
### Data flow: registry entry → runtime provider
|
|
67
|
+
|
|
68
|
+
```
|
|
69
|
+
~/.rflectr/providers.json
|
|
70
|
+
→ loadRegistry() [io.ts:116 — parse, validate, apply migrations]
|
|
71
|
+
→ loadRegistryProviders() [load.ts:12 — resolve each authRef → credential]
|
|
72
|
+
resolveProviderCredential() [env.ts — env → keyring → OAuth refresh]
|
|
73
|
+
→ materializeRegistry() [materialize.ts:84 — CachedModel → LocalProviderModel]
|
|
74
|
+
skip disabled / no-models / no-credential
|
|
75
|
+
shouldHideModel() per agent [model-compatibility.ts]
|
|
76
|
+
→ LocalProvider[] (the wizard list)
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
### On-disk schema & persistence
|
|
80
|
+
|
|
81
|
+
`ProviderRegistry`, `RegistryProvider`, and `CachedModel` are defined in `src/registry/types.ts:9-57`; `REGISTRY_SCHEMA_VERSION = 1` at `types.ts:5`. The provider id pattern `PROVIDER_ID_PATTERN = /^[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$/` lives in `src/registry/validate.ts:6`, with `slugifyProviderId` / `customProviderId` helpers at `validate.ts:12-26`.
|
|
82
|
+
|
|
83
|
+
`loadRegistry(path?)` (`src/registry/io.ts:116`) returns an empty registry when the file is missing or unparseable, runs the three migrations, and persists if any migration changed the data. `parseRegistry`/`parseProvider` (`io.ts:52-114`) defensively validate every field and drop malformed providers. `saveRegistry` (`io.ts:139`) writes a `.bak` copy, writes to a `.tmp` file via `writeSecureFile` (`io.ts:36` — `openSync` with `0o600`, then `chmodSync`), then `renameSync`s atomically. `ensureSecureAppHome` (`io.ts:26`) creates `~/.rflectr` at `0o700`. The JSON is 2-space indented with a trailing newline (`io.ts:140`). **No secrets** are written: `authRef` is a pointer string only.
|
|
84
|
+
|
|
85
|
+
### Built-in templates
|
|
86
|
+
|
|
87
|
+
`src/provider-templates.ts:26` defines `PROVIDER_TEMPLATES` (27 entries). Each `ProviderTemplate` (`provider-templates.ts:8-23`) carries `id`, `name`, `authType` (`api`/`oauth`/`none`), `npm` (SDK package), optional `defaultBaseUrl`/`signupUrl`/`urlPrompt`/`apiKeyOptional`, `modelSource` (`api-list` | `static-seed` | `manual-only` | `zen-go-api`), and `supported`/`addable`/`unsupportedReason` flags. Query helpers: `listSupportedTemplates` (`:310`, filters to `supported && authType==='api' && addable!==false`), `listAddableTemplates` (`:317`, removes already-configured ids and collapses Zen/Go under `opencode-cloud`), `getTemplateById` (`:327`), `filterTemplates` (`:331`).
|
|
88
|
+
|
|
89
|
+
Reference-only (not addable) templates carry `supported:false` + `unsupportedReason`: `bedrock` (`:242`), `azure` (`:250`), `vertex` (`:259`). Zen/Go stubs (`zen`/`go`) are `addable:false` (`:285`/`:295`); `opencode-cloud` (`:269`) is the addable proxy that routes to them. `github-copilot` is an `oauth` template (`:299`).
|
|
90
|
+
|
|
91
|
+
### The `providers` command
|
|
92
|
+
|
|
93
|
+
`src/providers-command.ts` parses args (`parseProvidersArgs`, `:51`) and dispatches (`runProvidersCommand`, `:798`):
|
|
94
|
+
|
|
95
|
+
| Command | Function | Notes |
|
|
96
|
+
|---|---|---|
|
|
97
|
+
| `rflectr providers` | `runProvidersHub` (`:727`) | Interactive hub loop |
|
|
98
|
+
| `… add` | `runProvidersAdd` (`:564`) | Import / template / custom endpoint |
|
|
99
|
+
| `… import` | `runProvidersImport` (`:131`) | One-time OpenCode import w/ conflict panel |
|
|
100
|
+
| `… list` | `runProvidersList` (`:304`) | Tabular, via `resolveProvidersForDisplay` |
|
|
101
|
+
| `… remove <id>` | `runProvidersRemove` (`:607`) → `removeProviderFromRegistry` | Entry + keyring cleanup |
|
|
102
|
+
| `… refresh-models [id]` | `runProvidersRefreshModels` (`:236`) | Re-fetch model cache |
|
|
103
|
+
| `… auth <id> [--native\|--broker]` | `runProvidersAuth` (`:221`) | OAuth sign-in — see [PRD-007](../prd-007-oauth-device-flows/) |
|
|
104
|
+
|
|
105
|
+
CRUD primitives live in `src/registry/crud.ts`: `removeProviderFromRegistry` (`:23`, with safe keyring deletion gated by `credentialStillReferenced`, `:18`), `toggleProviderEnabled` (`:87`), `addZenRegistryStub`/`addGoRegistryStub` (`:54`/`:66`), `setRegistrySubscriptionFilter` (`:76`).
|
|
106
|
+
|
|
107
|
+
### Template add
|
|
108
|
+
|
|
109
|
+
`addProviderFromTemplate` (`src/registry/add-template.ts:42`): probes the SDK package is importable (`probeTemplatePackage`, `:27`, guarded by `isSdkMigratedNpm`), rejects an existing provider unless `replaceExisting`, fetches the live model list (`fetchTemplateModels`), saves the credential to `keyring:provider:<id>`, enriches with pricing (`enrichModelsWithPricing`), writes the entry, and kicks off async pricing enrichment (`enrichPricingAsync`).
|
|
110
|
+
|
|
111
|
+
### Custom endpoints
|
|
112
|
+
|
|
113
|
+
`addCustomEndpointProvider` (`src/registry/custom-endpoint.ts:136`) validates the base URL through the SSRF guard, allocates a unique `custom-…` id (`uniqueProviderId`, `:121`), fetches models via the Anthropic path (`fetchAnthropicModels`, `:41`) or the OpenAI-compatible template path, saves the key (or the placeholder `'local'` for keyless local servers), and writes a `custom-anthropic`/`custom-openai` entry.
|
|
114
|
+
|
|
115
|
+
### OpenCode import
|
|
116
|
+
|
|
117
|
+
`importFromOpencode` (`src/registry/import-opencode.ts:102`) fetches raw providers from `opencode serve`, reads `auth.json`, and merges API-key + OAuth providers via `buildImportProviderList` (`src/registry/import-build.ts:40`). It runs the legacy-cloud migration, validates each key (`validateImportKey`), resolves conflicts through an injected `resolveConflict` callback (the hub renders a panel and prompts keep/import/skip), saves credentials (API key → `keyring:provider:<id>`; OAuth → `keyring:oauth:provider:<id>`), and records skip reasons. OAuth provider ids are split via `toOAuthRegistryId` (`import-build.ts:23` — `openai`→`openai-oauth`, `xai`→`xai-oauth`). `listCredentialSkippedProviders` (`import-build.ts:112`) surfaces only actionable gaps (OAuth sign-in needed, or a provider already in the registry). `localProviderToRegistry` (`src/registry/convert.ts:28`) performs the structural conversion (no secret write).
|
|
118
|
+
|
|
119
|
+
### Materialization
|
|
120
|
+
|
|
121
|
+
`materializeRegistry` (`src/registry/materialize.ts:84`) iterates providers and calls `materializeOne` (`:54`): skips disabled / invalid-id providers, converts each `CachedModel` via `cachedModelToLocal` (`:21`), drops models whose endpoint can't be resolved (`resolveEndpoint`, `src/providers.ts:29`) and models hidden by `shouldHideModel`, and finally drops the whole provider when it has **no models** or **no credential** (`:69-72`). Google ids/display names are normalized (`normalizeGoogleModelId`/`normalizeGoogleDisplayName`); context windows and reasoning fall back to `resolveContextWindow` and the models.dev row (`findModelsDevModel`). `loadRegistryProviders` (`src/registry/load.ts:12`) resolves credentials + OAuth account ids before materializing.
|
|
122
|
+
|
|
123
|
+
### Schema migrations
|
|
124
|
+
|
|
125
|
+
`src/registry/migrate.ts`, applied inside `loadRegistry`:
|
|
126
|
+
- `migrateLegacyCloudProviders` (`:10`) — `opencode`→`zen`, `opencode-go`→`go` (collapsing a duplicate, otherwise rewriting id/templateId/name and clearing `api`).
|
|
127
|
+
- `migrateOAuthOpenAiProvider` (`:37`) — `{id:'openai', authType:'oauth'}`→`openai-oauth` so it can coexist with the API-key `openai`, preserving the original `authRef`.
|
|
128
|
+
- `migrateOAuthXaiProvider` (`:56`) — `{id:'xai', authType:'oauth'}`→`xai-oauth`.
|
|
129
|
+
|
|
130
|
+
### models.dev enrichment
|
|
131
|
+
|
|
132
|
+
`src/registry/models-dev.ts` ships a bundled snapshot (`loadBundledModelsDevCache`, `:92`, from `src/data/models-dev-cache.json`) and supports an optional live refresh (`fetchModelsDevCache`, `:179`, 15s timeout, written `0o600` to `~/.rflectr/models-dev-cache.json`). `findModelsDevModel` (`:212`) resolves the provider slug via `REGISTRY_TO_MODELS_DEV` (`:60`) and matches normalized model-id candidates; `shouldHideByModelsDevCapabilities` (`:229`) is the conservative auto-hide rule (non-text output, `tool_call===false`, interaction-only models). `refreshModelsDevCacheAsync` (`:205`) refreshes in the background.
|
|
133
|
+
|
|
134
|
+
### URL-security (SSRF guard)
|
|
135
|
+
|
|
136
|
+
`validateCustomEndpointUrl` (`src/registry/url-security.ts:65`): requires HTTPS (HTTP only when `allowInsecureLocal` and the host resolves to loopback), blocks a metadata-host blocklist (`169.254.169.254`, `metadata.google.internal`, ECS task metadata — `:20`), DNS-resolves the hostname, and blocks any address in loopback/private/link-local/unique-local/CGNAT ranges via `isBlockedIp` (`:27`, using `ipaddr.js`). Unparseable inputs fail closed. Returns a normalized, trailing-slash-stripped URL on success.
|
|
137
|
+
|
|
138
|
+
### Model-list refresh
|
|
139
|
+
|
|
140
|
+
`refreshProviderModels` (`src/registry/refresh-models.ts:321`) branches on `resolveModelSource`: `manual-only` is skipped with a hint; `zen-go-api` re-fetches via `getModels(BACKENDS[...])` (`refreshZenGoProvider`, `:76`); OAuth providers use the live-or-seed strategy (`refreshOAuthProvider`, `:97`, with the OpenAI 3-tier ChatGPT-backend strategy at `:182` and the xAI strategy at `:223`); everything else hits the API list (`refreshApiListProvider`, `:248`) with placeholder-key detection and a "keep cached models" fallback on auth rejection. `refreshAllProviderModels` (`:445`) auto-seeds Zen/Go stubs when a global OpenCode key exists, then refreshes every enabled provider.
|
|
141
|
+
|
|
142
|
+
### Catalog adapters
|
|
143
|
+
|
|
144
|
+
`src/provider-catalog.ts` adapts materialized providers for consumers: `resolveLocalProviders`/`fetchProviderCatalog` (`:38`/`:44`), `providersForPicker` (`:85`, merges live Zen/Go when not already in the registry, sorts), `resolveLocalProviderApiKey` (`:109`), `formatRegistryAuthLabel` (`:120`), `resolveProvidersForDisplay` (`:161`, the `providers list`/hub row builder), `localProvidersToServerModels`/`zenGoModelsToServerModels` (`:228`/`:259`, used by the server gateway — see [PRD-012](../prd-012-server-gateway/)).
|
|
145
|
+
|
|
146
|
+
## Configuration & Data Shapes
|
|
147
|
+
|
|
148
|
+
Path: `getProvidersPath()` → `~/.rflectr/providers.json` (override the home with `RFLECTR_HOME`).
|
|
149
|
+
|
|
150
|
+
```ts
|
|
151
|
+
// src/registry/types.ts
|
|
152
|
+
ProviderRegistry {
|
|
153
|
+
schemaVersion: number // currently 1 (REGISTRY_SCHEMA_VERSION)
|
|
154
|
+
providers: RegistryProvider[]
|
|
155
|
+
importedAt?: string // last OpenCode import (ISO)
|
|
156
|
+
pricingCacheAt?: string
|
|
157
|
+
}
|
|
158
|
+
|
|
159
|
+
RegistryProvider {
|
|
160
|
+
id: string // PROVIDER_ID_PATTERN: /^[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$/
|
|
161
|
+
templateId: string // origin template, e.g. 'groq', 'custom-openai'
|
|
162
|
+
name: string // display name
|
|
163
|
+
enabled: boolean
|
|
164
|
+
authRef: string // pointer ONLY: 'keyring:provider:groq' |
|
|
165
|
+
// 'keyring:global:opencode' | 'keyring:oauth:provider:xai-oauth' | 'env:…'
|
|
166
|
+
authType?: 'api' | 'oauth' | 'none'
|
|
167
|
+
subscriptionFilter?: 'free' | 'zen' | 'go'
|
|
168
|
+
api: { npm?: string; url?: string; id?: string }
|
|
169
|
+
modelsCache?: { fetchedAt: string; models: CachedModel[] }
|
|
170
|
+
addedAt: string // ISO
|
|
171
|
+
refreshedAt?: string
|
|
172
|
+
}
|
|
173
|
+
|
|
174
|
+
CachedModel {
|
|
175
|
+
id: string
|
|
176
|
+
name: string
|
|
177
|
+
upstreamModelId: string // provider's native id for the wire call
|
|
178
|
+
family?: string; brand?: string
|
|
179
|
+
contextWindow?: number
|
|
180
|
+
cost?: { input: number; output: number }
|
|
181
|
+
modelFormat: 'anthropic' | 'openai'
|
|
182
|
+
npm?: string // per-model override of provider.api.npm
|
|
183
|
+
apiUrl?: string // per-model override of provider.api.url
|
|
184
|
+
sourceBackend?: string
|
|
185
|
+
supportedParameters?: string[]
|
|
186
|
+
reasoning?: boolean
|
|
187
|
+
interleavedReasoningField?: string
|
|
188
|
+
}
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
**`authRef` forms** (resolved by `resolveProviderCredential`): `keyring:provider:<id>` (per-provider key), `keyring:global:opencode` (shared OpenCode key, used by Zen/Go), `keyring:oauth:provider:<id>` (OAuth credential JSON), `env:VARNAME` (env-var fallback).
|
|
192
|
+
|
|
193
|
+
## Acceptance Criteria
|
|
194
|
+
|
|
195
|
+
- [x] **AC-1 — Schema & secure persistence.** `providers.json` round-trips through `loadRegistry`/`saveRegistry`; the file is written `0o600` inside a `0o700` dir via atomic tmp+rename with a `.bak` backup; malformed/missing files yield an empty registry; **no secrets** are stored (`authRef` is a pointer). `src/registry/io.ts:26,36,116,139`; covered by `tests/registry.test.ts`.
|
|
196
|
+
- [x] **AC-2 — Template catalog.** 27 built-in templates exist; `listSupportedTemplates`/`listAddableTemplates`/`getTemplateById`/`filterTemplates` filter by support, configured-state, and query; reference-only templates (Bedrock/Azure/Vertex) carry `unsupportedReason`. `src/provider-templates.ts:26,310-340`; covered by `tests/provider-templates.test.ts`.
|
|
197
|
+
- [x] **AC-3 — CRUD command.** `parseProvidersArgs` resolves `hub`/`add`/`import`/`list`/`remove`/`refresh-models`/`auth` with argument validation and help text. `src/providers-command.ts:51,104,798`; covered by `tests/providers-command.test.ts`.
|
|
198
|
+
- [x] **AC-4 — Template add.** `addProviderFromTemplate` probes the SDK package, tests the key by fetching models, rejects empty keys and existing providers (unless replacing), persists credential + entry, and enriches pricing. `src/registry/add-template.ts:42`; covered by `tests/registry-add-template.test.ts`.
|
|
199
|
+
- [x] **AC-5 — Custom endpoints.** `addCustomEndpointProvider` validates the URL, allocates a unique `custom-…` id, supports OpenAI-compatible and Anthropic-style servers, and persists a keyless `'local'` placeholder for local servers. `src/registry/custom-endpoint.ts:121,136`.
|
|
200
|
+
- [x] **AC-6 — OpenCode import.** `importFromOpencode` merges API-key + OAuth providers, validates keys, resolves duplicate-provider conflicts (keep/import/skip), saves credentials to the keyring, and reports skip reasons; OAuth ids split to `<id>-oauth`. `src/registry/import-opencode.ts:102`, `src/registry/import-build.ts:23,40,112`; covered by `tests/import-opencode.test.ts`.
|
|
201
|
+
- [x] **AC-7 — Materialization.** `materializeRegistry` converts enabled entries to `LocalProvider[]`, dropping providers with no credential or no cached models and applying per-agent `shouldHideModel`. `src/registry/materialize.ts:54,84`, `src/registry/load.ts:12`.
|
|
202
|
+
- [x] **AC-8 — Migrations.** Legacy `opencode`/`opencode-go` ids migrate to `zen`/`go`; OAuth `openai`/`xai` split to `openai-oauth`/`xai-oauth`, preserving `authRef`. `src/registry/migrate.ts:10,37,56`; run inside `loadRegistry` (`io.ts:123`).
|
|
203
|
+
- [x] **AC-9 — models.dev enrichment.** A bundled snapshot ships in the binary; an optional live refresh writes `0o600` to `~/.rflectr/models-dev-cache.json`; `findModelsDevModel` supplies reasoning/interleaved metadata during materialization. `src/registry/models-dev.ts:92,179,212`.
|
|
204
|
+
- [x] **AC-10 — SSRF guard.** `validateCustomEndpointUrl` blocks metadata hosts and private/loopback/link-local/CGNAT addresses after DNS resolution, allows loopback HTTP only with `allowInsecureLocal`, and fails closed on parse errors. `src/registry/url-security.ts:20,27,65`; covered by `tests/url-security.test.ts`.
|
|
205
|
+
- [x] **AC-11 — Model refresh.** `refreshProviderModels` re-fetches per `modelSource` (zen-go / OAuth live-or-seed / api-list), detects placeholder keys, and keeps cached models on auth rejection. `src/registry/refresh-models.ts:248,321,445`; covered by `tests/registry-refresh-models.test.ts`.
|
|
206
|
+
- [x] **AC-12 — Catalog adapters.** `providersForPicker` / `resolveProvidersForDisplay` / `localProvidersToServerModels` / `zenGoModelsToServerModels` adapt materialized providers for the CLI pickers, `providers list`, and the server gateway. `src/provider-catalog.ts:85,161,228,259`; covered by `tests/provider-catalog-display.test.ts`.
|
|
207
|
+
|
|
208
|
+
## Files
|
|
209
|
+
|
|
210
|
+
### Primary
|
|
211
|
+
- `src/registry/types.ts` — schema (`ProviderRegistry`, `RegistryProvider`, `CachedModel`, `REGISTRY_SCHEMA_VERSION`).
|
|
212
|
+
- `src/registry/io.ts` — load/save with secure perms, parse/validate, migration trigger.
|
|
213
|
+
- `src/registry/load.ts` — credential resolution + materialization entry (`loadRegistryProviders`).
|
|
214
|
+
- `src/registry/materialize.ts` — registry entries → `LocalProvider[]`.
|
|
215
|
+
- `src/registry/crud.ts` — add/remove/toggle, Zen/Go stubs, subscription filter.
|
|
216
|
+
- `src/registry/validate.ts` — provider-id pattern + slugify/custom-id helpers.
|
|
217
|
+
- `src/registry/builtins.ts` — Zen/Go registry stub entries.
|
|
218
|
+
- `src/registry/migrate.ts` — legacy-cloud + OAuth-id migrations.
|
|
219
|
+
- `src/registry/import-opencode.ts` + `src/registry/import-build.ts` — one-time OpenCode import + merge logic.
|
|
220
|
+
- `src/registry/convert.ts` — `LocalProvider` ↔ `RegistryProvider`.
|
|
221
|
+
- `src/registry/add-template.ts` — template add flow.
|
|
222
|
+
- `src/registry/custom-endpoint.ts` — custom OpenAI/Anthropic endpoint add + `fetchAnthropicModels`.
|
|
223
|
+
- `src/registry/resolve-template.ts` — imported-id → template + default base URL resolution.
|
|
224
|
+
- `src/registry/refresh-models.ts` — model-list refresh per `modelSource`.
|
|
225
|
+
- `src/registry/models-dev.ts` — models.dev capability cache (bundled + live).
|
|
226
|
+
- `src/registry/url-security.ts` — SSRF guard for custom URLs.
|
|
227
|
+
- `src/provider-templates.ts` — the 27 built-in templates + query helpers.
|
|
228
|
+
- `src/provider-catalog.ts` — picker/display/server adapters.
|
|
229
|
+
- `src/providers-command.ts` — the `rflectr providers` command.
|
|
230
|
+
|
|
231
|
+
### Supporting
|
|
232
|
+
- `src/registry/index.ts` — public barrel re-exports.
|
|
233
|
+
- `src/registry/fetch-template-models.ts` — live model-list fetch per template.
|
|
234
|
+
- `src/registry/google-model-id.ts` — Google model-id / display-name normalization.
|
|
235
|
+
- `src/registry/model-source.ts` — `resolveModelSource(provider)`.
|
|
236
|
+
- `src/registry/pricing.ts` — pricing index + async enrichment.
|
|
237
|
+
- `src/registry/refresh-credentials.ts` — credential resolution + placeholder-key detection for refresh.
|
|
238
|
+
- `src/registry/validate-import-key.ts` — import key validation.
|
|
239
|
+
- `src/registry/opencode-auth.ts` — `auth.json` read + OAuth credential shaping.
|
|
240
|
+
- `src/registry/provider-auth.ts`, `src/registry/auth-broker.ts` — provider OAuth (see [PRD-007](../prd-007-oauth-device-flows/)).
|
|
241
|
+
- `src/providers.ts` — `resolveEndpoint` + `normalizeProviders` (shared with import).
|
|
242
|
+
- `src/paths.ts` — `getProvidersPath` / `getAppHome` / `RFLECTR_HOME`.
|
|
243
|
+
- `src/data/models-dev-cache.json` — bundled models.dev snapshot.
|
|
244
|
+
|
|
245
|
+
## Risks & Known Limitations
|
|
246
|
+
|
|
247
|
+
- **Cost display inaccuracy** — pricing enrichment is best-effort; Claude Code applies its own pricing table for non-Anthropic models, so displayed cost is always inaccurate for them (documented, by design).
|
|
248
|
+
- **OAuth-only providers without a stored token** are silently skipped at materialization (no credential → provider dropped).
|
|
249
|
+
- **`@ai-sdk/github-copilot` model factory is unavailable** — OpenCode loads it from internal `@opencode-ai/core`, not a shippable public npm factory. OAuth login works; the model provider does not.
|
|
250
|
+
- **Bedrock / Azure / Vertex are reference-only** templates (`supported:false`); they need env-based auth beyond a forwarded API key. Vertex is supported only via the dedicated `server --vertex` path — see [PRD-012](../prd-012-server-gateway/).
|
|
251
|
+
- **SSRF guard resolves DNS at validation time**, not at request time — a TOCTOU rebinding window exists between validation and use (mitigated by the request itself going through the SDK adapter / proxy).
|
|
252
|
+
- **models.dev live refresh is silent on failure** — falls back to the bundled snapshot, so capability metadata may lag the upstream catalog when offline.
|
|
253
|
+
- **Stale cached models** persist when a refresh fails on auth rejection (deliberate — keeps the user functional, surfaced with a warning).
|
|
254
|
+
|
|
255
|
+
## Related
|
|
256
|
+
|
|
257
|
+
- Knowledge: [`provider-registry.md`](../../../knowledge/private/data/provider-registry.md)
|
|
258
|
+
- [PRD-001 — CLI Core & Launch Orchestration](../prd-001-cli-core-launch-orchestration/) (consumes materialized providers at launch)
|
|
259
|
+
- [PRD-003 — Model Discovery & Classification](../prd-003-model-discovery-classification/) (`resolveEndpoint`, format classification, `getModels`)
|
|
260
|
+
- [PRD-006 — Credential Storage & API Key Management](../prd-006-credential-storage/) (`authRef` resolution, keyring)
|
|
261
|
+
- [PRD-007 — OAuth Device Flows](../prd-007-oauth-device-flows/) (`providers auth`, OAuth import, id splits)
|
|
262
|
+
- [PRD-008 — Preferences, Tiers & Favorites](../prd-008-preferences-tiers-favorites/) (`subscriptionFilter`, model hiding)
|
|
263
|
+
- [PRD-012 — Server Gateway](../prd-012-server-gateway/) (`localProvidersToServerModels`, Vertex path)
|
|
@@ -0,0 +1,260 @@
|
|
|
1
|
+
# PRD-003: Model Discovery & Classification *(Retroactive)*
|
|
2
|
+
|
|
3
|
+
> **Status:** Shipped
|
|
4
|
+
> **Priority:** —
|
|
5
|
+
> **Effort:** —
|
|
6
|
+
> **Written:** June 2026
|
|
7
|
+
> **Retroactive:** Yes — written after implementation (rflectr v0.2.7).
|
|
8
|
+
> **Source:** `src/models.ts`, `src/model-compatibility.ts`, `src/context-window.ts`, `src/reasoning-capabilities.ts`
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Overview
|
|
13
|
+
|
|
14
|
+
`rflectr` re-points Claude Code (and the other agents) at alternative model backends. Before a user can pick a model, the launcher has to build a list of models that are actually usable, decide for each one whether it can be forwarded raw to an Anthropic-compatible endpoint or must be translated, and stamp it with the metadata Claude Code needs to render an accurate status bar (context window, cost, reasoning controls).
|
|
15
|
+
|
|
16
|
+
This PRD documents that subsystem: how the cloud (OpenCode Zen/Go) model list is assembled from two merged sources, how every model's `modelFormat` is classified, how context windows and reasoning capabilities are resolved, and how unusable models (incompatible, deprecated, stale-free) are filtered out.
|
|
17
|
+
|
|
18
|
+
The single most load-bearing output is `modelFormat`. It is the branch point for the entire launch flow — `'anthropic'` means direct passthrough, anything else routes through the SDK adapter proxy (see [PRD-004 — Translation Layer](../prd-004-translation-layer/prd-004-translation-layer-index.md)).
|
|
19
|
+
|
|
20
|
+
Knowledge doc: [`model-discovery-classification.md`](../../../knowledge/private/ai/model-discovery-classification.md).
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## What Was Built
|
|
25
|
+
|
|
26
|
+
- A **two-source merge** that builds the cloud model list from a live API call (`GET {backendUrl}/v1/models`, no auth) enriched by the local OpenCode CLI cache (`~/.cache/opencode/models.json`) — `src/models.ts:131` (`getModels`), `src/models.ts:85` (`mergeModels`).
|
|
27
|
+
- A pure **format classifier** `classifyModelFormat(modelId, providerNpm)` returning `'anthropic' | 'openai' | 'unsupported'` from provider npm first, then ID-prefix heuristics — `src/constants.ts:58`.
|
|
28
|
+
- A **Go-backend override** that demotes any `'anthropic'` classification to `'openai'`, because the Go gateway is OpenAI-compatible and an `@ai-sdk/anthropic` npm in the cache is a metadata error — `src/models.ts:50`, `src/models.ts:95`.
|
|
29
|
+
- **`sourceBackend` stamping** so a combined Zen-free + Go-paid list (the `go` tier) can resolve the correct base URL per selected model — `src/models.ts:51`, `src/models.ts:96`.
|
|
30
|
+
- **Context-window resolution** from cache `limit.context` → ID heuristics → 200K default, plus a `[1m]` suffix convention for million-token variants — `src/context-window.ts`, `src/context-model-id.ts`.
|
|
31
|
+
- **Reasoning-capability metadata** (levels, default level, mode, wire format) resolved per provider npm + model id — `src/reasoning-capabilities.ts`, `src/provider-factory.ts:469`.
|
|
32
|
+
- **Incompatibility filtering** via a curated JSON blacklist plus conservative models.dev capability checks — `src/model-compatibility.ts`, `src/data/model-incompatible.json`.
|
|
33
|
+
- **Graceful degradation**: API failure falls back to the cache, then to caller-supplied fallback models, then errors — `src/models.ts:140`.
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## Goals
|
|
38
|
+
|
|
39
|
+
- Show the user only models that will actually work in a coding agent (no embedding/image/audio/video/deprecated/managed-agent models).
|
|
40
|
+
- Decide passthrough-vs-translate deterministically and at the model level, not the provider level.
|
|
41
|
+
- Render an accurate remaining-context status bar in Claude Code for non-Anthropic models.
|
|
42
|
+
- Surface reasoning-effort controls where the provider supports them.
|
|
43
|
+
- Never hard-depend on the OpenCode cache: it is enrichment, not a runtime requirement.
|
|
44
|
+
- Degrade gracefully when the model API is unreachable.
|
|
45
|
+
|
|
46
|
+
## Non-Goals
|
|
47
|
+
|
|
48
|
+
- Live `/model` switching context-window updates — fixed at launch in switch-menu mode (documented limitation).
|
|
49
|
+
- Accurate cost display for non-Anthropic models — Claude Code owns its pricing table; this is unfixable from rflectr.
|
|
50
|
+
- Registry-provider model materialization — owned by [PRD-002 — Provider Registry](../prd-002-provider-registry/prd-002-provider-registry-index.md). This PRD covers the cloud Zen/Go path; registry models arrive pre-stamped.
|
|
51
|
+
- Backend selection / tier logic — owned by [PRD-008 — Preferences, Tiers & Favorites](../prd-008-preferences-tiers-favorites/prd-008-preferences-tiers-favorites-index.md). This PRD consumes `backend.id` / `sourceBackend`.
|
|
52
|
+
- The translation that happens *after* a model is classified `'openai'` — owned by [PRD-004 — Translation Layer](../prd-004-translation-layer/prd-004-translation-layer-index.md).
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## Features
|
|
57
|
+
|
|
58
|
+
| # | Feature | Source |
|
|
59
|
+
|---|---------|--------|
|
|
60
|
+
| F1 | Two-source merge (live API ids + OpenCode cache enrichment) | `src/models.ts:85`, `src/models.ts:131` |
|
|
61
|
+
| F2 | `classifyModelFormat` (npm → id-prefix heuristic) | `src/constants.ts:58` |
|
|
62
|
+
| F3 | Go-backend anthropic→openai demotion | `src/models.ts:50`, `src/models.ts:95` |
|
|
63
|
+
| F4 | `sourceBackend` per-model stamping | `src/models.ts:51`, `src/models.ts:96` |
|
|
64
|
+
| F5 | `isFree` detection (cost.input == 0 && cost.output == 0) | `src/models.ts:44` |
|
|
65
|
+
| F6 | Brand derivation for grouping | `src/models.ts:23` (`deriveBrand`) |
|
|
66
|
+
| F7 | Context-window resolution (cache → heuristics → 200K) | `src/context-window.ts:131` |
|
|
67
|
+
| F8 | `[1m]` context-suffix convention | `src/context-model-id.ts` |
|
|
68
|
+
| F9 | Reasoning-capability metadata resolution | `src/reasoning-capabilities.ts:24`, `src/provider-factory.ts:469` |
|
|
69
|
+
| F10 | Curated incompatibility blacklist filtering | `src/model-compatibility.ts:46`, `src/data/model-incompatible.json` |
|
|
70
|
+
| F11 | models.dev conservative capability filtering | `src/model-compatibility.ts:60`, `src/registry/models-dev.ts:229` |
|
|
71
|
+
| F12 | Deprecated/stale model exclusion | `src/models.ts:43` (cache `status`), `model-incompatible.json` (`stale_promotion`) |
|
|
72
|
+
| F13 | Graceful fallback chain on API failure | `src/models.ts:140` |
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## Architecture & Implementation
|
|
77
|
+
|
|
78
|
+
### The two-source merge
|
|
79
|
+
|
|
80
|
+
The cloud (OpenCode Zen / Go) model list is built from two sources merged together in `getModels` — `src/models.ts:131`:
|
|
81
|
+
|
|
82
|
+
1. **Primary — live API.** `fetchModelsFromApi(backend)` does `GET {backend.baseUrl}/v1/models` (5s abort timeout, no real auth — sends a placeholder `Bearer test`) and returns the available model ids — `src/models.ts:69`. This is the authority on *what exists*.
|
|
83
|
+
2. **Enrichment — OpenCode cache.** `readModelsFromCache(backendId)` reads `~/.cache/opencode/models.json` (path `OPENCODE_CACHE_PATH`, `src/constants.ts:48`) and indexes the entries under the `opencode` (Zen) or `opencode-go` (Go) provider keys — `src/models.ts:31`. Each cache entry supplies `name`, `family`, `cost`, `provider.npm`, and `limit.context`. The cache is optional; a missing or unparseable file simply yields `null` (`loadOpencodeCache`, `src/context-window.ts:63`).
|
|
84
|
+
|
|
85
|
+
`mergeModels(apiIds, cache, backendId)` — `src/models.ts:85` — drives the join:
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
for each apiId:
|
|
89
|
+
if shouldHideModel(...) → drop (incompatibility filter)
|
|
90
|
+
if cache has apiId → spread cached ModelInfo, re-stamp sourceBackend + modelFormat
|
|
91
|
+
else → synthesize a bare ModelInfo (name = id, brand 'Other',
|
|
92
|
+
classifyModelFormat(id, undefined), resolveContextWindow(id))
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
So the API ids are the spine; the cache only *enriches* ids that already appear in the live response. An id present only in the cache is never shown.
|
|
96
|
+
|
|
97
|
+
```mermaid
|
|
98
|
+
flowchart TD
|
|
99
|
+
api["GET {backendUrl}/v1/models — available ids (no auth)"]
|
|
100
|
+
cache["~/.cache/opencode/models.json — name, family, cost, npm, limit.context"]
|
|
101
|
+
api --> filter["shouldHideModel() — blacklist + models.dev"]
|
|
102
|
+
filter --> merge["mergeModels()"]
|
|
103
|
+
cache --> merge
|
|
104
|
+
merge --> out["ModelInfo[] — modelFormat, sourceBackend, contextWindow, cost, isFree"]
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
### The classification heuristic
|
|
108
|
+
|
|
109
|
+
`classifyModelFormat(modelId, providerNpm)` — `src/constants.ts:58` — is a pure function. Provider npm (from the cache) wins; if absent, an ID-prefix heuristic is applied to the lowercased id:
|
|
110
|
+
|
|
111
|
+
| Order | Condition | Result | Rationale |
|
|
112
|
+
|---|---|---|---|
|
|
113
|
+
| 1 | `providerNpm === '@ai-sdk/anthropic'` | `anthropic` | Direct Anthropic passthrough. |
|
|
114
|
+
| 2 | `providerNpm === '@ai-sdk/openai'` | `unsupported` | Cloud Zen/Go proxy can't do model-specific OpenAI endpoints. |
|
|
115
|
+
| 3 | `providerNpm === '@ai-sdk/google'` | `unsupported` | Needs Gemini-specific endpoints the cloud path lacks. |
|
|
116
|
+
| 4 | id starts with `claude-` | `anthropic` | Heuristic fallback when no cache npm. |
|
|
117
|
+
| 5 | id starts with `gpt-` | `unsupported` | Heuristic fallback. |
|
|
118
|
+
| 6 | id starts with `gemini-` | `unsupported` | Heuristic fallback. |
|
|
119
|
+
| 7 | (everything else) | `openai` | Catch-all → SDK adapter via local proxy. |
|
|
120
|
+
|
|
121
|
+
**Nuance:** `unsupported` is a *cloud-wizard* restriction, not a global one. The cloud OpenCode Zen/Go proxy layer can't reach GPT/Gemini directly, so those are hidden in the wizard. The same GPT/Gemini models *are* usable through the **local OpenAI / Google provider** (PRD-002), which carries the real `@ai-sdk/openai` / `@ai-sdk/google` npm and routes through the SDK adapter normally.
|
|
122
|
+
|
|
123
|
+
### The Go-backend override
|
|
124
|
+
|
|
125
|
+
The Go gateway is OpenAI-compatible. If the cache labels a Go model `@ai-sdk/anthropic` (a metadata error), `mergeModels` and `readModelsFromCache` both demote `'anthropic'` → `'openai'` so it routes through the translation proxy rather than attempting a raw Anthropic passthrough that would fail — `src/models.ts:50` and `src/models.ts:95`.
|
|
126
|
+
|
|
127
|
+
### `sourceBackend` and the combined-list problem
|
|
128
|
+
|
|
129
|
+
Every `ModelInfo` carries `sourceBackend: 'zen' | 'go'`, set from the backend that was queried (`src/models.ts:51`, `:96`). The `go` subscription tier shows Zen *free* models **and** Go *paid* models in one combined list; `sourceBackend` is what lets the launcher set the correct `ANTHROPIC_BASE_URL` per selected model rather than per session.
|
|
130
|
+
|
|
131
|
+
### `isFree` and brand grouping
|
|
132
|
+
|
|
133
|
+
`isFree` is true when the cache cost is explicitly `input === 0 && output === 0` — `src/models.ts:44`. Bare (cache-miss) models default `isFree: false`. `deriveBrand(family)` maps a family string to a display brand via the `BRAND_MAP` prefix table (Claude, GPT, Gemini, DeepSeek, Qwen, MiniMax, Kimi, GLM, MiMo, Grok, Nemotron, else `'Other'`) — `src/models.ts:9`. `groupModels` then splits free models out and buckets the rest by brand, each sorted by id — `src/models.ts:111`.
|
|
134
|
+
|
|
135
|
+
### Context-window resolution
|
|
136
|
+
|
|
137
|
+
`resolveContextWindow(modelId, explicit?)` — `src/context-window.ts:131` — resolves in priority order:
|
|
138
|
+
|
|
139
|
+
1. An explicit positive value (cache `limit.context`, or a pre-resolved number) wins.
|
|
140
|
+
2. Otherwise `lookupContextWindow` consults a model-id → context map built from the cache. The map prefers the `opencode` / `opencode-go` provider keys (`CACHE_PROVIDER_PRIORITY`), then falls back to the max context seen across any provider for that id — `src/context-window.ts:75`.
|
|
141
|
+
3. Otherwise ID-pattern heuristics (`HEURISTIC_RULES`, ordered most-specific-first) — `src/context-window.ts:30`.
|
|
142
|
+
4. Otherwise `DEFAULT_CONTEXT_WINDOW = 200_000` — Claude Code's own fallback for unknown models — `src/context-window.ts:12`.
|
|
143
|
+
|
|
144
|
+
The resolved window is written to `CLAUDE_CODE_MAX_CONTEXT_TOKENS` (by `buildChildEnv`) and emitted in the proxy's synthetic `GET /v1/models` (`context_window` per model) so the host can render remaining context.
|
|
145
|
+
|
|
146
|
+
### The `[1m]` suffix convention
|
|
147
|
+
|
|
148
|
+
Claude Code treats third-party routes as 200K unless the model id ends with `[1m]` — `src/context-model-id.ts:3`. `claudeCodeClientModelId(modelId, contextWindow?)` strips any existing suffix, resolves the real window, and re-appends `[1m]` when the window exceeds 200K so the host renders the larger window — `src/context-model-id.ts:17`. `routeLookupIds` generates the suffix/`models/`-prefix variants so inbound ids still match catalog aliases — `src/context-model-id.ts:27`.
|
|
149
|
+
|
|
150
|
+
### Reasoning-capability metadata
|
|
151
|
+
|
|
152
|
+
`resolveReasoningCapabilities({ npm, modelId, ...metadata })` — `src/reasoning-capabilities.ts:24` — delegates to `getReasoningCapabilities(npm, modelId, metadata)` in `src/provider-factory.ts:469`. The result (`ReasoningCapabilities`, `src/provider-factory.ts:206`) carries effort `levels`, a `defaultLevel`, `supportsSummaries`, a `mode` (`none | internal-only | controllable`), a provenance `source`, a `confidence`, and an optional `wireFormat` discriminated union (openrouter / openai-effort / anthropic-thinking / google-thinking-config / mistral-effort / deepseek-thinking) — `src/provider-factory.ts:187`. Per-provider effort vocabularies differ (e.g. OpenAI `low/medium/high/xhigh`, xAI `none/low/medium/high`, DeepSeek `high/max/off`) — `src/provider-factory.ts:216`.
|
|
153
|
+
|
|
154
|
+
### Incompatibility filtering
|
|
155
|
+
|
|
156
|
+
`shouldHideModel(ctx)` — `src/model-compatibility.ts:68` — hides a model when `hideReason` returns non-null. Two layers:
|
|
157
|
+
|
|
158
|
+
1. **Curated blacklist** (`src/data/model-incompatible.json`) keyed by `{provider, modelId, agents?}`. `provider: '*'` matches any provider; an absent/empty `agents` matches every agent. Categories include `image_generation`, `audio_only`, `video_generation`, `embedding`, `managed_agent`, `deprecated`, `gated_access`, and `stale_promotion` — `src/model-compatibility.ts:46`.
|
|
159
|
+
2. **models.dev conservative capabilities** — `shouldHideByModelsDevCapabilities` hides a model only when its models.dev row is explicit: non-text-only output modalities, `tool_call === false`, or `interactions === true && chat === false` — `src/registry/models-dev.ts:229`.
|
|
160
|
+
|
|
161
|
+
`mergeModels` calls this with `agent: 'claude'` for the cloud path — `src/models.ts:91`. Cache entries with `status === 'deprecated'` are dropped at read time too — `src/models.ts:43`.
|
|
162
|
+
|
|
163
|
+
### Stale-free models
|
|
164
|
+
|
|
165
|
+
The knowledge doc and `CLAUDE.md` reference a `STALE_FREE_MODELS` constant for models whose free promotion ended but the API still returns. In the shipped v0.2.7 tree this responsibility lives in the **incompatibility blacklist** instead: `qwen3.6-plus-free` is listed with `category: "stale_promotion"` (`src/data/model-incompatible.json`) and filtered through the same `shouldHideModel` path. There is no separate `STALE_FREE_MODELS` symbol in `src/constants.ts` in this version.
|
|
166
|
+
|
|
167
|
+
### Graceful degradation
|
|
168
|
+
|
|
169
|
+
`getModels` — `src/models.ts:131` — tries the live API first. On any throw it falls back: cache (if non-empty) → caller `fallbackModels` (if any) → a thrown error instructing the user to check network / OpenCode status. The returned `fromCache` flag tells callers the list is stale.
|
|
170
|
+
|
|
171
|
+
---
|
|
172
|
+
|
|
173
|
+
## Data Shapes
|
|
174
|
+
|
|
175
|
+
`ModelInfo` (cloud Zen/Go path) — `src/types.ts:22`:
|
|
176
|
+
|
|
177
|
+
| Field | Type | Notes |
|
|
178
|
+
|---|---|---|
|
|
179
|
+
| `id` | `string` | Model id from the API. |
|
|
180
|
+
| `name` | `string` | Display name (cache `name`, else id). |
|
|
181
|
+
| `isFree` | `boolean` | Cache cost both zero. |
|
|
182
|
+
| `brand` | `string` | From `deriveBrand(family)`. |
|
|
183
|
+
| `sourceBackend` | `'zen' \| 'go'` | Backend that was queried; drives base-URL selection. |
|
|
184
|
+
| `modelFormat` | `'anthropic' \| 'openai' \| 'unsupported'` | Launch branch point (`ModelFormat`, `src/types.ts:5`). |
|
|
185
|
+
| `cost` | `ModelCost?` | `{ input, output, cache_read?, cache_write? }`. |
|
|
186
|
+
| `contextWindow` | `number?` | Resolved window for the status bar. |
|
|
187
|
+
|
|
188
|
+
`ModelFormat` semantics:
|
|
189
|
+
|
|
190
|
+
| Value | Meaning |
|
|
191
|
+
|---|---|
|
|
192
|
+
| `anthropic` | Direct passthrough to the provider's Anthropic endpoint. |
|
|
193
|
+
| `openai` | Routed through the SDK adapter via the local proxy. |
|
|
194
|
+
| `unsupported` | Hidden in the cloud OpenCode wizard only (not a global ban). |
|
|
195
|
+
|
|
196
|
+
`ReasoningCapabilities` — `src/provider-factory.ts:206`: `{ levels: string[], defaultLevel: string, supportsSummaries: boolean, mode: 'none'|'internal-only'|'controllable', source, confidence, wireFormat? }`.
|
|
197
|
+
|
|
198
|
+
`IncompatibleModelEntry` — `src/model-compatibility.ts:20`: `{ provider, modelId, category, reason, agents?, sources?, verifiedAt? }`.
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## Acceptance Criteria
|
|
203
|
+
|
|
204
|
+
- [x] Cloud model list is built by merging live API ids with OpenCode cache enrichment — `src/models.ts:131`, `:85`.
|
|
205
|
+
- [x] API ids are authoritative; cache-only ids are not shown — `src/models.ts:90`.
|
|
206
|
+
- [x] OpenCode cache is optional — a missing/unparseable file degrades to `null` without error — `src/context-window.ts:63`.
|
|
207
|
+
- [x] `classifyModelFormat` resolves npm first, then id-prefix heuristics, returning one of three `ModelFormat` values — `src/constants.ts:58`.
|
|
208
|
+
- [x] Go backend demotes `anthropic` → `openai` — `src/models.ts:50`, `:95`.
|
|
209
|
+
- [x] `sourceBackend` is stamped on every model for per-model base-URL resolution — `src/models.ts:51`, `:96`.
|
|
210
|
+
- [x] `isFree` is true only when cache cost input and output are both 0 — `src/models.ts:44`.
|
|
211
|
+
- [x] Context window resolves cache `limit.context` → cache index → id heuristics → 200K default — `src/context-window.ts:131`.
|
|
212
|
+
- [x] `[1m]` suffix is appended for >200K windows so Claude Code renders the larger window — `src/context-model-id.ts:17`.
|
|
213
|
+
- [x] Reasoning capabilities are resolved per npm + model id with a per-provider wire format — `src/reasoning-capabilities.ts:24`, `src/provider-factory.ts:469`.
|
|
214
|
+
- [x] Incompatible models (image/audio/video/embedding/managed-agent/deprecated/gated/stale) are filtered via the curated blacklist — `src/model-compatibility.ts:46`, `src/data/model-incompatible.json`.
|
|
215
|
+
- [x] models.dev capability checks hide non-text/no-tool/interactions-only models conservatively — `src/registry/models-dev.ts:229`.
|
|
216
|
+
- [x] Cache entries marked `status: 'deprecated'` are dropped at read time — `src/models.ts:43`.
|
|
217
|
+
- [x] API failure degrades to cache → fallback models → error, with a `fromCache` flag — `src/models.ts:140`.
|
|
218
|
+
|
|
219
|
+
---
|
|
220
|
+
|
|
221
|
+
## Files
|
|
222
|
+
|
|
223
|
+
**Primary**
|
|
224
|
+
|
|
225
|
+
- `src/models.ts` — two-source merge, `getModels`, `mergeModels`, `readModelsFromCache`, `fetchModelsFromApi`, `deriveBrand`, `groupModels`, `isFree`, `sourceBackend`, Go override.
|
|
226
|
+
- `src/constants.ts` — `classifyModelFormat` (`:58`), `OPENCODE_CACHE_PATH` (`:48`).
|
|
227
|
+
- `src/context-window.ts` — `resolveContextWindow`, `lookupContextWindow`, `buildContextWindowIndex`, `contextWindowFromHeuristics`, `loadOpencodeCache`, `DEFAULT_CONTEXT_WINDOW`.
|
|
228
|
+
- `src/context-model-id.ts` — `[1m]` suffix handling, `claudeCodeClientModelId`, `routeLookupIds`.
|
|
229
|
+
- `src/reasoning-capabilities.ts` — `resolveReasoningCapabilities`, `effortProviderOptions` (re-exports from `provider-factory`).
|
|
230
|
+
- `src/model-compatibility.ts` — `shouldHideModel`, `hideReason`, `findBlacklistEntry`.
|
|
231
|
+
|
|
232
|
+
**Supporting**
|
|
233
|
+
|
|
234
|
+
- `src/data/model-incompatible.json` — curated incompatibility blacklist (image/audio/video/embedding/managed-agent/deprecated/gated/stale entries).
|
|
235
|
+
- `src/data/models-dev-cache.json` — relayed models.dev snapshot used for conservative capability filtering and enrichment.
|
|
236
|
+
- `src/data/pricing-cache.json` — relayed pricing snapshot.
|
|
237
|
+
- `src/registry/models-dev.ts` — `findModelsDevModel`, `loadModelsDevCache`, `shouldHideByModelsDevCapabilities`.
|
|
238
|
+
- `src/provider-factory.ts` — `getReasoningCapabilities`, `ReasoningCapabilities` / `ReasoningMetadata` types, effort vocabularies, wire formats.
|
|
239
|
+
- `src/types.ts` — `ModelInfo`, `ModelFormat`, `ModelCost`.
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
## Risks & Known Limitations
|
|
244
|
+
|
|
245
|
+
- **Cost display is inaccurate for non-Anthropic models.** Claude Code applies its own internal pricing table to whatever model id it sees, so the cost shown for a Groq/DeepSeek/Gemini model is wrong. Unfixable from rflectr — the host owns its pricing display.
|
|
246
|
+
- **GPT and Gemini models are `unsupported` in the cloud wizard.** They are hidden from the OpenCode Zen/Go wizard because the cloud proxy layer can't reach those model-specific endpoints. Workaround is the local OpenAI/Google provider (PRD-002), which routes through the SDK adapter.
|
|
247
|
+
- **Context window is fixed at launch in switch-menu mode.** Claude Code's gateway model discovery carries only id + display name (no `context_window`) and fetches `/v1/models` once at startup, so a live `/model` switch does not update the displayed window. Single-model launches show the correct window.
|
|
248
|
+
- **OpenCode cache is enrichment, not authority.** Without it, models still appear (id-only) but lose `name`, `family`, `cost`, and accurate per-model context/format from npm — classification then relies on id-prefix heuristics, which can misclassify novel ids.
|
|
249
|
+
- **Heuristic windows can drift.** `HEURISTIC_RULES` and the brand map are hand-maintained; a new model family with no cache entry falls to the 200K default or a coarse pattern match until the rules are updated.
|
|
250
|
+
- **Blacklist is point-in-time.** Entries carry `verifiedAt` dates (mostly 2026-06); a model that changes capabilities upstream won't be re-evaluated until the JSON is updated.
|
|
251
|
+
- **`STALE_FREE_MODELS` is documented but not present as a constant** in v0.2.7 — the stale-free responsibility moved into `model-incompatible.json` (`category: stale_promotion`). Docs referencing the constant are slightly ahead of/behind the code.
|
|
252
|
+
|
|
253
|
+
---
|
|
254
|
+
|
|
255
|
+
## Related
|
|
256
|
+
|
|
257
|
+
- [Knowledge: Model Discovery & Classification](../../../knowledge/private/ai/model-discovery-classification.md)
|
|
258
|
+
- [PRD-002 — Provider Registry](../prd-002-provider-registry/prd-002-provider-registry-index.md) — registry providers arrive with models pre-stamped (`modelFormat`, `npm`, `contextWindow`, `reasoning`).
|
|
259
|
+
- [PRD-004 — Translation Layer](../prd-004-translation-layer/prd-004-translation-layer-index.md) — what happens after a model is classified `'openai'`.
|
|
260
|
+
- [PRD-008 — Preferences, Tiers & Favorites](../prd-008-preferences-tiers-favorites/prd-008-preferences-tiers-favorites-index.md) — subscription tiers that drive which backends are queried and combined.
|