@oh-my-pi/pi-catalog 15.12.4 → 15.13.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +34 -1
- package/dist/types/identity/classify.d.ts +21 -3
- package/dist/types/identity/family.d.ts +22 -0
- package/dist/types/model-manager.d.ts +2 -0
- package/dist/types/provider-models/descriptors.d.ts +8 -8
- package/dist/types/types.d.ts +6 -0
- package/dist/types/wire/gemini-headers.d.ts +1 -0
- package/package.json +3 -3
- package/src/compat/anthropic.ts +7 -1
- package/src/compat/openai.ts +1 -0
- package/src/identity/classify.ts +59 -6
- package/src/identity/family.ts +54 -1
- package/src/model-manager.ts +7 -4
- package/src/models.json +88 -26
- package/src/provider-models/descriptors.ts +8 -8
- package/src/provider-models/openai-compat.ts +7 -21
- package/src/types.ts +6 -0
- package/src/wire/gemini-headers.ts +2 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,36 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [15.13.1] - 2026-06-15
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
|
|
9
|
+
- Added `modelFamilyToken(modelId)` to `@oh-my-pi/pi-catalog/identity`: a coarse vendor-lineage token (`anthropic`/`openai`/`gemini`/`kimi`/…) for "are two models the same family?" comparisons, backed by `parseKnownModel` canonical-id normalization. Opaque and comparison-only; kind/variant collapsed onto the vendor token ([#2406](https://github.com/can1357/oh-my-pi/issues/2406))
|
|
10
|
+
|
|
11
|
+
### Changed
|
|
12
|
+
|
|
13
|
+
- Changed catalog metadata to update a model’s per-token pricing to input 0.09 and output 0.18
|
|
14
|
+
- Changed the same cataloged model’s maximum token limit from 384000 to 65536
|
|
15
|
+
|
|
16
|
+
### Fixed
|
|
17
|
+
|
|
18
|
+
- Fixed MiniMax-M3 catalog context for `minimax` and `minimax-cn` to report the documented 1M long-context tier instead of the upstream 512K pricing boundary ([#2576](https://github.com/can1357/oh-my-pi/issues/2576)).
|
|
19
|
+
- Fixed OpenCode Go MiMo catalog metadata so title generation and other tool-enabled calls omit unsupported `tool_choice` instead of triggering provider 400s ([#2509](https://github.com/can1357/oh-my-pi/issues/2509)).
|
|
20
|
+
- Fixed OpenCode Go `kimi-k2.7-code` catalog metadata so resolve-gate requests use automatic tool selection instead of Moonshot-rejected forced `tool_choice` ([#2546](https://github.com/can1357/oh-my-pi/issues/2546)).
|
|
21
|
+
- Fixed Anthropic compat for the `github-copilot` host so `supportsEagerToolInputStreaming` defaults to `false` there, matching the Copilot proxy which rejects the per-tool `eager_input_streaming` field ([#2558](https://github.com/can1357/oh-my-pi/issues/2558)).
|
|
22
|
+
- Scoped vLLM model cache validity to the discovery base URL so changed endpoints refetch immediately, and bounded built-in vLLM discovery requests with a timeout.
|
|
23
|
+
|
|
24
|
+
## [15.12.6] - 2026-06-14
|
|
25
|
+
|
|
26
|
+
### Added
|
|
27
|
+
|
|
28
|
+
- Added GLM-5.2 to the bundled zai (GLM Coding Plan) catalog as the selectable 1M served model.
|
|
29
|
+
|
|
30
|
+
### Changed
|
|
31
|
+
|
|
32
|
+
- Pinned zai `glm-5.2` to 1M context during catalog generation so endpoint discovery and older fallbacks cannot regress it to 200k.
|
|
33
|
+
- Replaced the hand-maintained `zhipu-coding-plan` GLM reasoning allowlist and vision regex with a `parseGlmModel` family classifier in `identity/classify.ts` (variant + vision + version), surfaced as `isReasoningGlmModelId` / `isGlmVisionModelId`. Discovery now derives reasoning/vision capability from the GLM family instead of a per-id list, so newly-bumped integers (`glm-5.3`, `glm-6`, …) are covered automatically while `-flash`/`-preview` and the vision `…v` shape stay correctly classified.
|
|
34
|
+
|
|
5
35
|
## [15.12.4] - 2026-06-13
|
|
6
36
|
|
|
7
37
|
### Added
|
|
@@ -28,6 +58,7 @@
|
|
|
28
58
|
- Fixed Antigravity `gemini-3.1-pro --thinking high` failing with `Cloud Code Assist API error (400): Request contains an invalid argument.` — the upstream `gemini-3.1-pro-high` deployment rejects every `streamGenerateContent` request on both CCA endpoints while discovery still advertises it. High effort now routes to `gemini-pro-agent` (the same "Gemini 3.1 Pro (High)" model, verified accepting the identical request body), and the model-cache fingerprint version was bumped (`merge-v2` → `merge-v3`) so existing fresh caches refetch discovery and pick up the corrected routing immediately.
|
|
29
59
|
|
|
30
60
|
## [15.11.7] - 2026-06-12
|
|
61
|
+
|
|
31
62
|
### Added
|
|
32
63
|
|
|
33
64
|
- Added effort-tier variant collapsing (`variant-collapse`): providers that expose one logical model as several effort/thinking-suffixed upstream ids (Antigravity CCA `gemini-3.5-flash-extra-low`/`-low`/`gemini-3-flash-agent`, `gemini-3[.1]-pro-low|high`, `claude-*[-thinking]` pairs, `gpt-oss-120b-medium`) collapse into one logical entry carrying per-effort upstream routing in `thinking.effortRouting` (plus `thinking.suppressWhenOff` for Cloud Code Assist ids whose baked server default re-applies when `thinkingConfig` is omitted). Request-time code resolves the outbound id via `resolveWireModelId(model, effort)`; selection, caching, and usage attribution key on the logical id.
|
|
@@ -52,11 +83,13 @@
|
|
|
52
83
|
### Fixed
|
|
53
84
|
|
|
54
85
|
- Fixed MiniMax M2-family and OpenAI gpt-oss model metadata so OpenAI-compatible catalog entries declare only `low|medium|high` thinking efforts. Their upstreams reject `minimal`, `xhigh`, and Fireworks' `minimal → none` wire mapping, so `fireworks/minimax-m2.7` as the smol auto-thinking classifier model 400ed on every turn. OpenAI-compatible provider effort maps (`Groq qwen/qwen3-32b`, DeepSeek-family, OpenRouter Anthropic adaptive, Fireworks `minimal → none`) now bake into `thinking.effortMap` in catalog metadata instead of `buildOpenAICompat`, and request builders read that field directly. Regenerated `models.json` now makes `disableReasoning` choose `low` for those families while leaving GLM-5.x and other Fireworks models on the existing `minimal → none` path ([#2315](https://github.com/can1357/oh-my-pi/issues/2315)).
|
|
86
|
+
|
|
55
87
|
### Added
|
|
56
88
|
|
|
57
89
|
- Added `requiresJuiceZeroHack` Responses-API compat flag, resolved by `buildOpenAIResponsesCompat` from GPT-5-family model names and overridable via sparse model `compat` config. Replaces the request-time `model.name.startsWith("gpt-5")` sniff that gated the trailing `# Juice: 0 !important` no-reasoning developer item.
|
|
58
90
|
|
|
59
91
|
## [15.11.3] - 2026-06-11
|
|
92
|
+
|
|
60
93
|
### Added
|
|
61
94
|
|
|
62
95
|
- Added `requestModelId` on `Model` to represent the upstream model id used when a catalog entry is a local variant
|
|
@@ -134,4 +167,4 @@
|
|
|
134
167
|
|
|
135
168
|
### Removed
|
|
136
169
|
|
|
137
|
-
- Removed the runtime enrichment layer: `enrichModelThinking` (and its non-enumerable memo-slot cache), `refreshModelThinking`, `modelOmitsReasoningEffort`, and the `model-thinking` re-exports of generator-only policies. Thinking metadata is resolved exactly once inside `buildModel`; runtime helpers (`getSupportedEfforts`, `clampThinkingLevelForModel`, `requireSupportedEffort`, the effort mappers) are pure field reads.
|
|
170
|
+
- Removed the runtime enrichment layer: `enrichModelThinking` (and its non-enumerable memo-slot cache), `refreshModelThinking`, `modelOmitsReasoningEffort`, and the `model-thinking` re-exports of generator-only policies. Thinking metadata is resolved exactly once inside `buildModel`; runtime helpers (`getSupportedEfforts`, `clampThinkingLevelForModel`, `requireSupportedEffort`, the effort mappers) are pure field reads.
|
|
@@ -12,6 +12,7 @@ export type SemVer = {
|
|
|
12
12
|
export type GeminiKind = "pro" | "flash";
|
|
13
13
|
export type AnthropicKind = "opus" | "sonnet" | "fable" | "mythos";
|
|
14
14
|
export type OpenAIVariant = "base" | "codex" | "codex-max" | "codex-mini" | "codex-spark" | "mini" | "max" | "nano";
|
|
15
|
+
export type GlmVariant = "base" | "air" | "turbo" | "flash" | "flashx" | "preview";
|
|
15
16
|
export interface GeminiModel {
|
|
16
17
|
family: "gemini";
|
|
17
18
|
kind: GeminiKind;
|
|
@@ -27,6 +28,14 @@ export interface OpenAIModel {
|
|
|
27
28
|
variant: OpenAIVariant;
|
|
28
29
|
version: SemVer;
|
|
29
30
|
}
|
|
31
|
+
export interface GlmModel {
|
|
32
|
+
family: "glm";
|
|
33
|
+
/** Suffix variant (`-air`, `-turbo`, `-flash`, `-flashx`, `-preview`); `base` when none. */
|
|
34
|
+
variant: GlmVariant;
|
|
35
|
+
/** Vision SKU — the `v` that attaches directly to the version (`glm-4v`, `glm-4.5v`). */
|
|
36
|
+
vision: boolean;
|
|
37
|
+
version: SemVer;
|
|
38
|
+
}
|
|
30
39
|
export interface UnknownModel {
|
|
31
40
|
family: "unknown";
|
|
32
41
|
id: string;
|
|
@@ -35,9 +44,18 @@ export type ParsedModel = GeminiModel | AnthropicModel | OpenAIModel | UnknownMo
|
|
|
35
44
|
/** Strip a provider namespace prefix (`openai/gpt-5.4` → `gpt-5.4`). */
|
|
36
45
|
export declare function bareModelId(modelId: string): string;
|
|
37
46
|
export declare function parseKnownModel(modelId: string): ParsedModel;
|
|
38
|
-
export declare
|
|
39
|
-
export declare
|
|
40
|
-
export declare
|
|
47
|
+
export declare const parseGeminiModel: (modelId: string) => GeminiModel | null;
|
|
48
|
+
export declare const parseAnthropicModel: (modelId: string) => AnthropicModel | null;
|
|
49
|
+
export declare const parseOpenAIModel: (modelId: string) => OpenAIModel | null;
|
|
50
|
+
/**
|
|
51
|
+
* Parse a GLM (Zhipu / Z.AI) model id into family + variant + vision + version.
|
|
52
|
+
* Shape: `glm-<version>[v][-<variant>]` — e.g. `glm-4.5`, `glm-4.5-air`,
|
|
53
|
+
* `glm-5-turbo`, `glm-4.5v`, `glm-5-preview`. The `v` (vision) attaches to the
|
|
54
|
+
* version; other variants are `-` suffixes. Standalone like `parseAnthropicModel`
|
|
55
|
+
* is used in family.ts — GLM needs no global thinking policy, so it stays out of
|
|
56
|
+
* `parseKnownModel`.
|
|
57
|
+
*/
|
|
58
|
+
export declare const parseGlmModel: (modelId: string) => GlmModel | null;
|
|
41
59
|
export declare function isFableOrMythos(kind: AnthropicKind): boolean;
|
|
42
60
|
export declare function parseSemVer(version: string): SemVer | null;
|
|
43
61
|
export declare function semverGte(left: SemVer | string, right: SemVer | string): boolean;
|
|
@@ -37,6 +37,28 @@ export declare function isMinimaxM2FamilyModelId(modelId: string): boolean;
|
|
|
37
37
|
* and `none`.
|
|
38
38
|
*/
|
|
39
39
|
export declare function isOpenAIGptOssModelId(modelId: string): boolean;
|
|
40
|
+
/**
|
|
41
|
+
* Reasoning-capable GLM coding SKUs: glm-4.5 and up on the base / `-air` /
|
|
42
|
+
* `-turbo` lines. Excludes the vision (`…v`) shape, the non-reasoning
|
|
43
|
+
* `-flash`/`-flashx`/`-preview` variants, and pre-4.5 ids. Matching the family
|
|
44
|
+
* keeps newly-bumped integers (`glm-5.3`, `glm-6`, …) covered without a per-id
|
|
45
|
+
* allowlist.
|
|
46
|
+
*/
|
|
47
|
+
export declare function isReasoningGlmModelId(modelId: string): boolean;
|
|
48
|
+
/** GLM vision SKUs — the `v` that attaches to the version (`glm-4v`, `glm-4.5v`). */
|
|
49
|
+
export declare function isGlmVisionModelId(modelId: string): boolean;
|
|
50
|
+
/**
|
|
51
|
+
* Coarse vendor-lineage token for "are two models the same family?" checks
|
|
52
|
+
* (e.g. picking a cross-family reviewer). All Claude point releases share a token,
|
|
53
|
+
* Claude and GPT differ; namespace prefixes and aggregator mirrors fold onto the
|
|
54
|
+
* lineage via {@link parseKnownModel}'s `bareModelId` normalization. Opaque and
|
|
55
|
+
* comparison-only — not a stable key to persist, since the vocabulary tracks new
|
|
56
|
+
* releases. Returns `""` for ids it cannot classify; callers fall back to the provider.
|
|
57
|
+
*
|
|
58
|
+
* Vendor-only by design: a model's kind/variant (opus vs sonnet, codex vs base) is
|
|
59
|
+
* collapsed onto the single vendor token; use {@link parseKnownModel} for finer breakdowns.
|
|
60
|
+
*/
|
|
61
|
+
export declare function modelFamilyToken(modelId: string): string;
|
|
40
62
|
/**
|
|
41
63
|
* Adaptive thinking `display` is supported starting with Claude Opus 4.7 and
|
|
42
64
|
* the Claude Fable/Mythos 5 generation. Older adaptive-thinking models
|
|
@@ -22,6 +22,8 @@ export interface ModelManagerOptions<TApi extends Api = Api, TModelsDevPayload =
|
|
|
22
22
|
staticModels?: readonly ModelSpec<TApi>[];
|
|
23
23
|
/** Optional override for the cache database path. Default: <agent-dir>/models.db. */
|
|
24
24
|
cacheDbPath?: string;
|
|
25
|
+
/** Optional provider id override for cache namespacing. Defaults to providerId. */
|
|
26
|
+
cacheProviderId?: string;
|
|
25
27
|
/** Maximum cache age in milliseconds before considered stale. Default: 24h. */
|
|
26
28
|
cacheTtlMs?: number;
|
|
27
29
|
/** When true, a successful dynamic fetch is the complete provider catalog and prunes static-only models. */
|
|
@@ -25,10 +25,10 @@ export declare const CATALOG_PROVIDERS: readonly [{
|
|
|
25
25
|
};
|
|
26
26
|
}, {
|
|
27
27
|
readonly id: "amazon-bedrock";
|
|
28
|
-
readonly defaultModel: "us.anthropic.claude-opus-4-
|
|
28
|
+
readonly defaultModel: "us.anthropic.claude-opus-4-8";
|
|
29
29
|
}, {
|
|
30
30
|
readonly id: "anthropic";
|
|
31
|
-
readonly defaultModel: "claude-opus-4-
|
|
31
|
+
readonly defaultModel: "claude-opus-4-8";
|
|
32
32
|
readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"anthropic-messages", unknown>;
|
|
33
33
|
}, {
|
|
34
34
|
readonly id: "cerebras";
|
|
@@ -136,7 +136,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
|
|
|
136
136
|
};
|
|
137
137
|
}, {
|
|
138
138
|
readonly id: "litellm";
|
|
139
|
-
readonly defaultModel: "claude-opus-4-
|
|
139
|
+
readonly defaultModel: "claude-opus-4-8";
|
|
140
140
|
readonly envVars: readonly ["LITELLM_API_KEY"];
|
|
141
141
|
readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"openai-completions", unknown>;
|
|
142
142
|
readonly catalogDiscovery: {
|
|
@@ -176,7 +176,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
|
|
|
176
176
|
};
|
|
177
177
|
}, {
|
|
178
178
|
readonly id: "nanogpt";
|
|
179
|
-
readonly defaultModel: "openai/gpt-5.
|
|
179
|
+
readonly defaultModel: "openai/gpt-5.5";
|
|
180
180
|
readonly envVars: readonly ["NANO_GPT_API_KEY"];
|
|
181
181
|
readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"openai-completions", unknown>;
|
|
182
182
|
readonly catalogDiscovery: {
|
|
@@ -207,12 +207,12 @@ export declare const CATALOG_PROVIDERS: readonly [{
|
|
|
207
207
|
};
|
|
208
208
|
}, {
|
|
209
209
|
readonly id: "openai";
|
|
210
|
-
readonly defaultModel: "gpt-5.
|
|
210
|
+
readonly defaultModel: "gpt-5.5";
|
|
211
211
|
readonly envVars: readonly ["OPENAI_API_KEY"];
|
|
212
212
|
readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"openai-responses", unknown>;
|
|
213
213
|
}, {
|
|
214
214
|
readonly id: "openai-codex";
|
|
215
|
-
readonly defaultModel: "gpt-5.
|
|
215
|
+
readonly defaultModel: "gpt-5.5";
|
|
216
216
|
readonly envVars: readonly ["OPENAI_CODEX_OAUTH_TOKEN"];
|
|
217
217
|
readonly specialModelManager: true;
|
|
218
218
|
}, {
|
|
@@ -227,7 +227,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
|
|
|
227
227
|
readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<import("..").Api, unknown>;
|
|
228
228
|
}, {
|
|
229
229
|
readonly id: "openrouter";
|
|
230
|
-
readonly defaultModel: "openai/gpt-5.
|
|
230
|
+
readonly defaultModel: "openai/gpt-5.5";
|
|
231
231
|
readonly envVars: readonly ["OPENROUTER_API_KEY"];
|
|
232
232
|
readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"openai-completions", unknown>;
|
|
233
233
|
readonly catalogDiscovery: {
|
|
@@ -361,7 +361,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
|
|
|
361
361
|
};
|
|
362
362
|
}, {
|
|
363
363
|
readonly id: "zenmux";
|
|
364
|
-
readonly defaultModel: "anthropic/claude-opus-4.
|
|
364
|
+
readonly defaultModel: "anthropic/claude-opus-4.8";
|
|
365
365
|
readonly envVars: readonly ["ZENMUX_API_KEY"];
|
|
366
366
|
readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<import("..").Api, unknown>;
|
|
367
367
|
readonly catalogDiscovery: {
|
package/dist/types/types.d.ts
CHANGED
|
@@ -161,6 +161,12 @@ export interface OpenAICompat {
|
|
|
161
161
|
requiresAssistantContentForToolCalls?: boolean;
|
|
162
162
|
/** Whether the provider supports the `tool_choice` parameter. Default: true. */
|
|
163
163
|
supportsToolChoice?: boolean;
|
|
164
|
+
/**
|
|
165
|
+
* Whether forced `tool_choice` values (`"required"` or named tools) are accepted.
|
|
166
|
+
* When false, request builders keep tools available but downgrade forced choices
|
|
167
|
+
* to provider-default auto selection. Default: true.
|
|
168
|
+
*/
|
|
169
|
+
supportsForcedToolChoice?: boolean;
|
|
164
170
|
/**
|
|
165
171
|
* Drop reasoning fields (`reasoning_effort`, OpenRouter `reasoning`) for
|
|
166
172
|
* the request when `tool_choice` forces a tool call. Mirrors the Anthropic
|
|
@@ -9,6 +9,7 @@ export declare const getGeminiCliHeaders: (modelId?: string) => {
|
|
|
9
9
|
"Client-Metadata": string;
|
|
10
10
|
};
|
|
11
11
|
export declare const ANTIGRAVITY_SYSTEM_INSTRUCTION: string;
|
|
12
|
+
export declare const ANTIGRAVITY_NO_PREAMBLE_INSTRUCTION = "CRITICAL: NEVER output rule checks, formatting guidelines, constraint checklists (e.g. \"No emdashes\"), or your thinking/personality preambles in the final response. Output only the final response.";
|
|
12
13
|
/**
|
|
13
14
|
* Antigravity / Cloud Code Assist user agent. Lives in its own file so discovery
|
|
14
15
|
* and usage code can read it without pulling the heavy google-gemini-cli provider
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"type": "module",
|
|
3
3
|
"name": "@oh-my-pi/pi-catalog",
|
|
4
|
-
"version": "15.
|
|
4
|
+
"version": "15.13.1",
|
|
5
5
|
"description": "Model catalog for omp: bundled model database, provider discovery descriptors, model identity, classification, and equivalence",
|
|
6
6
|
"homepage": "https://omp.sh",
|
|
7
7
|
"author": "Can Boluk",
|
|
@@ -34,11 +34,11 @@
|
|
|
34
34
|
},
|
|
35
35
|
"dependencies": {
|
|
36
36
|
"@bufbuild/protobuf": "^2.12.0",
|
|
37
|
-
"@oh-my-pi/pi-utils": "15.
|
|
37
|
+
"@oh-my-pi/pi-utils": "15.13.1",
|
|
38
38
|
"zod": "^4"
|
|
39
39
|
},
|
|
40
40
|
"devDependencies": {
|
|
41
|
-
"@oh-my-pi/pi-ai": "15.
|
|
41
|
+
"@oh-my-pi/pi-ai": "15.13.1",
|
|
42
42
|
"@types/bun": "^1.3.14"
|
|
43
43
|
},
|
|
44
44
|
"engines": {
|
package/src/compat/anthropic.ts
CHANGED
|
@@ -34,11 +34,17 @@ export function buildAnthropicCompat(spec: ModelSpec<"anthropic-messages">): Res
|
|
|
34
34
|
const official = isOfficialAnthropicApiUrl(baseUrl);
|
|
35
35
|
// Z.AI's Anthropic-compatible proxy lives at `api.z.ai/api/anthropic`.
|
|
36
36
|
const isZai = modelMatchesHost(spec, "zai");
|
|
37
|
+
// GitHub Copilot's Anthropic-compatible proxy (api.githubcopilot.com/v1/messages)
|
|
38
|
+
// rejects the per-tool `eager_input_streaming` field with
|
|
39
|
+
// `tools.0.custom.eager_input_streaming: Extra inputs are not permitted` and
|
|
40
|
+
// doesn't whitelist the `fine-grained-tool-streaming-2025-05-14` beta either
|
|
41
|
+
// (issue #2558), so eager tool-input streaming is unavailable on this host.
|
|
42
|
+
const isCopilot = modelMatchesHost(spec, "githubCopilot");
|
|
37
43
|
const compat: ResolvedAnthropicCompat = {
|
|
38
44
|
officialEndpoint: official,
|
|
39
45
|
disableStrictTools: false,
|
|
40
46
|
disableAdaptiveThinking: false,
|
|
41
|
-
supportsEagerToolInputStreaming:
|
|
47
|
+
supportsEagerToolInputStreaming: !isCopilot,
|
|
42
48
|
// Long cache retention is only sent to the official API by default;
|
|
43
49
|
// proxies opt in explicitly via `compat.supportsLongCacheRetention: true`.
|
|
44
50
|
supportsLongCacheRetention: official,
|
package/src/compat/openai.ts
CHANGED
|
@@ -217,6 +217,7 @@ export function buildOpenAICompat(spec: ModelSpec<"openai-completions">): Resolv
|
|
|
217
217
|
disableReasoningOnForcedToolChoice: isKimiModel || isAnthropicModel,
|
|
218
218
|
disableReasoningOnToolChoice: isDeepseekFamily && Boolean(spec.reasoning) && !isOpenRouter,
|
|
219
219
|
supportsToolChoice: !isDirectDeepseekReasoning,
|
|
220
|
+
supportsForcedToolChoice: true,
|
|
220
221
|
maxTokensField: useMaxTokens ? "max_tokens" : "max_completion_tokens",
|
|
221
222
|
requiresToolResultName: isMistral,
|
|
222
223
|
requiresAssistantAfterToolResult: false,
|
package/src/identity/classify.ts
CHANGED
|
@@ -14,6 +14,7 @@ export type SemVer = {
|
|
|
14
14
|
export type GeminiKind = "pro" | "flash";
|
|
15
15
|
export type AnthropicKind = "opus" | "sonnet" | "fable" | "mythos";
|
|
16
16
|
export type OpenAIVariant = "base" | "codex" | "codex-max" | "codex-mini" | "codex-spark" | "mini" | "max" | "nano";
|
|
17
|
+
export type GlmVariant = "base" | "air" | "turbo" | "flash" | "flashx" | "preview";
|
|
17
18
|
|
|
18
19
|
export interface GeminiModel {
|
|
19
20
|
family: "gemini";
|
|
@@ -33,6 +34,15 @@ export interface OpenAIModel {
|
|
|
33
34
|
version: SemVer;
|
|
34
35
|
}
|
|
35
36
|
|
|
37
|
+
export interface GlmModel {
|
|
38
|
+
family: "glm";
|
|
39
|
+
/** Suffix variant (`-air`, `-turbo`, `-flash`, `-flashx`, `-preview`); `base` when none. */
|
|
40
|
+
variant: GlmVariant;
|
|
41
|
+
/** Vision SKU — the `v` that attaches directly to the version (`glm-4v`, `glm-4.5v`). */
|
|
42
|
+
vision: boolean;
|
|
43
|
+
version: SemVer;
|
|
44
|
+
}
|
|
45
|
+
|
|
36
46
|
export interface UnknownModel {
|
|
37
47
|
family: "unknown";
|
|
38
48
|
id: string;
|
|
@@ -55,8 +65,26 @@ export function parseKnownModel(modelId: string): ParsedModel {
|
|
|
55
65
|
);
|
|
56
66
|
}
|
|
57
67
|
|
|
68
|
+
/**
|
|
69
|
+
* Wrap a parse function in a per-id memo cache. Caches the `null` result too, so
|
|
70
|
+
* repeated misses (the common case — ids of other families) stay O(1) and never
|
|
71
|
+
* re-run the regex/semver work.
|
|
72
|
+
*/
|
|
73
|
+
function parser<T>(parse: (modelId: string) => T | null): (modelId: string) => T | null {
|
|
74
|
+
const cache = new Map<string, T | null>();
|
|
75
|
+
return modelId => {
|
|
76
|
+
const hit = cache.get(modelId);
|
|
77
|
+
if (hit !== undefined || cache.has(modelId)) {
|
|
78
|
+
return hit ?? null;
|
|
79
|
+
}
|
|
80
|
+
const result = parse(modelId);
|
|
81
|
+
cache.set(modelId, result);
|
|
82
|
+
return result;
|
|
83
|
+
};
|
|
84
|
+
}
|
|
85
|
+
|
|
58
86
|
const GEMINI_SUFFIX = "-preview";
|
|
59
|
-
export
|
|
87
|
+
export const parseGeminiModel = parser((modelId): GeminiModel | null => {
|
|
60
88
|
if (modelId.endsWith(GEMINI_SUFFIX)) {
|
|
61
89
|
modelId = modelId.slice(0, -GEMINI_SUFFIX.length);
|
|
62
90
|
}
|
|
@@ -69,9 +97,9 @@ export function parseGeminiModel(modelId: string): GeminiModel | null {
|
|
|
69
97
|
return null;
|
|
70
98
|
}
|
|
71
99
|
return { family: "gemini", kind: match[2] as GeminiKind, version };
|
|
72
|
-
}
|
|
100
|
+
});
|
|
73
101
|
|
|
74
|
-
export
|
|
102
|
+
export const parseAnthropicModel = parser((modelId): AnthropicModel | null => {
|
|
75
103
|
const match = /claude-(opus|sonnet|fable|mythos)-(\d{1,2}(?:[.-]\d{1,2}){0,2})\b/.exec(modelId);
|
|
76
104
|
if (!match) {
|
|
77
105
|
return null;
|
|
@@ -81,9 +109,9 @@ export function parseAnthropicModel(modelId: string): AnthropicModel | null {
|
|
|
81
109
|
return null;
|
|
82
110
|
}
|
|
83
111
|
return { family: "anthropic", kind: match[1] as AnthropicKind, version };
|
|
84
|
-
}
|
|
112
|
+
});
|
|
85
113
|
|
|
86
|
-
export
|
|
114
|
+
export const parseOpenAIModel = parser((modelId): OpenAIModel | null => {
|
|
87
115
|
const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?\b/.exec(modelId);
|
|
88
116
|
if (!match) {
|
|
89
117
|
return null;
|
|
@@ -93,7 +121,32 @@ export function parseOpenAIModel(modelId: string): OpenAIModel | null {
|
|
|
93
121
|
return null;
|
|
94
122
|
}
|
|
95
123
|
return { family: "openai", variant: (match[2] as OpenAIVariant | undefined) ?? "base", version };
|
|
96
|
-
}
|
|
124
|
+
});
|
|
125
|
+
|
|
126
|
+
/**
|
|
127
|
+
* Parse a GLM (Zhipu / Z.AI) model id into family + variant + vision + version.
|
|
128
|
+
* Shape: `glm-<version>[v][-<variant>]` — e.g. `glm-4.5`, `glm-4.5-air`,
|
|
129
|
+
* `glm-5-turbo`, `glm-4.5v`, `glm-5-preview`. The `v` (vision) attaches to the
|
|
130
|
+
* version; other variants are `-` suffixes. Standalone like `parseAnthropicModel`
|
|
131
|
+
* is used in family.ts — GLM needs no global thinking policy, so it stays out of
|
|
132
|
+
* `parseKnownModel`.
|
|
133
|
+
*/
|
|
134
|
+
export const parseGlmModel = parser((modelId): GlmModel | null => {
|
|
135
|
+
const match = /glm-(\d{1,2}(?:\.\d+)?)(v)?(?:-(air|turbo|flashx|flash|preview))?\b/.exec(modelId);
|
|
136
|
+
if (!match) {
|
|
137
|
+
return null;
|
|
138
|
+
}
|
|
139
|
+
const version = parseSemVer(match[1]);
|
|
140
|
+
if (!version) {
|
|
141
|
+
return null;
|
|
142
|
+
}
|
|
143
|
+
return {
|
|
144
|
+
family: "glm",
|
|
145
|
+
variant: (match[3] as GlmVariant | undefined) ?? "base",
|
|
146
|
+
vision: match[2] === "v",
|
|
147
|
+
version,
|
|
148
|
+
};
|
|
149
|
+
});
|
|
97
150
|
|
|
98
151
|
export function isFableOrMythos(kind: AnthropicKind): boolean {
|
|
99
152
|
return kind === "fable" || kind === "mythos";
|
package/src/identity/family.ts
CHANGED
|
@@ -7,7 +7,14 @@
|
|
|
7
7
|
* here.
|
|
8
8
|
*/
|
|
9
9
|
|
|
10
|
-
import {
|
|
10
|
+
import {
|
|
11
|
+
bareModelId,
|
|
12
|
+
isFableOrMythos,
|
|
13
|
+
parseAnthropicModel,
|
|
14
|
+
parseGlmModel,
|
|
15
|
+
parseKnownModel,
|
|
16
|
+
semverGte,
|
|
17
|
+
} from "./classify";
|
|
11
18
|
|
|
12
19
|
/** Kimi family ids in any namespace form (`moonshotai/kimi-*`, `kimi-k2.6`, `vendor/kimi.x`). */
|
|
13
20
|
export function isKimiModelId(modelId: string): boolean {
|
|
@@ -71,6 +78,52 @@ export function isOpenAIGptOssModelId(modelId: string): boolean {
|
|
|
71
78
|
return /(^|\/)gpt-oss[-:]/i.test(modelId);
|
|
72
79
|
}
|
|
73
80
|
|
|
81
|
+
/**
|
|
82
|
+
* Reasoning-capable GLM coding SKUs: glm-4.5 and up on the base / `-air` /
|
|
83
|
+
* `-turbo` lines. Excludes the vision (`…v`) shape, the non-reasoning
|
|
84
|
+
* `-flash`/`-flashx`/`-preview` variants, and pre-4.5 ids. Matching the family
|
|
85
|
+
* keeps newly-bumped integers (`glm-5.3`, `glm-6`, …) covered without a per-id
|
|
86
|
+
* allowlist.
|
|
87
|
+
*/
|
|
88
|
+
export function isReasoningGlmModelId(modelId: string): boolean {
|
|
89
|
+
const glm = parseGlmModel(bareModelId(modelId));
|
|
90
|
+
if (!glm || glm.vision) {
|
|
91
|
+
return false;
|
|
92
|
+
}
|
|
93
|
+
if (glm.variant !== "base" && glm.variant !== "air" && glm.variant !== "turbo") {
|
|
94
|
+
return false;
|
|
95
|
+
}
|
|
96
|
+
return semverGte(glm.version, "4.5");
|
|
97
|
+
}
|
|
98
|
+
|
|
99
|
+
/** GLM vision SKUs — the `v` that attaches to the version (`glm-4v`, `glm-4.5v`). */
|
|
100
|
+
export function isGlmVisionModelId(modelId: string): boolean {
|
|
101
|
+
return parseGlmModel(bareModelId(modelId))?.vision === true;
|
|
102
|
+
}
|
|
103
|
+
/**
|
|
104
|
+
* Coarse vendor-lineage token for "are two models the same family?" checks
|
|
105
|
+
* (e.g. picking a cross-family reviewer). All Claude point releases share a token,
|
|
106
|
+
* Claude and GPT differ; namespace prefixes and aggregator mirrors fold onto the
|
|
107
|
+
* lineage via {@link parseKnownModel}'s `bareModelId` normalization. Opaque and
|
|
108
|
+
* comparison-only — not a stable key to persist, since the vocabulary tracks new
|
|
109
|
+
* releases. Returns `""` for ids it cannot classify; callers fall back to the provider.
|
|
110
|
+
*
|
|
111
|
+
* Vendor-only by design: a model's kind/variant (opus vs sonnet, codex vs base) is
|
|
112
|
+
* collapsed onto the single vendor token; use {@link parseKnownModel} for finer breakdowns.
|
|
113
|
+
*/
|
|
114
|
+
export function modelFamilyToken(modelId: string): string {
|
|
115
|
+
const parsed = parseKnownModel(modelId);
|
|
116
|
+
if (parsed.family !== "unknown") return parsed.family;
|
|
117
|
+
if (isKimiModelId(modelId)) return "kimi";
|
|
118
|
+
if (isQwenModelId(modelId)) return "qwen";
|
|
119
|
+
if (isMinimaxM2FamilyModelId(modelId)) return "minimax";
|
|
120
|
+
if (isOpenAIGptOssModelId(modelId)) return "gpt-oss";
|
|
121
|
+
if (isDeepseekModelIdOrName(modelId)) return "deepseek";
|
|
122
|
+
if (isMimoModelIdOrName(modelId)) return "mimo";
|
|
123
|
+
if (parseGlmModel(bareModelId(modelId))) return "glm";
|
|
124
|
+
return "";
|
|
125
|
+
}
|
|
126
|
+
|
|
74
127
|
/**
|
|
75
128
|
* Adaptive thinking `display` is supported starting with Claude Opus 4.7 and
|
|
76
129
|
* the Claude Fable/Mythos 5 generation. Older adaptive-thinking models
|
package/src/model-manager.ts
CHANGED
|
@@ -33,6 +33,8 @@ export interface ModelManagerOptions<TApi extends Api = Api, TModelsDevPayload =
|
|
|
33
33
|
staticModels?: readonly ModelSpec<TApi>[];
|
|
34
34
|
/** Optional override for the cache database path. Default: <agent-dir>/models.db. */
|
|
35
35
|
cacheDbPath?: string;
|
|
36
|
+
/** Optional provider id override for cache namespacing. Defaults to providerId. */
|
|
37
|
+
cacheProviderId?: string;
|
|
36
38
|
/** Maximum cache age in milliseconds before considered stale. Default: 24h. */
|
|
37
39
|
cacheTtlMs?: number;
|
|
38
40
|
/** When true, a successful dynamic fetch is the complete provider catalog and prunes static-only models. */
|
|
@@ -107,13 +109,14 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
|
|
|
107
109
|
options: ModelManagerOptions<TApi, TModelsDevPayload>,
|
|
108
110
|
strategy: ModelRefreshStrategy = "online-if-uncached",
|
|
109
111
|
): Promise<ModelResolutionResult<TApi>> {
|
|
112
|
+
const cacheProviderId = options.cacheProviderId ?? options.providerId;
|
|
110
113
|
const now = options.now ?? Date.now;
|
|
111
114
|
const ttlMs = options.cacheTtlMs ?? DEFAULT_CACHE_TTL_MS;
|
|
112
115
|
const dbPath = options.cacheDbPath;
|
|
113
116
|
const staticModels = options.staticModels
|
|
114
117
|
? passModelList<TApi>(options.staticModels)
|
|
115
118
|
: (getBundledModels(options.providerId as GeneratedProvider) as Model<TApi>[]);
|
|
116
|
-
const cache = readModelCache<TApi>(
|
|
119
|
+
const cache = readModelCache<TApi>(cacheProviderId, ttlMs, now, dbPath);
|
|
117
120
|
const dynamicModelsAuthoritative = options.dynamicModelsAuthoritative ?? false;
|
|
118
121
|
const staticFingerprint = fingerprintStatic(staticModels, dynamicModelsAuthoritative);
|
|
119
122
|
const cacheFingerprintMatches = cache?.staticFingerprint === staticFingerprint && staticFingerprint.length > 0;
|
|
@@ -160,7 +163,7 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
|
|
|
160
163
|
? retainModelIds(mergedSnapshot, dynamicModels)
|
|
161
164
|
: mergedSnapshot;
|
|
162
165
|
writeModelCache(
|
|
163
|
-
|
|
166
|
+
cacheProviderId,
|
|
164
167
|
now(),
|
|
165
168
|
collapseBuiltModelVariants(snapshotModels),
|
|
166
169
|
true,
|
|
@@ -170,9 +173,9 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
|
|
|
170
173
|
} else {
|
|
171
174
|
// Dynamic fetch failed — update cache with a non-authoritative snapshot so
|
|
172
175
|
// stale state remains visible while retry backoff still applies.
|
|
173
|
-
const latestCache = readModelCache<TApi>(
|
|
176
|
+
const latestCache = readModelCache<TApi>(cacheProviderId, ttlMs, now, dbPath);
|
|
174
177
|
writeModelCache(
|
|
175
|
-
|
|
178
|
+
cacheProviderId,
|
|
176
179
|
now(),
|
|
177
180
|
collapseBuiltModelVariants(
|
|
178
181
|
mergeDynamicModels(
|
package/src/models.json
CHANGED
|
@@ -4259,7 +4259,8 @@
|
|
|
4259
4259
|
"cacheWrite": 0
|
|
4260
4260
|
},
|
|
4261
4261
|
"contextWindow": null,
|
|
4262
|
-
"maxTokens": null
|
|
4262
|
+
"maxTokens": null,
|
|
4263
|
+
"contextPromotionTarget": "aimlapi/gpt-5.4-2026-03-05"
|
|
4263
4264
|
},
|
|
4264
4265
|
"gpt-5.5-pro-2026-04-23": {
|
|
4265
4266
|
"id": "gpt-5.5-pro-2026-04-23",
|
|
@@ -4278,7 +4279,8 @@
|
|
|
4278
4279
|
"cacheWrite": 0
|
|
4279
4280
|
},
|
|
4280
4281
|
"contextWindow": null,
|
|
4281
|
-
"maxTokens": null
|
|
4282
|
+
"maxTokens": null,
|
|
4283
|
+
"contextPromotionTarget": "aimlapi/gpt-5.4-2026-03-05"
|
|
4282
4284
|
},
|
|
4283
4285
|
"gpt-oss-120b": {
|
|
4284
4286
|
"id": "gpt-oss-120b",
|
|
@@ -9577,7 +9579,8 @@
|
|
|
9577
9579
|
"high",
|
|
9578
9580
|
"xhigh"
|
|
9579
9581
|
]
|
|
9580
|
-
}
|
|
9582
|
+
},
|
|
9583
|
+
"contextPromotionTarget": "amazon-bedrock/openai.gpt-5.4"
|
|
9581
9584
|
},
|
|
9582
9585
|
"openai.gpt-oss-120b": {
|
|
9583
9586
|
"id": "openai.gpt-oss-120b",
|
|
@@ -12202,7 +12205,8 @@
|
|
|
12202
12205
|
"high",
|
|
12203
12206
|
"xhigh"
|
|
12204
12207
|
]
|
|
12205
|
-
}
|
|
12208
|
+
},
|
|
12209
|
+
"contextPromotionTarget": "cloudflare-ai-gateway/openai/gpt-5.4"
|
|
12206
12210
|
},
|
|
12207
12211
|
"openai/o1": {
|
|
12208
12212
|
"id": "openai/o1",
|
|
@@ -22904,6 +22908,7 @@
|
|
|
22904
22908
|
"disableReasoningOnForcedToolChoice": true,
|
|
22905
22909
|
"disableReasoningOnToolChoice": false,
|
|
22906
22910
|
"supportsToolChoice": true,
|
|
22911
|
+
"supportsForcedToolChoice": true,
|
|
22907
22912
|
"maxTokensField": "max_completion_tokens",
|
|
22908
22913
|
"requiresToolResultName": false,
|
|
22909
22914
|
"requiresAssistantAfterToolResult": false,
|
|
@@ -24595,7 +24600,8 @@
|
|
|
24595
24600
|
"high",
|
|
24596
24601
|
"xhigh"
|
|
24597
24602
|
]
|
|
24598
|
-
}
|
|
24603
|
+
},
|
|
24604
|
+
"contextPromotionTarget": "kilo/openai/gpt-5.4"
|
|
24599
24605
|
},
|
|
24600
24606
|
"openai/gpt-5.5-pro": {
|
|
24601
24607
|
"id": "openai/gpt-5.5-pro",
|
|
@@ -24624,7 +24630,8 @@
|
|
|
24624
24630
|
"high",
|
|
24625
24631
|
"xhigh"
|
|
24626
24632
|
]
|
|
24627
|
-
}
|
|
24633
|
+
},
|
|
24634
|
+
"contextPromotionTarget": "kilo/openai/gpt-5.4"
|
|
24628
24635
|
},
|
|
24629
24636
|
"openai/gpt-audio": {
|
|
24630
24637
|
"id": "openai/gpt-audio",
|
|
@@ -25327,6 +25334,7 @@
|
|
|
25327
25334
|
"disableReasoningOnForcedToolChoice": false,
|
|
25328
25335
|
"disableReasoningOnToolChoice": false,
|
|
25329
25336
|
"supportsToolChoice": true,
|
|
25337
|
+
"supportsForcedToolChoice": true,
|
|
25330
25338
|
"maxTokensField": "max_completion_tokens",
|
|
25331
25339
|
"requiresToolResultName": false,
|
|
25332
25340
|
"requiresAssistantAfterToolResult": false,
|
|
@@ -25778,6 +25786,7 @@
|
|
|
25778
25786
|
"disableReasoningOnForcedToolChoice": false,
|
|
25779
25787
|
"disableReasoningOnToolChoice": false,
|
|
25780
25788
|
"supportsToolChoice": true,
|
|
25789
|
+
"supportsForcedToolChoice": true,
|
|
25781
25790
|
"maxTokensField": "max_completion_tokens",
|
|
25782
25791
|
"requiresToolResultName": false,
|
|
25783
25792
|
"requiresAssistantAfterToolResult": false,
|
|
@@ -26058,6 +26067,7 @@
|
|
|
26058
26067
|
"disableReasoningOnForcedToolChoice": false,
|
|
26059
26068
|
"disableReasoningOnToolChoice": false,
|
|
26060
26069
|
"supportsToolChoice": true,
|
|
26070
|
+
"supportsForcedToolChoice": true,
|
|
26061
26071
|
"maxTokensField": "max_completion_tokens",
|
|
26062
26072
|
"requiresToolResultName": false,
|
|
26063
26073
|
"requiresAssistantAfterToolResult": false,
|
|
@@ -28539,7 +28549,7 @@
|
|
|
28539
28549
|
"cacheRead": 0.12,
|
|
28540
28550
|
"cacheWrite": 0
|
|
28541
28551
|
},
|
|
28542
|
-
"contextWindow":
|
|
28552
|
+
"contextWindow": 1000000,
|
|
28543
28553
|
"maxTokens": 128000,
|
|
28544
28554
|
"thinking": {
|
|
28545
28555
|
"mode": "budget",
|
|
@@ -28781,7 +28791,7 @@
|
|
|
28781
28791
|
"cacheRead": 0.12,
|
|
28782
28792
|
"cacheWrite": 0
|
|
28783
28793
|
},
|
|
28784
|
-
"contextWindow":
|
|
28794
|
+
"contextWindow": 1000000,
|
|
28785
28795
|
"maxTokens": 128000,
|
|
28786
28796
|
"thinking": {
|
|
28787
28797
|
"mode": "budget",
|
|
@@ -39194,8 +39204,8 @@
|
|
|
39194
39204
|
"cacheRead": 0,
|
|
39195
39205
|
"cacheWrite": 0
|
|
39196
39206
|
},
|
|
39197
|
-
"contextWindow":
|
|
39198
|
-
"maxTokens":
|
|
39207
|
+
"contextWindow": 128000,
|
|
39208
|
+
"maxTokens": 16384
|
|
39199
39209
|
},
|
|
39200
39210
|
"openai/gpt-5-codex": {
|
|
39201
39211
|
"id": "openai/gpt-5-codex",
|
|
@@ -39763,7 +39773,8 @@
|
|
|
39763
39773
|
"high",
|
|
39764
39774
|
"xhigh"
|
|
39765
39775
|
]
|
|
39766
|
-
}
|
|
39776
|
+
},
|
|
39777
|
+
"contextPromotionTarget": "nanogpt/openai/gpt-5.4"
|
|
39767
39778
|
},
|
|
39768
39779
|
"openai/gpt-chat-latest": {
|
|
39769
39780
|
"id": "openai/gpt-chat-latest",
|
|
@@ -51042,6 +51053,9 @@
|
|
|
51042
51053
|
},
|
|
51043
51054
|
"contextWindow": 262144,
|
|
51044
51055
|
"maxTokens": 262144,
|
|
51056
|
+
"compat": {
|
|
51057
|
+
"supportsForcedToolChoice": false
|
|
51058
|
+
},
|
|
51045
51059
|
"thinking": {
|
|
51046
51060
|
"mode": "effort",
|
|
51047
51061
|
"efforts": [
|
|
@@ -51081,6 +51095,9 @@
|
|
|
51081
51095
|
"high",
|
|
51082
51096
|
"xhigh"
|
|
51083
51097
|
]
|
|
51098
|
+
},
|
|
51099
|
+
"compat": {
|
|
51100
|
+
"supportsToolChoice": false
|
|
51084
51101
|
}
|
|
51085
51102
|
},
|
|
51086
51103
|
"mimo-v2-pro": {
|
|
@@ -51110,6 +51127,9 @@
|
|
|
51110
51127
|
"high",
|
|
51111
51128
|
"xhigh"
|
|
51112
51129
|
]
|
|
51130
|
+
},
|
|
51131
|
+
"compat": {
|
|
51132
|
+
"supportsToolChoice": false
|
|
51113
51133
|
}
|
|
51114
51134
|
},
|
|
51115
51135
|
"mimo-v2.5": {
|
|
@@ -51131,6 +51151,9 @@
|
|
|
51131
51151
|
},
|
|
51132
51152
|
"contextWindow": 1000000,
|
|
51133
51153
|
"maxTokens": 128000,
|
|
51154
|
+
"compat": {
|
|
51155
|
+
"supportsToolChoice": false
|
|
51156
|
+
},
|
|
51134
51157
|
"thinking": {
|
|
51135
51158
|
"mode": "effort",
|
|
51136
51159
|
"efforts": [
|
|
@@ -51160,6 +51183,9 @@
|
|
|
51160
51183
|
},
|
|
51161
51184
|
"contextWindow": 1048576,
|
|
51162
51185
|
"maxTokens": 128000,
|
|
51186
|
+
"compat": {
|
|
51187
|
+
"supportsToolChoice": false
|
|
51188
|
+
},
|
|
51163
51189
|
"thinking": {
|
|
51164
51190
|
"mode": "effort",
|
|
51165
51191
|
"efforts": [
|
|
@@ -55012,13 +55038,13 @@
|
|
|
55012
55038
|
"text"
|
|
55013
55039
|
],
|
|
55014
55040
|
"cost": {
|
|
55015
|
-
"input": 0.
|
|
55016
|
-
"output": 0.
|
|
55041
|
+
"input": 0.09,
|
|
55042
|
+
"output": 0.18,
|
|
55017
55043
|
"cacheRead": 0.02,
|
|
55018
55044
|
"cacheWrite": 0
|
|
55019
55045
|
},
|
|
55020
55046
|
"contextWindow": 1048576,
|
|
55021
|
-
"maxTokens":
|
|
55047
|
+
"maxTokens": 65536,
|
|
55022
55048
|
"thinking": {
|
|
55023
55049
|
"mode": "effort",
|
|
55024
55050
|
"efforts": [
|
|
@@ -57075,9 +57101,9 @@
|
|
|
57075
57101
|
"image"
|
|
57076
57102
|
],
|
|
57077
57103
|
"cost": {
|
|
57078
|
-
"input": 0.
|
|
57079
|
-
"output":
|
|
57080
|
-
"cacheRead": 0.
|
|
57104
|
+
"input": 0.75,
|
|
57105
|
+
"output": 3.5,
|
|
57106
|
+
"cacheRead": 0.16,
|
|
57081
57107
|
"cacheWrite": 0
|
|
57082
57108
|
},
|
|
57083
57109
|
"contextWindow": 262144,
|
|
@@ -58513,7 +58539,8 @@
|
|
|
58513
58539
|
"high",
|
|
58514
58540
|
"xhigh"
|
|
58515
58541
|
]
|
|
58516
|
-
}
|
|
58542
|
+
},
|
|
58543
|
+
"contextPromotionTarget": "openrouter/openai/gpt-5.4"
|
|
58517
58544
|
},
|
|
58518
58545
|
"openai/gpt-5.5-pro": {
|
|
58519
58546
|
"id": "openai/gpt-5.5-pro",
|
|
@@ -58542,7 +58569,8 @@
|
|
|
58542
58569
|
"high",
|
|
58543
58570
|
"xhigh"
|
|
58544
58571
|
]
|
|
58545
|
-
}
|
|
58572
|
+
},
|
|
58573
|
+
"contextPromotionTarget": "openrouter/openai/gpt-5.4"
|
|
58546
58574
|
},
|
|
58547
58575
|
"openai/gpt-audio": {
|
|
58548
58576
|
"id": "openai/gpt-audio",
|
|
@@ -59989,7 +60017,7 @@
|
|
|
59989
60017
|
"cacheWrite": 0
|
|
59990
60018
|
},
|
|
59991
60019
|
"contextWindow": 262144,
|
|
59992
|
-
"maxTokens":
|
|
60020
|
+
"maxTokens": 16384
|
|
59993
60021
|
},
|
|
59994
60022
|
"qwen/qwen3-next-80b-a3b-thinking": {
|
|
59995
60023
|
"id": "qwen/qwen3-next-80b-a3b-thinking",
|
|
@@ -64583,7 +64611,7 @@
|
|
|
64583
64611
|
"cacheWrite": 0
|
|
64584
64612
|
},
|
|
64585
64613
|
"contextWindow": 128000,
|
|
64586
|
-
"maxTokens":
|
|
64614
|
+
"maxTokens": 16384,
|
|
64587
64615
|
"compat": {
|
|
64588
64616
|
"supportsUsageInStreaming": false
|
|
64589
64617
|
}
|
|
@@ -69051,7 +69079,8 @@
|
|
|
69051
69079
|
"high",
|
|
69052
69080
|
"xhigh"
|
|
69053
69081
|
]
|
|
69054
|
-
}
|
|
69082
|
+
},
|
|
69083
|
+
"contextPromotionTarget": "vercel-ai-gateway/openai/gpt-5.4"
|
|
69055
69084
|
},
|
|
69056
69085
|
"openai/gpt-5.5-pro": {
|
|
69057
69086
|
"id": "openai/gpt-5.5-pro",
|
|
@@ -69080,7 +69109,8 @@
|
|
|
69080
69109
|
"high",
|
|
69081
69110
|
"xhigh"
|
|
69082
69111
|
]
|
|
69083
|
-
}
|
|
69112
|
+
},
|
|
69113
|
+
"contextPromotionTarget": "vercel-ai-gateway/openai/gpt-5.4"
|
|
69084
69114
|
},
|
|
69085
69115
|
"openai/gpt-oss-120b": {
|
|
69086
69116
|
"id": "openai/gpt-oss-120b",
|
|
@@ -72205,6 +72235,35 @@
|
|
|
72205
72235
|
]
|
|
72206
72236
|
}
|
|
72207
72237
|
},
|
|
72238
|
+
"glm-5.2": {
|
|
72239
|
+
"id": "glm-5.2",
|
|
72240
|
+
"name": "GLM-5.2",
|
|
72241
|
+
"api": "anthropic-messages",
|
|
72242
|
+
"provider": "zai",
|
|
72243
|
+
"baseUrl": "https://api.z.ai/api/anthropic",
|
|
72244
|
+
"reasoning": true,
|
|
72245
|
+
"input": [
|
|
72246
|
+
"text"
|
|
72247
|
+
],
|
|
72248
|
+
"cost": {
|
|
72249
|
+
"input": 0,
|
|
72250
|
+
"output": 0,
|
|
72251
|
+
"cacheRead": 0,
|
|
72252
|
+
"cacheWrite": 0
|
|
72253
|
+
},
|
|
72254
|
+
"contextWindow": 1000000,
|
|
72255
|
+
"maxTokens": 131072,
|
|
72256
|
+
"thinking": {
|
|
72257
|
+
"mode": "budget",
|
|
72258
|
+
"efforts": [
|
|
72259
|
+
"minimal",
|
|
72260
|
+
"low",
|
|
72261
|
+
"medium",
|
|
72262
|
+
"high",
|
|
72263
|
+
"xhigh"
|
|
72264
|
+
]
|
|
72265
|
+
}
|
|
72266
|
+
},
|
|
72208
72267
|
"glm-5v-turbo": {
|
|
72209
72268
|
"id": "glm-5v-turbo",
|
|
72210
72269
|
"name": "GLM-5V-Turbo",
|
|
@@ -75112,7 +75171,8 @@
|
|
|
75112
75171
|
"high",
|
|
75113
75172
|
"xhigh"
|
|
75114
75173
|
]
|
|
75115
|
-
}
|
|
75174
|
+
},
|
|
75175
|
+
"contextPromotionTarget": "zenmux/openai/gpt-5.4"
|
|
75116
75176
|
},
|
|
75117
75177
|
"openai/gpt-5.5-instant": {
|
|
75118
75178
|
"id": "openai/gpt-5.5-instant",
|
|
@@ -75141,7 +75201,8 @@
|
|
|
75141
75201
|
"high",
|
|
75142
75202
|
"xhigh"
|
|
75143
75203
|
]
|
|
75144
|
-
}
|
|
75204
|
+
},
|
|
75205
|
+
"contextPromotionTarget": "zenmux/openai/gpt-5.4"
|
|
75145
75206
|
},
|
|
75146
75207
|
"openai/gpt-5.5-pro": {
|
|
75147
75208
|
"id": "openai/gpt-5.5-pro",
|
|
@@ -75170,7 +75231,8 @@
|
|
|
75170
75231
|
"high",
|
|
75171
75232
|
"xhigh"
|
|
75172
75233
|
]
|
|
75173
|
-
}
|
|
75234
|
+
},
|
|
75235
|
+
"contextPromotionTarget": "zenmux/openai/gpt-5.4"
|
|
75174
75236
|
},
|
|
75175
75237
|
"openai/gpt-image-1.5": {
|
|
75176
75238
|
"id": "openai/gpt-image-1.5",
|
|
@@ -68,11 +68,11 @@ export const CATALOG_PROVIDERS = [
|
|
|
68
68
|
},
|
|
69
69
|
{
|
|
70
70
|
id: "amazon-bedrock",
|
|
71
|
-
defaultModel: "us.anthropic.claude-opus-4-
|
|
71
|
+
defaultModel: "us.anthropic.claude-opus-4-8",
|
|
72
72
|
},
|
|
73
73
|
{
|
|
74
74
|
id: "anthropic",
|
|
75
|
-
defaultModel: "claude-opus-4-
|
|
75
|
+
defaultModel: "claude-opus-4-8",
|
|
76
76
|
createModelManagerOptions: (config: ModelManagerConfig) => anthropicModelManagerOptions(config),
|
|
77
77
|
},
|
|
78
78
|
{
|
|
@@ -177,7 +177,7 @@ export const CATALOG_PROVIDERS = [
|
|
|
177
177
|
},
|
|
178
178
|
{
|
|
179
179
|
id: "litellm",
|
|
180
|
-
defaultModel: "claude-opus-4-
|
|
180
|
+
defaultModel: "claude-opus-4-8",
|
|
181
181
|
envVars: ["LITELLM_API_KEY"],
|
|
182
182
|
createModelManagerOptions: (config: ModelManagerConfig) => litellmModelManagerOptions(config),
|
|
183
183
|
catalogDiscovery: { label: "LiteLLM", allowUnauthenticated: true },
|
|
@@ -219,7 +219,7 @@ export const CATALOG_PROVIDERS = [
|
|
|
219
219
|
},
|
|
220
220
|
{
|
|
221
221
|
id: "nanogpt",
|
|
222
|
-
defaultModel: "openai/gpt-5.
|
|
222
|
+
defaultModel: "openai/gpt-5.5",
|
|
223
223
|
envVars: ["NANO_GPT_API_KEY"],
|
|
224
224
|
createModelManagerOptions: (config: ModelManagerConfig) => nanoGptModelManagerOptions(config),
|
|
225
225
|
catalogDiscovery: { label: "NanoGPT" },
|
|
@@ -247,13 +247,13 @@ export const CATALOG_PROVIDERS = [
|
|
|
247
247
|
},
|
|
248
248
|
{
|
|
249
249
|
id: "openai",
|
|
250
|
-
defaultModel: "gpt-5.
|
|
250
|
+
defaultModel: "gpt-5.5",
|
|
251
251
|
envVars: ["OPENAI_API_KEY"],
|
|
252
252
|
createModelManagerOptions: (config: ModelManagerConfig) => openaiModelManagerOptions(config),
|
|
253
253
|
},
|
|
254
254
|
{
|
|
255
255
|
id: "openai-codex",
|
|
256
|
-
defaultModel: "gpt-5.
|
|
256
|
+
defaultModel: "gpt-5.5",
|
|
257
257
|
envVars: ["OPENAI_CODEX_OAUTH_TOKEN"],
|
|
258
258
|
specialModelManager: true,
|
|
259
259
|
},
|
|
@@ -271,7 +271,7 @@ export const CATALOG_PROVIDERS = [
|
|
|
271
271
|
},
|
|
272
272
|
{
|
|
273
273
|
id: "openrouter",
|
|
274
|
-
defaultModel: "openai/gpt-5.
|
|
274
|
+
defaultModel: "openai/gpt-5.5",
|
|
275
275
|
envVars: ["OPENROUTER_API_KEY"],
|
|
276
276
|
createModelManagerOptions: (config: ModelManagerConfig) => openrouterModelManagerOptions(config),
|
|
277
277
|
catalogDiscovery: { label: "OpenRouter", allowUnauthenticated: true },
|
|
@@ -403,7 +403,7 @@ export const CATALOG_PROVIDERS = [
|
|
|
403
403
|
},
|
|
404
404
|
{
|
|
405
405
|
id: "zenmux",
|
|
406
|
-
defaultModel: "anthropic/claude-opus-4.
|
|
406
|
+
defaultModel: "anthropic/claude-opus-4.8",
|
|
407
407
|
envVars: ["ZENMUX_API_KEY"],
|
|
408
408
|
createModelManagerOptions: (config: ModelManagerConfig) => zenmuxModelManagerOptions(config),
|
|
409
409
|
catalogDiscovery: { label: "ZenMux" },
|
|
@@ -5,6 +5,7 @@ import {
|
|
|
5
5
|
} from "../discovery/openai-compatible";
|
|
6
6
|
import { Effort } from "../effort";
|
|
7
7
|
import { toFireworksPublicModelId } from "../fireworks-model-id";
|
|
8
|
+
import { isGlmVisionModelId, isReasoningGlmModelId } from "../identity/family";
|
|
8
9
|
import type { ModelManagerOptions } from "../model-manager";
|
|
9
10
|
import { getBundledModels } from "../models";
|
|
10
11
|
import type { Api, FetchImpl, Model, ModelSpec, Provider, ThinkingConfig } from "../types";
|
|
@@ -1030,8 +1031,8 @@ export function zhipuCodingPlanModelManagerOptions(
|
|
|
1030
1031
|
const id = defaults.id;
|
|
1031
1032
|
return {
|
|
1032
1033
|
...defaults,
|
|
1033
|
-
reasoning:
|
|
1034
|
-
input:
|
|
1034
|
+
reasoning: isReasoningGlmModelId(id) || id.includes("thinking"),
|
|
1035
|
+
input: isGlmVisionModelId(id) ? (["text", "image"] as const) : ["text"],
|
|
1035
1036
|
compat: {
|
|
1036
1037
|
thinkingFormat: "zai",
|
|
1037
1038
|
reasoningContentField: "reasoning_content",
|
|
@@ -1045,25 +1046,6 @@ export function zhipuCodingPlanModelManagerOptions(
|
|
|
1045
1046
|
};
|
|
1046
1047
|
}
|
|
1047
1048
|
|
|
1048
|
-
// Reasoning-capable GLM models on the BigModel coding-plan SKU. Keep this
|
|
1049
|
-
// explicit rather than regex-matching `glm-[45]\.\d` so newly-added integers
|
|
1050
|
-
// like `glm-5` / `glm-5-turbo` are covered and unrelated future SKUs (e.g.
|
|
1051
|
-
// `glm-5-preview`) do not silently flip into thinking mode.
|
|
1052
|
-
const ZHIPU_REASONING_MODELS: Readonly<Record<string, true>> = {
|
|
1053
|
-
"glm-4.5": true,
|
|
1054
|
-
"glm-4.5-air": true,
|
|
1055
|
-
"glm-4.6": true,
|
|
1056
|
-
"glm-4.7": true,
|
|
1057
|
-
"glm-5": true,
|
|
1058
|
-
"glm-5-turbo": true,
|
|
1059
|
-
"glm-5.1": true,
|
|
1060
|
-
};
|
|
1061
|
-
|
|
1062
|
-
// Vision-capable GLM models follow the `glm-<N>[.<N>]v[-<variant>]` shape
|
|
1063
|
-
// (e.g. `glm-4v`, `glm-4.5v`, `glm-4v-plus`). The previous `id.includes("v")`
|
|
1064
|
-
// check matched anything with a `v` — including the non-vision `glm-5-preview`.
|
|
1065
|
-
const ZHIPU_VISION_PATTERN = /^glm-[45](?:\.\d+)?v(?:-|$)/;
|
|
1066
|
-
|
|
1067
1049
|
// ---------------------------------------------------------------------------
|
|
1068
1050
|
// 7.5 Fireworks
|
|
1069
1051
|
// ---------------------------------------------------------------------------
|
|
@@ -2393,6 +2375,8 @@ export function litellmModelManagerOptions(
|
|
|
2393
2375
|
// 22. vLLM
|
|
2394
2376
|
// ---------------------------------------------------------------------------
|
|
2395
2377
|
|
|
2378
|
+
const VLLM_DISCOVERY_TIMEOUT_MS = 10_000;
|
|
2379
|
+
|
|
2396
2380
|
export interface VllmModelManagerConfig {
|
|
2397
2381
|
apiKey?: string;
|
|
2398
2382
|
baseUrl?: string;
|
|
@@ -2405,6 +2389,7 @@ export function vllmModelManagerOptions(config?: VllmModelManagerConfig): ModelM
|
|
|
2405
2389
|
const references = createBundledReferenceMap<"openai-completions">("vllm" as Parameters<typeof getBundledModels>[0]);
|
|
2406
2390
|
return {
|
|
2407
2391
|
providerId: "vllm",
|
|
2392
|
+
cacheProviderId: `vllm:${Bun.hash(baseUrl).toString(36)}`,
|
|
2408
2393
|
fetchDynamicModels: () =>
|
|
2409
2394
|
fetchOpenAICompatibleModels({
|
|
2410
2395
|
api: "openai-completions",
|
|
@@ -2419,6 +2404,7 @@ export function vllmModelManagerOptions(config?: VllmModelManagerConfig): ModelM
|
|
|
2419
2404
|
};
|
|
2420
2405
|
},
|
|
2421
2406
|
fetch: config?.fetch,
|
|
2407
|
+
signal: AbortSignal.timeout(VLLM_DISCOVERY_TIMEOUT_MS),
|
|
2422
2408
|
}),
|
|
2423
2409
|
};
|
|
2424
2410
|
}
|
package/src/types.ts
CHANGED
|
@@ -186,6 +186,12 @@ export interface OpenAICompat {
|
|
|
186
186
|
requiresAssistantContentForToolCalls?: boolean;
|
|
187
187
|
/** Whether the provider supports the `tool_choice` parameter. Default: true. */
|
|
188
188
|
supportsToolChoice?: boolean;
|
|
189
|
+
/**
|
|
190
|
+
* Whether forced `tool_choice` values (`"required"` or named tools) are accepted.
|
|
191
|
+
* When false, request builders keep tools available but downgrade forced choices
|
|
192
|
+
* to provider-default auto selection. Default: true.
|
|
193
|
+
*/
|
|
194
|
+
supportsForcedToolChoice?: boolean;
|
|
189
195
|
/**
|
|
190
196
|
* Drop reasoning fields (`reasoning_effort`, OpenRouter `reasoning`) for
|
|
191
197
|
* the request when `tool_choice` forces a tool call. Mirrors the Anthropic
|
|
@@ -20,6 +20,8 @@ export const ANTIGRAVITY_SYSTEM_INSTRUCTION =
|
|
|
20
20
|
"You are pair programming with a USER to solve their coding task. The task may require creating a new codebase, modifying or debugging an existing codebase, or simply answering a question." +
|
|
21
21
|
"**Absolute paths only**" +
|
|
22
22
|
"**Proactiveness**";
|
|
23
|
+
export const ANTIGRAVITY_NO_PREAMBLE_INSTRUCTION =
|
|
24
|
+
'CRITICAL: NEVER output rule checks, formatting guidelines, constraint checklists (e.g. "No emdashes"), or your thinking/personality preambles in the final response. Output only the final response.';
|
|
23
25
|
/**
|
|
24
26
|
* Antigravity / Cloud Code Assist user agent. Lives in its own file so discovery
|
|
25
27
|
* and usage code can read it without pulling the heavy google-gemini-cli provider
|