@oh-my-pi/pi-catalog 15.12.4 → 15.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,109 +2,20 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
- ## [15.12.4] - 2026-06-13
6
-
7
5
  ### Added
8
6
 
7
+ - Added `modelFamilyToken(modelId)` to `@oh-my-pi/pi-catalog/identity`: a coarse vendor-lineage token (`anthropic`/`openai`/`gemini`/`kimi`/…) for "are two models the same family?" comparisons, backed by `parseKnownModel` canonical-id normalization. Opaque and comparison-only; kind/variant collapsed onto the vendor token ([#2406](https://github.com/can1357/oh-my-pi/issues/2406))
8
+ - Added GLM-5.2 to the bundled zai (GLM Coding Plan) catalog as the selectable 1M served model.
9
9
  - Added bundled Fireworks models `deepseek-v4-flash`, `kimi-k2.7-code`, `minimax-m2.5`, `minimax-m3`, `nemotron-3-ultra-nvfp4`, `qwen3.6-plus`, and `qwen3.7-plus`
10
10
  - Changed
11
-
12
- ### Changed
13
-
14
- - Model `contextWindow`/`maxTokens` are now `number | null`; discovery emits `null` when a provider reports no limit, replacing the `222222`/`8888` (`UNK_CONTEXT_WINDOW`/`UNK_MAX_TOKENS`) sentinels (now removed). Bundled `models.json` unknown limits are `null`.
15
- - Changed the `github-copilot` model context window to `524288` tokens
16
- - Changed Fireworks model discovery to source the control-plane `List Models` API (`GET /v1/accounts/fireworks/models?filter=supports_serverless=true`) instead of the OpenAI-compatible `/v1/models` inference listing. The inference endpoint returns a sparse, account-specific subset that omits on-demand serverless models (e.g. `kimi-k2.7-code`), so newly published serverless models stayed invisible in the picker until hand-added to the bundled catalog. The control-plane catalog enumerates every serverless model with capability metadata (`supportsServerless`/`supportsTools`/`supportsImageInput`/`contextLength`/`displayName`), paginated and filtered to tool-capable `READY` entries, then merged with bundled/models.dev references — the Kimi K2 max-output clamp and DeepSeek V4 thinking-toggle strip are preserved, and unbundled models default to reasoning so `buildModel` derives the Fireworks effort map. New serverless releases now surface automatically with no catalog edits.
17
-
18
- ### Fixed
19
-
20
- - Filled missing `contextWindow` and `maxTokens` in generated `models.json` for proxy/reseller variants by inheriting limits from canonical-family and segment-reference models
21
- - Ignored zero-cost `x-ai` subscription entries as reference sources when backfilling limits so inflated values are not propagated
22
- - Fixed the model cache opening with `PRAGMA journal_mode=WAL` before `PRAGMA busy_timeout`, so concurrent omp startups could crash inside `getDb()` on `SQLITE_BUSY` during WAL recovery instead of waiting through the transient lock. The busy handler is now installed before the first lock-taking statement ([#2421](https://github.com/can1357/oh-my-pi/issues/2421)).
23
-
24
- ## [15.11.8] - 2026-06-12
25
-
26
- ### Fixed
27
-
28
- - Fixed Antigravity `gemini-3.1-pro --thinking high` failing with `Cloud Code Assist API error (400): Request contains an invalid argument.` — the upstream `gemini-3.1-pro-high` deployment rejects every `streamGenerateContent` request on both CCA endpoints while discovery still advertises it. High effort now routes to `gemini-pro-agent` (the same "Gemini 3.1 Pro (High)" model, verified accepting the identical request body), and the model-cache fingerprint version was bumped (`merge-v2` → `merge-v3`) so existing fresh caches refetch discovery and pick up the corrected routing immediately.
29
-
30
- ## [15.11.7] - 2026-06-12
31
- ### Added
32
-
33
11
  - Added effort-tier variant collapsing (`variant-collapse`): providers that expose one logical model as several effort/thinking-suffixed upstream ids (Antigravity CCA `gemini-3.5-flash-extra-low`/`-low`/`gemini-3-flash-agent`, `gemini-3[.1]-pro-low|high`, `claude-*[-thinking]` pairs, `gpt-oss-120b-medium`) collapse into one logical entry carrying per-effort upstream routing in `thinking.effortRouting` (plus `thinking.suppressWhenOff` for Cloud Code Assist ids whose baked server default re-applies when `thinkingConfig` is omitted). Request-time code resolves the outbound id via `resolveWireModelId(model, effort)`; selection, caching, and usage attribution key on the logical id.
34
12
  - Added the automatic `X`/`X-thinking` pair rule (`deriveThinkingPairFamilies`): any provider's live bare/thinking twin collapses into the bare id, routing thinking-enabled requests to the `-thinking` backing id (trailing or infix token, so `kimi-k2-thinking-turbo` pairs with `kimi-k2-turbo`). Gated on same api and compatible pricing — all-zero cost rows count as unknown, while twins that both carry real, differing prices remain separate SKUs.
35
13
  - Added `collapseBuiltModelVariants` and wired collapsing at every materialization point — Antigravity discovery, the catalog generator, and the model-manager merge — so stale sources (old static beside collapsed dynamic results, mixed cache rows) converge on logical entries instead of unioning raw tier ids back into the catalog.
36
14
  - Added `thinking.requiresEffort`, baked for reasoning-only upstreams — Gemini 3.x (levels only, no off), Gemini 2.5 Pro (thinkingBudget floors at 128, rejects 0), OpenAI o-series, MiniMax M2, and thinking-variant SKUs (`*-thinking`/`*-reasoner`/`*-reasoning`, with a negation-aware token grammar so `non-thinking` ids never match). Identity derivation bakes it for new entries and `fillThinkingWireDefaults` backfills explicit/cached metadata; `minimumSupportedEffort` exposes the canonical floor. Pair-collapsed twins drop member flags (their off routes to the bare SKU), while identity re-flags pairs whose logical id is itself mandatory
37
-
38
- ### Changed
39
-
40
- - Changed model display names to drop model-extrinsic decorations: gateway author prefixes (`OpenAI: …`, `Google: …`), `(latest)` alias markers, `(Antigravity)` provider attribution, price tiers (`($$$$)`), and promo/lifecycle tags (`(20% off)`, `(retires …)`). `cleanModelName` is applied in `buildModel` (covers live discovery and stale caches) and as a catalog-generator pass; Antigravity discovery no longer appends `(Antigravity)` to display names. Variant tags that map to distinct wire ids (`(Thinking)`, `(free)`, `(Fast)`, dates, regions) are preserved.
41
- - Changed the `google-antigravity` default model from `gemini-3-pro-high` to `gemini-3.1-pro`
42
- - Changed `gemini-2.5-flash-thinking` handling from discovery-denylist to collapsing into `gemini-2.5-flash` (thinking-enabled requests route to the `-thinking` backing id)
43
- - Bumped the model cache schema to v5 so rows predating effort-tier variant collapsing (raw `-low`/`-high`/`-thinking` member ids) are invalidated
44
-
45
- ### Fixed
46
-
47
- - Fixed catalog generation to apply effort-tier variant collapsing before provider grouping to ensure collapsed model families are consistently materialized without being impacted by in-loop mutation
48
- - Fixed Kimi K2.6 OpenAI-compatible compat metadata to use a 300s stream watchdog floor, covering Fire Pass router ids as well as public `kimi-k2.6` ids so long reasoning starts do not hit the generic first-event timeout ([#2366](https://github.com/can1357/oh-my-pi/issues/2366)).
49
-
50
- ## [15.11.4] - 2026-06-12
51
-
52
- ### Fixed
53
-
54
- - Fixed MiniMax M2-family and OpenAI gpt-oss model metadata so OpenAI-compatible catalog entries declare only `low|medium|high` thinking efforts. Their upstreams reject `minimal`, `xhigh`, and Fireworks' `minimal → none` wire mapping, so `fireworks/minimax-m2.7` as the smol auto-thinking classifier model 400ed on every turn. OpenAI-compatible provider effort maps (`Groq qwen/qwen3-32b`, DeepSeek-family, OpenRouter Anthropic adaptive, Fireworks `minimal → none`) now bake into `thinking.effortMap` in catalog metadata instead of `buildOpenAICompat`, and request builders read that field directly. Regenerated `models.json` now makes `disableReasoning` choose `low` for those families while leaving GLM-5.x and other Fireworks models on the existing `minimal → none` path ([#2315](https://github.com/can1357/oh-my-pi/issues/2315)).
55
- ### Added
56
-
57
15
  - Added `requiresJuiceZeroHack` Responses-API compat flag, resolved by `buildOpenAIResponsesCompat` from GPT-5-family model names and overridable via sparse model `compat` config. Replaces the request-time `model.name.startsWith("gpt-5")` sniff that gated the trailing `# Juice: 0 !important` no-reasoning developer item.
58
-
59
- ## [15.11.3] - 2026-06-11
60
- ### Added
61
-
62
16
  - Added `requestModelId` on `Model` to represent the upstream model id used when a catalog entry is a local variant
63
17
  - Added synthetic GitHub Copilot long-context model variants with `-1m` suffixes when tiered token pricing is advertised
64
-
65
- ### Changed
66
-
67
- - Changed GitHub Copilot discovery to request `X-GitHub-Api-Version: 2026-06-01` from `api.githubcopilot.com`
68
- - Changed GitHub Copilot discovery to cap base model `contextWindow` to the default token tier and keep long-context access as the separate `-1m` model entry
69
- - Changed Copilot model mapping to omit non-chat `/models` entries and enable image input for models whose capabilities indicate vision support
70
-
71
- ### Fixed
72
-
73
- - Fixed long-context variant pricing to use `billing.token_prices.long_context` rates instead of default model pricing
74
- - Fixed `mapModel` handling in OpenAI-compatible discovery so returning `null` now skips a model entry rather than falling back to defaults
75
- - Fixed model ID precedence so a real upstream Copilot model id is kept when it conflicts with a synthesized `-1m` variant
76
-
77
- ## [15.11.1] - 2026-06-11
78
-
79
- ### Fixed
80
-
81
- - Fixed NVIDIA NIM Qwen turns failing with `400 Validation: Unsupported parameter(s): enable_thinking`. NIM's chat-completions schema is `additionalProperties: false` and exposes thinking via the vLLM convention `chat_template_kwargs.enable_thinking`; `buildOpenAICompat` was sending top-level `enable_thinking` for every `qwen/*` id regardless of host. Registered `nvidia` as a known host (`integrate.api.nvidia.com`) and routed NVIDIA-hosted Qwen models to `thinkingFormat: "qwen-chat-template"` ([#2299](https://github.com/can1357/oh-my-pi/issues/2299)).
82
- - Fixed Moonshot/Kimi native OpenAI-compatible request metadata so Kimi K2 uses `max_tokens` and omits OpenAI-only `store`, restoring first-turn output with `MOONSHOT_API_KEY` ([#2289](https://github.com/can1357/oh-my-pi/issues/2289)).
83
-
84
- ## [15.11.0] - 2026-06-10
85
-
86
- ### Fixed
87
-
88
- - Fixed `buildModel` so malformed explicit thinking metadata without `efforts` is treated as sparse input and inferred instead of crashing during model resolution ([#2251](https://github.com/can1357/oh-my-pi/issues/2251)).
89
-
90
- ## [15.10.12] - 2026-06-10
91
-
92
- ### Added
93
-
94
18
  - Added `grok-composer-2.5-fast` (Cursor "Composer 2.5 Fast") to the xAI Grok OAuth (SuperGrok) catalog: non-reasoning, text-only, 200K context.
95
-
96
- ### Changed
97
-
98
- - Set every xAI Grok OAuth (SuperGrok) curated model's max output tokens to mirror its context window (`grok-build`, `grok-4.3`, `grok-4.20-0309-{reasoning,non-reasoning}`, `grok-4.20-multi-agent-0309`, `grok-composer-2.5-fast`), replacing the `8888` `UNK_MAX_TOKENS` placeholder (and a stale `30000` on three grok-4.x entries). xAI's OAuth `/v1/models` reports no per-request output limit, so the curated catalog now owns `maxTokens` like `contextWindow`, deterministic on both the static-seed and online-overlay paths; the `openai-responses` wire still clamps the actual request to `OPENAI_MAX_OUTPUT_TOKENS` (64k).
99
-
100
- ### Fixed
101
-
102
- - Excluded zero-cost `xai-oauth` subscription entries from the model reference indexes (`buildModelReferenceIndex`, `createReferenceResolver`), so their zero pricing and context-window-sized `maxTokens` cannot outrank paid/public Grok references when resolving custom-provider model identities.
103
-
104
- ## [15.10.11] - 2026-06-10
105
-
106
- ### Added
107
-
108
19
  - Added `hostMatchesUrl`, `modelMatchesHost`, and endpoint-shape helpers in the new `hosts` module for consistent provider/baseUrl matching
109
20
  - `buildModel(spec)` (`build.ts`) is now the single Model constructor: it materializes the fully-resolved compat record and canonical thinking metadata exactly once (compat first, thinking derived from identity + resolved compat), so `Model.compat` is a required, complete `CompatOf<TApi>` (`ResolvedOpenAICompat`/`ResolvedOpenAIResponsesCompat`/`ResolvedAnthropicCompat`) and request-path code reads fields with zero URL parsing and zero per-request allocation. Sparse user/config overrides live on the new `ModelSpec<TApi>` input shape and survive on `Model.compatConfig` for introspection.
110
21
  - Added `ResolvedAnthropicCompat.supportsSamplingParams` (Opus 4.7+/Fable/Mythos reject `temperature`/`top_p`/`top_k` with a 400), baked at build time from model identity so the request path stops re-parsing model ids.
@@ -116,6 +27,21 @@
116
27
 
117
28
  ### Changed
118
29
 
30
+ - Changed catalog metadata to update a model’s per-token pricing to input 0.09 and output 0.18
31
+ - Changed the same cataloged model’s maximum token limit from 384000 to 65536
32
+ - Pinned zai `glm-5.2` to 1M context during catalog generation so endpoint discovery and older fallbacks cannot regress it to 200k.
33
+ - Replaced the hand-maintained `zhipu-coding-plan` GLM reasoning allowlist and vision regex with a `parseGlmModel` family classifier in `identity/classify.ts` (variant + vision + version), surfaced as `isReasoningGlmModelId` / `isGlmVisionModelId`. Discovery now derives reasoning/vision capability from the GLM family instead of a per-id list, so newly-bumped integers (`glm-5.3`, `glm-6`, …) are covered automatically while `-flash`/`-preview` and the vision `…v` shape stay correctly classified.
34
+ - Model `contextWindow`/`maxTokens` are now `number | null`; discovery emits `null` when a provider reports no limit, replacing the `222222`/`8888` (`UNK_CONTEXT_WINDOW`/`UNK_MAX_TOKENS`) sentinels (now removed). Bundled `models.json` unknown limits are `null`.
35
+ - Changed the `github-copilot` model context window to `524288` tokens
36
+ - Changed Fireworks model discovery to source the control-plane `List Models` API (`GET /v1/accounts/fireworks/models?filter=supports_serverless=true`) instead of the OpenAI-compatible `/v1/models` inference listing. The inference endpoint returns a sparse, account-specific subset that omits on-demand serverless models (e.g. `kimi-k2.7-code`), so newly published serverless models stayed invisible in the picker until hand-added to the bundled catalog. The control-plane catalog enumerates every serverless model with capability metadata (`supportsServerless`/`supportsTools`/`supportsImageInput`/`contextLength`/`displayName`), paginated and filtered to tool-capable `READY` entries, then merged with bundled/models.dev references — the Kimi K2 max-output clamp and DeepSeek V4 thinking-toggle strip are preserved, and unbundled models default to reasoning so `buildModel` derives the Fireworks effort map. New serverless releases now surface automatically with no catalog edits.
37
+ - Changed model display names to drop model-extrinsic decorations: gateway author prefixes (`OpenAI: …`, `Google: …`), `(latest)` alias markers, `(Antigravity)` provider attribution, price tiers (`($$$$)`), and promo/lifecycle tags (`(20% off)`, `(retires …)`). `cleanModelName` is applied in `buildModel` (covers live discovery and stale caches) and as a catalog-generator pass; Antigravity discovery no longer appends `(Antigravity)` to display names. Variant tags that map to distinct wire ids (`(Thinking)`, `(free)`, `(Fast)`, dates, regions) are preserved.
38
+ - Changed the `google-antigravity` default model from `gemini-3-pro-high` to `gemini-3.1-pro`
39
+ - Changed `gemini-2.5-flash-thinking` handling from discovery-denylist to collapsing into `gemini-2.5-flash` (thinking-enabled requests route to the `-thinking` backing id)
40
+ - Bumped the model cache schema to v5 so rows predating effort-tier variant collapsing (raw `-low`/`-high`/`-thinking` member ids) are invalidated
41
+ - Changed GitHub Copilot discovery to request `X-GitHub-Api-Version: 2026-06-01` from `api.githubcopilot.com`
42
+ - Changed GitHub Copilot discovery to cap base model `contextWindow` to the default token tier and keep long-context access as the separate `-1m` model entry
43
+ - Changed Copilot model mapping to omit non-chat `/models` entries and enable image input for models whose capabilities indicate vision support
44
+ - Set every xAI Grok OAuth (SuperGrok) curated model's max output tokens to mirror its context window (`grok-build`, `grok-4.3`, `grok-4.20-0309-{reasoning,non-reasoning}`, `grok-4.20-multi-agent-0309`, `grok-composer-2.5-fast`), replacing the `8888` `UNK_MAX_TOKENS` placeholder (and a stale `30000` on three grok-4.x entries). xAI's OAuth `/v1/models` reports no per-request output limit, so the curated catalog now owns `maxTokens` like `contextWindow`, deterministic on both the static-seed and online-overlay paths; the `openai-responses` wire still clamps the actual request to `OPENAI_MAX_OUTPUT_TOKENS` (64k).
119
45
  - Changed OpenAI compatibility detection to use shared host classifiers (`modelMatchesHost`/`hostMatchesUrl`) with normalized matching instead of raw URL substring checks
120
46
  - Changed `hostMatchesUrl`/`modelMatchesHost` usage in compatibility detection to reduce mismatches across case variants and provider alias hosts
121
47
  - Provider catalog entries now carry the runtime API-key env fallback as an ordered `envVars` list; `catalogDiscovery.envVars` became an optional generation-time override (only `cursor` and `vercel-ai-gateway` differ) and `PROVIDER_DESCRIPTORS` materializes the resolved list for `generate-models.ts`.
@@ -126,6 +52,25 @@
126
52
 
127
53
  ### Fixed
128
54
 
55
+ - Fixed MiniMax-M3 catalog context for `minimax` and `minimax-cn` to report the documented 1M long-context tier instead of the upstream 512K pricing boundary ([#2576](https://github.com/can1357/oh-my-pi/issues/2576)).
56
+ - Fixed OpenCode Go MiMo catalog metadata so title generation and other tool-enabled calls omit unsupported `tool_choice` instead of triggering provider 400s ([#2509](https://github.com/can1357/oh-my-pi/issues/2509)).
57
+ - Fixed OpenCode Go `kimi-k2.7-code` catalog metadata so resolve-gate requests use automatic tool selection instead of Moonshot-rejected forced `tool_choice` ([#2546](https://github.com/can1357/oh-my-pi/issues/2546)).
58
+ - Fixed Anthropic compat for the `github-copilot` host so `supportsEagerToolInputStreaming` defaults to `false` there, matching the Copilot proxy which rejects the per-tool `eager_input_streaming` field ([#2558](https://github.com/can1357/oh-my-pi/issues/2558)).
59
+ - Scoped vLLM model cache validity to the discovery base URL so changed endpoints refetch immediately, and bounded built-in vLLM discovery requests with a timeout.
60
+ - Filled missing `contextWindow` and `maxTokens` in generated `models.json` for proxy/reseller variants by inheriting limits from canonical-family and segment-reference models
61
+ - Ignored zero-cost `x-ai` subscription entries as reference sources when backfilling limits so inflated values are not propagated
62
+ - Fixed the model cache opening with `PRAGMA journal_mode=WAL` before `PRAGMA busy_timeout`, so concurrent omp startups could crash inside `getDb()` on `SQLITE_BUSY` during WAL recovery instead of waiting through the transient lock. The busy handler is now installed before the first lock-taking statement ([#2421](https://github.com/can1357/oh-my-pi/issues/2421)).
63
+ - Fixed Antigravity `gemini-3.1-pro --thinking high` failing with `Cloud Code Assist API error (400): Request contains an invalid argument.` — the upstream `gemini-3.1-pro-high` deployment rejects every `streamGenerateContent` request on both CCA endpoints while discovery still advertises it. High effort now routes to `gemini-pro-agent` (the same "Gemini 3.1 Pro (High)" model, verified accepting the identical request body), and the model-cache fingerprint version was bumped (`merge-v2` → `merge-v3`) so existing fresh caches refetch discovery and pick up the corrected routing immediately.
64
+ - Fixed catalog generation to apply effort-tier variant collapsing before provider grouping to ensure collapsed model families are consistently materialized without being impacted by in-loop mutation
65
+ - Fixed Kimi K2.6 OpenAI-compatible compat metadata to use a 300s stream watchdog floor, covering Fire Pass router ids as well as public `kimi-k2.6` ids so long reasoning starts do not hit the generic first-event timeout ([#2366](https://github.com/can1357/oh-my-pi/issues/2366)).
66
+ - Fixed MiniMax M2-family and OpenAI gpt-oss model metadata so OpenAI-compatible catalog entries declare only `low|medium|high` thinking efforts. Their upstreams reject `minimal`, `xhigh`, and Fireworks' `minimal → none` wire mapping, so `fireworks/minimax-m2.7` as the smol auto-thinking classifier model 400ed on every turn. OpenAI-compatible provider effort maps (`Groq qwen/qwen3-32b`, DeepSeek-family, OpenRouter Anthropic adaptive, Fireworks `minimal → none`) now bake into `thinking.effortMap` in catalog metadata instead of `buildOpenAICompat`, and request builders read that field directly. Regenerated `models.json` now makes `disableReasoning` choose `low` for those families while leaving GLM-5.x and other Fireworks models on the existing `minimal → none` path ([#2315](https://github.com/can1357/oh-my-pi/issues/2315)).
67
+ - Fixed long-context variant pricing to use `billing.token_prices.long_context` rates instead of default model pricing
68
+ - Fixed `mapModel` handling in OpenAI-compatible discovery so returning `null` now skips a model entry rather than falling back to defaults
69
+ - Fixed model ID precedence so a real upstream Copilot model id is kept when it conflicts with a synthesized `-1m` variant
70
+ - Fixed NVIDIA NIM Qwen turns failing with `400 Validation: Unsupported parameter(s): enable_thinking`. NIM's chat-completions schema is `additionalProperties: false` and exposes thinking via the vLLM convention `chat_template_kwargs.enable_thinking`; `buildOpenAICompat` was sending top-level `enable_thinking` for every `qwen/*` id regardless of host. Registered `nvidia` as a known host (`integrate.api.nvidia.com`) and routed NVIDIA-hosted Qwen models to `thinkingFormat: "qwen-chat-template"` ([#2299](https://github.com/can1357/oh-my-pi/issues/2299)).
71
+ - Fixed Moonshot/Kimi native OpenAI-compatible request metadata so Kimi K2 uses `max_tokens` and omits OpenAI-only `store`, restoring first-turn output with `MOONSHOT_API_KEY` ([#2289](https://github.com/can1357/oh-my-pi/issues/2289)).
72
+ - Fixed `buildModel` so malformed explicit thinking metadata without `efforts` is treated as sparse input and inferred instead of crashing during model resolution ([#2251](https://github.com/can1357/oh-my-pi/issues/2251)).
73
+ - Excluded zero-cost `xai-oauth` subscription entries from the model reference indexes (`buildModelReferenceIndex`, `createReferenceResolver`), so their zero pricing and context-window-sized `maxTokens` cannot outrank paid/public Grok references when resolving custom-provider model identities.
129
74
  - Fixed Anthropic official-endpoint detection to require strict HTTPS hostname matching so non-official or lookalike URLs are no longer treated as official Anthropic hosts
130
75
  - Fixed Ollama Cloud dynamic discovery so same-id matches from other providers no longer supply context-window or max-output-token limits for discovered models.
131
76
  - Wired `@oh-my-pi/pi-catalog` into the release publish package list, tarball install smoke test, and root `bun generate-models` script.
@@ -134,4 +79,26 @@
134
79
 
135
80
  ### Removed
136
81
 
137
- - Removed the runtime enrichment layer: `enrichModelThinking` (and its non-enumerable memo-slot cache), `refreshModelThinking`, `modelOmitsReasoningEffort`, and the `model-thinking` re-exports of generator-only policies. Thinking metadata is resolved exactly once inside `buildModel`; runtime helpers (`getSupportedEfforts`, `clampThinkingLevelForModel`, `requireSupportedEffort`, the effort mappers) are pure field reads.
82
+ - Removed the runtime enrichment layer: `enrichModelThinking` (and its non-enumerable memo-slot cache), `refreshModelThinking`, `modelOmitsReasoningEffort`, and the `model-thinking` re-exports of generator-only policies. Thinking metadata is resolved exactly once inside `buildModel`; runtime helpers (`getSupportedEfforts`, `clampThinkingLevelForModel`, `requireSupportedEffort`, the effort mappers) are pure field reads.
83
+
84
+ ## [15.13.0] - 2026-06-14
85
+
86
+ ## [15.12.6] - 2026-06-14
87
+
88
+ ## [15.12.4] - 2026-06-13
89
+
90
+ ## [15.11.8] - 2026-06-12
91
+
92
+ ## [15.11.7] - 2026-06-12
93
+
94
+ ## [15.11.4] - 2026-06-12
95
+
96
+ ## [15.11.3] - 2026-06-11
97
+
98
+ ## [15.11.1] - 2026-06-11
99
+
100
+ ## [15.11.0] - 2026-06-10
101
+
102
+ ## [15.10.12] - 2026-06-10
103
+
104
+ ## [15.10.11] - 2026-06-10
@@ -12,6 +12,7 @@ export type SemVer = {
12
12
  export type GeminiKind = "pro" | "flash";
13
13
  export type AnthropicKind = "opus" | "sonnet" | "fable" | "mythos";
14
14
  export type OpenAIVariant = "base" | "codex" | "codex-max" | "codex-mini" | "codex-spark" | "mini" | "max" | "nano";
15
+ export type GlmVariant = "base" | "air" | "turbo" | "flash" | "flashx" | "preview";
15
16
  export interface GeminiModel {
16
17
  family: "gemini";
17
18
  kind: GeminiKind;
@@ -27,6 +28,14 @@ export interface OpenAIModel {
27
28
  variant: OpenAIVariant;
28
29
  version: SemVer;
29
30
  }
31
+ export interface GlmModel {
32
+ family: "glm";
33
+ /** Suffix variant (`-air`, `-turbo`, `-flash`, `-flashx`, `-preview`); `base` when none. */
34
+ variant: GlmVariant;
35
+ /** Vision SKU — the `v` that attaches directly to the version (`glm-4v`, `glm-4.5v`). */
36
+ vision: boolean;
37
+ version: SemVer;
38
+ }
30
39
  export interface UnknownModel {
31
40
  family: "unknown";
32
41
  id: string;
@@ -35,9 +44,18 @@ export type ParsedModel = GeminiModel | AnthropicModel | OpenAIModel | UnknownMo
35
44
  /** Strip a provider namespace prefix (`openai/gpt-5.4` → `gpt-5.4`). */
36
45
  export declare function bareModelId(modelId: string): string;
37
46
  export declare function parseKnownModel(modelId: string): ParsedModel;
38
- export declare function parseGeminiModel(modelId: string): GeminiModel | null;
39
- export declare function parseAnthropicModel(modelId: string): AnthropicModel | null;
40
- export declare function parseOpenAIModel(modelId: string): OpenAIModel | null;
47
+ export declare const parseGeminiModel: (modelId: string) => GeminiModel | null;
48
+ export declare const parseAnthropicModel: (modelId: string) => AnthropicModel | null;
49
+ export declare const parseOpenAIModel: (modelId: string) => OpenAIModel | null;
50
+ /**
51
+ * Parse a GLM (Zhipu / Z.AI) model id into family + variant + vision + version.
52
+ * Shape: `glm-<version>[v][-<variant>]` — e.g. `glm-4.5`, `glm-4.5-air`,
53
+ * `glm-5-turbo`, `glm-4.5v`, `glm-5-preview`. The `v` (vision) attaches to the
54
+ * version; other variants are `-` suffixes. Standalone like `parseAnthropicModel`
55
+ * is used in family.ts — GLM needs no global thinking policy, so it stays out of
56
+ * `parseKnownModel`.
57
+ */
58
+ export declare const parseGlmModel: (modelId: string) => GlmModel | null;
41
59
  export declare function isFableOrMythos(kind: AnthropicKind): boolean;
42
60
  export declare function parseSemVer(version: string): SemVer | null;
43
61
  export declare function semverGte(left: SemVer | string, right: SemVer | string): boolean;
@@ -37,6 +37,28 @@ export declare function isMinimaxM2FamilyModelId(modelId: string): boolean;
37
37
  * and `none`.
38
38
  */
39
39
  export declare function isOpenAIGptOssModelId(modelId: string): boolean;
40
+ /**
41
+ * Reasoning-capable GLM coding SKUs: glm-4.5 and up on the base / `-air` /
42
+ * `-turbo` lines. Excludes the vision (`…v`) shape, the non-reasoning
43
+ * `-flash`/`-flashx`/`-preview` variants, and pre-4.5 ids. Matching the family
44
+ * keeps newly-bumped integers (`glm-5.3`, `glm-6`, …) covered without a per-id
45
+ * allowlist.
46
+ */
47
+ export declare function isReasoningGlmModelId(modelId: string): boolean;
48
+ /** GLM vision SKUs — the `v` that attaches to the version (`glm-4v`, `glm-4.5v`). */
49
+ export declare function isGlmVisionModelId(modelId: string): boolean;
50
+ /**
51
+ * Coarse vendor-lineage token for "are two models the same family?" checks
52
+ * (e.g. picking a cross-family reviewer). All Claude point releases share a token,
53
+ * Claude and GPT differ; namespace prefixes and aggregator mirrors fold onto the
54
+ * lineage via {@link parseKnownModel}'s `bareModelId` normalization. Opaque and
55
+ * comparison-only — not a stable key to persist, since the vocabulary tracks new
56
+ * releases. Returns `""` for ids it cannot classify; callers fall back to the provider.
57
+ *
58
+ * Vendor-only by design: a model's kind/variant (opus vs sonnet, codex vs base) is
59
+ * collapsed onto the single vendor token; use {@link parseKnownModel} for finer breakdowns.
60
+ */
61
+ export declare function modelFamilyToken(modelId: string): string;
40
62
  /**
41
63
  * Adaptive thinking `display` is supported starting with Claude Opus 4.7 and
42
64
  * the Claude Fable/Mythos 5 generation. Older adaptive-thinking models
@@ -22,6 +22,8 @@ export interface ModelManagerOptions<TApi extends Api = Api, TModelsDevPayload =
22
22
  staticModels?: readonly ModelSpec<TApi>[];
23
23
  /** Optional override for the cache database path. Default: <agent-dir>/models.db. */
24
24
  cacheDbPath?: string;
25
+ /** Optional provider id override for cache namespacing. Defaults to providerId. */
26
+ cacheProviderId?: string;
25
27
  /** Maximum cache age in milliseconds before considered stale. Default: 24h. */
26
28
  cacheTtlMs?: number;
27
29
  /** When true, a successful dynamic fetch is the complete provider catalog and prunes static-only models. */
@@ -25,10 +25,10 @@ export declare const CATALOG_PROVIDERS: readonly [{
25
25
  };
26
26
  }, {
27
27
  readonly id: "amazon-bedrock";
28
- readonly defaultModel: "us.anthropic.claude-opus-4-6-v1";
28
+ readonly defaultModel: "us.anthropic.claude-opus-4-8";
29
29
  }, {
30
30
  readonly id: "anthropic";
31
- readonly defaultModel: "claude-opus-4-6";
31
+ readonly defaultModel: "claude-opus-4-8";
32
32
  readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"anthropic-messages", unknown>;
33
33
  }, {
34
34
  readonly id: "cerebras";
@@ -136,7 +136,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
136
136
  };
137
137
  }, {
138
138
  readonly id: "litellm";
139
- readonly defaultModel: "claude-opus-4-6";
139
+ readonly defaultModel: "claude-opus-4-8";
140
140
  readonly envVars: readonly ["LITELLM_API_KEY"];
141
141
  readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"openai-completions", unknown>;
142
142
  readonly catalogDiscovery: {
@@ -176,7 +176,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
176
176
  };
177
177
  }, {
178
178
  readonly id: "nanogpt";
179
- readonly defaultModel: "openai/gpt-5.4";
179
+ readonly defaultModel: "openai/gpt-5.5";
180
180
  readonly envVars: readonly ["NANO_GPT_API_KEY"];
181
181
  readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"openai-completions", unknown>;
182
182
  readonly catalogDiscovery: {
@@ -207,12 +207,12 @@ export declare const CATALOG_PROVIDERS: readonly [{
207
207
  };
208
208
  }, {
209
209
  readonly id: "openai";
210
- readonly defaultModel: "gpt-5.4";
210
+ readonly defaultModel: "gpt-5.5";
211
211
  readonly envVars: readonly ["OPENAI_API_KEY"];
212
212
  readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"openai-responses", unknown>;
213
213
  }, {
214
214
  readonly id: "openai-codex";
215
- readonly defaultModel: "gpt-5.4";
215
+ readonly defaultModel: "gpt-5.5";
216
216
  readonly envVars: readonly ["OPENAI_CODEX_OAUTH_TOKEN"];
217
217
  readonly specialModelManager: true;
218
218
  }, {
@@ -227,7 +227,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
227
227
  readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<import("..").Api, unknown>;
228
228
  }, {
229
229
  readonly id: "openrouter";
230
- readonly defaultModel: "openai/gpt-5.4";
230
+ readonly defaultModel: "openai/gpt-5.5";
231
231
  readonly envVars: readonly ["OPENROUTER_API_KEY"];
232
232
  readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"openai-completions", unknown>;
233
233
  readonly catalogDiscovery: {
@@ -361,7 +361,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
361
361
  };
362
362
  }, {
363
363
  readonly id: "zenmux";
364
- readonly defaultModel: "anthropic/claude-opus-4.6";
364
+ readonly defaultModel: "anthropic/claude-opus-4.8";
365
365
  readonly envVars: readonly ["ZENMUX_API_KEY"];
366
366
  readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<import("..").Api, unknown>;
367
367
  readonly catalogDiscovery: {
@@ -161,6 +161,12 @@ export interface OpenAICompat {
161
161
  requiresAssistantContentForToolCalls?: boolean;
162
162
  /** Whether the provider supports the `tool_choice` parameter. Default: true. */
163
163
  supportsToolChoice?: boolean;
164
+ /**
165
+ * Whether forced `tool_choice` values (`"required"` or named tools) are accepted.
166
+ * When false, request builders keep tools available but downgrade forced choices
167
+ * to provider-default auto selection. Default: true.
168
+ */
169
+ supportsForcedToolChoice?: boolean;
164
170
  /**
165
171
  * Drop reasoning fields (`reasoning_effort`, OpenRouter `reasoning`) for
166
172
  * the request when `tool_choice` forces a tool call. Mirrors the Anthropic
@@ -9,6 +9,7 @@ export declare const getGeminiCliHeaders: (modelId?: string) => {
9
9
  "Client-Metadata": string;
10
10
  };
11
11
  export declare const ANTIGRAVITY_SYSTEM_INSTRUCTION: string;
12
+ export declare const ANTIGRAVITY_NO_PREAMBLE_INSTRUCTION = "CRITICAL: NEVER output rule checks, formatting guidelines, constraint checklists (e.g. \"No emdashes\"), or your thinking/personality preambles in the final response. Output only the final response.";
12
13
  /**
13
14
  * Antigravity / Cloud Code Assist user agent. Lives in its own file so discovery
14
15
  * and usage code can read it without pulling the heavy google-gemini-cli provider
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "type": "module",
3
3
  "name": "@oh-my-pi/pi-catalog",
4
- "version": "15.12.4",
4
+ "version": "15.13.0",
5
5
  "description": "Model catalog for omp: bundled model database, provider discovery descriptors, model identity, classification, and equivalence",
6
6
  "homepage": "https://omp.sh",
7
7
  "author": "Can Boluk",
@@ -34,11 +34,11 @@
34
34
  },
35
35
  "dependencies": {
36
36
  "@bufbuild/protobuf": "^2.12.0",
37
- "@oh-my-pi/pi-utils": "15.12.4",
37
+ "@oh-my-pi/pi-utils": "15.13.0",
38
38
  "zod": "^4"
39
39
  },
40
40
  "devDependencies": {
41
- "@oh-my-pi/pi-ai": "15.12.4",
41
+ "@oh-my-pi/pi-ai": "15.13.0",
42
42
  "@types/bun": "^1.3.14"
43
43
  },
44
44
  "engines": {
@@ -34,11 +34,17 @@ export function buildAnthropicCompat(spec: ModelSpec<"anthropic-messages">): Res
34
34
  const official = isOfficialAnthropicApiUrl(baseUrl);
35
35
  // Z.AI's Anthropic-compatible proxy lives at `api.z.ai/api/anthropic`.
36
36
  const isZai = modelMatchesHost(spec, "zai");
37
+ // GitHub Copilot's Anthropic-compatible proxy (api.githubcopilot.com/v1/messages)
38
+ // rejects the per-tool `eager_input_streaming` field with
39
+ // `tools.0.custom.eager_input_streaming: Extra inputs are not permitted` and
40
+ // doesn't whitelist the `fine-grained-tool-streaming-2025-05-14` beta either
41
+ // (issue #2558), so eager tool-input streaming is unavailable on this host.
42
+ const isCopilot = modelMatchesHost(spec, "githubCopilot");
37
43
  const compat: ResolvedAnthropicCompat = {
38
44
  officialEndpoint: official,
39
45
  disableStrictTools: false,
40
46
  disableAdaptiveThinking: false,
41
- supportsEagerToolInputStreaming: true,
47
+ supportsEagerToolInputStreaming: !isCopilot,
42
48
  // Long cache retention is only sent to the official API by default;
43
49
  // proxies opt in explicitly via `compat.supportsLongCacheRetention: true`.
44
50
  supportsLongCacheRetention: official,
@@ -217,6 +217,7 @@ export function buildOpenAICompat(spec: ModelSpec<"openai-completions">): Resolv
217
217
  disableReasoningOnForcedToolChoice: isKimiModel || isAnthropicModel,
218
218
  disableReasoningOnToolChoice: isDeepseekFamily && Boolean(spec.reasoning) && !isOpenRouter,
219
219
  supportsToolChoice: !isDirectDeepseekReasoning,
220
+ supportsForcedToolChoice: true,
220
221
  maxTokensField: useMaxTokens ? "max_tokens" : "max_completion_tokens",
221
222
  requiresToolResultName: isMistral,
222
223
  requiresAssistantAfterToolResult: false,
@@ -14,6 +14,7 @@ export type SemVer = {
14
14
  export type GeminiKind = "pro" | "flash";
15
15
  export type AnthropicKind = "opus" | "sonnet" | "fable" | "mythos";
16
16
  export type OpenAIVariant = "base" | "codex" | "codex-max" | "codex-mini" | "codex-spark" | "mini" | "max" | "nano";
17
+ export type GlmVariant = "base" | "air" | "turbo" | "flash" | "flashx" | "preview";
17
18
 
18
19
  export interface GeminiModel {
19
20
  family: "gemini";
@@ -33,6 +34,15 @@ export interface OpenAIModel {
33
34
  version: SemVer;
34
35
  }
35
36
 
37
+ export interface GlmModel {
38
+ family: "glm";
39
+ /** Suffix variant (`-air`, `-turbo`, `-flash`, `-flashx`, `-preview`); `base` when none. */
40
+ variant: GlmVariant;
41
+ /** Vision SKU — the `v` that attaches directly to the version (`glm-4v`, `glm-4.5v`). */
42
+ vision: boolean;
43
+ version: SemVer;
44
+ }
45
+
36
46
  export interface UnknownModel {
37
47
  family: "unknown";
38
48
  id: string;
@@ -55,8 +65,26 @@ export function parseKnownModel(modelId: string): ParsedModel {
55
65
  );
56
66
  }
57
67
 
68
+ /**
69
+ * Wrap a parse function in a per-id memo cache. Caches the `null` result too, so
70
+ * repeated misses (the common case — ids of other families) stay O(1) and never
71
+ * re-run the regex/semver work.
72
+ */
73
+ function parser<T>(parse: (modelId: string) => T | null): (modelId: string) => T | null {
74
+ const cache = new Map<string, T | null>();
75
+ return modelId => {
76
+ const hit = cache.get(modelId);
77
+ if (hit !== undefined || cache.has(modelId)) {
78
+ return hit ?? null;
79
+ }
80
+ const result = parse(modelId);
81
+ cache.set(modelId, result);
82
+ return result;
83
+ };
84
+ }
85
+
58
86
  const GEMINI_SUFFIX = "-preview";
59
- export function parseGeminiModel(modelId: string): GeminiModel | null {
87
+ export const parseGeminiModel = parser((modelId): GeminiModel | null => {
60
88
  if (modelId.endsWith(GEMINI_SUFFIX)) {
61
89
  modelId = modelId.slice(0, -GEMINI_SUFFIX.length);
62
90
  }
@@ -69,9 +97,9 @@ export function parseGeminiModel(modelId: string): GeminiModel | null {
69
97
  return null;
70
98
  }
71
99
  return { family: "gemini", kind: match[2] as GeminiKind, version };
72
- }
100
+ });
73
101
 
74
- export function parseAnthropicModel(modelId: string): AnthropicModel | null {
102
+ export const parseAnthropicModel = parser((modelId): AnthropicModel | null => {
75
103
  const match = /claude-(opus|sonnet|fable|mythos)-(\d{1,2}(?:[.-]\d{1,2}){0,2})\b/.exec(modelId);
76
104
  if (!match) {
77
105
  return null;
@@ -81,9 +109,9 @@ export function parseAnthropicModel(modelId: string): AnthropicModel | null {
81
109
  return null;
82
110
  }
83
111
  return { family: "anthropic", kind: match[1] as AnthropicKind, version };
84
- }
112
+ });
85
113
 
86
- export function parseOpenAIModel(modelId: string): OpenAIModel | null {
114
+ export const parseOpenAIModel = parser((modelId): OpenAIModel | null => {
87
115
  const match = /gpt-(\d+(?:\.\d+){0,2})(?:-(codex-spark|codex-mini|codex-max|codex|mini|max|nano))?\b/.exec(modelId);
88
116
  if (!match) {
89
117
  return null;
@@ -93,7 +121,32 @@ export function parseOpenAIModel(modelId: string): OpenAIModel | null {
93
121
  return null;
94
122
  }
95
123
  return { family: "openai", variant: (match[2] as OpenAIVariant | undefined) ?? "base", version };
96
- }
124
+ });
125
+
126
+ /**
127
+ * Parse a GLM (Zhipu / Z.AI) model id into family + variant + vision + version.
128
+ * Shape: `glm-<version>[v][-<variant>]` — e.g. `glm-4.5`, `glm-4.5-air`,
129
+ * `glm-5-turbo`, `glm-4.5v`, `glm-5-preview`. The `v` (vision) attaches to the
130
+ * version; other variants are `-` suffixes. Standalone like `parseAnthropicModel`
131
+ * is used in family.ts — GLM needs no global thinking policy, so it stays out of
132
+ * `parseKnownModel`.
133
+ */
134
+ export const parseGlmModel = parser((modelId): GlmModel | null => {
135
+ const match = /glm-(\d{1,2}(?:\.\d+)?)(v)?(?:-(air|turbo|flashx|flash|preview))?\b/.exec(modelId);
136
+ if (!match) {
137
+ return null;
138
+ }
139
+ const version = parseSemVer(match[1]);
140
+ if (!version) {
141
+ return null;
142
+ }
143
+ return {
144
+ family: "glm",
145
+ variant: (match[3] as GlmVariant | undefined) ?? "base",
146
+ vision: match[2] === "v",
147
+ version,
148
+ };
149
+ });
97
150
 
98
151
  export function isFableOrMythos(kind: AnthropicKind): boolean {
99
152
  return kind === "fable" || kind === "mythos";
@@ -7,7 +7,14 @@
7
7
  * here.
8
8
  */
9
9
 
10
- import { bareModelId, isFableOrMythos, parseAnthropicModel, semverGte } from "./classify";
10
+ import {
11
+ bareModelId,
12
+ isFableOrMythos,
13
+ parseAnthropicModel,
14
+ parseGlmModel,
15
+ parseKnownModel,
16
+ semverGte,
17
+ } from "./classify";
11
18
 
12
19
  /** Kimi family ids in any namespace form (`moonshotai/kimi-*`, `kimi-k2.6`, `vendor/kimi.x`). */
13
20
  export function isKimiModelId(modelId: string): boolean {
@@ -71,6 +78,52 @@ export function isOpenAIGptOssModelId(modelId: string): boolean {
71
78
  return /(^|\/)gpt-oss[-:]/i.test(modelId);
72
79
  }
73
80
 
81
+ /**
82
+ * Reasoning-capable GLM coding SKUs: glm-4.5 and up on the base / `-air` /
83
+ * `-turbo` lines. Excludes the vision (`…v`) shape, the non-reasoning
84
+ * `-flash`/`-flashx`/`-preview` variants, and pre-4.5 ids. Matching the family
85
+ * keeps newly-bumped integers (`glm-5.3`, `glm-6`, …) covered without a per-id
86
+ * allowlist.
87
+ */
88
+ export function isReasoningGlmModelId(modelId: string): boolean {
89
+ const glm = parseGlmModel(bareModelId(modelId));
90
+ if (!glm || glm.vision) {
91
+ return false;
92
+ }
93
+ if (glm.variant !== "base" && glm.variant !== "air" && glm.variant !== "turbo") {
94
+ return false;
95
+ }
96
+ return semverGte(glm.version, "4.5");
97
+ }
98
+
99
+ /** GLM vision SKUs — the `v` that attaches to the version (`glm-4v`, `glm-4.5v`). */
100
+ export function isGlmVisionModelId(modelId: string): boolean {
101
+ return parseGlmModel(bareModelId(modelId))?.vision === true;
102
+ }
103
+ /**
104
+ * Coarse vendor-lineage token for "are two models the same family?" checks
105
+ * (e.g. picking a cross-family reviewer). All Claude point releases share a token,
106
+ * Claude and GPT differ; namespace prefixes and aggregator mirrors fold onto the
107
+ * lineage via {@link parseKnownModel}'s `bareModelId` normalization. Opaque and
108
+ * comparison-only — not a stable key to persist, since the vocabulary tracks new
109
+ * releases. Returns `""` for ids it cannot classify; callers fall back to the provider.
110
+ *
111
+ * Vendor-only by design: a model's kind/variant (opus vs sonnet, codex vs base) is
112
+ * collapsed onto the single vendor token; use {@link parseKnownModel} for finer breakdowns.
113
+ */
114
+ export function modelFamilyToken(modelId: string): string {
115
+ const parsed = parseKnownModel(modelId);
116
+ if (parsed.family !== "unknown") return parsed.family;
117
+ if (isKimiModelId(modelId)) return "kimi";
118
+ if (isQwenModelId(modelId)) return "qwen";
119
+ if (isMinimaxM2FamilyModelId(modelId)) return "minimax";
120
+ if (isOpenAIGptOssModelId(modelId)) return "gpt-oss";
121
+ if (isDeepseekModelIdOrName(modelId)) return "deepseek";
122
+ if (isMimoModelIdOrName(modelId)) return "mimo";
123
+ if (parseGlmModel(bareModelId(modelId))) return "glm";
124
+ return "";
125
+ }
126
+
74
127
  /**
75
128
  * Adaptive thinking `display` is supported starting with Claude Opus 4.7 and
76
129
  * the Claude Fable/Mythos 5 generation. Older adaptive-thinking models
@@ -33,6 +33,8 @@ export interface ModelManagerOptions<TApi extends Api = Api, TModelsDevPayload =
33
33
  staticModels?: readonly ModelSpec<TApi>[];
34
34
  /** Optional override for the cache database path. Default: <agent-dir>/models.db. */
35
35
  cacheDbPath?: string;
36
+ /** Optional provider id override for cache namespacing. Defaults to providerId. */
37
+ cacheProviderId?: string;
36
38
  /** Maximum cache age in milliseconds before considered stale. Default: 24h. */
37
39
  cacheTtlMs?: number;
38
40
  /** When true, a successful dynamic fetch is the complete provider catalog and prunes static-only models. */
@@ -107,13 +109,14 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
107
109
  options: ModelManagerOptions<TApi, TModelsDevPayload>,
108
110
  strategy: ModelRefreshStrategy = "online-if-uncached",
109
111
  ): Promise<ModelResolutionResult<TApi>> {
112
+ const cacheProviderId = options.cacheProviderId ?? options.providerId;
110
113
  const now = options.now ?? Date.now;
111
114
  const ttlMs = options.cacheTtlMs ?? DEFAULT_CACHE_TTL_MS;
112
115
  const dbPath = options.cacheDbPath;
113
116
  const staticModels = options.staticModels
114
117
  ? passModelList<TApi>(options.staticModels)
115
118
  : (getBundledModels(options.providerId as GeneratedProvider) as Model<TApi>[]);
116
- const cache = readModelCache<TApi>(options.providerId, ttlMs, now, dbPath);
119
+ const cache = readModelCache<TApi>(cacheProviderId, ttlMs, now, dbPath);
117
120
  const dynamicModelsAuthoritative = options.dynamicModelsAuthoritative ?? false;
118
121
  const staticFingerprint = fingerprintStatic(staticModels, dynamicModelsAuthoritative);
119
122
  const cacheFingerprintMatches = cache?.staticFingerprint === staticFingerprint && staticFingerprint.length > 0;
@@ -160,7 +163,7 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
160
163
  ? retainModelIds(mergedSnapshot, dynamicModels)
161
164
  : mergedSnapshot;
162
165
  writeModelCache(
163
- options.providerId,
166
+ cacheProviderId,
164
167
  now(),
165
168
  collapseBuiltModelVariants(snapshotModels),
166
169
  true,
@@ -170,9 +173,9 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
170
173
  } else {
171
174
  // Dynamic fetch failed — update cache with a non-authoritative snapshot so
172
175
  // stale state remains visible while retry backoff still applies.
173
- const latestCache = readModelCache<TApi>(options.providerId, ttlMs, now, dbPath);
176
+ const latestCache = readModelCache<TApi>(cacheProviderId, ttlMs, now, dbPath);
174
177
  writeModelCache(
175
- options.providerId,
178
+ cacheProviderId,
176
179
  now(),
177
180
  collapseBuiltModelVariants(
178
181
  mergeDynamicModels(
package/src/models.json CHANGED
@@ -4259,7 +4259,8 @@
4259
4259
  "cacheWrite": 0
4260
4260
  },
4261
4261
  "contextWindow": null,
4262
- "maxTokens": null
4262
+ "maxTokens": null,
4263
+ "contextPromotionTarget": "aimlapi/gpt-5.4-2026-03-05"
4263
4264
  },
4264
4265
  "gpt-5.5-pro-2026-04-23": {
4265
4266
  "id": "gpt-5.5-pro-2026-04-23",
@@ -4278,7 +4279,8 @@
4278
4279
  "cacheWrite": 0
4279
4280
  },
4280
4281
  "contextWindow": null,
4281
- "maxTokens": null
4282
+ "maxTokens": null,
4283
+ "contextPromotionTarget": "aimlapi/gpt-5.4-2026-03-05"
4282
4284
  },
4283
4285
  "gpt-oss-120b": {
4284
4286
  "id": "gpt-oss-120b",
@@ -9577,7 +9579,8 @@
9577
9579
  "high",
9578
9580
  "xhigh"
9579
9581
  ]
9580
- }
9582
+ },
9583
+ "contextPromotionTarget": "amazon-bedrock/openai.gpt-5.4"
9581
9584
  },
9582
9585
  "openai.gpt-oss-120b": {
9583
9586
  "id": "openai.gpt-oss-120b",
@@ -12202,7 +12205,8 @@
12202
12205
  "high",
12203
12206
  "xhigh"
12204
12207
  ]
12205
- }
12208
+ },
12209
+ "contextPromotionTarget": "cloudflare-ai-gateway/openai/gpt-5.4"
12206
12210
  },
12207
12211
  "openai/o1": {
12208
12212
  "id": "openai/o1",
@@ -22904,6 +22908,7 @@
22904
22908
  "disableReasoningOnForcedToolChoice": true,
22905
22909
  "disableReasoningOnToolChoice": false,
22906
22910
  "supportsToolChoice": true,
22911
+ "supportsForcedToolChoice": true,
22907
22912
  "maxTokensField": "max_completion_tokens",
22908
22913
  "requiresToolResultName": false,
22909
22914
  "requiresAssistantAfterToolResult": false,
@@ -24595,7 +24600,8 @@
24595
24600
  "high",
24596
24601
  "xhigh"
24597
24602
  ]
24598
- }
24603
+ },
24604
+ "contextPromotionTarget": "kilo/openai/gpt-5.4"
24599
24605
  },
24600
24606
  "openai/gpt-5.5-pro": {
24601
24607
  "id": "openai/gpt-5.5-pro",
@@ -24624,7 +24630,8 @@
24624
24630
  "high",
24625
24631
  "xhigh"
24626
24632
  ]
24627
- }
24633
+ },
24634
+ "contextPromotionTarget": "kilo/openai/gpt-5.4"
24628
24635
  },
24629
24636
  "openai/gpt-audio": {
24630
24637
  "id": "openai/gpt-audio",
@@ -25327,6 +25334,7 @@
25327
25334
  "disableReasoningOnForcedToolChoice": false,
25328
25335
  "disableReasoningOnToolChoice": false,
25329
25336
  "supportsToolChoice": true,
25337
+ "supportsForcedToolChoice": true,
25330
25338
  "maxTokensField": "max_completion_tokens",
25331
25339
  "requiresToolResultName": false,
25332
25340
  "requiresAssistantAfterToolResult": false,
@@ -25778,6 +25786,7 @@
25778
25786
  "disableReasoningOnForcedToolChoice": false,
25779
25787
  "disableReasoningOnToolChoice": false,
25780
25788
  "supportsToolChoice": true,
25789
+ "supportsForcedToolChoice": true,
25781
25790
  "maxTokensField": "max_completion_tokens",
25782
25791
  "requiresToolResultName": false,
25783
25792
  "requiresAssistantAfterToolResult": false,
@@ -26058,6 +26067,7 @@
26058
26067
  "disableReasoningOnForcedToolChoice": false,
26059
26068
  "disableReasoningOnToolChoice": false,
26060
26069
  "supportsToolChoice": true,
26070
+ "supportsForcedToolChoice": true,
26061
26071
  "maxTokensField": "max_completion_tokens",
26062
26072
  "requiresToolResultName": false,
26063
26073
  "requiresAssistantAfterToolResult": false,
@@ -28539,7 +28549,7 @@
28539
28549
  "cacheRead": 0.12,
28540
28550
  "cacheWrite": 0
28541
28551
  },
28542
- "contextWindow": 512000,
28552
+ "contextWindow": 1000000,
28543
28553
  "maxTokens": 128000,
28544
28554
  "thinking": {
28545
28555
  "mode": "budget",
@@ -28781,7 +28791,7 @@
28781
28791
  "cacheRead": 0.12,
28782
28792
  "cacheWrite": 0
28783
28793
  },
28784
- "contextWindow": 512000,
28794
+ "contextWindow": 1000000,
28785
28795
  "maxTokens": 128000,
28786
28796
  "thinking": {
28787
28797
  "mode": "budget",
@@ -39194,8 +39204,8 @@
39194
39204
  "cacheRead": 0,
39195
39205
  "cacheWrite": 0
39196
39206
  },
39197
- "contextWindow": null,
39198
- "maxTokens": null
39207
+ "contextWindow": 128000,
39208
+ "maxTokens": 16384
39199
39209
  },
39200
39210
  "openai/gpt-5-codex": {
39201
39211
  "id": "openai/gpt-5-codex",
@@ -39763,7 +39773,8 @@
39763
39773
  "high",
39764
39774
  "xhigh"
39765
39775
  ]
39766
- }
39776
+ },
39777
+ "contextPromotionTarget": "nanogpt/openai/gpt-5.4"
39767
39778
  },
39768
39779
  "openai/gpt-chat-latest": {
39769
39780
  "id": "openai/gpt-chat-latest",
@@ -51042,6 +51053,9 @@
51042
51053
  },
51043
51054
  "contextWindow": 262144,
51044
51055
  "maxTokens": 262144,
51056
+ "compat": {
51057
+ "supportsForcedToolChoice": false
51058
+ },
51045
51059
  "thinking": {
51046
51060
  "mode": "effort",
51047
51061
  "efforts": [
@@ -51081,6 +51095,9 @@
51081
51095
  "high",
51082
51096
  "xhigh"
51083
51097
  ]
51098
+ },
51099
+ "compat": {
51100
+ "supportsToolChoice": false
51084
51101
  }
51085
51102
  },
51086
51103
  "mimo-v2-pro": {
@@ -51110,6 +51127,9 @@
51110
51127
  "high",
51111
51128
  "xhigh"
51112
51129
  ]
51130
+ },
51131
+ "compat": {
51132
+ "supportsToolChoice": false
51113
51133
  }
51114
51134
  },
51115
51135
  "mimo-v2.5": {
@@ -51131,6 +51151,9 @@
51131
51151
  },
51132
51152
  "contextWindow": 1000000,
51133
51153
  "maxTokens": 128000,
51154
+ "compat": {
51155
+ "supportsToolChoice": false
51156
+ },
51134
51157
  "thinking": {
51135
51158
  "mode": "effort",
51136
51159
  "efforts": [
@@ -51160,6 +51183,9 @@
51160
51183
  },
51161
51184
  "contextWindow": 1048576,
51162
51185
  "maxTokens": 128000,
51186
+ "compat": {
51187
+ "supportsToolChoice": false
51188
+ },
51163
51189
  "thinking": {
51164
51190
  "mode": "effort",
51165
51191
  "efforts": [
@@ -55012,13 +55038,13 @@
55012
55038
  "text"
55013
55039
  ],
55014
55040
  "cost": {
55015
- "input": 0.098,
55016
- "output": 0.196,
55041
+ "input": 0.09,
55042
+ "output": 0.18,
55017
55043
  "cacheRead": 0.02,
55018
55044
  "cacheWrite": 0
55019
55045
  },
55020
55046
  "contextWindow": 1048576,
55021
- "maxTokens": 384000,
55047
+ "maxTokens": 65536,
55022
55048
  "thinking": {
55023
55049
  "mode": "effort",
55024
55050
  "efforts": [
@@ -57075,9 +57101,9 @@
57075
57101
  "image"
57076
57102
  ],
57077
57103
  "cost": {
57078
- "input": 0.95,
57079
- "output": 4,
57080
- "cacheRead": 0.19,
57104
+ "input": 0.75,
57105
+ "output": 3.5,
57106
+ "cacheRead": 0.16,
57081
57107
  "cacheWrite": 0
57082
57108
  },
57083
57109
  "contextWindow": 262144,
@@ -58513,7 +58539,8 @@
58513
58539
  "high",
58514
58540
  "xhigh"
58515
58541
  ]
58516
- }
58542
+ },
58543
+ "contextPromotionTarget": "openrouter/openai/gpt-5.4"
58517
58544
  },
58518
58545
  "openai/gpt-5.5-pro": {
58519
58546
  "id": "openai/gpt-5.5-pro",
@@ -58542,7 +58569,8 @@
58542
58569
  "high",
58543
58570
  "xhigh"
58544
58571
  ]
58545
- }
58572
+ },
58573
+ "contextPromotionTarget": "openrouter/openai/gpt-5.4"
58546
58574
  },
58547
58575
  "openai/gpt-audio": {
58548
58576
  "id": "openai/gpt-audio",
@@ -59989,7 +60017,7 @@
59989
60017
  "cacheWrite": 0
59990
60018
  },
59991
60019
  "contextWindow": 262144,
59992
- "maxTokens": null
60020
+ "maxTokens": 16384
59993
60021
  },
59994
60022
  "qwen/qwen3-next-80b-a3b-thinking": {
59995
60023
  "id": "qwen/qwen3-next-80b-a3b-thinking",
@@ -64583,7 +64611,7 @@
64583
64611
  "cacheWrite": 0
64584
64612
  },
64585
64613
  "contextWindow": 128000,
64586
- "maxTokens": null,
64614
+ "maxTokens": 16384,
64587
64615
  "compat": {
64588
64616
  "supportsUsageInStreaming": false
64589
64617
  }
@@ -69051,7 +69079,8 @@
69051
69079
  "high",
69052
69080
  "xhigh"
69053
69081
  ]
69054
- }
69082
+ },
69083
+ "contextPromotionTarget": "vercel-ai-gateway/openai/gpt-5.4"
69055
69084
  },
69056
69085
  "openai/gpt-5.5-pro": {
69057
69086
  "id": "openai/gpt-5.5-pro",
@@ -69080,7 +69109,8 @@
69080
69109
  "high",
69081
69110
  "xhigh"
69082
69111
  ]
69083
- }
69112
+ },
69113
+ "contextPromotionTarget": "vercel-ai-gateway/openai/gpt-5.4"
69084
69114
  },
69085
69115
  "openai/gpt-oss-120b": {
69086
69116
  "id": "openai/gpt-oss-120b",
@@ -72205,6 +72235,35 @@
72205
72235
  ]
72206
72236
  }
72207
72237
  },
72238
+ "glm-5.2": {
72239
+ "id": "glm-5.2",
72240
+ "name": "GLM-5.2",
72241
+ "api": "anthropic-messages",
72242
+ "provider": "zai",
72243
+ "baseUrl": "https://api.z.ai/api/anthropic",
72244
+ "reasoning": true,
72245
+ "input": [
72246
+ "text"
72247
+ ],
72248
+ "cost": {
72249
+ "input": 0,
72250
+ "output": 0,
72251
+ "cacheRead": 0,
72252
+ "cacheWrite": 0
72253
+ },
72254
+ "contextWindow": 1000000,
72255
+ "maxTokens": 131072,
72256
+ "thinking": {
72257
+ "mode": "budget",
72258
+ "efforts": [
72259
+ "minimal",
72260
+ "low",
72261
+ "medium",
72262
+ "high",
72263
+ "xhigh"
72264
+ ]
72265
+ }
72266
+ },
72208
72267
  "glm-5v-turbo": {
72209
72268
  "id": "glm-5v-turbo",
72210
72269
  "name": "GLM-5V-Turbo",
@@ -75112,7 +75171,8 @@
75112
75171
  "high",
75113
75172
  "xhigh"
75114
75173
  ]
75115
- }
75174
+ },
75175
+ "contextPromotionTarget": "zenmux/openai/gpt-5.4"
75116
75176
  },
75117
75177
  "openai/gpt-5.5-instant": {
75118
75178
  "id": "openai/gpt-5.5-instant",
@@ -75141,7 +75201,8 @@
75141
75201
  "high",
75142
75202
  "xhigh"
75143
75203
  ]
75144
- }
75204
+ },
75205
+ "contextPromotionTarget": "zenmux/openai/gpt-5.4"
75145
75206
  },
75146
75207
  "openai/gpt-5.5-pro": {
75147
75208
  "id": "openai/gpt-5.5-pro",
@@ -75170,7 +75231,8 @@
75170
75231
  "high",
75171
75232
  "xhigh"
75172
75233
  ]
75173
- }
75234
+ },
75235
+ "contextPromotionTarget": "zenmux/openai/gpt-5.4"
75174
75236
  },
75175
75237
  "openai/gpt-image-1.5": {
75176
75238
  "id": "openai/gpt-image-1.5",
@@ -68,11 +68,11 @@ export const CATALOG_PROVIDERS = [
68
68
  },
69
69
  {
70
70
  id: "amazon-bedrock",
71
- defaultModel: "us.anthropic.claude-opus-4-6-v1",
71
+ defaultModel: "us.anthropic.claude-opus-4-8",
72
72
  },
73
73
  {
74
74
  id: "anthropic",
75
- defaultModel: "claude-opus-4-6",
75
+ defaultModel: "claude-opus-4-8",
76
76
  createModelManagerOptions: (config: ModelManagerConfig) => anthropicModelManagerOptions(config),
77
77
  },
78
78
  {
@@ -177,7 +177,7 @@ export const CATALOG_PROVIDERS = [
177
177
  },
178
178
  {
179
179
  id: "litellm",
180
- defaultModel: "claude-opus-4-6",
180
+ defaultModel: "claude-opus-4-8",
181
181
  envVars: ["LITELLM_API_KEY"],
182
182
  createModelManagerOptions: (config: ModelManagerConfig) => litellmModelManagerOptions(config),
183
183
  catalogDiscovery: { label: "LiteLLM", allowUnauthenticated: true },
@@ -219,7 +219,7 @@ export const CATALOG_PROVIDERS = [
219
219
  },
220
220
  {
221
221
  id: "nanogpt",
222
- defaultModel: "openai/gpt-5.4",
222
+ defaultModel: "openai/gpt-5.5",
223
223
  envVars: ["NANO_GPT_API_KEY"],
224
224
  createModelManagerOptions: (config: ModelManagerConfig) => nanoGptModelManagerOptions(config),
225
225
  catalogDiscovery: { label: "NanoGPT" },
@@ -247,13 +247,13 @@ export const CATALOG_PROVIDERS = [
247
247
  },
248
248
  {
249
249
  id: "openai",
250
- defaultModel: "gpt-5.4",
250
+ defaultModel: "gpt-5.5",
251
251
  envVars: ["OPENAI_API_KEY"],
252
252
  createModelManagerOptions: (config: ModelManagerConfig) => openaiModelManagerOptions(config),
253
253
  },
254
254
  {
255
255
  id: "openai-codex",
256
- defaultModel: "gpt-5.4",
256
+ defaultModel: "gpt-5.5",
257
257
  envVars: ["OPENAI_CODEX_OAUTH_TOKEN"],
258
258
  specialModelManager: true,
259
259
  },
@@ -271,7 +271,7 @@ export const CATALOG_PROVIDERS = [
271
271
  },
272
272
  {
273
273
  id: "openrouter",
274
- defaultModel: "openai/gpt-5.4",
274
+ defaultModel: "openai/gpt-5.5",
275
275
  envVars: ["OPENROUTER_API_KEY"],
276
276
  createModelManagerOptions: (config: ModelManagerConfig) => openrouterModelManagerOptions(config),
277
277
  catalogDiscovery: { label: "OpenRouter", allowUnauthenticated: true },
@@ -403,7 +403,7 @@ export const CATALOG_PROVIDERS = [
403
403
  },
404
404
  {
405
405
  id: "zenmux",
406
- defaultModel: "anthropic/claude-opus-4.6",
406
+ defaultModel: "anthropic/claude-opus-4.8",
407
407
  envVars: ["ZENMUX_API_KEY"],
408
408
  createModelManagerOptions: (config: ModelManagerConfig) => zenmuxModelManagerOptions(config),
409
409
  catalogDiscovery: { label: "ZenMux" },
@@ -5,6 +5,7 @@ import {
5
5
  } from "../discovery/openai-compatible";
6
6
  import { Effort } from "../effort";
7
7
  import { toFireworksPublicModelId } from "../fireworks-model-id";
8
+ import { isGlmVisionModelId, isReasoningGlmModelId } from "../identity/family";
8
9
  import type { ModelManagerOptions } from "../model-manager";
9
10
  import { getBundledModels } from "../models";
10
11
  import type { Api, FetchImpl, Model, ModelSpec, Provider, ThinkingConfig } from "../types";
@@ -1030,8 +1031,8 @@ export function zhipuCodingPlanModelManagerOptions(
1030
1031
  const id = defaults.id;
1031
1032
  return {
1032
1033
  ...defaults,
1033
- reasoning: ZHIPU_REASONING_MODELS[id] === true || id.includes("thinking"),
1034
- input: ZHIPU_VISION_PATTERN.test(id) ? (["text", "image"] as const) : ["text"],
1034
+ reasoning: isReasoningGlmModelId(id) || id.includes("thinking"),
1035
+ input: isGlmVisionModelId(id) ? (["text", "image"] as const) : ["text"],
1035
1036
  compat: {
1036
1037
  thinkingFormat: "zai",
1037
1038
  reasoningContentField: "reasoning_content",
@@ -1045,25 +1046,6 @@ export function zhipuCodingPlanModelManagerOptions(
1045
1046
  };
1046
1047
  }
1047
1048
 
1048
- // Reasoning-capable GLM models on the BigModel coding-plan SKU. Keep this
1049
- // explicit rather than regex-matching `glm-[45]\.\d` so newly-added integers
1050
- // like `glm-5` / `glm-5-turbo` are covered and unrelated future SKUs (e.g.
1051
- // `glm-5-preview`) do not silently flip into thinking mode.
1052
- const ZHIPU_REASONING_MODELS: Readonly<Record<string, true>> = {
1053
- "glm-4.5": true,
1054
- "glm-4.5-air": true,
1055
- "glm-4.6": true,
1056
- "glm-4.7": true,
1057
- "glm-5": true,
1058
- "glm-5-turbo": true,
1059
- "glm-5.1": true,
1060
- };
1061
-
1062
- // Vision-capable GLM models follow the `glm-<N>[.<N>]v[-<variant>]` shape
1063
- // (e.g. `glm-4v`, `glm-4.5v`, `glm-4v-plus`). The previous `id.includes("v")`
1064
- // check matched anything with a `v` — including the non-vision `glm-5-preview`.
1065
- const ZHIPU_VISION_PATTERN = /^glm-[45](?:\.\d+)?v(?:-|$)/;
1066
-
1067
1049
  // ---------------------------------------------------------------------------
1068
1050
  // 7.5 Fireworks
1069
1051
  // ---------------------------------------------------------------------------
@@ -2393,6 +2375,8 @@ export function litellmModelManagerOptions(
2393
2375
  // 22. vLLM
2394
2376
  // ---------------------------------------------------------------------------
2395
2377
 
2378
+ const VLLM_DISCOVERY_TIMEOUT_MS = 10_000;
2379
+
2396
2380
  export interface VllmModelManagerConfig {
2397
2381
  apiKey?: string;
2398
2382
  baseUrl?: string;
@@ -2405,6 +2389,7 @@ export function vllmModelManagerOptions(config?: VllmModelManagerConfig): ModelM
2405
2389
  const references = createBundledReferenceMap<"openai-completions">("vllm" as Parameters<typeof getBundledModels>[0]);
2406
2390
  return {
2407
2391
  providerId: "vllm",
2392
+ cacheProviderId: `vllm:${Bun.hash(baseUrl).toString(36)}`,
2408
2393
  fetchDynamicModels: () =>
2409
2394
  fetchOpenAICompatibleModels({
2410
2395
  api: "openai-completions",
@@ -2419,6 +2404,7 @@ export function vllmModelManagerOptions(config?: VllmModelManagerConfig): ModelM
2419
2404
  };
2420
2405
  },
2421
2406
  fetch: config?.fetch,
2407
+ signal: AbortSignal.timeout(VLLM_DISCOVERY_TIMEOUT_MS),
2422
2408
  }),
2423
2409
  };
2424
2410
  }
package/src/types.ts CHANGED
@@ -186,6 +186,12 @@ export interface OpenAICompat {
186
186
  requiresAssistantContentForToolCalls?: boolean;
187
187
  /** Whether the provider supports the `tool_choice` parameter. Default: true. */
188
188
  supportsToolChoice?: boolean;
189
+ /**
190
+ * Whether forced `tool_choice` values (`"required"` or named tools) are accepted.
191
+ * When false, request builders keep tools available but downgrade forced choices
192
+ * to provider-default auto selection. Default: true.
193
+ */
194
+ supportsForcedToolChoice?: boolean;
189
195
  /**
190
196
  * Drop reasoning fields (`reasoning_effort`, OpenRouter `reasoning`) for
191
197
  * the request when `tool_choice` forces a tool call. Mirrors the Anthropic
@@ -20,6 +20,8 @@ export const ANTIGRAVITY_SYSTEM_INSTRUCTION =
20
20
  "You are pair programming with a USER to solve their coding task. The task may require creating a new codebase, modifying or debugging an existing codebase, or simply answering a question." +
21
21
  "**Absolute paths only**" +
22
22
  "**Proactiveness**";
23
+ export const ANTIGRAVITY_NO_PREAMBLE_INSTRUCTION =
24
+ 'CRITICAL: NEVER output rule checks, formatting guidelines, constraint checklists (e.g. "No emdashes"), or your thinking/personality preambles in the final response. Output only the final response.';
23
25
  /**
24
26
  * Antigravity / Cloud Code Assist user agent. Lives in its own file so discovery
25
27
  * and usage code can read it without pulling the heavy google-gemini-cli provider