npm - @oh-my-pi/pi-catalog - Versions diffs - 15.11.6 → 15.11.8 - Mend

@oh-my-pi/pi-catalog 15.11.6 → 15.11.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/CHANGELOG.md +26 -0
package/README.md +43 -0
package/dist/types/identity/family.d.ts +21 -1
package/dist/types/index.d.ts +1 -0
package/dist/types/model-thinking.d.ts +12 -0
package/dist/types/provider-models/descriptors.d.ts +1 -1
package/dist/types/types.d.ts +22 -0
package/dist/types/utils.d.ts +6 -0
package/dist/types/variant-collapse.d.ts +126 -0
package/package.json +3 -3
package/src/build.ts +2 -0
package/src/compat/openai.ts +10 -6
package/src/discovery/antigravity.ts +9 -10
package/src/identity/family.ts +43 -2
package/src/index.ts +1 -0
package/src/model-cache.ts +4 -3
package/src/model-manager.ts +19 -8
package/src/model-thinking.ts +56 -1
package/src/models.json +2426 -1809
package/src/provider-models/descriptors.ts +1 -1
package/src/types.ts +22 -0
package/src/utils.ts +24 -0
package/src/variant-collapse.ts +622 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,32 @@
 ## [Unreleased]
+## [15.11.8] - 2026-06-12
+### Fixed
+- Fixed Antigravity `gemini-3.1-pro --thinking high` failing with `Cloud Code Assist API error (400): Request contains an invalid argument.` — the upstream `gemini-3.1-pro-high` deployment rejects every `streamGenerateContent` request on both CCA endpoints while discovery still advertises it. High effort now routes to `gemini-pro-agent` (the same "Gemini 3.1 Pro (High)" model, verified accepting the identical request body), and the model-cache fingerprint version was bumped (`merge-v2` → `merge-v3`) so existing fresh caches refetch discovery and pick up the corrected routing immediately.
+## [15.11.7] - 2026-06-12
+### Added
+- Added effort-tier variant collapsing (`variant-collapse`): providers that expose one logical model as several effort/thinking-suffixed upstream ids (Antigravity CCA `gemini-3.5-flash-extra-low`/`-low`/`gemini-3-flash-agent`, `gemini-3[.1]-pro-low|high`, `claude-*[-thinking]` pairs, `gpt-oss-120b-medium`) collapse into one logical entry carrying per-effort upstream routing in `thinking.effortRouting` (plus `thinking.suppressWhenOff` for Cloud Code Assist ids whose baked server default re-applies when `thinkingConfig` is omitted). Request-time code resolves the outbound id via `resolveWireModelId(model, effort)`; selection, caching, and usage attribution key on the logical id.
+- Added the automatic `X`/`X-thinking` pair rule (`deriveThinkingPairFamilies`): any provider's live bare/thinking twin collapses into the bare id, routing thinking-enabled requests to the `-thinking` backing id (trailing or infix token, so `kimi-k2-thinking-turbo` pairs with `kimi-k2-turbo`). Gated on same api and compatible pricing — all-zero cost rows count as unknown, while twins that both carry real, differing prices remain separate SKUs.
+- Added `collapseBuiltModelVariants` and wired collapsing at every materialization point — Antigravity discovery, the catalog generator, and the model-manager merge — so stale sources (old static beside collapsed dynamic results, mixed cache rows) converge on logical entries instead of unioning raw tier ids back into the catalog.
+- Added `thinking.requiresEffort`, baked for reasoning-only upstreams — Gemini 3.x (levels only, no off), Gemini 2.5 Pro (thinkingBudget floors at 128, rejects 0), OpenAI o-series, MiniMax M2, and thinking-variant SKUs (`*-thinking`/`*-reasoner`/`*-reasoning`, with a negation-aware token grammar so `non-thinking` ids never match). Identity derivation bakes it for new entries and `fillThinkingWireDefaults` backfills explicit/cached metadata; `minimumSupportedEffort` exposes the canonical floor. Pair-collapsed twins drop member flags (their off routes to the bare SKU), while identity re-flags pairs whose logical id is itself mandatory
+### Changed
+- Changed model display names to drop model-extrinsic decorations: gateway author prefixes (`OpenAI: …`, `Google: …`), `(latest)` alias markers, `(Antigravity)` provider attribution, price tiers (`($$$$)`), and promo/lifecycle tags (`(20% off)`, `(retires …)`). `cleanModelName` is applied in `buildModel` (covers live discovery and stale caches) and as a catalog-generator pass; Antigravity discovery no longer appends `(Antigravity)` to display names. Variant tags that map to distinct wire ids (`(Thinking)`, `(free)`, `(Fast)`, dates, regions) are preserved.
+- Changed the `google-antigravity` default model from `gemini-3-pro-high` to `gemini-3.1-pro`
+- Changed `gemini-2.5-flash-thinking` handling from discovery-denylist to collapsing into `gemini-2.5-flash` (thinking-enabled requests route to the `-thinking` backing id)
+- Bumped the model cache schema to v5 so rows predating effort-tier variant collapsing (raw `-low`/`-high`/`-thinking` member ids) are invalidated
+### Fixed
+- Fixed catalog generation to apply effort-tier variant collapsing before provider grouping to ensure collapsed model families are consistently materialized without being impacted by in-loop mutation
+- Fixed Kimi K2.6 OpenAI-compatible compat metadata to use a 300s stream watchdog floor, covering Fire Pass router ids as well as public `kimi-k2.6` ids so long reasoning starts do not hit the generic first-event timeout ([#2366](https://github.com/can1357/oh-my-pi/issues/2366)).
 ## [15.11.4] - 2026-06-12
 ### Fixed

package/README.md ADDED Viewed

@@ -0,0 +1,43 @@
+# @oh-my-pi/pi-catalog
+Model catalog for [oh-my-pi](https://github.com/can1357/oh-my-pi): bundled model database, provider discovery, model identity, classification, and equivalence.
+## What's inside
+| Module | Purpose |
+| --- | --- |
+| `models.json` + `models` | Bundled model database (pricing, context windows, modalities, thinking support) |
+| `provider-models` | Provider catalog descriptors (`CATALOG_PROVIDERS`), per-provider model resolution rules |
+| `discovery` | Runtime model discovery for OpenAI-compatible endpoints, Gemini, Codex, Cursor, Antigravity, Ollama |
+| `identity` | Model id parsing and classification (family/version), reference resolution, equivalence, selection priority |
+| `model-thinking` | Thinking/reasoning metadata and generated per-model policies |
+| `model-manager` / `model-cache` | Runtime model registry with discovery refresh and on-disk caching |
+| `variant-collapse` | Collapsing provider-specific variants of the same underlying model |
+| `compat` | Request/response compatibility fixups for OpenAI- and Anthropic-shaped APIs |
+| `wire` | Wire-level helpers: Codex, Gemini headers, GitHub Copilot |
+| `effort` | Reasoning-effort level definitions |
+Import from subpaths (`@oh-my-pi/pi-catalog/<module>`) or the root barrel.
+## models.json is generated
+Never edit `src/models.json` by hand — it is produced from upstream sources (models.dev, provider catalog discovery, OpenCode docs) by `scripts/generate-models.ts` and the resolvers in `src/provider-models/`. Regenerate with:
+```sh
+bun --cwd=packages/catalog run generate-models
+```
+To change an entry, fix the source: resolver overrides in `provider-models/openai-compat.ts`, provider entries in `provider-models/descriptors.ts`, generator fixups in `scripts/generate-models.ts`, or thinking policies in `model-thinking.ts`.
+## Install
+```sh
+bun add @oh-my-pi/pi-catalog
+```
+Ships TypeScript source directly (no build step); requires Bun ≥ 1.3.14.
+## References
+- [Monorepo README](https://github.com/can1357/oh-my-pi#readme)
+- [CHANGELOG](./CHANGELOG.md)

package/dist/types/identity/family.d.ts CHANGED Viewed

@@ -8,7 +8,7 @@
  */
 /** Kimi family ids in any namespace form (`moonshotai/kimi-*`, `kimi-k2.6`, `vendor/kimi.x`). */
 export declare function isKimiModelId(modelId: string): boolean;
-/** Kimi K2.6 specifically (preserved-thinking transport on Moonshot-native hosts). */
+/** Kimi K2.6 specifically, including router ids that spell the version `k2p6`. */
 export declare function isKimiK26ModelId(modelId: string): boolean;
 /** Claude ids in any namespace form (`claude-*`, `vendor/claude.x`). */
 export declare function isClaudeModelId(modelId: string): boolean;
@@ -60,3 +60,23 @@ export declare function hasOpus47ApiRestrictions(modelId: string): boolean;
  */
 export declare function supportsMidConversationSystemMessages(modelId: string): boolean;
 export declare function isAnthropicFableOrMythosModel(modelId: string): boolean;
+/** Thinking-variant token location inside a model id. */
+export interface ThinkingVariantToken {
+    index: number;
+    length: number;
+}
+/**
+ * Locates the first thinking-variant token (`-thinking`, `-reasoner`,
+ * `-reasoning`; trailing or infix) in a model id. The token ends at the id
+ * end or any non-alphanumeric boundary, and negated forms (`non-thinking`,
+ * `no-thinking`) never match — those name the NON-thinking SKU.
+ */
+export declare function findThinkingVariantToken(modelId: string): ThinkingVariantToken | undefined;
+/**
+ * Removes the located thinking-variant token: `kimi-k2-thinking` → `kimi-k2`,
+ * `mimo-v2-flash-thinking-original` → `mimo-v2-flash-original`,
+ * `grok-4.1-fast-reasoning` → `grok-4.1-fast`. Returns `undefined` when no
+ * token exists or nothing would remain. Callers MUST verify the result names
+ * a live model.
+ */
+export declare function stripThinkingVariantToken(modelId: string): string | undefined;

package/dist/types/index.d.ts CHANGED Viewed

@@ -10,6 +10,7 @@ export * from "./models";
 export * from "./provider-models";
 export * from "./types";
 export * from "./utils";
+export * from "./variant-collapse";
 export * from "./wire/codex";
 export * from "./wire/gemini-headers";
 export * from "./wire/github-copilot";

package/dist/types/model-thinking.d.ts CHANGED Viewed

@@ -64,4 +64,16 @@ export declare function mapEffortToGoogleThinkingLevel(effort: Effort): "MINIMAL
  * the model's baked `thinking.effortMap` (identity for unmapped efforts).
  */
 export declare function mapEffortToAnthropicAdaptiveEffort<TApi extends Api>(model: ApiModel<TApi>, effort: Effort): "low" | "medium" | "high" | "xhigh" | "max";
+/**
+ * Resolves the upstream wire model id for a request at the given effort
+ * (`undefined` = thinking off). Collapsed effort-tier variants route through
+ * `thinking.effortRouting`; everything else falls back to
+ * `requestModelId ?? id`.
+ */
+export declare function resolveWireModelId<TApi extends Api>(model: ApiModel<TApi>, effort: Effort | undefined): string;
+/**
+ * Lowest supported effort in canonical order — the clamp target for
+ * thinking-off requests on `thinking.requiresEffort` models.
+ */
+export declare function minimumSupportedEffort<TApi extends Api>(model: ApiModel<TApi>): Effort | undefined;
 export {};

package/dist/types/provider-models/descriptors.d.ts CHANGED Viewed

@@ -93,7 +93,7 @@ export declare const CATALOG_PROVIDERS: readonly [{
     readonly createModelManagerOptions: (config: ModelManagerConfig) => import("..").ModelManagerOptions<"google-generative-ai", unknown>;
 }, {
     readonly id: "google-antigravity";
-    readonly defaultModel: "gemini-3-pro-high";
+    readonly defaultModel: "gemini-3.1-pro";
     readonly specialModelManager: true;
 }, {
     readonly id: "google-gemini-cli";

package/dist/types/types.d.ts CHANGED Viewed

@@ -27,6 +27,28 @@ export interface ThinkingConfig {
      * 5). Also implies native interleaved thinking — no beta header needed.
      */
     supportsDisplay?: boolean;
+    /**
+     * Per-effort upstream wire-id routing for collapsed effort-tier variants
+     * (`variant-collapse.ts`). Keyed by pi effort; `"off"` applies when
+     * thinking is disabled. Missing keys fall back to `requestModelId ?? id`.
+     */
+    effortRouting?: Readonly<Partial<Record<Effort | "off", string>>>;
+    /**
+     * When true, a thinking-off request MUST explicitly suppress thinking on
+     * the wire (google-level: `thinkingLevel: "MINIMAL"` + `includeThoughts:
+     * false`; budget: `thinkingBudget: 0`) instead of omitting thinkingConfig —
+     * Cloud Code Assist re-applies the per-id baked server default when the
+     * config is absent.
+     */
+    suppressWhenOff?: boolean;
+    /**
+     * Reasoning is mandatory upstream: the endpoint rejects disabled or
+     * omitted thinking (e.g. OpenRouter Gemini 3.x — "Reasoning is mandatory
+     * for this endpoint and cannot be disabled"). Request mapping clamps
+     * thinking-off to the lowest supported effort unless `suppressWhenOff`
+     * provides an explicit wire off-path.
+     */
+    requiresEffort?: boolean;
 }
 export type Provider = string;
 /** Token budgets for each thinking level (token-based providers only) */

package/dist/types/utils.d.ts CHANGED Viewed

@@ -3,3 +3,9 @@ export declare function toNumber(value: unknown): number | undefined;
 export declare function toPositiveNumber(value: unknown, fallback: number): number;
 export declare function toBoolean(value: unknown): boolean | undefined;
 export declare function isAnthropicOAuthToken(key: string): boolean;
+/**
+ * Normalize a model display name: drop the gateway author prefix and
+ * model-extrinsic decorations. Returns the input verbatim when nothing
+ * matches (or when stripping would leave an empty name).
+ */
+export declare function cleanModelName(name: string): string;

package/dist/types/variant-collapse.d.ts ADDED Viewed

@@ -0,0 +1,126 @@
+import { Effort } from "./effort";
+import type { Api, Model, ModelSpec, Provider, ThinkingConfig } from "./types";
+/**
+ * Structural bound for collapse inputs: both raw `ModelSpec`s and built
+ * `Model`s qualify. (`Model.compat` is the resolved record, not the sparse
+ * config, so the two are not mutually assignable — collapsing never touches
+ * `compat`.)
+ */
+export type VariantSpecLike = Omit<ModelSpec<Api>, "compat"> & {
+    compat?: unknown;
+};
+/** One collapsed family: logical id + member wire ids + per-effort routing. */
+export interface EffortVariantFamily {
+    /** Collapsed logical id (may equal a member id — e.g. bare/thinking pairs). */
+    id: string;
+    /** Final display name, no tier marker. */
+    name: string;
+    /**
+     * Member wire ids in priority order. The first member present in the input
+     * becomes the collapsed spec's default wire id (`requestModelId`; omitted
+     * when it equals the logical id).
+     */
+    members: readonly string[];
+    /**
+     * Wire ids upstream no longer serves (e.g. a deployment killed while
+     * discovery still advertises it). Fresh collapsing never routes to them,
+     * and stale collapsed snapshots (bundled catalog, cache rows,
+     * previous-generation fallbacks) get routing/`requestModelId` entries that
+     * target them re-pointed through `routing`. Keep retired ids in `members`
+     * so the raw upstream spec is still consumed and aliased.
+     */
+    retiredMembers?: readonly string[];
+    /**
+     * Per-effort upstream wire id; `"off"` applies when thinking is disabled.
+     * Entries whose target member is absent from the input are dropped — those
+     * efforts fall back to `requestModelId ?? id`.
+     */
+    routing: Readonly<Partial<Record<Effort | "off", string>>>;
+    /** Explicit capability surface for the collapsed spec — no inference. */
+    thinking: Readonly<Omit<ThinkingConfig, "effortRouting" | "suppressWhenOff">>;
+    /** Thinking-off requests must explicitly suppress thinking on the wire. */
+    suppressWhenOff?: boolean;
+    /** Retired/recycled selector ids that alias to this family without being members. */
+    extraAliases?: readonly string[];
+}
+export interface VariantCollapseTable {
+    families: readonly EffortVariantFamily[];
+}
+/**
+ * Shared by `google-antigravity` and `google-gemini-cli` — both serve the
+ * Antigravity discovery list (`fetchAntigravityDiscoveryModels`).
+ */
+export declare const ANTIGRAVITY_VARIANT_COLLAPSE_TABLE: VariantCollapseTable;
+/** Provider id → hand collapse table. Both CCA providers share one table. */
+export declare const VARIANT_COLLAPSE_TABLES: Readonly<Record<string, VariantCollapseTable>>;
+/**
+ * The global automatic rule: derive an `X` + `X-thinking` family for every
+ * pair where both ids are live in `specs` (trailing or infix token). Gates:
+ * - both members share the same `api`,
+ * - known pricing must match — all-zero cost rows count as unknown
+ *   (aggregators routinely ship them), but twins that BOTH carry real,
+ *   differing prices are distinct SKUs and never merge,
+ * - ids claimed by the provider's hand `table` are skipped (curation wins).
+ * The capability surface prefers the thinking member's metadata, then the
+ * bare member's, then the canonical deriver (aggregators often ship
+ * `reasoning: false` and no thinking config on the twin), then a budget
+ * default. `off` routes to the bare id; every supported effort routes to the
+ * thinking id.
+ */
+export declare function deriveThinkingPairFamilies<TSpec extends VariantSpecLike>(specs: readonly TSpec[], table?: VariantCollapseTable): EffortVariantFamily[];
+/**
+ * True when `spec` is the output of collapsing rather than a raw upstream
+ * member. `thinking.effortRouting` is written only by collapsing; the
+ * `requestModelId` arm is scoped to the provider's hand-table family ids so
+ * unrelated carriers (GitHub Copilot `-1m` context variants) never match.
+ */
+export declare function isVariantCollapsedSpec(spec: VariantSpecLike): boolean;
+/**
+ * Collapse every family in `table` found in `specs`. Non-member specs pass
+ * through verbatim (by reference), order preserved; the collapsed spec
+ * replaces the first occurrence of its family.
+ */
+export declare function collapseEffortVariants<TSpec extends VariantSpecLike>(specs: readonly TSpec[], table: VariantCollapseTable): TSpec[];
+/**
+ * Collapse a full mixed-provider list: per provider, the hand table (when
+ * registered) plus the automatic `X`/`X-thinking` pair rule. Used by the
+ * catalog generator; the runtime equivalent lives at the model-manager merge
+ * point. Output is regrouped by provider — callers re-sort.
+ */
+export declare function collapseEffortVariantsAcrossProviders<TSpec extends VariantSpecLike>(specs: readonly TSpec[]): TSpec[];
+/**
+ * Runtime entry point for already-built `Model` lists (the model-manager
+ * merge point, coding-agent registry custom providers): collapses hand
+ * tables plus derived pairs, then re-runs `buildModel` on freshly created
+ * logical specs so thinking wire defaults stay resolved. Untouched entries
+ * pass through by reference.
+ */
+export declare function collapseBuiltModelVariants<TApi extends Api>(models: readonly Model<TApi>[]): Model<TApi>[];
+/**
+ * Resolve a retired effort-tier variant id (collapsed member, recycled id) to
+ * its replacement model id for `provider` via the hand table. Returns
+ * `undefined` when the id is not a known alias; derived `X-thinking` members
+ * resolve through `stripThinkingVariantToken` instead. Callers must try an
+ * exact model lookup first — a live model always wins over an alias.
+ */
+export declare function resolveVariantAlias(provider: Provider, modelId: string): string | undefined;
+/** Bare-id alias hit: replacement id plus the providers declaring it. */
+export interface BareVariantAliasHit {
+    id: string;
+    /** Providers whose table declares the alias — candidates from these win ties. */
+    providers: readonly Provider[];
+}
+/**
+ * Provider-agnostic hand-table alias lookup for bare-id selectors. Returns
+ * the declaring providers so callers can prefer their models when the
+ * replacement id exists on unrelated providers too (e.g. a retired Cursor
+ * tier id must not resolve to `openai/gpt-5.4`).
+ */
+export declare function resolveBareVariantAlias(modelId: string): BareVariantAliasHit | undefined;
+/**
+ * Reverse alias lookup: the retired ids that resolve to `modelId` for
+ * `provider` via the hand table. Used to re-key config keyed by raw member
+ * ids (models.yml `modelOverrides`, suppressed selectors) onto the collapsed
+ * model. Empty for providers without a table.
+ */
+export declare function getVariantAliasSources(provider: Provider, modelId: string): readonly string[];

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
 	"type": "module",
 	"name": "@oh-my-pi/pi-catalog",
-	"version": "15.11.6",
+	"version": "15.11.8",
 	"description": "Model catalog for omp: bundled model database, provider discovery descriptors, model identity, classification, and equivalence",
 	"homepage": "https://omp.sh",
 	"author": "Can Boluk",
@@ -34,11 +34,11 @@
 	},
 	"dependencies": {
 		"@bufbuild/protobuf": "^2.12.0",
-		"@oh-my-pi/pi-utils": "15.11.6",
+		"@oh-my-pi/pi-utils": "15.11.8",
 		"zod": "4.4.3"
 	},
 	"devDependencies": {
-		"@oh-my-pi/pi-ai": "15.11.6",
+		"@oh-my-pi/pi-ai": "15.11.8",
 		"@types/bun": "^1.3.14"
 	},
 	"engines": {

package/src/build.ts CHANGED Viewed

@@ -13,11 +13,13 @@ import { buildAnthropicCompat } from "./compat/anthropic";
 import { buildOpenAICompat, buildOpenAIResponsesCompat } from "./compat/openai";
 import { resolveModelThinking } from "./model-thinking";
 import type { Api, CompatOf, Model, ModelSpec } from "./types";
+import { cleanModelName } from "./utils";
 export function buildModel<TApi extends Api>(spec: ModelSpec<TApi>): Model<TApi> {
 	const compat = buildCompat(spec) as CompatOf<TApi>;
 	return {
 		...spec,
+		name: cleanModelName(spec.name),
 		thinking: resolveModelThinking(spec, compat),
 		compat,
 		compatConfig: spec.compat,

package/src/compat/openai.ts CHANGED Viewed

@@ -25,6 +25,8 @@ const GLM_CODING_PLAN_MODEL_PATTERN = /^glm-5(?:[.-]|$)/i;
 const GLM_CODING_PLAN_STREAM_IDLE_TIMEOUT_MS = 600_000;
 /** Direct DeepSeek reasoning models stall between thinking and answer phases. */
 const DEEPSEEK_REASONING_STREAM_IDLE_TIMEOUT_MS = 300_000;
+/** Kimi K2.6 can spend several minutes reasoning before the first visible token. */
+const KIMI_K26_REASONING_STREAM_IDLE_TIMEOUT_MS = 300_000;
 /**
  * OpenCode's gateways (https://opencode.ai/zen|go) gate `reasoning_content`
@@ -178,15 +180,17 @@ export function buildOpenAICompat(spec: ModelSpec<"openai-completions">): Resolv
 			isCopilotHost ||
 			isZenmuxHost);
-	// Stream-watchdog floor: GLM coding-plan SKUs and direct DeepSeek reasoning
-	// models idle for minutes mid-reasoning; widen the idle timeout so warm-ups
-	// stop aborting and retrying.
+	// Stream-watchdog floor: GLM coding-plan SKUs, Kimi K2.6, and direct
+	// DeepSeek reasoning models can idle for minutes while reasoning; widen the
+	// idle timeout so warm-ups stop aborting and retrying.
 	const streamIdleTimeoutMs =
 		GLM_CODING_PLAN_MODEL_PATTERN.test(spec.id) && (isZai || isZhipu)
 			? GLM_CODING_PLAN_STREAM_IDLE_TIMEOUT_MS
-			: spec.reasoning && isDirectDeepseekApi
-				? DEEPSEEK_REASONING_STREAM_IDLE_TIMEOUT_MS
-				: undefined;
+			: spec.reasoning && isKimiK26ModelId(spec.id)
+				? KIMI_K26_REASONING_STREAM_IDLE_TIMEOUT_MS
+				: spec.reasoning && isDirectDeepseekApi
+					? DEEPSEEK_REASONING_STREAM_IDLE_TIMEOUT_MS
+					: undefined;
 	const compat: ResolvedOpenAICompat = {
 		supportsStore: !isNonStandard,

package/src/discovery/antigravity.ts CHANGED Viewed

@@ -1,6 +1,7 @@
 import * as z from "zod/v4";
 import type { ModelSpec } from "../types";
 import { toPositiveNumber } from "../utils";
+import { ANTIGRAVITY_VARIANT_COLLAPSE_TABLE, collapseEffortVariants } from "../variant-collapse";
 import { getAntigravityUserAgent } from "../wire/gemini-headers";
 const DEFAULT_ANTIGRAVITY_DISCOVERY_ENDPOINTS = [
@@ -11,13 +12,7 @@ const FETCH_AVAILABLE_MODELS_PATH = "/v1internal:fetchAvailableModels";
 const DEFAULT_CONTEXT_WINDOW = 200_000;
 const DEFAULT_MAX_TOKENS = 64_000;
-const ANTIGRAVITY_DISCOVERY_DENYLIST = new Set([
-	"chat_20706",
-	"chat_23310",
-	"gemini-2.5-flash-thinking",
-	"gemini-3-pro-low",
-	"gemini-2.5-pro",
-]);
+const ANTIGRAVITY_DISCOVERY_DENYLIST = new Set(["chat_20706", "chat_23310", "gemini-2.5-pro"]);
 /**
  * Raw model metadata returned by Antigravity's `fetchAvailableModels` endpoint.
@@ -224,7 +219,7 @@ export async function fetchAntigravityDiscoveryModels(
 			const supportsImages = model.supportsImages === true;
 			models.push({
 				id: modelId,
-				name: model.displayName ? `${model.displayName} (Antigravity)` : modelId,
+				name: model.displayName || modelId,
 				api: "google-gemini-cli",
 				provider: "google-antigravity",
 				baseUrl: endpoint,
@@ -241,8 +236,12 @@ export async function fetchAntigravityDiscoveryModels(
 			});
 		}
-		models.sort((a, b) => a.name.localeCompare(b.name) || a.id.localeCompare(b.id));
-		return models;
+		// Collapse effort-tier variants at the source so runtime discovery,
+		// the gemini-cli re-provision, and the catalog generator all see
+		// logical ids only.
+		const collapsed = collapseEffortVariants(models, ANTIGRAVITY_VARIANT_COLLAPSE_TABLE);
+		collapsed.sort((a, b) => a.name.localeCompare(b.name) || a.id.localeCompare(b.id));
+		return collapsed;
 	}
 	return null;

package/src/identity/family.ts CHANGED Viewed

@@ -14,9 +14,9 @@ export function isKimiModelId(modelId: string): boolean {
 	return modelId.includes("moonshotai/kimi") || /(^|\/)kimi[-.]/i.test(modelId);
 }
-/** Kimi K2.6 specifically (preserved-thinking transport on Moonshot-native hosts). */
+/** Kimi K2.6 specifically, including router ids that spell the version `k2p6`. */
 export function isKimiK26ModelId(modelId: string): boolean {
-	return /(^|\/)kimi-k2\.6(?:[-:]|$)/i.test(modelId);
+	return /(^|\/)kimi-k2(?:\.6|p6)(?:[-:]|$)/i.test(modelId);
 }
 /** Claude ids in any namespace form (`claude-*`, `vendor/claude.x`). */
@@ -113,3 +113,44 @@ export function isAnthropicFableOrMythosModel(modelId: string): boolean {
 	const parsed = parseAnthropicModel(bareModelId(modelId));
 	return parsed !== null && isFableOrMythos(parsed.kind);
 }
+/** Thinking-variant token location inside a model id. */
+export interface ThinkingVariantToken {
+	index: number;
+	length: number;
+}
+const THINKING_VARIANT_TOKEN_RE = /-(?:thinking|reasoner|reasoning)(?=$|[^a-z0-9])/gi;
+/**
+ * Locates the first thinking-variant token (`-thinking`, `-reasoner`,
+ * `-reasoning`; trailing or infix) in a model id. The token ends at the id
+ * end or any non-alphanumeric boundary, and negated forms (`non-thinking`,
+ * `no-thinking`) never match — those name the NON-thinking SKU.
+ */
+export function findThinkingVariantToken(modelId: string): ThinkingVariantToken | undefined {
+	THINKING_VARIANT_TOKEN_RE.lastIndex = 0;
+	let match = THINKING_VARIANT_TOKEN_RE.exec(modelId);
+	while (match !== null) {
+		const preceding = /([a-z0-9]+)$/i.exec(modelId.slice(0, match.index))?.[1]?.toLowerCase();
+		if (preceding !== "non" && preceding !== "no") {
+			return { index: match.index, length: match[0].length };
+		}
+		match = THINKING_VARIANT_TOKEN_RE.exec(modelId);
+	}
+	return undefined;
+}
+/**
+ * Removes the located thinking-variant token: `kimi-k2-thinking` → `kimi-k2`,
+ * `mimo-v2-flash-thinking-original` → `mimo-v2-flash-original`,
+ * `grok-4.1-fast-reasoning` → `grok-4.1-fast`. Returns `undefined` when no
+ * token exists or nothing would remain. Callers MUST verify the result names
+ * a live model.
+ */
+export function stripThinkingVariantToken(modelId: string): string | undefined {
+	const token = findThinkingVariantToken(modelId);
+	if (!token) return undefined;
+	const stripped = modelId.slice(0, token.index) + modelId.slice(token.index + token.length);
+	return stripped.length > 0 ? stripped : undefined;
+}

package/src/index.ts CHANGED Viewed

@@ -10,6 +10,7 @@ export * from "./models";
 export * from "./provider-models";
 export * from "./types";
 export * from "./utils";
+export * from "./variant-collapse";
 export * from "./wire/codex";
 export * from "./wire/gemini-headers";
 export * from "./wire/github-copilot";

package/src/model-cache.ts CHANGED Viewed

@@ -7,9 +7,10 @@ import { getModelDbPath } from "@oh-my-pi/pi-utils";
 import type { Api, Model, ModelSpec } from "./types";
 // Rows persist ModelSpec JSON (sparse `compat`, never the resolved record);
-// the model manager rebuilds via `buildModel` on load. v4 invalidates rows
-// carrying the pre-efforts ThinkingConfig shape (minLevel/maxLevel/levels).
-const CACHE_SCHEMA_VERSION = 4;
+// the model manager rebuilds via `buildModel` on load. v5 invalidates rows
+// predating effort-tier variant collapsing (raw `-low`/`-high`/`-thinking`
+// member ids); v4 dropped the pre-efforts ThinkingConfig shape.
+const CACHE_SCHEMA_VERSION = 5;
 interface CacheRow {
 	provider_id: string;

package/src/model-manager.ts CHANGED Viewed

@@ -3,6 +3,7 @@ import { readModelCache, writeModelCache } from "./model-cache";
 import { type GeneratedProvider, getBundledModels } from "./models";
 import type { Api, Model, ModelSpec, Provider } from "./types";
 import { isRecord } from "./utils";
+import { collapseBuiltModelVariants } from "./variant-collapse";
 const DEFAULT_CACHE_TTL_MS = 2 * 60 * 60 * 1000;
 const NON_AUTHORITATIVE_RETRY_MS = 5 * 60 * 1000;
@@ -134,7 +135,7 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
 	// Re-running `mergeDynamicModels(static, cache)` would just rebuild the same
 	// objects (~800ms in the steady-state cold-start profile for `omp -p hi`).
 	if (!shouldFetchFromNetwork && cache?.fresh && hasAuthoritativeCache && cacheFingerprintMatches) {
-		return { models: passModelList<TApi>(cache.models), stale: false };
+		return { models: collapseBuiltModelVariants(passModelList<TApi>(cache.models)), stale: false };
 	}
 	const [fetchedModelsDevModels, fetchedDynamicModels] = shouldFetchFromNetwork
@@ -148,8 +149,9 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
 	const dynamicModels = fetchedDynamicModels ?? [];
 	const mergedWithCache = mergeDynamicModels(mergeModelSources(staticModels, modelsDevModels), cacheModels);
 	const mergedModels = mergeDynamicModels(mergedWithCache, dynamicModels);
-	const models =
-		dynamicModelsAuthoritative && dynamicFetchSucceeded ? retainModelIds(mergedModels, dynamicModels) : mergedModels;
+	const models = collapseBuiltModelVariants(
+		dynamicModelsAuthoritative && dynamicFetchSucceeded ? retainModelIds(mergedModels, dynamicModels) : mergedModels,
+	);
 	const dynamicAuthoritative = !hasDynamicFetcher || dynamicFetchSucceeded || shouldUseFreshCacheAsAuthoritative;
 	if (shouldFetchFromNetwork) {
 		if (dynamicFetchSucceeded) {
@@ -157,7 +159,14 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
 			const snapshotModels = dynamicModelsAuthoritative
 				? retainModelIds(mergedSnapshot, dynamicModels)
 				: mergedSnapshot;
-			writeModelCache(options.providerId, now(), snapshotModels, true, staticFingerprint, dbPath);
+			writeModelCache(
+				options.providerId,
+				now(),
+				collapseBuiltModelVariants(snapshotModels),
+				true,
+				staticFingerprint,
+				dbPath,
+			);
 		} else {
 			// Dynamic fetch failed — update cache with a non-authoritative snapshot so
 			// stale state remains visible while retry backoff still applies.
@@ -165,9 +174,11 @@ export async function resolveProviderModels<TApi extends Api = Api, TModelsDevPa
 			writeModelCache(
 				options.providerId,
 				now(),
-				mergeDynamicModels(
-					mergeModelSources(staticModels, modelsDevModels),
-					normalizeModelList<TApi>(latestCache?.models ?? cache?.models ?? []),
+				collapseBuiltModelVariants(
+					mergeDynamicModels(
+						mergeModelSources(staticModels, modelsDevModels),
+						normalizeModelList<TApi>(latestCache?.models ?? cache?.models ?? []),
+					),
 				),
 				false,
 				staticFingerprint,
@@ -290,7 +301,7 @@ function retainModelIds<TApi extends Api>(
  * arms calling `resolveProviderModels` with the same `staticModels` array)
  * skip the JSON+hash work after the first call.
  */
-const MODEL_CACHE_FINGERPRINT_VERSION = "merge-v2";
+const MODEL_CACHE_FINGERPRINT_VERSION = "merge-v3";
 const kStaticFingerprint = Symbol("model-manager.staticFingerprint");
 type ModelArrayWithFingerprint = readonly Model<Api>[] & { [kStaticFingerprint]?: string };
 function fingerprintStatic<TApi extends Api>(