npm - pi-sap-aicore - Versions diffs - 0.1.0 - Mend

pi-sap-aicore 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/LICENSE +21 -0
package/README.md +296 -0
package/index.ts +68 -0
package/package.json +40 -0
package/scripts/diagnose-streaming.mjs +99 -0
package/scripts/list-sap-models.mjs +92 -0
package/scripts/update-models.mjs +107 -0
package/src/auth.ts +104 -0
package/src/foundation-params.ts +55 -0
package/src/models-config.ts +93 -0
package/src/models-snapshot.json +527 -0
package/src/stream-foundation.ts +361 -0
package/src/stream.ts +1051 -0
package/src/to-pi-model.ts +21 -0
package/src/translate-foundation.ts +154 -0
package/src/translate.ts +218 -0
package/tsconfig.json +16 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Tim Pearson
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,296 @@
+# pi-sap-aicore
+A custom provider extension for the [pi coding agent](https://pi.dev) that routes
+inference through **SAP AI Core** — via the **orchestration** service (every model
+from a single deployment) and/or **direct foundation deployments** (per-model
+Azure OpenAI endpoints with native streaming). Both register at once and share one
+login, so you pick the route per model. See
+[Orchestration vs. Foundation](#orchestration-vs-foundation).
+## Prerequisites
+- pi **0.78.0 or newer** installed (`npm install -g @earendil-works/pi-coding-agent`)
+- An SAP BTP account with AI Core entitlement and an **orchestration deployment**
+- *(optional, for the foundation provider)* one or more **foundation-models
+  deployments** — one per OpenAI model you want to route directly
+- The service key JSON for your AI Core service binding
+## Credentials
+The extension looks for the SAP BTP service-key JSON in this order:
+1. **Pi's auth store** — `~/.pi/agent/auth.json`, populated by `/login` (see below).
+   Persisted across sessions, file-permission-locked by pi.
+2. **`AICORE_SERVICE_KEY` environment variable** — per-shell override. Useful for
+   testing against a different tenant for one session without re-running `/login`.
+If neither is present, inference fails with a clear "no service key configured" error.
+Both providers — `sap-aicore` (orchestration) and `sap-aicore-foundation` — use the
+**same** service key, so a single `/login` (or one `AICORE_SERVICE_KEY`) covers
+both. pi keys stored credentials per provider, so the foundation provider reads the
+shared login from pi's auth store directly; you never log in twice.
+### Recommended: `/login`
+From inside pi:
+```
+/login
+```
+Then:
+1. Pick **Use a subscription**.
+2. Pick **SAP AI Core**.
+3. At the prompt, paste your BTP service-key JSON as a single line and hit enter.
+   It's validated immediately — if anything is missing or malformed, you'll get a
+   specific error pointing at the field, so you can re-run `/login` and fix it.
+Pi stores the JSON in `~/.pi/agent/auth.json` for future sessions.
+To get the JSON: BTP cockpit → your AI Core service instance → Service Keys
+→ View. Copy the entire JSON object.
+> **Why "Use a subscription" and not "Use an API key"?** SAP service keys contain
+> a `$` in their `clientsecret`. Since pi 0.77, keys stored via "Use an API key"
+> are run through a `$`-interpolating template resolver that mangles them. The
+> extension registers credentials through pi's `oauth` mechanism instead, which
+> stores and returns the key verbatim. It's not real OAuth — it's just the path
+> that keeps your key intact. (See [pi issue #5095](https://github.com/earendil-works/pi/issues/5095).)
+> **Upgrading from an older install?** If you previously logged in via
+> "Use an API key" (stored as `{"type":"api_key"}` in `auth.json`), re-run
+> `/login` **once** via **Use a subscription** to convert the stored credential.
+> A single re-login is all that's needed.
+### Alternative: `AICORE_SERVICE_KEY` env var
+```bash
+export AICORE_SERVICE_KEY='{"clientid":"...","clientsecret":"...","url":"https://...authentication.sap.hana.ondemand.com","serviceurls":{"AI_API_URL":"https://api.ai.<region>.ml.hana.ondemand.com"}}'
+```
+The `@sap-ai-sdk/orchestration` SDK reads this directly for XSUAA auth, token
+caching, and deployment resolution — no manual token plumbing needed.
+## Install
+### From npm (recommended)
+```bash
+pi install npm:pi-sap-aicore
+```
+pi downloads the package under `~/.pi/agent/npm/`, runs `npm install` to pull the
+SAP AI SDKs, and auto-loads the extension on every startup. Run the one command on
+each machine; `pi update` keeps it current. Pin a version with
+`pi install npm:pi-sap-aicore@<version>` (pinned specs are skipped by `pi update`).
+Then configure credentials with `/login` (see [Credentials](#credentials)) and
+confirm the models are visible:
+```bash
+pi --list-models | grep sap-aicore
+```
+### Local development (this repo)
+```bash
+npm install
+pi -e ./index.ts --list-models
+```
+You'll see the orchestration models under `sap-aicore/` (Claude, GPT-5*, Gemini),
+plus any direct **foundation** models under `sap-aicore-foundation/`:
+- `sap-aicore/anthropic--claude-4.7-opus` — Claude Opus 4.7 (orchestration)
+- `sap-aicore/gpt-5.5` — GPT-5.5 via orchestration
+- `sap-aicore-foundation/gpt-5.5` — GPT-5.5 via its direct foundation deployment
+Run `pi -e ./index.ts` to launch pi with the local extension loaded; this
+overrides any globally-installed version for the session, which is the fastest
+iteration loop while developing.
+### Alternative: install from git
+For an unpublished fork or a branch you want to track directly:
+```bash
+pi install git:github.com/ttiimmaahh/pi-sap-aicore@main
+```
+pi clones to `~/.pi/agent/git/…`, runs `npm install`, and auto-loads on startup.
+Note: an `@main` git install is **not** moved to newer commits by `pi update` (it
+only reconciles to the pinned ref) — prefer the npm install above for hands-off
+updates.
+## Orchestration vs. Foundation
+The extension registers **two providers**, both backed by the same service key:
+| | `sap-aicore` (orchestration) | `sap-aicore-foundation` (direct) |
+|---|---|---|
+| SAP deployment | one orchestration deployment fronts **every** model | one foundation deployment **per model** |
+| Models | Claude, GPT-5*, Gemini | OpenAI (`gpt-*`) only |
+| Streaming | subject to orchestration's per-model allow-list — new models can 400 `Streaming is not supported` (we fall back to non-streaming) | **native** — streams straight from the Azure OpenAI endpoint |
+| Reasoning effort | tunable (`reasoning_effort` / `thinking`) | model **default** only (SDK pins Azure API `2024-10-21`, which has no `reasoning_effort`) |
+| Content filter / grounding / templating | yes | no — raw model access |
+| SDK | `@sap-ai-sdk/orchestration` | `@sap-ai-sdk/foundation-models` (`AzureOpenAiChatClient`) |
+Both routes appear in the model list simultaneously, so you choose per model. The
+foundation route exists mainly to get **native streaming** for new OpenAI models
+that orchestration hasn't enabled streaming for yet (e.g. `gpt-5.5`).
+**Adding a foundation model:** it needs its own foundation-models deployment in
+SAP AI Core — one per (model, version, resource group); the SDK resolves it by
+model name, so no deployment IDs to wire in. Then add its `id` to
+`FOUNDATION_MODEL_IDS` in [`src/models-config.ts`](./src/models-config.ts)
+(definitions are reused from the shared snapshot). An id with no matching
+deployment 404s at call time. Run `node scripts/list-sap-models.mjs` to see what
+your tenant actually deploys.
+## Models
+The model list is composed of two sources, merged at startup:
+1. **`src/models-snapshot.json`** — auto-generated from
+   [models.dev](https://models.dev)'s SAP AI Core catalog. Refresh with:
+   ```bash
+   npm run update-models
+   ```
+   This re-fetches the live catalog, applies our family-specific filters
+   (currently anthropic claude-4.x, gpt-5*, gemini-2.5*), and writes the
+   snapshot to disk. Commit the result.
+2. **`TENANT_EXTRAS` in [`src/models-config.ts`](./src/models-config.ts)** —
+   hand-maintained list of models that exist in your SAP tenant but
+   aren't (yet) in the models.dev catalog. Same `SapModel` shape. Extras
+   win over snapshot on duplicate `id`.
+To add a model that everyone on your team should see, add it to
+`TENANT_EXTRAS` and commit. To add a per-machine custom (your own tenant
+only), use pi's built-in custom-models mechanism by editing
+`~/.pi/agent/models.json` — no extension changes required.
+The `cost` fields are vendor list prices (USD per million tokens) from
+models.dev. Used **only** for pi's in-UI cost display — your actual SAP
+BTP invoice is contract-based and will differ.
+## Thinking levels
+Models with `reasoning: true` honor pi's thinking-level cycle (default
+keybind `Shift+Tab`): `off`, `minimal`, `low`, `medium`, `high`, `xhigh`.
+- **Anthropic 4.6+ models** (`anthropic--claude-4.6-*`, `4.7-*`) use
+  *adaptive* thinking — `thinking: {type: "adaptive"}` + `output_config:
+  {effort}`. The model decides the budget; the level only nudges depth.
+- **Older Anthropic models** (`anthropic--claude-4-*`, `4.5-*`) use
+  *budget-token* thinking — `thinking: {type: "enabled", budget_tokens: N}`.
+  Each pi level maps to a token count (1k / 4k / 8k / 16k / 32k for
+  minimal/low/medium/high/xhigh), clamped down so `max_tokens` always
+  has at least 1024 tokens of headroom for the response. SAP rejects
+  the adaptive shape on these models ("adaptive thinking is not
+  supported on this model"), which is why we split.
+**Note on reasoning visibility:** SAP orchestration does NOT pass
+structured reasoning/thinking content through to streaming clients.
+The model genuinely reasons (you'll see step-by-step structure leak
+into the visible answer text, and the tokens are billed via
+`completion_tokens_details.reasoning_tokens`), but pi's dedicated
+"thinking" panel will stay empty for SAP-routed models — there's no
+client-side fix. If SAP exposes a server-side flag for this in the
+future, our `pickReasoning` probe is wired and ready in `stream.ts`.
+- **OpenAI models** (`gpt-*`) use `reasoning_effort: "minimal" | "low"
+  | "medium" | "high"`. `xhigh` is omitted — OpenAI has no equivalent
+  tier; pi will skip it when cycling.
+- **Gemini models** (`gemini-2.5-*`) ship with `reasoning: false` —
+  SAP's gemini reasoning passthrough is undocumented, so we keep
+  `Shift+Tab` off the cycle for these models rather than send a request
+  shape SAP may reject. Wire-up (likely `thinking_config.thinking_budget`)
+  is a future TODO in `src/stream.ts:reasoningParams`.
+**Foundation route caveat:** on `sap-aicore-foundation/*` the direct Azure
+OpenAI SDK pins API version `2024-10-21`, which has no `reasoning_effort`
+field — so gpt-5\* reason at their **default** effort and pi's thinking-level
+cycle is a no-op there. The models still reason (reasoning tokens are billed
+and show in `output`); the depth just isn't tunable. Use the orchestration
+route (`sap-aicore/*`) when you need to set the effort level.
+To override budgets per model, edit `thinkingLevelMap` on the relevant
+entry in `TENANT_EXTRAS`, or override per-user via pi's `models.json`.
+## AI Resource Group
+Resolved in this order:
+1. **`AICORE_RESOURCE_GROUP` env var** — per-shell override. Example:
+   ```bash
+   export AICORE_RESOURCE_GROUP=my-team-rg
+   ```
+2. **`resourceGroup` field on the service-key JSON** — convenient for teams
+   who manage multiple groups and want to bake the default into the key.
+   Non-standard, so add it yourself before pasting into pi:
+   ```json
+   { "clientid": "...", "clientsecret": "...", "resourceGroup": "my-team-rg", ... }
+   ```
+3. **SAP's server-side default** (`default`) — if neither of the above is set.
+The value is passed via SAP's `OrchestrationClient(..., {resourceGroup})`
+constructor arg, which is the only supported channel — `AI-Resource-Group`
+as a request header is explicitly rejected by SAP's typings. The foundation
+provider applies the same resolved group via
+`AzureOpenAiChatClient({ modelName, resourceGroup })`; both a model's foundation
+deployment and the orchestration deployment must live in the resolved group for
+name-based resolution to find them.
+## Prompt caching & cost reporting
+**Cache read/write tokens always report 0** on SAP-routed turns. SAP
+orchestration strips all detail fields from the TokenUsage response
+— we only get `prompt_tokens`, `completion_tokens`, and `total_tokens`
+across every route. There's no `prompt_tokens_details.cached_tokens`
+(OpenAI) and no top-level `cache_read_input_tokens` (Anthropic) for
+the client to read.
+Whether the backend actually caches is invisible to pi. SAP's
+contract billing may give you a discount on cached tokens that this
+extension can't surface — check your BTP invoice if cache savings
+matter.
+**Experimental:** `PI_SAP_AICORE_CACHE_CONTROL=1` tags the system
+prompt and last user message with Anthropic's `cache_control:
+{type:"ephemeral"}`. SAP may forward it (saving SAP money on the
+backend, possibly passed through via your contract) or may 400 the
+request. Either way, you won't see cacheRead become non-zero in pi's
+diagnostics — that requires SAP to expose detail fields, which they
+currently don't.
+OpenAI/Gemini routes ignore the flag — they have their own automatic
+caching with no breakpoint API.
+**Foundation route:** because it talks to the Azure OpenAI endpoint directly
+(not through orchestration's usage-stripping), `prompt_tokens_details.cached_tokens`
+*may* come back populated — `mapUsage` reads it, so `cacheRead` could be non-zero
+on `sap-aicore-foundation/*` turns where orchestration always reports 0. Unverified
+against SAP's proxy; treat as best-effort.
+## Repo layout
+```
+.
+├── package.json              # pi-package manifest + deps + scripts
+├── tsconfig.json             # editor support; pi runs the .ts directly
+├── index.ts                  # ExtensionAPI factory + registerProvider calls (both providers)
+├── scripts/
+│   ├── update-models.mjs     # fetches models.dev, writes models-snapshot.json
+│   ├── list-sap-models.mjs   # lists models your tenant actually deploys (diff vs snapshot)
+│   └── diagnose-streaming.mjs # probes orchestration streaming support per model
+└── src/
+    ├── auth.ts                  # service-key validation + pi oauth registration
+    ├── models-config.ts         # loads snapshot, merges TENANT_EXTRAS, exposes FOUNDATION_MODELS
+    ├── models-snapshot.json     # auto-generated from models.dev (committed)
+    ├── to-pi-model.ts           # SapModel → pi's ProviderModelConfig mapper
+    ├── stream.ts                # orchestration streamSimple adapter + shared helpers (auth, usage, errors)
+    ├── translate.ts             # pi Context ↔ orchestration message shape
+    ├── foundation-params.ts     # Azure OpenAI request params (max_completion_tokens, temperature gating)
+    ├── stream-foundation.ts     # foundation streamSimple adapter (AzureOpenAiChatClient, native streaming)
+    └── translate-foundation.ts  # pi Context ↔ Azure OpenAI message shape
+```

package/index.ts ADDED Viewed

@@ -0,0 +1,68 @@
+import type { Api } from "@earendil-works/pi-ai";
+import type { ExtensionAPI } from "@earendil-works/pi-coding-agent";
+import { sapAiCoreOAuth } from "./src/auth.ts";
+import { FOUNDATION_MODELS, MODELS } from "./src/models-config.ts";
+import { streamSapAiCore } from "./src/stream.ts";
+import { streamSapFoundation } from "./src/stream-foundation.ts";
+import { toPiModel } from "./src/to-pi-model.ts";
+const PROVIDER_NAME = "sap-aicore";
+const PROVIDER_API = "sap-aicore-orchestration" as Api;
+// Second provider: direct foundation (Azure OpenAI) deployments, registered
+// alongside orchestration so both routes are independently selectable
+// (e.g. `sap-aicore/gpt-5.5` vs `sap-aicore-foundation/gpt-5.5`).
+const FOUNDATION_PROVIDER_NAME = "sap-aicore-foundation";
+const FOUNDATION_PROVIDER_API = "sap-aicore-foundation" as Api;
+// pi requires a non-empty `apiKey` for any custom provider that defines models
+// (model-registry `validateConfig`), even when credentials come from `oauth`.
+// This value is never used: the real key is supplied by `sapAiCoreOAuth`
+// (after `/login`) or by AICORE_SERVICE_KEY (both handled in stream.ts). It is a
+// plain lowercase literal so pi's config-value resolver returns it as-is — no
+// `$` interpolation, no shell exec, and not mistaken for a legacy env-var name.
+const PLACEHOLDER_API_KEY = "managed-by-extension-oauth";
+export default function (pi: ExtensionAPI) {
+	pi.registerProvider(PROVIDER_NAME, {
+		name: "SAP AI Core",
+		baseUrl: "https://sap-aicore-handled-by-sdk.invalid",
+		apiKey: PLACEHOLDER_API_KEY,
+		api: PROVIDER_API,
+		// Credentials flow through pi's `oauth` path — its escape hatch from the
+		// $-interpolating config-value resolver that corrupts service keys
+		// containing `$` (SAP keys have one in `clientsecret`). `/login → Use a
+		// subscription → SAP AI Core` captures the service-key JSON; `getApiKey`
+		// returns it verbatim as `options.apiKey` to `streamSimple`.
+		oauth: sapAiCoreOAuth,
+		// Resource-group selection lives in stream.ts (passed to
+		// OrchestrationClient's deploymentConfig); SAP's typings reject
+		// it as a header (`'AI-Resource-Group'?: never`). A `headers`
+		// entry here would also be a no-op anyway — pi only forwards
+		// `headers` when it makes the HTTP request itself, but we use
+		// `streamSimple` and the SAP SDK handles transport.
+		models: MODELS.map((m) => toPiModel(m, PROVIDER_API)),
+		// Synchronous, as pi's provider contract requires. The SAP SDK is still
+		// deferred to first use — `stream.ts` only `import type`s it at module
+		// load and dynamically imports the OrchestrationClient inside the stream
+		// producer, surfacing a missing-dependency error through the stream.
+		streamSimple: streamSapAiCore,
+	});
+	// Foundation provider — shares the exact same credential. Both providers
+	// reference the same `sapAiCoreOAuth` (oauth name "SAP AI Core"), so a single
+	// `/login` serves both and the service key is never entered twice. Models
+	// appear under `sap-aicore-foundation/…`; streaming runs natively here (no
+	// orchestration streaming-unsupported fallback). The foundation SDK is
+	// dynamically imported inside `streamSapFoundation`, same deferral as above.
+	pi.registerProvider(FOUNDATION_PROVIDER_NAME, {
+		name: "SAP AI Core (Foundation)",
+		baseUrl: "https://sap-aicore-handled-by-sdk.invalid",
+		apiKey: PLACEHOLDER_API_KEY,
+		api: FOUNDATION_PROVIDER_API,
+		oauth: sapAiCoreOAuth,
+		models: FOUNDATION_MODELS.map((m) => toPiModel(m, FOUNDATION_PROVIDER_API)),
+		streamSimple: streamSapFoundation,
+	});
+}

package/package.json ADDED Viewed

@@ -0,0 +1,40 @@
+{
+  "name": "pi-sap-aicore",
+  "version": "0.1.0",
+  "description": "SAP AI Core (orchestration + foundation) provider for the pi coding agent",
+  "license": "MIT",
+  "author": "Tim Pearson (https://github.com/ttiimmaahh)",
+  "homepage": "https://github.com/ttiimmaahh/pi-sap-aicore#readme",
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/ttiimmaahh/pi-sap-aicore.git"
+  },
+  "bugs": {
+    "url": "https://github.com/ttiimmaahh/pi-sap-aicore/issues"
+  },
+  "type": "module",
+  "keywords": ["pi-package"],
+  "pi": {
+    "extensions": ["./index.ts"]
+  },
+  "scripts": {
+    "update-models": "node scripts/update-models.mjs",
+    "prepublishOnly": "tsc --noEmit"
+  },
+  "dependencies": {
+    "@sap-ai-sdk/foundation-models": "^2.10.0",
+    "@sap-ai-sdk/orchestration": "^2.10.0"
+  },
+  "peerDependencies": {
+    "@earendil-works/pi-ai": "*",
+    "@earendil-works/pi-coding-agent": "*"
+  },
+  "devDependencies": {
+    "@earendil-works/pi-ai": "^0.78.0",
+    "@earendil-works/pi-coding-agent": "^0.78.0",
+    "typescript": "^5.6.0"
+  },
+  "engines": {
+    "node": ">=20"
+  }
+}

package/scripts/diagnose-streaming.mjs ADDED Viewed

@@ -0,0 +1,99 @@
+// THROWAWAY DIAGNOSTIC — safe to delete. Not wired into the extension.
+//
+// Proves the SAP-side hypothesis behind the gpt-5.5 fallback: that SAP AI Core
+// *orchestration* refuses to STREAM the model (400 "Streaming is not supported
+// for this model") even though the same model answers fine NON-streaming via
+// chatCompletion(). If the blocking call below succeeds, the auto-detect
+// fallback in src/stream.ts is the correct fix.
+//
+// Usage (makes ONE real, billed call per leg):
+//   AICORE_SERVICE_KEY='<your service-key JSON>' node scripts/diagnose-streaming.mjs
+//   # optional: AICORE_RESOURCE_GROUP=<group>   MODEL=gpt-5.5
+//
+// Run it from the repo root so Node resolves @sap-ai-sdk/orchestration from
+// this project's node_modules.
+import { OrchestrationClient } from "@sap-ai-sdk/orchestration";
+const MODEL = process.env.MODEL ?? "gpt-5.5";
+const raw = process.env.AICORE_SERVICE_KEY;
+if (!raw) {
+	console.error(
+		"Set AICORE_SERVICE_KEY to your SAP BTP service-key JSON, e.g.\n" +
+			"  AICORE_SERVICE_KEY='{...}' node scripts/diagnose-streaming.mjs",
+	);
+	process.exit(2);
+}
+// Mirror the extension's resource-group precedence: env override, then a
+// non-standard `resourceGroup` baked into the key, else SAP's "default".
+let resourceGroup = process.env.AICORE_RESOURCE_GROUP?.trim() || undefined;
+if (!resourceGroup) {
+	try {
+		const parsed = JSON.parse(raw);
+		if (typeof parsed?.resourceGroup === "string") {
+			resourceGroup = parsed.resourceGroup;
+		}
+	} catch {
+		// The SDK validates the key shape itself; ignore parse noise here.
+	}
+}
+function makeClient() {
+	return new OrchestrationClient(
+		{
+			promptTemplating: {
+				model: { name: MODEL, params: { max_tokens: 64 } },
+				prompt: { template: [] },
+			},
+		},
+		resourceGroup ? { resourceGroup } : undefined,
+	);
+}
+const messages = [{ role: "user", content: "Reply with exactly: pong" }];
+console.log(`Model: ${MODEL}  resourceGroup: ${resourceGroup ?? "(default)"}\n`);
+// Leg 1: streaming — expected to FAIL for a streaming-gated model like gpt-5.5.
+console.log("[1/2] client.stream() ...");
+try {
+	const response = await makeClient().stream({ messages }, undefined, {
+		promptTemplating: { include_usage: true },
+	});
+	let text = "";
+	for await (const chunk of response.stream) {
+		text += chunk.getDeltaContent() ?? "";
+	}
+	console.log(`  STREAMING OK — got: ${JSON.stringify(text)}`);
+	console.log("  → This model already streams via orchestration; no fallback needed.\n");
+} catch (error) {
+	const msg = error?.response?.data
+		? JSON.stringify(error.response.data)
+		: (error?.message ?? String(error));
+	const isStreamGate = /streaming is not supported/i.test(
+		`${error?.message ?? ""} ${msg}`,
+	);
+	console.log(`  STREAMING FAILED — ${msg}`);
+	console.log(
+		isStreamGate
+			? "  → Confirms the streaming-gate. Checking non-streaming next.\n"
+			: "  → Different failure (not the streaming gate). Read the message above.\n",
+	);
+}
+// Leg 2: non-streaming — expected to SUCCEED, proving the fallback is valid.
+console.log("[2/2] client.chatCompletion() ...");
+try {
+	const response = await makeClient().chatCompletion({ messages });
+	console.log(`  NON-STREAMING OK — got: ${JSON.stringify(response.getContent())}`);
+	console.log("  → Fallback is valid: the extension's auto-detect path will work.\n");
+} catch (error) {
+	const msg = error?.response?.data
+		? JSON.stringify(error.response.data)
+		: (error?.message ?? String(error));
+	console.log(`  NON-STREAMING FAILED — ${msg}`);
+	console.log("  → This model is broken via orchestration entirely (not just streaming).\n");
+	process.exit(1);
+}

package/scripts/list-sap-models.mjs ADDED Viewed

@@ -0,0 +1,92 @@
+// DIAGNOSTIC — lists the models your SAP AI Core tenant *actually* deploys.
+//
+// Hits the authoritative endpoint GET /v2/lm/scenarios/foundation-models/models
+// (SDK: ScenarioApi.scenarioQueryModels) and diffs it against
+// src/models-snapshot.json. This is the ground truth that models.dev's catalog
+// only approximates — use it to spot phantom models (in the snapshot but not in
+// the tenant, e.g. gpt-5.5) and missing ones (deployed but absent from our
+// snapshot, e.g. gpt-5.2 / gpt-5.4-nano).
+//
+// Usage (one read-only, unbilled API call):
+//   AICORE_SERVICE_KEY='<your service-key JSON>' node scripts/list-sap-models.mjs
+//   # optional: AICORE_RESOURCE_GROUP=<group>
+//
+// Run from the repo root so Node resolves @sap-ai-sdk/ai-api from node_modules.
+import { readFileSync } from "node:fs";
+import { fileURLToPath } from "node:url";
+import { dirname, join } from "node:path";
+import { ScenarioApi } from "@sap-ai-sdk/ai-api";
+const raw = process.env.AICORE_SERVICE_KEY;
+if (!raw) {
+	console.error(
+		"Set AICORE_SERVICE_KEY to your SAP BTP service-key JSON, e.g.\n" +
+			"  AICORE_SERVICE_KEY='{...}' node scripts/list-sap-models.mjs",
+	);
+	process.exit(2);
+}
+// Resource-group precedence mirrors the extension and diagnose-streaming.mjs:
+// env override, then a non-standard `resourceGroup` baked into the key, else
+// SAP's "default".
+let resourceGroup = process.env.AICORE_RESOURCE_GROUP?.trim() || undefined;
+if (!resourceGroup) {
+	try {
+		const parsed = JSON.parse(raw);
+		if (typeof parsed?.resourceGroup === "string") {
+			resourceGroup = parsed.resourceGroup;
+		}
+	} catch {
+		// The SDK validates the key shape itself; ignore parse noise here.
+	}
+}
+resourceGroup ??= "default";
+function snapshotIds() {
+	const path = join(
+		dirname(fileURLToPath(import.meta.url)),
+		"..",
+		"src",
+		"models-snapshot.json",
+	);
+	const parsed = JSON.parse(readFileSync(path, "utf8"));
+	return new Set((parsed.models ?? []).map((m) => m.id));
+}
+console.log(
+	`Querying foundation-models scenario  resourceGroup: ${resourceGroup}\n`,
+);
+const response = await ScenarioApi.scenarioQueryModels("foundation-models", {
+	"AI-Resource-Group": resourceGroup,
+}).execute();
+const resources = response?.resources ?? [];
+const tenant = new Set(resources.map((r) => r.model));
+const tenantSorted = [...tenant].sort();
+console.log(`Tenant reports ${response?.count ?? resources.length} models:\n`);
+for (const r of resources.sort((a, b) => a.model.localeCompare(b.model))) {
+	const extras = [r.provider, r.accessType].filter(Boolean).join(", ");
+	console.log(`  ${r.model}${extras ? `  (${extras})` : ""}`);
+}
+console.log("\n--- gpt-5.5 specifically ---");
+console.log(
+	tenant.has("gpt-5.5")
+		? "  PRESENT — SAP does deploy gpt-5.5 after all."
+		: "  ABSENT — gpt-5.5 is not in the tenant's model list (matches the 400).",
+);
+const snap = snapshotIds();
+const phantom = [...snap].filter((id) => !tenant.has(id)).sort();
+const missing = tenantSorted.filter((id) => !snap.has(id));
+console.log("\n--- snapshot vs. tenant ---");
+console.log(
+	`  PHANTOM (in snapshot, NOT in tenant → will 400): ${phantom.length ? phantom.join(", ") : "none"}`,
+);
+console.log(
+	`  MISSING (in tenant, NOT in snapshot → unselectable): ${missing.length ? missing.join(", ") : "none"}`,
+);