pi-sap-aicore 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Tim Pearson
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,296 @@
1
+ # pi-sap-aicore
2
+
3
+ A custom provider extension for the [pi coding agent](https://pi.dev) that routes
4
+ inference through **SAP AI Core** — via the **orchestration** service (every model
5
+ from a single deployment) and/or **direct foundation deployments** (per-model
6
+ Azure OpenAI endpoints with native streaming). Both register at once and share one
7
+ login, so you pick the route per model. See
8
+ [Orchestration vs. Foundation](#orchestration-vs-foundation).
9
+
10
+ ## Prerequisites
11
+
12
+ - pi **0.78.0 or newer** installed (`npm install -g @earendil-works/pi-coding-agent`)
13
+ - An SAP BTP account with AI Core entitlement and an **orchestration deployment**
14
+ - *(optional, for the foundation provider)* one or more **foundation-models
15
+ deployments** — one per OpenAI model you want to route directly
16
+ - The service key JSON for your AI Core service binding
17
+
18
+ ## Credentials
19
+
20
+ The extension looks for the SAP BTP service-key JSON in this order:
21
+
22
+ 1. **Pi's auth store** — `~/.pi/agent/auth.json`, populated by `/login` (see below).
23
+ Persisted across sessions, file-permission-locked by pi.
24
+ 2. **`AICORE_SERVICE_KEY` environment variable** — per-shell override. Useful for
25
+ testing against a different tenant for one session without re-running `/login`.
26
+
27
+ If neither is present, inference fails with a clear "no service key configured" error.
28
+
29
+ Both providers — `sap-aicore` (orchestration) and `sap-aicore-foundation` — use the
30
+ **same** service key, so a single `/login` (or one `AICORE_SERVICE_KEY`) covers
31
+ both. pi keys stored credentials per provider, so the foundation provider reads the
32
+ shared login from pi's auth store directly; you never log in twice.
33
+
34
+ ### Recommended: `/login`
35
+
36
+ From inside pi:
37
+
38
+ ```
39
+ /login
40
+ ```
41
+
42
+ Then:
43
+ 1. Pick **Use a subscription**.
44
+ 2. Pick **SAP AI Core**.
45
+ 3. At the prompt, paste your BTP service-key JSON as a single line and hit enter.
46
+ It's validated immediately — if anything is missing or malformed, you'll get a
47
+ specific error pointing at the field, so you can re-run `/login` and fix it.
48
+
49
+ Pi stores the JSON in `~/.pi/agent/auth.json` for future sessions.
50
+
51
+ To get the JSON: BTP cockpit → your AI Core service instance → Service Keys
52
+ → View. Copy the entire JSON object.
53
+
54
+ > **Why "Use a subscription" and not "Use an API key"?** SAP service keys contain
55
+ > a `$` in their `clientsecret`. Since pi 0.77, keys stored via "Use an API key"
56
+ > are run through a `$`-interpolating template resolver that mangles them. The
57
+ > extension registers credentials through pi's `oauth` mechanism instead, which
58
+ > stores and returns the key verbatim. It's not real OAuth — it's just the path
59
+ > that keeps your key intact. (See [pi issue #5095](https://github.com/earendil-works/pi/issues/5095).)
60
+
61
+ > **Upgrading from an older install?** If you previously logged in via
62
+ > "Use an API key" (stored as `{"type":"api_key"}` in `auth.json`), re-run
63
+ > `/login` **once** via **Use a subscription** to convert the stored credential.
64
+ > A single re-login is all that's needed.
65
+
66
+ ### Alternative: `AICORE_SERVICE_KEY` env var
67
+
68
+ ```bash
69
+ export AICORE_SERVICE_KEY='{"clientid":"...","clientsecret":"...","url":"https://...authentication.sap.hana.ondemand.com","serviceurls":{"AI_API_URL":"https://api.ai.<region>.ml.hana.ondemand.com"}}'
70
+ ```
71
+
72
+ The `@sap-ai-sdk/orchestration` SDK reads this directly for XSUAA auth, token
73
+ caching, and deployment resolution — no manual token plumbing needed.
74
+
75
+ ## Install
76
+
77
+ ### From npm (recommended)
78
+
79
+ ```bash
80
+ pi install npm:pi-sap-aicore
81
+ ```
82
+
83
+ pi downloads the package under `~/.pi/agent/npm/`, runs `npm install` to pull the
84
+ SAP AI SDKs, and auto-loads the extension on every startup. Run the one command on
85
+ each machine; `pi update` keeps it current. Pin a version with
86
+ `pi install npm:pi-sap-aicore@<version>` (pinned specs are skipped by `pi update`).
87
+
88
+ Then configure credentials with `/login` (see [Credentials](#credentials)) and
89
+ confirm the models are visible:
90
+
91
+ ```bash
92
+ pi --list-models | grep sap-aicore
93
+ ```
94
+
95
+ ### Local development (this repo)
96
+
97
+ ```bash
98
+ npm install
99
+ pi -e ./index.ts --list-models
100
+ ```
101
+
102
+ You'll see the orchestration models under `sap-aicore/` (Claude, GPT-5*, Gemini),
103
+ plus any direct **foundation** models under `sap-aicore-foundation/`:
104
+ - `sap-aicore/anthropic--claude-4.7-opus` — Claude Opus 4.7 (orchestration)
105
+ - `sap-aicore/gpt-5.5` — GPT-5.5 via orchestration
106
+ - `sap-aicore-foundation/gpt-5.5` — GPT-5.5 via its direct foundation deployment
107
+
108
+ Run `pi -e ./index.ts` to launch pi with the local extension loaded; this
109
+ overrides any globally-installed version for the session, which is the fastest
110
+ iteration loop while developing.
111
+
112
+ ### Alternative: install from git
113
+
114
+ For an unpublished fork or a branch you want to track directly:
115
+
116
+ ```bash
117
+ pi install git:github.com/ttiimmaahh/pi-sap-aicore@main
118
+ ```
119
+
120
+ pi clones to `~/.pi/agent/git/…`, runs `npm install`, and auto-loads on startup.
121
+ Note: an `@main` git install is **not** moved to newer commits by `pi update` (it
122
+ only reconciles to the pinned ref) — prefer the npm install above for hands-off
123
+ updates.
124
+
125
+ ## Orchestration vs. Foundation
126
+
127
+ The extension registers **two providers**, both backed by the same service key:
128
+
129
+ | | `sap-aicore` (orchestration) | `sap-aicore-foundation` (direct) |
130
+ |---|---|---|
131
+ | SAP deployment | one orchestration deployment fronts **every** model | one foundation deployment **per model** |
132
+ | Models | Claude, GPT-5*, Gemini | OpenAI (`gpt-*`) only |
133
+ | Streaming | subject to orchestration's per-model allow-list — new models can 400 `Streaming is not supported` (we fall back to non-streaming) | **native** — streams straight from the Azure OpenAI endpoint |
134
+ | Reasoning effort | tunable (`reasoning_effort` / `thinking`) | model **default** only (SDK pins Azure API `2024-10-21`, which has no `reasoning_effort`) |
135
+ | Content filter / grounding / templating | yes | no — raw model access |
136
+ | SDK | `@sap-ai-sdk/orchestration` | `@sap-ai-sdk/foundation-models` (`AzureOpenAiChatClient`) |
137
+
138
+ Both routes appear in the model list simultaneously, so you choose per model. The
139
+ foundation route exists mainly to get **native streaming** for new OpenAI models
140
+ that orchestration hasn't enabled streaming for yet (e.g. `gpt-5.5`).
141
+
142
+ **Adding a foundation model:** it needs its own foundation-models deployment in
143
+ SAP AI Core — one per (model, version, resource group); the SDK resolves it by
144
+ model name, so no deployment IDs to wire in. Then add its `id` to
145
+ `FOUNDATION_MODEL_IDS` in [`src/models-config.ts`](./src/models-config.ts)
146
+ (definitions are reused from the shared snapshot). An id with no matching
147
+ deployment 404s at call time. Run `node scripts/list-sap-models.mjs` to see what
148
+ your tenant actually deploys.
149
+
150
+ ## Models
151
+
152
+ The model list is composed of two sources, merged at startup:
153
+
154
+ 1. **`src/models-snapshot.json`** — auto-generated from
155
+ [models.dev](https://models.dev)'s SAP AI Core catalog. Refresh with:
156
+ ```bash
157
+ npm run update-models
158
+ ```
159
+ This re-fetches the live catalog, applies our family-specific filters
160
+ (currently anthropic claude-4.x, gpt-5*, gemini-2.5*), and writes the
161
+ snapshot to disk. Commit the result.
162
+
163
+ 2. **`TENANT_EXTRAS` in [`src/models-config.ts`](./src/models-config.ts)** —
164
+ hand-maintained list of models that exist in your SAP tenant but
165
+ aren't (yet) in the models.dev catalog. Same `SapModel` shape. Extras
166
+ win over snapshot on duplicate `id`.
167
+
168
+ To add a model that everyone on your team should see, add it to
169
+ `TENANT_EXTRAS` and commit. To add a per-machine custom (your own tenant
170
+ only), use pi's built-in custom-models mechanism by editing
171
+ `~/.pi/agent/models.json` — no extension changes required.
172
+
173
+ The `cost` fields are vendor list prices (USD per million tokens) from
174
+ models.dev. Used **only** for pi's in-UI cost display — your actual SAP
175
+ BTP invoice is contract-based and will differ.
176
+
177
+ ## Thinking levels
178
+
179
+ Models with `reasoning: true` honor pi's thinking-level cycle (default
180
+ keybind `Shift+Tab`): `off`, `minimal`, `low`, `medium`, `high`, `xhigh`.
181
+
182
+ - **Anthropic 4.6+ models** (`anthropic--claude-4.6-*`, `4.7-*`) use
183
+ *adaptive* thinking — `thinking: {type: "adaptive"}` + `output_config:
184
+ {effort}`. The model decides the budget; the level only nudges depth.
185
+ - **Older Anthropic models** (`anthropic--claude-4-*`, `4.5-*`) use
186
+ *budget-token* thinking — `thinking: {type: "enabled", budget_tokens: N}`.
187
+ Each pi level maps to a token count (1k / 4k / 8k / 16k / 32k for
188
+ minimal/low/medium/high/xhigh), clamped down so `max_tokens` always
189
+ has at least 1024 tokens of headroom for the response. SAP rejects
190
+ the adaptive shape on these models ("adaptive thinking is not
191
+ supported on this model"), which is why we split.
192
+
193
+ **Note on reasoning visibility:** SAP orchestration does NOT pass
194
+ structured reasoning/thinking content through to streaming clients.
195
+ The model genuinely reasons (you'll see step-by-step structure leak
196
+ into the visible answer text, and the tokens are billed via
197
+ `completion_tokens_details.reasoning_tokens`), but pi's dedicated
198
+ "thinking" panel will stay empty for SAP-routed models — there's no
199
+ client-side fix. If SAP exposes a server-side flag for this in the
200
+ future, our `pickReasoning` probe is wired and ready in `stream.ts`.
201
+ - **OpenAI models** (`gpt-*`) use `reasoning_effort: "minimal" | "low"
202
+ | "medium" | "high"`. `xhigh` is omitted — OpenAI has no equivalent
203
+ tier; pi will skip it when cycling.
204
+ - **Gemini models** (`gemini-2.5-*`) ship with `reasoning: false` —
205
+ SAP's gemini reasoning passthrough is undocumented, so we keep
206
+ `Shift+Tab` off the cycle for these models rather than send a request
207
+ shape SAP may reject. Wire-up (likely `thinking_config.thinking_budget`)
208
+ is a future TODO in `src/stream.ts:reasoningParams`.
209
+
210
+ **Foundation route caveat:** on `sap-aicore-foundation/*` the direct Azure
211
+ OpenAI SDK pins API version `2024-10-21`, which has no `reasoning_effort`
212
+ field — so gpt-5\* reason at their **default** effort and pi's thinking-level
213
+ cycle is a no-op there. The models still reason (reasoning tokens are billed
214
+ and show in `output`); the depth just isn't tunable. Use the orchestration
215
+ route (`sap-aicore/*`) when you need to set the effort level.
216
+
217
+ To override budgets per model, edit `thinkingLevelMap` on the relevant
218
+ entry in `TENANT_EXTRAS`, or override per-user via pi's `models.json`.
219
+
220
+ ## AI Resource Group
221
+
222
+ Resolved in this order:
223
+
224
+ 1. **`AICORE_RESOURCE_GROUP` env var** — per-shell override. Example:
225
+ ```bash
226
+ export AICORE_RESOURCE_GROUP=my-team-rg
227
+ ```
228
+ 2. **`resourceGroup` field on the service-key JSON** — convenient for teams
229
+ who manage multiple groups and want to bake the default into the key.
230
+ Non-standard, so add it yourself before pasting into pi:
231
+ ```json
232
+ { "clientid": "...", "clientsecret": "...", "resourceGroup": "my-team-rg", ... }
233
+ ```
234
+ 3. **SAP's server-side default** (`default`) — if neither of the above is set.
235
+
236
+ The value is passed via SAP's `OrchestrationClient(..., {resourceGroup})`
237
+ constructor arg, which is the only supported channel — `AI-Resource-Group`
238
+ as a request header is explicitly rejected by SAP's typings. The foundation
239
+ provider applies the same resolved group via
240
+ `AzureOpenAiChatClient({ modelName, resourceGroup })`; both a model's foundation
241
+ deployment and the orchestration deployment must live in the resolved group for
242
+ name-based resolution to find them.
243
+
244
+ ## Prompt caching & cost reporting
245
+
246
+ **Cache read/write tokens always report 0** on SAP-routed turns. SAP
247
+ orchestration strips all detail fields from the TokenUsage response
248
+ — we only get `prompt_tokens`, `completion_tokens`, and `total_tokens`
249
+ across every route. There's no `prompt_tokens_details.cached_tokens`
250
+ (OpenAI) and no top-level `cache_read_input_tokens` (Anthropic) for
251
+ the client to read.
252
+
253
+ Whether the backend actually caches is invisible to pi. SAP's
254
+ contract billing may give you a discount on cached tokens that this
255
+ extension can't surface — check your BTP invoice if cache savings
256
+ matter.
257
+
258
+ **Experimental:** `PI_SAP_AICORE_CACHE_CONTROL=1` tags the system
259
+ prompt and last user message with Anthropic's `cache_control:
260
+ {type:"ephemeral"}`. SAP may forward it (saving SAP money on the
261
+ backend, possibly passed through via your contract) or may 400 the
262
+ request. Either way, you won't see cacheRead become non-zero in pi's
263
+ diagnostics — that requires SAP to expose detail fields, which they
264
+ currently don't.
265
+
266
+ OpenAI/Gemini routes ignore the flag — they have their own automatic
267
+ caching with no breakpoint API.
268
+
269
+ **Foundation route:** because it talks to the Azure OpenAI endpoint directly
270
+ (not through orchestration's usage-stripping), `prompt_tokens_details.cached_tokens`
271
+ *may* come back populated — `mapUsage` reads it, so `cacheRead` could be non-zero
272
+ on `sap-aicore-foundation/*` turns where orchestration always reports 0. Unverified
273
+ against SAP's proxy; treat as best-effort.
274
+
275
+ ## Repo layout
276
+
277
+ ```
278
+ .
279
+ ├── package.json # pi-package manifest + deps + scripts
280
+ ├── tsconfig.json # editor support; pi runs the .ts directly
281
+ ├── index.ts # ExtensionAPI factory + registerProvider calls (both providers)
282
+ ├── scripts/
283
+ │ ├── update-models.mjs # fetches models.dev, writes models-snapshot.json
284
+ │ ├── list-sap-models.mjs # lists models your tenant actually deploys (diff vs snapshot)
285
+ │ └── diagnose-streaming.mjs # probes orchestration streaming support per model
286
+ └── src/
287
+ ├── auth.ts # service-key validation + pi oauth registration
288
+ ├── models-config.ts # loads snapshot, merges TENANT_EXTRAS, exposes FOUNDATION_MODELS
289
+ ├── models-snapshot.json # auto-generated from models.dev (committed)
290
+ ├── to-pi-model.ts # SapModel → pi's ProviderModelConfig mapper
291
+ ├── stream.ts # orchestration streamSimple adapter + shared helpers (auth, usage, errors)
292
+ ├── translate.ts # pi Context ↔ orchestration message shape
293
+ ├── foundation-params.ts # Azure OpenAI request params (max_completion_tokens, temperature gating)
294
+ ├── stream-foundation.ts # foundation streamSimple adapter (AzureOpenAiChatClient, native streaming)
295
+ └── translate-foundation.ts # pi Context ↔ Azure OpenAI message shape
296
+ ```
package/index.ts ADDED
@@ -0,0 +1,68 @@
1
+ import type { Api } from "@earendil-works/pi-ai";
2
+ import type { ExtensionAPI } from "@earendil-works/pi-coding-agent";
3
+
4
+ import { sapAiCoreOAuth } from "./src/auth.ts";
5
+ import { FOUNDATION_MODELS, MODELS } from "./src/models-config.ts";
6
+ import { streamSapAiCore } from "./src/stream.ts";
7
+ import { streamSapFoundation } from "./src/stream-foundation.ts";
8
+ import { toPiModel } from "./src/to-pi-model.ts";
9
+
10
+ const PROVIDER_NAME = "sap-aicore";
11
+ const PROVIDER_API = "sap-aicore-orchestration" as Api;
12
+
13
+ // Second provider: direct foundation (Azure OpenAI) deployments, registered
14
+ // alongside orchestration so both routes are independently selectable
15
+ // (e.g. `sap-aicore/gpt-5.5` vs `sap-aicore-foundation/gpt-5.5`).
16
+ const FOUNDATION_PROVIDER_NAME = "sap-aicore-foundation";
17
+ const FOUNDATION_PROVIDER_API = "sap-aicore-foundation" as Api;
18
+
19
+ // pi requires a non-empty `apiKey` for any custom provider that defines models
20
+ // (model-registry `validateConfig`), even when credentials come from `oauth`.
21
+ // This value is never used: the real key is supplied by `sapAiCoreOAuth`
22
+ // (after `/login`) or by AICORE_SERVICE_KEY (both handled in stream.ts). It is a
23
+ // plain lowercase literal so pi's config-value resolver returns it as-is — no
24
+ // `$` interpolation, no shell exec, and not mistaken for a legacy env-var name.
25
+ const PLACEHOLDER_API_KEY = "managed-by-extension-oauth";
26
+
27
+ export default function (pi: ExtensionAPI) {
28
+ pi.registerProvider(PROVIDER_NAME, {
29
+ name: "SAP AI Core",
30
+ baseUrl: "https://sap-aicore-handled-by-sdk.invalid",
31
+ apiKey: PLACEHOLDER_API_KEY,
32
+ api: PROVIDER_API,
33
+ // Credentials flow through pi's `oauth` path — its escape hatch from the
34
+ // $-interpolating config-value resolver that corrupts service keys
35
+ // containing `$` (SAP keys have one in `clientsecret`). `/login → Use a
36
+ // subscription → SAP AI Core` captures the service-key JSON; `getApiKey`
37
+ // returns it verbatim as `options.apiKey` to `streamSimple`.
38
+ oauth: sapAiCoreOAuth,
39
+ // Resource-group selection lives in stream.ts (passed to
40
+ // OrchestrationClient's deploymentConfig); SAP's typings reject
41
+ // it as a header (`'AI-Resource-Group'?: never`). A `headers`
42
+ // entry here would also be a no-op anyway — pi only forwards
43
+ // `headers` when it makes the HTTP request itself, but we use
44
+ // `streamSimple` and the SAP SDK handles transport.
45
+ models: MODELS.map((m) => toPiModel(m, PROVIDER_API)),
46
+ // Synchronous, as pi's provider contract requires. The SAP SDK is still
47
+ // deferred to first use — `stream.ts` only `import type`s it at module
48
+ // load and dynamically imports the OrchestrationClient inside the stream
49
+ // producer, surfacing a missing-dependency error through the stream.
50
+ streamSimple: streamSapAiCore,
51
+ });
52
+
53
+ // Foundation provider — shares the exact same credential. Both providers
54
+ // reference the same `sapAiCoreOAuth` (oauth name "SAP AI Core"), so a single
55
+ // `/login` serves both and the service key is never entered twice. Models
56
+ // appear under `sap-aicore-foundation/…`; streaming runs natively here (no
57
+ // orchestration streaming-unsupported fallback). The foundation SDK is
58
+ // dynamically imported inside `streamSapFoundation`, same deferral as above.
59
+ pi.registerProvider(FOUNDATION_PROVIDER_NAME, {
60
+ name: "SAP AI Core (Foundation)",
61
+ baseUrl: "https://sap-aicore-handled-by-sdk.invalid",
62
+ apiKey: PLACEHOLDER_API_KEY,
63
+ api: FOUNDATION_PROVIDER_API,
64
+ oauth: sapAiCoreOAuth,
65
+ models: FOUNDATION_MODELS.map((m) => toPiModel(m, FOUNDATION_PROVIDER_API)),
66
+ streamSimple: streamSapFoundation,
67
+ });
68
+ }
package/package.json ADDED
@@ -0,0 +1,40 @@
1
+ {
2
+ "name": "pi-sap-aicore",
3
+ "version": "0.1.0",
4
+ "description": "SAP AI Core (orchestration + foundation) provider for the pi coding agent",
5
+ "license": "MIT",
6
+ "author": "Tim Pearson (https://github.com/ttiimmaahh)",
7
+ "homepage": "https://github.com/ttiimmaahh/pi-sap-aicore#readme",
8
+ "repository": {
9
+ "type": "git",
10
+ "url": "git+https://github.com/ttiimmaahh/pi-sap-aicore.git"
11
+ },
12
+ "bugs": {
13
+ "url": "https://github.com/ttiimmaahh/pi-sap-aicore/issues"
14
+ },
15
+ "type": "module",
16
+ "keywords": ["pi-package"],
17
+ "pi": {
18
+ "extensions": ["./index.ts"]
19
+ },
20
+ "scripts": {
21
+ "update-models": "node scripts/update-models.mjs",
22
+ "prepublishOnly": "tsc --noEmit"
23
+ },
24
+ "dependencies": {
25
+ "@sap-ai-sdk/foundation-models": "^2.10.0",
26
+ "@sap-ai-sdk/orchestration": "^2.10.0"
27
+ },
28
+ "peerDependencies": {
29
+ "@earendil-works/pi-ai": "*",
30
+ "@earendil-works/pi-coding-agent": "*"
31
+ },
32
+ "devDependencies": {
33
+ "@earendil-works/pi-ai": "^0.78.0",
34
+ "@earendil-works/pi-coding-agent": "^0.78.0",
35
+ "typescript": "^5.6.0"
36
+ },
37
+ "engines": {
38
+ "node": ">=20"
39
+ }
40
+ }
@@ -0,0 +1,99 @@
1
+ // THROWAWAY DIAGNOSTIC — safe to delete. Not wired into the extension.
2
+ //
3
+ // Proves the SAP-side hypothesis behind the gpt-5.5 fallback: that SAP AI Core
4
+ // *orchestration* refuses to STREAM the model (400 "Streaming is not supported
5
+ // for this model") even though the same model answers fine NON-streaming via
6
+ // chatCompletion(). If the blocking call below succeeds, the auto-detect
7
+ // fallback in src/stream.ts is the correct fix.
8
+ //
9
+ // Usage (makes ONE real, billed call per leg):
10
+ // AICORE_SERVICE_KEY='<your service-key JSON>' node scripts/diagnose-streaming.mjs
11
+ // # optional: AICORE_RESOURCE_GROUP=<group> MODEL=gpt-5.5
12
+ //
13
+ // Run it from the repo root so Node resolves @sap-ai-sdk/orchestration from
14
+ // this project's node_modules.
15
+
16
+ import { OrchestrationClient } from "@sap-ai-sdk/orchestration";
17
+
18
+ const MODEL = process.env.MODEL ?? "gpt-5.5";
19
+
20
+ const raw = process.env.AICORE_SERVICE_KEY;
21
+ if (!raw) {
22
+ console.error(
23
+ "Set AICORE_SERVICE_KEY to your SAP BTP service-key JSON, e.g.\n" +
24
+ " AICORE_SERVICE_KEY='{...}' node scripts/diagnose-streaming.mjs",
25
+ );
26
+ process.exit(2);
27
+ }
28
+
29
+ // Mirror the extension's resource-group precedence: env override, then a
30
+ // non-standard `resourceGroup` baked into the key, else SAP's "default".
31
+ let resourceGroup = process.env.AICORE_RESOURCE_GROUP?.trim() || undefined;
32
+ if (!resourceGroup) {
33
+ try {
34
+ const parsed = JSON.parse(raw);
35
+ if (typeof parsed?.resourceGroup === "string") {
36
+ resourceGroup = parsed.resourceGroup;
37
+ }
38
+ } catch {
39
+ // The SDK validates the key shape itself; ignore parse noise here.
40
+ }
41
+ }
42
+
43
+ function makeClient() {
44
+ return new OrchestrationClient(
45
+ {
46
+ promptTemplating: {
47
+ model: { name: MODEL, params: { max_tokens: 64 } },
48
+ prompt: { template: [] },
49
+ },
50
+ },
51
+ resourceGroup ? { resourceGroup } : undefined,
52
+ );
53
+ }
54
+
55
+ const messages = [{ role: "user", content: "Reply with exactly: pong" }];
56
+
57
+ console.log(`Model: ${MODEL} resourceGroup: ${resourceGroup ?? "(default)"}\n`);
58
+
59
+ // Leg 1: streaming — expected to FAIL for a streaming-gated model like gpt-5.5.
60
+ console.log("[1/2] client.stream() ...");
61
+ try {
62
+ const response = await makeClient().stream({ messages }, undefined, {
63
+ promptTemplating: { include_usage: true },
64
+ });
65
+ let text = "";
66
+ for await (const chunk of response.stream) {
67
+ text += chunk.getDeltaContent() ?? "";
68
+ }
69
+ console.log(` STREAMING OK — got: ${JSON.stringify(text)}`);
70
+ console.log(" → This model already streams via orchestration; no fallback needed.\n");
71
+ } catch (error) {
72
+ const msg = error?.response?.data
73
+ ? JSON.stringify(error.response.data)
74
+ : (error?.message ?? String(error));
75
+ const isStreamGate = /streaming is not supported/i.test(
76
+ `${error?.message ?? ""} ${msg}`,
77
+ );
78
+ console.log(` STREAMING FAILED — ${msg}`);
79
+ console.log(
80
+ isStreamGate
81
+ ? " → Confirms the streaming-gate. Checking non-streaming next.\n"
82
+ : " → Different failure (not the streaming gate). Read the message above.\n",
83
+ );
84
+ }
85
+
86
+ // Leg 2: non-streaming — expected to SUCCEED, proving the fallback is valid.
87
+ console.log("[2/2] client.chatCompletion() ...");
88
+ try {
89
+ const response = await makeClient().chatCompletion({ messages });
90
+ console.log(` NON-STREAMING OK — got: ${JSON.stringify(response.getContent())}`);
91
+ console.log(" → Fallback is valid: the extension's auto-detect path will work.\n");
92
+ } catch (error) {
93
+ const msg = error?.response?.data
94
+ ? JSON.stringify(error.response.data)
95
+ : (error?.message ?? String(error));
96
+ console.log(` NON-STREAMING FAILED — ${msg}`);
97
+ console.log(" → This model is broken via orchestration entirely (not just streaming).\n");
98
+ process.exit(1);
99
+ }
@@ -0,0 +1,92 @@
1
+ // DIAGNOSTIC — lists the models your SAP AI Core tenant *actually* deploys.
2
+ //
3
+ // Hits the authoritative endpoint GET /v2/lm/scenarios/foundation-models/models
4
+ // (SDK: ScenarioApi.scenarioQueryModels) and diffs it against
5
+ // src/models-snapshot.json. This is the ground truth that models.dev's catalog
6
+ // only approximates — use it to spot phantom models (in the snapshot but not in
7
+ // the tenant, e.g. gpt-5.5) and missing ones (deployed but absent from our
8
+ // snapshot, e.g. gpt-5.2 / gpt-5.4-nano).
9
+ //
10
+ // Usage (one read-only, unbilled API call):
11
+ // AICORE_SERVICE_KEY='<your service-key JSON>' node scripts/list-sap-models.mjs
12
+ // # optional: AICORE_RESOURCE_GROUP=<group>
13
+ //
14
+ // Run from the repo root so Node resolves @sap-ai-sdk/ai-api from node_modules.
15
+
16
+ import { readFileSync } from "node:fs";
17
+ import { fileURLToPath } from "node:url";
18
+ import { dirname, join } from "node:path";
19
+ import { ScenarioApi } from "@sap-ai-sdk/ai-api";
20
+
21
+ const raw = process.env.AICORE_SERVICE_KEY;
22
+ if (!raw) {
23
+ console.error(
24
+ "Set AICORE_SERVICE_KEY to your SAP BTP service-key JSON, e.g.\n" +
25
+ " AICORE_SERVICE_KEY='{...}' node scripts/list-sap-models.mjs",
26
+ );
27
+ process.exit(2);
28
+ }
29
+
30
+ // Resource-group precedence mirrors the extension and diagnose-streaming.mjs:
31
+ // env override, then a non-standard `resourceGroup` baked into the key, else
32
+ // SAP's "default".
33
+ let resourceGroup = process.env.AICORE_RESOURCE_GROUP?.trim() || undefined;
34
+ if (!resourceGroup) {
35
+ try {
36
+ const parsed = JSON.parse(raw);
37
+ if (typeof parsed?.resourceGroup === "string") {
38
+ resourceGroup = parsed.resourceGroup;
39
+ }
40
+ } catch {
41
+ // The SDK validates the key shape itself; ignore parse noise here.
42
+ }
43
+ }
44
+ resourceGroup ??= "default";
45
+
46
+ function snapshotIds() {
47
+ const path = join(
48
+ dirname(fileURLToPath(import.meta.url)),
49
+ "..",
50
+ "src",
51
+ "models-snapshot.json",
52
+ );
53
+ const parsed = JSON.parse(readFileSync(path, "utf8"));
54
+ return new Set((parsed.models ?? []).map((m) => m.id));
55
+ }
56
+
57
+ console.log(
58
+ `Querying foundation-models scenario resourceGroup: ${resourceGroup}\n`,
59
+ );
60
+
61
+ const response = await ScenarioApi.scenarioQueryModels("foundation-models", {
62
+ "AI-Resource-Group": resourceGroup,
63
+ }).execute();
64
+
65
+ const resources = response?.resources ?? [];
66
+ const tenant = new Set(resources.map((r) => r.model));
67
+ const tenantSorted = [...tenant].sort();
68
+
69
+ console.log(`Tenant reports ${response?.count ?? resources.length} models:\n`);
70
+ for (const r of resources.sort((a, b) => a.model.localeCompare(b.model))) {
71
+ const extras = [r.provider, r.accessType].filter(Boolean).join(", ");
72
+ console.log(` ${r.model}${extras ? ` (${extras})` : ""}`);
73
+ }
74
+
75
+ console.log("\n--- gpt-5.5 specifically ---");
76
+ console.log(
77
+ tenant.has("gpt-5.5")
78
+ ? " PRESENT — SAP does deploy gpt-5.5 after all."
79
+ : " ABSENT — gpt-5.5 is not in the tenant's model list (matches the 400).",
80
+ );
81
+
82
+ const snap = snapshotIds();
83
+ const phantom = [...snap].filter((id) => !tenant.has(id)).sort();
84
+ const missing = tenantSorted.filter((id) => !snap.has(id));
85
+
86
+ console.log("\n--- snapshot vs. tenant ---");
87
+ console.log(
88
+ ` PHANTOM (in snapshot, NOT in tenant → will 400): ${phantom.length ? phantom.join(", ") : "none"}`,
89
+ );
90
+ console.log(
91
+ ` MISSING (in tenant, NOT in snapshot → unselectable): ${missing.length ? missing.join(", ") : "none"}`,
92
+ );