@oh-my-pi/pi-ai 15.13.0 → 15.13.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +294 -55
- package/dist/types/auth-broker/wire-schemas.d.ts +19 -17
- package/dist/types/auth-storage.d.ts +1 -1
- package/dist/types/provider-details.d.ts +1 -1
- package/package.json +3 -3
- package/src/auth-broker/wire-schemas.ts +14 -4
- package/src/auth-storage.ts +1 -1
- package/src/provider-details.ts +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,11 +2,17 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [15.13.1] - 2026-06-15
|
|
6
|
+
|
|
7
|
+
### Fixed
|
|
8
|
+
|
|
9
|
+
- Fixed the auth-broker (`OMP_AUTH_BROKER_URL`) rejecting OAuth credentials that carry provider-specific extension fields (e.g. an MCP server's `tokenUrl`/`clientId`/`clientSecret`/`resource` embedded for self-contained token refresh): the OAuth credential wire schema was `.strict()`, so `POST /v1/credential` failed with `400 unrecognized_keys` and a broker-backed MCP reauth reported success while the reloaded credential lacked its refresh material and could no longer refresh. The OAuth wire schema now uses `.loose()` to preserve unknown fields — matching the field-preserving local SQLite store — so extra OAuth fields round-trip through broker set->get (envelope and API-key schemas stay strict).
|
|
10
|
+
|
|
5
11
|
## [15.13.0] - 2026-06-14
|
|
6
12
|
|
|
7
13
|
### Fixed
|
|
8
|
-
- Fixed OpenAI Responses/Realtime SSE stream handler crashing with "Error Code undefined: undefined" when parsing error events with nested error details by falling back to the nested error object fields.
|
|
9
14
|
|
|
15
|
+
- Fixed OpenAI Responses/Realtime SSE stream handler crashing with "Error Code undefined: undefined" when parsing error events with nested error details by falling back to the nested error object fields.
|
|
10
16
|
- Fixed OpenAI-compatible providers that reject forced `tool_choice` on thinking-required models by downgrading unsupported forced choices to `auto` while keeping tools available ([#2546](https://github.com/can1357/oh-my-pi/issues/2546)).
|
|
11
17
|
- Fixed GitHub Copilot Anthropic transport (`api.githubcopilot.com/v1/messages`) returning `400 tools.0.custom.eager_input_streaming: Extra inputs are not permitted` on every tool-bearing turn by stopping the emission of the per-tool `eager_input_streaming` flag and the `fine-grained-tool-streaming-2025-05-14` beta header on the Copilot transport — the proxy whitelists neither ([#2558](https://github.com/can1357/oh-my-pi/issues/2558)).
|
|
12
18
|
- Disabled Bun's native ~300s pre-response `fetch` timeout in every streaming provider (OpenAI completions/responses, Azure responses, Anthropic, Codex SSE, Bedrock, Gemini CLI, Ollama). The configurable first-event/idle/SDK watchdogs (`PI_STREAM_FIRST_EVENT_TIMEOUT_MS`, `PI_OPENAI_STREAM_IDLE_TIMEOUT_MS`, `compat.streamIdleTimeoutMs`) were silently capped by Bun's hidden ceiling, so cold large-context streams (e.g. self-hosted vLLM at multi-hundred-K prompts) died at exactly 300s with `TimeoutError: The operation timed out.` Direct callers of `./providers/{amazon-bedrock,google-gemini-cli,ollama,openai-codex-responses}` (which bypass `register-builtins`' iterator-level watchdog) now install a pre-response `AbortSignal.timeout(firstEventTimeoutMs)` alongside the disable, so a stalled upstream still fails within the configured budget instead of hanging forever ([#2422](https://github.com/can1357/oh-my-pi/issues/2422))
|
|
@@ -301,6 +307,63 @@
|
|
|
301
307
|
|
|
302
308
|
- Removed the dead `iterateUntilAbort` helper (superseded by `iterateWithIdleTimeout`); it leaked the upstream iterator when the consumer abandoned mid-yield and had no production call sites.
|
|
303
309
|
|
|
310
|
+
## [15.10.10] - 2026-06-09
|
|
311
|
+
|
|
312
|
+
### Added
|
|
313
|
+
|
|
314
|
+
- Exported `wrapFetchForCch` so non-streaming OAuth callers (e.g. the web-search provider) can patch the Claude Code billing-header `cch` attestation into their request bodies instead of shipping the `cch=00000` placeholder.
|
|
315
|
+
|
|
316
|
+
### Fixed
|
|
317
|
+
|
|
318
|
+
- Fixed an unbounded, zero-backoff Codex WebSocket reconnect loop on `websocket_connection_limit_reached`: the no-content reconnect path never consulted the retry budget and never waited, hammering the endpoint forever when the limit is account-scoped. Reconnects are now budgeted and delayed like every other WS retry path, falling back to a single SSE replay when exhausted.
|
|
319
|
+
- Fixed the Codex whitespace-loop breaker not observing degenerate frames that arrive after their item closed (or before it opened) — those frames count as stream progress, so the idle watchdogs never fired and the turn hung forever, which is exactly the failure mode the breaker exists for. Whitespace-loop recovery now also refuses to replay the turn once a `toolcall_end` was delivered, surfacing the error instead of re-emitting the same tool calls.
|
|
320
|
+
- Fixed the two remaining Codex retry paths (WS mid-stream reconnect and the empty-content SSE fallback) leaking blockless native output items (e.g. `web_search_call`) from the failed attempt into the replayed turn's `providerPayload` and append baseline.
|
|
321
|
+
- Fixed Codex WebSocket failure handling closing whatever connection currently occupies the session slot — including a concurrent caller's in-flight CONNECTING handshake, whose rejection (`websocket closed before open`) is classified fatal and disabled WebSockets for the whole session. Failure cleanup now skips CONNECTING sockets and the pool re-joins replacement handshakes (bounded).
|
|
322
|
+
- Fixed the Codex request transformer not repairing orphan `custom_tool_call_output` items (only `function_call_output` was folded into an assistant note) — a compaction splice that dropped an `apply_patch` call while keeping its result produced a hard 400 on the default GPT-5 Codex toolset.
|
|
323
|
+
- Fixed `processResponsesStream` finalizing reasoning items via a bare `itemId` content scan instead of the routed entry: with id-less reasoning items (local hosts), every `output_item.done` matched the FIRST thinking block — the second item's text clobbered it and the second block was never finalized or signed.
|
|
324
|
+
- Fixed `processResponsesStream` dropping tool calls and message text whose `output_item.added` event was lost (lossy proxies): `toolcall_end` was emitted with a dangling contentIndex while the call never entered `message.content`, so the agent loop silently never executed it. The done handler now synthesizes the missing block; still-open tool-call blocks are also final-parsed at `response.completed` so the `toolUse` override cannot hand the agent stale `{}` arguments.
|
|
325
|
+
- Fixed `response.incomplete` with `incomplete_details.reason: "content_filter"` being reported as a token-cap truncation (`stopReason: "length"`) — the agent loop's length recovery then asked the model to "shorten" a filtered prompt. Content-filtered turns now surface as errors; usage is also populated from `response.failed` events, and an unknown terminal status degrades to `"stop"` with a logged anomaly instead of throwing away a fully-streamed response.
|
|
326
|
+
- Fixed Copilot `premiumRequests` accounting being dropped from failed/cancelled responses: `populateResponsesUsageFromResponse` replaced `usage` wholesale and the error path threw before the success-path re-apply. The populate now preserves the field.
|
|
327
|
+
- Fixed `deduplicateToolCallIds` suffixing the whole composite Responses id (`callId|itemId`) — `normalizeResponsesToolCallId` extracts the first segment as the wire `call_id` at encode time, so both copies collapsed back onto one `call_id` and the request carried duplicate call/output pairs. The suffix and length budget now apply per segment.
|
|
328
|
+
- Gated native history payload replay on api + model id in both Responses providers: after a mid-session model switch, reasoning items carrying encrypted content minted by the previous model were replayed verbatim under the new model. Replay now falls back to block re-encode (which already strips foreign signatures), matching `transformMessages`' same-model trust rule.
|
|
329
|
+
- Fixed Azure OpenAI Responses requests omitting `store: false` while requesting `reasoning.encrypted_content` (stateless-only per OpenAI), replaying custom tool calls paired with mismatched `function_call_output` items (customCallIds was never threaded through), letting the SDK's internal retries (maxRetries 5) silently re-POST inside the explicit first-event deadline, and sending a `prompt_cache_key` when the caller opted out via `cacheRetention: "none"`.
|
|
330
|
+
- Fixed strict-pairing Responses backends (Azure, Copilot) silently discarding tool results whose call is absent from history — the result is now folded into an assistant note (same shape as orphan-output repair) so the model keeps the information.
|
|
331
|
+
- Fixed the OpenAI Responses first-event watchdog staying armed across the `onResponse` notification callback (a slow callback aborted an already-connected stream), Copilot transient-model retries re-attempting on an already-aborted signal (instant dead retry surfacing the scheduler's AbortError), Codex `reasoningSummary: null` being coerced to `"auto"` (the documented omit-summary contract was unreachable), nested Codex error codes (`response.error.code`) being invisible to the connection-limit/previous-response recovery matchers, and the session id leaking unredacted into `PI_CODEX_DEBUG` logs via the `x-client-request-id` header.
|
|
332
|
+
- Fixed `processResponsesStream` (shared by `openai-responses` and `azure-openai-responses`) ignoring the terminal `response.incomplete` event: a max-output-tokens-truncated response ended with `stopReason: "stop"`, zero usage, and no cost instead of `"length"` with the reported token counts. `response.incomplete` is now handled alongside `response.completed` and counts as stream progress for the idle watchdogs.
|
|
333
|
+
- Fixed custom tool-call content blocks keeping the transient `partialJson` accumulation buffer (and a potentially stale `arguments.input`) after `response.output_item.done` in the shared Responses stream processor — the function_call branch already cleaned these up.
|
|
334
|
+
- Fixed two OpenAI Codex stream-retry paths (whitespace-loop recovery and retryable provider errors) leaking native output items from the abandoned attempt into the replayed turn's `providerPayload` — stale reasoning items completed before the failure were re-sent as history input on subsequent requests alongside the retry's own items.
|
|
335
|
+
- Fixed the Codex WebSocket queue wiping already-received frames when a transport error arrived: a `response.completed` queued just before an eager server close was discarded, turning a finished response into a spurious `websocket closed` failure and a full request replay. Errors now append behind pending data frames.
|
|
336
|
+
- Fixed concurrent `getOrCreateCodexWebSocketConnection` callers (prewarm racing the first request) tearing down each other's in-flight handshake — closing a CONNECTING socket rejected the other caller with a fatal `websocket closed before open`, disabling WebSockets for the entire session. Callers now join the pending handshake.
|
|
337
|
+
- Stopped the Codex connection-limit recovery from replaying a turn over SSE after a `toolcall_end` had already been delivered to the consumer (`canSafelyReplayWebsocketOverSse` guard was bypassed, re-emitting the same tool calls); the error now surfaces instead.
|
|
338
|
+
- Extended the Codex whitespace-only argument-delta circuit breaker to `custom_tool_call_input.delta` frames, which counted as stream progress and could keep a degenerate response alive forever with no cap on buffer growth.
|
|
339
|
+
- Fixed Codex stream failures during transport open reporting a synthetic request dump (empty URL/body) instead of the real request, and a `response.created` event resetting the recorded time-to-first-token.
|
|
340
|
+
- Fixed the Codex WebSocket connect watchdog timer leaking (pinning the event loop for up to 10s) when the request signal aborted before or during the handshake.
|
|
341
|
+
- Fixed OpenRouter-hosted Anthropic adaptive reasoning models (Claude Fable/Mythos 5 and Opus 4.6+) so the catalog exposes `xhigh`; Fable/Mythos and Opus 4.7+ requests now map user `high`/`xhigh` onto OpenRouter's Anthropic `xhigh`/`max` effort scale.
|
|
342
|
+
- Fixed an unknown Anthropic `stop_reason` failing the whole turn after the response had fully streamed. `mapStopReason` threw on unrecognized values, and since the reason arrives on the trailing `message_delta` the error was unretryable — the live `model_context_window_exceeded` stop reason (default on Sonnet 4.5+) hit this path. It now maps to `length`, and any future unknown reason degrades to a logged anomaly plus a normal `stop` instead of an error.
|
|
343
|
+
- Stopped clamping API-key Anthropic requests to Claude Code's 64k output cap. The `CLAUDE_CODE_MAX_OUTPUT_TOKENS` clamp exists to match the OAuth wire fingerprint, but `buildParams` applied it unconditionally, silently halving the output budget of 128k-output models (e.g. Opus 4.8) for API-key callers. OAuth requests keep the clamp.
|
|
344
|
+
- Stopped a successful strict-tools fallback from shipping `errorMessage` on a `stopReason: "stop"` assistant message. After a grammar-too-large 400 triggered the non-strict retry, the original 400 text was kept on the final message even when the retry succeeded — consumers that treat `errorMessage` presence as failure (e.g. balance probes) misclassified the turn, and the stale text suppressed later refusal explanations. The fallback is now logged instead.
|
|
345
|
+
- Fixed model-supplied `User-Agent` headers being silently dropped on non-OAuth Anthropic requests. `enforcedHeaderKeys` filtered the header out of `modelHeaders` in every branch but only the OAuth branch set one back; the Cloudflare-gateway, bearer-gateway, and `X-Api-Key` branches now forward the caller's value verbatim.
|
|
346
|
+
- Stopped sending the `fast-mode-2026-02-01` beta header once a session has learned the endpoint+model rejects fast mode (`fastModeDisabled` provider state), matching the already-dropped `speed` param.
|
|
347
|
+
- Stopped `buildAnthropicHeaders` defaulting API-key requests onto the full Claude Code OAuth beta list (`oauth-2025-04-20`, `claude-code-20250219`, …). The `claudeCodeBetas` default is now OAuth-gated, matching the streaming path — the web-search header builder was the only caller hitting the default, so API-key search requests now carry just their own betas (e.g. `web-search-2025-03-05`). An empty `anthropic-beta` header is omitted entirely instead of being sent as an empty string.
|
|
348
|
+
- Fixed image-bearing `developer` messages being upgraded to mid-conversation `system` turns on Opus 4.8+/Fable/Mythos 5. System content is text-only on the wire, so a developer turn carrying image blocks in an upgrade-eligible position produced a 400; it now stays a `user` message.
|
|
349
|
+
- Fixed a spliced reconnect's second envelope overwriting the completed Anthropic message: `message_delta` was not gated by the terminal-stop flag (content events and duplicate `message_start` were), so the splice's `stop_reason`/usage replaced the finished turn's — a `tool_use` turn could be relabeled `stop`, and the harness then never executed the streamed tool calls. Post-terminal deltas are now logged as envelope anomalies and skipped.
|
|
350
|
+
- Fixed a `ping` arriving before `message_start` consuming the Anthropic first-event watchdog: the stall was then classified as a terminal mid-stream idle timeout instead of a retryable first-event timeout. Pings no longer count as the first item but still refresh the idle deadline once content is flowing.
|
|
351
|
+
- Fixed Anthropic-compatible proxies that omit `usage`/`delta` objects from `message_start`/`message_delta`/`content_block_*` envelopes crashing the turn with an unretryable `TypeError`; the missing payloads now degrade to logged envelope anomalies like every other malformed-frame case.
|
|
352
|
+
- Fixed `applyPromptCaching` placing `cache_control` on `thinking`/`redacted_thinking` blocks — Anthropic rejects that with a 400. A thinking-only assistant turn inside the trailing cache window (e.g. followed by the synthetic `Continue.` pad) no longer receives a breakpoint.
|
|
353
|
+
- Fixed consecutive `assistant` params reaching the wire when an empty user/developer turn between two assistant turns was dropped by the converter (e.g. an empty "nudge" submission after a length-truncated reply); Anthropic 400s on non-alternating assistant turns, and the broken triple replayed on every subsequent request. A `user: "Continue."` separator is now inserted, mirroring the trailing-prefill fallback.
|
|
354
|
+
- Fixed `supportsAdaptiveThinkingDisplay` misparsing bare dated Opus ids: `claude-opus-4-20250514` (Opus 4.0) parsed as minor `20250514` ≥ 4.7, which silently dropped the `interleaved-thinking-2025-05-14` beta for API-key Opus 4.0 requests.
|
|
355
|
+
- Fixed `output_config.effort` shipping without the `effort-2025-11-24` beta on thinking-off requests against adaptive-only Claude models (the effort:"low" pin), and the mid-conversation `system` role shipping without `mid-conversation-system-2026-04-07` on API-key and OAuth-utility requests; both betas are now added whenever the request can carry the corresponding field.
|
|
356
|
+
- Fixed GitHub Copilot anthropic-messages requests going out with no `Content-Type` and no `anthropic-version` header — the copilot branch builds its headers from scratch and Bun's fetch does not default `Content-Type` for string bodies. Both headers are now pinned to match every other branch.
|
|
357
|
+
- Fixed Anthropic client/provider retry multiplication: with the first-event watchdog disabled (`PI_STREAM_FIRST_EVENT_TIMEOUT_MS=0`), the client's internal `maxRetries: 5` reactivated and stacked with the provider loop's 3 retries — up to 24 wire attempts with double backoff. The provider now pins per-request `maxRetries: 0` unconditionally.
|
|
358
|
+
- Fixed `AnthropicMessagesClient` spreading `fetchOptions` after the core request fields, letting a caller-supplied `signal`/`method`/`body` silently disconnect the timeout controller or corrupt the request. Transport extras (TLS) still pass through; core fields now always win.
|
|
359
|
+
- Fixed Foundry mTLS/CA material being cached for the process lifetime when the env vars point at files: the cache key now folds in the file mtime so on-disk certificate rotation takes effect.
|
|
360
|
+
- Fixed the Claude Code fingerprint version drifting across surfaces: the usage endpoint (`claude-cli/2.1.160`) and OAuth bootstrap (`claude-code/2.1.160`) pinned a stale version while `/v1/messages` reported 2.1.165; both now derive from `claudeCodeVersion`.
|
|
361
|
+
- Fixed a system prompt that merely *mentions* `x-anthropic-billing-header:` mid-text suppressing the entire Claude Code system-block injection (billing header, instruction, and cch attestation); the resumed-session guard now anchors with `startsWith`.
|
|
362
|
+
- Fixed lone surrogates in cross-API tool-call arguments reaching Anthropic's strict UTF-8 validation: replayed OpenAI/Google-origin `tool_use.input` string leaves are now deep-sanitized with `toWellFormed()`, while same-API Anthropic arguments stay byte-identical to keep prompt-cache prefixes stable.
|
|
363
|
+
- Bounded the many-image resize fan-out to 4 concurrent decodes (it previously decoded every oversized image at once, two encode pipelines each — multi-GB transient memory at the 20+-image threshold that activates the feature).
|
|
364
|
+
- Fixed `mergeHeaders` merging case-sensitively on the Copilot/client-options path, where a miscased user-configured header (e.g. `authorization` next to the synthesized `Authorization`) survived as two keys that the `Headers` constructor joins comma-separated on the wire.
|
|
365
|
+
- Hardened the Anthropic stream lifecycle: prologue failures (e.g. a malformed Copilot credential in `buildCopilotDynamicHeaders`) and error-finalization failures now surface as an `error` event instead of an unhandled rejection that left `stream.result()` hanging forever; the spurious "cch billing placeholder not patched" warning no longer fires when the placeholder only appears in user content.
|
|
366
|
+
|
|
304
367
|
## [15.10.9] - 2026-06-09
|
|
305
368
|
|
|
306
369
|
### Added
|
|
@@ -450,7 +513,6 @@
|
|
|
450
513
|
- Fixed Antigravity usage provider emitting one bar per model instead of deduplicating by tier — a single account's 15+ model entries now collapse to one bar per tier, matching the shared-quota reality of the upstream API.
|
|
451
514
|
- Fixed Antigravity usage reports missing `email` and `accountId` in metadata, so the `/usage` display and the deduplicator can associate reports with their credentials.
|
|
452
515
|
- Fixed usage-report dedup ignoring `projectId` for Google Cloud providers, preventing duplicate credential entries from being recognized as the same account.
|
|
453
|
-
|
|
454
516
|
- Fixed Cloud Code Assist (Antigravity / Gemini CLI) rejecting the `github` tool with HTTP 400 when the `pr` parameter schema contained `anyOf: [string, array]`. The CCA mixed-type combiner collapse picked the first non-null type (`string`) but indiscriminately copied type-specific keys from variant branches — `items` from the array variant leaked onto the string-typed result, producing `{type: "string", items: {...}}` which Google's API rejects as invalid. The collapse now filters merged variant fields against the winning type's allowed key set. ([#2002](https://github.com/can1357/oh-my-pi/pull/2002))
|
|
455
517
|
- Fixed OpenAI Responses-family providers (Codex, OpenAI Responses, Azure Responses) rejecting requests with `400 No tool output found for function call …` after the user branched/navigated the session tree to a node that ends on a tool call (the tool-result child is dropped from the reconstructed history) or after a turn was aborted/crashed between the call streaming and its result persisting. The converters now synthesize a placeholder `function_call_output`/`custom_tool_call_output` immediately after any unpaired `function_call`/`custom_tool_call`, symmetric to the existing orphan-output repair, so the model still sees the call and can recover instead of the whole request 400ing.
|
|
456
518
|
- Fixed Anthropic-compatible reasoning endpoints losing prior-turn reasoning on continuation requests when they emit unsigned `thinking` blocks. `convertAnthropicMessages` treated unknown endpoints as signature-enforcing and demoted unsigned reasoning to `type: "text"`, which destabilized tool-call argument serialization on the next turn — the upstream symptom behind the `args?.ops?.map is not a function` crash reported against the `todo` tool. Official `api.anthropic.com` keeps the conservative text fallback; non-official `anthropic-messages` reasoning models now replay unsigned reasoning as native `type: "thinking"` ([#2005](https://github.com/can1357/oh-my-pi/issues/2005)).
|
|
@@ -649,7 +711,6 @@
|
|
|
649
711
|
### Added
|
|
650
712
|
|
|
651
713
|
- `SimpleStreamOptions.openrouterVariant` (`"nitro"`, `"floor"`, `"online"`, `"exacto"`, …) — when set, appends `:<variant>` to OpenRouter model IDs at request time, leaving ids that already carry an explicit `:suffix` untouched. Plumbed through `openai-completions` and the pi-native gateway forwarder.
|
|
652
|
-
|
|
653
714
|
- xAI Grok OAuth (SuperGrok Subscription) provider in `/login`. Loopback PKCE flow on `127.0.0.1:56121`; the token unlocks Grok-4.x chat. Ported from NousResearch/hermes-agent (MIT).
|
|
654
715
|
- OpenRouter provider in `/login`. API-key paste flow validated against `https://openrouter.ai/api/v1/auth/key` (the `/models` endpoint is public and cannot validate auth). The pasted key is stored under the existing `openrouter` provider id used by `OPENROUTER_API_KEY`.
|
|
655
716
|
- `XAI_OAUTH_TOKEN` environment variable accepted as a headless fallback for the xAI Grok OAuth provider.
|
|
@@ -682,14 +743,11 @@
|
|
|
682
743
|
- Added `PI_CODEX_WEBSOCKET_PING_INTERVAL_MS` to configure the interval for Codex WebSocket protocol ping heartbeats
|
|
683
744
|
- Added `PI_CODEX_WEBSOCKET_PONG_TIMEOUT_MS` to configure the Codex WebSocket pong timeout used to detect unresponsive connections
|
|
684
745
|
- Added `PI_CODEX_WEBSOCKET_MESSAGE_QUEUE_CAPACITY` to configure the maximum buffered Codex WebSocket inbound queue size before transport fallback
|
|
685
|
-
- Added `parseStreamingJsonThrottled` to `@oh-my-pi/pi-ai/utils/json-parse` — a per-delta wrapper around `parseStreamingJson` that skips re-parses until the buffer has grown by `minGrowthBytes` (default 256). Wired into the streaming hot path of every provider's tool-call argument accumulator (`anthropic`, `amazon-bedrock`, `openai-completions`, `openai-codex-responses`, `openai-responses-shared`) so per-delta cost is O(N) in total buffer length instead of O(N²). Each provider's `toolcall_end` still runs a final unthrottled parse, so the published `block.arguments` is unchanged.
|
|
686
|
-
- Added named-tool routing support to Google providers: `GoogleSharedStreamOptions.toolChoice` and `GoogleGeminiCliOptions.toolChoice` now accept `{ mode: "ANY"; allowedFunctionNames: [string, ...string[]] }` in addition to the string forms. `mapGoogleToolChoice` converts `ToolChoice` objects of shape `{ type: "tool" | "function", name }` to the wire form. Mirrors the equivalent Anthropic mapper.
|
|
687
746
|
|
|
688
747
|
### Changed
|
|
689
748
|
|
|
690
749
|
- Improved Codex WebSocket timeout diagnostics to include last event type and time since last progress event
|
|
691
750
|
- Enhanced Codex WebSocket error classification to recognize ping, pong, send, and queue-overflow failures as retryable
|
|
692
|
-
- Changed `mapGoogleToolChoice` to be exported from `@oh-my-pi/pi-ai/stream` so callers can build the wire-shape allow-list directly without re-deriving it.
|
|
693
751
|
|
|
694
752
|
### Fixed
|
|
695
753
|
|
|
@@ -698,7 +756,6 @@
|
|
|
698
756
|
- Fixed Codex WebSocket pong timeout detection by tracking pong events and failing the connection when no pong is received within the configured timeout
|
|
699
757
|
- Fixed Anthropic streaming to suppress hallucinated meta-prompt thinking blocks (the recent "I don't see any current rewritten thinking..." regression). When the marker phrase `rewritten thinking` appears in a streamed thinking summary the block is collapsed to a plain `Thinking...` placeholder and its signature is dropped so subsequent turns can't re-anchor on the garbled chain.
|
|
700
758
|
- Fixed Codex WebSocket silent stalls by adding protocol pings, inbound queue bounding, clearer idle-timeout diagnostics, and SDK retry clamping for first-event timeouts.
|
|
701
|
-
- Fixed Synthetic model discovery to treat the provider `/models` response as authoritative so deprecated bundled IDs are pruned from the runtime cache, and changed Synthetic login validation to avoid probing a specific model ([#1417](https://github.com/can1357/oh-my-pi/issues/1417)).
|
|
702
759
|
|
|
703
760
|
## [15.5.0] - 2026-05-26
|
|
704
761
|
|
|
@@ -822,10 +879,6 @@
|
|
|
822
879
|
|
|
823
880
|
- Added DeepSeek to the built-in API-key login provider catalog so `omp login deepseek` stores a reusable `DEEPSEEK_API_KEY` credential for the bundled DeepSeek models.
|
|
824
881
|
|
|
825
|
-
### Fixed
|
|
826
|
-
|
|
827
|
-
- Fixed `openai-responses` requests intermittently 400ing with `No tool call found for function call output with call_id …` after an aborted turn or a locally-rejected tool call (e.g. argument-validation failure). `convertConversationMessages` now folds orphan `function_call_output` / `custom_tool_call_output` items — those whose matching `function_call` was wiped by an earlier `dt: false` snapshot splice or never landed in any persisted provider payload — into assistant text notes, preserving the payload while keeping the request grammatically valid ([#1351](https://github.com/can1357/oh-my-pi/issues/1351)).
|
|
828
|
-
|
|
829
882
|
## [15.2.4] - 2026-05-22
|
|
830
883
|
|
|
831
884
|
### Fixed
|
|
@@ -878,7 +931,6 @@
|
|
|
878
931
|
### Fixed
|
|
879
932
|
|
|
880
933
|
- Fixed Anthropic fast mode (`serviceTier: "priority"`) looping on 429 `rate_limit_error: "Extra usage is required for fast mode."` for accounts without the extra-usage entitlement. `isAnthropicFastModeUnsupportedError` now matches the 429 phrasing in addition to the 400 `invalid_request_error` "does not support the `speed` parameter" case, so the provider drops `speed: "fast"` on the in-turn retry, sets `providerSessionState.fastModeDisabled` for the remainder of the session, and surfaces `disabledFeatures: ["priority"]` to the caller instead of retrying with the same payload until `PROVIDER_MAX_RETRIES` is exhausted.
|
|
881
|
-
- Fixed MiniMax Coding Plan CN streaming `<think>...</think>` reasoning as visible assistant text. The OpenAI-compatible stream parser now enables the existing MiniMax tag parser for both `minimax-code` and `minimax-code-cn`, so CN responses become structured `thinking` blocks instead of raw text. ([#1203](https://github.com/can1357/oh-my-pi/issues/1203))
|
|
882
934
|
|
|
883
935
|
## [15.1.6] - 2026-05-19
|
|
884
936
|
|
|
@@ -895,7 +947,6 @@
|
|
|
895
947
|
|
|
896
948
|
### Fixed
|
|
897
949
|
|
|
898
|
-
- Fixed OpenCode-Go and OpenCode-Zen chat-completions replay to omit stored reasoning fields on Kimi assistant tool-call messages, avoiding provider 400s for rejected `messages[].reasoning` payloads. ([#1157](https://github.com/can1357/oh-my-pi/issues/1157))
|
|
899
950
|
- Fixed OpenAI Responses and Codex tool schema normalization to emit `properties: {}` for no-argument object schemas without rewriting literal payloads. ([#1147](https://github.com/can1357/oh-my-pi/issues/1147))
|
|
900
951
|
- Fixed Anthropic 400 (`unexpected tool_use_id found in tool_result blocks ... Each tool_result block must have a corresponding tool_use block in the previous message`) when handoff/compaction folds an assistant `tool_use` into the handoff summary string but leaves the matching user-side `tool_result` message in the history. `transformMessages` now indexes every `tool_use` id surviving the first pass and drops orphan `tool_result` messages whose originator was compacted away, preserving the text payload as a user-level `<stale-tool-result>` note so the model still sees what the tool returned. The note is emitted with `role: "user"` rather than `role: "developer"` so providers that elevate developer-role messages (Ollama: `developer` → `system`; OpenAI chat-completions reasoning models: `developer` → `developer`) cannot lift stale tool output to an instruction-priority tier above the surrounding user/developer messages.
|
|
901
952
|
- Fixed streaming authentication retry to trigger when a provider emits a 401 `error` event after a `start` event but before any replay-unsafe content is emitted
|
|
@@ -1084,11 +1135,6 @@
|
|
|
1084
1135
|
|
|
1085
1136
|
- Fixed OAuth credentials being silently disabled when two omp processes (or any two `AuthStorage` instances sharing a `agent.db`) race on token refresh. Anthropic rotates refresh tokens on every use, so the loser's `invalid_grant` response previously soft-deleted the row that the winner just rotated, forcing the user to `/login` again. `#tryOAuthCredential` now re-reads the row from disk before declaring a definitive failure: if the persisted `refresh` differs from the snapshot it tried, the peer-rotated credential is reloaded and the request retries against the fresh token instead of disabling the live row.
|
|
1086
1137
|
- Closed a remaining race window in OAuth refresh-failure handling: between re-reading the credential row to check for peer rotation and the subsequent soft-delete, another process could still complete a refresh and rotate the row, leaving us to disable the freshly-rotated credential by `id`. The disable now runs as a single CAS update conditioned on the row's `data` still matching the snapshot we tried to refresh, and on `disabled_cause IS NULL`. If the CAS reports 0 rows changed (peer rotation, or row already disabled by a concurrent failure on the same snapshot), we reload from disk and retry instead of mutating the wrong row or emitting a spurious `credential_disabled` event.
|
|
1087
|
-
- Lazy built-in provider streams now enforce the shared idle watchdog and abort stalled provider requests, so session auto-retry can continue after transient network drops instead of remaining stuck. Caller aborts still terminate as aborted.
|
|
1088
|
-
|
|
1089
|
-
### Changed
|
|
1090
|
-
|
|
1091
|
-
- Lowered the default steady-state stream idle timeout from 120s to 30s while preserving the existing environment overrides.
|
|
1092
1138
|
|
|
1093
1139
|
## [14.9.3] - 2026-05-10
|
|
1094
1140
|
|
|
@@ -1101,7 +1147,6 @@
|
|
|
1101
1147
|
### Fixed
|
|
1102
1148
|
|
|
1103
1149
|
- Fixed silent forwarding of image content (for example Python plot output rendered in the terminal) to models without vision support, which produced opaque 404 errors from upstream. Image blocks are now stripped and replaced with a `[image omitted: model does not support vision]` placeholder for non-vision models, including tool-result payloads ([#967](https://github.com/can1357/oh-my-pi/issues/967), [#968](https://github.com/can1357/oh-my-pi/issues/968)).
|
|
1104
|
-
|
|
1105
1150
|
- Added `AuthStorage` `onCredentialDisabled` callback (sync or async) so embedders can react when a credential is automatically disabled (e.g. OAuth refresh fails with `invalid_grant`) — useful for surfacing a banner or auto-launching a re-login flow instead of letting the credential silently disappear. Sync throws and async rejections are both caught and logged so a misbehaving subscriber cannot break the disable path.
|
|
1106
1151
|
- Added Anthropic OAuth `account.uuid` and `account.email_address` extraction from the `/v1/oauth/token` exchange and refresh responses; both `AnthropicOAuthFlow.exchangeToken()` and `refreshAnthropicToken()` now populate `OAuthCredentials.{accountId, email}` so downstream consumers can attribute requests to the authenticated account without a separate `/api/oauth/profile` round-trip.
|
|
1107
1152
|
- Added `onSseEvent` stream diagnostics so HTTP SSE providers can expose raw SSE frames without changing parsed model output.
|
|
@@ -1119,7 +1164,6 @@
|
|
|
1119
1164
|
|
|
1120
1165
|
- Fixed Gemini 3 Pro thinking metadata so `medium` effort is rejected with the expected error instead of being silently accepted: `ThinkingConfig` now carries an optional explicit `levels` list that survives `expandEffortRange`, letting non-contiguous supported sets (e.g. `[low, high]`) round-trip through enrichment.
|
|
1121
1166
|
- Fixed Kimi Code OAuth expiry handling to refresh access tokens 5 minutes before server expiry, avoiding daily 401s from using tokens right up to the cutoff.
|
|
1122
|
-
- Fixed OpenAI Responses custom tool replay to preserve custom tool call item IDs with the `ctc_` prefix instead of rewriting them as `fc_` function-call IDs ([#977](https://github.com/can1357/oh-my-pi/issues/977)).
|
|
1123
1167
|
|
|
1124
1168
|
## [14.7.6] - 2026-05-07
|
|
1125
1169
|
|
|
@@ -1383,7 +1427,6 @@
|
|
|
1383
1427
|
- Fixed shell execution failure responses to preserve all result fields when sanitizing, preventing truncated metadata in stream results
|
|
1384
1428
|
- Fixed context overflow detection to recognize `model_context_window_exceeded` from z.ai / GLM providers, preventing infinite retry loops when context window is exceeded ([#638](https://github.com/can1357/oh-my-pi/issues/638))
|
|
1385
1429
|
- Fixed strict tool schema enforcement to preserve `additionalProperties: false` and required keys for reused nested object schemas, preventing invalid `todo_write` function schemas in Codex/OpenAI requests
|
|
1386
|
-
- Fixed GitHub Copilot reasoning regressions by preserving GPT-5.x / Claude 4.x reasoning controls instead of stripping them from requests ([#773](https://github.com/can1357/oh-my-pi/issues/773))
|
|
1387
1430
|
|
|
1388
1431
|
## [14.1.0] - 2026-04-11
|
|
1389
1432
|
|
|
@@ -1446,7 +1489,6 @@
|
|
|
1446
1489
|
- Fixed Gemini 2.5 Pro context window detection in GitHub Copilot model limits test
|
|
1447
1490
|
- Fixed Claude Opus 4.6 context window detection in GitHub Copilot model limits test
|
|
1448
1491
|
- Fixed Anthropic streaming to suppress transient SDK console errors for malformed SSE keep-alive frames so the TUI only shows surfaced provider errors
|
|
1449
|
-
|
|
1450
1492
|
- Added environment-based credential fallback for the OpenAI Codex provider.
|
|
1451
1493
|
|
|
1452
1494
|
## [13.17.6] - 2026-04-01
|
|
@@ -1824,8 +1866,6 @@
|
|
|
1824
1866
|
- Fixed OpenAI Codex streaming to properly include service_tier in SSE payloads
|
|
1825
1867
|
- Fixed type safety in OpenAI responses by removing unsafe type casts on image content blocks
|
|
1826
1868
|
- Fixed credential purging to respect disabled credentials when deduplicating by email
|
|
1827
|
-
- Fixed API-key provider re-login to replace the active stored key instead of appending stale credentials that were still selected first
|
|
1828
|
-
- Fixed Kagi login guidance to use the correct `KG_...` key format and mention Search API beta access requirements
|
|
1829
1869
|
|
|
1830
1870
|
## [13.9.2] - 2026-03-05
|
|
1831
1871
|
|
|
@@ -1850,7 +1890,7 @@
|
|
|
1850
1890
|
- Removed `THINKING_LEVELS`, `ALL_THINKING_LEVELS`, `ALL_THINKING_MODES`, `THINKING_MODE_DESCRIPTIONS`, and `THINKING_MODE_LABELS` exports
|
|
1851
1891
|
- Renamed `formatThinking()` to `getThinkingMetadata()` with changed return type from string to `ThinkingMetadata` object
|
|
1852
1892
|
- Renamed `getAvailableThinkingLevel()` to `getAvailableThinkingLevels()` and added default parameter
|
|
1853
|
-
- Renamed `
|
|
1893
|
+
- Renamed `getAvailableThinkingEffort()` to `getAvailableThinkingEfforts()` and added default parameter
|
|
1854
1894
|
|
|
1855
1895
|
### Added
|
|
1856
1896
|
|
|
@@ -1860,17 +1900,17 @@
|
|
|
1860
1900
|
|
|
1861
1901
|
### Added
|
|
1862
1902
|
|
|
1863
|
-
- Exported new thinking module with `
|
|
1864
|
-
- Added `
|
|
1865
|
-
- Added `
|
|
1903
|
+
- Exported new thinking module with `ThinkingEffort`, `ThinkingLevel`, and `ThinkingMode` types for managing reasoning effort levels
|
|
1904
|
+
- Added `getAvailableThinkingEffort()` function to determine supported thinking effort levels based on model capabilities
|
|
1905
|
+
- Added `parseThinkingEffort()`, `parseThinkingLevel()`, and `parseThinkingMode()` functions for parsing thinking configuration strings
|
|
1866
1906
|
- Added `THINKING_LEVELS`, `ALL_THINKING_LEVELS`, and `ALL_THINKING_MODES` constants for iterating over available thinking options
|
|
1867
1907
|
- Added `THINKING_MODE_DESCRIPTIONS` and `THINKING_MODE_LABELS` for displaying thinking modes in user interfaces
|
|
1868
1908
|
- Added `formatThinking()` function to format thinking modes as compact display labels
|
|
1869
1909
|
|
|
1870
1910
|
### Changed
|
|
1871
1911
|
|
|
1872
|
-
- Refactored thinking level handling to distinguish between `
|
|
1873
|
-
- Updated `ThinkingBudgets` type to use `
|
|
1912
|
+
- Refactored thinking level handling to distinguish between `ThinkingEffort` (provider-level, no "off") and `ThinkingLevel` (user-facing, includes "off")
|
|
1913
|
+
- Updated `ThinkingBudgets` type to use `ThinkingEffort` instead of `ThinkingLevel` for more precise token budget configuration
|
|
1874
1914
|
- Improved reasoning option handling to explicitly support "off" value for disabling reasoning across all providers
|
|
1875
1915
|
- Simplified thinking effort mapping logic by centralizing provider-specific clamping behavior
|
|
1876
1916
|
|
|
@@ -2661,7 +2701,7 @@
|
|
|
2661
2701
|
|
|
2662
2702
|
### Changed
|
|
2663
2703
|
|
|
2664
|
-
- Replaced direct `
|
|
2704
|
+
- Replaced direct `process.env` access with `getEnv()` utility from `@oh-my-pi/pi-utils` for consistent environment variable handling across all providers
|
|
2665
2705
|
- Updated environment variable names from `OMP_*` prefix to `PI_*` prefix for consistency (e.g., `OMP_CODING_AGENT_DIR` → `PI_CODING_AGENT_DIR`)
|
|
2666
2706
|
|
|
2667
2707
|
### Removed
|
|
@@ -2688,13 +2728,13 @@
|
|
|
2688
2728
|
|
|
2689
2729
|
### Added
|
|
2690
2730
|
|
|
2691
|
-
- Added `getEnv()` function to retrieve environment variables from
|
|
2731
|
+
- Added `getEnv()` function to retrieve environment variables from process.env, cwd/.env, or ~/.env
|
|
2692
2732
|
- Added support for reading .env files from home directory and current working directory
|
|
2693
2733
|
- Added support for `exa` and `perplexity` as known providers in `getEnvApiKey()`
|
|
2694
2734
|
|
|
2695
2735
|
### Changed
|
|
2696
2736
|
|
|
2697
|
-
- Changed `getEnvApiKey()` to check
|
|
2737
|
+
- Changed `getEnvApiKey()` to check process.env, cwd/.env, and ~/.env files in order of precedence
|
|
2698
2738
|
- Refactored provider API key resolution to use a declarative service provider map
|
|
2699
2739
|
|
|
2700
2740
|
## [9.2.2] - 2026-01-31
|
|
@@ -2900,7 +2940,7 @@
|
|
|
2900
2940
|
- Replaced custom sleep implementations with Bun.sleep and abortableSleep
|
|
2901
2941
|
- Simplified SSE stream parsing using readLines utility
|
|
2902
2942
|
- Updated test framework from vitest to bun:test
|
|
2903
|
-
- Replaced temp directory creation with
|
|
2943
|
+
- Replaced temp directory creation with createTempDirSync utility
|
|
2904
2944
|
- Changed credential storage from auth.json to ~/.omp/agent/agent.db
|
|
2905
2945
|
- Changed CLI command examples from npx to bunx
|
|
2906
2946
|
- Refactored OAuth flows to use common callback server base class
|
|
@@ -2943,8 +2983,8 @@
|
|
|
2943
2983
|
|
|
2944
2984
|
### Changed
|
|
2945
2985
|
|
|
2946
|
-
- Updated environment variable prefix from
|
|
2947
|
-
- Added automatic migration for legacy
|
|
2986
|
+
- Updated environment variable prefix from PI_ to OMP_ for better consistency
|
|
2987
|
+
- Added automatic migration for legacy PI_ environment variables to OMP_ equivalents
|
|
2948
2988
|
- Adjusted Bedrock Claude thinking budgets to reserve output tokens when maxTokens is too low
|
|
2949
2989
|
|
|
2950
2990
|
### Fixed
|
|
@@ -3061,7 +3101,7 @@
|
|
|
3061
3101
|
|
|
3062
3102
|
- Changed Cursor debug logging to use structured JSONL format with automatic MCP argument decoding
|
|
3063
3103
|
- Changed MCP tool argument decoding to use protobuf Value schema for improved type handling
|
|
3064
|
-
- Changed tool advertisement to filter Cursor native tools (bash, read, write, delete, ls, grep, lsp) instead of only exposing
|
|
3104
|
+
- Changed tool advertisement to filter Cursor native tools (bash, read, write, delete, ls, grep, lsp) instead of only exposing mcp_ prefixed tools
|
|
3065
3105
|
|
|
3066
3106
|
### Fixed
|
|
3067
3107
|
|
|
@@ -3144,6 +3184,222 @@
|
|
|
3144
3184
|
|
|
3145
3185
|
- Enhanced error messages to include retry-after timing information from API rate limit headers
|
|
3146
3186
|
|
|
3187
|
+
## [3.20.0] - 2026-01-06
|
|
3188
|
+
|
|
3189
|
+
### Added
|
|
3190
|
+
|
|
3191
|
+
- Added support for kwaipilot/kat-coder-pro model via OpenRouter
|
|
3192
|
+
- Added OpenAI Codex responses provider with OAuth login support for ChatGPT Plus/Pro accounts
|
|
3193
|
+
- Added Google Vertex AI provider (Gemini via Vertex) with Application Default Credentials support
|
|
3194
|
+
|
|
3195
|
+
### Changed
|
|
3196
|
+
|
|
3197
|
+
- Updated model specifications including context windows, max tokens, and pricing for multiple OpenRouter models
|
|
3198
|
+
|
|
3199
|
+
### Removed
|
|
3200
|
+
|
|
3201
|
+
- Removed alibaba/tongyi-deepresearch-30b-a3b:free model from OpenRouter
|
|
3202
|
+
- Removed nousresearch/hermes-4-405b model from OpenRouter
|
|
3203
|
+
- Removed tngtech/tng-r1t-chimera:free model from OpenRouter
|
|
3204
|
+
|
|
3205
|
+
## [3.15.0] - 2026-01-05
|
|
3206
|
+
|
|
3207
|
+
### Changed
|
|
3208
|
+
|
|
3209
|
+
- Made `isError` field optional in `ToolResultMessage` interface, defaulting to non-error state
|
|
3210
|
+
|
|
3211
|
+
## [3.5.1337] - 2026-01-03
|
|
3212
|
+
|
|
3213
|
+
### Added
|
|
3214
|
+
|
|
3215
|
+
- Added localhost URL detection for OpenAI-compatible provider auto-configuration
|
|
3216
|
+
|
|
3217
|
+
## [1.337.1] - 2026-01-02
|
|
3218
|
+
|
|
3219
|
+
### Changed
|
|
3220
|
+
|
|
3221
|
+
- Forked to @oh-my-pi scope with unified versioning across all packages
|
|
3222
|
+
|
|
3223
|
+
### Fixed
|
|
3224
|
+
|
|
3225
|
+
- **Gemini CLI rate limit handling**: Added automatic retry with server-provided delay for 429 errors
|
|
3226
|
+
|
|
3227
|
+
## [1.337.0] - 2026-01-02
|
|
3228
|
+
|
|
3229
|
+
Initial release under @oh-my-pi scope. See previous releases at [badlogic/pi-mono](https://github.com/badlogic/pi-mono).
|
|
3230
|
+
|
|
3231
|
+
## [0.50.1] - 2026-01-26
|
|
3232
|
+
|
|
3233
|
+
### Fixed
|
|
3234
|
+
|
|
3235
|
+
- Fixed OpenCode Zen model generation to exclude deprecated models ([#970](https://github.com/badlogic/pi-mono/pull/970) by [@DanielTatarkin](https://github.com/DanielTatarkin))
|
|
3236
|
+
|
|
3237
|
+
## [0.50.0] - 2026-01-26
|
|
3238
|
+
|
|
3239
|
+
### Added
|
|
3240
|
+
|
|
3241
|
+
- Added OpenRouter provider routing support for custom models via `openRouterRouting` compat field ([#859](https://github.com/badlogic/pi-mono/pull/859) by [@v01dpr1mr0s3](https://github.com/v01dpr1mr0s3))
|
|
3242
|
+
- Added `azure-openai-responses` provider support for Azure OpenAI Responses API. ([#890](https://github.com/badlogic/pi-mono/pull/890) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3243
|
+
- Added HTTP proxy environment variable support for API requests ([#942](https://github.com/badlogic/pi-mono/pull/942) by [@haoqixu](https://github.com/haoqixu))
|
|
3244
|
+
- Added `createAssistantMessageEventStream()` factory function for use in extensions.
|
|
3245
|
+
- Added `resetApiProviders()` to clear and re-register built-in API providers.
|
|
3246
|
+
|
|
3247
|
+
### Changed
|
|
3248
|
+
|
|
3249
|
+
- Refactored API streaming dispatch to use an API registry with provider-owned `streamSimple` mapping.
|
|
3250
|
+
- Moved environment API key resolution to `env-api-keys.ts` and re-exported it from the package entrypoint.
|
|
3251
|
+
- Azure OpenAI Responses provider now uses base URL configuration with deployment-aware model mapping and no longer includes service tier handling.
|
|
3252
|
+
|
|
3253
|
+
### Fixed
|
|
3254
|
+
|
|
3255
|
+
- Fixed Bun runtime detection for dynamic imports in browser-compatible modules (stream.ts, openai-codex-responses.ts, openai-codex.ts) ([#922](https://github.com/badlogic/pi-mono/pull/922) by [@dannote](https://github.com/dannote))
|
|
3256
|
+
- Fixed streaming functions to use `model.api` instead of hardcoded API types
|
|
3257
|
+
- Fixed Google providers to default tool call arguments to an empty object when omitted
|
|
3258
|
+
- Fixed OpenAI Responses streaming to handle `arguments.done` events on OpenAI-compatible endpoints ([#917](https://github.com/badlogic/pi-mono/pull/917) by [@williballenthin](https://github.com/williballenthin))
|
|
3259
|
+
- Fixed OpenAI Codex Responses tool strictness handling after the shared responses refactor
|
|
3260
|
+
- Fixed Azure OpenAI Responses streaming to guard deltas before content parts and correct metadata and handoff gating
|
|
3261
|
+
- Fixed OpenAI completions tool-result image batching after consecutive tool results ([#902](https://github.com/badlogic/pi-mono/pull/902) by [@terrorobe](https://github.com/terrorobe))
|
|
3262
|
+
|
|
3263
|
+
## [0.49.3] - 2026-01-22
|
|
3264
|
+
|
|
3265
|
+
### Added
|
|
3266
|
+
|
|
3267
|
+
- Added `headers` option to `StreamOptions` for custom HTTP headers in API requests. Supported by all providers except Amazon Bedrock (which uses AWS SDK auth). Headers are merged with provider defaults and `model.headers`, with `options.headers` taking precedence.
|
|
3268
|
+
- Added `originator` option to `loginOpenAICodex()` for custom OAuth client identification
|
|
3269
|
+
- Browser compatibility for pi-ai: replaced top-level Node.js imports with dynamic imports for browser environments ([#873](https://github.com/badlogic/pi-mono/issues/873))
|
|
3270
|
+
|
|
3271
|
+
### Fixed
|
|
3272
|
+
|
|
3273
|
+
- Fixed OpenAI Responses API 400 error "function_call without required reasoning item" when switching between models (same provider, different model). The fix omits the `id` field for function_calls from different models to avoid triggering OpenAI's reasoning/function_call pairing validation ([#886](https://github.com/badlogic/pi-mono/issues/886))
|
|
3274
|
+
|
|
3275
|
+
## [0.49.2] - 2026-01-19
|
|
3276
|
+
|
|
3277
|
+
### Added
|
|
3278
|
+
|
|
3279
|
+
- Added AWS credential detection for ECS/Kubernetes environments: `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI`, `AWS_CONTAINER_CREDENTIALS_FULL_URI`, `AWS_WEB_IDENTITY_TOKEN_FILE` ([#848](https://github.com/badlogic/pi-mono/issues/848))
|
|
3280
|
+
|
|
3281
|
+
### Fixed
|
|
3282
|
+
|
|
3283
|
+
- Fixed OpenAI Responses 400 error "reasoning without following item" by skipping errored/aborted assistant messages entirely in transform-messages.ts ([#838](https://github.com/badlogic/pi-mono/pull/838))
|
|
3284
|
+
|
|
3285
|
+
### Removed
|
|
3286
|
+
|
|
3287
|
+
- Removed `strictResponsesPairing` compat option (no longer needed after the transform-messages fix)
|
|
3288
|
+
|
|
3289
|
+
## [0.49.1] - 2026-01-18
|
|
3290
|
+
|
|
3291
|
+
### Added
|
|
3292
|
+
|
|
3293
|
+
- Added `OpenAIResponsesCompat` interface with `strictResponsesPairing` option for Azure OpenAI Responses API, which requires strict reasoning/message pairing in history replay ([#768](https://github.com/badlogic/pi-mono/pull/768) by [@nicobako](https://github.com/nicobako))
|
|
3294
|
+
|
|
3295
|
+
### Changed
|
|
3296
|
+
|
|
3297
|
+
- Split `OpenAICompat` into `OpenAICompletionsCompat` and `OpenAIResponsesCompat` for type-safe API-specific compat settings
|
|
3298
|
+
|
|
3299
|
+
### Fixed
|
|
3300
|
+
|
|
3301
|
+
- Fixed tool call ID normalization for cross-provider handoffs (e.g., Codex to Antigravity Claude) ([#821](https://github.com/badlogic/pi-mono/issues/821))
|
|
3302
|
+
|
|
3303
|
+
## [0.49.0] - 2026-01-17
|
|
3304
|
+
|
|
3305
|
+
### Changed
|
|
3306
|
+
|
|
3307
|
+
- OpenAI Codex responses now use the context system prompt directly in the instructions field.
|
|
3308
|
+
|
|
3309
|
+
### Fixed
|
|
3310
|
+
|
|
3311
|
+
- Fixed orphaned tool results after errored assistant messages causing Codex API errors. When an assistant message has `stopReason: "error"`, its tool calls are now excluded from pending tool tracking, preventing synthetic tool results from being generated for calls that will be dropped by provider-specific converters. ([#812](https://github.com/badlogic/pi-mono/issues/812))
|
|
3312
|
+
- Fixed Bedrock Claude max_tokens handling to always exceed thinking budget tokens, preventing compaction failures. ([#797](https://github.com/badlogic/pi-mono/pull/797) by [@pjtf93](https://github.com/pjtf93))
|
|
3313
|
+
- Fixed Claude Code tool name normalization to match the Claude Code tool list case-insensitively and remove invalid mappings.
|
|
3314
|
+
|
|
3315
|
+
## [0.48.0] - 2026-01-16
|
|
3316
|
+
|
|
3317
|
+
### Fixed
|
|
3318
|
+
|
|
3319
|
+
- Fixed OpenAI-compatible provider feature detection to use `model.provider` in addition to URL, allowing custom base URLs (e.g., proxies) to work correctly with provider-specific settings ([#774](https://github.com/badlogic/pi-mono/issues/774))
|
|
3320
|
+
- Fixed Gemini 3 context loss when switching from providers without thought signatures: unsigned tool calls are now converted to text with anti-mimicry notes instead of being skipped
|
|
3321
|
+
- Fixed string numbers in tool arguments not being coerced to numbers during validation ([#786](https://github.com/badlogic/pi-mono/pull/786) by [@dannote](https://github.com/dannote))
|
|
3322
|
+
- Fixed Bedrock tool call IDs to use only alphanumeric characters, avoiding API errors from invalid characters ([#781](https://github.com/badlogic/pi-mono/pull/781) by [@pjtf93](https://github.com/pjtf93))
|
|
3323
|
+
- Fixed empty error assistant messages (from 429/500 errors) breaking the tool_use to tool_result chain by filtering them in `transformMessages`
|
|
3324
|
+
|
|
3325
|
+
## [0.47.0] - 2026-01-16
|
|
3326
|
+
|
|
3327
|
+
### Fixed
|
|
3328
|
+
|
|
3329
|
+
- Fixed OpenCode provider's `/v1` endpoint to use `system` role instead of `developer` role, fixing `400 Incorrect role information` error for models using `openai-completions` API ([#755](https://github.com/badlogic/pi-mono/pull/755) by [@melihmucuk](https://github.com/melihmucuk))
|
|
3330
|
+
- Added retry logic to OpenAI Codex provider for transient errors (429, 5xx, connection failures). Uses exponential backoff with up to 3 retries. ([#733](https://github.com/badlogic/pi-mono/issues/733))
|
|
3331
|
+
|
|
3332
|
+
## [0.46.0] - 2026-01-15
|
|
3333
|
+
|
|
3334
|
+
### Added
|
|
3335
|
+
|
|
3336
|
+
- Added MiniMax China (`minimax-cn`) provider support ([#725](https://github.com/badlogic/pi-mono/pull/725) by [@tallshort](https://github.com/tallshort))
|
|
3337
|
+
- Added `gpt-5.2-codex` models for GitHub Copilot and OpenCode Zen providers ([#734](https://github.com/badlogic/pi-mono/pull/734) by [@aadishv](https://github.com/aadishv))
|
|
3338
|
+
|
|
3339
|
+
### Fixed
|
|
3340
|
+
|
|
3341
|
+
- Avoid unsigned Gemini 3 tool calls ([#741](https://github.com/badlogic/pi-mono/pull/741) by [@roshanasingh4](https://github.com/roshanasingh4))
|
|
3342
|
+
- Fixed signature support for non-Anthropic models in Amazon Bedrock provider ([#727](https://github.com/badlogic/pi-mono/pull/727) by [@unexge](https://github.com/unexge))
|
|
3343
|
+
|
|
3344
|
+
## [0.45.7] - 2026-01-13
|
|
3345
|
+
|
|
3346
|
+
### Fixed
|
|
3347
|
+
|
|
3348
|
+
- Fixed OpenAI Responses timeout option handling ([#706](https://github.com/badlogic/pi-mono/pull/706) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3349
|
+
- Fixed Bedrock tool call conversion to apply message transforms ([#707](https://github.com/badlogic/pi-mono/pull/707) by [@pjtf93](https://github.com/pjtf93))
|
|
3350
|
+
|
|
3351
|
+
## [0.45.6] - 2026-01-13
|
|
3352
|
+
|
|
3353
|
+
### Fixed
|
|
3354
|
+
|
|
3355
|
+
- Export `parseStreamingJson` from main package for tsx dev mode compatibility
|
|
3356
|
+
|
|
3357
|
+
## [0.45.4] - 2026-01-13
|
|
3358
|
+
|
|
3359
|
+
### Added
|
|
3360
|
+
|
|
3361
|
+
- Added Vercel AI Gateway provider with model discovery and `AI_GATEWAY_API_KEY` env support ([#689](https://github.com/badlogic/pi-mono/pull/689) by [@timolins](https://github.com/timolins))
|
|
3362
|
+
|
|
3363
|
+
### Fixed
|
|
3364
|
+
|
|
3365
|
+
- Fixed z.ai thinking/reasoning: z.ai uses `thinking: { type: "enabled" }` instead of OpenAI's `reasoning_effort`. Added `thinkingFormat` compat flag to handle this. ([#688](https://github.com/badlogic/pi-mono/issues/688))
|
|
3366
|
+
|
|
3367
|
+
## [0.45.0] - 2026-01-13
|
|
3368
|
+
|
|
3369
|
+
### Added
|
|
3370
|
+
|
|
3371
|
+
- MiniMax provider support with M2 and M2.1 models via Anthropic-compatible API ([#656](https://github.com/badlogic/pi-mono/pull/656) by [@dannote](https://github.com/dannote))
|
|
3372
|
+
- Add Amazon Bedrock provider with prompt caching for Claude models (experimental, tested with Anthropic Claude models only) ([#494](https://github.com/badlogic/pi-mono/pull/494) by [@unexge](https://github.com/unexge))
|
|
3373
|
+
- Added `serviceTier` option for OpenAI Responses requests ([#672](https://github.com/badlogic/pi-mono/pull/672) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3374
|
+
- **Anthropic caching on OpenRouter**: Interactions with Anthropic models via OpenRouter now set a 5-minute cache point using Anthropic-style `cache_control` breakpoints on the last assistant or user message. ([#584](https://github.com/badlogic/pi-mono/pull/584) by [@nathyong](https://github.com/nathyong))
|
|
3375
|
+
- **Google Gemini CLI provider improvements**: Added Antigravity endpoint fallback (tries daily sandbox then prod when `baseUrl` is unset), header-based retry delay parsing (`Retry-After`, `x-ratelimit-reset`, `x-ratelimit-reset-after`), stable `sessionId` derivation from first user message for cache affinity, empty SSE stream retry with backoff, and `anthropic-beta` header for Claude thinking models ([#670](https://github.com/badlogic/pi-mono/pull/670) by [@kim0](https://github.com/kim0))
|
|
3376
|
+
|
|
3377
|
+
## [0.43.0] - 2026-01-11
|
|
3378
|
+
|
|
3379
|
+
### Fixed
|
|
3380
|
+
|
|
3381
|
+
- Fixed Google provider thinking detection: `isThinkingPart()` now only checks `thought === true`, not `thoughtSignature`. Per Google docs, `thoughtSignature` is for context replay and can appear on any part type. Also removed `id` field from `functionCall`/`functionResponse` (rejected by Vertex AI and Cloud Code Assist), and added `textSignature` round-trip for multi-turn reasoning context. ([#631](https://github.com/badlogic/pi-mono/pull/631) by [@theBucky](https://github.com/theBucky))
|
|
3382
|
+
|
|
3383
|
+
## [0.42.3] - 2026-01-10
|
|
3384
|
+
|
|
3385
|
+
### Changed
|
|
3386
|
+
|
|
3387
|
+
- OpenAI Codex: switched to bundled system prompt matching opencode, changed originator to "pi", simplified prompt handling
|
|
3388
|
+
|
|
3389
|
+
## [0.42.2] - 2026-01-10
|
|
3390
|
+
|
|
3391
|
+
### Added
|
|
3392
|
+
|
|
3393
|
+
- Added `GOOGLE_APPLICATION_CREDENTIALS` env var support for Vertex AI credential detection (standard for CI/production).
|
|
3394
|
+
- Added `supportsUsageInStreaming` compatibility flag for OpenAI-compatible providers that reject `stream_options: { include_usage: true }`. Defaults to `true`. Set to `false` in model config for providers like gatewayz.ai. ([#596](https://github.com/badlogic/pi-mono/pull/596) by [@XesGaDeus](https://github.com/XesGaDeus))
|
|
3395
|
+
- Improved Google model pricing info ([#588](https://github.com/badlogic/pi-mono/pull/588) by [@aadishv](https://github.com/aadishv))
|
|
3396
|
+
|
|
3397
|
+
### Fixed
|
|
3398
|
+
|
|
3399
|
+
- Fixed `os.homedir()` calls at module load time; now resolved lazily when needed.
|
|
3400
|
+
- Fixed OpenAI Responses tool strict flag to use a boolean for LM Studio compatibility ([#598](https://github.com/badlogic/pi-mono/pull/598) by [@gnattu](https://github.com/gnattu))
|
|
3401
|
+
- Fixed Google Cloud Code Assist OAuth for paid subscriptions: properly handles long-running operations for project provisioning, supports `GOOGLE_CLOUD_PROJECT` / `GOOGLE_CLOUD_PROJECT_ID` env vars for paid tiers, and handles VPC-SC affected users ([#582](https://github.com/badlogic/pi-mono/pull/582) by [@cmf](https://github.com/cmf))
|
|
3402
|
+
|
|
3147
3403
|
## [0.42.0] - 2026-01-09
|
|
3148
3404
|
|
|
3149
3405
|
### Added
|
|
@@ -3237,7 +3493,7 @@
|
|
|
3237
3493
|
|
|
3238
3494
|
### Breaking Changes
|
|
3239
3495
|
|
|
3240
|
-
- **Agent API moved**: All agent functionality (`agentLoop`, `agentLoopContinue`, `AgentContext`, `AgentEvent`, `AgentTool`, `AgentToolResult`, etc.) has moved to `@
|
|
3496
|
+
- **Agent API moved**: All agent functionality (`agentLoop`, `agentLoopContinue`, `AgentContext`, `AgentEvent`, `AgentTool`, `AgentToolResult`, etc.) has moved to `@oh-my-pi/pi-agent-core`. Import from that package instead of `@oh-my-pi/pi-ai`.
|
|
3241
3497
|
|
|
3242
3498
|
### Added
|
|
3243
3499
|
|
|
@@ -3253,7 +3509,6 @@
|
|
|
3253
3509
|
### Fixed
|
|
3254
3510
|
|
|
3255
3511
|
- **OpenAI completions empty content blocks**: Empty text or thinking blocks in assistant messages are now filtered out before sending to the OpenAI completions API, preventing validation errors. ([#344](https://github.com/badlogic/pi-mono/pull/344) by [@default-anton](https://github.com/default-anton))
|
|
3256
|
-
- **Thinking token duplication**: Fixed thinking content duplication with chutes.ai provider. The provider was returning thinking content in both `reasoning_content` and `reasoning` fields, causing each chunk to be processed twice. Now only the first non-empty reasoning field is used.
|
|
3257
3512
|
- **zAi provider API mapping**: Fixed zAi models to use `openai-completions` API with correct base URL (`https://api.z.ai/api/coding/paas/v4`) instead of incorrect Anthropic API mapping. ([#344](https://github.com/badlogic/pi-mono/pull/344), [#358](https://github.com/badlogic/pi-mono/pull/358) by [@default-anton](https://github.com/default-anton))
|
|
3258
3513
|
|
|
3259
3514
|
## [0.28.0] - 2025-12-25
|
|
@@ -3283,11 +3538,8 @@
|
|
|
3283
3538
|
### Fixed
|
|
3284
3539
|
|
|
3285
3540
|
- **Gemini multimodal tool results**: Fixed images in tool results causing flaky/broken responses with Gemini models. For Gemini 3, images are now nested inside `functionResponse.parts` per the [docs](https://ai.google.dev/gemini-api/docs/function-calling#multimodal). For older models (which don't support multimodal function responses), images are sent in a separate user message.
|
|
3286
|
-
|
|
3287
3541
|
- **Queued message steering**: When `getQueuedMessages` is provided, the agent loop now checks for queued user messages after each tool call and skips remaining tool calls in the current assistant message when a queued message arrives (emitting error tool results).
|
|
3288
|
-
|
|
3289
3542
|
- **Double API version path in Google provider URL**: Fixed Gemini API calls returning 404 after baseUrl support was added. The SDK was appending its default apiVersion to baseUrl which already included the version path. ([#251](https://github.com/badlogic/pi-mono/pull/251) by [@shellfyred](https://github.com/shellfyred))
|
|
3290
|
-
|
|
3291
3543
|
- **Anthropic SDK retries disabled**: Re-enabled SDK-level retries (default 2) for transient HTTP failures. ([#252](https://github.com/badlogic/pi-mono/issues/252))
|
|
3292
3544
|
|
|
3293
3545
|
## [0.23.5] - 2025-12-19
|
|
@@ -3295,17 +3547,13 @@
|
|
|
3295
3547
|
### Added
|
|
3296
3548
|
|
|
3297
3549
|
- **Gemini 3 Flash thinking support**: Extended thinking level support for Gemini 3 Flash models (MINIMAL, LOW, MEDIUM, HIGH) to match Pro models' capabilities. ([#212](https://github.com/badlogic/pi-mono/pull/212) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3298
|
-
|
|
3299
3550
|
- **GitHub Copilot thinking models**: Added thinking support for additional Copilot models (o3-mini, o1-mini, o1-preview). ([#234](https://github.com/badlogic/pi-mono/pull/234) by [@aadishv](https://github.com/aadishv))
|
|
3300
3551
|
|
|
3301
3552
|
### Fixed
|
|
3302
3553
|
|
|
3303
3554
|
- **Gemini tool result format**: Fixed tool result format for Gemini 3 Flash Preview which strictly requires `{ output: value }` for success and `{ error: value }` for errors. Previous format using `{ result, isError }` was rejected by newer Gemini models. Also improved type safety by removing `as any` casts. ([#213](https://github.com/badlogic/pi-mono/issues/213), [#220](https://github.com/badlogic/pi-mono/pull/220))
|
|
3304
|
-
|
|
3305
3555
|
- **Google baseUrl configuration**: Google provider now respects `baseUrl` configuration for custom endpoints or API proxies. ([#216](https://github.com/badlogic/pi-mono/issues/216), [#221](https://github.com/badlogic/pi-mono/pull/221) by [@theBucky](https://github.com/theBucky))
|
|
3306
|
-
|
|
3307
3556
|
- **GitHub Copilot vision requests**: Added `Copilot-Vision-Request` header when sending images to GitHub Copilot models. ([#222](https://github.com/badlogic/pi-mono/issues/222))
|
|
3308
|
-
|
|
3309
3557
|
- **GitHub Copilot X-Initiator header**: Fixed X-Initiator logic to check last message role instead of any message in history. This ensures proper billing when users send follow-up messages. ([#209](https://github.com/badlogic/pi-mono/issues/209))
|
|
3310
3558
|
|
|
3311
3559
|
## [0.22.3] - 2025-12-16
|
|
@@ -3313,9 +3561,7 @@
|
|
|
3313
3561
|
### Added
|
|
3314
3562
|
|
|
3315
3563
|
- **Image limits test suite**: Added comprehensive tests for provider-specific image limitations (max images, max size, max dimensions). Discovered actual limits: Anthropic (100 images, 5MB, 8000px), OpenAI (500 images, ≥25MB), Gemini (~2500 images, ≥40MB), Mistral (8 images, ~15MB), OpenRouter (~40 images context-limited, ~15MB). ([#120](https://github.com/badlogic/pi-mono/pull/120))
|
|
3316
|
-
|
|
3317
3564
|
- **Tool result streaming**: Added `tool_execution_update` event and optional `onUpdate` callback to `AgentTool.execute()` for streaming tool output during execution. Tools can now emit partial results (e.g., bash stdout) that are forwarded to subscribers. ([#44](https://github.com/badlogic/pi-mono/issues/44))
|
|
3318
|
-
|
|
3319
3565
|
- **X-Initiator header for GitHub Copilot**: Added X-Initiator header handling for GitHub Copilot provider to ensure correct call accounting (agent calls are not deducted from quota). Sets initiator based on last message role. ([#200](https://github.com/badlogic/pi-mono/pull/200) by [@kim0](https://github.com/kim0))
|
|
3320
3566
|
|
|
3321
3567
|
### Changed
|
|
@@ -3349,9 +3595,7 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3349
3595
|
### Fixed
|
|
3350
3596
|
|
|
3351
3597
|
- **GitHub Copilot gpt-5 models**: Fixed API selection for gpt-5 models to use `openai-responses` instead of `openai-completions` (gpt-5 models are not accessible via completions endpoint)
|
|
3352
|
-
|
|
3353
3598
|
- **GitHub Copilot cross-model context handoff**: Fixed context handoff failing when switching between GitHub Copilot models using different APIs (e.g., gpt-5 to claude-sonnet-4). Tool call IDs from OpenAI Responses API were incompatible with other models. ([#198](https://github.com/badlogic/pi-mono/issues/198))
|
|
3354
|
-
|
|
3355
3599
|
- **Gemini 3 Pro thinking levels**: Thinking level configuration now works correctly for Gemini 3 Pro models. Previously all levels mapped to -1 (minimal thinking). Now LOW/MEDIUM/HIGH properly control test-time computation. ([#176](https://github.com/badlogic/pi-mono/pull/176) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3356
3600
|
|
|
3357
3601
|
## [0.18.2] - 2025-12-11
|
|
@@ -3369,9 +3613,7 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3369
3613
|
### Fixed
|
|
3370
3614
|
|
|
3371
3615
|
- Fixed Mistral 400 errors after aborted assistant messages by skipping empty assistant messages (no content, no tool calls) ([#165](https://github.com/badlogic/pi-mono/issues/165))
|
|
3372
|
-
|
|
3373
3616
|
- Removed synthetic assistant bridge message after tool results for Mistral (no longer required as of Dec 2025) ([#165](https://github.com/badlogic/pi-mono/issues/165))
|
|
3374
|
-
|
|
3375
3617
|
- Fixed bug where `ANTHROPIC_API_KEY` environment variable was deleted globally after first OAuth token usage, causing subsequent prompts to fail ([#164](https://github.com/badlogic/pi-mono/pull/164))
|
|
3376
3618
|
|
|
3377
3619
|
## [0.17.0] - 2025-12-09
|
|
@@ -3380,9 +3622,7 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3380
3622
|
|
|
3381
3623
|
- **`agentLoopContinue` function**: Continue an agent loop from existing context without adding a new user message. Validates that the last message is `user` or `toolResult`. Useful for retry after context overflow or resuming from manually-added tool results.
|
|
3382
3624
|
- Added `validateToolCall(tools, toolCall)` helper that finds the tool by name and validates arguments.
|
|
3383
|
-
|
|
3384
3625
|
- **OpenAI compatibility overrides**: Added `compat` field to `Model` for `openai-completions` API, allowing explicit configuration of provider quirks (`supportsStore`, `supportsDeveloperRole`, `supportsReasoningEffort`, `maxTokensField`). Falls back to URL-based detection if not set. Useful for LiteLLM, custom proxies, and other non-standard endpoints. ([#133](https://github.com/badlogic/pi-mono/issues/133), thanks @fink-andreas for the initial idea and PR)
|
|
3385
|
-
|
|
3386
3626
|
- **xhigh reasoning level**: Added `xhigh` to `ReasoningEffort` type for OpenAI codex-max models. For non-OpenAI providers (Anthropic, Google), `xhigh` is automatically mapped to `high`. ([#143](https://github.com/badlogic/pi-mono/issues/143))
|
|
3387
3627
|
|
|
3388
3628
|
### Breaking Changes
|
|
@@ -3408,7 +3648,6 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3408
3648
|
### Fixed
|
|
3409
3649
|
|
|
3410
3650
|
- **OpenAI Token Counting**: Fixed `usage.input` to exclude cached tokens for OpenAI providers. Previously, `input` included cached tokens, causing double-counting when calculating total context size via `input + cacheRead`. Now `input` represents non-cached input tokens across all providers, making `input + output + cacheRead + cacheWrite` the correct formula for total context size.
|
|
3411
|
-
|
|
3412
3651
|
- **Fixed Claude Opus 4.5 cache pricing** (was 3x too expensive)
|
|
3413
3652
|
- Corrected cache_read: $1.50 → $0.50 per MTok
|
|
3414
3653
|
- Corrected cache_write: $18.75 → $6.25 per MTok
|
|
@@ -6,9 +6,11 @@
|
|
|
6
6
|
* in `./types.ts` 1:1; the types remain the source of truth for static typing,
|
|
7
7
|
* and `z.infer<typeof Schema>` is asserted-compatible with them where possible.
|
|
8
8
|
*
|
|
9
|
-
*
|
|
10
|
-
*
|
|
11
|
-
*
|
|
9
|
+
* Envelope and fixed-shape schemas use `.strict()` so unknown keys are
|
|
10
|
+
* rejected — the previous implementation used a hand-rolled `hasOnlyFields`
|
|
11
|
+
* allowlist for the same effect. The OAuth credential schema is the deliberate
|
|
12
|
+
* exception (`.loose()`): it preserves provider-specific extension fields so
|
|
13
|
+
* they round-trip through the broker instead of being dropped (see below).
|
|
12
14
|
*/
|
|
13
15
|
import { z } from "zod/v4";
|
|
14
16
|
/** Real OAuth credential (broker-side) — refresh token is the actual upstream value. */
|
|
@@ -21,7 +23,7 @@ export declare const oauthCredentialSchema: z.ZodObject<{
|
|
|
21
23
|
projectId: z.ZodOptional<z.ZodString>;
|
|
22
24
|
email: z.ZodOptional<z.ZodString>;
|
|
23
25
|
accountId: z.ZodOptional<z.ZodString>;
|
|
24
|
-
}, z.core.$
|
|
26
|
+
}, z.core.$loose>;
|
|
25
27
|
/** OAuth credential as it appears in broker snapshots — refresh replaced with sentinel. */
|
|
26
28
|
export declare const remoteOauthCredentialSchema: z.ZodObject<{
|
|
27
29
|
type: z.ZodLiteral<"oauth">;
|
|
@@ -32,7 +34,7 @@ export declare const remoteOauthCredentialSchema: z.ZodObject<{
|
|
|
32
34
|
email: z.ZodOptional<z.ZodString>;
|
|
33
35
|
accountId: z.ZodOptional<z.ZodString>;
|
|
34
36
|
refresh: z.ZodLiteral<"__remote__">;
|
|
35
|
-
}, z.core.$
|
|
37
|
+
}, z.core.$loose>;
|
|
36
38
|
export declare const apiKeyCredentialSchema: z.ZodObject<{
|
|
37
39
|
type: z.ZodLiteral<"api_key">;
|
|
38
40
|
key: z.ZodString;
|
|
@@ -47,7 +49,7 @@ export declare const writableAuthCredentialSchema: z.ZodDiscriminatedUnion<[z.Zo
|
|
|
47
49
|
projectId: z.ZodOptional<z.ZodString>;
|
|
48
50
|
email: z.ZodOptional<z.ZodString>;
|
|
49
51
|
accountId: z.ZodOptional<z.ZodString>;
|
|
50
|
-
}, z.core.$
|
|
52
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
51
53
|
type: z.ZodLiteral<"api_key">;
|
|
52
54
|
key: z.ZodString;
|
|
53
55
|
}, z.core.$strict>], "type">;
|
|
@@ -61,7 +63,7 @@ export declare const snapshotCredentialSchema: z.ZodDiscriminatedUnion<[z.ZodObj
|
|
|
61
63
|
email: z.ZodOptional<z.ZodString>;
|
|
62
64
|
accountId: z.ZodOptional<z.ZodString>;
|
|
63
65
|
refresh: z.ZodLiteral<"__remote__">;
|
|
64
|
-
}, z.core.$
|
|
66
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
65
67
|
type: z.ZodLiteral<"api_key">;
|
|
66
68
|
key: z.ZodString;
|
|
67
69
|
}, z.core.$strict>], "type">;
|
|
@@ -77,7 +79,7 @@ export declare const credentialSnapshotEntrySchema: z.ZodObject<{
|
|
|
77
79
|
email: z.ZodOptional<z.ZodString>;
|
|
78
80
|
accountId: z.ZodOptional<z.ZodString>;
|
|
79
81
|
refresh: z.ZodLiteral<"__remote__">;
|
|
80
|
-
}, z.core.$
|
|
82
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
81
83
|
type: z.ZodLiteral<"api_key">;
|
|
82
84
|
key: z.ZodString;
|
|
83
85
|
}, z.core.$strict>], "type">;
|
|
@@ -95,7 +97,7 @@ export declare const snapshotEntrySchema: z.ZodObject<{
|
|
|
95
97
|
email: z.ZodOptional<z.ZodString>;
|
|
96
98
|
accountId: z.ZodOptional<z.ZodString>;
|
|
97
99
|
refresh: z.ZodLiteral<"__remote__">;
|
|
98
|
-
}, z.core.$
|
|
100
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
99
101
|
type: z.ZodLiteral<"api_key">;
|
|
100
102
|
key: z.ZodString;
|
|
101
103
|
}, z.core.$strict>], "type">;
|
|
@@ -130,7 +132,7 @@ export declare const snapshotResponseSchema: z.ZodObject<{
|
|
|
130
132
|
email: z.ZodOptional<z.ZodString>;
|
|
131
133
|
accountId: z.ZodOptional<z.ZodString>;
|
|
132
134
|
refresh: z.ZodLiteral<"__remote__">;
|
|
133
|
-
}, z.core.$
|
|
135
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
134
136
|
type: z.ZodLiteral<"api_key">;
|
|
135
137
|
key: z.ZodString;
|
|
136
138
|
}, z.core.$strict>], "type">;
|
|
@@ -161,7 +163,7 @@ export declare const snapshotStreamSnapshotEventSchema: z.ZodObject<{
|
|
|
161
163
|
email: z.ZodOptional<z.ZodString>;
|
|
162
164
|
accountId: z.ZodOptional<z.ZodString>;
|
|
163
165
|
refresh: z.ZodLiteral<"__remote__">;
|
|
164
|
-
}, z.core.$
|
|
166
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
165
167
|
type: z.ZodLiteral<"api_key">;
|
|
166
168
|
key: z.ZodString;
|
|
167
169
|
}, z.core.$strict>], "type">;
|
|
@@ -193,7 +195,7 @@ export declare const snapshotStreamEntryEventSchema: z.ZodObject<{
|
|
|
193
195
|
email: z.ZodOptional<z.ZodString>;
|
|
194
196
|
accountId: z.ZodOptional<z.ZodString>;
|
|
195
197
|
refresh: z.ZodLiteral<"__remote__">;
|
|
196
|
-
}, z.core.$
|
|
198
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
197
199
|
type: z.ZodLiteral<"api_key">;
|
|
198
200
|
key: z.ZodString;
|
|
199
201
|
}, z.core.$strict>], "type">;
|
|
@@ -237,7 +239,7 @@ export declare const snapshotStreamEventSchema: z.ZodDiscriminatedUnion<[z.ZodOb
|
|
|
237
239
|
email: z.ZodOptional<z.ZodString>;
|
|
238
240
|
accountId: z.ZodOptional<z.ZodString>;
|
|
239
241
|
refresh: z.ZodLiteral<"__remote__">;
|
|
240
|
-
}, z.core.$
|
|
242
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
241
243
|
type: z.ZodLiteral<"api_key">;
|
|
242
244
|
key: z.ZodString;
|
|
243
245
|
}, z.core.$strict>], "type">;
|
|
@@ -267,7 +269,7 @@ export declare const snapshotStreamEventSchema: z.ZodDiscriminatedUnion<[z.ZodOb
|
|
|
267
269
|
email: z.ZodOptional<z.ZodString>;
|
|
268
270
|
accountId: z.ZodOptional<z.ZodString>;
|
|
269
271
|
refresh: z.ZodLiteral<"__remote__">;
|
|
270
|
-
}, z.core.$
|
|
272
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
271
273
|
type: z.ZodLiteral<"api_key">;
|
|
272
274
|
key: z.ZodString;
|
|
273
275
|
}, z.core.$strict>], "type">;
|
|
@@ -364,7 +366,7 @@ export declare const credentialRefreshResponseSchema: z.ZodObject<{
|
|
|
364
366
|
email: z.ZodOptional<z.ZodString>;
|
|
365
367
|
accountId: z.ZodOptional<z.ZodString>;
|
|
366
368
|
refresh: z.ZodLiteral<"__remote__">;
|
|
367
|
-
}, z.core.$
|
|
369
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
368
370
|
type: z.ZodLiteral<"api_key">;
|
|
369
371
|
key: z.ZodString;
|
|
370
372
|
}, z.core.$strict>], "type">;
|
|
@@ -388,7 +390,7 @@ export declare const credentialUploadRequestSchema: z.ZodObject<{
|
|
|
388
390
|
projectId: z.ZodOptional<z.ZodString>;
|
|
389
391
|
email: z.ZodOptional<z.ZodString>;
|
|
390
392
|
accountId: z.ZodOptional<z.ZodString>;
|
|
391
|
-
}, z.core.$
|
|
393
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
392
394
|
type: z.ZodLiteral<"api_key">;
|
|
393
395
|
key: z.ZodString;
|
|
394
396
|
}, z.core.$strict>], "type">;
|
|
@@ -406,7 +408,7 @@ export declare const credentialUploadResponseSchema: z.ZodObject<{
|
|
|
406
408
|
email: z.ZodOptional<z.ZodString>;
|
|
407
409
|
accountId: z.ZodOptional<z.ZodString>;
|
|
408
410
|
refresh: z.ZodLiteral<"__remote__">;
|
|
409
|
-
}, z.core.$
|
|
411
|
+
}, z.core.$loose>, z.ZodObject<{
|
|
410
412
|
type: z.ZodLiteral<"api_key">;
|
|
411
413
|
key: z.ZodString;
|
|
412
414
|
}, z.core.$strict>], "type">;
|
|
@@ -14,7 +14,7 @@ export interface ProviderDetailsContext {
|
|
|
14
14
|
authMode?: string;
|
|
15
15
|
/**
|
|
16
16
|
* Human-readable description of the active credential, e.g.
|
|
17
|
-
* `"broker http://
|
|
17
|
+
* `"broker http://omp.internal:8765 · oauth #5 (foo@bar.com)"`.
|
|
18
18
|
* Rendered as a `Source` field; omitted when undefined.
|
|
19
19
|
*/
|
|
20
20
|
credentialSource?: string;
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"type": "module",
|
|
3
3
|
"name": "@oh-my-pi/pi-ai",
|
|
4
|
-
"version": "15.13.
|
|
4
|
+
"version": "15.13.1",
|
|
5
5
|
"description": "Unified LLM API with automatic model discovery and provider configuration",
|
|
6
6
|
"homepage": "https://omp.sh",
|
|
7
7
|
"author": "Can Boluk",
|
|
@@ -38,8 +38,8 @@
|
|
|
38
38
|
},
|
|
39
39
|
"dependencies": {
|
|
40
40
|
"@bufbuild/protobuf": "^2.12.0",
|
|
41
|
-
"@oh-my-pi/pi-catalog": "15.13.
|
|
42
|
-
"@oh-my-pi/pi-utils": "15.13.
|
|
41
|
+
"@oh-my-pi/pi-catalog": "15.13.1",
|
|
42
|
+
"@oh-my-pi/pi-utils": "15.13.1",
|
|
43
43
|
"partial-json": "^0.1.7",
|
|
44
44
|
"zod": "^4"
|
|
45
45
|
},
|
|
@@ -6,9 +6,11 @@
|
|
|
6
6
|
* in `./types.ts` 1:1; the types remain the source of truth for static typing,
|
|
7
7
|
* and `z.infer<typeof Schema>` is asserted-compatible with them where possible.
|
|
8
8
|
*
|
|
9
|
-
*
|
|
10
|
-
*
|
|
11
|
-
*
|
|
9
|
+
* Envelope and fixed-shape schemas use `.strict()` so unknown keys are
|
|
10
|
+
* rejected — the previous implementation used a hand-rolled `hasOnlyFields`
|
|
11
|
+
* allowlist for the same effect. The OAuth credential schema is the deliberate
|
|
12
|
+
* exception (`.loose()`): it preserves provider-specific extension fields so
|
|
13
|
+
* they round-trip through the broker instead of being dropped (see below).
|
|
12
14
|
*/
|
|
13
15
|
import { z } from "zod/v4";
|
|
14
16
|
import { REMOTE_REFRESH_SENTINEL } from "../auth-storage";
|
|
@@ -38,7 +40,15 @@ export const oauthCredentialSchema = z
|
|
|
38
40
|
email: z.string().optional(),
|
|
39
41
|
accountId: z.string().optional(),
|
|
40
42
|
})
|
|
41
|
-
|
|
43
|
+
// `.loose()`, not `.strict()`: OAuth credentials carry an open set of
|
|
44
|
+
// provider-specific extension fields beyond the base shape above — e.g. an
|
|
45
|
+
// MCP server's tokenUrl/clientId/clientSecret/resource embedded so token
|
|
46
|
+
// refresh works without an `auth` block in config. The storage layer
|
|
47
|
+
// (`serializeCredential`/`deserializeCredential`/`exportSnapshot`) already
|
|
48
|
+
// preserves unknown OAuth fields generically; the wire schema must match or
|
|
49
|
+
// the broker set->get round-trip silently strips them and the credential
|
|
50
|
+
// can no longer refresh after reload. Envelope schemas stay `.strict()`.
|
|
51
|
+
.loose();
|
|
42
52
|
|
|
43
53
|
/** OAuth credential as it appears in broker snapshots — refresh replaced with sentinel. */
|
|
44
54
|
export const remoteOauthCredentialSchema = oauthCredentialSchema.extend({
|
package/src/auth-storage.ts
CHANGED
package/src/provider-details.ts
CHANGED
|
@@ -18,7 +18,7 @@ export interface ProviderDetailsContext {
|
|
|
18
18
|
authMode?: string;
|
|
19
19
|
/**
|
|
20
20
|
* Human-readable description of the active credential, e.g.
|
|
21
|
-
* `"broker http://
|
|
21
|
+
* `"broker http://omp.internal:8765 · oauth #5 (foo@bar.com)"`.
|
|
22
22
|
* Rendered as a `Source` field; omitted when undefined.
|
|
23
23
|
*/
|
|
24
24
|
credentialSource?: string;
|