@oh-my-pi/pi-ai 15.12.4 → 15.13.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +319 -55
- package/dist/types/auth-broker/wire-schemas.d.ts +19 -17
- package/dist/types/auth-storage.d.ts +1 -1
- package/dist/types/provider-details.d.ts +1 -1
- package/dist/types/providers/anthropic-client.d.ts +2 -0
- package/dist/types/providers/google-gemini-cli.d.ts +1 -1
- package/package.json +3 -3
- package/src/auth-broker/wire-schemas.ts +14 -4
- package/src/auth-storage.ts +1 -1
- package/src/provider-details.ts +1 -1
- package/src/providers/amazon-bedrock.ts +19 -1
- package/src/providers/anthropic-client.ts +2 -0
- package/src/providers/anthropic.ts +14 -7
- package/src/providers/azure-openai-responses.ts +9 -1
- package/src/providers/google-gemini-cli.ts +35 -3
- package/src/providers/google-shared.ts +14 -2
- package/src/providers/ollama.ts +19 -1
- package/src/providers/openai-codex-responses.ts +27 -4
- package/src/providers/openai-completions.ts +40 -7
- package/src/providers/openai-responses-shared.ts +4 -1
- package/src/providers/openai-responses.ts +9 -4
- package/src/registry/oauth/gitlab-duo.ts +8 -3
- package/src/registry/zai.ts +1 -1
- package/src/utils/openai-http.ts +4 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,37 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [15.13.1] - 2026-06-15
|
|
6
|
+
|
|
7
|
+
### Fixed
|
|
8
|
+
|
|
9
|
+
- Fixed the auth-broker (`OMP_AUTH_BROKER_URL`) rejecting OAuth credentials that carry provider-specific extension fields (e.g. an MCP server's `tokenUrl`/`clientId`/`clientSecret`/`resource` embedded for self-contained token refresh): the OAuth credential wire schema was `.strict()`, so `POST /v1/credential` failed with `400 unrecognized_keys` and a broker-backed MCP reauth reported success while the reloaded credential lacked its refresh material and could no longer refresh. The OAuth wire schema now uses `.loose()` to preserve unknown fields — matching the field-preserving local SQLite store — so extra OAuth fields round-trip through broker set->get (envelope and API-key schemas stay strict).
|
|
10
|
+
|
|
11
|
+
## [15.13.0] - 2026-06-14
|
|
12
|
+
|
|
13
|
+
### Fixed
|
|
14
|
+
|
|
15
|
+
- Fixed OpenAI Responses/Realtime SSE stream handler crashing with "Error Code undefined: undefined" when parsing error events with nested error details by falling back to the nested error object fields.
|
|
16
|
+
- Fixed OpenAI-compatible providers that reject forced `tool_choice` on thinking-required models by downgrading unsupported forced choices to `auto` while keeping tools available ([#2546](https://github.com/can1357/oh-my-pi/issues/2546)).
|
|
17
|
+
- Fixed GitHub Copilot Anthropic transport (`api.githubcopilot.com/v1/messages`) returning `400 tools.0.custom.eager_input_streaming: Extra inputs are not permitted` on every tool-bearing turn by stopping the emission of the per-tool `eager_input_streaming` flag and the `fine-grained-tool-streaming-2025-05-14` beta header on the Copilot transport — the proxy whitelists neither ([#2558](https://github.com/can1357/oh-my-pi/issues/2558)).
|
|
18
|
+
- Disabled Bun's native ~300s pre-response `fetch` timeout in every streaming provider (OpenAI completions/responses, Azure responses, Anthropic, Codex SSE, Bedrock, Gemini CLI, Ollama). The configurable first-event/idle/SDK watchdogs (`PI_STREAM_FIRST_EVENT_TIMEOUT_MS`, `PI_OPENAI_STREAM_IDLE_TIMEOUT_MS`, `compat.streamIdleTimeoutMs`) were silently capped by Bun's hidden ceiling, so cold large-context streams (e.g. self-hosted vLLM at multi-hundred-K prompts) died at exactly 300s with `TimeoutError: The operation timed out.` Direct callers of `./providers/{amazon-bedrock,google-gemini-cli,ollama,openai-codex-responses}` (which bypass `register-builtins`' iterator-level watchdog) now install a pre-response `AbortSignal.timeout(firstEventTimeoutMs)` alongside the disable, so a stalled upstream still fails within the configured budget instead of hanging forever ([#2422](https://github.com/can1357/oh-my-pi/issues/2422))
|
|
19
|
+
- Fixed Gemini / Antigravity streams (Google Cloud Code Assist API) creating a trailing empty text block and emitting redundant `text_start`/`text_delta`/`text_end` events at the end of the turn when the final SSE chunk contains an empty text part (`text: ""`). The parser now ignores empty text parts, preserving the active transcript block state and ensuring proper nesting and rendering of subsequent background jobs or new turns.
|
|
20
|
+
- Preserved terminal Google `thoughtSignature`s by still extracting and applying the signature on the active block even when the text part is empty or undefined.
|
|
21
|
+
- Stopped Gemini Antigravity sessions (`gemini-3*` / Claude under Cloud Code Assist) from leaking system rule reminders and personality preambles into the final response, by appending an explicit 'do not output rule checks' instruction to the injected system parts.
|
|
22
|
+
- Fixed Gemini / Antigravity streams (Google Cloud Code Assist API) letting a `functionCall` part's own `thoughtSignature` clobber the preceding text or thinking block's signature on `think → tool` and `text → tool` turns. A signed function-call part has `text: undefined`, so it fell into the terminal-signature branch while the prior block was still active; that branch now skips function-call parts, leaving the tool call's signature on the tool call where it belongs and preventing corrupted signatures on same-model replay.
|
|
23
|
+
- Fixed MiniMax-M3 OpenAI-compatible streams rendering reasoning twice when the same chunk carried both `<think>…</think>` content and structured `reasoning_content`; structured reasoning now wins and cumulative MiniMax reasoning snapshots are collapsed to deltas using a per-signature snapshot tracker that survives the `</think>`-to-text block transition (so post-answer cumulative snapshots don't reinstate a duplicate thinking block). ([#2433](https://github.com/can1357/oh-my-pi/issues/2433))
|
|
24
|
+
|
|
25
|
+
## [15.12.6] - 2026-06-14
|
|
26
|
+
|
|
27
|
+
### Changed
|
|
28
|
+
|
|
29
|
+
- Bumped Z.AI (GLM Coding Plan) API key validation probe to glm-5.2.
|
|
30
|
+
|
|
31
|
+
### Fixed
|
|
32
|
+
|
|
33
|
+
- Fixed tool schema conversion for non-Cloud Code Assist Google Gemini models by normalizing parameters with `normalizeSchemaForGoogle` to prevent un-normalized schema properties (such as `additionalProperties: false` or type arrays) from causing Gemini API errors.
|
|
34
|
+
- Fixed OpenAI-family request builders dropping forced named `tool_choice` directives when the named tool is absent from the serialized `tools` array, preventing spec-strict providers from rejecting self-inconsistent requests. ([#1701](https://github.com/can1357/oh-my-pi/issues/1701))
|
|
35
|
+
|
|
5
36
|
## [15.12.4] - 2026-06-13
|
|
6
37
|
|
|
7
38
|
### Added
|
|
@@ -276,6 +307,63 @@
|
|
|
276
307
|
|
|
277
308
|
- Removed the dead `iterateUntilAbort` helper (superseded by `iterateWithIdleTimeout`); it leaked the upstream iterator when the consumer abandoned mid-yield and had no production call sites.
|
|
278
309
|
|
|
310
|
+
## [15.10.10] - 2026-06-09
|
|
311
|
+
|
|
312
|
+
### Added
|
|
313
|
+
|
|
314
|
+
- Exported `wrapFetchForCch` so non-streaming OAuth callers (e.g. the web-search provider) can patch the Claude Code billing-header `cch` attestation into their request bodies instead of shipping the `cch=00000` placeholder.
|
|
315
|
+
|
|
316
|
+
### Fixed
|
|
317
|
+
|
|
318
|
+
- Fixed an unbounded, zero-backoff Codex WebSocket reconnect loop on `websocket_connection_limit_reached`: the no-content reconnect path never consulted the retry budget and never waited, hammering the endpoint forever when the limit is account-scoped. Reconnects are now budgeted and delayed like every other WS retry path, falling back to a single SSE replay when exhausted.
|
|
319
|
+
- Fixed the Codex whitespace-loop breaker not observing degenerate frames that arrive after their item closed (or before it opened) — those frames count as stream progress, so the idle watchdogs never fired and the turn hung forever, which is exactly the failure mode the breaker exists for. Whitespace-loop recovery now also refuses to replay the turn once a `toolcall_end` was delivered, surfacing the error instead of re-emitting the same tool calls.
|
|
320
|
+
- Fixed the two remaining Codex retry paths (WS mid-stream reconnect and the empty-content SSE fallback) leaking blockless native output items (e.g. `web_search_call`) from the failed attempt into the replayed turn's `providerPayload` and append baseline.
|
|
321
|
+
- Fixed Codex WebSocket failure handling closing whatever connection currently occupies the session slot — including a concurrent caller's in-flight CONNECTING handshake, whose rejection (`websocket closed before open`) is classified fatal and disabled WebSockets for the whole session. Failure cleanup now skips CONNECTING sockets and the pool re-joins replacement handshakes (bounded).
|
|
322
|
+
- Fixed the Codex request transformer not repairing orphan `custom_tool_call_output` items (only `function_call_output` was folded into an assistant note) — a compaction splice that dropped an `apply_patch` call while keeping its result produced a hard 400 on the default GPT-5 Codex toolset.
|
|
323
|
+
- Fixed `processResponsesStream` finalizing reasoning items via a bare `itemId` content scan instead of the routed entry: with id-less reasoning items (local hosts), every `output_item.done` matched the FIRST thinking block — the second item's text clobbered it and the second block was never finalized or signed.
|
|
324
|
+
- Fixed `processResponsesStream` dropping tool calls and message text whose `output_item.added` event was lost (lossy proxies): `toolcall_end` was emitted with a dangling contentIndex while the call never entered `message.content`, so the agent loop silently never executed it. The done handler now synthesizes the missing block; still-open tool-call blocks are also final-parsed at `response.completed` so the `toolUse` override cannot hand the agent stale `{}` arguments.
|
|
325
|
+
- Fixed `response.incomplete` with `incomplete_details.reason: "content_filter"` being reported as a token-cap truncation (`stopReason: "length"`) — the agent loop's length recovery then asked the model to "shorten" a filtered prompt. Content-filtered turns now surface as errors; usage is also populated from `response.failed` events, and an unknown terminal status degrades to `"stop"` with a logged anomaly instead of throwing away a fully-streamed response.
|
|
326
|
+
- Fixed Copilot `premiumRequests` accounting being dropped from failed/cancelled responses: `populateResponsesUsageFromResponse` replaced `usage` wholesale and the error path threw before the success-path re-apply. The populate now preserves the field.
|
|
327
|
+
- Fixed `deduplicateToolCallIds` suffixing the whole composite Responses id (`callId|itemId`) — `normalizeResponsesToolCallId` extracts the first segment as the wire `call_id` at encode time, so both copies collapsed back onto one `call_id` and the request carried duplicate call/output pairs. The suffix and length budget now apply per segment.
|
|
328
|
+
- Gated native history payload replay on api + model id in both Responses providers: after a mid-session model switch, reasoning items carrying encrypted content minted by the previous model were replayed verbatim under the new model. Replay now falls back to block re-encode (which already strips foreign signatures), matching `transformMessages`' same-model trust rule.
|
|
329
|
+
- Fixed Azure OpenAI Responses requests omitting `store: false` while requesting `reasoning.encrypted_content` (stateless-only per OpenAI), replaying custom tool calls paired with mismatched `function_call_output` items (customCallIds was never threaded through), letting the SDK's internal retries (maxRetries 5) silently re-POST inside the explicit first-event deadline, and sending a `prompt_cache_key` when the caller opted out via `cacheRetention: "none"`.
|
|
330
|
+
- Fixed strict-pairing Responses backends (Azure, Copilot) silently discarding tool results whose call is absent from history — the result is now folded into an assistant note (same shape as orphan-output repair) so the model keeps the information.
|
|
331
|
+
- Fixed the OpenAI Responses first-event watchdog staying armed across the `onResponse` notification callback (a slow callback aborted an already-connected stream), Copilot transient-model retries re-attempting on an already-aborted signal (instant dead retry surfacing the scheduler's AbortError), Codex `reasoningSummary: null` being coerced to `"auto"` (the documented omit-summary contract was unreachable), nested Codex error codes (`response.error.code`) being invisible to the connection-limit/previous-response recovery matchers, and the session id leaking unredacted into `PI_CODEX_DEBUG` logs via the `x-client-request-id` header.
|
|
332
|
+
- Fixed `processResponsesStream` (shared by `openai-responses` and `azure-openai-responses`) ignoring the terminal `response.incomplete` event: a max-output-tokens-truncated response ended with `stopReason: "stop"`, zero usage, and no cost instead of `"length"` with the reported token counts. `response.incomplete` is now handled alongside `response.completed` and counts as stream progress for the idle watchdogs.
|
|
333
|
+
- Fixed custom tool-call content blocks keeping the transient `partialJson` accumulation buffer (and a potentially stale `arguments.input`) after `response.output_item.done` in the shared Responses stream processor — the function_call branch already cleaned these up.
|
|
334
|
+
- Fixed two OpenAI Codex stream-retry paths (whitespace-loop recovery and retryable provider errors) leaking native output items from the abandoned attempt into the replayed turn's `providerPayload` — stale reasoning items completed before the failure were re-sent as history input on subsequent requests alongside the retry's own items.
|
|
335
|
+
- Fixed the Codex WebSocket queue wiping already-received frames when a transport error arrived: a `response.completed` queued just before an eager server close was discarded, turning a finished response into a spurious `websocket closed` failure and a full request replay. Errors now append behind pending data frames.
|
|
336
|
+
- Fixed concurrent `getOrCreateCodexWebSocketConnection` callers (prewarm racing the first request) tearing down each other's in-flight handshake — closing a CONNECTING socket rejected the other caller with a fatal `websocket closed before open`, disabling WebSockets for the entire session. Callers now join the pending handshake.
|
|
337
|
+
- Stopped the Codex connection-limit recovery from replaying a turn over SSE after a `toolcall_end` had already been delivered to the consumer (`canSafelyReplayWebsocketOverSse` guard was bypassed, re-emitting the same tool calls); the error now surfaces instead.
|
|
338
|
+
- Extended the Codex whitespace-only argument-delta circuit breaker to `custom_tool_call_input.delta` frames, which counted as stream progress and could keep a degenerate response alive forever with no cap on buffer growth.
|
|
339
|
+
- Fixed Codex stream failures during transport open reporting a synthetic request dump (empty URL/body) instead of the real request, and a `response.created` event resetting the recorded time-to-first-token.
|
|
340
|
+
- Fixed the Codex WebSocket connect watchdog timer leaking (pinning the event loop for up to 10s) when the request signal aborted before or during the handshake.
|
|
341
|
+
- Fixed OpenRouter-hosted Anthropic adaptive reasoning models (Claude Fable/Mythos 5 and Opus 4.6+) so the catalog exposes `xhigh`; Fable/Mythos and Opus 4.7+ requests now map user `high`/`xhigh` onto OpenRouter's Anthropic `xhigh`/`max` effort scale.
|
|
342
|
+
- Fixed an unknown Anthropic `stop_reason` failing the whole turn after the response had fully streamed. `mapStopReason` threw on unrecognized values, and since the reason arrives on the trailing `message_delta` the error was unretryable — the live `model_context_window_exceeded` stop reason (default on Sonnet 4.5+) hit this path. It now maps to `length`, and any future unknown reason degrades to a logged anomaly plus a normal `stop` instead of an error.
|
|
343
|
+
- Stopped clamping API-key Anthropic requests to Claude Code's 64k output cap. The `CLAUDE_CODE_MAX_OUTPUT_TOKENS` clamp exists to match the OAuth wire fingerprint, but `buildParams` applied it unconditionally, silently halving the output budget of 128k-output models (e.g. Opus 4.8) for API-key callers. OAuth requests keep the clamp.
|
|
344
|
+
- Stopped a successful strict-tools fallback from shipping `errorMessage` on a `stopReason: "stop"` assistant message. After a grammar-too-large 400 triggered the non-strict retry, the original 400 text was kept on the final message even when the retry succeeded — consumers that treat `errorMessage` presence as failure (e.g. balance probes) misclassified the turn, and the stale text suppressed later refusal explanations. The fallback is now logged instead.
|
|
345
|
+
- Fixed model-supplied `User-Agent` headers being silently dropped on non-OAuth Anthropic requests. `enforcedHeaderKeys` filtered the header out of `modelHeaders` in every branch but only the OAuth branch set one back; the Cloudflare-gateway, bearer-gateway, and `X-Api-Key` branches now forward the caller's value verbatim.
|
|
346
|
+
- Stopped sending the `fast-mode-2026-02-01` beta header once a session has learned the endpoint+model rejects fast mode (`fastModeDisabled` provider state), matching the already-dropped `speed` param.
|
|
347
|
+
- Stopped `buildAnthropicHeaders` defaulting API-key requests onto the full Claude Code OAuth beta list (`oauth-2025-04-20`, `claude-code-20250219`, …). The `claudeCodeBetas` default is now OAuth-gated, matching the streaming path — the web-search header builder was the only caller hitting the default, so API-key search requests now carry just their own betas (e.g. `web-search-2025-03-05`). An empty `anthropic-beta` header is omitted entirely instead of being sent as an empty string.
|
|
348
|
+
- Fixed image-bearing `developer` messages being upgraded to mid-conversation `system` turns on Opus 4.8+/Fable/Mythos 5. System content is text-only on the wire, so a developer turn carrying image blocks in an upgrade-eligible position produced a 400; it now stays a `user` message.
|
|
349
|
+
- Fixed a spliced reconnect's second envelope overwriting the completed Anthropic message: `message_delta` was not gated by the terminal-stop flag (content events and duplicate `message_start` were), so the splice's `stop_reason`/usage replaced the finished turn's — a `tool_use` turn could be relabeled `stop`, and the harness then never executed the streamed tool calls. Post-terminal deltas are now logged as envelope anomalies and skipped.
|
|
350
|
+
- Fixed a `ping` arriving before `message_start` consuming the Anthropic first-event watchdog: the stall was then classified as a terminal mid-stream idle timeout instead of a retryable first-event timeout. Pings no longer count as the first item but still refresh the idle deadline once content is flowing.
|
|
351
|
+
- Fixed Anthropic-compatible proxies that omit `usage`/`delta` objects from `message_start`/`message_delta`/`content_block_*` envelopes crashing the turn with an unretryable `TypeError`; the missing payloads now degrade to logged envelope anomalies like every other malformed-frame case.
|
|
352
|
+
- Fixed `applyPromptCaching` placing `cache_control` on `thinking`/`redacted_thinking` blocks — Anthropic rejects that with a 400. A thinking-only assistant turn inside the trailing cache window (e.g. followed by the synthetic `Continue.` pad) no longer receives a breakpoint.
|
|
353
|
+
- Fixed consecutive `assistant` params reaching the wire when an empty user/developer turn between two assistant turns was dropped by the converter (e.g. an empty "nudge" submission after a length-truncated reply); Anthropic 400s on non-alternating assistant turns, and the broken triple replayed on every subsequent request. A `user: "Continue."` separator is now inserted, mirroring the trailing-prefill fallback.
|
|
354
|
+
- Fixed `supportsAdaptiveThinkingDisplay` misparsing bare dated Opus ids: `claude-opus-4-20250514` (Opus 4.0) parsed as minor `20250514` ≥ 4.7, which silently dropped the `interleaved-thinking-2025-05-14` beta for API-key Opus 4.0 requests.
|
|
355
|
+
- Fixed `output_config.effort` shipping without the `effort-2025-11-24` beta on thinking-off requests against adaptive-only Claude models (the effort:"low" pin), and the mid-conversation `system` role shipping without `mid-conversation-system-2026-04-07` on API-key and OAuth-utility requests; both betas are now added whenever the request can carry the corresponding field.
|
|
356
|
+
- Fixed GitHub Copilot anthropic-messages requests going out with no `Content-Type` and no `anthropic-version` header — the copilot branch builds its headers from scratch and Bun's fetch does not default `Content-Type` for string bodies. Both headers are now pinned to match every other branch.
|
|
357
|
+
- Fixed Anthropic client/provider retry multiplication: with the first-event watchdog disabled (`PI_STREAM_FIRST_EVENT_TIMEOUT_MS=0`), the client's internal `maxRetries: 5` reactivated and stacked with the provider loop's 3 retries — up to 24 wire attempts with double backoff. The provider now pins per-request `maxRetries: 0` unconditionally.
|
|
358
|
+
- Fixed `AnthropicMessagesClient` spreading `fetchOptions` after the core request fields, letting a caller-supplied `signal`/`method`/`body` silently disconnect the timeout controller or corrupt the request. Transport extras (TLS) still pass through; core fields now always win.
|
|
359
|
+
- Fixed Foundry mTLS/CA material being cached for the process lifetime when the env vars point at files: the cache key now folds in the file mtime so on-disk certificate rotation takes effect.
|
|
360
|
+
- Fixed the Claude Code fingerprint version drifting across surfaces: the usage endpoint (`claude-cli/2.1.160`) and OAuth bootstrap (`claude-code/2.1.160`) pinned a stale version while `/v1/messages` reported 2.1.165; both now derive from `claudeCodeVersion`.
|
|
361
|
+
- Fixed a system prompt that merely *mentions* `x-anthropic-billing-header:` mid-text suppressing the entire Claude Code system-block injection (billing header, instruction, and cch attestation); the resumed-session guard now anchors with `startsWith`.
|
|
362
|
+
- Fixed lone surrogates in cross-API tool-call arguments reaching Anthropic's strict UTF-8 validation: replayed OpenAI/Google-origin `tool_use.input` string leaves are now deep-sanitized with `toWellFormed()`, while same-API Anthropic arguments stay byte-identical to keep prompt-cache prefixes stable.
|
|
363
|
+
- Bounded the many-image resize fan-out to 4 concurrent decodes (it previously decoded every oversized image at once, two encode pipelines each — multi-GB transient memory at the 20+-image threshold that activates the feature).
|
|
364
|
+
- Fixed `mergeHeaders` merging case-sensitively on the Copilot/client-options path, where a miscased user-configured header (e.g. `authorization` next to the synthesized `Authorization`) survived as two keys that the `Headers` constructor joins comma-separated on the wire.
|
|
365
|
+
- Hardened the Anthropic stream lifecycle: prologue failures (e.g. a malformed Copilot credential in `buildCopilotDynamicHeaders`) and error-finalization failures now surface as an `error` event instead of an unhandled rejection that left `stream.result()` hanging forever; the spurious "cch billing placeholder not patched" warning no longer fires when the placeholder only appears in user content.
|
|
366
|
+
|
|
279
367
|
## [15.10.9] - 2026-06-09
|
|
280
368
|
|
|
281
369
|
### Added
|
|
@@ -425,7 +513,6 @@
|
|
|
425
513
|
- Fixed Antigravity usage provider emitting one bar per model instead of deduplicating by tier — a single account's 15+ model entries now collapse to one bar per tier, matching the shared-quota reality of the upstream API.
|
|
426
514
|
- Fixed Antigravity usage reports missing `email` and `accountId` in metadata, so the `/usage` display and the deduplicator can associate reports with their credentials.
|
|
427
515
|
- Fixed usage-report dedup ignoring `projectId` for Google Cloud providers, preventing duplicate credential entries from being recognized as the same account.
|
|
428
|
-
|
|
429
516
|
- Fixed Cloud Code Assist (Antigravity / Gemini CLI) rejecting the `github` tool with HTTP 400 when the `pr` parameter schema contained `anyOf: [string, array]`. The CCA mixed-type combiner collapse picked the first non-null type (`string`) but indiscriminately copied type-specific keys from variant branches — `items` from the array variant leaked onto the string-typed result, producing `{type: "string", items: {...}}` which Google's API rejects as invalid. The collapse now filters merged variant fields against the winning type's allowed key set. ([#2002](https://github.com/can1357/oh-my-pi/pull/2002))
|
|
430
517
|
- Fixed OpenAI Responses-family providers (Codex, OpenAI Responses, Azure Responses) rejecting requests with `400 No tool output found for function call …` after the user branched/navigated the session tree to a node that ends on a tool call (the tool-result child is dropped from the reconstructed history) or after a turn was aborted/crashed between the call streaming and its result persisting. The converters now synthesize a placeholder `function_call_output`/`custom_tool_call_output` immediately after any unpaired `function_call`/`custom_tool_call`, symmetric to the existing orphan-output repair, so the model still sees the call and can recover instead of the whole request 400ing.
|
|
431
518
|
- Fixed Anthropic-compatible reasoning endpoints losing prior-turn reasoning on continuation requests when they emit unsigned `thinking` blocks. `convertAnthropicMessages` treated unknown endpoints as signature-enforcing and demoted unsigned reasoning to `type: "text"`, which destabilized tool-call argument serialization on the next turn — the upstream symptom behind the `args?.ops?.map is not a function` crash reported against the `todo` tool. Official `api.anthropic.com` keeps the conservative text fallback; non-official `anthropic-messages` reasoning models now replay unsigned reasoning as native `type: "thinking"` ([#2005](https://github.com/can1357/oh-my-pi/issues/2005)).
|
|
@@ -624,7 +711,6 @@
|
|
|
624
711
|
### Added
|
|
625
712
|
|
|
626
713
|
- `SimpleStreamOptions.openrouterVariant` (`"nitro"`, `"floor"`, `"online"`, `"exacto"`, …) — when set, appends `:<variant>` to OpenRouter model IDs at request time, leaving ids that already carry an explicit `:suffix` untouched. Plumbed through `openai-completions` and the pi-native gateway forwarder.
|
|
627
|
-
|
|
628
714
|
- xAI Grok OAuth (SuperGrok Subscription) provider in `/login`. Loopback PKCE flow on `127.0.0.1:56121`; the token unlocks Grok-4.x chat. Ported from NousResearch/hermes-agent (MIT).
|
|
629
715
|
- OpenRouter provider in `/login`. API-key paste flow validated against `https://openrouter.ai/api/v1/auth/key` (the `/models` endpoint is public and cannot validate auth). The pasted key is stored under the existing `openrouter` provider id used by `OPENROUTER_API_KEY`.
|
|
630
716
|
- `XAI_OAUTH_TOKEN` environment variable accepted as a headless fallback for the xAI Grok OAuth provider.
|
|
@@ -657,14 +743,11 @@
|
|
|
657
743
|
- Added `PI_CODEX_WEBSOCKET_PING_INTERVAL_MS` to configure the interval for Codex WebSocket protocol ping heartbeats
|
|
658
744
|
- Added `PI_CODEX_WEBSOCKET_PONG_TIMEOUT_MS` to configure the Codex WebSocket pong timeout used to detect unresponsive connections
|
|
659
745
|
- Added `PI_CODEX_WEBSOCKET_MESSAGE_QUEUE_CAPACITY` to configure the maximum buffered Codex WebSocket inbound queue size before transport fallback
|
|
660
|
-
- Added `parseStreamingJsonThrottled` to `@oh-my-pi/pi-ai/utils/json-parse` — a per-delta wrapper around `parseStreamingJson` that skips re-parses until the buffer has grown by `minGrowthBytes` (default 256). Wired into the streaming hot path of every provider's tool-call argument accumulator (`anthropic`, `amazon-bedrock`, `openai-completions`, `openai-codex-responses`, `openai-responses-shared`) so per-delta cost is O(N) in total buffer length instead of O(N²). Each provider's `toolcall_end` still runs a final unthrottled parse, so the published `block.arguments` is unchanged.
|
|
661
|
-
- Added named-tool routing support to Google providers: `GoogleSharedStreamOptions.toolChoice` and `GoogleGeminiCliOptions.toolChoice` now accept `{ mode: "ANY"; allowedFunctionNames: [string, ...string[]] }` in addition to the string forms. `mapGoogleToolChoice` converts `ToolChoice` objects of shape `{ type: "tool" | "function", name }` to the wire form. Mirrors the equivalent Anthropic mapper.
|
|
662
746
|
|
|
663
747
|
### Changed
|
|
664
748
|
|
|
665
749
|
- Improved Codex WebSocket timeout diagnostics to include last event type and time since last progress event
|
|
666
750
|
- Enhanced Codex WebSocket error classification to recognize ping, pong, send, and queue-overflow failures as retryable
|
|
667
|
-
- Changed `mapGoogleToolChoice` to be exported from `@oh-my-pi/pi-ai/stream` so callers can build the wire-shape allow-list directly without re-deriving it.
|
|
668
751
|
|
|
669
752
|
### Fixed
|
|
670
753
|
|
|
@@ -673,7 +756,6 @@
|
|
|
673
756
|
- Fixed Codex WebSocket pong timeout detection by tracking pong events and failing the connection when no pong is received within the configured timeout
|
|
674
757
|
- Fixed Anthropic streaming to suppress hallucinated meta-prompt thinking blocks (the recent "I don't see any current rewritten thinking..." regression). When the marker phrase `rewritten thinking` appears in a streamed thinking summary the block is collapsed to a plain `Thinking...` placeholder and its signature is dropped so subsequent turns can't re-anchor on the garbled chain.
|
|
675
758
|
- Fixed Codex WebSocket silent stalls by adding protocol pings, inbound queue bounding, clearer idle-timeout diagnostics, and SDK retry clamping for first-event timeouts.
|
|
676
|
-
- Fixed Synthetic model discovery to treat the provider `/models` response as authoritative so deprecated bundled IDs are pruned from the runtime cache, and changed Synthetic login validation to avoid probing a specific model ([#1417](https://github.com/can1357/oh-my-pi/issues/1417)).
|
|
677
759
|
|
|
678
760
|
## [15.5.0] - 2026-05-26
|
|
679
761
|
|
|
@@ -797,10 +879,6 @@
|
|
|
797
879
|
|
|
798
880
|
- Added DeepSeek to the built-in API-key login provider catalog so `omp login deepseek` stores a reusable `DEEPSEEK_API_KEY` credential for the bundled DeepSeek models.
|
|
799
881
|
|
|
800
|
-
### Fixed
|
|
801
|
-
|
|
802
|
-
- Fixed `openai-responses` requests intermittently 400ing with `No tool call found for function call output with call_id …` after an aborted turn or a locally-rejected tool call (e.g. argument-validation failure). `convertConversationMessages` now folds orphan `function_call_output` / `custom_tool_call_output` items — those whose matching `function_call` was wiped by an earlier `dt: false` snapshot splice or never landed in any persisted provider payload — into assistant text notes, preserving the payload while keeping the request grammatically valid ([#1351](https://github.com/can1357/oh-my-pi/issues/1351)).
|
|
803
|
-
|
|
804
882
|
## [15.2.4] - 2026-05-22
|
|
805
883
|
|
|
806
884
|
### Fixed
|
|
@@ -853,7 +931,6 @@
|
|
|
853
931
|
### Fixed
|
|
854
932
|
|
|
855
933
|
- Fixed Anthropic fast mode (`serviceTier: "priority"`) looping on 429 `rate_limit_error: "Extra usage is required for fast mode."` for accounts without the extra-usage entitlement. `isAnthropicFastModeUnsupportedError` now matches the 429 phrasing in addition to the 400 `invalid_request_error` "does not support the `speed` parameter" case, so the provider drops `speed: "fast"` on the in-turn retry, sets `providerSessionState.fastModeDisabled` for the remainder of the session, and surfaces `disabledFeatures: ["priority"]` to the caller instead of retrying with the same payload until `PROVIDER_MAX_RETRIES` is exhausted.
|
|
856
|
-
- Fixed MiniMax Coding Plan CN streaming `<think>...</think>` reasoning as visible assistant text. The OpenAI-compatible stream parser now enables the existing MiniMax tag parser for both `minimax-code` and `minimax-code-cn`, so CN responses become structured `thinking` blocks instead of raw text. ([#1203](https://github.com/can1357/oh-my-pi/issues/1203))
|
|
857
934
|
|
|
858
935
|
## [15.1.6] - 2026-05-19
|
|
859
936
|
|
|
@@ -870,7 +947,6 @@
|
|
|
870
947
|
|
|
871
948
|
### Fixed
|
|
872
949
|
|
|
873
|
-
- Fixed OpenCode-Go and OpenCode-Zen chat-completions replay to omit stored reasoning fields on Kimi assistant tool-call messages, avoiding provider 400s for rejected `messages[].reasoning` payloads. ([#1157](https://github.com/can1357/oh-my-pi/issues/1157))
|
|
874
950
|
- Fixed OpenAI Responses and Codex tool schema normalization to emit `properties: {}` for no-argument object schemas without rewriting literal payloads. ([#1147](https://github.com/can1357/oh-my-pi/issues/1147))
|
|
875
951
|
- Fixed Anthropic 400 (`unexpected tool_use_id found in tool_result blocks ... Each tool_result block must have a corresponding tool_use block in the previous message`) when handoff/compaction folds an assistant `tool_use` into the handoff summary string but leaves the matching user-side `tool_result` message in the history. `transformMessages` now indexes every `tool_use` id surviving the first pass and drops orphan `tool_result` messages whose originator was compacted away, preserving the text payload as a user-level `<stale-tool-result>` note so the model still sees what the tool returned. The note is emitted with `role: "user"` rather than `role: "developer"` so providers that elevate developer-role messages (Ollama: `developer` → `system`; OpenAI chat-completions reasoning models: `developer` → `developer`) cannot lift stale tool output to an instruction-priority tier above the surrounding user/developer messages.
|
|
876
952
|
- Fixed streaming authentication retry to trigger when a provider emits a 401 `error` event after a `start` event but before any replay-unsafe content is emitted
|
|
@@ -1059,11 +1135,6 @@
|
|
|
1059
1135
|
|
|
1060
1136
|
- Fixed OAuth credentials being silently disabled when two omp processes (or any two `AuthStorage` instances sharing a `agent.db`) race on token refresh. Anthropic rotates refresh tokens on every use, so the loser's `invalid_grant` response previously soft-deleted the row that the winner just rotated, forcing the user to `/login` again. `#tryOAuthCredential` now re-reads the row from disk before declaring a definitive failure: if the persisted `refresh` differs from the snapshot it tried, the peer-rotated credential is reloaded and the request retries against the fresh token instead of disabling the live row.
|
|
1061
1137
|
- Closed a remaining race window in OAuth refresh-failure handling: between re-reading the credential row to check for peer rotation and the subsequent soft-delete, another process could still complete a refresh and rotate the row, leaving us to disable the freshly-rotated credential by `id`. The disable now runs as a single CAS update conditioned on the row's `data` still matching the snapshot we tried to refresh, and on `disabled_cause IS NULL`. If the CAS reports 0 rows changed (peer rotation, or row already disabled by a concurrent failure on the same snapshot), we reload from disk and retry instead of mutating the wrong row or emitting a spurious `credential_disabled` event.
|
|
1062
|
-
- Lazy built-in provider streams now enforce the shared idle watchdog and abort stalled provider requests, so session auto-retry can continue after transient network drops instead of remaining stuck. Caller aborts still terminate as aborted.
|
|
1063
|
-
|
|
1064
|
-
### Changed
|
|
1065
|
-
|
|
1066
|
-
- Lowered the default steady-state stream idle timeout from 120s to 30s while preserving the existing environment overrides.
|
|
1067
1138
|
|
|
1068
1139
|
## [14.9.3] - 2026-05-10
|
|
1069
1140
|
|
|
@@ -1076,7 +1147,6 @@
|
|
|
1076
1147
|
### Fixed
|
|
1077
1148
|
|
|
1078
1149
|
- Fixed silent forwarding of image content (for example Python plot output rendered in the terminal) to models without vision support, which produced opaque 404 errors from upstream. Image blocks are now stripped and replaced with a `[image omitted: model does not support vision]` placeholder for non-vision models, including tool-result payloads ([#967](https://github.com/can1357/oh-my-pi/issues/967), [#968](https://github.com/can1357/oh-my-pi/issues/968)).
|
|
1079
|
-
|
|
1080
1150
|
- Added `AuthStorage` `onCredentialDisabled` callback (sync or async) so embedders can react when a credential is automatically disabled (e.g. OAuth refresh fails with `invalid_grant`) — useful for surfacing a banner or auto-launching a re-login flow instead of letting the credential silently disappear. Sync throws and async rejections are both caught and logged so a misbehaving subscriber cannot break the disable path.
|
|
1081
1151
|
- Added Anthropic OAuth `account.uuid` and `account.email_address` extraction from the `/v1/oauth/token` exchange and refresh responses; both `AnthropicOAuthFlow.exchangeToken()` and `refreshAnthropicToken()` now populate `OAuthCredentials.{accountId, email}` so downstream consumers can attribute requests to the authenticated account without a separate `/api/oauth/profile` round-trip.
|
|
1082
1152
|
- Added `onSseEvent` stream diagnostics so HTTP SSE providers can expose raw SSE frames without changing parsed model output.
|
|
@@ -1094,7 +1164,6 @@
|
|
|
1094
1164
|
|
|
1095
1165
|
- Fixed Gemini 3 Pro thinking metadata so `medium` effort is rejected with the expected error instead of being silently accepted: `ThinkingConfig` now carries an optional explicit `levels` list that survives `expandEffortRange`, letting non-contiguous supported sets (e.g. `[low, high]`) round-trip through enrichment.
|
|
1096
1166
|
- Fixed Kimi Code OAuth expiry handling to refresh access tokens 5 minutes before server expiry, avoiding daily 401s from using tokens right up to the cutoff.
|
|
1097
|
-
- Fixed OpenAI Responses custom tool replay to preserve custom tool call item IDs with the `ctc_` prefix instead of rewriting them as `fc_` function-call IDs ([#977](https://github.com/can1357/oh-my-pi/issues/977)).
|
|
1098
1167
|
|
|
1099
1168
|
## [14.7.6] - 2026-05-07
|
|
1100
1169
|
|
|
@@ -1358,7 +1427,6 @@
|
|
|
1358
1427
|
- Fixed shell execution failure responses to preserve all result fields when sanitizing, preventing truncated metadata in stream results
|
|
1359
1428
|
- Fixed context overflow detection to recognize `model_context_window_exceeded` from z.ai / GLM providers, preventing infinite retry loops when context window is exceeded ([#638](https://github.com/can1357/oh-my-pi/issues/638))
|
|
1360
1429
|
- Fixed strict tool schema enforcement to preserve `additionalProperties: false` and required keys for reused nested object schemas, preventing invalid `todo_write` function schemas in Codex/OpenAI requests
|
|
1361
|
-
- Fixed GitHub Copilot reasoning regressions by preserving GPT-5.x / Claude 4.x reasoning controls instead of stripping them from requests ([#773](https://github.com/can1357/oh-my-pi/issues/773))
|
|
1362
1430
|
|
|
1363
1431
|
## [14.1.0] - 2026-04-11
|
|
1364
1432
|
|
|
@@ -1421,7 +1489,6 @@
|
|
|
1421
1489
|
- Fixed Gemini 2.5 Pro context window detection in GitHub Copilot model limits test
|
|
1422
1490
|
- Fixed Claude Opus 4.6 context window detection in GitHub Copilot model limits test
|
|
1423
1491
|
- Fixed Anthropic streaming to suppress transient SDK console errors for malformed SSE keep-alive frames so the TUI only shows surfaced provider errors
|
|
1424
|
-
|
|
1425
1492
|
- Added environment-based credential fallback for the OpenAI Codex provider.
|
|
1426
1493
|
|
|
1427
1494
|
## [13.17.6] - 2026-04-01
|
|
@@ -1799,8 +1866,6 @@
|
|
|
1799
1866
|
- Fixed OpenAI Codex streaming to properly include service_tier in SSE payloads
|
|
1800
1867
|
- Fixed type safety in OpenAI responses by removing unsafe type casts on image content blocks
|
|
1801
1868
|
- Fixed credential purging to respect disabled credentials when deduplicating by email
|
|
1802
|
-
- Fixed API-key provider re-login to replace the active stored key instead of appending stale credentials that were still selected first
|
|
1803
|
-
- Fixed Kagi login guidance to use the correct `KG_...` key format and mention Search API beta access requirements
|
|
1804
1869
|
|
|
1805
1870
|
## [13.9.2] - 2026-03-05
|
|
1806
1871
|
|
|
@@ -1825,7 +1890,7 @@
|
|
|
1825
1890
|
- Removed `THINKING_LEVELS`, `ALL_THINKING_LEVELS`, `ALL_THINKING_MODES`, `THINKING_MODE_DESCRIPTIONS`, and `THINKING_MODE_LABELS` exports
|
|
1826
1891
|
- Renamed `formatThinking()` to `getThinkingMetadata()` with changed return type from string to `ThinkingMetadata` object
|
|
1827
1892
|
- Renamed `getAvailableThinkingLevel()` to `getAvailableThinkingLevels()` and added default parameter
|
|
1828
|
-
- Renamed `
|
|
1893
|
+
- Renamed `getAvailableThinkingEffort()` to `getAvailableThinkingEfforts()` and added default parameter
|
|
1829
1894
|
|
|
1830
1895
|
### Added
|
|
1831
1896
|
|
|
@@ -1835,17 +1900,17 @@
|
|
|
1835
1900
|
|
|
1836
1901
|
### Added
|
|
1837
1902
|
|
|
1838
|
-
- Exported new thinking module with `
|
|
1839
|
-
- Added `
|
|
1840
|
-
- Added `
|
|
1903
|
+
- Exported new thinking module with `ThinkingEffort`, `ThinkingLevel`, and `ThinkingMode` types for managing reasoning effort levels
|
|
1904
|
+
- Added `getAvailableThinkingEffort()` function to determine supported thinking effort levels based on model capabilities
|
|
1905
|
+
- Added `parseThinkingEffort()`, `parseThinkingLevel()`, and `parseThinkingMode()` functions for parsing thinking configuration strings
|
|
1841
1906
|
- Added `THINKING_LEVELS`, `ALL_THINKING_LEVELS`, and `ALL_THINKING_MODES` constants for iterating over available thinking options
|
|
1842
1907
|
- Added `THINKING_MODE_DESCRIPTIONS` and `THINKING_MODE_LABELS` for displaying thinking modes in user interfaces
|
|
1843
1908
|
- Added `formatThinking()` function to format thinking modes as compact display labels
|
|
1844
1909
|
|
|
1845
1910
|
### Changed
|
|
1846
1911
|
|
|
1847
|
-
- Refactored thinking level handling to distinguish between `
|
|
1848
|
-
- Updated `ThinkingBudgets` type to use `
|
|
1912
|
+
- Refactored thinking level handling to distinguish between `ThinkingEffort` (provider-level, no "off") and `ThinkingLevel` (user-facing, includes "off")
|
|
1913
|
+
- Updated `ThinkingBudgets` type to use `ThinkingEffort` instead of `ThinkingLevel` for more precise token budget configuration
|
|
1849
1914
|
- Improved reasoning option handling to explicitly support "off" value for disabling reasoning across all providers
|
|
1850
1915
|
- Simplified thinking effort mapping logic by centralizing provider-specific clamping behavior
|
|
1851
1916
|
|
|
@@ -2636,7 +2701,7 @@
|
|
|
2636
2701
|
|
|
2637
2702
|
### Changed
|
|
2638
2703
|
|
|
2639
|
-
- Replaced direct `
|
|
2704
|
+
- Replaced direct `process.env` access with `getEnv()` utility from `@oh-my-pi/pi-utils` for consistent environment variable handling across all providers
|
|
2640
2705
|
- Updated environment variable names from `OMP_*` prefix to `PI_*` prefix for consistency (e.g., `OMP_CODING_AGENT_DIR` → `PI_CODING_AGENT_DIR`)
|
|
2641
2706
|
|
|
2642
2707
|
### Removed
|
|
@@ -2663,13 +2728,13 @@
|
|
|
2663
2728
|
|
|
2664
2729
|
### Added
|
|
2665
2730
|
|
|
2666
|
-
- Added `getEnv()` function to retrieve environment variables from
|
|
2731
|
+
- Added `getEnv()` function to retrieve environment variables from process.env, cwd/.env, or ~/.env
|
|
2667
2732
|
- Added support for reading .env files from home directory and current working directory
|
|
2668
2733
|
- Added support for `exa` and `perplexity` as known providers in `getEnvApiKey()`
|
|
2669
2734
|
|
|
2670
2735
|
### Changed
|
|
2671
2736
|
|
|
2672
|
-
- Changed `getEnvApiKey()` to check
|
|
2737
|
+
- Changed `getEnvApiKey()` to check process.env, cwd/.env, and ~/.env files in order of precedence
|
|
2673
2738
|
- Refactored provider API key resolution to use a declarative service provider map
|
|
2674
2739
|
|
|
2675
2740
|
## [9.2.2] - 2026-01-31
|
|
@@ -2875,7 +2940,7 @@
|
|
|
2875
2940
|
- Replaced custom sleep implementations with Bun.sleep and abortableSleep
|
|
2876
2941
|
- Simplified SSE stream parsing using readLines utility
|
|
2877
2942
|
- Updated test framework from vitest to bun:test
|
|
2878
|
-
- Replaced temp directory creation with
|
|
2943
|
+
- Replaced temp directory creation with createTempDirSync utility
|
|
2879
2944
|
- Changed credential storage from auth.json to ~/.omp/agent/agent.db
|
|
2880
2945
|
- Changed CLI command examples from npx to bunx
|
|
2881
2946
|
- Refactored OAuth flows to use common callback server base class
|
|
@@ -2918,8 +2983,8 @@
|
|
|
2918
2983
|
|
|
2919
2984
|
### Changed
|
|
2920
2985
|
|
|
2921
|
-
- Updated environment variable prefix from
|
|
2922
|
-
- Added automatic migration for legacy
|
|
2986
|
+
- Updated environment variable prefix from PI_ to OMP_ for better consistency
|
|
2987
|
+
- Added automatic migration for legacy PI_ environment variables to OMP_ equivalents
|
|
2923
2988
|
- Adjusted Bedrock Claude thinking budgets to reserve output tokens when maxTokens is too low
|
|
2924
2989
|
|
|
2925
2990
|
### Fixed
|
|
@@ -3036,7 +3101,7 @@
|
|
|
3036
3101
|
|
|
3037
3102
|
- Changed Cursor debug logging to use structured JSONL format with automatic MCP argument decoding
|
|
3038
3103
|
- Changed MCP tool argument decoding to use protobuf Value schema for improved type handling
|
|
3039
|
-
- Changed tool advertisement to filter Cursor native tools (bash, read, write, delete, ls, grep, lsp) instead of only exposing
|
|
3104
|
+
- Changed tool advertisement to filter Cursor native tools (bash, read, write, delete, ls, grep, lsp) instead of only exposing mcp_ prefixed tools
|
|
3040
3105
|
|
|
3041
3106
|
### Fixed
|
|
3042
3107
|
|
|
@@ -3119,6 +3184,222 @@
|
|
|
3119
3184
|
|
|
3120
3185
|
- Enhanced error messages to include retry-after timing information from API rate limit headers
|
|
3121
3186
|
|
|
3187
|
+
## [3.20.0] - 2026-01-06
|
|
3188
|
+
|
|
3189
|
+
### Added
|
|
3190
|
+
|
|
3191
|
+
- Added support for kwaipilot/kat-coder-pro model via OpenRouter
|
|
3192
|
+
- Added OpenAI Codex responses provider with OAuth login support for ChatGPT Plus/Pro accounts
|
|
3193
|
+
- Added Google Vertex AI provider (Gemini via Vertex) with Application Default Credentials support
|
|
3194
|
+
|
|
3195
|
+
### Changed
|
|
3196
|
+
|
|
3197
|
+
- Updated model specifications including context windows, max tokens, and pricing for multiple OpenRouter models
|
|
3198
|
+
|
|
3199
|
+
### Removed
|
|
3200
|
+
|
|
3201
|
+
- Removed alibaba/tongyi-deepresearch-30b-a3b:free model from OpenRouter
|
|
3202
|
+
- Removed nousresearch/hermes-4-405b model from OpenRouter
|
|
3203
|
+
- Removed tngtech/tng-r1t-chimera:free model from OpenRouter
|
|
3204
|
+
|
|
3205
|
+
## [3.15.0] - 2026-01-05
|
|
3206
|
+
|
|
3207
|
+
### Changed
|
|
3208
|
+
|
|
3209
|
+
- Made `isError` field optional in `ToolResultMessage` interface, defaulting to non-error state
|
|
3210
|
+
|
|
3211
|
+
## [3.5.1337] - 2026-01-03
|
|
3212
|
+
|
|
3213
|
+
### Added
|
|
3214
|
+
|
|
3215
|
+
- Added localhost URL detection for OpenAI-compatible provider auto-configuration
|
|
3216
|
+
|
|
3217
|
+
## [1.337.1] - 2026-01-02
|
|
3218
|
+
|
|
3219
|
+
### Changed
|
|
3220
|
+
|
|
3221
|
+
- Forked to @oh-my-pi scope with unified versioning across all packages
|
|
3222
|
+
|
|
3223
|
+
### Fixed
|
|
3224
|
+
|
|
3225
|
+
- **Gemini CLI rate limit handling**: Added automatic retry with server-provided delay for 429 errors
|
|
3226
|
+
|
|
3227
|
+
## [1.337.0] - 2026-01-02
|
|
3228
|
+
|
|
3229
|
+
Initial release under @oh-my-pi scope. See previous releases at [badlogic/pi-mono](https://github.com/badlogic/pi-mono).
|
|
3230
|
+
|
|
3231
|
+
## [0.50.1] - 2026-01-26
|
|
3232
|
+
|
|
3233
|
+
### Fixed
|
|
3234
|
+
|
|
3235
|
+
- Fixed OpenCode Zen model generation to exclude deprecated models ([#970](https://github.com/badlogic/pi-mono/pull/970) by [@DanielTatarkin](https://github.com/DanielTatarkin))
|
|
3236
|
+
|
|
3237
|
+
## [0.50.0] - 2026-01-26
|
|
3238
|
+
|
|
3239
|
+
### Added
|
|
3240
|
+
|
|
3241
|
+
- Added OpenRouter provider routing support for custom models via `openRouterRouting` compat field ([#859](https://github.com/badlogic/pi-mono/pull/859) by [@v01dpr1mr0s3](https://github.com/v01dpr1mr0s3))
|
|
3242
|
+
- Added `azure-openai-responses` provider support for Azure OpenAI Responses API. ([#890](https://github.com/badlogic/pi-mono/pull/890) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3243
|
+
- Added HTTP proxy environment variable support for API requests ([#942](https://github.com/badlogic/pi-mono/pull/942) by [@haoqixu](https://github.com/haoqixu))
|
|
3244
|
+
- Added `createAssistantMessageEventStream()` factory function for use in extensions.
|
|
3245
|
+
- Added `resetApiProviders()` to clear and re-register built-in API providers.
|
|
3246
|
+
|
|
3247
|
+
### Changed
|
|
3248
|
+
|
|
3249
|
+
- Refactored API streaming dispatch to use an API registry with provider-owned `streamSimple` mapping.
|
|
3250
|
+
- Moved environment API key resolution to `env-api-keys.ts` and re-exported it from the package entrypoint.
|
|
3251
|
+
- Azure OpenAI Responses provider now uses base URL configuration with deployment-aware model mapping and no longer includes service tier handling.
|
|
3252
|
+
|
|
3253
|
+
### Fixed
|
|
3254
|
+
|
|
3255
|
+
- Fixed Bun runtime detection for dynamic imports in browser-compatible modules (stream.ts, openai-codex-responses.ts, openai-codex.ts) ([#922](https://github.com/badlogic/pi-mono/pull/922) by [@dannote](https://github.com/dannote))
|
|
3256
|
+
- Fixed streaming functions to use `model.api` instead of hardcoded API types
|
|
3257
|
+
- Fixed Google providers to default tool call arguments to an empty object when omitted
|
|
3258
|
+
- Fixed OpenAI Responses streaming to handle `arguments.done` events on OpenAI-compatible endpoints ([#917](https://github.com/badlogic/pi-mono/pull/917) by [@williballenthin](https://github.com/williballenthin))
|
|
3259
|
+
- Fixed OpenAI Codex Responses tool strictness handling after the shared responses refactor
|
|
3260
|
+
- Fixed Azure OpenAI Responses streaming to guard deltas before content parts and correct metadata and handoff gating
|
|
3261
|
+
- Fixed OpenAI completions tool-result image batching after consecutive tool results ([#902](https://github.com/badlogic/pi-mono/pull/902) by [@terrorobe](https://github.com/terrorobe))
|
|
3262
|
+
|
|
3263
|
+
## [0.49.3] - 2026-01-22
|
|
3264
|
+
|
|
3265
|
+
### Added
|
|
3266
|
+
|
|
3267
|
+
- Added `headers` option to `StreamOptions` for custom HTTP headers in API requests. Supported by all providers except Amazon Bedrock (which uses AWS SDK auth). Headers are merged with provider defaults and `model.headers`, with `options.headers` taking precedence.
|
|
3268
|
+
- Added `originator` option to `loginOpenAICodex()` for custom OAuth client identification
|
|
3269
|
+
- Browser compatibility for pi-ai: replaced top-level Node.js imports with dynamic imports for browser environments ([#873](https://github.com/badlogic/pi-mono/issues/873))
|
|
3270
|
+
|
|
3271
|
+
### Fixed
|
|
3272
|
+
|
|
3273
|
+
- Fixed OpenAI Responses API 400 error "function_call without required reasoning item" when switching between models (same provider, different model). The fix omits the `id` field for function_calls from different models to avoid triggering OpenAI's reasoning/function_call pairing validation ([#886](https://github.com/badlogic/pi-mono/issues/886))
|
|
3274
|
+
|
|
3275
|
+
## [0.49.2] - 2026-01-19
|
|
3276
|
+
|
|
3277
|
+
### Added
|
|
3278
|
+
|
|
3279
|
+
- Added AWS credential detection for ECS/Kubernetes environments: `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI`, `AWS_CONTAINER_CREDENTIALS_FULL_URI`, `AWS_WEB_IDENTITY_TOKEN_FILE` ([#848](https://github.com/badlogic/pi-mono/issues/848))
|
|
3280
|
+
|
|
3281
|
+
### Fixed
|
|
3282
|
+
|
|
3283
|
+
- Fixed OpenAI Responses 400 error "reasoning without following item" by skipping errored/aborted assistant messages entirely in transform-messages.ts ([#838](https://github.com/badlogic/pi-mono/pull/838))
|
|
3284
|
+
|
|
3285
|
+
### Removed
|
|
3286
|
+
|
|
3287
|
+
- Removed `strictResponsesPairing` compat option (no longer needed after the transform-messages fix)
|
|
3288
|
+
|
|
3289
|
+
## [0.49.1] - 2026-01-18
|
|
3290
|
+
|
|
3291
|
+
### Added
|
|
3292
|
+
|
|
3293
|
+
- Added `OpenAIResponsesCompat` interface with `strictResponsesPairing` option for Azure OpenAI Responses API, which requires strict reasoning/message pairing in history replay ([#768](https://github.com/badlogic/pi-mono/pull/768) by [@nicobako](https://github.com/nicobako))
|
|
3294
|
+
|
|
3295
|
+
### Changed
|
|
3296
|
+
|
|
3297
|
+
- Split `OpenAICompat` into `OpenAICompletionsCompat` and `OpenAIResponsesCompat` for type-safe API-specific compat settings
|
|
3298
|
+
|
|
3299
|
+
### Fixed
|
|
3300
|
+
|
|
3301
|
+
- Fixed tool call ID normalization for cross-provider handoffs (e.g., Codex to Antigravity Claude) ([#821](https://github.com/badlogic/pi-mono/issues/821))
|
|
3302
|
+
|
|
3303
|
+
## [0.49.0] - 2026-01-17
|
|
3304
|
+
|
|
3305
|
+
### Changed
|
|
3306
|
+
|
|
3307
|
+
- OpenAI Codex responses now use the context system prompt directly in the instructions field.
|
|
3308
|
+
|
|
3309
|
+
### Fixed
|
|
3310
|
+
|
|
3311
|
+
- Fixed orphaned tool results after errored assistant messages causing Codex API errors. When an assistant message has `stopReason: "error"`, its tool calls are now excluded from pending tool tracking, preventing synthetic tool results from being generated for calls that will be dropped by provider-specific converters. ([#812](https://github.com/badlogic/pi-mono/issues/812))
|
|
3312
|
+
- Fixed Bedrock Claude max_tokens handling to always exceed thinking budget tokens, preventing compaction failures. ([#797](https://github.com/badlogic/pi-mono/pull/797) by [@pjtf93](https://github.com/pjtf93))
|
|
3313
|
+
- Fixed Claude Code tool name normalization to match the Claude Code tool list case-insensitively and remove invalid mappings.
|
|
3314
|
+
|
|
3315
|
+
## [0.48.0] - 2026-01-16
|
|
3316
|
+
|
|
3317
|
+
### Fixed
|
|
3318
|
+
|
|
3319
|
+
- Fixed OpenAI-compatible provider feature detection to use `model.provider` in addition to URL, allowing custom base URLs (e.g., proxies) to work correctly with provider-specific settings ([#774](https://github.com/badlogic/pi-mono/issues/774))
|
|
3320
|
+
- Fixed Gemini 3 context loss when switching from providers without thought signatures: unsigned tool calls are now converted to text with anti-mimicry notes instead of being skipped
|
|
3321
|
+
- Fixed string numbers in tool arguments not being coerced to numbers during validation ([#786](https://github.com/badlogic/pi-mono/pull/786) by [@dannote](https://github.com/dannote))
|
|
3322
|
+
- Fixed Bedrock tool call IDs to use only alphanumeric characters, avoiding API errors from invalid characters ([#781](https://github.com/badlogic/pi-mono/pull/781) by [@pjtf93](https://github.com/pjtf93))
|
|
3323
|
+
- Fixed empty error assistant messages (from 429/500 errors) breaking the tool_use to tool_result chain by filtering them in `transformMessages`
|
|
3324
|
+
|
|
3325
|
+
## [0.47.0] - 2026-01-16
|
|
3326
|
+
|
|
3327
|
+
### Fixed
|
|
3328
|
+
|
|
3329
|
+
- Fixed OpenCode provider's `/v1` endpoint to use `system` role instead of `developer` role, fixing `400 Incorrect role information` error for models using `openai-completions` API ([#755](https://github.com/badlogic/pi-mono/pull/755) by [@melihmucuk](https://github.com/melihmucuk))
|
|
3330
|
+
- Added retry logic to OpenAI Codex provider for transient errors (429, 5xx, connection failures). Uses exponential backoff with up to 3 retries. ([#733](https://github.com/badlogic/pi-mono/issues/733))
|
|
3331
|
+
|
|
3332
|
+
## [0.46.0] - 2026-01-15
|
|
3333
|
+
|
|
3334
|
+
### Added
|
|
3335
|
+
|
|
3336
|
+
- Added MiniMax China (`minimax-cn`) provider support ([#725](https://github.com/badlogic/pi-mono/pull/725) by [@tallshort](https://github.com/tallshort))
|
|
3337
|
+
- Added `gpt-5.2-codex` models for GitHub Copilot and OpenCode Zen providers ([#734](https://github.com/badlogic/pi-mono/pull/734) by [@aadishv](https://github.com/aadishv))
|
|
3338
|
+
|
|
3339
|
+
### Fixed
|
|
3340
|
+
|
|
3341
|
+
- Avoid unsigned Gemini 3 tool calls ([#741](https://github.com/badlogic/pi-mono/pull/741) by [@roshanasingh4](https://github.com/roshanasingh4))
|
|
3342
|
+
- Fixed signature support for non-Anthropic models in Amazon Bedrock provider ([#727](https://github.com/badlogic/pi-mono/pull/727) by [@unexge](https://github.com/unexge))
|
|
3343
|
+
|
|
3344
|
+
## [0.45.7] - 2026-01-13
|
|
3345
|
+
|
|
3346
|
+
### Fixed
|
|
3347
|
+
|
|
3348
|
+
- Fixed OpenAI Responses timeout option handling ([#706](https://github.com/badlogic/pi-mono/pull/706) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3349
|
+
- Fixed Bedrock tool call conversion to apply message transforms ([#707](https://github.com/badlogic/pi-mono/pull/707) by [@pjtf93](https://github.com/pjtf93))
|
|
3350
|
+
|
|
3351
|
+
## [0.45.6] - 2026-01-13
|
|
3352
|
+
|
|
3353
|
+
### Fixed
|
|
3354
|
+
|
|
3355
|
+
- Export `parseStreamingJson` from main package for tsx dev mode compatibility
|
|
3356
|
+
|
|
3357
|
+
## [0.45.4] - 2026-01-13
|
|
3358
|
+
|
|
3359
|
+
### Added
|
|
3360
|
+
|
|
3361
|
+
- Added Vercel AI Gateway provider with model discovery and `AI_GATEWAY_API_KEY` env support ([#689](https://github.com/badlogic/pi-mono/pull/689) by [@timolins](https://github.com/timolins))
|
|
3362
|
+
|
|
3363
|
+
### Fixed
|
|
3364
|
+
|
|
3365
|
+
- Fixed z.ai thinking/reasoning: z.ai uses `thinking: { type: "enabled" }` instead of OpenAI's `reasoning_effort`. Added `thinkingFormat` compat flag to handle this. ([#688](https://github.com/badlogic/pi-mono/issues/688))
|
|
3366
|
+
|
|
3367
|
+
## [0.45.0] - 2026-01-13
|
|
3368
|
+
|
|
3369
|
+
### Added
|
|
3370
|
+
|
|
3371
|
+
- MiniMax provider support with M2 and M2.1 models via Anthropic-compatible API ([#656](https://github.com/badlogic/pi-mono/pull/656) by [@dannote](https://github.com/dannote))
|
|
3372
|
+
- Add Amazon Bedrock provider with prompt caching for Claude models (experimental, tested with Anthropic Claude models only) ([#494](https://github.com/badlogic/pi-mono/pull/494) by [@unexge](https://github.com/unexge))
|
|
3373
|
+
- Added `serviceTier` option for OpenAI Responses requests ([#672](https://github.com/badlogic/pi-mono/pull/672) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3374
|
+
- **Anthropic caching on OpenRouter**: Interactions with Anthropic models via OpenRouter now set a 5-minute cache point using Anthropic-style `cache_control` breakpoints on the last assistant or user message. ([#584](https://github.com/badlogic/pi-mono/pull/584) by [@nathyong](https://github.com/nathyong))
|
|
3375
|
+
- **Google Gemini CLI provider improvements**: Added Antigravity endpoint fallback (tries daily sandbox then prod when `baseUrl` is unset), header-based retry delay parsing (`Retry-After`, `x-ratelimit-reset`, `x-ratelimit-reset-after`), stable `sessionId` derivation from first user message for cache affinity, empty SSE stream retry with backoff, and `anthropic-beta` header for Claude thinking models ([#670](https://github.com/badlogic/pi-mono/pull/670) by [@kim0](https://github.com/kim0))
|
|
3376
|
+
|
|
3377
|
+
## [0.43.0] - 2026-01-11
|
|
3378
|
+
|
|
3379
|
+
### Fixed
|
|
3380
|
+
|
|
3381
|
+
- Fixed Google provider thinking detection: `isThinkingPart()` now only checks `thought === true`, not `thoughtSignature`. Per Google docs, `thoughtSignature` is for context replay and can appear on any part type. Also removed `id` field from `functionCall`/`functionResponse` (rejected by Vertex AI and Cloud Code Assist), and added `textSignature` round-trip for multi-turn reasoning context. ([#631](https://github.com/badlogic/pi-mono/pull/631) by [@theBucky](https://github.com/theBucky))
|
|
3382
|
+
|
|
3383
|
+
## [0.42.3] - 2026-01-10
|
|
3384
|
+
|
|
3385
|
+
### Changed
|
|
3386
|
+
|
|
3387
|
+
- OpenAI Codex: switched to bundled system prompt matching opencode, changed originator to "pi", simplified prompt handling
|
|
3388
|
+
|
|
3389
|
+
## [0.42.2] - 2026-01-10
|
|
3390
|
+
|
|
3391
|
+
### Added
|
|
3392
|
+
|
|
3393
|
+
- Added `GOOGLE_APPLICATION_CREDENTIALS` env var support for Vertex AI credential detection (standard for CI/production).
|
|
3394
|
+
- Added `supportsUsageInStreaming` compatibility flag for OpenAI-compatible providers that reject `stream_options: { include_usage: true }`. Defaults to `true`. Set to `false` in model config for providers like gatewayz.ai. ([#596](https://github.com/badlogic/pi-mono/pull/596) by [@XesGaDeus](https://github.com/XesGaDeus))
|
|
3395
|
+
- Improved Google model pricing info ([#588](https://github.com/badlogic/pi-mono/pull/588) by [@aadishv](https://github.com/aadishv))
|
|
3396
|
+
|
|
3397
|
+
### Fixed
|
|
3398
|
+
|
|
3399
|
+
- Fixed `os.homedir()` calls at module load time; now resolved lazily when needed.
|
|
3400
|
+
- Fixed OpenAI Responses tool strict flag to use a boolean for LM Studio compatibility ([#598](https://github.com/badlogic/pi-mono/pull/598) by [@gnattu](https://github.com/gnattu))
|
|
3401
|
+
- Fixed Google Cloud Code Assist OAuth for paid subscriptions: properly handles long-running operations for project provisioning, supports `GOOGLE_CLOUD_PROJECT` / `GOOGLE_CLOUD_PROJECT_ID` env vars for paid tiers, and handles VPC-SC affected users ([#582](https://github.com/badlogic/pi-mono/pull/582) by [@cmf](https://github.com/cmf))
|
|
3402
|
+
|
|
3122
3403
|
## [0.42.0] - 2026-01-09
|
|
3123
3404
|
|
|
3124
3405
|
### Added
|
|
@@ -3212,7 +3493,7 @@
|
|
|
3212
3493
|
|
|
3213
3494
|
### Breaking Changes
|
|
3214
3495
|
|
|
3215
|
-
- **Agent API moved**: All agent functionality (`agentLoop`, `agentLoopContinue`, `AgentContext`, `AgentEvent`, `AgentTool`, `AgentToolResult`, etc.) has moved to `@
|
|
3496
|
+
- **Agent API moved**: All agent functionality (`agentLoop`, `agentLoopContinue`, `AgentContext`, `AgentEvent`, `AgentTool`, `AgentToolResult`, etc.) has moved to `@oh-my-pi/pi-agent-core`. Import from that package instead of `@oh-my-pi/pi-ai`.
|
|
3216
3497
|
|
|
3217
3498
|
### Added
|
|
3218
3499
|
|
|
@@ -3228,7 +3509,6 @@
|
|
|
3228
3509
|
### Fixed
|
|
3229
3510
|
|
|
3230
3511
|
- **OpenAI completions empty content blocks**: Empty text or thinking blocks in assistant messages are now filtered out before sending to the OpenAI completions API, preventing validation errors. ([#344](https://github.com/badlogic/pi-mono/pull/344) by [@default-anton](https://github.com/default-anton))
|
|
3231
|
-
- **Thinking token duplication**: Fixed thinking content duplication with chutes.ai provider. The provider was returning thinking content in both `reasoning_content` and `reasoning` fields, causing each chunk to be processed twice. Now only the first non-empty reasoning field is used.
|
|
3232
3512
|
- **zAi provider API mapping**: Fixed zAi models to use `openai-completions` API with correct base URL (`https://api.z.ai/api/coding/paas/v4`) instead of incorrect Anthropic API mapping. ([#344](https://github.com/badlogic/pi-mono/pull/344), [#358](https://github.com/badlogic/pi-mono/pull/358) by [@default-anton](https://github.com/default-anton))
|
|
3233
3513
|
|
|
3234
3514
|
## [0.28.0] - 2025-12-25
|
|
@@ -3258,11 +3538,8 @@
|
|
|
3258
3538
|
### Fixed
|
|
3259
3539
|
|
|
3260
3540
|
- **Gemini multimodal tool results**: Fixed images in tool results causing flaky/broken responses with Gemini models. For Gemini 3, images are now nested inside `functionResponse.parts` per the [docs](https://ai.google.dev/gemini-api/docs/function-calling#multimodal). For older models (which don't support multimodal function responses), images are sent in a separate user message.
|
|
3261
|
-
|
|
3262
3541
|
- **Queued message steering**: When `getQueuedMessages` is provided, the agent loop now checks for queued user messages after each tool call and skips remaining tool calls in the current assistant message when a queued message arrives (emitting error tool results).
|
|
3263
|
-
|
|
3264
3542
|
- **Double API version path in Google provider URL**: Fixed Gemini API calls returning 404 after baseUrl support was added. The SDK was appending its default apiVersion to baseUrl which already included the version path. ([#251](https://github.com/badlogic/pi-mono/pull/251) by [@shellfyred](https://github.com/shellfyred))
|
|
3265
|
-
|
|
3266
3543
|
- **Anthropic SDK retries disabled**: Re-enabled SDK-level retries (default 2) for transient HTTP failures. ([#252](https://github.com/badlogic/pi-mono/issues/252))
|
|
3267
3544
|
|
|
3268
3545
|
## [0.23.5] - 2025-12-19
|
|
@@ -3270,17 +3547,13 @@
|
|
|
3270
3547
|
### Added
|
|
3271
3548
|
|
|
3272
3549
|
- **Gemini 3 Flash thinking support**: Extended thinking level support for Gemini 3 Flash models (MINIMAL, LOW, MEDIUM, HIGH) to match Pro models' capabilities. ([#212](https://github.com/badlogic/pi-mono/pull/212) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3273
|
-
|
|
3274
3550
|
- **GitHub Copilot thinking models**: Added thinking support for additional Copilot models (o3-mini, o1-mini, o1-preview). ([#234](https://github.com/badlogic/pi-mono/pull/234) by [@aadishv](https://github.com/aadishv))
|
|
3275
3551
|
|
|
3276
3552
|
### Fixed
|
|
3277
3553
|
|
|
3278
3554
|
- **Gemini tool result format**: Fixed tool result format for Gemini 3 Flash Preview which strictly requires `{ output: value }` for success and `{ error: value }` for errors. Previous format using `{ result, isError }` was rejected by newer Gemini models. Also improved type safety by removing `as any` casts. ([#213](https://github.com/badlogic/pi-mono/issues/213), [#220](https://github.com/badlogic/pi-mono/pull/220))
|
|
3279
|
-
|
|
3280
3555
|
- **Google baseUrl configuration**: Google provider now respects `baseUrl` configuration for custom endpoints or API proxies. ([#216](https://github.com/badlogic/pi-mono/issues/216), [#221](https://github.com/badlogic/pi-mono/pull/221) by [@theBucky](https://github.com/theBucky))
|
|
3281
|
-
|
|
3282
3556
|
- **GitHub Copilot vision requests**: Added `Copilot-Vision-Request` header when sending images to GitHub Copilot models. ([#222](https://github.com/badlogic/pi-mono/issues/222))
|
|
3283
|
-
|
|
3284
3557
|
- **GitHub Copilot X-Initiator header**: Fixed X-Initiator logic to check last message role instead of any message in history. This ensures proper billing when users send follow-up messages. ([#209](https://github.com/badlogic/pi-mono/issues/209))
|
|
3285
3558
|
|
|
3286
3559
|
## [0.22.3] - 2025-12-16
|
|
@@ -3288,9 +3561,7 @@
|
|
|
3288
3561
|
### Added
|
|
3289
3562
|
|
|
3290
3563
|
- **Image limits test suite**: Added comprehensive tests for provider-specific image limitations (max images, max size, max dimensions). Discovered actual limits: Anthropic (100 images, 5MB, 8000px), OpenAI (500 images, ≥25MB), Gemini (~2500 images, ≥40MB), Mistral (8 images, ~15MB), OpenRouter (~40 images context-limited, ~15MB). ([#120](https://github.com/badlogic/pi-mono/pull/120))
|
|
3291
|
-
|
|
3292
3564
|
- **Tool result streaming**: Added `tool_execution_update` event and optional `onUpdate` callback to `AgentTool.execute()` for streaming tool output during execution. Tools can now emit partial results (e.g., bash stdout) that are forwarded to subscribers. ([#44](https://github.com/badlogic/pi-mono/issues/44))
|
|
3293
|
-
|
|
3294
3565
|
- **X-Initiator header for GitHub Copilot**: Added X-Initiator header handling for GitHub Copilot provider to ensure correct call accounting (agent calls are not deducted from quota). Sets initiator based on last message role. ([#200](https://github.com/badlogic/pi-mono/pull/200) by [@kim0](https://github.com/kim0))
|
|
3295
3566
|
|
|
3296
3567
|
### Changed
|
|
@@ -3324,9 +3595,7 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3324
3595
|
### Fixed
|
|
3325
3596
|
|
|
3326
3597
|
- **GitHub Copilot gpt-5 models**: Fixed API selection for gpt-5 models to use `openai-responses` instead of `openai-completions` (gpt-5 models are not accessible via completions endpoint)
|
|
3327
|
-
|
|
3328
3598
|
- **GitHub Copilot cross-model context handoff**: Fixed context handoff failing when switching between GitHub Copilot models using different APIs (e.g., gpt-5 to claude-sonnet-4). Tool call IDs from OpenAI Responses API were incompatible with other models. ([#198](https://github.com/badlogic/pi-mono/issues/198))
|
|
3329
|
-
|
|
3330
3599
|
- **Gemini 3 Pro thinking levels**: Thinking level configuration now works correctly for Gemini 3 Pro models. Previously all levels mapped to -1 (minimal thinking). Now LOW/MEDIUM/HIGH properly control test-time computation. ([#176](https://github.com/badlogic/pi-mono/pull/176) by [@markusylisiurunen](https://github.com/markusylisiurunen))
|
|
3331
3600
|
|
|
3332
3601
|
## [0.18.2] - 2025-12-11
|
|
@@ -3344,9 +3613,7 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3344
3613
|
### Fixed
|
|
3345
3614
|
|
|
3346
3615
|
- Fixed Mistral 400 errors after aborted assistant messages by skipping empty assistant messages (no content, no tool calls) ([#165](https://github.com/badlogic/pi-mono/issues/165))
|
|
3347
|
-
|
|
3348
3616
|
- Removed synthetic assistant bridge message after tool results for Mistral (no longer required as of Dec 2025) ([#165](https://github.com/badlogic/pi-mono/issues/165))
|
|
3349
|
-
|
|
3350
3617
|
- Fixed bug where `ANTHROPIC_API_KEY` environment variable was deleted globally after first OAuth token usage, causing subsequent prompts to fail ([#164](https://github.com/badlogic/pi-mono/pull/164))
|
|
3351
3618
|
|
|
3352
3619
|
## [0.17.0] - 2025-12-09
|
|
@@ -3355,9 +3622,7 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3355
3622
|
|
|
3356
3623
|
- **`agentLoopContinue` function**: Continue an agent loop from existing context without adding a new user message. Validates that the last message is `user` or `toolResult`. Useful for retry after context overflow or resuming from manually-added tool results.
|
|
3357
3624
|
- Added `validateToolCall(tools, toolCall)` helper that finds the tool by name and validates arguments.
|
|
3358
|
-
|
|
3359
3625
|
- **OpenAI compatibility overrides**: Added `compat` field to `Model` for `openai-completions` API, allowing explicit configuration of provider quirks (`supportsStore`, `supportsDeveloperRole`, `supportsReasoningEffort`, `maxTokensField`). Falls back to URL-based detection if not set. Useful for LiteLLM, custom proxies, and other non-standard endpoints. ([#133](https://github.com/badlogic/pi-mono/issues/133), thanks @fink-andreas for the initial idea and PR)
|
|
3360
|
-
|
|
3361
3626
|
- **xhigh reasoning level**: Added `xhigh` to `ReasoningEffort` type for OpenAI codex-max models. For non-OpenAI providers (Anthropic, Google), `xhigh` is automatically mapped to `high`. ([#143](https://github.com/badlogic/pi-mono/issues/143))
|
|
3362
3627
|
|
|
3363
3628
|
### Breaking Changes
|
|
@@ -3383,7 +3648,6 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3383
3648
|
### Fixed
|
|
3384
3649
|
|
|
3385
3650
|
- **OpenAI Token Counting**: Fixed `usage.input` to exclude cached tokens for OpenAI providers. Previously, `input` included cached tokens, causing double-counting when calculating total context size via `input + cacheRead`. Now `input` represents non-cached input tokens across all providers, making `input + output + cacheRead + cacheWrite` the correct formula for total context size.
|
|
3386
|
-
|
|
3387
3651
|
- **Fixed Claude Opus 4.5 cache pricing** (was 3x too expensive)
|
|
3388
3652
|
- Corrected cache_read: $1.50 → $0.50 per MTok
|
|
3389
3653
|
- Corrected cache_write: $18.75 → $6.25 per MTok
|
|
@@ -3392,4 +3656,4 @@ _Dedicated to Peter's shoulder ([@steipete](https://twitter.com/steipete))_
|
|
|
3392
3656
|
|
|
3393
3657
|
## [0.9.4] - 2025-11-26
|
|
3394
3658
|
|
|
3395
|
-
Initial release with multi-provider LLM support.
|
|
3659
|
+
Initial release with multi-provider LLM support.
|