npm - @oh-my-pi/pi-ai - Versions diffs - 16.0.4 → 16.0.6 - Mend

@oh-my-pi/pi-ai 16.0.4 → 16.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (86) hide show

package/CHANGELOG.md +66 -0
package/dist/types/auth-broker/wire-schemas.d.ts +391 -403
package/dist/types/auth-retry.d.ts +9 -0
package/dist/types/auth-storage.d.ts +3 -0
package/dist/types/dialect/anthropic.d.ts +7 -1
package/dist/types/dialect/minimax.d.ts +3 -0
package/dist/types/index.d.ts +3 -0
package/dist/types/providers/__tests__/openai-codex-error.test.d.ts +1 -0
package/dist/types/providers/anthropic-messages-server-schema.d.ts +450 -449
package/dist/types/providers/azure-openai-responses.d.ts +1 -1
package/dist/types/providers/google-gemini-cli.d.ts +23 -2
package/dist/types/providers/openai-chat-server-schema.d.ts +656 -781
package/dist/types/providers/openai-codex/request-transformer.d.ts +1 -0
package/dist/types/providers/openai-codex-responses.d.ts +11 -1
package/dist/types/providers/openai-completions.d.ts +5 -12
package/dist/types/providers/openai-responses-server-schema.d.ts +314 -363
package/dist/types/providers/openai-responses.d.ts +25 -20
package/dist/types/providers/openai-shared.d.ts +496 -0
package/dist/types/registry/alibaba-coding-plan.d.ts +4 -3
package/dist/types/registry/oauth/github-copilot.d.ts +1 -1
package/dist/types/registry/oauth/google-oauth-shared.d.ts +0 -6
package/dist/types/registry/oauth/types.d.ts +1 -0
package/dist/types/registry/registry.d.ts +2 -1
package/dist/types/types.d.ts +23 -6
package/dist/types/usage.d.ts +101 -152
package/dist/types/utils/google-validation.d.ts +2 -0
package/dist/types/utils/idle-iterator.d.ts +24 -0
package/dist/types/utils/openrouter-headers.d.ts +1 -0
package/dist/types/utils/schema/normalize.d.ts +31 -0
package/dist/types/utils/schema/wire.d.ts +29 -7
package/dist/types/utils/thinking-loop.d.ts +49 -0
package/package.json +4 -3
package/src/auth-broker/client.ts +16 -16
package/src/auth-broker/server.ts +9 -8
package/src/auth-broker/snapshot-cache.ts +4 -3
package/src/auth-broker/wire-schemas.ts +183 -152
package/src/auth-retry.ts +19 -0
package/src/auth-storage.ts +16 -12
package/src/dialect/anthropic.ts +19 -8
package/src/dialect/factory.ts +2 -0
package/src/dialect/minimax.md +31 -0
package/src/dialect/minimax.ts +95 -0
package/src/dialect/owned-stream.ts +1 -0
package/src/index.ts +3 -0
package/src/providers/__tests__/openai-codex-error.test.ts +84 -0
package/src/providers/amazon-bedrock.ts +18 -18
package/src/providers/anthropic-messages-server-schema.ts +143 -149
package/src/providers/anthropic-messages-server.ts +4 -4
package/src/providers/anthropic.ts +13 -8
package/src/providers/azure-openai-responses.ts +31 -104
package/src/providers/github-copilot-headers.ts +2 -1
package/src/providers/google-gemini-cli.ts +334 -140
package/src/providers/ollama.ts +42 -28
package/src/providers/openai-anthropic-shim.ts +7 -2
package/src/providers/openai-chat-server-schema.ts +124 -140
package/src/providers/openai-chat-server.ts +5 -4
package/src/providers/openai-codex/request-transformer.ts +1 -0
package/src/providers/openai-codex-responses.ts +498 -581
package/src/providers/openai-completions.ts +335 -550
package/src/providers/openai-responses-server-schema.ts +169 -186
package/src/providers/openai-responses-server.ts +5 -5
package/src/providers/openai-responses.ts +276 -307
package/src/providers/openai-shared.ts +2415 -0
package/src/registry/alibaba-coding-plan.ts +55 -10
package/src/registry/github-copilot.ts +1 -1
package/src/registry/litellm.ts +2 -1
package/src/registry/oauth/github-copilot.ts +37 -5
package/src/registry/oauth/google-oauth-shared.ts +9 -1
package/src/registry/oauth/index.ts +5 -1
package/src/registry/oauth/types.ts +1 -0
package/src/stream.ts +60 -14
package/src/types.ts +30 -6
package/src/usage/gemini.ts +10 -2
package/src/usage/google-antigravity.ts +38 -17
package/src/usage.ts +39 -38
package/src/utils/google-validation.ts +25 -0
package/src/utils/idle-iterator.ts +34 -0
package/src/utils/openrouter-headers.ts +12 -0
package/src/utils/schema/normalize.ts +162 -8
package/src/utils/schema/wire.ts +336 -8
package/src/utils/thinking-loop.ts +379 -0
package/src/utils/validation.ts +36 -12
package/dist/types/providers/openai-responses-shared.d.ts +0 -145
package/dist/types/providers/xai-responses.d.ts +0 -23
package/src/providers/openai-responses-shared.ts +0 -1254
package/src/providers/xai-responses.ts +0 -82

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,72 @@
 ## [Unreleased]
+## [16.0.6] - 2026-06-18
+### Added
+- Added support for ArkType schemas as tool parameters alongside existing Zod schemas
+- Added `getOpenRouterHeaders` utility to export standard OpenRouter integration headers
+### Changed
+- Expanded thinking loop detection guard to also cover DeepSeek models (family, provider, or id matches).
+- Extended loop guard to monitor assistant response prose (via `text_delta` events) in addition to thinking logs, customizable via request options.
+- Modified loop guard error reporting to emit a non-retryable partial content block containing the accumulated streamed text if a loop is detected after response prose has started streaming, preventing unsafe agent session rollbacks.
+- Migrated internal wire-schema validation (auth-broker, Anthropic Messages request, OpenAI Chat/Responses requests, and /v1/usage shapes) from Zod to ArkType
+- Replaced the dedicated `xai-responses` provider with a unified `openai-responses` path that handles xAI-specific reasoning effort stripping dynamically
+- Updated OpenAI Responses stream handling to throw a clearer error message when a stream closes without a terminal response event
+- Consolidated shared OpenAI-compatible routing and strict-tool fallback helpers across Chat Completions and Responses providers.
+- Consolidated the OpenAI-family provider stack: merged `openai-responses-shared` into `openai-shared` and removed the now-dead `openai-responses-shared` re-export shim; folded the three duplicated `service_tier` request blocks and the per-provider wire model-id transform into shared `applyOpenAIServiceTier`/`applyWireModelIdTransform` helpers; moved residual provider-name wire-quirk checks (DeepSeek special-token strip, cumulative reasoning deltas, Ollama empty-length context error, OpenAI tool-call-id cap, Fireworks thinking drop, OpenRouter/OpenAI Responses request fields) into resolved compat fields; shared the Responses stream per-block accumulation helpers plus the terminal pending-tool-call finalization (`finalizePendingResponsesToolCalls`) and toolUse/pause stop-reason promotion (`promoteResponsesToolUseStopReason`) between `processResponsesStream` and the Codex stream handler; and removed the redundant `getOpenAIResponsesCacheSessionId` alias in favor of `getOpenAIResponsesPromptCacheKey`.
+- Centralized OpenAI-family request-param policy into shared `resolveOpenAIOutputTokenParam` (output-token field selection, OpenRouter default-cap omission, `alwaysSendMaxTokens` defaulting, model/provider clamp), `applyOpenAIGatewayRouting` (OpenRouter `provider` + Vercel AI Gateway `providerOptions`), and `applyOpenAIExtraBody` (extra-body merge + Fireworks thinking drop) helpers used by both Chat Completions and Responses `buildParams`, and moved the Chat Completions reasoning/thinking dialect dispatch (`applyChatCompletionsReasoningParams` + `disableChatCompletionsReasoningForDialect`) plus the `OpenAICompletionsParams` request type into `openai-shared` alongside `applyResponsesReasoningParams`. As a consistency consequence, direct `streamOpenAIResponses` calls (bypassing `streamSimple`) now emit `max_output_tokens` for `alwaysSendMaxTokens` (Kimi-family) models even without a caller cap — matching Chat Completions and the value `streamSimple` already supplied.
+- Centralized OpenAI-family reasoning compat resolution behind a shared `resolveOpenAICompatPolicy` consumed by both Chat Completions and Responses request builders. Shared policy now drives tool-choice reasoning suppression, dialect-specific disable encoding, reasoning-history replay filters, encrypted-reasoning inclusion, Mistral/OpenAI tool-call-id modes, stream healing/DeepSeek token stripping, and xAI/OpenRouter cache-affinity wiring instead of endpoint-local provider/model checks.
+### Fixed
+- Fixed OpenAI Responses cost accounting to apply standard service-tier pricing multipliers (flex 0.5×, priority 2×) to the calculated cost based on the served (or requested) service tier for provider `"openai"` models.
+- Fixed OpenAI Chat Completions to consume the dedicated `requiresReasoningContentForAllAssistantTurns` compatibility flag, preventing unnecessary reasoning replay on non-tool-call turns for OpenRouter DeepSeek and OpenCode models.
+- Fixed the Kimi Code and Synthetic dual-surface shim (`streamOpenAIAnthropicShim`) to correctly forward caller-supplied `toolChoice`, `serviceTier`, and `disableReasoning` options.
+- Fixed the OpenAI Responses tool-choice compatibility helper to drop `tool_choice` when `supportsToolChoice` is false, and downgrade forced choices to `"auto"` when `supportsForcedToolChoice` is false.
+- Fixed Azure Responses to avoid emitting `tool_choice: "none"` when `context.tools` is empty.
+- Fixed Kimi via OpenRouter forced-tool requests to omit the OpenRouter `reasoning` object instead of sending `reasoning: { enabled: false }`, preserving the generic OpenRouter explicit-disable behavior while avoiding Kimi's forced-tool reasoning conflict.
+- Fixed Google Gemini CLI credential parsing schema to gracefully handle empty or unexpected non-string shapes without throwing unhandled exceptions
+- Fixed Google Gemini CLI credential parsing to correctly prioritize `projectId` over `project_id` even when empty, and drop non-string values gracefully
+- Fixed OpenRouter Responses requests to omit default max token fields unless an explicit caller cap is provided, preventing upstream filtering issues
+- Fixed Chat Completions reasoning suppression (`disableReasoningOnToolChoice` / `disableReasoningOnForcedToolChoice`) to turn thinking off symmetrically across every dialect via a shared `disableChatCompletionsReasoningForDialect` helper. Previously the conflict path only deleted `reasoning_effort`/`reasoning` (and set Z.AI `thinking: { type: "disabled" }` on the forced branch alone), leaving Qwen `enable_thinking`, Qwen chat-template `chat_template_kwargs.enable_thinking`, and OpenRouter nested `reasoning` enabled — so those hosts could keep thinking on under forced/required tool choice and re-trip the incompatibility the policy guards against. OpenRouter is now set to `{ reasoning: { enabled: false } }` (not deleted, which OpenRouter treats as default-on).
+- Fixed OpenRouter Responses requests to send `session_id` from `sessionId` in the request body for sticky provider routing and observability grouping.
+- Fixed OpenRouter Responses request shaping to preserve provider routing, variant suffixes, caller header overrides, and strict-tool fallback behavior while omitting only unsafe default max-token caps.
+- Fixed OpenAI Responses stateful chaining so a non-ZDR stale `previous_response_id` retry keeps `store: true`: the full-context retry stays chainable on the next turn and the consecutive stale-failure circuit breaker trips after the configured limit instead of alternating cold turns. Zero Data Retention rejections still disable chaining on the first strike.
+- Fixed Anthropic Messages tool schema normalization demoting root `anyOf`/`allOf` and all `oneOf` constraints into descriptions instead of forwarding provider-rejected keywords in MCP tool `input_schema`.
+- Fixed Ollama Cloud GLM-5.2 reasoning efforts to map `xhigh` to native think `"max"` ([#2911](https://github.com/can1357/oh-my-pi/pull/2911) by [@serverinspector](https://github.com/serverinspector))
+- Fixed OpenRouter Responses requests tagging the streamed assistant message with a hardcoded `openai-responses` API instead of the runtime `model.api`, which silently disabled native-history replay (`buildResponsesInput`) and cross-model tool-call item-id stripping on subsequent OpenRouter turns. The message now carries `model.api` (matching the Chat Completions path).
+- Fixed OpenAI-family streaming leaking a pre-retry `errorMessage` onto a successful turn: the OpenRouter Anthropic compiled-grammar strict-tool fallback set `errorMessage` before retrying with strict tools disabled and never cleared it on success, and the Chat Completions success path could carry an `errorMessage` from an internally-retried attempt — both made a successful turn read as errored in agent state and telemetry. The Responses fallback no longer assigns `errorMessage`, and the Completions success path clears it before emitting the terminal `done` event.
+- Fixed Codex stream-error `.code` resolution to use the same nested-first precedence (`error.code` → `error.type` → top-level `code`) as `isRetryableCodexFailureEvent` and the formatted message. Previously the error factory resolved top-level-first, so a failure event carrying both a top-level and a differing nested error code surfaced a `.code` that could disagree with its own `retryable` flag and message text.
+## [16.0.5] - 2026-06-17
+### Added
+- Added `antigravityEndpointMode` stream option with `auto`, `production`, and `sandbox` values to control Antigravity endpoint routing
+- Added `seedApiKeyResolver` for reusing a pre-resolved request key while preserving resolver-driven auth retry and credential rotation
+- Added optional `contextSnapshot` property to `AssistantMessage` with token usage metadata via new `ContextSnapshot` interface (`promptTokens`, `nonMessageTokens`, and optional `lastMessageTimestamp`)
+- Added `LITELLM_BASE_URL` guidance to the LiteLLM login prompt so non-default proxy endpoints are discoverable. ([#2726](https://github.com/can1357/oh-my-pi/issues/2726))
+- Added a Gemini thinking-loop guard that watches streamed `thinking` deltas for degenerate reasoning loops — verbatim tail repetition and near-duplicate paragraph cycling — and terminates the stream with a retryable, empty-content `error` message (worded as a transient stream stall) so the turn is discarded and re-sampled instead of committing a runaway transcript. Gated to Gemini models across every transport (OpenRouter, direct Google, Vertex) and disarmed once visible answer text or a tool call starts; disable with `PI_NO_THINKING_LOOP_GUARD=1`.
+### Changed
+- Changed the Antigravity (`google-antigravity`) request builder to mirror the captured `antigravity/hub` client: gemini-3.x send `thinkingConfig.thinkingBudget` per tier, a fixed per-model `maxOutputTokens`, a default `functionCallingConfig.mode: "VALIDATED"` tool mode (auto/unset tool choice only), a `role: "user"` system instruction, a structured `requestId` (`agent/<id>/<ts>/<trajectoryId>/<step>`), and `labels` (`model_enum`, `trajectory_id`, `last_step_index`, `last_execution_id`, `used_claude*`) tracked across the conversation via provider session state.
+### Fixed
+- Fixed Gemini usage-tier mapping so `gemini-3.5-flash` is treated as `Flash` and `gemini-3.1-pro` plus `gemini-pro-agent` are treated as `Pro` in usage accounting
+- Fixed Antigravity stream state handling so a request’s `last_execution_id` is committed only after a successful completion and cleared between retry attempts
+- Fixed `streamSimple()` Gemini streams to run through the thinking-loop guard for custom API and pi-native transports, so degenerate `thinking` loops now abort with the same retryable empty-content error path as other Gemini stream paths
+- Fixed Antigravity model streaming and usage fetch paths to retry on transient `429`/`5xx` errors by failing over to the alternate endpoint before surfacing an error
+- Fixed Antigravity endpoint tracking to prefer a previously successful endpoint in `auto` mode for subsequent requests
+- Fixed Antigravity and Gemini CLI model requests failing with an opaque error when Google requires account verification. Cloud Code Assist `403 VALIDATION_REQUIRED` responses now surface the `validation_url` and the signed-in account email when available, so users see an actionable account-verification message instead of the raw API error body.
+- Fixed MiniMax M3 in-band tool calls by adding a MiniMax dialect that parses `<minimax:tool_call>` wrappers instead of falling back to generic XML. ([#2759](https://github.com/can1357/oh-my-pi/issues/2759))
+- Fixed GitHub Copilot OAuth for Business seats by storing the login-discovered API endpoint and routing model enablement plus chat requests to that endpoint. ([#2876](https://github.com/can1357/oh-my-pi/issues/2876))
 ## [16.0.4] - 2026-06-17
 ### Fixed