npm - @oh-my-pi/pi-ai - Versions diffs - 16.0.5 → 16.0.7 - Mend

@oh-my-pi/pi-ai 16.0.5 → 16.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (64) hide show

package/CHANGELOG.md +52 -0
package/dist/types/auth-broker/wire-schemas.d.ts +391 -417
package/dist/types/auth-storage.d.ts +13 -1
package/dist/types/index.d.ts +3 -0
package/dist/types/providers/__tests__/openai-codex-error.test.d.ts +1 -0
package/dist/types/providers/anthropic-messages-server-schema.d.ts +450 -449
package/dist/types/providers/anthropic.d.ts +5 -2
package/dist/types/providers/azure-openai-responses.d.ts +1 -1
package/dist/types/providers/openai-chat-server-schema.d.ts +656 -781
package/dist/types/providers/openai-codex/request-transformer.d.ts +1 -0
package/dist/types/providers/openai-codex-responses.d.ts +11 -1
package/dist/types/providers/openai-completions.d.ts +3 -12
package/dist/types/providers/openai-responses-server-schema.d.ts +314 -363
package/dist/types/providers/openai-responses.d.ts +25 -20
package/dist/types/providers/openai-shared.d.ts +496 -0
package/dist/types/types.d.ts +15 -6
package/dist/types/usage/opencode-go.d.ts +2 -0
package/dist/types/usage.d.ts +123 -152
package/dist/types/utils/idle-iterator.d.ts +24 -0
package/dist/types/utils/openrouter-headers.d.ts +1 -0
package/dist/types/utils/schema/normalize.d.ts +31 -0
package/dist/types/utils/schema/wire.d.ts +29 -7
package/dist/types/utils/thinking-loop.d.ts +14 -10
package/package.json +4 -3
package/src/auth-broker/client.ts +16 -16
package/src/auth-broker/server.ts +9 -8
package/src/auth-broker/snapshot-cache.ts +4 -3
package/src/auth-broker/wire-schemas.ts +183 -153
package/src/auth-storage.ts +144 -5
package/src/index.ts +3 -0
package/src/providers/__tests__/openai-codex-error.test.ts +84 -0
package/src/providers/amazon-bedrock.ts +18 -18
package/src/providers/anthropic-messages-server-schema.ts +143 -149
package/src/providers/anthropic-messages-server.ts +4 -4
package/src/providers/anthropic.ts +43 -17
package/src/providers/azure-openai-responses.ts +31 -104
package/src/providers/google-gemini-cli.ts +53 -48
package/src/providers/ollama.ts +42 -28
package/src/providers/openai-anthropic-shim.ts +7 -2
package/src/providers/openai-chat-server-schema.ts +124 -140
package/src/providers/openai-chat-server.ts +5 -4
package/src/providers/openai-codex/request-transformer.ts +1 -0
package/src/providers/openai-codex-responses.ts +498 -581
package/src/providers/openai-completions.ts +333 -564
package/src/providers/openai-responses-server-schema.ts +169 -186
package/src/providers/openai-responses-server.ts +5 -5
package/src/providers/openai-responses.ts +276 -307
package/src/providers/openai-shared.ts +2415 -0
package/src/registry/oauth/google-oauth-shared.ts +5 -1
package/src/registry/oauth/kimi.ts +9 -4
package/src/stream.ts +41 -11
package/src/types.ts +21 -6
package/src/usage/opencode-go.ts +89 -0
package/src/usage.ts +62 -38
package/src/utils/idle-iterator.ts +34 -0
package/src/utils/openrouter-headers.ts +12 -0
package/src/utils/schema/normalize.ts +162 -8
package/src/utils/schema/wire.ts +336 -8
package/src/utils/thinking-loop.ts +68 -43
package/src/utils/validation.ts +36 -12
package/dist/types/providers/openai-responses-shared.d.ts +0 -145
package/dist/types/providers/xai-responses.d.ts +0 -23
package/src/providers/openai-responses-shared.ts +0 -1254
package/src/providers/xai-responses.ts +0 -82

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,58 @@
 ## [Unreleased]
+## [16.0.7] - 2026-06-18
+### Changed
+- Switched Google OAuth callback hostname from `localhost` to `127.0.0.1` to prevent IPv6 loopback fallback delays and proxy routing interception.
+### Fixed
+- Fixed OpenCode Go usage reporting to synthesize `/usage` limits from OMP-observed request costs for the 5h, weekly, and monthly provider caps. ([#2942](https://github.com/can1357/oh-my-pi/issues/2942))
+- Fixed MiniMax Anthropic-compatible requests to serialize adaptive thinking without an invalid Anthropic `output_config.effort` tier ([#2928](https://github.com/can1357/oh-my-pi/issues/2928)).
+## [16.0.6] - 2026-06-18
+### Added
+- Added support for ArkType schemas as tool parameters alongside existing Zod schemas
+- Added `getOpenRouterHeaders` utility to export standard OpenRouter integration headers
+### Changed
+- Expanded thinking loop detection guard to also cover DeepSeek models (family, provider, or id matches).
+- Extended loop guard to monitor assistant response prose (via `text_delta` events) in addition to thinking logs, customizable via request options.
+- Modified loop guard error reporting to emit a non-retryable partial content block containing the accumulated streamed text if a loop is detected after response prose has started streaming, preventing unsafe agent session rollbacks.
+- Migrated internal wire-schema validation (auth-broker, Anthropic Messages request, OpenAI Chat/Responses requests, and /v1/usage shapes) from Zod to ArkType
+- Replaced the dedicated `xai-responses` provider with a unified `openai-responses` path that handles xAI-specific reasoning effort stripping dynamically
+- Updated OpenAI Responses stream handling to throw a clearer error message when a stream closes without a terminal response event
+- Consolidated shared OpenAI-compatible routing and strict-tool fallback helpers across Chat Completions and Responses providers.
+- Consolidated the OpenAI-family provider stack: merged `openai-responses-shared` into `openai-shared` and removed the now-dead `openai-responses-shared` re-export shim; folded the three duplicated `service_tier` request blocks and the per-provider wire model-id transform into shared `applyOpenAIServiceTier`/`applyWireModelIdTransform` helpers; moved residual provider-name wire-quirk checks (DeepSeek special-token strip, cumulative reasoning deltas, Ollama empty-length context error, OpenAI tool-call-id cap, Fireworks thinking drop, OpenRouter/OpenAI Responses request fields) into resolved compat fields; shared the Responses stream per-block accumulation helpers plus the terminal pending-tool-call finalization (`finalizePendingResponsesToolCalls`) and toolUse/pause stop-reason promotion (`promoteResponsesToolUseStopReason`) between `processResponsesStream` and the Codex stream handler; and removed the redundant `getOpenAIResponsesCacheSessionId` alias in favor of `getOpenAIResponsesPromptCacheKey`.
+- Centralized OpenAI-family request-param policy into shared `resolveOpenAIOutputTokenParam` (output-token field selection, OpenRouter default-cap omission, `alwaysSendMaxTokens` defaulting, model/provider clamp), `applyOpenAIGatewayRouting` (OpenRouter `provider` + Vercel AI Gateway `providerOptions`), and `applyOpenAIExtraBody` (extra-body merge + Fireworks thinking drop) helpers used by both Chat Completions and Responses `buildParams`, and moved the Chat Completions reasoning/thinking dialect dispatch (`applyChatCompletionsReasoningParams` + `disableChatCompletionsReasoningForDialect`) plus the `OpenAICompletionsParams` request type into `openai-shared` alongside `applyResponsesReasoningParams`. As a consistency consequence, direct `streamOpenAIResponses` calls (bypassing `streamSimple`) now emit `max_output_tokens` for `alwaysSendMaxTokens` (Kimi-family) models even without a caller cap — matching Chat Completions and the value `streamSimple` already supplied.
+- Centralized OpenAI-family reasoning compat resolution behind a shared `resolveOpenAICompatPolicy` consumed by both Chat Completions and Responses request builders. Shared policy now drives tool-choice reasoning suppression, dialect-specific disable encoding, reasoning-history replay filters, encrypted-reasoning inclusion, Mistral/OpenAI tool-call-id modes, stream healing/DeepSeek token stripping, and xAI/OpenRouter cache-affinity wiring instead of endpoint-local provider/model checks.
+### Fixed
+- Fixed OpenAI Responses cost accounting to apply standard service-tier pricing multipliers (flex 0.5×, priority 2×) to the calculated cost based on the served (or requested) service tier for provider `"openai"` models.
+- Fixed OpenAI Chat Completions to consume the dedicated `requiresReasoningContentForAllAssistantTurns` compatibility flag, preventing unnecessary reasoning replay on non-tool-call turns for OpenRouter DeepSeek and OpenCode models.
+- Fixed the Kimi Code and Synthetic dual-surface shim (`streamOpenAIAnthropicShim`) to correctly forward caller-supplied `toolChoice`, `serviceTier`, and `disableReasoning` options.
+- Fixed the OpenAI Responses tool-choice compatibility helper to drop `tool_choice` when `supportsToolChoice` is false, and downgrade forced choices to `"auto"` when `supportsForcedToolChoice` is false.
+- Fixed Azure Responses to avoid emitting `tool_choice: "none"` when `context.tools` is empty.
+- Fixed Kimi via OpenRouter forced-tool requests to omit the OpenRouter `reasoning` object instead of sending `reasoning: { enabled: false }`, preserving the generic OpenRouter explicit-disable behavior while avoiding Kimi's forced-tool reasoning conflict.
+- Fixed Google Gemini CLI credential parsing schema to gracefully handle empty or unexpected non-string shapes without throwing unhandled exceptions
+- Fixed Google Gemini CLI credential parsing to correctly prioritize `projectId` over `project_id` even when empty, and drop non-string values gracefully
+- Fixed OpenRouter Responses requests to omit default max token fields unless an explicit caller cap is provided, preventing upstream filtering issues
+- Fixed Chat Completions reasoning suppression (`disableReasoningOnToolChoice` / `disableReasoningOnForcedToolChoice`) to turn thinking off symmetrically across every dialect via a shared `disableChatCompletionsReasoningForDialect` helper. Previously the conflict path only deleted `reasoning_effort`/`reasoning` (and set Z.AI `thinking: { type: "disabled" }` on the forced branch alone), leaving Qwen `enable_thinking`, Qwen chat-template `chat_template_kwargs.enable_thinking`, and OpenRouter nested `reasoning` enabled — so those hosts could keep thinking on under forced/required tool choice and re-trip the incompatibility the policy guards against. OpenRouter is now set to `{ reasoning: { enabled: false } }` (not deleted, which OpenRouter treats as default-on).
+- Fixed OpenRouter Responses requests to send `session_id` from `sessionId` in the request body for sticky provider routing and observability grouping.
+- Fixed OpenRouter Responses request shaping to preserve provider routing, variant suffixes, caller header overrides, and strict-tool fallback behavior while omitting only unsafe default max-token caps.
+- Fixed OpenAI Responses stateful chaining so a non-ZDR stale `previous_response_id` retry keeps `store: true`: the full-context retry stays chainable on the next turn and the consecutive stale-failure circuit breaker trips after the configured limit instead of alternating cold turns. Zero Data Retention rejections still disable chaining on the first strike.
+- Fixed Anthropic Messages tool schema normalization demoting root `anyOf`/`allOf` and all `oneOf` constraints into descriptions instead of forwarding provider-rejected keywords in MCP tool `input_schema`.
+- Fixed Ollama Cloud GLM-5.2 reasoning efforts to map `xhigh` to native think `"max"` ([#2911](https://github.com/can1357/oh-my-pi/pull/2911) by [@serverinspector](https://github.com/serverinspector))
+- Fixed OpenRouter Responses requests tagging the streamed assistant message with a hardcoded `openai-responses` API instead of the runtime `model.api`, which silently disabled native-history replay (`buildResponsesInput`) and cross-model tool-call item-id stripping on subsequent OpenRouter turns. The message now carries `model.api` (matching the Chat Completions path).
+- Fixed OpenAI-family streaming leaking a pre-retry `errorMessage` onto a successful turn: the OpenRouter Anthropic compiled-grammar strict-tool fallback set `errorMessage` before retrying with strict tools disabled and never cleared it on success, and the Chat Completions success path could carry an `errorMessage` from an internally-retried attempt — both made a successful turn read as errored in agent state and telemetry. The Responses fallback no longer assigns `errorMessage`, and the Completions success path clears it before emitting the terminal `done` event.
+- Fixed Codex stream-error `.code` resolution to use the same nested-first precedence (`error.code` → `error.type` → top-level `code`) as `isRetryableCodexFailureEvent` and the formatted message. Previously the error factory resolved top-level-first, so a failure event carrying both a top-level and a differing nested error code surfaced a `.code` that could disagree with its own `retryable` flag and message text.
 ## [16.0.5] - 2026-06-17
 ### Added