copilot-api-plus 1.2.11 → 1.2.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.en.md CHANGED
@@ -46,7 +46,7 @@ English | [简体中文](README.md)
46
46
  | 👥 **Multi-Account** | Multiple GitHub accounts with automatic failover on quota exhaustion/rate limiting/bans |
47
47
  | 🔀 **Model Routing** | Flexible model name mapping and per-model concurrency control |
48
48
  | 📱 **Visual Management** | Web dashboard for account management, model config, and runtime stats |
49
- | 🛡️ **Network Resilience** | 120s timeout + smart retry + proxy tunnel keepalive (45s heartbeat) |
49
+ | 🛡️ **Network Resilience** | 60s timeout + smart retry + instant stream recovery + proxy tunnel keepalive (45s heartbeat) |
50
50
  | ✂️ **Context Passthrough** | Full context passthrough to upstream API; clients (e.g. Claude Code) manage compression |
51
51
  | 🔍 **Smart Model Matching** | Handles model name format differences (date suffixes, dash/dot versions, etc.) |
52
52
  | 🧠 **Thinking Chain** | Automatically enables deep thinking (thinking/reasoning) for supported models, improving code quality |
@@ -582,8 +582,9 @@ Each API request outputs a log line with model name, status code, and duration:
582
582
 
583
583
  Built-in connection timeout and smart retry for upstream API requests, minimizing Copilot request credit consumption:
584
584
 
585
- - **Connection timeout**: 120 seconds for the first attempt, 30 seconds for retries (headers typically arrive in 3–5s)
585
+ - **Connection timeout**: 60 seconds for the first attempt, 30 seconds for retries (headers typically arrive in 3–5s)
586
586
  - **Retry strategy**: Up to 2 retries (3 total attempts), 2-3 second delays
587
+ - **Instant stream recovery**: On SSE stream interruption, immediately destroys the connection pool so the next request uses fresh sockets — recovery drops from ~135s to seconds
587
588
  - **Connection pool reset**: Automatically destroys all pooled connections on the first network error and creates fresh instances, preventing retries from hitting stale sockets
588
589
  - **Proxy tunnel keepalive**: Sends lightweight heartbeat requests every 45s while SSE streams are active, preventing proxy nodes from killing CONNECT tunnels due to inactivity
589
590
  - **HTTP/2 support**: Enables HTTP/2 protocol for better multiplexing performance
package/README.md CHANGED
@@ -47,7 +47,7 @@
47
47
  | 👥 **多账号管理** | 支持添加多个 GitHub 账号,额度耗尽/限流/封禁时自动切换下一个 |
48
48
  | 🔀 **模型路由** | 灵活的模型名映射和每模型并发控制 |
49
49
  | 📱 **可视化管理** | Web 仪表盘支持账号管理、模型管理、运行统计 |
50
- | 🛡️ **网络弹性** | 120s 连接超时 + 智能重试 + 代理隧道保活(45s 心跳防断连) |
50
+ | 🛡️ **网络弹性** | 60s 连接超时 + 智能重试 + 流中断即时恢复 + 代理隧道保活(45s 心跳) |
51
51
  | ✂️ **上下文透传** | 全量透传上下文至上游 API,由客户端(如 Claude Code)自行管理压缩 |
52
52
  | 🔍 **智能模型匹配** | 自动处理模型名格式差异(日期后缀、dash/dot 版本号等) |
53
53
  | 🧠 **Thinking 思维链** | 自动为支持的模型启用深度思考(thinking/reasoning),提升代码质量 |
@@ -745,8 +745,9 @@ Anthropic 格式的模型名(如 `claude-opus-4-6`)和 Copilot 的模型列
745
745
 
746
746
  对上游 API 的请求内置了连接超时和智能重试,以最小化 Copilot 请求次数消耗:
747
747
 
748
- - **连接超时**:首次请求 120 秒,重试请求 30 秒(响应头通常 3~5 秒到达)
748
+ - **连接超时**:首次请求 60 秒,重试请求 30 秒(响应头通常 3~5 秒到达)
749
749
  - **重试策略**:最多重试 2 次(共 3 次尝试),间隔 2-3 秒
750
+ - **流中断即时恢复**:SSE 流中断时立刻销毁连接池,下一个请求使用全新连接,恢复时间从 ~135 秒降至几秒
750
751
  - **连接池重置**:首次网络错误后自动销毁所有连接并创建新实例,避免后续请求复用坏连接
751
752
  - **代理隧道保活**:SSE 流传输期间每 45 秒发送一次轻量心跳请求,防止代理节点因空闲而杀断 CONNECT 隧道
752
753
  - **HTTP/2 支持**:启用 HTTP/2 协议,提升多路复用性能
package/dist/main.js CHANGED
@@ -1763,9 +1763,9 @@ async function checkRateLimit(state) {
1763
1763
  /**
1764
1764
  * Timeout for the initial HTTP connection + headers (not the body/stream).
1765
1765
  * Copilot's slow models (e.g. claude-opus) can take up to ~60s to start
1766
- * streaming, so we give 120s for the connection phase.
1766
+ * streaming, so we give 60s for the connection phase.
1767
1767
  */
1768
- const FETCH_TIMEOUT_MS = 12e4;
1768
+ const FETCH_TIMEOUT_MS = 6e4;
1769
1769
  /**
1770
1770
  * Retry delays in ms. After the first failure the connection pool is reset
1771
1771
  * (see `resetConnections`), so retries use fresh sockets. We allow up to
@@ -1836,11 +1836,16 @@ async function fetchWithRetry(url, buildInit) {
1836
1836
  */
1837
1837
  async function* wrapGeneratorWithRelease(gen, releaseSlot) {
1838
1838
  notifyStreamStart();
1839
+ let streamError = false;
1839
1840
  try {
1840
1841
  yield* gen;
1842
+ } catch (error) {
1843
+ streamError = true;
1844
+ throw error;
1841
1845
  } finally {
1842
1846
  notifyStreamEnd();
1843
1847
  releaseSlot();
1848
+ if (streamError) resetConnections();
1844
1849
  }
1845
1850
  }
1846
1851
  /**
@@ -1865,18 +1870,29 @@ function getThinkingBudget(model) {
1865
1870
  return Math.max(upperBound, lowerBound);
1866
1871
  }
1867
1872
  /**
1873
+ * Check whether tool_choice forces tool use (not "auto" or "none").
1874
+ * Thinking/reasoning cannot be enabled when tool_choice forces a tool.
1875
+ */
1876
+ function isToolChoiceForced(toolChoice) {
1877
+ if (!toolChoice) return false;
1878
+ if (toolChoice === "auto" || toolChoice === "none") return false;
1879
+ return true;
1880
+ }
1881
+ /**
1868
1882
  * Inject thinking parameters into the payload based on model capabilities.
1869
1883
  *
1870
1884
  * Strategy (in priority order):
1871
1885
  * 1. If the client already set reasoning_effort or thinking_budget → keep as-is
1872
- * 2. If model capabilities declare max_thinking_budgetinject thinking_budget
1873
- * 3. Otherwise inject reasoning_effort="high" (works on claude-*-4.6)
1886
+ * 2. If tool_choice forces tool useskip (API rejects the combination)
1887
+ * 3. If model capabilities declare max_thinking_budget inject thinking_budget
1888
+ * 4. Otherwise → inject reasoning_effort="high" (works on claude-*-4.6)
1874
1889
  *
1875
1890
  * The fallback to reasoning_effort ensures thinking works even when the
1876
1891
  * /models endpoint doesn't expose thinking budget fields.
1877
1892
  */
1878
1893
  function injectThinking(payload, resolvedModel) {
1879
1894
  if (payload.reasoning_effort || payload.thinking_budget) return payload;
1895
+ if (isToolChoiceForced(payload.tool_choice)) return payload;
1880
1896
  const budget = getThinkingBudget(findModel(resolvedModel));
1881
1897
  if (budget) return {
1882
1898
  ...payload,
@@ -2159,6 +2175,7 @@ async function handleCompletion$1(c) {
2159
2175
  } catch (error) {
2160
2176
  const message = error.message || String(error);
2161
2177
  consola.warn(`SSE stream interrupted: ${message}`);
2178
+ resetConnections();
2162
2179
  }
2163
2180
  });
2164
2181
  }
@@ -2433,6 +2450,8 @@ function mapOpenAIStopReasonToAnthropic(finishReason) {
2433
2450
  //#endregion
2434
2451
  //#region src/routes/messages/non-stream-translation.ts
2435
2452
  function translateToOpenAI(payload) {
2453
+ const toolChoice = translateAnthropicToolChoiceToOpenAI(payload.tool_choice);
2454
+ const isForced = Boolean(toolChoice) && toolChoice !== "auto" && toolChoice !== "none";
2436
2455
  return {
2437
2456
  model: translateModelName(payload.model),
2438
2457
  messages: translateAnthropicMessagesToOpenAI(payload.messages, payload.system),
@@ -2443,10 +2462,10 @@ function translateToOpenAI(payload) {
2443
2462
  top_p: payload.top_p,
2444
2463
  user: payload.metadata?.user_id,
2445
2464
  tools: translateAnthropicToolsToOpenAI(payload.tools),
2446
- tool_choice: translateAnthropicToolChoiceToOpenAI(payload.tool_choice),
2447
- ...payload.thinking && { thinking: payload.thinking },
2448
- ...payload.thinking?.type === "enabled" && { reasoning_effort: "high" },
2449
- ...payload.thinking?.budget_tokens && { thinking_budget: payload.thinking.budget_tokens }
2465
+ tool_choice: toolChoice,
2466
+ ...!isForced && payload.thinking && { thinking: payload.thinking },
2467
+ ...!isForced && payload.thinking?.type === "enabled" && { reasoning_effort: "high" },
2468
+ ...!isForced && payload.thinking?.budget_tokens && { thinking_budget: payload.thinking.budget_tokens }
2450
2469
  };
2451
2470
  }
2452
2471
  function translateModelName(model) {
@@ -2914,6 +2933,7 @@ async function handleCompletion(c) {
2914
2933
  } catch (error) {
2915
2934
  const message = error.message || String(error);
2916
2935
  consola.warn(`SSE stream interrupted: ${message}`);
2936
+ resetConnections();
2917
2937
  try {
2918
2938
  const errorEvent = translateErrorToAnthropicErrorEvent();
2919
2939
  await stream.writeSSE({