copilot-api-plus 1.0.41 → 1.0.43

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,7 +1,5 @@
1
1
  # Copilot API Plus
2
2
 
3
- > **Fork of [ericc-ch/copilot-api](https://github.com/ericc-ch/copilot-api)** with bug fixes and improvements.
4
-
5
3
  将 GitHub Copilot、OpenCode Zen、Google Antigravity 等 AI 服务转换为 **OpenAI** 和 **Anthropic** 兼容 API,支持与 [Claude Code](https://docs.anthropic.com/en/docs/claude-code/overview)、[opencode](https://github.com/sst/opencode) 等工具无缝集成。
6
4
 
7
5
  ---
@@ -18,7 +16,9 @@
18
16
  - [Claude Code 集成](#-claude-code-集成)
19
17
  - [opencode 集成](#-opencode-集成)
20
18
  - [API 端点](#-api-端点)
21
- - [命令行参考](#-命令行参考)
19
+ - [API Key 认证](#-api-key-认证)
20
+ - [技术细节](#-技术细节)
21
+ - [命令行参考](#️-命令行参考)
22
22
  - [Docker 部署](#-docker-部署)
23
23
  - [常见问题](#-常见问题)
24
24
 
@@ -36,6 +36,10 @@
36
36
  | ⚡ **速率限制** | 内置请求频率控制,避免触发限制 |
37
37
  | 🌐 **代理支持** | 支持 HTTP/HTTPS 代理,配置持久化 |
38
38
  | 🐳 **Docker 支持** | 提供完整的 Docker 部署方案 |
39
+ | 🔑 **API Key 认证** | 可选的 API Key 鉴权,保护公开部署的服务 |
40
+ | ✂️ **智能上下文压缩** | Prompt 超过模型 Token 限制时自动截断,保留系统消息和最近对话 |
41
+ | 🔍 **智能模型匹配** | 自动处理模型名格式差异(日期后缀、dash/dot 版本号等) |
42
+ | 🔁 **Antigravity 端点容错** | 双端点自动切换,按模型族追踪速率限制,指数退避重试 |
39
43
 
40
44
  ---
41
45
 
@@ -558,6 +562,7 @@ curl http://localhost:4141/v1/messages \
558
562
  | `--github-token` | `-g` | - | 直接提供 GitHub Token |
559
563
  | `--show-token` | - | false | 显示 Token 信息 |
560
564
  | `--proxy-env` | - | false | 从环境变量读取代理 |
565
+ | `--api-key` | - | - | API Key 鉴权(可多次指定) |
561
566
 
562
567
  ### proxy 命令参数
563
568
 
@@ -604,6 +609,72 @@ npx copilot-api-plus@latest antigravity clear # 清除所有账户
604
609
 
605
610
  ---
606
611
 
612
+ ## 🔑 API Key 认证
613
+
614
+ 如果你将服务暴露到公网,可以启用 API Key 认证来保护接口:
615
+
616
+ ```bash
617
+ # 单个 Key
618
+ npx copilot-api-plus@latest start --api-key my-secret-key
619
+
620
+ # 多个 Key
621
+ npx copilot-api-plus@latest start --api-key key1 --api-key key2
622
+ ```
623
+
624
+ 启用后,所有请求需要携带 API Key:
625
+
626
+ ```bash
627
+ # OpenAI 格式 - 通过 Authorization header
628
+ curl http://localhost:4141/v1/chat/completions \
629
+ -H "Authorization: Bearer my-secret-key" \
630
+ -H "Content-Type: application/json" \
631
+ -d '{"model": "claude-sonnet-4", "messages": [{"role": "user", "content": "Hello"}]}'
632
+
633
+ # Anthropic 格式 - 通过 x-api-key header
634
+ curl http://localhost:4141/v1/messages \
635
+ -H "x-api-key: my-secret-key" \
636
+ -H "Content-Type: application/json" \
637
+ -d '{"model": "claude-sonnet-4", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello"}]}'
638
+ ```
639
+
640
+ 在 Claude Code 中使用时,将 `ANTHROPIC_AUTH_TOKEN` 设为你的 API Key 即可。
641
+
642
+ ---
643
+
644
+ ## 🔧 技术细节
645
+
646
+ ### 智能上下文压缩
647
+
648
+ 当 Prompt Token 数量超过模型的上下文窗口限制时,代理层会自动截断消息以避免上游 API 返回 400 错误:
649
+
650
+ - **保留系统/开发者消息**:system 和 developer 角色的消息始终保留
651
+ - **保留最近对话**:优先丢弃最早的消息,保留最近的上下文
652
+ - **工具调用分组**:assistant 的 tool_calls 和对应的 tool result 消息作为一组,不会被拆散
653
+ - **5% 安全余量**:实际限制为模型上下文窗口的 95%,避免边界情况
654
+
655
+ ### 智能模型名匹配
656
+
657
+ Anthropic 格式的模型名(如 `claude-opus-4-6`)和 Copilot 的模型列表 ID 可能存在格式差异。代理使用多策略精确匹配:
658
+
659
+ | 策略 | 示例 |
660
+ |------|------|
661
+ | 精确匹配 | `claude-opus-4-6` → `claude-opus-4-6` |
662
+ | 去日期后缀 | `claude-opus-4-6-20251101` → `claude-opus-4-6` |
663
+ | Dash → Dot | `claude-opus-4-5` → `claude-opus-4.5` |
664
+ | Dot → Dash | `claude-opus-4.5` → `claude-opus-4-5` |
665
+
666
+ 对于 Anthropic 端点(`/v1/messages`),还会先通过 `translateModelName` 做格式转换(包括旧格式 `claude-3-5-sonnet` → `claude-sonnet-4.5` 的映射),再通过上述策略匹配。
667
+
668
+ ### Antigravity 端点容错
669
+
670
+ Google Antigravity 模式内置了可靠性保障:
671
+
672
+ - **双端点自动切换**:daily sandbox 和 production 两个端点,一个失败自动切换到另一个
673
+ - **按模型族速率追踪**:分别追踪 Gemini 和 Claude 模型族的速率限制状态
674
+ - **指数退避重试**:429/503 等限流错误自动退避重试,短间隔走同端点,长间隔切换端点
675
+
676
+ ---
677
+
607
678
  ## 🐳 Docker 部署
608
679
 
609
680
  ### 快速启动
package/dist/main.js CHANGED
@@ -1194,6 +1194,15 @@ const apiKeyAuthMiddleware = async (c, next) => {
1194
1194
  //#endregion
1195
1195
  //#region src/lib/model-logger.ts
1196
1196
  /**
1197
+ * Global token usage store for passing usage info from handlers to logger.
1198
+ * Handlers call setTokenUsage() when usage is available,
1199
+ * logger reads and clears it after await next().
1200
+ */
1201
+ let pendingTokenUsage;
1202
+ function setTokenUsage(usage) {
1203
+ pendingTokenUsage = usage;
1204
+ }
1205
+ /**
1197
1206
  * Get timestamp string in format HH:mm:ss
1198
1207
  */
1199
1208
  function getTime() {
@@ -1207,6 +1216,15 @@ function formatDuration(ms) {
1207
1216
  return `${(ms / 1e3).toFixed(1)}s`;
1208
1217
  }
1209
1218
  /**
1219
+ * Format token usage for log output
1220
+ */
1221
+ function formatTokenUsage(usage) {
1222
+ const parts = [`in:${usage.inputTokens}`, `out:${usage.outputTokens}`];
1223
+ if (usage.cacheReadTokens) parts.push(`cache_read:${usage.cacheReadTokens}`);
1224
+ if (usage.cacheCreationTokens) parts.push(`cache_create:${usage.cacheCreationTokens}`);
1225
+ return parts.join(" ");
1226
+ }
1227
+ /**
1210
1228
  * Extract model name from request body
1211
1229
  */
1212
1230
  async function extractModel(c) {
@@ -1221,7 +1239,7 @@ async function extractModel(c) {
1221
1239
  *
1222
1240
  * Output format:
1223
1241
  * [model] HH:mm:ss <-- METHOD /path
1224
- * [model] HH:mm:ss --> METHOD /path STATUS DURATION
1242
+ * [model] HH:mm:ss --> METHOD /path STATUS DURATION [in:N out:N]
1225
1243
  */
1226
1244
  function modelLogger() {
1227
1245
  return async (c, next) => {
@@ -1233,12 +1251,16 @@ function modelLogger() {
1233
1251
  if (method === "POST" && c.req.header("content-type")?.includes("json")) model = await extractModel(c);
1234
1252
  const modelPrefix = model ? `[${model}] ` : "";
1235
1253
  const startTime = getTime();
1254
+ pendingTokenUsage = void 0;
1236
1255
  console.log(`${modelPrefix}${startTime} <-- ${method} ${fullPath}`);
1237
1256
  const start$1 = Date.now();
1238
1257
  await next();
1239
1258
  const duration = Date.now() - start$1;
1240
1259
  const endTime = getTime();
1241
- console.log(`${modelPrefix}${endTime} --> ${method} ${fullPath} ${c.res.status} ${formatDuration(duration)}`);
1260
+ const usage = pendingTokenUsage;
1261
+ pendingTokenUsage = void 0;
1262
+ const usageSuffix = usage ? ` [${formatTokenUsage(usage)}]` : "";
1263
+ console.log(`${modelPrefix}${endTime} --> ${method} ${fullPath} ${c.res.status} ${formatDuration(duration)}${usageSuffix}`);
1242
1264
  };
1243
1265
  }
1244
1266
 
@@ -2935,11 +2957,27 @@ const createChatCompletions = async (payload) => {
2935
2957
  model: payload.model,
2936
2958
  endpoint: `${copilotBaseUrl(state)}/chat/completions`
2937
2959
  });
2938
- const response = await fetch(`${copilotBaseUrl(state)}/chat/completions`, {
2960
+ const url = `${copilotBaseUrl(state)}/chat/completions`;
2961
+ const fetchOptions = {
2939
2962
  method: "POST",
2940
2963
  headers,
2941
2964
  body: JSON.stringify(payload)
2942
- });
2965
+ };
2966
+ const maxRetries = 2;
2967
+ let lastError;
2968
+ let response;
2969
+ for (let attempt = 0; attempt <= maxRetries; attempt++) try {
2970
+ response = await fetch(url, fetchOptions);
2971
+ break;
2972
+ } catch (error) {
2973
+ lastError = error;
2974
+ if (attempt < maxRetries) {
2975
+ const delay = 1e3 * (attempt + 1);
2976
+ consola.warn(`Network error on attempt ${attempt + 1}/${maxRetries + 1}, retrying in ${delay}ms:`, error instanceof Error ? error.message : error);
2977
+ await new Promise((r) => setTimeout(r, delay));
2978
+ }
2979
+ }
2980
+ if (!response) throw lastError;
2943
2981
  if (!response.ok) {
2944
2982
  const errorBody = await response.text();
2945
2983
  consola.error("Failed to create chat completions", {
@@ -2994,12 +3032,28 @@ async function handleCompletion$1(c) {
2994
3032
  const response = await createChatCompletions(payload);
2995
3033
  if (isNonStreaming$1(response)) {
2996
3034
  consola.debug("Non-streaming response:", JSON.stringify(response));
3035
+ if (response.usage) setTokenUsage({
3036
+ inputTokens: response.usage.prompt_tokens,
3037
+ outputTokens: response.usage.completion_tokens,
3038
+ cacheReadTokens: response.usage.prompt_tokens_details?.cached_tokens
3039
+ });
2997
3040
  return c.json(response);
2998
3041
  }
2999
3042
  consola.debug("Streaming response");
3000
3043
  return streamSSE(c, async (stream) => {
3001
3044
  for await (const chunk of response) {
3002
3045
  consola.debug("Streaming chunk:", JSON.stringify(chunk));
3046
+ try {
3047
+ const sseChunk = chunk;
3048
+ if (sseChunk.data && sseChunk.data !== "[DONE]") {
3049
+ const parsed = JSON.parse(sseChunk.data);
3050
+ if (parsed.usage) setTokenUsage({
3051
+ inputTokens: parsed.usage.prompt_tokens ?? 0,
3052
+ outputTokens: parsed.usage.completion_tokens ?? 0,
3053
+ cacheReadTokens: parsed.usage.prompt_tokens_details?.cached_tokens
3054
+ });
3055
+ }
3056
+ } catch {}
3003
3057
  await stream.writeSSE(chunk);
3004
3058
  }
3005
3059
  });
@@ -3446,6 +3500,12 @@ async function handleCompletion(c) {
3446
3500
  const response = await createChatCompletions(openAIPayload);
3447
3501
  if (isNonStreaming(response)) {
3448
3502
  const anthropicResponse = translateToAnthropic(response);
3503
+ setTokenUsage({
3504
+ inputTokens: anthropicResponse.usage.input_tokens,
3505
+ outputTokens: anthropicResponse.usage.output_tokens,
3506
+ cacheReadTokens: anthropicResponse.usage.cache_read_input_tokens,
3507
+ cacheCreationTokens: anthropicResponse.usage.cache_creation_input_tokens
3508
+ });
3449
3509
  return c.json(anthropicResponse);
3450
3510
  }
3451
3511
  return streamSSE(c, async (stream) => {
@@ -3460,6 +3520,11 @@ async function handleCompletion(c) {
3460
3520
  if (!rawEvent.data) continue;
3461
3521
  const chunk = JSON.parse(rawEvent.data);
3462
3522
  const events$1 = translateChunkToAnthropicEvents(chunk, streamState);
3523
+ if (chunk.usage) setTokenUsage({
3524
+ inputTokens: chunk.usage.prompt_tokens - (chunk.usage.prompt_tokens_details?.cached_tokens ?? 0),
3525
+ outputTokens: chunk.usage.completion_tokens,
3526
+ cacheReadTokens: chunk.usage.prompt_tokens_details?.cached_tokens
3527
+ });
3463
3528
  for (const event of events$1) await stream.writeSSE({
3464
3529
  event: event.type,
3465
3530
  data: JSON.stringify(event)