@jeffreycao/copilot-api 1.10.10 → 1.10.12
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +44 -207
- package/README.zh-CN.md +38 -203
- package/dist/{auth-CeZtnQE_.js → auth-9cNGmKTS.js} +2 -2
- package/dist/{auth-CeZtnQE_.js.map → auth-9cNGmKTS.js.map} +1 -1
- package/dist/{check-usage-CTShKCvR.js → check-usage-TtF-fhRm.js} +2 -2
- package/dist/{check-usage-CTShKCvR.js.map → check-usage-TtF-fhRm.js.map} +1 -1
- package/dist/main.js +3 -3
- package/dist/{server-D5O9IzAY.js → server-C3iJWAUP.js} +156 -122
- package/dist/server-C3iJWAUP.js.map +1 -0
- package/dist/{start-BhPxHgu8.js → start-DSzP59pP.js} +3 -3
- package/dist/{start-BhPxHgu8.js.map → start-DSzP59pP.js.map} +1 -1
- package/dist/{token-1SfgxCRm.js → token-CQHif3s_.js} +34 -13
- package/dist/token-CQHif3s_.js.map +1 -0
- package/package.json +1 -1
- package/dist/server-D5O9IzAY.js.map +0 -1
- package/dist/token-1SfgxCRm.js.map +0 -1
package/README.md
CHANGED
|
@@ -2,25 +2,6 @@
|
|
|
2
2
|
|
|
3
3
|
English | [简体中文](./README.zh-CN.md)
|
|
4
4
|
|
|
5
|
-
> [!WARNING]
|
|
6
|
-
> This is a reverse-engineered proxy of GitHub Copilot API. It is not supported by GitHub, and may break unexpectedly. Use at your own risk. In the current version, if not using opencode OAuth, the device ID and machine ID will be sent to GitHub Copilot. It is not recommended to use a large number of accounts on a single device; if necessary, it is advised to run them in Docker containers.
|
|
7
|
-
|
|
8
|
-
> [!WARNING]
|
|
9
|
-
> **GitHub Security Notice:**
|
|
10
|
-
> Excessive automated or scripted use of Copilot (including rapid or bulk requests, such as via automated tools) may trigger GitHub's abuse-detection systems.
|
|
11
|
-
> You may receive a warning from GitHub Security, and further anomalous activity could result in temporary suspension of your Copilot access.
|
|
12
|
-
>
|
|
13
|
-
> GitHub prohibits use of their servers for excessive automated bulk activity or any activity that places undue burden on their infrastructure.
|
|
14
|
-
>
|
|
15
|
-
> Please review:
|
|
16
|
-
>
|
|
17
|
-
> - [GitHub Acceptable Use Policies](https://docs.github.com/site-policy/acceptable-use-policies/github-acceptable-use-policies#4-spam-and-inauthentic-activity-on-github)
|
|
18
|
-
> - [GitHub Copilot Terms](https://docs.github.com/site-policy/github-terms/github-terms-for-additional-products-and-features#github-copilot)
|
|
19
|
-
>
|
|
20
|
-
> Use this proxy responsibly to avoid account restrictions.
|
|
21
|
-
|
|
22
|
-
---
|
|
23
|
-
|
|
24
5
|
## Important Notes
|
|
25
6
|
|
|
26
7
|
> [!IMPORTANT]
|
|
@@ -28,98 +9,35 @@ English | [简体中文](./README.zh-CN.md)
|
|
|
28
9
|
>
|
|
29
10
|
> 1. **Claude Code configuration:** When using with Claude Code, please configure the model ID as `claude-opus-4-6` or `claude-opus-4.6` (without the `[1m]` suffix, exceeding GitHub Copilot's context window limit too much may lead to being banned). Example claude `settings.json` see [Manual Configuration with `settings.json`](#manual-configuration-with-settingsjson).
|
|
30
11
|
>
|
|
31
|
-
> 2. **Recommend for Opencode:**
|
|
12
|
+
> 2. **Recommend for Opencode:** For opencode, prefer the opencode OAuth app. It matches opencode's built-in GitHub Copilot provider and avoids Terms of Service risk:
|
|
32
13
|
> ```sh
|
|
33
14
|
> npx @jeffreycao/copilot-api@latest --oauth-app=opencode start
|
|
34
15
|
> ```
|
|
35
16
|
>
|
|
36
|
-
> 3. **
|
|
17
|
+
> 3. **Built-in `codex` provider:** Run `npx @jeffreycao/copilot-api@latest auth login --provider codex` once and the gateway will persist and refresh Codex OAuth credentials automatically.
|
|
18
|
+
>
|
|
19
|
+
> 4. **Disable multi agent when using codex:** If you're using codex via GitHub Copilot, disable multi agent. Copilot currently charges codex traffic based on whether the last message is a user role, and that billing logic has not been adjusted.
|
|
20
|
+
>
|
|
21
|
+
> 5. **Note:** See [GitHub Copilot Security Notice](./NOTICE.md#github-copilot-security-notice) for the warning removed from the README header.
|
|
37
22
|
|
|
38
23
|
---
|
|
39
24
|
|
|
40
25
|
## Project Overview
|
|
41
26
|
|
|
42
|
-
A reverse-engineered
|
|
27
|
+
A reverse-engineered GitHub Copilot integration that also works as a small AI gateway. Besides Copilot, it can route the built-in `codex` provider and configured third-party providers such as DashScope behind OpenAI- and Anthropic-compatible APIs, so tools like [Claude Code](https://docs.anthropic.com/en/docs/claude-code/overview) can use one local endpoint.
|
|
43
28
|
|
|
44
|
-
|
|
29
|
+
On the GitHub Copilot path, the gateway prefers Copilot's native Anthropic-style Messages API when available, preserving more Claude-native behavior for tool-heavy workflows.
|
|
45
30
|
|
|
46
31
|
## Features
|
|
47
32
|
|
|
48
|
-
- **OpenAI
|
|
49
|
-
- **
|
|
50
|
-
- **
|
|
51
|
-
- **
|
|
52
|
-
- **
|
|
53
|
-
- **
|
|
54
|
-
- **
|
|
55
|
-
- **
|
|
56
|
-
- **Usage Dashboard**: A web-based dashboard to monitor your Copilot API usage, view quotas, and see detailed statistics.
|
|
57
|
-
- **Rate Limit Control**: Manage API usage with rate-limiting options (`--rate-limit`) and a waiting mechanism (`--wait`) to prevent errors from rapid requests.
|
|
58
|
-
- **Manual Request Approval**: Manually approve or deny each API request for fine-grained control over usage (`--manual`).
|
|
59
|
-
- **Token Visibility**: Option to display GitHub and Copilot tokens during authentication and refresh for debugging (`--show-token`).
|
|
60
|
-
- **Flexible Authentication**: Authenticate interactively or provide a GitHub token directly, suitable for CI/CD environments.
|
|
61
|
-
- **Support for Different Account Types**: Works with individual, business, and enterprise GitHub Copilot plans.
|
|
62
|
-
- **Opencode OAuth Support**: Use opencode GitHub Copilot authentication by setting `COPILOT_API_OAUTH_APP=opencode` environment variable or using `--oauth-app=opencode` command line option.
|
|
63
|
-
- **GitHub Enterprise Support**: Connect to GHE.com by setting `COPILOT_API_ENTERPRISE_URL` environment variable (e.g., `company.ghe.com`) or using `--enterprise-url=company.ghe.com` command line option.
|
|
64
|
-
- **Custom Data Directory**: Change the default data directory (where tokens and config are stored) by setting `COPILOT_API_HOME` environment variable or using `--api-home=/path/to/dir` command line option.
|
|
65
|
-
- **Multi-Provider Messages Proxy Routes**: Add global provider configs and call external Anthropic-compatible or OpenAI-compatible APIs via `/:provider/v1/messages` and `/:provider/v1/models`, or send `model: "provider/model"` to the top-level `/v1/messages` API.
|
|
66
|
-
- **Accurate Claude Token Counting**: Optionally forward `/v1/messages/count_tokens` requests for Claude models to Anthropic's free token counting endpoint for exact counts instead of GPT tokenizer estimation.
|
|
67
|
-
- **GPT Context Management**: Configurable context compaction for long-running GPT conversations via `responsesApiContextManagementModels`, reducing unnecessary premium requests when approaching token limits. See [Configuration](#configuration-configjson) for details.
|
|
68
|
-
|
|
69
|
-
## Better Agent Semantics
|
|
70
|
-
|
|
71
|
-
### Native Anthropic Messages API when available
|
|
72
|
-
|
|
73
|
-
For models that advertise Copilot support for `/v1/messages`, this project sends the request to the native Messages API first and only falls back to `/responses` or `/chat/completions` when needed.
|
|
74
|
-
|
|
75
|
-
Compared with using Claude-family models only through Chat Completions compatibility, the Messages API path keeps more Anthropic-native behavior, including support for:
|
|
76
|
-
|
|
77
|
-
- `interleaved-thinking-2025-05-14`
|
|
78
|
-
- `advanced-tool-use-2025-11-20`
|
|
79
|
-
- `context-management-2025-06-27`
|
|
80
|
-
|
|
81
|
-
Supported `anthropic-beta` values are filtered and forwarded on the native Messages path, and `interleaved-thinking` is added automatically when a thinking budget is requested for non-adaptive extended thinking.
|
|
82
|
-
|
|
83
|
-
### Fewer unnecessary Premium requests
|
|
84
|
-
|
|
85
|
-
The proxy includes request-accounting safeguards designed for tool-heavy coding workflows:
|
|
86
|
-
|
|
87
|
-
- tool-less warmup or probe requests can be forced onto `smallModel` so background checks do not spend premium usage;
|
|
88
|
-
- mixed `tool_result` + reminder text blocks are merged back into the `tool_result` flow instead of being counted like fresh user turns;
|
|
89
|
-
- `x-initiator` is derived from the latest message or item, not stale assistant history.
|
|
90
|
-
|
|
91
|
-
This helps resumed tool turns continue the existing workflow instead of consuming an extra Premium request as a brand-new interaction.
|
|
92
|
-
|
|
93
|
-
### Phase-aware `gpt-5.4` and `gpt-5.3-codex`
|
|
94
|
-
|
|
95
|
-
By default, the built-in `extraPrompts` for `gpt-5.4` and `gpt-5.3-codex` enable intermediary-update behavior, and the proxy translates assistant turns into `phase: "commentary"` before tool calls and `phase: "final_answer"` for the final response.
|
|
96
|
-
|
|
97
|
-
That gives clients a short, user-friendly explanation of what the model is about to do before deeper reasoning or tool execution begins.
|
|
98
|
-
|
|
99
|
-
### Subagent marker integration
|
|
100
|
-
|
|
101
|
-
For subagent-based clients, this project can preserve root session context and correctly classify subagent-originated traffic.
|
|
102
|
-
|
|
103
|
-
The marker flow uses `__SUBAGENT_MARKER__...` inside a `<system-reminder>` block together with root `x-session-id` propagation. When a marker is detected, the proxy can keep the parent session identity, infer `x-initiator: agent`, and tag the interaction as subagent traffic instead of a fresh top-level request.
|
|
104
|
-
|
|
105
|
-
Plugin integrations are included for both Claude Code and opencode; see [Plugin Integrations](#plugin-integrations) below for setup details.
|
|
106
|
-
|
|
107
|
-
### Accurate Claude token counting
|
|
108
|
-
|
|
109
|
-
By default, `/v1/messages/count_tokens` estimates Claude token counts using the GPT `o200k_base` tokenizer with a 1.15x multiplier. This consistently underestimates actual Claude token usage, which can cause tools like Claude Code to compact too late and hit "prompt token count exceeds limit" errors.
|
|
110
|
-
|
|
111
|
-
When an Anthropic API key is configured, the proxy forwards Claude model token counting requests to [Anthropic's real `/v1/messages/count_tokens` endpoint](https://docs.anthropic.com/en/docs/build-with-claude/token-counting) instead. This returns exact counts and eliminates the estimation mismatch. Non-Claude models and failures fall back to the GPT tokenizer estimation automatically.
|
|
112
|
-
|
|
113
|
-
**Setup:**
|
|
114
|
-
|
|
115
|
-
1. Create an Anthropic API account at [console.anthropic.com](https://console.anthropic.com) and add a minimum $5 credit balance (required to activate the API key, but the token counting endpoint itself is free)
|
|
116
|
-
2. Create an API key from Settings > API Keys
|
|
117
|
-
3. Configure the key via **one** of:
|
|
118
|
-
- `config.json`: set `"anthropicApiKey": "sk-ant-..."`
|
|
119
|
-
- Environment variable: `ANTHROPIC_API_KEY=sk-ant-...`
|
|
120
|
-
|
|
121
|
-
> [!NOTE]
|
|
122
|
-
> Anthropic's `/v1/messages/count_tokens` endpoint is **free** (no per-token cost). It is rate-limited to 100 RPM at Tier 1. The $5 credit purchase is only needed to activate API access — the token counting calls themselves cost nothing.
|
|
33
|
+
- **OpenAI and Anthropic compatibility**: Serve `/v1/responses`, `/v1/chat/completions`, `/v1/models`, `/v1/embeddings`, and `/v1/messages` from one local gateway.
|
|
34
|
+
- **One gateway for Copilot, `codex`, and external providers**: Route GitHub Copilot, the built-in `codex` provider, and configured third-party providers behind the same endpoint.
|
|
35
|
+
- **Agent-friendly Claude handling on Copilot**: Prefer native `/v1/messages` when available, preserve Claude-style tool flows, support Anthropic beta features, and keep subagent/session markers intact.
|
|
36
|
+
- **Claude Code and OpenCode integration**: Works with Claude Code and OpenCode, including direct Anthropic-compatible usage through `@ai-sdk/anthropic`.
|
|
37
|
+
- **Flexible auth and deployment options**: Supports interactive login or direct tokens, individual/business/enterprise plans, GitHub Enterprise, opencode OAuth, and custom data directories.
|
|
38
|
+
- **Local control and visibility**: Includes a usage dashboard, rate limiting, manual approval, and optional token visibility for debugging.
|
|
39
|
+
- **Multi-provider routing**: Expose provider-specific `/:provider/...` routes or use `model: "provider/model"` on the top-level API.
|
|
40
|
+
- **Better token and context management**: Supports exact Claude token counting and configurable GPT context compaction for long-running conversations.
|
|
123
41
|
|
|
124
42
|
## Prerequisites
|
|
125
43
|
|
|
@@ -189,63 +107,27 @@ Main dashboard, token usage breakdown in the bundled Electron app:
|
|
|
189
107
|
|
|
190
108
|
## Using with Docker
|
|
191
109
|
|
|
192
|
-
Build image
|
|
110
|
+
Build the image:
|
|
193
111
|
|
|
194
112
|
```sh
|
|
195
113
|
docker build -t copilot-api .
|
|
196
114
|
```
|
|
197
115
|
|
|
198
|
-
Run the container
|
|
116
|
+
Run the container with a bind mount so auth data survives restarts:
|
|
199
117
|
|
|
200
118
|
```sh
|
|
201
|
-
# Create a directory on your host to persist the GitHub token and related data
|
|
202
119
|
mkdir -p ./copilot-data
|
|
203
|
-
|
|
204
|
-
# Run the container with a bind mount to persist the token
|
|
205
|
-
# This ensures your authentication survives container restarts
|
|
206
|
-
|
|
207
120
|
docker run -p 4141:4141 -v $(pwd)/copilot-data:/root/.local/share/copilot-api copilot-api
|
|
208
121
|
```
|
|
209
122
|
|
|
210
|
-
|
|
211
|
-
> The GitHub token and related data will be stored in `copilot-data` on your host. This is mapped to `/root/.local/share/copilot-api` inside the container, ensuring persistence across restarts.
|
|
123
|
+
This stores GitHub auth data in `./copilot-data` on the host, mapped to `/root/.local/share/copilot-api` in the container.
|
|
212
124
|
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
You can pass the GitHub token directly to the container using environment variables:
|
|
125
|
+
Or pass a GitHub token directly:
|
|
216
126
|
|
|
217
127
|
```sh
|
|
218
|
-
# Build with GitHub token
|
|
219
|
-
docker build --build-arg GH_TOKEN=your_github_token_here -t copilot-api .
|
|
220
|
-
|
|
221
|
-
# Run with GitHub token
|
|
222
128
|
docker run -p 4141:4141 -e GH_TOKEN=your_github_token_here copilot-api
|
|
223
|
-
|
|
224
|
-
# Run with additional options
|
|
225
|
-
docker run -p 4141:4141 -e GH_TOKEN=your_token copilot-api start --verbose --port 4141
|
|
226
|
-
```
|
|
227
|
-
|
|
228
|
-
### Docker Compose Example
|
|
229
|
-
|
|
230
|
-
```yaml
|
|
231
|
-
version: "3.8"
|
|
232
|
-
services:
|
|
233
|
-
copilot-api:
|
|
234
|
-
build: .
|
|
235
|
-
ports:
|
|
236
|
-
- "4141:4141"
|
|
237
|
-
environment:
|
|
238
|
-
- GH_TOKEN=your_github_token_here
|
|
239
|
-
restart: unless-stopped
|
|
240
129
|
```
|
|
241
130
|
|
|
242
|
-
The Docker image includes:
|
|
243
|
-
|
|
244
|
-
- Multi-stage build for optimized image size
|
|
245
|
-
- Non-root user for enhanced security
|
|
246
|
-
- Health check for container monitoring
|
|
247
|
-
- Pinned base image version for reproducible builds
|
|
248
|
-
|
|
249
131
|
## Command Structure
|
|
250
132
|
|
|
251
133
|
Copilot API now uses a subcommand structure with these main commands:
|
|
@@ -372,7 +254,7 @@ The following command line options are available for the `start` command:
|
|
|
372
254
|
- **auth.adminApiKey:** Single admin key used only for `/admin/*` routes. If missing, the server generates a random key at startup and writes it back to `config.json`. Requests use the same `x-api-key` or `Authorization: Bearer` headers, but regular `auth.apiKeys` never grant access to `/admin/*`.
|
|
373
255
|
- **modelMappings:** Exact `sourceModel -> targetModel` rewrites for top-level `POST /v1/messages` and `POST /v1/messages/count_tokens` requests. Omit it or leave it as `{}` to disable rewrites. Both the source and target must be non-empty strings. Targets can be regular model IDs or `provider/model` aliases such as `dashscope/qwen3.6-plus`, and the rewrite happens before provider alias parsing. The admin endpoints `GET/POST /admin/config/model-mappings` read and update only this field.
|
|
374
256
|
- **extraPrompts:** Map of `model -> prompt` appended to the first system prompt when translating Anthropic-style requests to Copilot. Use this to inject guardrails or guidance per model. Missing default entries are auto-added without overwriting your custom prompts. The built-in prompts for `gpt-5.3-codex` and `gpt-5.4` enable phase-aware commentary, which lets the model emit a short user-facing progress update before tools or deeper reasoning.
|
|
375
|
-
- **providers:** Global upstream provider map. Each provider key (for example `
|
|
257
|
+
- **providers:** Global upstream provider map. Each provider key (for example `dashscope`) becomes a route prefix (`/dashscope/v1/messages`). Supports `type: "anthropic"`, `type: "openai-compatible"`, and `type: "openai-responses"`. Top-level clients can also use `model: "dashscope/model-id"` with `/v1/messages`, `/v1/messages/count_tokens`, and `/v1/responses`; the gateway strips the `dashscope/` prefix before forwarding upstream. `GET /v1/models` does not aggregate provider models; use `GET /dashscope/v1/models` for provider model lists.
|
|
376
258
|
- `enabled` defaults to `true` if omitted.
|
|
377
259
|
- `baseUrl` should be provider API base URL without the final endpoint. For Anthropic providers, omit `/v1/messages`; for OpenAI-compatible providers, omit `/v1/chat/completions`; for OpenAI Responses providers, omit `/v1/responses`.
|
|
378
260
|
- `apiKey` is used as the upstream credential value and is required for regular providers.
|
|
@@ -386,14 +268,14 @@ The following command line options are available for the `start` command:
|
|
|
386
268
|
- `contextCache` (optional): Defaults to `true` for OpenAI-compatible providers. This enables Alibaba Cloud Model Studio/DashScope explicit context cache by injecting `cache_control: { "type": "ephemeral" }` on up to 4 content blocks using the Context Cache format. The cache breakpoint strategy matches opencode's main provider flow: the first 2 system messages plus the last 2 non-system messages. Marked string content is converted to text content part arrays for `system` / `user` / `assistant` / `tool` messages; existing array content is marked on the last part. Set this to `false` when the model already supports implicit caching, or when the upstream does not accept this explicit-cache extension field.
|
|
387
269
|
- `supportPdf` (optional): Controls whether the model supports PDF/document content. Defaults to `false`; unsupported PDFs are converted to a text notice. Set it to `true` to send PDF/document blocks as OpenAI Chat Completions file parts.
|
|
388
270
|
- `toolContentSupportType` (optional): Tool result content capabilities for that model, as an array of `array`, `image`, and `pdf`. Provider routes default to string-only tool content when omitted. If `supportPdf` is `true` but this list does not include `pdf`, file parts in tool results are moved to user role messages. This provider default does not change the Copilot main flow, which continues to support array + image and not PDF.
|
|
389
|
-
- **smallModel:** Fallback model used for tool-less warmup messages (e.g., Claude Code probe requests)
|
|
390
|
-
- **responsesApiContextManagementModels:** List of GPT model IDs that should receive Responses API `context_management` compaction instructions. This defaults to `[]`, so you need to opt in explicitly. A good starting point is `["gpt-5-mini", "gpt-5.3-codex", "gpt-5.4-mini", "gpt-5.4"]`. When enabled, the request includes `context_management` in the body and keeps only the latest compaction carrier on follow-up turns. The actual compaction is handled server-side and appears to begin when usage approaches roughly 90% of the model's `maxPromptTokens`, which makes it especially useful for long-running tasks
|
|
271
|
+
- **smallModel:** Fallback model used for tool-less warmup messages (e.g., Claude Code probe requests); defaults to gpt-5-mini.
|
|
272
|
+
- **responsesApiContextManagementModels:** List of GPT model IDs that should receive Responses API `context_management` compaction instructions. This defaults to `[]`, so you need to opt in explicitly. A good starting point is `["gpt-5-mini", "gpt-5.3-codex", "gpt-5.4-mini", "gpt-5.4"]`. When enabled, the request includes `context_management` in the body and keeps only the latest compaction carrier on follow-up turns. The actual compaction is handled server-side and appears to begin when usage approaches roughly 90% of the model's `maxPromptTokens`, which makes it especially useful for long-running tasks. In practice, the effective `compact_threshold` also appears to be fixed on the server side, so changing it in this project does not currently alter compaction behavior. At the moment, this optimization is intended for GPT-family models only.
|
|
391
273
|
- **modelReasoningEfforts:** Per-model `reasoning.effort` sent to the Copilot Responses API. Allowed values are `none`, `minimal`, `low`, `medium`, `high`, and `xhigh`. If a model isn’t listed, `high` is used by default.
|
|
392
274
|
- **useMessagesApi:** When `true`, Claude-family models that support Copilot's native `/v1/messages` endpoint will use the Messages API; otherwise they fall back to `/chat/completions`. Set to `false` to disable Messages API routing and always use `/chat/completions`. Defaults to `true`.
|
|
393
275
|
- **useResponsesApiWebSocket:** When `true`, Responses API requests use Copilot's websocket transport for models that advertise `ws:/responses`; models that only advertise `/responses` continue to use HTTP. Set to `false` to disable websocket routing and use HTTP `/responses` whenever the selected model supports it. Defaults to `true`.
|
|
394
276
|
- **useResponsesApiWebSearch:** When `true`, the server keeps Responses API tools with `type: "web_search"` and forwards them upstream. Set to `false` to strip those tools from `/responses` payloads. Defaults to `true`.
|
|
395
277
|
- **claudeTokenMultiplier:** Multiplier applied to the fallback GPT-tokenizer estimate for Claude `/v1/messages/count_tokens` requests. Defaults to `1.15`. Increase it if your client is still compacting too late. This setting is only used when the proxy is estimating Claude tokens locally; if `anthropicApiKey` is configured and Anthropic token counting succeeds, the exact Anthropic count is returned instead.
|
|
396
|
-
- **anthropicApiKey:** Anthropic API key used
|
|
278
|
+
- **anthropicApiKey:** Anthropic API key used to forward Claude `/v1/messages/count_tokens` requests to Anthropic's real token counting endpoint, which returns exact counts instead of GPT tokenizer estimates. Can also be set via the `ANTHROPIC_API_KEY` environment variable. If not set, or if the upstream call fails, token counting falls back to local GPT tokenizer estimation controlled by `claudeTokenMultiplier`.
|
|
397
279
|
|
|
398
280
|
Edit this file to customize prompts or swap in your own fast model. Restart the server (or rerun the command) after changes so the cached config is refreshed.
|
|
399
281
|
|
|
@@ -468,78 +350,33 @@ These endpoints are reserved for local administrative actions and only accept `a
|
|
|
468
350
|
|
|
469
351
|
## Example Usage
|
|
470
352
|
|
|
471
|
-
|
|
353
|
+
Common `npx` commands:
|
|
472
354
|
|
|
473
355
|
```sh
|
|
474
|
-
#
|
|
356
|
+
# Start the gateway
|
|
475
357
|
npx @jeffreycao/copilot-api@latest start
|
|
476
358
|
|
|
477
|
-
#
|
|
359
|
+
# Start on a custom port with verbose logging
|
|
478
360
|
npx @jeffreycao/copilot-api@latest start --port 8080 --verbose
|
|
479
361
|
|
|
480
|
-
#
|
|
481
|
-
npx @jeffreycao/copilot-api@latest start --account-type business
|
|
482
|
-
|
|
483
|
-
# Use with an enterprise plan GitHub account
|
|
484
|
-
npx @jeffreycao/copilot-api@latest start --account-type enterprise
|
|
485
|
-
|
|
486
|
-
# Enable manual approval for each request
|
|
487
|
-
npx @jeffreycao/copilot-api@latest start --manual
|
|
488
|
-
|
|
489
|
-
# Set rate limit to 30 seconds between requests
|
|
490
|
-
npx @jeffreycao/copilot-api@latest start --rate-limit 30
|
|
491
|
-
|
|
492
|
-
# Wait instead of error when rate limit is hit
|
|
493
|
-
npx @jeffreycao/copilot-api@latest start --rate-limit 30 --wait
|
|
494
|
-
|
|
495
|
-
# Provide GitHub token directly
|
|
496
|
-
npx @jeffreycao/copilot-api@latest start --github-token ghp_YOUR_TOKEN_HERE
|
|
497
|
-
|
|
498
|
-
# Run only the auth flow and choose a builtin provider interactively
|
|
362
|
+
# Run the auth flow
|
|
499
363
|
npx @jeffreycao/copilot-api@latest auth login
|
|
500
364
|
|
|
501
|
-
#
|
|
502
|
-
npx @jeffreycao/copilot-api@latest auth login --provider codex
|
|
503
|
-
|
|
504
|
-
# Authenticate Copilot explicitly with verbose logging
|
|
505
|
-
npx @jeffreycao/copilot-api@latest auth login --provider copilot --verbose
|
|
506
|
-
|
|
507
|
-
# Show your Copilot usage/quota in the terminal (no server needed)
|
|
365
|
+
# Check Copilot usage without starting the server
|
|
508
366
|
npx @jeffreycao/copilot-api@latest check-usage
|
|
509
367
|
|
|
510
|
-
#
|
|
511
|
-
npx @jeffreycao/copilot-api@latest debug
|
|
512
|
-
|
|
513
|
-
# Display debug information in JSON format
|
|
368
|
+
# Print debug information as JSON
|
|
514
369
|
npx @jeffreycao/copilot-api@latest debug --json
|
|
515
370
|
|
|
516
|
-
# Initialize proxy from environment variables (HTTP_PROXY, HTTPS_PROXY, etc.)
|
|
517
|
-
npx @jeffreycao/copilot-api@latest start --proxy-env
|
|
518
|
-
|
|
519
|
-
# Use opencode GitHub Copilot authentication
|
|
520
|
-
COPILOT_API_OAUTH_APP=opencode npx @jeffreycao/copilot-api@latest start
|
|
521
|
-
|
|
522
|
-
# Set custom API home directory via command line
|
|
523
|
-
npx @jeffreycao/copilot-api@latest --api-home=/path/to/custom/dir start
|
|
524
|
-
|
|
525
|
-
# Use GitHub Enterprise via command line
|
|
526
|
-
npx @jeffreycao/copilot-api@latest --enterprise-url=company.ghe.com start
|
|
527
|
-
|
|
528
|
-
# Use opencode OAuth via command line
|
|
529
|
-
npx @jeffreycao/copilot-api@latest --oauth-app=opencode start
|
|
530
|
-
|
|
531
|
-
# Combine multiple global options
|
|
532
|
-
npx @jeffreycao/copilot-api@latest --api-home=/custom/path --oauth-app=opencode --enterprise-url=company.ghe.com start
|
|
533
|
-
|
|
534
371
|
# Run the published CLI with Bun instead of Node.js
|
|
535
372
|
bunx --bun @jeffreycao/copilot-api@latest start
|
|
536
373
|
```
|
|
537
374
|
|
|
538
375
|
## Using with Claude Code
|
|
539
376
|
|
|
540
|
-
This
|
|
377
|
+
This AI gateway can be used to power [Claude Code](https://docs.anthropic.com/en/claude-code), an experimental conversational AI assistant for developers from Anthropic.
|
|
541
378
|
|
|
542
|
-
There are two ways to configure Claude Code to use this
|
|
379
|
+
There are two ways to configure Claude Code to use this AI gateway:
|
|
543
380
|
|
|
544
381
|
### Interactive Setup with `--claude-code` flag
|
|
545
382
|
|
|
@@ -549,7 +386,7 @@ To get started, run the `start` command with the `--claude-code` flag:
|
|
|
549
386
|
npx @jeffreycao/copilot-api@latest start --claude-code
|
|
550
387
|
```
|
|
551
388
|
|
|
552
|
-
You will be prompted to select a primary model and a "small, fast" model for background tasks. After selecting the models, a command will be copied to your clipboard. This command sets the necessary environment variables for Claude Code to use the
|
|
389
|
+
You will be prompted to select a primary model and a "small, fast" model for background tasks. After selecting the models, a command will be copied to your clipboard. This command sets the necessary environment variables for Claude Code to use the gateway.
|
|
553
390
|
|
|
554
391
|
Paste and run this command in a new terminal to launch Claude Code.
|
|
555
392
|
|
|
@@ -596,9 +433,9 @@ You can also read more about IDE integration here: [Add Claude Code to your IDE]
|
|
|
596
433
|
|
|
597
434
|
## GPT Tool Search
|
|
598
435
|
|
|
599
|
-
For GPT Responses models such as `gpt-5.4+`, this
|
|
436
|
+
For GPT Responses models such as `gpt-5.4+`, this AI gateway can expose Responses `tool_search` through a small MCP bridge. The same bridge can be used by Claude Code and opencode, as long as the client loads MCP servers and sends Anthropic Messages traffic through this gateway.
|
|
600
437
|
|
|
601
|
-
Do not set Claude Code's native `ENABLE_TOOL_SEARCH` for GPT models. That flag enables Claude Code's own client-side tool search mode, and it may stop forwarding deferred tool definitions. This
|
|
438
|
+
Do not set Claude Code's native `ENABLE_TOOL_SEARCH` for GPT models. That flag enables Claude Code's own client-side tool search mode, and it may stop forwarding deferred tool definitions. This gateway needs the full tool definitions so it can keep the small always-loaded tool set eager and translate every other tool into Responses deferred namespaces.
|
|
602
439
|
|
|
603
440
|
If you install `tool-search@copilot-api-marketplace`, Claude Code receives this MCP bridge automatically and you can skip the manual Claude Code MCP setup below.
|
|
604
441
|
|
|
@@ -631,23 +468,23 @@ Add the tool search bridge to the MCP config used by opencode:
|
|
|
631
468
|
|
|
632
469
|
For local development, use `bun` as the command and `["run", "./src/main.ts", "mcp"]` as the args.
|
|
633
470
|
|
|
634
|
-
Internally, the
|
|
471
|
+
Internally, the gateway now configures OpenAI Responses `tool_search` in client-executed mode. Deferred tools are still exposed as searchable namespaces, but the model is explicitly asked to return the exact deferred tool names it wants to load next.
|
|
635
472
|
|
|
636
473
|
The bridge uses direct tool selection, not query search. Its tool input is `names`, a comma-separated list of exact deferred tool names, for example `TaskList,TaskGet,mcp__fetch__fetch`.
|
|
637
474
|
|
|
638
475
|
## Using with OpenCode
|
|
639
476
|
|
|
640
|
-
OpenCode already has a direct GitHub Copilot provider. Use this section when you want OpenCode to point at this
|
|
477
|
+
OpenCode already has a direct GitHub Copilot provider. Use this section when you want OpenCode to point at this AI gateway through `@ai-sdk/anthropic` and reuse the agent behaviors described earlier in this README.
|
|
641
478
|
|
|
642
479
|
### Minimal setup
|
|
643
480
|
|
|
644
|
-
Start the
|
|
481
|
+
Start the AI gateway with the OpenCode OAuth app:
|
|
645
482
|
|
|
646
483
|
```sh
|
|
647
484
|
npx @jeffreycao/copilot-api@latest --oauth-app=opencode start
|
|
648
485
|
```
|
|
649
486
|
|
|
650
|
-
Then point OpenCode at the
|
|
487
|
+
Then point OpenCode at the gateway with `@ai-sdk/anthropic`.
|
|
651
488
|
|
|
652
489
|
Example `~/.config/opencode/opencode.json`:
|
|
653
490
|
|
|
@@ -720,10 +557,10 @@ Example `~/.config/opencode/opencode.json`:
|
|
|
720
557
|
|
|
721
558
|
Why these fields matter:
|
|
722
559
|
|
|
723
|
-
- `npm: "@ai-sdk/anthropic"` is the important part. OpenCode will speak Anthropic Messages semantics to this
|
|
560
|
+
- `npm: "@ai-sdk/anthropic"` is the important part. OpenCode will speak Anthropic Messages semantics to this AI gateway instead of flattening everything into OpenAI Chat Completions.
|
|
724
561
|
- `options.baseURL` should be `http://localhost:4141/v1`; the Anthropic SDK will append `/messages`, `/models`, and `/messages/count_tokens` automatically.
|
|
725
562
|
- `model`, `small_model`, and `agent.*.model` let you keep `gpt-5.4` for build/plan work while routing exploration and background work to `gpt-5-mini`.
|
|
726
|
-
- If you enable `auth.apiKeys` in this
|
|
563
|
+
- If you enable `auth.apiKeys` in this AI gateway, replace `dummy` with a real key. Otherwise any placeholder value is fine.
|
|
727
564
|
|
|
728
565
|
## Plugin Integrations
|
|
729
566
|
|
|
@@ -733,7 +570,7 @@ Plugin integrations are available for Claude Code and opencode.
|
|
|
733
570
|
|
|
734
571
|
The Claude Code integration is packaged as two plugins:
|
|
735
572
|
|
|
736
|
-
- `agent-inject` injects `__SUBAGENT_MARKER__...` on `SubagentStart`, so
|
|
573
|
+
- `agent-inject` injects `__SUBAGENT_MARKER__...` on `SubagentStart`, so the gateway can infer `x-initiator: agent`.
|
|
737
574
|
- `tool-search` registers the `tool_search` MCP bridge used for GPT Responses deferred tool loading.
|
|
738
575
|
|
|
739
576
|
- Marketplace catalog in this repository: `.claude-plugin/marketplace.json`
|
|
@@ -752,7 +589,7 @@ Install the plugins from the marketplace:
|
|
|
752
589
|
/plugin install tool-search@copilot-api-marketplace
|
|
753
590
|
```
|
|
754
591
|
|
|
755
|
-
After installation, `agent-inject` injects `__SUBAGENT_MARKER__...` on `SubagentStart`, and
|
|
592
|
+
After installation, `agent-inject` injects `__SUBAGENT_MARKER__...` on `SubagentStart`, and the gateway uses it to infer `x-initiator: agent`.
|
|
756
593
|
|
|
757
594
|
The `agent-inject` plugin also registers a `UserPromptSubmit` hook that returns `{"continue": true}`, and it can inject `SessionStart` reminder rules through environment variables:
|
|
758
595
|
|
|
@@ -781,7 +618,7 @@ Or manually create the file at `~/.config/opencode/plugins/subagent-marker.js` w
|
|
|
781
618
|
- Tracks sub-sessions created by subagents
|
|
782
619
|
- Automatically prepends a marker system reminder (`__SUBAGENT_MARKER__...`) to subagent chat messages
|
|
783
620
|
- Sets `x-session-id` header for session tracking
|
|
784
|
-
- Enables
|
|
621
|
+
- Enables the gateway to infer `x-initiator: agent` for subagent-originated requests
|
|
785
622
|
|
|
786
623
|
The plugin hooks into `session.created`, `session.deleted`, `chat.message`, and `chat.headers` events to provide seamless subagent marker functionality.
|
|
787
624
|
|