ghc-proxy 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +95 -4
- package/dist/main.mjs +4056 -877
- package/dist/main.mjs.map +1 -1
- package/package.json +6 -5
package/README.md
CHANGED
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
[](https://github.com/wxxb789/ghc-proxy/actions/workflows/ci.yml)
|
|
5
5
|
[](https://github.com/wxxb789/ghc-proxy/blob/master/LICENSE)
|
|
6
6
|
|
|
7
|
-
A proxy that turns your GitHub Copilot subscription into an OpenAI and Anthropic compatible API. Use it to power [Claude Code](https://docs.anthropic.com/en/docs/claude-code/overview), [Cursor](https://www.cursor.com/), or any tool that speaks the OpenAI Chat Completions or Anthropic Messages protocol.
|
|
7
|
+
A proxy that turns your GitHub Copilot subscription into an OpenAI and Anthropic compatible API. Use it to power [Claude Code](https://docs.anthropic.com/en/docs/claude-code/overview), [Cursor](https://www.cursor.com/), or any tool that speaks the OpenAI Chat Completions, OpenAI Responses, or Anthropic Messages protocol.
|
|
8
8
|
|
|
9
9
|
> [!WARNING]
|
|
10
10
|
> Reverse-engineered, unofficial, may break at any time. Excessive use can trigger GitHub abuse detection. **Use at your own risk.**
|
|
@@ -110,7 +110,26 @@ ghc-proxy sits between your tools and the GitHub Copilot API:
|
|
|
110
110
|
|
|
111
111
|
The proxy authenticates with GitHub using the [device code OAuth flow](https://docs.github.com/en/apps/oauth-apps/building-oauth-apps/authorizing-oauth-apps#device-flow) (the same flow VS Code uses), then exchanges the GitHub token for a short-lived Copilot token that auto-refreshes.
|
|
112
112
|
|
|
113
|
-
|
|
113
|
+
When the Copilot token response includes `endpoints.api`, `ghc-proxy` now prefers that runtime API base automatically instead of relying only on the configured account type. This keeps enterprise/business routing aligned with the endpoint GitHub actually returned for the current token.
|
|
114
|
+
|
|
115
|
+
Incoming requests hit a [Hono](https://hono.dev/) server. `chat/completions` requests are validated, normalized into the shared planning pipeline, and then forwarded to Copilot. `responses` requests use a native Responses path with explicit compatibility policies. `messages` requests are routed per-model and can use native Anthropic passthrough, the Responses translation path, or the existing chat-completions fallback. The translator tracks exact vs lossy vs unsupported behavior explicitly; see the [Messages Routing and Translation Guide](./docs/messages-routing-and-translation.md) and the [Anthropic Translation Matrix](./docs/anthropic-translation-matrix.md) for the current support surface.
|
|
116
|
+
|
|
117
|
+
### Request Routing
|
|
118
|
+
|
|
119
|
+
`ghc-proxy` does not force every request through one protocol. The current routing rules are:
|
|
120
|
+
|
|
121
|
+
- `POST /v1/chat/completions`: OpenAI Chat Completions -> shared planning pipeline -> Copilot `/chat/completions`
|
|
122
|
+
- `POST /v1/responses`: OpenAI Responses create -> native Responses handler -> Copilot `/responses`
|
|
123
|
+
- `POST /v1/responses/input_tokens`: Responses input-token counting passthrough when the upstream supports it
|
|
124
|
+
- `GET /v1/responses/:responseId`: Responses retrieve passthrough when the upstream supports it
|
|
125
|
+
- `GET /v1/responses/:responseId/input_items`: Responses input-items passthrough when the upstream supports it
|
|
126
|
+
- `DELETE /v1/responses/:responseId`: Responses delete passthrough when the upstream supports it
|
|
127
|
+
- `POST /v1/messages`: Anthropic Messages -> choose the best available upstream path for the selected model:
|
|
128
|
+
- native Copilot `/v1/messages` when supported
|
|
129
|
+
- Anthropic -> Responses -> Anthropic translation when the model only supports `/responses`
|
|
130
|
+
- Anthropic -> Chat Completions -> Anthropic fallback otherwise
|
|
131
|
+
|
|
132
|
+
This keeps the existing chat pipeline stable while allowing newer Copilot models to use the endpoint they actually expose.
|
|
114
133
|
|
|
115
134
|
### Endpoints
|
|
116
135
|
|
|
@@ -119,6 +138,11 @@ Incoming requests hit a [Hono](https://hono.dev/) server. OpenAI-format requests
|
|
|
119
138
|
| Method | Path | Description |
|
|
120
139
|
|--------|------|-------------|
|
|
121
140
|
| `POST` | `/v1/chat/completions` | Chat completions (streaming and non-streaming) |
|
|
141
|
+
| `POST` | `/v1/responses` | Create a Responses API response |
|
|
142
|
+
| `POST` | `/v1/responses/input_tokens` | Count Responses input tokens when supported by Copilot upstream |
|
|
143
|
+
| `GET` | `/v1/responses/:responseId` | Retrieve one response when supported by Copilot upstream |
|
|
144
|
+
| `GET` | `/v1/responses/:responseId/input_items` | Retrieve response input items when supported by Copilot upstream |
|
|
145
|
+
| `DELETE` | `/v1/responses/:responseId` | Delete one response when supported by Copilot upstream |
|
|
122
146
|
| `GET` | `/v1/models` | List available models |
|
|
123
147
|
| `POST` | `/v1/embeddings` | Generate embeddings |
|
|
124
148
|
|
|
@@ -126,7 +150,7 @@ Incoming requests hit a [Hono](https://hono.dev/) server. OpenAI-format requests
|
|
|
126
150
|
|
|
127
151
|
| Method | Path | Description |
|
|
128
152
|
|--------|------|-------------|
|
|
129
|
-
| `POST` | `/v1/messages` | Messages API with
|
|
153
|
+
| `POST` | `/v1/messages` | Messages API with per-model routing across native Messages, Responses translation, or chat-completions fallback |
|
|
130
154
|
| `POST` | `/v1/messages/count_tokens` | Token counting |
|
|
131
155
|
|
|
132
156
|
**Utility:**
|
|
@@ -136,7 +160,7 @@ Incoming requests hit a [Hono](https://hono.dev/) server. OpenAI-format requests
|
|
|
136
160
|
| `GET` | `/usage` | Copilot quota / usage monitoring |
|
|
137
161
|
| `GET` | `/token` | Inspect the current Copilot token |
|
|
138
162
|
|
|
139
|
-
> **Note:** The `/v1/` prefix is optional. `/chat/completions`, `/models`, and `/embeddings` also work.
|
|
163
|
+
> **Note:** The `/v1/` prefix is optional. `/chat/completions`, `/responses`, `/models`, and `/embeddings` also work.
|
|
140
164
|
|
|
141
165
|
## CLI Reference
|
|
142
166
|
|
|
@@ -231,6 +255,61 @@ Or in the proxy's **config file** (`~/.local/share/ghc-proxy/config.json`):
|
|
|
231
255
|
|
|
232
256
|
**Priority order:** environment variable > config.json > built-in default.
|
|
233
257
|
|
|
258
|
+
### Small-Model Routing
|
|
259
|
+
|
|
260
|
+
`/v1/messages` can optionally reroute specific low-value requests to a cheaper model:
|
|
261
|
+
|
|
262
|
+
- `smallModel`: the model to reroute to
|
|
263
|
+
- `compactUseSmallModel`: reroute recognized compact/summarization requests
|
|
264
|
+
- `warmupUseSmallModel`: reroute explicitly marked warmup/probe requests
|
|
265
|
+
|
|
266
|
+
Both switches default to `false`. Routing is conservative:
|
|
267
|
+
|
|
268
|
+
- the target `smallModel` must exist in Copilot's model list
|
|
269
|
+
- it must preserve the original model's declared endpoint support
|
|
270
|
+
- tool, thinking, and vision requests are not rerouted to a model that lacks the required capabilities
|
|
271
|
+
|
|
272
|
+
Warmup routing is intentionally narrow. Requests must look like explicit warmup/probe traffic; ordinary tool-free chat requests are not rerouted just because they include `anthropic-beta`.
|
|
273
|
+
|
|
274
|
+
### Responses Compatibility
|
|
275
|
+
|
|
276
|
+
`/v1/responses` is designed to stay close to the OpenAI wire format while making Copilot limitations explicit:
|
|
277
|
+
|
|
278
|
+
- requests are validated before any mutation
|
|
279
|
+
- common official request fields such as `conversation`, `previous_response_id`, `max_tool_calls`, `truncation`, `user`, `prompt`, and `text` are now modeled explicitly instead of relying on loose passthrough alone
|
|
280
|
+
- official `text.format` options are modeled explicitly, including `text`, `json_object`, and `json_schema`
|
|
281
|
+
- `custom` `apply_patch` can be rewritten as a function tool when `useFunctionApplyPatch` is enabled
|
|
282
|
+
- per-model Responses context compaction can be enabled with `responsesApiContextManagementModels`
|
|
283
|
+
- reasoning defaults for Anthropic -> Responses translation can be tuned with `modelReasoningEfforts`
|
|
284
|
+
- known unsupported builtin tools, such as `web_search`, fail explicitly with `400` instead of being silently removed
|
|
285
|
+
- external image URLs on the Responses path fail explicitly with `400`; use `file_id` or data URL image input instead
|
|
286
|
+
- official `input_file` and `item_reference` input items are modeled explicitly and validated before forwarding
|
|
287
|
+
|
|
288
|
+
Live upstream verification matters here. On March 11, 2026, a full local scan across every Copilot model that advertised `/responses` support still showed two stable vision gaps:
|
|
289
|
+
|
|
290
|
+
- external image URLs were rejected uniformly enough that the proxy now rejects them locally with a clearer capability error
|
|
291
|
+
- the current 1x1 PNG data URL probe was rejected upstream as invalid image data even though the fixture itself decodes as a valid PNG locally
|
|
292
|
+
|
|
293
|
+
The proxy does not currently disable Responses vision wholesale because the same models still advertise vision capability in Copilot model metadata. Treat Responses vision as upstream-contract-sensitive and verify it with `matrix:live` before relying on it.
|
|
294
|
+
|
|
295
|
+
Additional real-upstream note: on March 11, 2026, `POST /responses` succeeded against the current enterprise Copilot endpoint, but `POST /responses/input_tokens`, `GET /responses/{id}`, `GET /responses/{id}/input_items`, and `DELETE /responses/{id}` all returned upstream `404`. The proxy exposes those routes because they are part of the official Responses surface, but current Copilot upstream support is not there yet. The same live matrix also showed `previous_response_id` returning upstream `400 previous_response_id is not supported` on the tested model.
|
|
296
|
+
|
|
297
|
+
Example `config.json`:
|
|
298
|
+
|
|
299
|
+
```json
|
|
300
|
+
{
|
|
301
|
+
"smallModel": "gpt-4.1-mini",
|
|
302
|
+
"compactUseSmallModel": true,
|
|
303
|
+
"warmupUseSmallModel": false,
|
|
304
|
+
"useFunctionApplyPatch": true,
|
|
305
|
+
"responsesApiContextManagementModels": ["gpt-5", "gpt-5-mini"],
|
|
306
|
+
"modelReasoningEfforts": {
|
|
307
|
+
"gpt-5": "high",
|
|
308
|
+
"gpt-5-mini": "medium"
|
|
309
|
+
}
|
|
310
|
+
}
|
|
311
|
+
```
|
|
312
|
+
|
|
234
313
|
## Docker
|
|
235
314
|
|
|
236
315
|
Build and run:
|
|
@@ -280,4 +359,16 @@ bun run build # Build with tsdown
|
|
|
280
359
|
bun run lint # ESLint
|
|
281
360
|
bun run typecheck # tsc --noEmit
|
|
282
361
|
bun test # Run tests
|
|
362
|
+
bun run matrix:live # Real Copilot upstream compatibility matrix
|
|
363
|
+
bun run matrix:live --vision-only --all-responses-models --json
|
|
364
|
+
bun run matrix:live --stateful-only --json --model=gpt-5.2-codex
|
|
283
365
|
```
|
|
366
|
+
|
|
367
|
+
> **Note:** `bun run matrix:live` uses your configured GitHub/Copilot credentials and spends real upstream requests. Use it when you want end-to-end verification against the current Copilot service, not for every local edit.
|
|
368
|
+
>
|
|
369
|
+
> Useful flags:
|
|
370
|
+
> - `--json`: emit machine-readable JSON only
|
|
371
|
+
> - `--vision-only`: run just the Responses image probes
|
|
372
|
+
> - `--stateful-only`: run follow-up/resource probes such as `previous_response_id`, `input_tokens`, and `input_items`
|
|
373
|
+
> - `--all-responses-models`: scan every model that advertises `/responses`
|
|
374
|
+
> - `--model=<id>`: pin the Responses scan to one specific model
|