ghc-proxy 0.5.8 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +38 -12
- package/dist/main.mjs +4039 -3010
- package/dist/main.mjs.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -189,7 +189,8 @@ All fields are optional. The full schema:
|
|
|
189
189
|
| `modelFallback.claudeHaiku` | `string` | `claude-haiku-4.5` | Fallback for `claude-haiku-*` models |
|
|
190
190
|
| `smallModel` | `string` | -- | Target model for compact request routing (see [Small-Model Routing](#small-model-routing)) |
|
|
191
191
|
| `compactUseSmallModel` | `boolean` | `false` | Route compact/summarization requests to `smallModel` |
|
|
192
|
-
| `contextUpgrade` | `boolean` | `true` |
|
|
192
|
+
| `contextUpgrade` | `boolean` | `true` | Enable configured extended-context upgrade rules (see [Context-1M Auto-Upgrade](#context-1m-auto-upgrade)) |
|
|
193
|
+
| `contextUpgradeRules` | `{ from, to }[]` | `[]` | Glob-pattern context upgrade rules used for proactive, reactive, and beta-header upgrades |
|
|
193
194
|
| `contextUpgradeTokenThreshold` | `number` | `160000` | Token threshold for proactive context upgrade |
|
|
194
195
|
| `useFunctionApplyPatch` | `boolean` | `true` | Rewrite `apply_patch` custom tool as function tool on Responses path |
|
|
195
196
|
| `responsesApiAutoCompactInput` | `boolean` | `false` | Automatically trim Responses `input` to the latest `compaction` item |
|
|
@@ -213,6 +214,9 @@ Example:
|
|
|
213
214
|
"smallModel": "gpt-4.1-mini",
|
|
214
215
|
"compactUseSmallModel": true,
|
|
215
216
|
"contextUpgrade": true,
|
|
217
|
+
"contextUpgradeRules": [
|
|
218
|
+
{ "from": "claude-opus-4.6", "to": "claude-opus-4.6-1m" }
|
|
219
|
+
],
|
|
216
220
|
"contextUpgradeTokenThreshold": 160000,
|
|
217
221
|
"useFunctionApplyPatch": true,
|
|
218
222
|
"responsesApiAutoCompactInput": false,
|
|
@@ -287,25 +291,45 @@ Rewrites run **before** any other model policy — context upgrades, small-model
|
|
|
287
291
|
|
|
288
292
|
### Context-1M Auto-Upgrade
|
|
289
293
|
|
|
290
|
-
The proxy can automatically upgrade models to
|
|
291
|
-
|
|
292
|
-
**Proactive upgrade:** Before sending the request, the proxy estimates the input token count. If it exceeds the configured threshold (default: 160,000 tokens), the model is upgraded to its 1M variant before the request is sent.
|
|
293
|
-
|
|
294
|
-
**Reactive upgrade:** If the upstream returns a context-length error (e.g. "context length exceeded"), the proxy retries the request with the upgraded model automatically.
|
|
294
|
+
The proxy can automatically upgrade models to extended-context variants when the request is large. Upgrade targets are config-driven so users only route to models their Copilot account can access.
|
|
295
295
|
|
|
296
|
-
**
|
|
296
|
+
**Proactive upgrade:** Before sending the request, the proxy estimates the input token count. If it exceeds the configured threshold (default: 160,000 tokens), the first matching `contextUpgradeRules` entry is applied before the request is sent.
|
|
297
297
|
|
|
298
|
-
|
|
298
|
+
**Reactive upgrade:** If the upstream returns a context-length error (e.g. "context length exceeded"), the proxy retries the request with the configured upgraded model automatically.
|
|
299
299
|
|
|
300
|
-
|
|
301
|
-
|-------------|----------------|
|
|
302
|
-
| `claude-opus-4.6` | `claude-opus-4.6-1m` |
|
|
300
|
+
**Beta header support:** When a client sends an `anthropic-beta: context-*` header (e.g. `context-1m-2025-04-14`), the proxy strips the header (Copilot does not understand it) and applies the configured context upgrade rule instead.
|
|
303
301
|
|
|
304
302
|
Configuration:
|
|
305
303
|
|
|
306
|
-
- `contextUpgrade` (boolean, default `true`) — enable or disable auto-upgrade
|
|
304
|
+
- `contextUpgrade` (boolean, default `true`) — enable or disable configured auto-upgrade rules
|
|
305
|
+
- `contextUpgradeRules` (`{ from, to }[]`, default `[]`) — glob-pattern model upgrade rules; first match wins
|
|
307
306
|
- `contextUpgradeTokenThreshold` (number, default `160000`) — token count threshold for proactive upgrade
|
|
308
307
|
|
|
308
|
+
Example for the public Opus 4.6 1M model:
|
|
309
|
+
|
|
310
|
+
```json
|
|
311
|
+
{
|
|
312
|
+
"contextUpgradeRules": [
|
|
313
|
+
{ "from": "claude-opus-4.6", "to": "claude-opus-4.6-1m" }
|
|
314
|
+
]
|
|
315
|
+
}
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
Example for an enterprise account with access to the Opus 4.7 internal 1M model:
|
|
319
|
+
|
|
320
|
+
```json
|
|
321
|
+
{
|
|
322
|
+
"modelRewrites": [
|
|
323
|
+
{ "from": "claude-opus-*", "to": "claude-opus-4.7" }
|
|
324
|
+
],
|
|
325
|
+
"contextUpgrade": true,
|
|
326
|
+
"contextUpgradeRules": [
|
|
327
|
+
{ "from": "claude-opus-4.7", "to": "claude-opus-4.7-1m-internal" }
|
|
328
|
+
],
|
|
329
|
+
"contextUpgradeTokenThreshold": 160000
|
|
330
|
+
}
|
|
331
|
+
```
|
|
332
|
+
|
|
309
333
|
### Small-Model Routing
|
|
310
334
|
|
|
311
335
|
`/v1/messages` can optionally reroute specific low-value requests to a cheaper model:
|
|
@@ -340,6 +364,8 @@ When the Copilot token response includes `endpoints.api`, `ghc-proxy` now prefer
|
|
|
340
364
|
|
|
341
365
|
Incoming requests hit an [Elysia](https://elysiajs.com/) server. `chat/completions` requests are validated, normalized into the shared planning pipeline, and then forwarded to Copilot. `responses` requests use a native Responses path with explicit compatibility policies. `messages` requests are routed per-model and can use native Anthropic passthrough, the Responses translation path, or the existing chat-completions fallback. The translator tracks exact vs lossy vs unsupported behavior explicitly; see the [Messages Routing and Translation Guide](./docs/messages-routing-and-translation.md) and the [Anthropic Translation Matrix](./docs/anthropic-translation-matrix.md) for the current support surface.
|
|
342
366
|
|
|
367
|
+
For Anthropic `search_result` blocks, current live probes show Copilot native `/v1/messages` accepts top-level search results and pure search-result tool outputs, but rejects top-level `citations` and mixed text/search-result tool output arrays. The native path sanitizes those known rejection cases, while translated paths flatten search results to text.
|
|
368
|
+
|
|
343
369
|
### Request Routing
|
|
344
370
|
|
|
345
371
|
`ghc-proxy` does not force every request through one protocol. The current routing rules are:
|