copilot-custom-endpoint 1.2.1 → 1.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +80 -818
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,873 +1,135 @@
1
- # Github Copilot Custom Endpoints
1
+ # GitHub Copilot Custom Endpoints
2
2
 
3
- > **TL;DR** — As of **June 1, 2026**, GitHub Copilot switched to usage-based billing (AI Credits), making every chat and agent session burn credits fast. This repo documents a practical workaround: use **cheaper, non-GitHub models** (DeepSeek, Kimi, Qwen, MiMo) inside VS Code's Copilot chat — often at **5–55× lower cost** while retaining agent mode, tool calling, and streaming. We keep validated, copy-paste-ready configs and a small local proxy that smooths out provider quirks.
3
+ > **TL;DR** — GitHub Copilot switched to usage-based billing on **June 1, 2026**. Every chat and agent session now burns AI credits fast. This repo shows you how to plug **cheaper non-GitHub models** (DeepSeek, Kimi, Qwen, MiMo, MiniMax) into VS Code's Copilot chat — often **5–55× cheaper** than the built-ins — while keeping agent mode, tools, streaming, and vision.
4
4
 
5
5
  ## What is this?
6
6
 
7
- VS Code lets you add your own language-model endpoint ("Bring Your Own Key"). In practice, many providers claim "OpenAI-compatible" APIs but reject the exact request shapes that VS Code sends. This repo is a growing collection of **real, tested setups** — not just hopeful `curl` snippets.
8
-
9
- Each provider/model gets one durable record under `docs/models/` plus any local proxy code it needs under `proxy/`.
10
-
11
- ### Why custom endpoints instead of OpenRouter?
12
-
13
- [OpenRouter](https://openrouter.ai) is a popular unified gateway, but it is **not always an option**:
14
-
15
- - **Corporate firewalls often block OpenRouter** (and many other cloud AI gateways) by default. If your employer's network blocks OpenRouter, you cannot use it — full stop. A custom endpoint lets you talk directly to a provider that _is_ allowed, or run a small local proxy on `localhost` that forwards through an approved egress path.
16
- - **Provider-specific features** (Kimi's thinking mode, vision quirks, etc.) often need request rewriting that a generic aggregator does not support.
17
- - **Cost or contract reasons** may mean your organisation already has a direct relationship with a specific provider and does not want traffic routed through a third party.
18
-
19
- This repo is for those situations: validated, copy-paste-ready configs when OpenRouter is blocked, too expensive, or simply the wrong tool for the job.
20
-
21
- ## Quick start
22
-
23
- | Provider | Model | Needs proxy? | Plain chat | Streaming | Tool calling | Vision |
24
- | ----------------------------- | --------------- | ---------------------------------- | ---------- | --------- | ------------ | ------ |
25
- | **Moonshot (Kimi)** | `kimi-k2.6` | Yes — `proxy/kimi-proxy.mjs` | ✅ | ✅ | ✅ | ✅ |
26
- | **Alibaba Cloud (DashScope)** | `qwen3.6-plus` | Optional — `proxy/qwen-proxy.mjs`¹ | ✅² | ✅ | ✅ | ✅ |
27
- | **Alibaba Cloud (DashScope)** | `qwen3.7-max` | Optional — `proxy/qwen-proxy.mjs`¹ | ✅² | ✅ | ✅ | ❌ |
28
- | **DeepSeek** | `deepseek-v4` | No — uses a VS Code extension | ✅ | ✅ | ✅ | ✅³ |
29
- | **Xiaomi MiMo** | `mimo-v2.5` | No | ✅ | ✅ | ✅ | ✅⁴ |
30
- | **Xiaomi MiMo** | `mimo-v2.5-pro` | No | ✅ | ✅ | ✅ | ❌ |
31
- | **Xiaomi MiMo** | `mimo-v2-flash` | No | ✅ | ✅ | ✅ | ❌ |
32
- | **MiniMax** | `MiniMax-M3` | No | ✅ | ✅ | ✅ | ✅ |
33
-
34
- ¹ Proxy is optional: direct path works with static `enable_thinking: false`. Proxy adds dynamic thinking suppression (thinking ON in plain chat, OFF in tool loops).
35
- ² With proxy: reasoning visible in plain chat. Without proxy: always suppressed.
36
- ³ Vision is supported through a proxy model (Claude, GPT-4o) that describes the image before sending to DeepSeek.
37
- ⁴ Native vision via dedicated ViT encoder. Tested via VS Code image attachment in agent mode.
38
-
39
- Pick the model you want and follow the corresponding section below.
40
-
41
- ### Config setup: two-step workflow
42
-
43
- VS Code separates **model configuration** from **API key storage** for security. You set up each provider in two steps:
44
-
45
- 1. **Create/update `chatLanguageModels.json`** — this file defines the models, URLs, and settings. API keys are **not** stored here (leave `apiKey` out entirely, or use an empty string).
46
-
47
- | OS | Path |
48
- | ------- | ----------------------------------------------------------------- |
49
- | Windows | `%APPDATA%\Code\User\chatLanguageModels.json` |
50
- | macOS | `~/Library/Application Support/Code/User/chatLanguageModels.json` |
51
- | Linux | `~/.config/Code/User/chatLanguageModels.json` |
52
-
53
- 2. **Set each API key through the Language Models UI:**
54
- - Open the Command Palette (`Ctrl+Shift+P`).
55
- - Run **Chat: Manage Language Models**.
56
- - Find your provider group in the list.
57
- - Right-click the group name → **Update API Key**.
58
- - Paste your key. It is stored securely (not in the JSON file).
59
-
60
- > **Why this way?** The JSON config file is often tracked in dotfile repos or shared across machines. API keys don't belong there. The VS Code UI stores them in your OS keychain instead.
61
-
62
- ### Full example config
63
-
64
- Here's a complete, real-world example of `chatLanguageModels.json` combining all the providers documented in this repo. Note the `apiKey` fields are left as empty strings — you'll set them via the Language Models UI instead. After you set a key via the UI, VS Code replaces the empty string with a `${input:chat.lm.secret.<id>}` secret reference.
65
-
66
- ```json
67
- [
68
- {
69
- "name": "Qwen",
70
- "vendor": "customendpoint",
71
- "apiKey": "",
72
- "apiType": "chat-completions",
73
- "models": [
74
- {
75
- "id": "qwen3.7-max",
76
- "name": "Qwen 3.7 Max",
77
- "url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
78
- "toolCalling": true,
79
- "vision": false,
80
- "streaming": true,
81
- "requestBody": {
82
- "enable_thinking": false
83
- }
84
- },
85
- {
86
- "id": "qwen3.6-plus",
87
- "name": "Qwen 3.6 Plus",
88
- "url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
89
- "toolCalling": true,
90
- "vision": true,
91
- "streaming": true,
92
- "requestBody": {
93
- "enable_thinking": false
94
- }
95
- }
96
- ]
97
- },
98
- {
99
- "name": "Kimi",
100
- "vendor": "customendpoint",
101
- "apiKey": "",
102
- "apiType": "chat-completions",
103
- "models": [
104
- {
105
- "id": "kimi-k2.6",
106
- "name": "Kimi K2.6",
107
- "url": "http://127.0.0.1:3457/v1/chat/completions",
108
- "requestBody": {
109
- "temperature": 1
110
- },
111
- "toolCalling": true,
112
- "vision": true,
113
- "streaming": true,
114
- "maxInputTokens": 262144,
115
- "maxOutputTokens": 32768
116
- }
117
- ]
118
- },
119
- {
120
- "name": "MiMo",
121
- "vendor": "customendpoint",
122
- "apiKey": "",
123
- "apiType": "chat-completions",
124
- "models": [
125
- {
126
- "id": "mimo-v2.5-pro",
127
- "name": "MiMo V2.5 Pro",
128
- "url": "https://api.xiaomimimo.com/v1/chat/completions",
129
- "toolCalling": true,
130
- "vision": false,
131
- "streaming": true,
132
- "maxInputTokens": 1048576,
133
- "maxOutputTokens": 131072,
134
- "requestBody": {
135
- "thinking": { "type": "disabled" },
136
- "temperature": 1,
137
- "top_p": 0.95
138
- }
139
- },
140
- {
141
- "id": "mimo-v2.5",
142
- "name": "MiMo V2.5",
143
- "url": "https://api.xiaomimimo.com/v1/chat/completions",
144
- "toolCalling": true,
145
- "vision": true,
146
- "streaming": true,
147
- "maxInputTokens": 1048576,
148
- "maxOutputTokens": 32768,
149
- "requestBody": {
150
- "thinking": { "type": "disabled" },
151
- "temperature": 1,
152
- "top_p": 0.95
153
- }
154
- },
155
- {
156
- "id": "mimo-v2-flash",
157
- "name": "MiMo V2 Flash",
158
- "url": "https://api.xiaomimimo.com/v1/chat/completions",
159
- "toolCalling": true,
160
- "vision": false,
161
- "streaming": true,
162
- "maxInputTokens": 262144,
163
- "maxOutputTokens": 65536,
164
- "requestBody": {
165
- "thinking": { "type": "disabled" },
166
- "temperature": 0.3,
167
- "top_p": 0.95
168
- }
169
- }
170
- ]
171
- },
172
- {
173
- "name": "MiniMax",
174
- "vendor": "customendpoint",
175
- "apiKey": "",
176
- "apiType": "chat-completions",
177
- "models": [
178
- {
179
- "id": "MiniMax-M3",
180
- "name": "MiniMax M3",
181
- "url": "https://api.minimax.io/v1/chat/completions",
182
- "toolCalling": true,
183
- "vision": true,
184
- "streaming": true,
185
- "maxInputTokens": 1048576,
186
- "maxOutputTokens": 131072,
187
- "requestBody": {
188
- "thinking": { "type": "adaptive" },
189
- "reasoning_split": true,
190
- "temperature": 1,
191
- "top_p": 0.95
192
- }
193
- }
194
- ]
195
- }
196
- ]
197
- ```
7
+ VS Code lets you add your own language-model endpoint via a small JSON config file. Many providers advertise "OpenAI-compatible" APIs but reject the exact request shapes VS Code sends. This repo collects **real, tested setups** — one per provider plus a tiny local proxy that smooths over the rough edges when needed.
198
8
 
199
- <details>
200
- <summary>Kimi K2.6 (Moonshot)</summary>
9
+ If [OpenRouter](https://openrouter.ai) is blocked by your network, too expensive, or too generic for your model's quirks, this is the workaround.
201
10
 
202
- ### Kimi K2.6 (Moonshot)
11
+ ## How it works (4 steps)
203
12
 
204
- #### 1. Grab a Moonshot API key
13
+ 1. **Pick a model** from the table below.
14
+ 2. **Add it to your VS Code config** — copy the snippet from the model's doc.
15
+ 3. **Set the API key** through VS Code's UI (it goes to your OS keychain, not the file).
16
+ 4. **Open chat** and pick the model from the model picker.
205
17
 
206
- Sign up at [platform.moonshot.ai](https://platform.moonshot.ai) and create an API key.
18
+ That's it. No code, no servers to manage (unless the model specifically needs the local proxy — the table tells you).
207
19
 
208
- #### 2. Start the local proxy
20
+ ## Pick a model
209
21
 
210
- The proxy rewrites VS Code's requests into shapes Kimi actually accepts (fixed `temperature`, `top_p`, and disabling "thinking" during tool calls).
22
+ | Model | Provider | Needs proxy? | Vision | Setup guide |
23
+ | --------------------------- | --------- | ---------------------- | ------------ | -------------------------------------------------------------------------------------------------- |
24
+ | **MiMo V2 Flash** | Xiaomi | No | ❌ | [Setup](docs/models/mimo.md) |
25
+ | **MiMo V2.5** | Xiaomi | No | ✅ | [Setup](docs/models/mimo.md) |
26
+ | **MiMo V2.5 Pro** | Xiaomi | No | ❌ | [Setup](docs/models/mimo.md) |
27
+ | **Kimi K2.6** | Moonshot | **Yes** | ✅ | [Setup](docs/models/kimi-k2.6.md) |
28
+ | **Qwen 3.6 Plus** | DashScope | Optional | ✅ | [Setup](docs/models/qwen.md) |
29
+ | **Qwen 3.7 Max** | DashScope | Optional | ❌ | [Setup](docs/models/qwen.md) |
30
+ | **MiniMax M3** | MiniMax | No | ✅ | [Setup](docs/models/minimax.md) |
31
+ | **DeepSeek V4 Pro / Flash** | DeepSeek | No (uses an extension) | ✅ via proxy | [Marketplace](https://marketplace.visualstudio.com/items?itemName=Vizards.deepseek-v4-for-copilot) |
211
32
 
212
- > **Local config:** Create a `.env` file in this repo root to set environment variables like `KIMI_PROXY_PORT`, `KIMI_UPSTREAM_URL`, etc. It's loaded automatically via `dotenv` — no need to prefix commands.
33
+ ## Setup
213
34
 
214
- Run Kimi proxy
35
+ ### 1. Find (or create) your config file
215
36
 
216
- ```bash
217
- npm run proxy:kimi
218
- ```
37
+ | OS | Path |
38
+ | ------- | ----------------------------------------------------------------- |
39
+ | Windows | `%APPDATA%\Code\User\chatLanguageModels.json` |
40
+ | macOS | `~/Library/Application Support/Code/User/chatLanguageModels.json` |
41
+ | Linux | `~/.config/Code/User/chatLanguageModels.json` |
219
42
 
220
- Run all proxies
43
+ If the file doesn't exist yet, create it with `[]` inside.
221
44
 
222
- ```bash
223
- npm run proxy
224
- ```
45
+ ### 2. Add a model entry
225
46
 
226
- Run globally (from any directory)
47
+ Open the setup guide for the model you picked (links in the table above) and copy its JSON snippet into the file. Each snippet is a single provider object inside the array.
227
48
 
228
- ```bash
229
- # Kimi only
230
- npx copilot-custom-endpoint kimi
231
- # All proxies
232
- npx copilot-custom-endpoint
233
- ```
49
+ > **⚠️ Leave `apiKey` as `""`** — never paste the key into the JSON file.
234
50
 
235
- Clean up debug logs
236
-
237
- ```bash
238
- npm run clean:logs
239
- # or with npx
240
- npx copilot-custom-endpoint clean
241
- ```
242
-
243
- You should see:
244
-
245
- ```
246
- [kimi-proxy] listening on http://127.0.0.1:3457/v1/chat/completions
247
- [kimi-proxy] forwarding to https://api.moonshot.ai/v1/chat/completions
248
- [kimi-proxy] forcing temperature=1, non-thinking temperature=0.6, and top_p=0.95
249
- [kimi-proxy] disable thinking with tools=true
250
- [kimi-proxy] writing redacted request summaries to debug_log/kimi-proxy.ndjson
251
- ```
252
-
253
- Check it's alive:
254
-
255
- ```bash
256
- curl http://127.0.0.1:3457/healthz
257
- ```
258
-
259
- Expected response:
260
-
261
- ```json
262
- {
263
- "ok": true,
264
- "upstreamUrl": "https://api.moonshot.ai/v1/chat/completions",
265
- "port": 3457,
266
- "forcedTemperature": 1,
267
- "forcedTopP": 0.95
268
- }
269
- ```
270
-
271
- > **Keep this terminal open** while you use Kimi in VS Code.
272
-
273
- #### 3. Register the model in VS Code
274
-
275
- First, open (or create) your user config file (see [Config file location](#config-file-location) above) and paste this entry (leave `apiKey` as empty string — you'll set it via the UI):
276
-
277
- ```json
278
- {
279
- "name": "Kimi",
280
- "vendor": "customendpoint",
281
- "apiKey": "",
282
- "apiType": "chat-completions",
283
- "models": [
284
- {
285
- "id": "kimi-k2.6",
286
- "name": "Kimi K2.6",
287
- "url": "http://127.0.0.1:3457/v1/chat/completions",
288
- "requestBody": {
289
- "temperature": 1
290
- },
291
- "toolCalling": true,
292
- "vision": true,
293
- "streaming": true,
294
- "maxInputTokens": 262144,
295
- "maxOutputTokens": 32768
296
- }
297
- ]
298
- }
299
- ```
300
-
301
- > **Note:** The `requestBody.temperature` here is a hint to VS Code, but the proxy will enforce the exact values Kimi requires regardless.
302
-
303
- Then set your Moonshot API key via the Language Models UI:
304
-
305
- - Open the Command Palette (`Ctrl+Shift+P`).
306
- - Run **Chat: Manage Language Models**.
307
- - Find the **Kimi** group, right-click it → **Update API Key**.
308
- - Paste your Moonshot API key.
309
-
310
- #### 4. Chat!
311
-
312
- - Open the Copilot chat panel (`Ctrl+Alt+I` / `Cmd+Ctrl+I`).
313
- - Click the model picker (top-right of the chat input).
314
- - Choose **Kimi K2.6**.
315
- - Ask something. Streaming, tool use, and vision all work.
316
-
317
- #### Troubleshooting (Kimi)
318
-
319
- | Symptom | Fix |
320
- | --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
321
- | "Connection refused" or no response | Make sure `node proxy/kimi-proxy.mjs` is still running. |
322
- | `invalid temperature` / `invalid top_p` | You're talking directly to Moonshot instead of through the proxy. Double-check the `url` in `chatLanguageModels.json`. |
323
- | Tool calls fail after first turn | This happens if "thinking" stays enabled during tool loops. The proxy normally disables it automatically; ensure you're on the latest `proxy/kimi-proxy.mjs`. |
324
-
325
- </details>
326
-
327
- ---
328
-
329
- <details>
330
- <summary>Qwen 3.6 Plus / Qwen 3.7 Max (DashScope)</summary>
331
-
332
- ### Qwen 3.6 Plus or Qwen 3.7 Max (DashScope)
333
-
334
- Qwen models work **directly** with DashScope — no proxy needed. Just add `enable_thinking: false` to `requestBody` for tool-calling stability. An optional `proxy/qwen-proxy.mjs` is also available for dynamic thinking suppression (see [below](#optional-local-proxy-for-dynamic-thinking)).
335
-
336
- #### 1. Grab a DashScope API key
337
-
338
- Create an API key [here](https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=dashboard#/api-key).
339
-
340
- > **Regional endpoints:** DashScope offers endpoints for several regions. API keys are region-specific.
341
- >
342
- > - **China (Beijing):** `https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions`
343
- > - **US (Virginia):** `https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completions`
344
- > - **Singapore (default):** `https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions`
345
-
346
- #### 2. Register the models in VS Code
347
-
348
- First, open (or create) your user config file (see [Config file location](#config-file-location) above) and paste this entry (leave `apiKey` as empty string — you'll set it via the UI):
349
-
350
- ```json
351
- {
352
- "name": "Qwen",
353
- "vendor": "customendpoint",
354
- "apiKey": "",
355
- "apiType": "chat-completions",
356
- "models": [
357
- {
358
- "id": "qwen3.7-max",
359
- "name": "Qwen 3.7 Max",
360
- "url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
361
- "toolCalling": true,
362
- "vision": false,
363
- "streaming": true,
364
- "requestBody": {
365
- "enable_thinking": false
366
- }
367
- },
368
- {
369
- "id": "qwen3.6-plus",
370
- "name": "Qwen 3.6 Plus",
371
- "url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
372
- "toolCalling": true,
373
- "vision": true,
374
- "streaming": true,
375
- "requestBody": {
376
- "enable_thinking": false
377
- }
378
- }
379
- ]
380
- }
381
- ```
51
+ ### 3. Set the API key
382
52
 
383
- Then set your DashScope API key via the Language Models UI:
53
+ 1. Open the Command Palette (`Ctrl+Shift+P` / `Cmd+Shift+P`).
54
+ 2. Run **Chat: Manage Language Models**.
55
+ 3. Find your provider in the list, right-click the group name → **Update API Key**.
56
+ 4. Paste your key. It's stored in your OS keychain.
384
57
 
385
- - Open the Command Palette (`Ctrl+Shift+P`).
386
- - Run **Chat: Manage Language Models**.
387
- - Find the **Qwen** group, right-click it → **Update API Key**.
388
- - Paste your DashScope API key.
58
+ ### 4. Chat
389
59
 
390
- > **Trade-off:** `enable_thinking: false` suppresses reasoning in all requests (both plain chat and tool loops). Tool loops stay stable, but you never see the model's thought process. The [optional proxy](#optional-local-proxy-for-dynamic-thinking) below avoids this trade-off.
60
+ - Open Copilot chat (`Ctrl+Alt+I` / `Cmd+Ctrl+I`).
61
+ - Click the model picker (top-right).
62
+ - Pick your model and ask something.
391
63
 
392
- #### 3. Chat!
64
+ If a model needs a proxy, the setup guide will tell you to run a command first. Keep that terminal open while you chat.
393
65
 
394
- - Open the Copilot chat panel (`Ctrl+Alt+I` / `Cmd+Ctrl+I`).
395
- - Click the model picker (top-right of the chat input).
396
- - Choose **Qwen 3.6 Plus** (with vision) or **Qwen 3.7 Max** (text only).
397
- - Ask something. Streaming, tool use, and vision (3.6 Plus) all work.
66
+ ## Common commands
398
67
 
399
- ---
400
-
401
- #### Optional: Local proxy for dynamic thinking
402
-
403
- If you want reasoning visible in plain chat but automatically suppressed during tool loops, run the optional `proxy/qwen-proxy.mjs` instead.
404
-
405
- Start the proxy:
406
-
407
- ```bash
408
- npm run proxy:qwen
409
- ```
410
-
411
- Or with all proxies:
412
-
413
- ```bash
414
- npm run proxy
415
- ```
416
-
417
- Or globally (from any directory):
68
+ Run from the repo root:
418
69
 
419
70
  ```bash
420
- # Qwen only
421
- npx copilot-custom-endpoint qwen
422
- # All proxies
423
- npx copilot-custom-endpoint
424
- ```
425
-
426
- You should see:
427
-
428
- ```
429
- [qwen-proxy] listening on http://127.0.0.1:3458/v1/chat/completions
71
+ npm run proxy # Start both proxies (Kimi + Qwen)
72
+ npm run proxy:kimi # Start only the Kimi proxy
73
+ npm run proxy:qwen # Start only the Qwen proxy
74
+ npm run clean:logs # Remove debug_log/
75
+ npm test # Run the test suite
430
76
  ```
431
77
 
432
- Check it's alive:
78
+ Or globally via npx (no clone needed):
433
79
 
434
80
  ```bash
435
- curl http://127.0.0.1:3458/healthz
436
- ```
437
-
438
- Expected response:
439
-
440
- ```json
441
- {
442
- "ok": true,
443
- "upstreamUrl": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
444
- "port": 3458,
445
- "disableThinkingWithTools": true
446
- }
81
+ npx copilot-custom-endpoint # Start both proxies
82
+ npx copilot-custom-endpoint kimi # Kimi only
83
+ npx copilot-custom-endpoint qwen # Qwen only
84
+ npx copilot-custom-endpoint clean # Remove debug_log/
447
85
  ```
448
86
 
449
- Then update your VS Code config to point URLs at the proxy and remove `requestBody` — the proxy handles thinking dynamically (remember, `apiKey` stays empty — set it via the UI):
450
-
451
- ```json
452
- {
453
- "name": "Qwen",
454
- "vendor": "customendpoint",
455
- "apiKey": "",
456
- "apiType": "chat-completions",
457
- "models": [
458
- {
459
- "id": "qwen3.7-max",
460
- "name": "Qwen 3.7 Max",
461
- "url": "http://127.0.0.1:3458/v1/chat/completions",
462
- "toolCalling": true,
463
- "vision": false,
464
- "streaming": true
465
- },
466
- {
467
- "id": "qwen3.6-plus",
468
- "name": "Qwen 3.6 Plus",
469
- "url": "http://127.0.0.1:3458/v1/chat/completions",
470
- "toolCalling": true,
471
- "vision": true,
472
- "streaming": true
473
- }
474
- ]
475
- }
476
- ```
477
-
478
- > **Keep the proxy terminal open** while using these models.
479
-
480
- The proxy URL is configurable via the `QWEN_UPSTREAM_URL` environment variable (defaults to the Singapore endpoint shown in [step 1](#1-grab-a-dashscope-api-key)).
481
-
482
- #### Troubleshooting (Qwen)
483
-
484
- | Symptom | Fix |
485
- | -------------------------------------------- | --------------------------------------------------------------------------------------- |
486
- | `reasoning_content` errors during tool loops | Ensure `enable_thinking: false` is present in `requestBody` for every Qwen model. |
487
- | Vision images fail to upload | Use base64-encoded images; external image URLs may fail if DashScope cannot reach them. |
488
-
489
- </details>
87
+ ## Pricing snapshot
490
88
 
491
- ---
89
+ All prices are **USD per 1M tokens** (cache miss). 1 AI credit = $0.01.
492
90
 
493
- <details>
494
- <summary>DeepSeek V4 (VS Code Extension)</summary>
91
+ | Model | Input | Output | Context |
92
+ | ---------------------------- | ----- | ------ | ------- |
93
+ | **MiMo V2 Flash** 🏆 | $0.10 | $0.30 | 256K |
94
+ | **DeepSeek V4 Flash** 🏆 | $0.14 | $0.28 | 1M |
95
+ | **Kimi K2.6** (non-thinking) | $0.16 | $0.95 | 256K |
96
+ | **MiMo V2.5** | $0.40 | $2.00 | 1M |
97
+ | **Qwen 3.6 Plus** | $0.50 | $3.00 | 1M |
98
+ | **MiniMax M3** | $0.60 | $2.40 | 1M |
99
+ | **MiMo V2.5 Pro** | $1.00 | $3.00 | 1M |
100
+ | **Qwen 3.7 Max** | $2.50 | $7.50 | 1M |
495
101
 
496
- ### DeepSeek V4 (VS Code Extension)
102
+ For the full pricing comparison (cached rates, full Copilot roster, footnotes, sources) see [docs/pricing.md](docs/pricing.md). For a copy-paste config containing **all providers at once**, see [docs/example-config.md](docs/example-config.md).
497
103
 
498
- DeepSeek V4 Pro & Flash are available via a **dedicated VS Code extension** rather than a raw custom endpoint. The extension plugs DeepSeek directly into Copilot Chat's model picker while preserving agent mode, tool calling, skills, and MCP support.
104
+ ## Need help?
499
105
 
500
- > **How this differs:** Unlike Kimi and Qwen (which use VS Code's built-in `chatLanguageModels.json` custom endpoint mechanism), DeepSeek uses a VS Code extension that registers itself with Copilot. The experience is the same pick the model in chat — but the setup path goes through the extension.
501
-
502
- #### 1. Install the Extension
503
-
504
- - VS Code 1.116 or later.
505
- - A [GitHub Copilot subscription](https://github.com/features/copilot) (Free / Pro / Enterprise all work).
506
- - Install **[DeepSeek V4 for Copilot Chat](https://marketplace.visualstudio.com/items?itemName=Vizards.deepseek-v4-for-copilot)** from the VS Code Marketplace ([source](https://github.com/Vizards/deepseek-v4-for-copilot)).
507
-
508
- #### 2. Get a DeepSeek API Key
509
-
510
- Go to [platform.deepseek.com/api_keys](https://platform.deepseek.com/api_keys) and create an API key (starts with `sk-`).
511
-
512
- #### 3. Configure the API Key
513
-
514
- Open the Command Palette (`Ctrl+Shift+P`) and run **DeepSeek: Set API Key**, then paste your key. The key is stored in your OS keychain.
515
-
516
- #### 4. Select the Model and Start Chatting
517
-
518
- - Open Copilot Chat (`Ctrl+Shift+I`).
519
- - Click the model picker (top-right of the chat panel).
520
- - Choose **DeepSeek V4 Pro** or **DeepSeek V4 Flash**.
521
- - Agent mode, tool calling, skills, and MCP all work out of the box.
522
-
523
- #### Optional: Configure Thinking Effort
524
-
525
- In the model picker, click the gear icon next to a DeepSeek model to choose:
526
-
527
- - **None** — fastest, no reasoning.
528
- - **High** — balanced (default).
529
- - **Max** — deep reasoning for complex tasks.
530
-
531
- #### Optional: Vision Support
532
-
533
- DeepSeek V4 is text-only, but the extension handles images automatically — drop a screenshot into chat and it proxies through another installed Copilot model (Claude, GPT-4o) to describe the image first. Run **DeepSeek: Set Vision Proxy Model** to pick which model handles image descriptions.
534
-
535
- > For the full official guide, see: [github.com/deepseek-ai/awesome-deepseek-agent/blob/main/docs/github_copilot.md](https://github.com/deepseek-ai/awesome-deepseek-agent/blob/main/docs/github_copilot.md)
536
-
537
- </details>
538
-
539
- ---
540
-
541
- <details>
542
- <summary>Xiaomi MiMo</summary>
543
-
544
- ### Xiaomi MiMo
545
-
546
- MiMo works **directly** — no proxy needed. Just add the provider entry to your VS Code config and select the model in the chat picker.
547
-
548
- No proxy means lower latency, fewer moving parts, and nothing extra to keep running.
549
-
550
- #### 1. Get a MiMo API key
551
-
552
- Sign up at [platform.xiaomimimo.com](https://platform.xiaomimimo.com) and create an API key from the [Console](https://platform.xiaomimimo.com/console/api-keys).
553
-
554
- #### 2. Register the models in VS Code
555
-
556
- First, open your user config file (see [Config file location](#config-file-location) above) and paste this entry (leave `apiKey` as empty string — you'll set it via the UI):
557
-
558
- ```json
559
- {
560
- "name": "MiMo",
561
- "vendor": "customendpoint",
562
- "apiKey": "",
563
- "apiType": "chat-completions",
564
- "models": [
565
- {
566
- "id": "mimo-v2.5-pro",
567
- "name": "MiMo V2.5 Pro",
568
- "url": "https://api.xiaomimimo.com/v1/chat/completions",
569
- "toolCalling": true,
570
- "vision": false,
571
- "streaming": true,
572
- "maxInputTokens": 1048576,
573
- "maxOutputTokens": 131072,
574
- "requestBody": {
575
- "thinking": { "type": "disabled" },
576
- "temperature": 1,
577
- "top_p": 0.95
578
- }
579
- },
580
- {
581
- "id": "mimo-v2.5",
582
- "name": "MiMo V2.5",
583
- "url": "https://api.xiaomimimo.com/v1/chat/completions",
584
- "toolCalling": true,
585
- "vision": true,
586
- "streaming": true,
587
- "maxInputTokens": 1048576,
588
- "maxOutputTokens": 32768,
589
- "requestBody": {
590
- "thinking": { "type": "disabled" },
591
- "temperature": 1,
592
- "top_p": 0.95
593
- }
594
- },
595
- {
596
- "id": "mimo-v2-flash",
597
- "name": "MiMo V2 Flash",
598
- "url": "https://api.xiaomimimo.com/v1/chat/completions",
599
- "toolCalling": true,
600
- "vision": false,
601
- "streaming": true,
602
- "maxInputTokens": 262144,
603
- "maxOutputTokens": 65536,
604
- "requestBody": {
605
- "thinking": { "type": "disabled" },
606
- "temperature": 0.3,
607
- "top_p": 0.95
608
- }
609
- }
610
- ]
611
- }
612
- ```
613
-
614
- Then set your MiMo API key via the Language Models UI:
615
-
616
- - Open the Command Palette (`Ctrl+Shift+P`).
617
- - Run **Chat: Manage Language Models**.
618
- - Find the **MiMo** group, right-click it → **Update API Key**.
619
- - Paste your MiMo API key.
620
-
621
- > **Note:** `thinking: { "type": "disabled" }` is required for tool-calling stability. Without it, MiMo returns a 400 error when conversation history contains tool calls with missing `reasoning_content`.
622
-
623
- #### 3. Chat!
624
-
625
- - Open the Copilot chat panel (`Ctrl+Alt+I` / `Cmd+Ctrl+I`).
626
- - Click the model picker (top-right of the chat input).
627
- - Choose **MiMo V2 Flash** (fastest/cheapest), **MiMo V2.5** (omnimodal with vision), or **MiMo V2.5 Pro** (most capable for agentic work).
628
- - Ask something. Streaming, tool use, and vision (V2.5) all work.
629
-
630
- #### Troubleshooting (MiMo)
631
-
632
- | Symptom | Fix |
633
- | ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
634
- | 400 error `reasoning_content` during tool loops | Ensure `thinking: { "type": "disabled" }` is present in `requestBody` for every MiMo model. |
635
- | Vision images fail to upload | Use `mimo-v2.5` (the only model with native vision). Text-only models (`pro`, `flash`) don't support image input. |
636
-
637
- </details>
638
-
639
- ---
640
-
641
- <details>
642
- <summary>MiniMax M3 (MiniMax)</summary>
643
-
644
- ### MiniMax M3 (MiniMax)
645
-
646
- MiniMax works **directly** with the OpenAI-compatible Chat Completions endpoint — no proxy needed. The recommended config enables MiniMax's native reasoning via `thinking: { "type": "adaptive" }` + `reasoning_split: true`.
647
-
648
- #### 1. Grab a MiniMax API key
649
-
650
- Create an API key at the [MiniMax Developer Platform](https://platform.minimax.io/user-center/basic-information/interface-key).
651
-
652
- > **Regional endpoints:** MiniMax offers endpoints for different regions. API keys are region-specific.
653
- >
654
- > - **International (default):** `https://api.minimax.io/v1/chat/completions`
655
- > - **China:** `https://api.minimaxi.com/v1/chat/completions`
656
-
657
- #### 2. Register the model in VS Code
658
-
659
- First, open (or create) your user config file (see [Config file location](#config-file-location) above) and paste this entry (leave `apiKey` as empty string — you'll set it via the UI):
660
-
661
- ```json
662
- {
663
- "name": "MiniMax",
664
- "vendor": "customendpoint",
665
- "apiKey": "",
666
- "apiType": "chat-completions",
667
- "models": [
668
- {
669
- "id": "MiniMax-M3",
670
- "name": "MiniMax M3",
671
- "url": "https://api.minimax.io/v1/chat/completions",
672
- "toolCalling": true,
673
- "vision": true,
674
- "streaming": true,
675
- "maxInputTokens": 1048576,
676
- "maxOutputTokens": 131072,
677
- "requestBody": {
678
- "thinking": { "type": "adaptive" },
679
- "reasoning_split": true,
680
- "temperature": 1,
681
- "top_p": 0.95
682
- }
683
- }
684
- ]
685
- }
686
- ```
687
-
688
- Then set your MiniMax API key via the Language Models UI:
689
-
690
- - Open the Command Palette (`Ctrl+Shift+P`).
691
- - Run **Chat: Manage Language Models**.
692
- - Find the **MiniMax** group, right-click it → **Update API Key**.
693
- - Paste your MiniMax API key.
694
-
695
- **Why this config?**
696
-
697
- - `thinking: { "type": "adaptive" }` — MiniMax's documented default. The model decides when to reason.
698
- - `reasoning_split: true` — the server returns reasoning in a structured `reasoning_details` field instead of mixing `<think>` tags into `content`. VS Code sees a clean OpenAI-format message.
699
-
700
- > **Note:** `thinking: { "type": "disabled" }` is **not** a hard override — Phase 1 testing confirmed MiniMax-M3 still reasons internally regardless of this setting, and emits `<think>` tags in `content` either way. Setting it to `disabled` only changes the response field layout, not actual model behavior. We recommend `adaptive` for clarity.
701
-
702
- #### 3. Chat!
703
-
704
- - Open the Copilot chat panel (`Ctrl+Alt+I` / `Cmd+Ctrl+I`).
705
- - Click the model picker and select **MiniMax M3**.
706
- - Ask something. Plain chat, streaming, tool use, and vision all work.
707
-
708
- #### Troubleshooting (MiniMax)
709
-
710
- | Symptom | Fix |
711
- | ------------------------------------ | ------------------------------------------------------------------------------------------------------------- |
712
- | Model not appearing in picker | Check your `chatLanguageModels.json` syntax. Reload the VS Code window. |
713
- | 400 on tool calls | Confirm the model ID is `MiniMax-M3` (capital M's, lowercase i, hyphen). Check the API key region. |
714
- | Responses show leaked `<think>` tags | Make sure `"reasoning_split": true` is set in `requestBody` so reasoning goes to `reasoning_details` instead. |
715
-
716
- </details>
717
-
718
- ---
719
-
720
- For the full research notes, tested values, and known limitations, see:
721
-
722
- - [`docs/models/kimi-k2.6.md`](docs/models/kimi-k2.6.md)
723
- - [`docs/models/qwen.md`](docs/models/qwen.md)
724
- - [`docs/models/mimo.md`](docs/models/mimo.md)
725
- - [`docs/models/minimax.md`](docs/models/minimax.md)
726
-
727
- ## Pricing comparison
728
-
729
- > **⏰ June 1, 2026 — GitHub Copilot switched to usage-based billing (AI Credits) today.**
730
- >
731
- > Before this change, Copilot used **premium request-based billing** — each model had its own multiplier (e.g., GPT-5.5 = 7.5×, Claude Sonnet 4.6 = 1×, Haiku 4.5 = 0.33×), and every request consumed `multiplier × 1` from your monthly premium-request allowance. Now **every interaction burns AI credits** based on actual token consumption. Agent mode and complex multi-file tasks consume significantly more tokens than simple Q&A, which means your 7,000 Pro+ credits can disappear fast if you're using frontier models.
732
- >
733
- > **The practical workaround:** use cheaper alternative models (DeepSeek V4 Flash, Kimi K2.6, Qwen) that are still powerful enough for coding — often at **5–55× less cost** than the Copilot defaults. The tables below show the exact comparison.
734
- >
735
- > 1 AI credit = $0.01 USD. All paid plans include a monthly credit allowance:
736
- >
737
- > | Plan | Price/mo | Base credits | Flex allotment | Total monthly |
738
- > | ---- | -------- | ------------ | -------------- | ------------- |
739
- > | Pro | $10 | 1,000 | 500 | **1,500** |
740
- > | Pro+ | $39 | 3,900 | 3,100 | **7,000** |
741
- > | Max | $100 | 10,000 | 10,000 | **20,000** |
742
- >
743
- > Code completions remain unlimited and **not** billed. Auto model selection gets a 10% discount.
744
-
745
- All prices below are in **USD per 1M tokens** (non-cached). To convert to AI credits, multiply by 100 (e.g., $5.00/1M = 500 credits/1M).
746
-
747
- ### Default GitHub Copilot models
748
-
749
- These are the models available through GitHub Copilot's model roster as of June 1, 2026.
750
-
751
- | Model | Provider | Tier | Input (per 1M) | Cached input | Output (per 1M) | Context |
752
- | --------------------- | --------- | ----------- | -------------- | ------------ | --------------- | ------- |
753
- | **GPT-5.5** | OpenAI | Powerful | $5.00 | $0.50 | $30.00 | — |
754
- | **Claude Opus 4.8** | Anthropic | Powerful | $5.00 | $0.50 | $25.00 | 1M |
755
- | **Claude Opus 4.7** | Anthropic | Powerful | $5.00 | $0.50 | $25.00 | 1M |
756
- | **GPT-5.4** | OpenAI | Versatile | $2.50 | $0.25 | $15.00 | — |
757
- | **GPT-5.3-Codex** | OpenAI | Powerful | $1.75 | $0.175 | $14.00 | — |
758
- | **Claude Sonnet 4.6** | Anthropic | Versatile | $3.00 | $0.30 | $15.00 | 1M |
759
- | **Gemini 3.1 Pro** | Google | Powerful | $2.00¹ | $0.20 | $12.00¹ | 1M |
760
- | **Claude Haiku 4.5** | Anthropic | Versatile | $1.00 | $0.10 | $5.00 | 1M |
761
- | **Gemini 3.5 Flash** | Google | Lightweight | $1.50 | $0.15 | $9.00 | 1M |
762
- | **Gemini 2.5 Pro** | Google | Powerful | $1.25¹ | $0.125 | $10.00¹ | 1M |
763
- | **GPT-5.4 mini** | OpenAI | Lightweight | $0.75 | $0.075 | $4.50 | — |
764
- | **Gemini 3 Flash** | Google | Lightweight | $0.50 | $0.05 | $3.00 | 1M |
765
- | **Raptor mini** | GitHub | Versatile | $0.25 | $0.025 | $2.00 | — |
766
-
767
- ¹ Gemini 3.1 Pro and 2.5 Pro pricing applies to prompts ≤200K tokens.
768
-
769
- ### Custom-endpoint alternatives
770
-
771
- | Model | Provider | Input (per 1M) | Output (per 1M) | Context window |
772
- | --------------------- | --------- | ----------------------------- | --------------------------------------- | -------------- |
773
- | **DeepSeek V4 Flash** | DeepSeek | $0.14 | $0.28 | 1M |
774
- | **MiMo V2 Flash** 🏆 | Xiaomi | $0.10 | $0.30 | 256K |
775
- | **Kimi K2.6** | Moonshot | $0.16 | $0.95 (non-thinking) / $4.00 (thinking) | 256K |
776
- | **DeepSeek V4 Pro** | DeepSeek | $1.74 | $3.48 | 1M |
777
- | **MiMo V2.5** | Xiaomi | $0.40 | $2.00 | 1M |
778
- | **MiMo V2.5 Pro** | Xiaomi | $1.00 | $3.00 | 1M |
779
- | **Qwen 3.6 Plus** | DashScope | $0.50 (≤256K) / $2.00 (>256K) | $3.00 (≤256K) / $6.00 (>256K) | 1M |
780
- | **Qwen 3.7 Max** | DashScope | $2.50 (≤1M) | $7.50 (≤1M) | 1M |
781
- | **MiniMax M3** | MiniMax | $0.60 (≤512K) / $1.20 (>512K) | $2.40 (≤512K) / $4.80 (>512K) | 1M |
782
-
783
- > **Notes:**
784
- >
785
- > - **DeepSeek V4** input pricing shown is the **cache miss** price. Cache hits are significantly cheaper ($0.0028/M for Flash, $0.0145/M for Pro).
786
- > - **MiMo** input pricing shown is the **cache miss** price. Cache hits are 5× cheaper for V2.5 Pro ($0.20/M) and V2.5 ($0.08/M), and 10× cheaper for V2 Flash ($0.01/M).
787
- > - **Gemini 3 Flash** is priced at $0.50/MTok input (text/image/video) and $1.00/MTok input for audio.
788
- > - **Anthropic (Claude)** models also have a cache write cost ($6.25/MTok for Opus, $3.75/MTok for Sonnet, $1.25/MTok for Haiku). Opus 4.7+ use a new tokenizer that may use up to 35% more tokens for the same text.
789
- > - **OpenAI** models support cached input at 0.1× base input rate.
790
- > - **Qwen** models use **tiered pricing** — determined by total input tokens per request. Prices above are for non-thinking mode.
791
- > - **Kimi K2.6** pricing is from the **Moonshot platform** (direct). Via DashScope: $0.89 input / $3.71 output.
792
- > - **DashScope** offers a **free quota** of 1M input + 1M output tokens per model, valid for 90 days.
793
- > - **MiniMax M3** uses **tiered pricing** — input price doubles above 512K input tokens. A 7-day 50% off promotion is available for new accounts.
794
- > - **MiMo** offers a **Token Plan** subscription model with discounted rates and a free cache-writing promotion.
795
- > - For typical Copilot chat usage (short-to-medium prompts), you'll almost always fall in the lowest pricing tier.
796
-
797
- **Quick cost comparison for a typical coding session** (~10K input + ~2K output tokens per turn, 50 turns):
798
-
799
- | Model | Estimated session cost | Copilot Pro+ credits |
800
- | ------------------------ | ---------------------- | -------------------- |
801
- | MiMo V2 Flash 🏆 | ~$0.08 | — |
802
- | DeepSeek V4 Flash 🏆 | ~$0.10 | — |
803
- | Kimi K2.6 (non-thinking) | ~$0.18 | — |
804
- | MiMo V2.5 | ~$0.40 | — |
805
- | Kimi K2.6 (thinking) | ~$0.48 | — |
806
- | Gemini 3 Flash | ~$0.55 | ~55 |
807
- | Qwen 3.6 Plus | ~$0.55 | — |
808
- | MiniMax M3 | ~$0.54 | — |
809
- | MiMo V2.5 Pro | ~$0.80 | — |
810
- | GPT-5.4 mini | ~$0.83 | ~83 |
811
- | Claude Haiku 4.5 | ~$1.00 | ~100 |
812
- | DeepSeek V4 Pro | ~$1.22 | — |
813
- | Qwen 3.7 Max | ~$1.33 | — |
814
- | Gemini 2.5 Pro | ~$1.63 | ~163 |
815
- | Gemini 3.5 Flash | ~$1.65 | ~165 |
816
- | Gemini 3.1 Pro | ~$2.20 | ~220 |
817
- | GPT-5.3-Codex | ~$2.28 | ~228 |
818
- | GPT-5.4 | ~$2.75 | ~275 |
819
- | Claude Sonnet 4.6 | ~$3.00 | ~300 |
820
- | Claude Opus 4.8 / 4.7 | ~$5.00 | ~500 |
821
- | GPT-5.5 | ~$5.50 | ~550 |
822
-
823
- > **How long does 7,000 credits last?** A Pro+ subscriber running 50-turn sessions could afford roughly **13 GPT-5.5 sessions**, **23 Opus sessions**, or **212 Raptor mini sessions** per month — or mix and match.
824
-
825
- > Prices last verified: June 1, 2026. Always check the official pages for the latest rates:
826
- >
827
- > - [GitHub Copilot models & pricing](https://docs.github.com/en/copilot/reference/copilot-billing/models-and-pricing)
828
- > - [OpenAI pricing](https://openai.com/api/pricing/)
829
- > - [Anthropic (Claude) pricing](https://platform.claude.com/docs/en/about-claude/pricing)
830
- > - [Google Gemini pricing](https://ai.google.dev/pricing)
831
- > - [DashScope pricing](https://www.alibabacloud.com/help/en/model-studio/billing-for-model-studio)
832
- > - [DeepSeek pricing](https://api-docs.deepseek.com/quick_start/pricing)
833
- > - [MiMo pricing](https://platform.xiaomimimo.com/docs/en-US/pricing)
834
- > - [MiniMax pricing](https://platform.minimax.io/docs/pricing/overview)
106
+ - **Per-model issues:** check the troubleshooting section at the bottom of each model's doc.
107
+ - **Repo questions / bugs:** open an issue on GitHub.
835
108
 
836
109
  ## Repo layout
837
110
 
838
111
  ```
839
112
  .
840
- ├── docs/models/<provider>-<model>.md # One merged record per model
841
- ├── proxy/ # Local compatibility shims (Kimi only)
842
- ├── tests/ # Test assets (images, etc.)
113
+ ├── docs/models/<provider>-<model>.md # Per-model setup guides (the real docs)
114
+ ├── proxy/ # Local compatibility shims
115
+ ├── tests/ # Test assets
843
116
  └── debug_log/ # Runtime logs (git-ignored)
844
117
  ```
845
118
 
846
- ## Adding a new model
847
-
848
- Want to validate Qwen, GLM, Mimo, or something else?
119
+ ## Want to add a new model?
849
120
 
850
- 1. Create `docs/models/<provider>-<model>.md`.
851
- 2. If the provider needs request rewriting, add a proxy script under `proxy/`.
852
- 3. Recommended sections for the record:
853
- 1. Summary
854
- 2. Compatibility assessment
855
- 3. Final working configuration
856
- 4. Validation summary
857
- 5. Known limitations
858
- 6. Final verdict
859
- 7. Sources
121
+ 1. Create `docs/models/<provider>-<model>.md` with a clear walkthrough.
122
+ 2. If the provider needs request rewriting, add a proxy under `proxy/`.
123
+ 3. Submit a PR.
860
124
 
861
125
  ## Limitations
862
126
 
863
- - This repo covers **chat only**. GitHub Copilot features like inline completions, semantic search, and next-edit suggestions still require a GitHub-hosted model.
864
- - Each proxy is tuned for a specific provider family. Don't point the Kimi proxy at an arbitrary OpenAI-compatible endpoint and expect it to work.
865
-
866
- ---
127
+ - **Chat only.** Inline completions, semantic search, and next-edit suggestions still need a GitHub-hosted model.
128
+ - Each proxy is tuned for a specific provider family. Don't point the Kimi proxy at an arbitrary OpenAI-compatible endpoint.
867
129
 
868
130
  ## Support
869
131
 
870
- If you find this project helpful, please consider supporting its development:
132
+ If this helped, consider sponsoring or donating:
871
133
 
872
134
  [![GitHub Sponsors](https://img.shields.io/badge/Sponsor-GitHub-pink?logo=github)](https://github.com/sponsors/tugudush)
873
135
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "copilot-custom-endpoint",
3
- "version": "1.2.1",
3
+ "version": "1.2.3",
4
4
  "description": "Local proxies for VS Code Copilot custom endpoints — Kimi K2 & Qwen 3.x",
5
5
  "license": "MIT",
6
6
  "type": "module",