copilot-custom-endpoint 1.1.1 → 1.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +164 -19
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -29,6 +29,7 @@ This repo is for those situations: validated, copy-paste-ready configs when Open
|
|
|
29
29
|
| **Xiaomi MiMo** | `mimo-v2.5` | No | ✅ | ✅ | ✅ | ✅⁴ |
|
|
30
30
|
| **Xiaomi MiMo** | `mimo-v2.5-pro` | No | ✅ | ✅ | ✅ | ❌ |
|
|
31
31
|
| **Xiaomi MiMo** | `mimo-v2-flash` | No | ✅ | ✅ | ✅ | ❌ |
|
|
32
|
+
| **MiniMax** | `MiniMax-M3` | No | ✅ | ✅ | ✅ | ✅ |
|
|
32
33
|
|
|
33
34
|
¹ Proxy is optional: direct path works with static `enable_thinking: false`. Proxy adds dynamic thinking suppression (thinking ON in plain chat, OFF in tool loops).
|
|
34
35
|
² With proxy: reasoning visible in plain chat. Without proxy: always suppressed.
|
|
@@ -37,26 +38,37 @@ This repo is for those situations: validated, copy-paste-ready configs when Open
|
|
|
37
38
|
|
|
38
39
|
Pick the model you want and follow the corresponding section below.
|
|
39
40
|
|
|
40
|
-
### Config
|
|
41
|
+
### Config setup: two-step workflow
|
|
41
42
|
|
|
42
|
-
|
|
43
|
+
VS Code separates **model configuration** from **API key storage** for security. You set up each provider in two steps:
|
|
43
44
|
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
|
47
|
-
|
|
|
48
|
-
|
|
|
45
|
+
1. **Create/update `chatLanguageModels.json`** — this file defines the models, URLs, and settings. API keys are **not** stored here (leave `apiKey` out entirely, or use an empty string).
|
|
46
|
+
|
|
47
|
+
| OS | Path |
|
|
48
|
+
| ------- | ----------------------------------------------------------------- |
|
|
49
|
+
| Windows | `%APPDATA%\Code\User\chatLanguageModels.json` |
|
|
50
|
+
| macOS | `~/Library/Application Support/Code/User/chatLanguageModels.json` |
|
|
51
|
+
| Linux | `~/.config/Code/User/chatLanguageModels.json` |
|
|
52
|
+
|
|
53
|
+
2. **Set each API key through the Language Models UI:**
|
|
54
|
+
- Open the Command Palette (`Ctrl+Shift+P`).
|
|
55
|
+
- Run **Chat: Manage Language Models**.
|
|
56
|
+
- Find your provider group in the list.
|
|
57
|
+
- Right-click the group name → **Update API Key**.
|
|
58
|
+
- Paste your key. It is stored securely (not in the JSON file).
|
|
59
|
+
|
|
60
|
+
> **Why this way?** The JSON config file is often tracked in dotfile repos or shared across machines. API keys don't belong there. The VS Code UI stores them in your OS keychain instead.
|
|
49
61
|
|
|
50
62
|
### Full example config
|
|
51
63
|
|
|
52
|
-
Here's a complete, real-world example of `chatLanguageModels.json` combining all the providers documented in this repo.
|
|
64
|
+
Here's a complete, real-world example of `chatLanguageModels.json` combining all the providers documented in this repo. Note the `apiKey` fields are left as empty strings — you'll set them via the Language Models UI instead. After you set a key via the UI, VS Code replaces the empty string with a `${input:chat.lm.secret.<id>}` secret reference.
|
|
53
65
|
|
|
54
66
|
```json
|
|
55
67
|
[
|
|
56
68
|
{
|
|
57
69
|
"name": "Qwen",
|
|
58
70
|
"vendor": "customendpoint",
|
|
59
|
-
"apiKey": "
|
|
71
|
+
"apiKey": "",
|
|
60
72
|
"apiType": "chat-completions",
|
|
61
73
|
"models": [
|
|
62
74
|
{
|
|
@@ -86,7 +98,7 @@ Here's a complete, real-world example of `chatLanguageModels.json` combining all
|
|
|
86
98
|
{
|
|
87
99
|
"name": "Kimi",
|
|
88
100
|
"vendor": "customendpoint",
|
|
89
|
-
"apiKey": "
|
|
101
|
+
"apiKey": "",
|
|
90
102
|
"apiType": "chat-completions",
|
|
91
103
|
"models": [
|
|
92
104
|
{
|
|
@@ -107,7 +119,7 @@ Here's a complete, real-world example of `chatLanguageModels.json` combining all
|
|
|
107
119
|
{
|
|
108
120
|
"name": "MiMo",
|
|
109
121
|
"vendor": "customendpoint",
|
|
110
|
-
"apiKey": "
|
|
122
|
+
"apiKey": "",
|
|
111
123
|
"apiType": "chat-completions",
|
|
112
124
|
"models": [
|
|
113
125
|
{
|
|
@@ -156,6 +168,30 @@ Here's a complete, real-world example of `chatLanguageModels.json` combining all
|
|
|
156
168
|
}
|
|
157
169
|
}
|
|
158
170
|
]
|
|
171
|
+
},
|
|
172
|
+
{
|
|
173
|
+
"name": "MiniMax",
|
|
174
|
+
"vendor": "customendpoint",
|
|
175
|
+
"apiKey": "",
|
|
176
|
+
"apiType": "chat-completions",
|
|
177
|
+
"models": [
|
|
178
|
+
{
|
|
179
|
+
"id": "MiniMax-M3",
|
|
180
|
+
"name": "MiniMax M3",
|
|
181
|
+
"url": "https://api.minimax.io/v1/chat/completions",
|
|
182
|
+
"toolCalling": true,
|
|
183
|
+
"vision": true,
|
|
184
|
+
"streaming": true,
|
|
185
|
+
"maxInputTokens": 1048576,
|
|
186
|
+
"maxOutputTokens": 131072,
|
|
187
|
+
"requestBody": {
|
|
188
|
+
"thinking": { "type": "adaptive" },
|
|
189
|
+
"reasoning_split": true,
|
|
190
|
+
"temperature": 1,
|
|
191
|
+
"top_p": 0.95
|
|
192
|
+
}
|
|
193
|
+
}
|
|
194
|
+
]
|
|
159
195
|
}
|
|
160
196
|
]
|
|
161
197
|
```
|
|
@@ -208,6 +244,10 @@ You should see:
|
|
|
208
244
|
|
|
209
245
|
```
|
|
210
246
|
[kimi-proxy] listening on http://127.0.0.1:3457/v1/chat/completions
|
|
247
|
+
[kimi-proxy] forwarding to https://api.moonshot.ai/v1/chat/completions
|
|
248
|
+
[kimi-proxy] forcing temperature=1, non-thinking temperature=0.6, and top_p=0.95
|
|
249
|
+
[kimi-proxy] disable thinking with tools=true
|
|
250
|
+
[kimi-proxy] writing redacted request summaries to debug_log/kimi-proxy.ndjson
|
|
211
251
|
```
|
|
212
252
|
|
|
213
253
|
Check it's alive:
|
|
@@ -232,13 +272,13 @@ Expected response:
|
|
|
232
272
|
|
|
233
273
|
#### 3. Register the model in VS Code
|
|
234
274
|
|
|
235
|
-
|
|
275
|
+
First, open (or create) your user config file (see [Config file location](#config-file-location) above) and paste this entry (leave `apiKey` as empty string — you'll set it via the UI):
|
|
236
276
|
|
|
237
277
|
```json
|
|
238
278
|
{
|
|
239
279
|
"name": "Kimi",
|
|
240
280
|
"vendor": "customendpoint",
|
|
241
|
-
"apiKey": "
|
|
281
|
+
"apiKey": "",
|
|
242
282
|
"apiType": "chat-completions",
|
|
243
283
|
"models": [
|
|
244
284
|
{
|
|
@@ -260,6 +300,13 @@ Open (or create) your user config file (see [Config file location](#config-file-
|
|
|
260
300
|
|
|
261
301
|
> **Note:** The `requestBody.temperature` here is a hint to VS Code, but the proxy will enforce the exact values Kimi requires regardless.
|
|
262
302
|
|
|
303
|
+
Then set your Moonshot API key via the Language Models UI:
|
|
304
|
+
|
|
305
|
+
- Open the Command Palette (`Ctrl+Shift+P`).
|
|
306
|
+
- Run **Chat: Manage Language Models**.
|
|
307
|
+
- Find the **Kimi** group, right-click it → **Update API Key**.
|
|
308
|
+
- Paste your Moonshot API key.
|
|
309
|
+
|
|
263
310
|
#### 4. Chat!
|
|
264
311
|
|
|
265
312
|
- Open the Copilot chat panel (`Ctrl+Alt+I` / `Cmd+Ctrl+I`).
|
|
@@ -298,13 +345,13 @@ Create an API key [here](https://modelstudio.console.alibabacloud.com/ap-southea
|
|
|
298
345
|
|
|
299
346
|
#### 2. Register the models in VS Code
|
|
300
347
|
|
|
301
|
-
|
|
348
|
+
First, open (or create) your user config file (see [Config file location](#config-file-location) above) and paste this entry (leave `apiKey` as empty string — you'll set it via the UI):
|
|
302
349
|
|
|
303
350
|
```json
|
|
304
351
|
{
|
|
305
352
|
"name": "Qwen",
|
|
306
353
|
"vendor": "customendpoint",
|
|
307
|
-
"apiKey": "
|
|
354
|
+
"apiKey": "",
|
|
308
355
|
"apiType": "chat-completions",
|
|
309
356
|
"models": [
|
|
310
357
|
{
|
|
@@ -333,6 +380,13 @@ Open (or create) your user config file (see [Config file location](#config-file-
|
|
|
333
380
|
}
|
|
334
381
|
```
|
|
335
382
|
|
|
383
|
+
Then set your DashScope API key via the Language Models UI:
|
|
384
|
+
|
|
385
|
+
- Open the Command Palette (`Ctrl+Shift+P`).
|
|
386
|
+
- Run **Chat: Manage Language Models**.
|
|
387
|
+
- Find the **Qwen** group, right-click it → **Update API Key**.
|
|
388
|
+
- Paste your DashScope API key.
|
|
389
|
+
|
|
336
390
|
> **Trade-off:** `enable_thinking: false` suppresses reasoning in all requests (both plain chat and tool loops). Tool loops stay stable, but you never see the model's thought process. The [optional proxy](#optional-local-proxy-for-dynamic-thinking) below avoids this trade-off.
|
|
337
391
|
|
|
338
392
|
#### 3. Chat!
|
|
@@ -392,13 +446,13 @@ Expected response:
|
|
|
392
446
|
}
|
|
393
447
|
```
|
|
394
448
|
|
|
395
|
-
Then update your VS Code config to point URLs at the proxy and remove `requestBody` — the proxy handles thinking dynamically:
|
|
449
|
+
Then update your VS Code config to point URLs at the proxy and remove `requestBody` — the proxy handles thinking dynamically (remember, `apiKey` stays empty — set it via the UI):
|
|
396
450
|
|
|
397
451
|
```json
|
|
398
452
|
{
|
|
399
453
|
"name": "Qwen",
|
|
400
454
|
"vendor": "customendpoint",
|
|
401
|
-
"apiKey": "
|
|
455
|
+
"apiKey": "",
|
|
402
456
|
"apiType": "chat-completions",
|
|
403
457
|
"models": [
|
|
404
458
|
{
|
|
@@ -499,13 +553,13 @@ Sign up at [platform.xiaomimimo.com](https://platform.xiaomimimo.com) and create
|
|
|
499
553
|
|
|
500
554
|
#### 2. Register the models in VS Code
|
|
501
555
|
|
|
502
|
-
|
|
556
|
+
First, open your user config file (see [Config file location](#config-file-location) above) and paste this entry (leave `apiKey` as empty string — you'll set it via the UI):
|
|
503
557
|
|
|
504
558
|
```json
|
|
505
559
|
{
|
|
506
560
|
"name": "MiMo",
|
|
507
561
|
"vendor": "customendpoint",
|
|
508
|
-
"apiKey": "
|
|
562
|
+
"apiKey": "",
|
|
509
563
|
"apiType": "chat-completions",
|
|
510
564
|
"models": [
|
|
511
565
|
{
|
|
@@ -557,6 +611,13 @@ Open your user config file (see [Config file location](#config-file-location) ab
|
|
|
557
611
|
}
|
|
558
612
|
```
|
|
559
613
|
|
|
614
|
+
Then set your MiMo API key via the Language Models UI:
|
|
615
|
+
|
|
616
|
+
- Open the Command Palette (`Ctrl+Shift+P`).
|
|
617
|
+
- Run **Chat: Manage Language Models**.
|
|
618
|
+
- Find the **MiMo** group, right-click it → **Update API Key**.
|
|
619
|
+
- Paste your MiMo API key.
|
|
620
|
+
|
|
560
621
|
> **Note:** `thinking: { "type": "disabled" }` is required for tool-calling stability. Without it, MiMo returns a 400 error when conversation history contains tool calls with missing `reasoning_content`.
|
|
561
622
|
|
|
562
623
|
#### 3. Chat!
|
|
@@ -577,11 +638,91 @@ Open your user config file (see [Config file location](#config-file-location) ab
|
|
|
577
638
|
|
|
578
639
|
---
|
|
579
640
|
|
|
641
|
+
<details>
|
|
642
|
+
<summary>MiniMax M3 (MiniMax)</summary>
|
|
643
|
+
|
|
644
|
+
### MiniMax M3 (MiniMax)
|
|
645
|
+
|
|
646
|
+
MiniMax works **directly** with the OpenAI-compatible Chat Completions endpoint — no proxy needed. The recommended config enables MiniMax's native reasoning via `thinking: { "type": "adaptive" }` + `reasoning_split: true`.
|
|
647
|
+
|
|
648
|
+
#### 1. Grab a MiniMax API key
|
|
649
|
+
|
|
650
|
+
Create an API key at the [MiniMax Developer Platform](https://platform.minimax.io/user-center/basic-information/interface-key).
|
|
651
|
+
|
|
652
|
+
> **Regional endpoints:** MiniMax offers endpoints for different regions. API keys are region-specific.
|
|
653
|
+
>
|
|
654
|
+
> - **International (default):** `https://api.minimax.io/v1/chat/completions`
|
|
655
|
+
> - **China:** `https://api.minimaxi.com/v1/chat/completions`
|
|
656
|
+
|
|
657
|
+
#### 2. Register the model in VS Code
|
|
658
|
+
|
|
659
|
+
First, open (or create) your user config file (see [Config file location](#config-file-location) above) and paste this entry (leave `apiKey` as empty string — you'll set it via the UI):
|
|
660
|
+
|
|
661
|
+
```json
|
|
662
|
+
{
|
|
663
|
+
"name": "MiniMax",
|
|
664
|
+
"vendor": "customendpoint",
|
|
665
|
+
"apiKey": "",
|
|
666
|
+
"apiType": "chat-completions",
|
|
667
|
+
"models": [
|
|
668
|
+
{
|
|
669
|
+
"id": "MiniMax-M3",
|
|
670
|
+
"name": "MiniMax M3",
|
|
671
|
+
"url": "https://api.minimax.io/v1/chat/completions",
|
|
672
|
+
"toolCalling": true,
|
|
673
|
+
"vision": true,
|
|
674
|
+
"streaming": true,
|
|
675
|
+
"maxInputTokens": 1048576,
|
|
676
|
+
"maxOutputTokens": 131072,
|
|
677
|
+
"requestBody": {
|
|
678
|
+
"thinking": { "type": "adaptive" },
|
|
679
|
+
"reasoning_split": true,
|
|
680
|
+
"temperature": 1,
|
|
681
|
+
"top_p": 0.95
|
|
682
|
+
}
|
|
683
|
+
}
|
|
684
|
+
]
|
|
685
|
+
}
|
|
686
|
+
```
|
|
687
|
+
|
|
688
|
+
Then set your MiniMax API key via the Language Models UI:
|
|
689
|
+
|
|
690
|
+
- Open the Command Palette (`Ctrl+Shift+P`).
|
|
691
|
+
- Run **Chat: Manage Language Models**.
|
|
692
|
+
- Find the **MiniMax** group, right-click it → **Update API Key**.
|
|
693
|
+
- Paste your MiniMax API key.
|
|
694
|
+
|
|
695
|
+
**Why this config?**
|
|
696
|
+
|
|
697
|
+
- `thinking: { "type": "adaptive" }` — MiniMax's documented default. The model decides when to reason.
|
|
698
|
+
- `reasoning_split: true` — the server returns reasoning in a structured `reasoning_details` field instead of mixing `<think>` tags into `content`. VS Code sees a clean OpenAI-format message.
|
|
699
|
+
|
|
700
|
+
> **Note:** `thinking: { "type": "disabled" }` is **not** a hard override — Phase 1 testing confirmed MiniMax-M3 still reasons internally regardless of this setting, and emits `<think>` tags in `content` either way. Setting it to `disabled` only changes the response field layout, not actual model behavior. We recommend `adaptive` for clarity.
|
|
701
|
+
|
|
702
|
+
#### 3. Chat!
|
|
703
|
+
|
|
704
|
+
- Open the Copilot chat panel (`Ctrl+Alt+I` / `Cmd+Ctrl+I`).
|
|
705
|
+
- Click the model picker and select **MiniMax M3**.
|
|
706
|
+
- Ask something. Plain chat, streaming, tool use, and vision all work.
|
|
707
|
+
|
|
708
|
+
#### Troubleshooting (MiniMax)
|
|
709
|
+
|
|
710
|
+
| Symptom | Fix |
|
|
711
|
+
| ------------------------------------ | ------------------------------------------------------------------------------------------------------------- |
|
|
712
|
+
| Model not appearing in picker | Check your `chatLanguageModels.json` syntax. Reload the VS Code window. |
|
|
713
|
+
| 400 on tool calls | Confirm the model ID is `MiniMax-M3` (capital M's, lowercase i, hyphen). Check the API key region. |
|
|
714
|
+
| Responses show leaked `<think>` tags | Make sure `"reasoning_split": true` is set in `requestBody` so reasoning goes to `reasoning_details` instead. |
|
|
715
|
+
|
|
716
|
+
</details>
|
|
717
|
+
|
|
718
|
+
---
|
|
719
|
+
|
|
580
720
|
For the full research notes, tested values, and known limitations, see:
|
|
581
721
|
|
|
582
722
|
- [`docs/models/kimi-k2.6.md`](docs/models/kimi-k2.6.md)
|
|
583
723
|
- [`docs/models/qwen.md`](docs/models/qwen.md)
|
|
584
724
|
- [`docs/models/mimo.md`](docs/models/mimo.md)
|
|
725
|
+
- [`docs/models/minimax.md`](docs/models/minimax.md)
|
|
585
726
|
|
|
586
727
|
## Pricing comparison
|
|
587
728
|
|
|
@@ -637,6 +778,7 @@ These are the models available through GitHub Copilot's model roster as of June
|
|
|
637
778
|
| **MiMo V2.5 Pro** | Xiaomi | $1.00 | $3.00 | 1M |
|
|
638
779
|
| **Qwen 3.6 Plus** | DashScope | $0.50 (≤256K) / $2.00 (>256K) | $3.00 (≤256K) / $6.00 (>256K) | 1M |
|
|
639
780
|
| **Qwen 3.7 Max** | DashScope | $2.50 (≤1M) | $7.50 (≤1M) | 1M |
|
|
781
|
+
| **MiniMax M3** | MiniMax | $0.60 (≤512K) / $1.20 (>512K) | $2.40 (≤512K) / $4.80 (>512K) | 1M |
|
|
640
782
|
|
|
641
783
|
> **Notes:**
|
|
642
784
|
>
|
|
@@ -648,6 +790,7 @@ These are the models available through GitHub Copilot's model roster as of June
|
|
|
648
790
|
> - **Qwen** models use **tiered pricing** — determined by total input tokens per request. Prices above are for non-thinking mode.
|
|
649
791
|
> - **Kimi K2.6** pricing is from the **Moonshot platform** (direct). Via DashScope: $0.89 input / $3.71 output.
|
|
650
792
|
> - **DashScope** offers a **free quota** of 1M input + 1M output tokens per model, valid for 90 days.
|
|
793
|
+
> - **MiniMax M3** uses **tiered pricing** — input price doubles above 512K input tokens. A 7-day 50% off promotion is available for new accounts.
|
|
651
794
|
> - **MiMo** offers a **Token Plan** subscription model with discounted rates and a free cache-writing promotion.
|
|
652
795
|
> - For typical Copilot chat usage (short-to-medium prompts), you'll almost always fall in the lowest pricing tier.
|
|
653
796
|
|
|
@@ -662,6 +805,7 @@ These are the models available through GitHub Copilot's model roster as of June
|
|
|
662
805
|
| Kimi K2.6 (thinking) | ~$0.48 | — |
|
|
663
806
|
| Gemini 3 Flash | ~$0.55 | ~55 |
|
|
664
807
|
| Qwen 3.6 Plus | ~$0.55 | — |
|
|
808
|
+
| MiniMax M3 | ~$0.54 | — |
|
|
665
809
|
| MiMo V2.5 Pro | ~$0.80 | — |
|
|
666
810
|
| GPT-5.4 mini | ~$0.83 | ~83 |
|
|
667
811
|
| Claude Haiku 4.5 | ~$1.00 | ~100 |
|
|
@@ -687,6 +831,7 @@ These are the models available through GitHub Copilot's model roster as of June
|
|
|
687
831
|
> - [DashScope pricing](https://www.alibabacloud.com/help/en/model-studio/billing-for-model-studio)
|
|
688
832
|
> - [DeepSeek pricing](https://api-docs.deepseek.com/quick_start/pricing)
|
|
689
833
|
> - [MiMo pricing](https://platform.xiaomimimo.com/docs/en-US/pricing)
|
|
834
|
+
> - [MiniMax pricing](https://platform.minimax.io/docs/pricing/overview)
|
|
690
835
|
|
|
691
836
|
## Repo layout
|
|
692
837
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "copilot-custom-endpoint",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.2.1",
|
|
4
4
|
"description": "Local proxies for VS Code Copilot custom endpoints — Kimi K2 & Qwen 3.x",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"type": "module",
|
|
@@ -51,4 +51,4 @@
|
|
|
51
51
|
"dependencies": {
|
|
52
52
|
"dotenv": "^17.4.2"
|
|
53
53
|
}
|
|
54
|
-
}
|
|
54
|
+
}
|