copilot-custom-endpoint 1.0.5 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +165 -9
  2. package/package.json +2 -2
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Github Copilot Custom Endpoints
2
2
 
3
- > **TL;DR** — As of **June 1, 2026**, GitHub Copilot switched to usage-based billing (AI Credits), making every chat and agent session burn credits fast. This repo documents a practical workaround: use **cheaper, non-GitHub models** (DeepSeek, Kimi, Qwen) inside VS Code's Copilot chat — often at **5–55× lower cost** while retaining agent mode, tool calling, and streaming. We keep validated, copy-paste-ready configs and a small local proxy that smooths out provider quirks.
3
+ > **TL;DR** — As of **June 1, 2026**, GitHub Copilot switched to usage-based billing (AI Credits), making every chat and agent session burn credits fast. This repo documents a practical workaround: use **cheaper, non-GitHub models** (DeepSeek, Kimi, Qwen, MiMo) inside VS Code's Copilot chat — often at **5–55× lower cost** while retaining agent mode, tool calling, and streaming. We keep validated, copy-paste-ready configs and a small local proxy that smooths out provider quirks.
4
4
 
5
5
  ## What is this?
6
6
 
@@ -20,16 +20,20 @@ This repo is for those situations: validated, copy-paste-ready configs when Open
20
20
 
21
21
  ## Quick start
22
22
 
23
- | Provider | Model | Needs proxy? | Plain chat | Streaming | Tool calling | Vision |
24
- | ----------------------------- | -------------- | ---------------------------------- | ---------- | --------- | ------------ | ------ |
25
- | **Moonshot (Kimi)** | `kimi-k2.6` | Yes — `proxy/kimi-proxy.mjs` | ✅ | ✅ | ✅ | ✅ |
26
- | **Alibaba Cloud (DashScope)** | `qwen3.6-plus` | Optional — `proxy/qwen-proxy.mjs`¹ | ✅² | ✅ | ✅ | ✅ |
27
- | **Alibaba Cloud (DashScope)** | `qwen3.7-max` | Optional — `proxy/qwen-proxy.mjs`¹ | ✅² | ✅ | ✅ | ❌ |
28
- | **DeepSeek** | `deepseek-v4` | No — uses a VS Code extension | ✅ | ✅ | ✅ | ✅³ |
23
+ | Provider | Model | Needs proxy? | Plain chat | Streaming | Tool calling | Vision |
24
+ | ----------------------------- | --------------- | ---------------------------------- | ---------- | --------- | ------------ | ------ |
25
+ | **Moonshot (Kimi)** | `kimi-k2.6` | Yes — `proxy/kimi-proxy.mjs` | ✅ | ✅ | ✅ | ✅ |
26
+ | **Alibaba Cloud (DashScope)** | `qwen3.6-plus` | Optional — `proxy/qwen-proxy.mjs`¹ | ✅² | ✅ | ✅ | ✅ |
27
+ | **Alibaba Cloud (DashScope)** | `qwen3.7-max` | Optional — `proxy/qwen-proxy.mjs`¹ | ✅² | ✅ | ✅ | ❌ |
28
+ | **DeepSeek** | `deepseek-v4` | No — uses a VS Code extension | ✅ | ✅ | ✅ | ✅³ |
29
+ | **Xiaomi MiMo** | `mimo-v2.5` | No | ✅ | ✅ | ✅ | ✅⁴ |
30
+ | **Xiaomi MiMo** | `mimo-v2.5-pro` | No | ✅ | ✅ | ✅ | ❌ |
31
+ | **Xiaomi MiMo** | `mimo-v2-flash` | No | ✅ | ✅ | ✅ | ❌ |
29
32
 
30
33
  ¹ Proxy is optional: direct path works with static `enable_thinking: false`. Proxy adds dynamic thinking suppression (thinking ON in plain chat, OFF in tool loops).
31
34
  ² With proxy: reasoning visible in plain chat. Without proxy: always suppressed.
32
- ³ Vision is supported through a proxy model (Claude, GPT-4o) that describes the image before sending to DeepSeek.
35
+ ³ Vision is supported through a proxy model (Claude, GPT-4o) that describes the image before sending to DeepSeek.
36
+ ⁴ Native vision via dedicated ViT encoder. Tested via VS Code image attachment in agent mode.
33
37
 
34
38
  Pick the model you want and follow the corresponding section below.
35
39
 
@@ -99,6 +103,59 @@ Here's a complete, real-world example of `chatLanguageModels.json` combining all
99
103
  "maxOutputTokens": 32768
100
104
  }
101
105
  ]
106
+ },
107
+ {
108
+ "name": "MiMo",
109
+ "vendor": "customendpoint",
110
+ "apiKey": "<your-mimo-api-key>",
111
+ "apiType": "chat-completions",
112
+ "models": [
113
+ {
114
+ "id": "mimo-v2.5-pro",
115
+ "name": "MiMo V2.5 Pro",
116
+ "url": "https://api.xiaomimimo.com/v1/chat/completions",
117
+ "toolCalling": true,
118
+ "vision": false,
119
+ "streaming": true,
120
+ "maxInputTokens": 1048576,
121
+ "maxOutputTokens": 131072,
122
+ "requestBody": {
123
+ "thinking": { "type": "disabled" },
124
+ "temperature": 1,
125
+ "top_p": 0.95
126
+ }
127
+ },
128
+ {
129
+ "id": "mimo-v2.5",
130
+ "name": "MiMo V2.5",
131
+ "url": "https://api.xiaomimimo.com/v1/chat/completions",
132
+ "toolCalling": true,
133
+ "vision": true,
134
+ "streaming": true,
135
+ "maxInputTokens": 1048576,
136
+ "maxOutputTokens": 32768,
137
+ "requestBody": {
138
+ "thinking": { "type": "disabled" },
139
+ "temperature": 1,
140
+ "top_p": 0.95
141
+ }
142
+ },
143
+ {
144
+ "id": "mimo-v2-flash",
145
+ "name": "MiMo V2 Flash",
146
+ "url": "https://api.xiaomimimo.com/v1/chat/completions",
147
+ "toolCalling": true,
148
+ "vision": false,
149
+ "streaming": true,
150
+ "maxInputTokens": 262144,
151
+ "maxOutputTokens": 65536,
152
+ "requestBody": {
153
+ "thinking": { "type": "disabled" },
154
+ "temperature": 0.3,
155
+ "top_p": 0.95
156
+ }
157
+ }
158
+ ]
102
159
  }
103
160
  ]
104
161
  ```
@@ -418,10 +475,101 @@ DeepSeek V4 is text-only, but the extension handles images automatically — dro
418
475
 
419
476
  > For the full official guide, see: [github.com/deepseek-ai/awesome-deepseek-agent/blob/main/docs/github_copilot.md](https://github.com/deepseek-ai/awesome-deepseek-agent/blob/main/docs/github_copilot.md)
420
477
 
478
+ ---
479
+
480
+ ### Xiaomi MiMo
481
+
482
+ MiMo works **directly** — no proxy needed. Just add the provider entry to your VS Code config and select the model in the chat picker.
483
+
484
+ No proxy means lower latency, fewer moving parts, and nothing extra to keep running.
485
+
486
+ #### 1. Get a MiMo API key
487
+
488
+ Sign up at [platform.xiaomimimo.com](https://platform.xiaomimimo.com) and create an API key from the [Console](https://platform.xiaomimimo.com/console/api-keys).
489
+
490
+ #### 2. Register the models in VS Code
491
+
492
+ Open your user config file (see [Config file location](#config-file-location) above) and paste this entry (replace `<your-mimo-api-key>`):
493
+
494
+ ```json
495
+ {
496
+ "name": "MiMo",
497
+ "vendor": "customendpoint",
498
+ "apiKey": "<your-mimo-api-key>",
499
+ "apiType": "chat-completions",
500
+ "models": [
501
+ {
502
+ "id": "mimo-v2.5-pro",
503
+ "name": "MiMo V2.5 Pro",
504
+ "url": "https://api.xiaomimimo.com/v1/chat/completions",
505
+ "toolCalling": true,
506
+ "vision": false,
507
+ "streaming": true,
508
+ "maxInputTokens": 1048576,
509
+ "maxOutputTokens": 131072,
510
+ "requestBody": {
511
+ "thinking": { "type": "disabled" },
512
+ "temperature": 1,
513
+ "top_p": 0.95
514
+ }
515
+ },
516
+ {
517
+ "id": "mimo-v2.5",
518
+ "name": "MiMo V2.5",
519
+ "url": "https://api.xiaomimimo.com/v1/chat/completions",
520
+ "toolCalling": true,
521
+ "vision": true,
522
+ "streaming": true,
523
+ "maxInputTokens": 1048576,
524
+ "maxOutputTokens": 32768,
525
+ "requestBody": {
526
+ "thinking": { "type": "disabled" },
527
+ "temperature": 1,
528
+ "top_p": 0.95
529
+ }
530
+ },
531
+ {
532
+ "id": "mimo-v2-flash",
533
+ "name": "MiMo V2 Flash",
534
+ "url": "https://api.xiaomimimo.com/v1/chat/completions",
535
+ "toolCalling": true,
536
+ "vision": false,
537
+ "streaming": true,
538
+ "maxInputTokens": 262144,
539
+ "maxOutputTokens": 65536,
540
+ "requestBody": {
541
+ "thinking": { "type": "disabled" },
542
+ "temperature": 0.3,
543
+ "top_p": 0.95
544
+ }
545
+ }
546
+ ]
547
+ }
548
+ ```
549
+
550
+ > **Note:** `thinking: { "type": "disabled" }` is required for tool-calling stability. Without it, MiMo returns a 400 error when conversation history contains tool calls with missing `reasoning_content`.
551
+
552
+ #### 3. Chat!
553
+
554
+ - Open the Copilot chat panel (`Ctrl+Alt+I` / `Cmd+Ctrl+I`).
555
+ - Click the model picker (top-right of the chat input).
556
+ - Choose **MiMo V2 Flash** (fastest/cheapest), **MiMo V2.5** (omnimodal with vision), or **MiMo V2.5 Pro** (most capable for agentic work).
557
+ - Ask something. Streaming, tool use, and vision (V2.5) all work.
558
+
559
+ #### Troubleshooting (MiMo)
560
+
561
+ | Symptom | Fix |
562
+ | ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
563
+ | 400 error `reasoning_content` during tool loops | Ensure `thinking: { "type": "disabled" }` is present in `requestBody` for every MiMo model. |
564
+ | Vision images fail to upload | Use `mimo-v2.5` (the only model with native vision). Text-only models (`pro`, `flash`) don't support image input. |
565
+
566
+ ---
567
+
421
568
  For the full research notes, tested values, and known limitations, see:
422
569
 
423
570
  - [`docs/models/kimi-k2.6.md`](docs/models/kimi-k2.6.md)
424
571
  - [`docs/models/qwen.md`](docs/models/qwen.md)
572
+ - [`docs/models/mimo.md`](docs/models/mimo.md)
425
573
 
426
574
  ## Pricing comparison
427
575
 
@@ -470,32 +618,39 @@ These are the models available through GitHub Copilot's model roster as of June
470
618
  | Model | Provider | Input (per 1M) | Output (per 1M) | Context window |
471
619
  | --------------------- | --------- | ----------------------------- | --------------------------------------- | -------------- |
472
620
  | **DeepSeek V4 Flash** | DeepSeek | $0.14 | $0.28 | 1M |
621
+ | **MiMo V2 Flash** 🏆 | Xiaomi | $0.10 | $0.30 | 256K |
473
622
  | **Kimi K2.6** | Moonshot | $0.16 | $0.95 (non-thinking) / $4.00 (thinking) | 256K |
474
623
  | **DeepSeek V4 Pro** | DeepSeek | $1.74 | $3.48 | 1M |
624
+ | **MiMo V2.5** | Xiaomi | $0.40 | $2.00 | 1M |
625
+ | **MiMo V2.5 Pro** | Xiaomi | $1.00 | $3.00 | 1M |
475
626
  | **Qwen 3.6 Plus** | DashScope | $0.50 (≤256K) / $2.00 (>256K) | $3.00 (≤256K) / $6.00 (>256K) | 1M |
476
627
  | **Qwen 3.7 Max** | DashScope | $2.50 (≤1M) | $7.50 (≤1M) | 1M |
477
628
 
478
629
  > **Notes:**
479
630
  >
480
631
  > - **DeepSeek V4** input pricing shown is the **cache miss** price. Cache hits are significantly cheaper ($0.0028/M for Flash, $0.0145/M for Pro).
632
+ > - **MiMo** input pricing shown is the **cache miss** price. Cache hits are 5× cheaper for V2.5 Pro ($0.20/M) and V2.5 ($0.08/M), and 10× cheaper for V2 Flash ($0.01/M).
481
633
  > - **Gemini 3 Flash** is priced at $0.50/MTok input (text/image/video) and $1.00/MTok input for audio.
482
634
  > - **Anthropic (Claude)** models also have a cache write cost ($6.25/MTok for Opus, $3.75/MTok for Sonnet, $1.25/MTok for Haiku). Opus 4.7+ use a new tokenizer that may use up to 35% more tokens for the same text.
483
635
  > - **OpenAI** models support cached input at 0.1× base input rate.
484
636
  > - **Qwen** models use **tiered pricing** — determined by total input tokens per request. Prices above are for non-thinking mode.
485
637
  > - **Kimi K2.6** pricing is from the **Moonshot platform** (direct). Via DashScope: $0.89 input / $3.71 output.
486
638
  > - **DashScope** offers a **free quota** of 1M input + 1M output tokens per model, valid for 90 days.
639
+ > - **MiMo** offers a **Token Plan** subscription model with discounted rates and a free cache-writing promotion.
487
640
  > - For typical Copilot chat usage (short-to-medium prompts), you'll almost always fall in the lowest pricing tier.
488
641
 
489
642
  **Quick cost comparison for a typical coding session** (~10K input + ~2K output tokens per turn, 50 turns):
490
643
 
491
644
  | Model | Estimated session cost | Copilot Pro+ credits |
492
645
  | ------------------------ | ---------------------- | -------------------- |
646
+ | MiMo V2 Flash 🏆 | ~$0.08 | — |
493
647
  | DeepSeek V4 Flash 🏆 | ~$0.10 | — |
494
648
  | Kimi K2.6 (non-thinking) | ~$0.18 | — |
495
- | Raptor mini | ~$0.33 | ~33 |
649
+ | MiMo V2.5 | ~$0.40 | |
496
650
  | Kimi K2.6 (thinking) | ~$0.48 | — |
497
651
  | Gemini 3 Flash | ~$0.55 | ~55 |
498
652
  | Qwen 3.6 Plus | ~$0.55 | — |
653
+ | MiMo V2.5 Pro | ~$0.80 | — |
499
654
  | GPT-5.4 mini | ~$0.83 | ~83 |
500
655
  | Claude Haiku 4.5 | ~$1.00 | ~100 |
501
656
  | DeepSeek V4 Pro | ~$1.22 | — |
@@ -519,6 +674,7 @@ These are the models available through GitHub Copilot's model roster as of June
519
674
  > - [Google Gemini pricing](https://ai.google.dev/pricing)
520
675
  > - [DashScope pricing](https://www.alibabacloud.com/help/en/model-studio/billing-for-model-studio)
521
676
  > - [DeepSeek pricing](https://api-docs.deepseek.com/quick_start/pricing)
677
+ > - [MiMo pricing](https://platform.xiaomimimo.com/docs/en-US/pricing)
522
678
 
523
679
  ## Repo layout
524
680
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "copilot-custom-endpoint",
3
- "version": "1.0.5",
3
+ "version": "1.1.0",
4
4
  "description": "Local proxies for VS Code Copilot custom endpoints — Kimi K2 & Qwen 3.x",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -51,4 +51,4 @@
51
51
  "dependencies": {
52
52
  "dotenv": "^17.4.2"
53
53
  }
54
- }
54
+ }