copilot-custom-endpoint 1.4.0 → 1.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -88,21 +88,23 @@ npx copilot-custom-endpoint clean # Remove debug_log/
88
88
 
89
89
  ## Pricing snapshot
90
90
 
91
- All prices are **USD per 1M tokens** (cache miss). 1 AI credit = $0.01. **MiniMax M3** figures reflect a permanent 50% off list price — see the model doc for the full rate card.
92
-
93
- | Model | Input | Output | Context |
94
- | ---------------------------- | ----- | ------ | ------- |
95
- | **MiMo V2 Flash** 🏆 | $0.10 | $0.30 | 256K |
96
- | **DeepSeek V4 Flash** 🏆 | $0.14 | $0.28 | 1M |
97
- | **Kimi K2.6** (non-thinking) | $0.16 | $0.95 | 262K |
98
- | **Kimi K2.7 Code** | $0.19 | $4.00 | 262K |
99
- | **MiniMax M3** | $0.30 | $1.20 | 1M |
100
- | **MiMo V2.5** | $0.40 | $2.00 | 1M |
101
- | **Qwen 3.7 Plus** | $0.40 | $1.60 | 1M |
102
- | **MiMo V2.5 Pro** | $1.00 | $3.00 | 1M |
103
- | **GLM 5V Turbo** | $1.20 | $4.00 | 200K |
104
- | **GLM 5.1** | $1.40 | $4.40 | 200K |
105
- | **Qwen 3.7 Max** | $2.50 | $7.50 | 1M |
91
+ All prices are **USD per 1M tokens** (cache miss). 1 AI credit = $0.01. **MiniMax M3** figures reflect a permanent 50% off list price — see the model doc for the full rate card. Context window ¹ covers input + output combined.
92
+
93
+ | Model | Input | Output | Context ¹ |
94
+ | ---------------------------- | ----- | ------ | --------- |
95
+ | **MiMo V2 Flash** 🏆 | $0.10 | $0.30 | 256K |
96
+ | **DeepSeek V4 Flash** 🏆 | $0.14 | $0.28 | 1M |
97
+ | **Kimi K2.6** (non-thinking) | $0.16 | $0.95 | 262K |
98
+ | **Kimi K2.6** (thinking) | $0.16 | $4.00 | 262K |
99
+ | **Kimi K2.7 Code** | $0.19 | $4.00 | 262K |
100
+ | **MiniMax M3** | $0.30 | $1.20 | 1M |
101
+ | **MiMo V2.5** | $0.40 | $2.00 | 1M |
102
+ | **Qwen 3.7 Plus** | $0.40 | $1.60 | 1M |
103
+ | **MAI-Code-1-Flash** | $0.75 | $4.50 | |
104
+ | **MiMo V2.5 Pro** | $1.00 | $3.00 | 1M |
105
+ | **GLM 5V Turbo** | $1.20 | $4.00 | 200K |
106
+ | **GLM 5.1** | $1.40 | $4.40 | 200K |
107
+ | **Qwen 3.7 Max** | $2.50 | $7.50 | 1M |
106
108
 
107
109
  For the full pricing comparison (cached rates, full Copilot roster, footnotes, sources) see [docs/pricing.md](docs/pricing.md). For a copy-paste config containing **all providers at once**, see [docs/example-config.md](docs/example-config.md).
108
110
 
@@ -159,11 +159,11 @@ All can be set in a `.env` file at the repo root (both proxies `import 'dotenv/c
159
159
 
160
160
  ### Thinking mode
161
161
 
162
- | Model | Turn type | Behavior |
163
- | ----------- | ------------ | ----------------------------------------------------------- |
164
- | K2.5 / K2.6 | Plain chat | Thinking enabled, `temperature: 1`, `top_p: 0.95` |
162
+ | Model | Turn type | Behavior |
163
+ | ----------- | ------------ | -------------------------------------------------------------------------- |
164
+ | K2.5 / K2.6 | Plain chat | Thinking enabled, `temperature: 1`, `top_p: 0.95` |
165
165
  | K2.5 / K2.6 | Tool-enabled | `thinking: { type: "disabled" }` forced, `temperature: 0.6`, `top_p: 0.95` |
166
- | K2.7 Code | All turns | Always-thinking, `temperature: 1`, `top_p: 0.95` |
166
+ | K2.7 Code | All turns | Always-thinking, `temperature: 1`, `top_p: 0.95` |
167
167
 
168
168
  ### Capabilities
169
169
 
@@ -4,15 +4,15 @@
4
4
 
5
5
  ## At a Glance
6
6
 
7
- | Field | Value |
8
- | ---------------------- | ------------------------------------------------ |
9
- | Mode | **Direct** (no proxy) |
10
- | Vision | ✅ Yes (`mimo-v2.5` only) |
11
- | Tool calling | ✅ Yes (with `thinking: disabled`) |
12
- | Context | 1M (V2.5 Pro / V2.5) / 256K (V2 Flash) |
7
+ | Field | Value |
8
+ | ---------------------- | --------------------------------------------------- |
9
+ | Mode | **Direct** (no proxy) |
10
+ | Vision | ✅ Yes (`mimo-v2.5` only) |
11
+ | Tool calling | ✅ Yes (with `thinking: disabled`) |
12
+ | Context | 1M (V2.5 Pro / V2.5) / 256K (V2 Flash) |
13
13
  | Max output | 131072 (V2.5 Pro) / 32768 (V2.5) / 65536 (V2 Flash) |
14
- | Required `requestBody` | `thinking: { type: "disabled" }` |
15
- | Endpoint | `https://api.xiaomimimo.com/v1/chat/completions` |
14
+ | Required `requestBody` | `thinking: { type: "disabled" }` |
15
+ | Endpoint | `https://api.xiaomimimo.com/v1/chat/completions` |
16
16
 
17
17
  ### Models at a glance
18
18
 
@@ -4,8 +4,8 @@
4
4
 
5
5
  ## At a Glance
6
6
 
7
- | Field | Value |
8
- | ------------------------------- | ------------------------------------------------------------------------- |
7
+ | Field | Value |
8
+ | ------------------------------- | -------------------------------------------------------------------------------- |
9
9
  | Mode | **Proxy** (local on `:3458`) **or** **Direct** (static `enable_thinking: false`) |
10
10
  | Vision | ✅ Yes (`qwen3.7-plus`) |
11
11
  | Tool calling | ✅ Yes |
package/docs/pricing.md CHANGED
@@ -22,23 +22,25 @@ All prices below are in **USD per 1M tokens** (non-cached). To convert to AI cre
22
22
 
23
23
  These are the models available through GitHub Copilot's model roster as of June 1, 2026.
24
24
 
25
- | Model | Provider | Tier | Input (per 1M) | Cached input | Output (per 1M) | Context window |
26
- | --------------------- | --------- | ----------- | -------------- | ------------ | --------------- | -------------- |
27
- | **Raptor mini** | GitHub | Versatile | $0.25 | $0.025 | $2.00 | 264K |
28
- | **Gemini 3 Flash** | Google | Lightweight | $0.50 | $0.05 | $3.00 | 173K |
29
- | **GPT-5.4 mini** | OpenAI | Lightweight | $0.75 | $0.075 | $4.50 | 400K |
30
- | **Claude Haiku 4.5** | Anthropic | Versatile | $1.00 | $0.10 | $5.00 | 160K |
31
- | **Gemini 2.5 Pro** | Google | Powerful | $1.25¹ | $0.125 | $10.00¹ | 173K |
32
- | **Gemini 3.5 Flash** | Google | Lightweight | $1.50 | $0.15 | $9.00 | 1M |
33
- | **GPT-5.3-Codex** | OpenAI | Powerful | $1.75 | $0.175 | $14.00 | 400K |
34
- | **Gemini 3.1 Pro** | Google | Powerful | $2.00¹ | $0.20 | $12.00¹ | 1M |
35
- | **GPT-5.4** | OpenAI | Versatile | $2.50 | $0.25 | $15.00 | 1M |
36
- | **Claude Sonnet 4.6** | Anthropic | Versatile | $3.00 | $0.30 | $15.00 | 1M |
37
- | **Claude Opus 4.8** | Anthropic | Powerful | $5.00 | $0.50 | $25.00 | 1M |
38
- | **Claude Opus 4.7** | Anthropic | Powerful | $5.00 | $0.50 | $25.00 | 1M |
39
- | **GPT-5.5** | OpenAI | Powerful | $5.00 | $0.50 | $30.00 | 1M |
25
+ | Model | Provider | Tier | Input (per 1M) | Cached input | Output (per 1M) | Context window |
26
+ | ---------------------- | --------- | ----------- | -------------- | ------------ | --------------- | -------------- |
27
+ | **Raptor mini** | GitHub | Versatile | $0.25 | $0.025 | $2.00 | 264K |
28
+ | **Gemini 3 Flash** | Google | Lightweight | $0.50 | $0.05 | $3.00 | 173K |
29
+ | **GPT-5.4 mini** | OpenAI | Lightweight | $0.75 | $0.075 | $4.50 | 400K |
30
+ | **MAI-Code-1-Flash** ² | Microsoft | Lightweight | $0.75 | $0.075 | $4.50 | |
31
+ | **Claude Haiku 4.5** | Anthropic | Versatile | $1.00 | $0.10 | $5.00 | 160K |
32
+ | **Gemini 2.5 Pro** | Google | Powerful | $1.25¹ | $0.125 | $10.00¹ | 173K |
33
+ | **Gemini 3.5 Flash** | Google | Lightweight | $1.50 | $0.15 | $9.00 | 1M |
34
+ | **GPT-5.3-Codex** | OpenAI | Powerful | $1.75 | $0.175 | $14.00 | 400K |
35
+ | **Gemini 3.1 Pro** | Google | Powerful | $2.00¹ | $0.20 | $12.00¹ | 1M |
36
+ | **GPT-5.4** | OpenAI | Versatile | $2.50 | $0.25 | $15.00 | 1M |
37
+ | **Claude Sonnet 4.6** | Anthropic | Versatile | $3.00 | $0.30 | $15.00 | 1M |
38
+ | **Claude Opus 4.8** | Anthropic | Powerful | $5.00 | $0.50 | $25.00 | 1M |
39
+ | **Claude Opus 4.7** | Anthropic | Powerful | $5.00 | $0.50 | $25.00 | 1M |
40
+ | **GPT-5.5** | OpenAI | Powerful | $5.00 | $0.50 | $30.00 | 1M |
40
41
 
41
42
  ¹ Gemini 3.1 Pro and 2.5 Pro pricing applies to prompts ≤200K tokens.
43
+ ² MAI-Code-1-Flash is a continuously improving model — performance and behavior may evolve over time as new checkpoints are released.
42
44
 
43
45
  ## Custom-endpoint alternatives
44
46
 
@@ -59,6 +61,7 @@ These are the models available through GitHub Copilot's model roster as of June
59
61
 
60
62
  > **Notes:**
61
63
  >
64
+ > - **MAI-Code-1-Flash** is a continuously improving model — performance and behavior may evolve over time as new checkpoints are released.
62
65
  > - **DeepSeek V4** input pricing shown is the **cache miss** price. Cache hits are significantly cheaper ($0.0028/M for Flash, $0.003625/M for Pro).
63
66
  > - **MiMo** input pricing shown is the **cache miss** price. Cache hits are 5× cheaper for V2.5 Pro ($0.20/M) and V2.5 ($0.08/M), and 10× cheaper for V2 Flash ($0.01/M).
64
67
  > - **Gemini 3 Flash** is priced at $0.50/MTok input (text/image/video) and $1.00/MTok input for audio.
@@ -90,6 +93,7 @@ For a typical coding session (~10K input + ~2K output tokens per turn, 50 turns)
90
93
  | Kimi K2.7 Code | ~$0.50 |
91
94
  | Gemini 3 Flash | ~$0.55 |
92
95
  | MiMo V2.5 Pro | ~$0.80 |
96
+ | MAI-Code-1-Flash | ~$0.83 |
93
97
  | GPT-5.4 mini | ~$0.83 |
94
98
  | Claude Haiku 4.5 | ~$1.00 |
95
99
  | Qwen 3.7 Max | ~$1.33 |
@@ -104,9 +108,10 @@ For a typical coding session (~10K input + ~2K output tokens per turn, 50 turns)
104
108
 
105
109
  > **How long does 7,000 credits last?** A Pro+ subscriber running 50-turn sessions could afford roughly **13 GPT-5.5 sessions**, **23 Opus sessions**, or **212 Raptor mini sessions** per month — or mix and match. (Multiply session cost by 100 to convert to AI credits.)
106
110
 
107
- > Prices last verified: June 9, 2026. Always check the official pages for the latest rates:
111
+ > Prices last verified: June 14, 2026. Always check the official pages for the latest rates:
108
112
  >
109
113
  > - [GitHub Copilot models & pricing](https://docs.github.com/en/copilot/reference/copilot-billing/models-and-pricing)
114
+ > - [Microsoft MAI-Code-1-Flash model card](https://docs.github.com/en/copilot/reference/ai-models/model-comparison#task-general-purpose-coding-and-writing)
110
115
  > - [OpenAI pricing](https://openai.com/api/pricing/)
111
116
  > - [Anthropic (Claude) pricing](https://platform.claude.com/docs/en/about-claude/pricing)
112
117
  > - [Google Gemini pricing](https://ai.google.dev/pricing)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "copilot-custom-endpoint",
3
- "version": "1.4.0",
3
+ "version": "1.4.2",
4
4
  "description": "Local proxies for VS Code Copilot custom endpoints — Kimi K2 & Qwen 3.x",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -55,4 +55,4 @@
55
55
  "dependencies": {
56
56
  "dotenv": "^17.4.2"
57
57
  }
58
- }
58
+ }