npm - copilot-custom-endpoint - Versions diffs - 1.3.2 → 1.3.3 - Mend

copilot-custom-endpoint 1.3.2 → 1.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md CHANGED Viewed

@@ -25,7 +25,7 @@ That's it. No code, no servers to manage (unless the model specifically needs th
 | **MiMo V2.5**               | Xiaomi    | No                     | ✅           | [Setup](docs/models/mimo.md)                                                                       |
 | **MiMo V2.5 Pro**           | Xiaomi    | No                     | ❌           | [Setup](docs/models/mimo.md)                                                                       |
 | **Kimi K2.6**               | Moonshot  | **Yes**                | ✅           | [Setup](docs/models/kimi.md)                                                                       |
-| **Qwen 3.6 Plus**           | DashScope | Optional               | ✅           | [Setup](docs/models/qwen.md)                                                                       |
+| **Qwen 3.7 Plus**           | DashScope | Optional               | ✅           | [Setup](docs/models/qwen.md)                                                                       |
 | **Qwen 3.7 Max**            | DashScope | Optional               | ❌           | [Setup](docs/models/qwen.md)                                                                       |
 | **MiniMax M3**              | MiniMax   | No                     | ✅           | [Setup](docs/models/minimax.md)                                                                    |
 | **GLM 5.1**                 | Z.ai      | No                     | ❌           | [Setup](docs/models/glm.md)                                                                        |
@@ -97,7 +97,7 @@ All prices are **USD per 1M tokens** (cache miss). 1 AI credit = $0.01.
 | **DeepSeek V4 Flash** 🏆     | $0.14 | $0.28  | 1M      |
 | **Kimi K2.6** (non-thinking) | $0.16 | $0.95  | 256K    |
 | **MiMo V2.5**                | $0.40 | $2.00  | 1M      |
-| **Qwen 3.6 Plus**            | $0.50 | $3.00  | 1M      |
+| **Qwen 3.7 Plus**            | $0.40 | $1.60  | 1M      |
 | **MiniMax M3**               | $0.60 | $2.40  | 1M      |
 | **MiMo V2.5 Pro**            | $1.00 | $3.00  | 1M      |
 | **GLM 5V Turbo**             | $1.20 | $4.00  | 200K    |
@@ -117,7 +117,7 @@ VS Code's built-in `view_image` tool only accepts **static images** (PNG, JPG, G
 **Video Context MCP** is a small MCP server that bridges that gap. It works with **GitHub Copilot, Cursor, and Claude Code** out of the box, and:
 - **Extracts frames** from local files or remote URLs (no `ffmpeg` gymnastics required).
-- **Routes them through a multi-provider fallback chain** — `Gemini → GLM-4.6V → Qwen3.6 → Kimi K2.6 → MiMo-V2.5` — so a single `GLM 5V Turbo` rate-limit hiccup doesn't kill your session.
+- **Routes them through a multi-provider fallback chain** — `Gemini → GLM-4.6V-flash → Qwen3.6-plus → Kimi K2.6 → MiMo-V2.5` — so a single `GLM 5V Turbo` rate-limit hiccup doesn't kill your session.
 - **Answers natural-language questions** about the video grounded in actual frames: "what does the speaker click in the last 30 seconds?", "summarize the demo", "find the frame where the error appears".
 - **Extras:** timestamp search, audio transcription with speaker diarization, and video metadata (resolution, duration, codec).

package/docs/example-config.md CHANGED Viewed

@@ -24,8 +24,8 @@ Here's a complete, real-world `chatLanguageModels.json` that combines **all the
         }
       },
       {
-        "id": "qwen3.6-plus",
-        "name": "Qwen 3.6 Plus",
+        "id": "qwen3.7-plus",
+        "name": "Qwen 3.7 Plus",
         "url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
         "toolCalling": true,
         "vision": true,
@@ -195,7 +195,7 @@ Here's a complete, real-world `chatLanguageModels.json` that combines **all the
 If you only need one provider, jump straight to its setup guide:
 - [Kimi K2.6](kimi.md)
-- [Qwen 3.6 Plus / 3.7 Max](qwen.md)
+- [Qwen 3.7 Plus / 3.7 Max](qwen.md)
 - [Xiaomi MiMo (V2.5 / V2.5 Pro / V2 Flash)](mimo.md)
 - [MiniMax M3](minimax.md)
 - [GLM (5.1 / 4.7 Flash / 5V Turbo)](glm.md)

package/docs/models/glm.md CHANGED Viewed

@@ -177,7 +177,7 @@ Config file location:
 - **Max 128 functions** per request.
 - **Tool stream** (`tool_stream: true`) is supported on the `glm-4.6v` family and above for streaming tool-call deltas.
 - **Vision** on `glm-4.6v`, `glm-4.6v-flash`, and `glm-5v-turbo` using the OpenAI `image_url` content-part format. External URLs and base64 data URIs both work.
-- **Video input** on `glm-5v-turbo` — the model natively accepts video (Input Modality: **Video / Image / Text / File**). Use a public video URL in an `image_url` content part via direct API call; VS Code's chat UI does not currently forward video attachments to the model. For a turnkey VS Code integration that bridges the gap (extracts frames, routes them to GLM or a fallback provider, and answers natural-language questions about the video), see [**Video Context MCP**](https://www.videocontextmcp.com/) — an MCP server that gives Copilot/Cursor/Claude Code video understanding via the `glm-4.6v` provider and a multi-provider fallback chain (Gemini → GLM-4.6V → Qwen3.6 → Kimi K2.6 → MiMo-V2.5).
+- **Video input** on `glm-5v-turbo` — the model natively accepts video (Input Modality: **Video / Image / Text / File**). Use a public video URL in an `image_url` content part via direct API call; VS Code's chat UI does not currently forward video attachments to the model. For a turnkey VS Code integration that bridges the gap (extracts frames, routes them to GLM or a fallback provider, and answers natural-language questions about the video), see [**Video Context MCP**](https://www.videocontextmcp.com/) — an MCP server that gives Copilot/Cursor/Claude Code video understanding via the `glm-4.6v` provider and a multi-provider fallback chain (Gemini → GLM-4.6V → Qwen 3.7 Plus → Kimi K2.6 → MiMo-V2.5).
 - **Native multimodal tool calling** on `glm-4.6v` (and inherited by `glm-5v-turbo`) — images, screenshots, and document pages can be passed directly as tool parameters and tool results can be consumed visually.
 - **Built-in web search** is exposed as a tool type `web_search` (different from `function`).
 - **Context caching** is automatic — the API returns `usage.prompt_tokens_details.cached_tokens` on cache hits; cache writes are currently free of charge.
@@ -343,7 +343,7 @@ This file is the **research record and the user-facing setup guide**. The implem
 ## Companion tools
-- [**Video Context MCP**](https://www.videocontextmcp.com/) — an MCP server that gives AI coding assistants (GitHub Copilot, Cursor, Claude Code) the ability to **understand video content** via natural language. Extracts frames from local or remote videos, routes them through a multi-provider fallback chain (**Gemini → GLM-4.6V → Qwen3.6 → Kimi K2.6 → MiMo-V2.5**), and returns answers grounded in actual video frames. Also handles summarization, timestamp search, audio transcription with speaker diarization, and video metadata. Works around the limitation that VS Code's built-in `view_image` tool only accepts static images — so it lets `glm-5v-turbo`'s native video support actually be exercised end-to-end from inside VS Code.
+- [**Video Context MCP**](https://www.videocontextmcp.com/) — an MCP server that gives AI coding assistants (GitHub Copilot, Cursor, Claude Code) the ability to **understand video content** via natural language. Extracts frames from local or remote videos, routes them through a multi-provider fallback chain (**Gemini → GLM-4.6V-flash → Qwen 3.6 Plus → Kimi K2.6 → MiMo-V2.5**), and returns answers grounded in actual video frames. Also handles summarization, timestamp search, audio transcription with speaker diarization, and video metadata. Works around the limitation that VS Code's built-in `view_image` tool only accepts static images — so it lets `glm-5v-turbo`'s native video support actually be exercised end-to-end from inside VS Code.
 ## References

package/docs/models/qwen.md CHANGED Viewed

@@ -1,13 +1,13 @@
 # Qwen (DashScope) — VS Code Custom Endpoint Setup Guide
-> **TL;DR:** Direct path works for both `qwen3.6-plus` (vision) and `qwen3.7-max` (text-only) without a proxy. The optional `proxy/qwen-proxy.mjs` adds dynamic thinking suppression: reasoning stays ON in plain chat but turns OFF automatically when tools are invoked. Pick the mode that matches your tradeoff.
+> **TL;DR:** Direct path works for `qwen3.7-plus` (vision) and `qwen3.7-max` (text-only) without a proxy. The optional `proxy/qwen-proxy.mjs` adds dynamic thinking suppression: reasoning stays ON in plain chat but turns OFF automatically when tools are invoked. Pick the mode that matches your tradeoff.
 ## At a Glance
 | Field                           | Value                                                                     |
 | ------------------------------- | ------------------------------------------------------------------------- |
 | Mode                            | **Direct** (no proxy) **or** **Proxy** (optional, for dynamic thinking)   |
-| Vision                          | ✅ Yes (`qwen3.6-plus` only)                                              |
+| Vision                          | ✅ Yes (`qwen3.7-plus`)                                                   |
 | Tool calling                    | ✅ Yes                                                                    |
 | Context                         | 1M                                                                        |
 | Required `requestBody` (direct) | `enable_thinking: false`                                                  |
@@ -19,16 +19,16 @@
 | Model          | Vision | Role                                   |
 | -------------- | ------ | -------------------------------------- |
-| `qwen3.6-plus` | ✅ Yes | Primary model with image understanding |
+| `qwen3.7-plus` | ✅ Yes | Primary model with image understanding |
 | `qwen3.7-max`  | ❌ No  | Larger text-only model                 |
-> The snapshot `qwen3.6-plus-2026-04-02` is also available; the floating `qwen3.6-plus` alias is preferred.
+> The snapshot `qwen3.7-plus-2026-05-26` is also available; the floating `qwen3.7-plus` alias is preferred.
 ## Quick Start — Direct Path (Recommended for Simplicity)
 1. **Edit `chatLanguageModels.json`** — add the Qwen block from [Setup § Direct](#direct-path) below.
 2. **Set your `DASHSCOPE_API_KEY`** via Command Palette → **Chat: Manage Language Models**.
-3. **Restart VS Code** and pick "Qwen 3.6 Plus" or "Qwen 3.7 Max".
+3. **Restart VS Code** and pick "Qwen 3.7 Plus" or "Qwen 3.7 Max".
 ## Quick Start — With Proxy (Dynamic Thinking)
@@ -70,8 +70,8 @@ DashScope is region-specific — your API key only works on the endpoint it was
       }
     },
     {
-      "id": "qwen3.6-plus",
-      "name": "Qwen 3.6 Plus",
+      "id": "qwen3.7-plus",
+      "name": "Qwen 3.7 Plus",
       "url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
       "toolCalling": true,
       "vision": true,
@@ -136,8 +136,8 @@ Expected response:
       "streaming": true
     },
     {
-      "id": "qwen3.6-plus",
-      "name": "Qwen 3.6 Plus",
+      "id": "qwen3.7-plus",
+      "name": "Qwen 3.7 Plus",
       "url": "http://127.0.0.1:3458/v1/chat/completions",
       "toolCalling": true,
       "vision": true,
@@ -190,16 +190,18 @@ The Qwen3 hybrid-thinking models default to `enable_thinking: true`, producing `
 | Proxy path          | Thinking ON (default preserved) | Thinking OFF (auto-injected)  |
 | No config (default) | Thinking ON                     | Risk: history may be rejected |
-### Vision (`qwen3.6-plus` only)
+### Vision (`qwen3.7-plus`)
 - Image input via OpenAI-compatible `content` array format (base64 data URIs).
 - **External image URLs may fail** if DashScope's servers cannot reach them — base64-encoded images work reliably.
+- **Image attachment behavior**: Unlike some other models, Qwen may fail to read images that are directly dragged and dropped into the Copilot Chat. If this happens, provide the absolute file path to the image (e.g., `c:\path\to\image.png`) in your prompt as a reliable workaround.
+- **Pricing**: **$0.40 / $1.60 per 1M input/output (≤ 256K)** and **$1.20 / $4.80 per 1M (> 256K)**.
 ### Capabilities
 - Streaming (SSE, `data: [DONE]` terminator).
 - Tool calling with `tools` array and `tool_calls` response.
-- Vision (image input) on `qwen3.6-plus` only.
+- Vision (image input) on `qwen3.7-plus`.
 - Non-OpenAI extras: `enable_thinking`, `thinking_budget`, `enable_search` (via `extra_body`).
 ## Troubleshooting
@@ -220,7 +222,7 @@ For the cross-provider comparison, see [docs/pricing.md](../pricing.md). DashSco
 | Model          | Input (≤ 256K tokens) | Input (> 256K tokens) | Output (≤ 256K tokens) | Output (> 256K tokens) |
 | -------------- | --------------------- | --------------------- | ---------------------- | ---------------------- |
-| `qwen3.6-plus` | $0.50 / 1M            | $2.00 / 1M            | $3.00 / 1M             | $6.00 / 1M             |
+| `qwen3.7-plus` | $0.40 / 1M            | $1.20 / 1M            | $1.60 / 1M             | $4.80 / 1M             |
 | `qwen3.7-max`  | $2.50 / 1M (≤ 1M)     | —                     | $7.50 / 1M (≤ 1M)      | —                      |
 > **Free quota:** DashScope offers 1M input + 1M output tokens per model, valid for 90 days after activating Model Studio.
@@ -266,7 +268,7 @@ Both work — pick based on your preference:
 | Streaming in VS Code                         | ✅     | Token-by-token streaming confirmed                                     |
 | Tool / agent use in VS Code                  | ✅     | Browser tool invoked successfully                                      |
-#### Direct-path validation — `qwen3.6-plus`
+#### Direct-path validation — `qwen3.7-plus`
 | Capability                                   | Result | Notes                                                                        |
 | -------------------------------------------- | ------ | ---------------------------------------------------------------------------- |
@@ -275,7 +277,7 @@ Both work — pick based on your preference:
 | Tool-enabled chat (`enable_thinking: false`) | ✅     | Clean `tool_calls`, no `reasoning_content`, 25 tokens                        |
 | Vision: image + text (curl, base64)          | ✅     | Model correctly identified a 10×10 test pattern; `image_tokens: 66`          |
 | Vision: image + text (curl, external URL)    | ❌     | `Failed to download multimodal content` — DashScope couldn't reach Wikipedia |
-| Model appears in VS Code picker              | ✅     | "Agent \| Qwen 3.6 Plus" confirmed                                           |
+| Model appears in VS Code picker              | ✅     | "Agent \| Qwen 3.7 Plus" confirmed                                           |
 | Plain chat in VS Code                        | ✅     | Streaming output confirmed                                                   |
 | Streaming in VS Code                         | ✅     | Token-by-token streaming confirmed                                           |
 | Tool / agent use in VS Code                  | ✅     | Browser tool invoked to open Qwen docs and Google                            |
@@ -283,17 +285,17 @@ Both work — pick based on your preference:
 #### Intermittent `ERR_CONNECTION_RESET` investigation
-A `net::ERR_CONNECTION_RESET` was observed once during `qwen3.6-plus` validation, but did not reproduce on the same machine outside VS Code:
+A `net::ERR_CONNECTION_RESET` was observed once during `qwen3.7-plus` validation, but did not reproduce on the same machine outside VS Code:
 - Direct `curl` POST to DashScope Singapore → HTTP 200.
 - Direct Node.js HTTPS POST → HTTP 200.
-- Direct Node.js HTTPS **streaming** POST with full `qwen3.6-plus.md` content embedded → HTTP 200.
+- Direct Node.js HTTPS **streaming** POST with full `qwen3.7-plus.md` content embedded → HTTP 200.
 Conclusion: not a DashScope or Qwen model incompatibility. Evidence points to an intermittent VS Code / Electron transport issue or transient network interruption local to the editor process.
 ### Final verdict
-| Criterion              | `qwen3.7-max`  | `qwen3.6-plus` |
+| Criterion              | `qwen3.7-max`  | `qwen3.7-plus` |
 | ---------------------- | -------------- | -------------- |
 | Plain chat             | ✅             | ✅             |
 | Streaming chat         | ✅             | ✅             |
@@ -305,7 +307,7 @@ Conclusion: not a DashScope or Qwen model incompatibility. Evidence points to an
 - GitHub Copilot inline completions and semantic-search features remain outside scope.
 - One intermittent VS Code-side `net::ERR_CONNECTION_RESET` was observed — not reproducible externally, treated as transient transport issue.
-- External image URLs may fail if DashScope's servers cannot reach them; base64-encoded images work reliably (`qwen3.6-plus`).
+- External image URLs may fail if DashScope's servers cannot reach them; base64-encoded images work reliably.
 - Vision is not supported on `qwen3.7-max` (text-generation model).
 - `maxInputTokens` / `maxOutputTokens` not yet confirmed from official DashScope documentation.
 - API keys are region-specific — a key created for one regional endpoint will not work with another.

package/docs/pricing.md CHANGED Viewed

@@ -50,7 +50,7 @@ These are the models available through GitHub Copilot's model roster as of June
 | **DeepSeek V4 Pro**   | DeepSeek  | $1.74                         | $3.48                                   | 1M             |
 | **MiMo V2.5**         | Xiaomi    | $0.40                         | $2.00                                   | 1M             |
 | **MiMo V2.5 Pro**     | Xiaomi    | $1.00                         | $3.00                                   | 1M             |
-| **Qwen 3.6 Plus**     | DashScope | $0.50 (≤256K) / $2.00 (>256K) | $3.00 (≤256K) / $6.00 (>256K)           | 1M             |
+| **Qwen 3.7 Plus**     | DashScope | $0.40 (≤256K) / $1.20 (>256K) | $1.60 (≤256K) / $4.80 (>256K)           | 1M             |
 | **Qwen 3.7 Max**      | DashScope | $2.50 (≤1M)                   | $7.50 (≤1M)                             | 1M             |
 | **MiniMax M3**        | MiniMax   | $0.60 (≤512K) / $1.20 (>512K) | $2.40 (≤512K) / $4.80 (>512K)           | 1M             |
 | **GLM 4.7 Flash**     | Z.ai      | Free (rate-limited ¹)         | Free (rate-limited ¹)                   | 200K           |
@@ -83,8 +83,8 @@ For a typical coding session (~10K input + ~2K output tokens per turn, 50 turns)
 | Kimi K2.6 (non-thinking) | ~$0.18                 | —                    |
 | MiMo V2.5                | ~$0.40                 | —                    |
 | Kimi K2.6 (thinking)     | ~$0.48                 | —                    |
+| Qwen 3.7 Plus            | ~$0.36                 | —                    |
 | Gemini 3 Flash           | ~$0.55                 | ~55                  |
-| Qwen 3.6 Plus            | ~$0.55                 | —                    |
 | MiniMax M3               | ~$0.54                 | —                    |
 | MiMo V2.5 Pro            | ~$0.80                 | —                    |
 | GLM 4.7 Flash (free)     | ~$0.00 ¹               | —                    |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "copilot-custom-endpoint",
-  "version": "1.3.2",
+  "version": "1.3.3",
   "description": "Local proxies for VS Code Copilot custom endpoints — Kimi K2 & Qwen 3.x",
   "license": "MIT",
   "type": "module",

package/proxy/qwen-proxy.mjs CHANGED Viewed

@@ -5,7 +5,7 @@ import { createProxy } from '../lib/create-proxy.mjs'
 /**
  * Supported model scope for this proxy:
- * - Validated with `qwen3.6-plus` and `qwen3.7-max`.
+ * - Validated with `qwen3.7-plus` and `qwen3.7-max`.
  * - Expected to work for any Qwen3 hybrid-thinking model (qwen3-* series)
  *   that supports the `enable_thinking` top-level field on DashScope's
  *   OpenAI-compatible surface.