npm - video-context-mcp-server - Versions diffs - 1.2.2 → 1.2.3 - Mend

video-context-mcp-server 1.2.2 → 1.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +7 -7
package/dist/generated/version.d.ts +1 -1
package/dist/generated/version.js +1 -1
package/dist/services/providerRouter.js +1 -1
package/dist/services/qwenClient.d.ts +1 -1
package/dist/services/qwenClient.js +3 -3
package/dist/tools/schemas.js +3 -3
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -15,7 +15,7 @@ An MCP server that gives coding assistants (GitHub Copilot, Cursor, Claude Code)
 - 🎙️ **Audio Transcription** — Transcribe speech with paragraph-level timestamps (`[MM:SS]`) or export as SRT/VTT subtitles and JSON using Deepgram, AssemblyAI, Groq/Whisper, or Gemini
 - 🔊 **Speaker Diarization** — Identify who said what (Deepgram and AssemblyAI)
 - 🔊 **Audio-Enhanced Analysis** — Auto-transcribes audio and injects transcripts into AI prompts for richer results (GLM/Kimi/Qwen)
-- 🔄 **Multi-Provider Support** — Choose between GLM-4.6V, Qwen3.6, Kimi K2.6, Gemini, or MiMo-V2.5
+- 🔄 **Multi-Provider Support** — Choose between GLM-4.6V, Qwen3.7, Kimi K2.6, Gemini, or MiMo-V2.5
 - 🎯 **Smart Video Handling** — Extracts keyframes from long videos to reduce token usage
 - 🗣️ **Text-to-Speech** ⚠️ _Experimental_ — Convert text to natural speech audio (MiniMax TTS)
 - 🖼️ **Image Generation** ⚠️ _Experimental_ — Generate images from text prompts (MiniMax image-01)
@@ -417,7 +417,7 @@ Set all keys to get the full fallback chain. The server will try Gemini first, t
 | ----------------------------------------- | ------------------- | --------------------------------------------------------------------------------------------- |
 | **Gemini 3.5 Flash** (default, free-tier) | `GEMINI_API_KEY`    | [Get key](https://aistudio.google.com/app/apikey)                                             |
 | **GLM-4.6V** (free-tier)                  | `Z_AI_API_KEY`      | [Get key](https://z.ai/manage-apikey/apikey-list)                                             |
-| **Qwen3.6** (paid)                        | `DASHSCOPE_API_KEY` | [Get key](https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=dashboard#/api-key) |
+| **Qwen3.7** (paid)                        | `DASHSCOPE_API_KEY` | [Get key](https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=dashboard#/api-key) |
 | **Kimi K2.6** (paid)                      | `MOONSHOT_API_KEY`  | [Get key](https://platform.kimi.ai)                                                           |
 | **MiMo-V2.5** (paid)                      | `MIMO_API_KEY`      | [Get key](https://platform.xiaomimimo.com/#/console/api-keys)                                 |
@@ -448,7 +448,7 @@ When an audio key is missing or an audio API call fails at runtime, tools automa
 ### Video Providers
-| Feature        | Gemini 3.5 Flash (default)                     | GLM-4.6V                                               | Qwen3.6                                                | Kimi K2.6                                      | MiMo-V2.5                                              |
+| Feature        | Gemini 3.5 Flash (default)                     | GLM-4.6V                                               | Qwen3.7                                                | Kimi K2.6                                      | MiMo-V2.5                                              |
 | -------------- | ---------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------ | ---------------------------------------------- | ------------------------------------------------------ |
 | Price          | Free tier available                            | Free tier available (GLM-4.6V-Flash)                   | $0.50 input / $3.00 output per 1M tokens               | $0.60 input / $3.00 output per 1M tokens       | $0.40 input / $2.00 output per 1M tokens               |
 | Video formats  | mp4, mpeg, mov, avi, flv, mpg, webm, wmv, 3gpp | mp4, avi, mov, wmv, webm, m4v                          | mp4, avi, mov, wmv, webm, m4v                          | mp4, mpeg, mov, avi, flv, mpg, webm, wmv, 3gpp | mp4, mov, avi, wmv                                     |
@@ -456,7 +456,7 @@ When an audio key is missing or an audio API call fails at runtime, tools automa
 | Max file size  | 2 GB                                           | ~12 MB base64 / frames fallback / **unlimited w/ S3↓** | ~10 MB base64 / frames fallback / **unlimited w/ S3↓** | 100 MB                                         | ~10 MB base64 / frames fallback / **unlimited w/ S3↓** |
 | Best for       | **Default** — free, no card required           | Free, no card required                                 | SOTA agentic coding                                    | Paid — broadest format support                 | Paid — thinking mode; multimodal                       |
-**Gemini 3.5 Flash** is the default — it offers a free tier with no credit card required, 1M context window, and 2 GB file support. **GLM-4.6V** is the second fallback — also free with no card required. **Qwen3.6** is a paid provider at $0.50 input / $3.00 output per 1M tokens with SOTA agentic coding performance. **Kimi K2.6** is a paid provider with the broadest video format support. **MiMo-V2.5** is Xiaomi's multimodal model with thinking mode support ($0.40 input / $2.00 output per 1M tokens).
+**Gemini 3.5 Flash** is the default — it offers a free tier with no credit card required, 1M context window, and 2 GB file support. **GLM-4.6V** is the second fallback — also free with no card required. **Qwen3.7** is a paid provider at $0.50 input / $3.00 output per 1M tokens with SOTA agentic coding performance. **Kimi K2.6** is a paid provider with the broadest video format support. **MiMo-V2.5** is Xiaomi's multimodal model with thinking mode support ($0.40 input / $2.00 output per 1M tokens).
 Set `VIDEO_MCP_DEFAULT_PROVIDER=gemini`, `glm`, `qwen`, `kimi`, or `mimo` to change the default provider used when a tool call does not pass `provider`. If a tool call includes `provider`, that per-call value takes precedence.
@@ -465,7 +465,7 @@ Set `VIDEO_MCP_DEFAULT_PROVIDER=gemini`, `glm`, `qwen`, `kimi`, or `mimo` to cha
 <details open>
 <summary><strong>Automatic S3 relay: bypass the 10 MB local file limit with GLM, Qwen, and MiMo</strong></summary>
-**GLM-4.6V**, **Qwen3.6**, and **MiMo-V2.5** all accept direct video URLs, but base64-encoding a local file caps out at **10–12 MB**. Above that limit, the server first tries to fall back to an upload-capable provider (Gemini or Kimi) if one is available, then falls back to **frame-based analysis** as a last resort. For the best results on large local videos, set `AWS_S3_BUCKET` — the server uploads the full video to S3 and passes a presigned URL to GLM, Qwen, and MiMo, bypassing the base64 limit entirely and taking priority over both fallbacks. No manual upload step needed.
+**GLM-4.6V**, **Qwen3.7**, and **MiMo-V2.5** all accept direct video URLs, but base64-encoding a local file caps out at **10–12 MB**. Above that limit, the server first tries to fall back to an upload-capable provider (Gemini or Kimi) if one is available, then falls back to **frame-based analysis** as a last resort. For the best results on large local videos, set `AWS_S3_BUCKET` — the server uploads the full video to S3 and passes a presigned URL to GLM, Qwen, and MiMo, bypassing the base64 limit entirely and taking priority over both fallbacks. No manual upload step needed.
 #### Why S3 works
@@ -691,7 +691,7 @@ Set `AUDIO_MCP_DEFAULT_PROVIDER` to change the default.
 | Variable                     | Description                                                                                                           | Default  |
 | ---------------------------- | --------------------------------------------------------------------------------------------------------------------- | -------- |
 | `Z_AI_API_KEY`               | Z.AI API key for GLM-4.6V                                                                                             | —        |
-| `DASHSCOPE_API_KEY`          | Alibaba Cloud API key for Qwen3.6                                                                                     | —        |
+| `DASHSCOPE_API_KEY`          | Alibaba Cloud API key for Qwen3.7                                                                                     | —        |
 | `MOONSHOT_API_KEY`           | Moonshot AI API key for Kimi K2.6                                                                                     | —        |
 | `GEMINI_API_KEY`             | Google API key for Gemini                                                                                             | —        |
 | `MIMO_API_KEY`               | Xiaomi MiMo API key for MiMo-V2.5                                                                                     | —        |
@@ -1067,7 +1067,7 @@ Proprietary — All Rights Reserved. No part of this software may be copied, mod
 - [MCP SDK](https://github.com/modelcontextprotocol/typescript-sdk) by Anthropic
 - [Kimi K2.6](https://github.com/MoonshotAI/Kimi-K2.6) by Moonshot AI
 - [GLM-4.6V](https://docs.z.ai/guides/vlm/glm-4.6v) by Z.AI
-- [Qwen3.6](https://bailian.console.alibabacloud.com/ap-southeast-1/) by Alibaba Cloud
+- [Qwen3.7](https://bailian.console.alibabacloud.com/ap-southeast-1/) by Alibaba Cloud
 - [MiMo-V2.5](https://platform.xiaomimimo.com/) by Xiaomi
 - [Deepgram](https://www.deepgram.com/) for audio transcription
 - [AssemblyAI](https://www.assemblyai.com/) for audio transcription

package/dist/generated/version.d.ts CHANGED Viewed

@@ -1,2 +1,2 @@
-export declare const VERSION = "1.2.2";
+export declare const VERSION = "1.2.3";
 //# sourceMappingURL=version.d.ts.map

package/dist/generated/version.js CHANGED Viewed

@@ -1,3 +1,3 @@
 // Auto-generated by scripts/sync-version.ts — do not edit
-export const VERSION = '1.2.2';
+export const VERSION = '1.2.3';
 //# sourceMappingURL=version.js.map

package/dist/services/providerRouter.js CHANGED Viewed

@@ -25,7 +25,7 @@ export function selectProvider(provider, hasKimiKey, hasGLMKey, hasGeminiKey = f
         glm: 'GLM-4.6V',
         kimi: 'Kimi K2.6',
         gemini: 'Gemini',
-        qwen: 'Qwen3.6',
+        qwen: 'Qwen3.7',
         mimo: 'MiMo-V2.5',
     };
     const envVar = {

package/dist/services/qwenClient.d.ts CHANGED Viewed

@@ -1,5 +1,5 @@
 /**
- * Qwen3.6 Client
+ * Qwen3.7 Client
  * Handles video analysis using Alibaba Cloud's DashScope API (OpenAI-compatible)
  */
 type QwenContentPart = {

package/dist/services/qwenClient.js CHANGED Viewed

@@ -1,10 +1,10 @@
 /**
- * Qwen3.6 Client
+ * Qwen3.7 Client
  * Handles video analysis using Alibaba Cloud's DashScope API (OpenAI-compatible)
  */
 const QWEN_BASE_URL = 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions';
-const QWEN_MODEL = 'qwen3.6-plus';
-const QWEN_MODEL_FLASH = 'qwen3.6-plus';
+const QWEN_MODEL = 'qwen3.7-plus';
+const QWEN_MODEL_FLASH = 'qwen3.7-plus';
 /** Default fps for server-side frame extraction on video URL inputs. */
 const QWEN_DEFAULT_FPS = 2.0;
 /** Client-side request timeout in ms. Default: 4 min (just under DashScope's ~5 min server cap).

package/dist/tools/schemas.js CHANGED Viewed

@@ -24,13 +24,13 @@ export const analyzeVideoSchema = z.object({
     question: z.string().describe('Question to ask about the video content'),
     provider: videoProviderEnum
         .optional()
-        .describe("AI backend to use: 'gemini' (Gemini 3.5 Flash, default), 'glm' (GLM-4.6V), 'qwen' (Qwen3.6), 'kimi' (Kimi K2.6), or 'mimo' (MiMo-V2.5)"),
+        .describe("AI backend to use: 'gemini' (Gemini 3.5 Flash, default), 'glm' (GLM-4.6V), 'qwen' (Qwen3.7), 'kimi' (Kimi K2.6), or 'mimo' (MiMo-V2.5)"),
 });
 export const summarizeVideoSchema = z.object({
     videoPath: z.string().describe('Path to the video file (local path or URL)'),
     provider: videoProviderEnum
         .optional()
-        .describe("AI backend to use: 'gemini' (Gemini 3.5 Flash, default), 'glm' (GLM-4.6V), 'qwen' (Qwen3.6), 'kimi' (Kimi K2.6), or 'mimo' (MiMo-V2.5)"),
+        .describe("AI backend to use: 'gemini' (Gemini 3.5 Flash, default), 'glm' (GLM-4.6V), 'qwen' (Qwen3.7), 'kimi' (Kimi K2.6), or 'mimo' (MiMo-V2.5)"),
 });
 export const extractFramesSchema = z.object({
     videoPath: z
@@ -84,7 +84,7 @@ export const searchTimestampSchema = z.object({
         .describe("What to search for, e.g., 'person waves', 'dog runs', 'car crash'"),
     provider: videoProviderEnum
         .optional()
-        .describe("AI backend to use: 'gemini' (Gemini 3.5 Flash, default), 'glm' (GLM-4.6V), 'qwen' (Qwen3.6), 'kimi' (Kimi K2.6), or 'mimo' (MiMo-V2.5)"),
+        .describe("AI backend to use: 'gemini' (Gemini 3.5 Flash, default), 'glm' (GLM-4.6V), 'qwen' (Qwen3.7), 'kimi' (Kimi K2.6), or 'mimo' (MiMo-V2.5)"),
 });
 export const getVideoInfoSchema = z.object({
     videoPath: z

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "video-context-mcp-server",
-  "version": "1.2.2",
+  "version": "1.2.3",
   "description": "A Model Context Protocol server that gives GitHub Copilot the ability to understand and analyze video content",
   "type": "module",
   "main": "dist/index.js",