npm - @jochenyang/opencode-vision - Versions diffs - 1.0.1 → 1.1.0 - Mend

@jochenyang/opencode-vision 1.0.1 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -62,7 +62,7 @@ vision 工具调用视觉 API 返回图片描述
 ## 前置要求
 - [OpenCode](https://github.com/opencode-ai/opencode) 已安装
-- 一个兼容 OpenAI 格式的视觉 API（如阿里云 DashScope 通义千问等）
+- 一个支持视觉识别的 API（兼容 OpenAI Chat Completions 格式，或 MiniMax VLM 接口）
 - 环境变量（建议配置到系统级，避免每次启动重复输入）
 ## 环境变量
@@ -70,29 +70,54 @@ vision 工具调用视觉 API 返回图片描述
 | 变量               | 说明                                   | 示例值                                                        |
 | ------------------ | -------------------------------------- | ------------------------------------------------------------- |
 | `VISION_API_KEY`   | 视觉 API 的密钥                        | `sk-your-api-key`                                              |
-| `VISION_API_URL`   | 视觉 API 的基础地址<br>（工具自动补全 `/chat/completions`） | `https://your-api-endpoint/v1`                |
-| `VISION_MODEL`     | 视觉模型名称                           | `your-vision-model`                                            |
+| `VISION_API_URL`   | 视觉 API 的基础地址                     | `https://your-api-endpoint/v1`                                 |
+| `VISION_MODEL`     | 视觉模型名称<br>（MiniMax 无需设置）    | `your-vision-model`                                            |
+| `VISION_API_TYPE`  | 可选，强制指定 API 类型<br>`openai` / `minimax` | `minimax`                                           |
-### Windows 系统级配置
+> `VISION_API_URL`：OpenAI 兼容接口会自动补全 `/chat/completions`；MiniMax 会自动使用 `/v1/coding_plan/vlm` 端点。
+>
+> `VISION_API_TYPE`：默认自动检测（URL 含 `minimax` 自动切换），设此变量可显式指定。
+### 示例一：OpenAI 兼容接口（如阿里云 DashScope 通义千问）
+**Windows 系统级配置：**
 ```powershell
 [System.Environment]::SetEnvironmentVariable('VISION_API_KEY', 'sk-your-api-key', 'User')
 [System.Environment]::SetEnvironmentVariable('VISION_API_URL', 'https://your-api-endpoint/v1', 'User')
 [System.Environment]::SetEnvironmentVariable('VISION_MODEL', 'your-vision-model', 'User')
 ```
-设置后**重启终端**生效。
-### macOS / Linux
-在 `~/.zshrc` 或 `~/.bashrc` 中添加：
+**macOS / Linux：**
 ```bash
 export VISION_API_KEY="sk-your-api-key"
 export VISION_API_URL="https://your-api-endpoint/v1"
 export VISION_MODEL="your-vision-model"
 ```
+### 示例二：MiniMax VLM
+MiniMax 的 VLM 接口属于 **Token Plan** 服务，需要使用具备 Token Plan 访问权限的 API Key（Group API Key），而非普通的 Chat API Key。
+> 如何获取：登录 [MiniMax 平台](https://platform.minimaxi.com) → Token Plan → 创建/查看 Group API Key。
+**Windows 系统级配置：**
+```powershell
+[System.Environment]::SetEnvironmentVariable('VISION_API_KEY', 'your-minimax-group-api-key', 'User')
+[System.Environment]::SetEnvironmentVariable('VISION_API_URL', 'https://api.minimaxi.com', 'User')
+REM VISION_MODEL 不需要设置，MiniMax 自动识别
+```
+**macOS / Linux：**
+```bash
+export VISION_API_KEY="your-minimax-group-api-key"
+export VISION_API_URL="https://api.minimaxi.com"
+# VISION_MODEL 不需要设置，MiniMax 自动识别
+```
+> 提示：国内站使用 `https://api.minimaxi.com`，国际站使用 `https://api.minimax.io`。
+设置后**重启终端**生效。
 ## 安装
 ### 手动安装
@@ -157,7 +182,7 @@ opencode-vision/
 - 读取本地图片文件，通过视觉 API 识别内容
 - 支持 `path`（单图）和 `paths`（多图数组）两个参数
-- 兼容 OpenAI Chat Completions 格式的 API
+- 支持两种后端：OpenAI Chat Completions 格式 / MiniMax VLM（自动检测）
 ### 插件：`plugins/vision-helper.ts`
@@ -176,7 +201,11 @@ opencode-vision/
 ## 自定义视觉 API
-本工具兼容任何 OpenAI Chat Completions 格式的视觉 API。只需更换环境变量即可：
+本工具支持两种后端，自动检测或显式指定。
+### OpenAI Chat Completions 格式
+兼容任何 OpenAI Chat Completions 格式的视觉 API：
 ```powershell
 $env:VISION_API_KEY = 'sk-your-api-key'
@@ -184,6 +213,18 @@ $env:VISION_API_URL = 'https://your-api-endpoint/v1'
 $env:VISION_MODEL = 'your-vision-model'
 ```
+### MiniMax VLM
+工具自动检测 URL 是否含 `minimax`/`minimaxi`，自动切换为 MiniMax VLM 接口。也可通过 `VISION_API_TYPE=minimax` 强制指定。
+> ⚠️ 需要具备 **Token Plan** 权限的 Group API Key。普通 Chat API Key 无法使用。
+```powershell
+$env:VISION_API_KEY = 'your-minimax-group-api-key'
+$env:VISION_API_URL = 'https://api.minimaxi.com'
+# VISION_MODEL 不需要
+```
 ## 许可证
 [MIT](LICENSE)

package/README_en.md CHANGED Viewed

@@ -78,37 +78,62 @@ vision tool calls the vision API → returns image description
 ## Prerequisites
 - [OpenCode](https://github.com/opencode-ai/opencode) installed
-- An OpenAI-compatible vision API (e.g., Aliyun DashScope, OpenAI, etc.)
+- A vision-capable API (OpenAI Chat Completions format or MiniMax VLM)
 - Environment variables configured (recommended system-wide)
 ## Environment Variables
-| Variable          | Description                                            | Example                         |
-| ----------------- | ------------------------------------------------------ | ------------------------------- |
-| `VISION_API_KEY`  | Vision API key                                         | `sk-your-api-key`               |
-| `VISION_API_URL`  | Vision API base URL<br>(tool auto-appends `/chat/completions`) | `https://your-api-endpoint/v1`  |
-| `VISION_MODEL`    | Vision model name                                      | `your-vision-model`             |
+| Variable          | Description                                                        | Example                         |
+| ----------------- | ------------------------------------------------------------------ | ------------------------------- |
+| `VISION_API_KEY`  | Vision API key                                                     | `sk-your-api-key`               |
+| `VISION_API_URL`  | Vision API base URL                                                | `https://your-api-endpoint/v1`  |
+| `VISION_MODEL`    | Vision model name<br>(not needed for MiniMax)                      | `your-vision-model`             |
+| `VISION_API_TYPE` | Optional, force API type<br>`openai` / `minimax`                   | `minimax`                       |
-### Windows (System-wide)
+> `VISION_API_URL`: OpenAI-compatible backends auto-append `/chat/completions`; MiniMax auto-detects and uses `/v1/coding_plan/vlm`.
+>
+> `VISION_API_TYPE`: Auto-detected by default (URL containing `minimax` triggers MiniMax mode). Can be explicitly set.
+### Example 1: OpenAI-compatible (e.g., Aliyun DashScope)
+**Windows:**
 ```powershell
 [System.Environment]::SetEnvironmentVariable('VISION_API_KEY', 'sk-your-api-key', 'User')
 [System.Environment]::SetEnvironmentVariable('VISION_API_URL', 'https://your-api-endpoint/v1', 'User')
 [System.Environment]::SetEnvironmentVariable('VISION_MODEL', 'your-vision-model', 'User')
 ```
-**Restart your terminal** after setting.
-### macOS / Linux
-Add to `~/.zshrc` or `~/.bashrc`:
+**macOS / Linux:**
 ```bash
 export VISION_API_KEY="sk-your-api-key"
 export VISION_API_URL="https://your-api-endpoint/v1"
 export VISION_MODEL="your-vision-model"
 ```
+### Example 2: MiniMax VLM
+MiniMax's VLM endpoint is part of the **Token Plan** service and requires a Group API Key with Token Plan access — a regular Chat API Key won't work.
+> How to get one: Login to [MiniMax platform](https://platform.minimaxi.com) → Token Plan → Create/view Group API Key.
+**Windows:**
+```powershell
+[System.Environment]::SetEnvironmentVariable('VISION_API_KEY', 'your-minimax-group-api-key', 'User')
+[System.Environment]::SetEnvironmentVariable('VISION_API_URL', 'https://api.minimaxi.com', 'User')
+REM VISION_MODEL is not needed — MiniMax auto-detected
+```
+**macOS / Linux:**
+```bash
+export VISION_API_KEY="your-minimax-group-api-key"
+export VISION_API_URL="https://api.minimaxi.com"
+# VISION_MODEL is not needed — MiniMax auto-detected
+```
+> Note: Use `https://api.minimaxi.com` for China region, `https://api.minimax.io` for global.
+**Restart your terminal** after setting.
 ## Installation
 ### Manual
@@ -178,7 +203,7 @@ opencode-vision/
 - Reads local image files and describes them via a vision API
 - Supports `path` (single) and `paths` (multiple) parameters
-- Compatible with any OpenAI Chat Completions API
+- Supports two backends: OpenAI Chat Completions / MiniMax VLM (auto-detected)
 ### Plugin: `plugins/vision-helper.ts`
@@ -197,7 +222,11 @@ opencode-vision/
 ## Custom Vision API
-Compatible with any OpenAI Chat Completions vision API. Just change the environment variables:
+The tool supports two backends with auto-detection or explicit override.
+### OpenAI Chat Completions Format
+Works with any OpenAI Chat Completions vision API:
 ```bash
 export VISION_API_KEY="sk-your-api-key"
@@ -205,6 +234,18 @@ export VISION_API_URL="https://your-api-endpoint/v1"
 export VISION_MODEL="your-vision-model"
 ```
+### MiniMax VLM
+Auto-detected when the URL contains `minimax`/`minimaxi`. Can also be forced with `VISION_API_TYPE=minimax`.
+> ⚠️ Requires a **Group API Key** with Token Plan access. Regular Chat API Keys won't work.
+```bash
+export VISION_API_KEY="your-minimax-group-api-key"
+export VISION_API_URL="https://api.minimaxi.com"
+# VISION_MODEL is not needed
+```
 ## License
 [MIT](LICENSE)

package/bin/install.js CHANGED Viewed

@@ -5,12 +5,19 @@ const os = require("os")
 const SRC = path.join(__dirname, "..")
 const DST = path.join(os.homedir(), ".config", "opencode")
+const isWin = process.platform === "win32"
 const FILES = [
   ["tools/vision.ts", "tools/vision.ts"],
   ["plugins/vision-helper.ts", "plugins/vision-helper.ts"],
 ]
+const ENV_VARS = [
+  { name: "VISION_API_KEY", desc: "视觉 API 密钥 / Vision API key", example: "sk-your-api-key" },
+  { name: "VISION_API_URL", desc: "视觉 API 地址 / Vision API base URL（MiniMax 也可用）", example: "https://api.minimax.chat" },
+  { name: "VISION_MODEL", desc: "视觉模型名称 / Vision model name（MiniMax 无需此项）", example: "your-vision-model" },
+]
 function log(msg, ok = true) {
   const prefix = ok ? "\x1b[32m ✓\x1b[0m" : "\x1b[31m ✗\x1b[0m"
   console.log(`${prefix} ${msg}`)
@@ -20,45 +27,88 @@ function title(msg) {
   console.log(`\n\x1b[36m═══ ${msg} \x1b[0m\n`)
 }
+function printEnvGuide() {
+  console.log("\n  你需要设置以下环境变量才能使用视觉识别功能：")
+  console.log()
+  for (const v of ENV_VARS) {
+    console.log(`    \x1b[33m${v.name}\x1b[0m`)
+    console.log(`    → ${v.desc}`)
+    console.log(`    → 示例: ${v.example}`)
+    console.log()
+  }
+  console.log("  \x1b[36mMiniMax 用户注意：\x1b[0m")
+  console.log("  VISION_API_URL 设为你的 MiniMax API 基础地址即可。")
+  console.log("  工具自动检测 MiniMax 并使用 VLM 接口，不需要 VISION_MODEL。")
+  console.log("  也可显式设置 VISION_API_TYPE=minimax。")
+  console.log()
+  if (isWin) {
+    console.log("  \x1b[36mWindows 系统级配置（管理员 PowerShell）：\x1b[0m")
+    console.log()
+    for (const v of ENV_VARS) {
+      console.log(`    [System.Environment]::SetEnvironmentVariable('${v.name}', '${v.example}', 'User')`)
+    }
+    console.log()
+    console.log("  设置后重启终端生效。")
+  } else {
+    console.log("  \x1b[36mmacOS / Linux 配置（添加到 ~/.zshrc 或 ~/.bashrc）：\x1b[0m")
+    console.log()
+    for (const v of ENV_VARS) {
+      console.log(`    export ${v.name}="${v.example}"`)
+    }
+    console.log()
+    console.log("  然后执行 source ~/.zshrc 或重启终端。")
+  }
+}
+async function checkVars() {
+  let missing = 0
+  for (const v of ENV_VARS) {
+    const val = process.env[v.name]
+    if (val) {
+      const masked = v.name === "VISION_API_KEY" ? val.slice(0, 6) + "****" : val
+      log(`${v.name} = ${masked}`)
+    } else {
+      log(`${v.name} 未设置`, false)
+      missing++
+    }
+  }
+  return missing
+}
 async function doInstall() {
   title("opencode-vision 安装")
+  // ── 文件复制 ──
   for (const [, rel] of FILES) {
     const dir = path.join(DST, path.dirname(rel))
     if (!fs.existsSync(dir)) {
       fs.mkdirSync(dir, { recursive: true })
     }
   }
   for (const [srcRel, dstRel] of FILES) {
     const src = path.join(SRC, srcRel)
     const dst = path.join(DST, dstRel)
     if (!fs.existsSync(src)) {
       log(`源文件不存在: ${srcRel}`, false)
       continue
     }
     fs.copyFileSync(src, dst)
     log(`安装 ${dstRel}`)
   }
+  // ── 环境变量检查 ──
   title("环境变量检查")
-  const vars = {
-    VISION_API_KEY: process.env.VISION_API_KEY,
-    VISION_API_URL: process.env.VISION_API_URL,
-    VISION_MODEL: process.env.VISION_MODEL,
-  }
+  const missing = await checkVars()
-  for (const [name, val] of Object.entries(vars)) {
-    if (val) {
-      const masked = name === "VISION_API_KEY" ? val.slice(0, 6) + "****" : val
-      log(`${name} = ${masked}`)
-    } else {
-      log(`${name} 未设置 — 请配置后再使用`, false)
-    }
+  if (missing > 0) {
+    console.log(`\n  \x1b[33m⚠ 有 ${missing} 个环境变量未设置。\x1b[0m`)
+    printEnvGuide()
   }
+  // ── OpenCode 检测 ──
   title("OpenCode 检测")
   try {
     const { execSync } = require("child_process")
@@ -73,9 +123,13 @@ async function doInstall() {
   }
   title("安装完成")
-  console.log("  重启 OpenCode 后即可使用。")
-  console.log("  粘贴一张图片试试看：")
-  console.log('    [图片] "这是什么？"')
+  console.log("  ✅ 文件已就位，重启 OpenCode 后即可使用。")
+  if (missing > 0) {
+    console.log("  ⚠ 环境变量未配置完整，视觉识别功能无法正常工作。")
+    console.log("     请按上面指引设置后再重启 OpenCode。")
+  }
+  console.log("  📝 使用方式：粘贴一张图片并提问")
+  console.log('     "[图片] 这是什么？"')
 }
 async function doUninstall() {
@@ -92,7 +146,6 @@ async function doUninstall() {
     log(`已删除 ${rel}`)
     removed++
-    // 如果目录空了就一并清理
     const dir = path.dirname(dst)
     if (fs.existsSync(dir) && fs.readdirSync(dir).length === 0) {
       fs.rmdirSync(dir)

package/package.json CHANGED Viewed

@@ -1,13 +1,21 @@
 {
   "name": "@jochenyang/opencode-vision",
-  "version": "1.0.1",
+  "version": "1.1.0",
   "description": "Vision plugin + tool for OpenCode — automatically handles pasted images for non-vision models",
-  "keywords": ["opencode", "vision", "image", "ai", "plugin", "tool"],
+  "keywords": [
+    "opencode",
+    "vision",
+    "image",
+    "ai",
+    "plugin",
+    "tool",
+    "minimax"
+  ],
   "homepage": "https://github.com/jochenyang/opencode-vision",
   "license": "MIT",
   "author": "Jochen Yang",
   "bin": {
-    "opencode-vision": "./bin/install.js"
+    "opencode-vision": "bin/install.js"
   },
   "files": [
     "bin/",

package/tools/vision.ts CHANGED Viewed

@@ -11,7 +11,10 @@ Use this when the user pastes images but the current model cannot view images di
 The image(s) will have been auto-saved with a path hint like "[Image auto-saved to ...]" in the conversation.
 For multiple images, use the "paths" parameter.
-Requires VISION_API_KEY, VISION_API_URL and VISION_MODEL environment variables.`,
+Requires VISION_API_KEY and VISION_API_URL.
+VISION_MODEL is required for OpenAI-compatible backends.
+MiniMax is auto-detected — set VISION_API_URL to your MiniMax base URL and VISION_MODEL is optional.
+Override with VISION_API_TYPE=openai|minimax.`,
   args: {
     paths: tool.schema
       .array(tool.schema.string())
@@ -56,46 +59,113 @@ Requires VISION_API_KEY, VISION_API_URL and VISION_MODEL environment variables.`
     const apiKey = process.env["VISION_API_KEY"]
     const baseUrl = process.env["VISION_API_URL"]
-    const model = process.env["VISION_MODEL"]
     if (!apiKey) return "Error: VISION_API_KEY not set"
     if (!baseUrl) return "Error: VISION_API_URL not set"
-    if (!model) return "Error: VISION_MODEL not set"
-    const apiUrl = `${baseUrl.replace(/\/+$/, "")}/chat/completions`
+    // Determine API type: explicit override or auto-detect from URL
+    const apiType = (process.env["VISION_API_TYPE"] || "").toLowerCase()
+    const isMiniMax = apiType === "minimax" || (!apiType && /minimax/i.test(baseUrl))
-    const content: Record<string, unknown>[] = []
-    if (args.question) {
-      content.push({ type: "text", text: args.question })
-    } else if (resolved.length > 1) {
-      content.push({ type: "text", text: `Describe each of these ${resolved.length} images in detail, labeling which description corresponds to which file.` })
-    } else {
-      content.push({ type: "text", text: "Please describe this image in detail" })
+    if (isMiniMax) {
+      return await callMiniMax(apiKey, baseUrl, resolved, args.question)
     }
+    return await callOpenAI(apiKey, baseUrl, resolved, args.question)
+  },
+})
-    for (const filePath of resolved) {
-      const file = Bun.file(filePath)
-      const mime = file.type || "image/png"
-      const buffer = await file.arrayBuffer()
-      const base64 = Buffer.from(buffer).toString("base64")
-      content.push({ type: "image_url", image_url: { url: `data:${mime};base64,${base64}` } })
-    }
+// ── OpenAI-compatible backend ──
+async function callOpenAI(apiKey: string, baseUrl: string, resolved: string[], question?: string) {
+  const model = process.env["VISION_MODEL"]
+  if (!model) return "Error: VISION_MODEL not set (required for OpenAI-compatible backends)"
+  const apiUrl = `${baseUrl.replace(/\/+$/, "")}/chat/completions`
+  const content: Record<string, unknown>[] = []
+  if (question) {
+    content.push({ type: "text", text: question })
+  } else if (resolved.length > 1) {
+    content.push({
+      type: "text",
+      text: `Describe each of these ${resolved.length} images in detail, labeling which description corresponds to which file.`,
+    })
+  } else {
+    content.push({ type: "text", text: "Please describe this image in detail" })
+  }
+  for (const filePath of resolved) {
+    const file = Bun.file(filePath)
+    const mime = file.type || "image/png"
+    const buffer = await file.arrayBuffer()
+    const base64 = Buffer.from(buffer).toString("base64")
+    content.push({ type: "image_url", image_url: { url: `data:${mime};base64,${base64}` } })
+  }
+  const response = await fetch(apiUrl, {
+    method: "POST",
+    headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}` },
+    body: JSON.stringify({
+      model,
+      messages: [{ role: "user", content }],
+      max_tokens: 4096,
+    }),
+  })
+  if (!response.ok) {
+    const text = await response.text()
+    return `Vision API error (${response.status}): ${text}`
+  }
+  const data = (await response.json()) as { choices: { message: { content: string } }[] }
+  return data.choices?.[0]?.message?.content ?? "No description returned."
+}
+// ── MiniMax VLM backend ──
+interface MiniMaxBaseResp {
+  status_code?: number
+  status_msg?: string
+}
+interface MiniMaxVlmResponse {
+  base_resp?: MiniMaxBaseResp
+  content?: string
+}
+async function callMiniMax(apiKey: string, baseUrl: string, resolved: string[], question?: string) {
+  const apiUrl = `${baseUrl.replace(/\/+$/, "")}/v1/coding_plan/vlm`
+  const descriptions: string[] = []
+  for (const filePath of resolved) {
+    const file = Bun.file(filePath)
+    const mime = file.type || "image/png"
+    const buffer = await file.arrayBuffer()
+    const base64 = Buffer.from(buffer).toString("base64")
+    const imageUrl = `data:${mime};base64,${base64}`
+    const prompt = question || "Please describe this image in detail"
     const response = await fetch(apiUrl, {
       method: "POST",
       headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}` },
-      body: JSON.stringify({
-        model,
-        messages: [{ role: "user", content }],
-        max_tokens: 4096,
-      }),
+      body: JSON.stringify({ prompt, image_url: imageUrl }),
     })
     if (!response.ok) {
       const text = await response.text()
-      return `Vision API error (${response.status}): ${text}`
+      return `MiniMax Vision API error (${response.status}): ${text}`
     }
-    const data = (await response.json()) as { choices: { message: { content: string } }[] }
-    return data.choices?.[0]?.message?.content ?? "No description returned."
-  },
-})
+    const data = (await response.json()) as MiniMaxVlmResponse
+    // MiniMax wraps errors in base_resp even on HTTP 200
+    if (data.base_resp?.status_code && data.base_resp.status_code !== 0) {
+      return `MiniMax Vision API error: ${data.base_resp.status_msg || `status_code ${data.base_resp.status_code}`}`
+    }
+    descriptions.push(data.content || "No description returned.")
+  }
+  if (descriptions.length === 1) return descriptions[0]
+  return descriptions.map((d, i) => `--- Image ${i + 1} ---\n${d}`).join("\n\n")
+}