npm - @jochenyang/opencode-vision - Versions diffs - 1.0.0 - Mend

@jochenyang/opencode-vision 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Jochen Yang
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,148 @@
+# opencode-vision
+为 [OpenCode](https://github.com/opencode-ai/opencode) 提供视觉识别能力的插件 + 工具。
+当模型本身不支持多模态输入时，自动将用户粘贴的图片保存到临时目录，并引导模型调用 vision 工具进行识别。支持单图和多图。
+## 原理
+```
+用户粘贴图片 + "这是什么？"
+  ↓
+vision-helper 插件 (experimental.chat.messages.transform)
+  ├─ 解 base64 → 保存到临时目录
+  ├─ 用简短占位替换原始图片部分（消除不支持的模型的 ERROR 噪音）
+  └─ 路径提示注入到用户文本前
+  ↓
+模型看到路径提示 → 自动调用 vision 工具
+  ↓
+vision 工具调用视觉 API 返回图片描述
+```
+- **单图** → 模型调用 `vision(path)` 读取单张图片
+- **多图** → 模型调用 `vision(paths=[...])` 一次 API 调用处理全部图片
+## 前置要求
+- [OpenCode](https://github.com/opencode-ai/opencode) 已安装
+- 一个兼容 OpenAI 格式的视觉 API（如阿里云 DashScope 通义千问等）
+- 环境变量（建议配置到系统级，避免每次启动重复输入）
+## 环境变量
+| 变量               | 说明                                   | 示例值                                                        |
+| ------------------ | -------------------------------------- | ------------------------------------------------------------- |
+| `VISION_API_KEY`   | 视觉 API 的密钥                        | `sk-your-api-key`                                              |
+| `VISION_API_URL`   | 视觉 API 的基础地址<br>（工具自动补全 `/chat/completions`） | `https://your-api-endpoint/v1`                |
+| `VISION_MODEL`     | 视觉模型名称                           | `your-vision-model`                                            |
+### Windows 系统级配置
+```powershell
+[System.Environment]::SetEnvironmentVariable('VISION_API_KEY', 'sk-your-api-key', 'User')
+[System.Environment]::SetEnvironmentVariable('VISION_API_URL', 'https://your-api-endpoint/v1', 'User')
+[System.Environment]::SetEnvironmentVariable('VISION_MODEL', 'your-vision-model', 'User')
+```
+设置后**重启终端**生效。
+### macOS / Linux
+在 `~/.zshrc` 或 `~/.bashrc` 中添加：
+```bash
+export VISION_API_KEY="sk-your-api-key"
+export VISION_API_URL="https://your-api-endpoint/v1"
+export VISION_MODEL="your-vision-model"
+```
+## 安装
+### 手动安装
+将两个文件复制到 OpenCode 的全局配置目录：
+```powershell
+# 工具文件
+copy tools\vision.ts $env:USERPROFILE\.config\opencode\tools\
+# 插件文件
+copy plugins\vision-helper.ts $env:USERPROFILE\.config\opencode\plugins\
+```
+OpenCode 会自动发现 `~/.config/opencode/tools/` 和 `~/.config/opencode/plugins/` 下的文件，**无需修改 `opencode.json`**。
+> 如果对应目录不存在，手动创建即可。
+### 通过 npx（即将支持）
+```bash
+npx opencode-vision install
+```
+## 验证
+启动 OpenCode：
+```powershell
+opencode
+```
+粘贴一张图片并提问：
+```
+[图片] 这是什么？
+```
+预期行为：
+1. 模型无法直接读取图片（当前模型不支持多模态）
+2. 插件自动保存图片到临时目录并注入路径提示
+3. 模型自动调用 `vision` 工具读取图片
+4. 模型返回图片描述
+## 项目结构
+```
+opencode-vision/
+├── tools/
+│   └── vision.ts          # vision 工具定义，调用视觉 API
+├── plugins/
+│   └── vision-helper.ts   # 插件：自动存图、注入提示、消除 ERROR 噪音
+└── README.md
+```
+### 工具：`tools/vision.ts`
+- 读取本地图片文件，通过视觉 API 识别内容
+- 支持 `path`（单图）和 `paths`（多图数组）两个参数
+- 兼容 OpenAI Chat Completions 格式的 API
+### 插件：`plugins/vision-helper.ts`
+- 钩子：`experimental.chat.messages.transform`
+- 在消息发送给模型前一刻处理
+- 将图片保存到 `os.tmpdir()/opencode-vision/`
+- 路径提示注入到用户文本前（不会持久化到聊天记录）
+- 用简短占位替换原始图片部分，消除 `unsupportedParts` 产生的 ERROR 噪音
+## 注意事项
+- 图片保存到系统临时目录 `os.tmpdir()/opencode-vision/`，重启系统后自动清理
+- 临时文件以 `pasted-{timestamp}-{random}.{ext}` 命名
+- 同一会话中多次粘贴同一张图会产生多个临时文件
+- 视觉 API 调用使用 `max_tokens: 4096`，多图场景下足够返回详细描述
+## 自定义视觉 API
+本工具兼容任何 OpenAI Chat Completions 格式的视觉 API。只需更换环境变量即可：
+```powershell
+$env:VISION_API_KEY = 'sk-your-api-key'
+$env:VISION_API_URL = 'https://your-api-endpoint/v1'
+$env:VISION_MODEL = 'your-vision-model'
+```
+## 许可证
+MIT

package/bin/install.js ADDED Viewed

@@ -0,0 +1,125 @@
+#!/usr/bin/env node
+const fs = require("fs")
+const path = require("path")
+const os = require("os")
+const SRC = path.join(__dirname, "..")
+const DST = path.join(os.homedir(), ".config", "opencode")
+const FILES = [
+  ["tools/vision.ts", "tools/vision.ts"],
+  ["plugins/vision-helper.ts", "plugins/vision-helper.ts"],
+]
+function log(msg, ok = true) {
+  const prefix = ok ? "\x1b[32m ✓\x1b[0m" : "\x1b[31m ✗\x1b[0m"
+  console.log(`${prefix} ${msg}`)
+}
+function title(msg) {
+  console.log(`\n\x1b[36m═══ ${msg} \x1b[0m\n`)
+}
+async function doInstall() {
+  title("opencode-vision 安装")
+  for (const [, rel] of FILES) {
+    const dir = path.join(DST, path.dirname(rel))
+    if (!fs.existsSync(dir)) {
+      fs.mkdirSync(dir, { recursive: true })
+    }
+  }
+  for (const [srcRel, dstRel] of FILES) {
+    const src = path.join(SRC, srcRel)
+    const dst = path.join(DST, dstRel)
+    if (!fs.existsSync(src)) {
+      log(`源文件不存在: ${srcRel}`, false)
+      continue
+    }
+    fs.copyFileSync(src, dst)
+    log(`安装 ${dstRel}`)
+  }
+  title("环境变量检查")
+  const vars = {
+    VISION_API_KEY: process.env.VISION_API_KEY,
+    VISION_API_URL: process.env.VISION_API_URL,
+    VISION_MODEL: process.env.VISION_MODEL,
+  }
+  for (const [name, val] of Object.entries(vars)) {
+    if (val) {
+      const masked = name === "VISION_API_KEY" ? val.slice(0, 6) + "****" : val
+      log(`${name} = ${masked}`)
+    } else {
+      log(`${name} 未设置 — 请配置后再使用`, false)
+    }
+  }
+  title("OpenCode 检测")
+  try {
+    const { execSync } = require("child_process")
+    const ver = execSync("opencode --version 2>nul || opencode version 2>/dev/null", {
+      encoding: "utf8",
+      stdio: ["ignore", "pipe", "ignore"],
+      timeout: 5000,
+    }).trim()
+    log(`OpenCode ${ver || "已安装"}`)
+  } catch {
+    log("未检测到 OpenCode — 请先安装 https://github.com/opencode-ai/opencode", false)
+  }
+  title("安装完成")
+  console.log("  重启 OpenCode 后即可使用。")
+  console.log("  粘贴一张图片试试看：")
+  console.log('    [图片] "这是什么？"')
+}
+async function doUninstall() {
+  title("opencode-vision 卸载")
+  let removed = 0
+  for (const [, rel] of FILES) {
+    const dst = path.join(DST, rel)
+    if (!fs.existsSync(dst)) {
+      log(`未安装: ${rel}`)
+      continue
+    }
+    fs.unlinkSync(dst)
+    log(`已删除 ${rel}`)
+    removed++
+    // 如果目录空了就一并清理
+    const dir = path.dirname(dst)
+    if (fs.existsSync(dir) && fs.readdirSync(dir).length === 0) {
+      fs.rmdirSync(dir)
+      log(`已清理空目录 ${path.relative(os.homedir(), dir)}`)
+    }
+  }
+  title("卸载完成")
+  if (removed > 0) {
+    console.log("  已删除 opencode-vision 相关文件。")
+    console.log("  重启 OpenCode 即可生效。")
+  } else {
+    console.log("  没有找到已安装的文件。")
+  }
+}
+async function main() {
+  const isUninstall = process.argv.includes("--uninstall") || process.argv.includes("uninstall")
+  if (isUninstall) {
+    await doUninstall()
+  } else {
+    await doInstall()
+  }
+  console.log()
+}
+main().catch((err) => {
+  console.error("\x1b[31m操作失败:\x1b[0m", err.message)
+  process.exit(1)
+})

package/package.json ADDED Viewed

@@ -0,0 +1,24 @@
+{
+  "name": "@jochenyang/opencode-vision",
+  "version": "1.0.0",
+  "description": "Vision plugin + tool for OpenCode — automatically handles pasted images for non-vision models",
+  "keywords": ["opencode", "vision", "image", "ai", "plugin", "tool"],
+  "homepage": "https://github.com/jochenyang/opencode-vision",
+  "license": "MIT",
+  "author": "Jochen Yang",
+  "bin": {
+    "opencode-vision": "./bin/install.js"
+  },
+  "files": [
+    "bin/",
+    "tools/",
+    "plugins/",
+    "README.md"
+  ],
+  "engines": {
+    "node": ">=18"
+  },
+  "publishConfig": {
+    "access": "public"
+  }
+}

package/plugins/vision-helper.ts ADDED Viewed

@@ -0,0 +1,65 @@
+import type { Plugin } from "@opencode-ai/plugin"
+import { tmpdir } from "os"
+import path from "path"
+const TMP_DIR = path.join(tmpdir(), "opencode-vision")
+/**
+ * 在消息发送给模型前一刻，检测用户消息中的图片附件：
+ * 1. 保存图片到临时目录
+ * 2. 在用户文本前注入路径提示，让不支持多模态的模型自动调用 vision 工具
+ * 3. 替换原始图片部分避免 unsupportedParts 产生噪音 ERROR 文本
+ */
+export default (async () => {
+  await Bun.write(path.join(TMP_DIR, ".check"), "").catch(() => {})
+  return {
+    "experimental.chat.messages.transform": async (_input, output) => {
+      for (const msg of output.messages) {
+        if (msg.info.role !== "user") continue
+        // 找出所有图片，保存到磁盘
+        const saved: { index: number; filePath: string }[] = []
+        for (let i = 0; i < msg.parts.length; i++) {
+          const part = msg.parts[i]
+          if (part.type !== "file" || typeof part.mime !== "string" || !part.mime.startsWith("image/")) continue
+          const colon = part.url.indexOf(";base64,")
+          if (colon === -1) continue
+          const base64 = part.url.slice(colon + ";base64,".length)
+          if (!base64) continue
+          const ext = part.mime.split("/")[1] || "png"
+          const name = `pasted-${Date.now()}-${Math.random().toString(36).slice(2, 8)}.${ext}`
+          const filePath = path.join(TMP_DIR, name)
+          await Bun.write(filePath, Buffer.from(base64, "base64"))
+          saved.push({ index: i, filePath })
+        }
+        if (saved.length === 0) continue
+        // 用简短文本占位替换原始图片 part，防止 unsupportedParts 产生噪音 ERROR
+        // 逆序遍历避免 index 偏移
+        for (const { index, filePath } of saved.toReversed()) {
+          msg.parts.splice(index, 1, {
+            type: "text",
+            text: `[vision: ${path.basename(filePath)}]`,
+          } as never)
+        }
+        // 构造路径提示
+        const hintText = saved.length === 1
+          ? `[Image auto-saved to ${saved[0].filePath} — use the vision tool to read it]`
+          : `[Images auto-saved to:\n${saved.map((s) => `  ${s.filePath}`).join("\n")}\n— use the vision tool with paths=[...] to read them all at once]`
+        // 注入到用户文本前面
+        const firstText = msg.parts.find((p) => p.type === "text" && !p.synthetic)
+        if (firstText && typeof firstText.text === "string") {
+          firstText.text = hintText + "\n" + firstText.text
+        }
+      }
+    },
+  }
+}) satisfies Plugin

package/tools/vision.ts ADDED Viewed

@@ -0,0 +1,101 @@
+/// <reference path="../env.d.ts" />
+import { tool } from "@opencode-ai/plugin"
+import { tmpdir } from "os"
+import path from "path"
+const TMP_DIR = path.join(tmpdir(), "opencode-vision")
+export default tool({
+  description: `Reads one or more image files and returns a description of their contents.
+Use this when the user pastes images but the current model cannot view images directly.
+The image(s) will have been auto-saved with a path hint like "[Image auto-saved to ...]" in the conversation.
+For multiple images, use the "paths" parameter.
+Requires VISION_API_KEY, VISION_API_URL and VISION_MODEL environment variables.`,
+  args: {
+    paths: tool.schema
+      .array(tool.schema.string())
+      .describe("Absolute path(s) to the image file(s). Use this for one or multiple images.")
+      .optional(),
+    path: tool.schema
+      .string()
+      .describe("Deprecated: use 'paths' instead. Absolute path to a single image file.")
+      .optional(),
+    question: tool.schema
+      .string()
+      .describe("Optional specific question about the image(s)")
+      .optional(),
+  },
+  async execute(args) {
+    const allPaths: string[] = []
+    if (args.paths && args.paths.length > 0) {
+      allPaths.push(...args.paths)
+    } else if (args.path) {
+      allPaths.push(args.path)
+    }
+    if (allPaths.length === 0) return "Error: no image path provided"
+    // Resolve each path (try absolute first, then fallback to TMP_DIR)
+    const resolved: string[] = []
+    for (const p of allPaths) {
+      let file = Bun.file(p)
+      if (await file.exists()) {
+        resolved.push(p)
+        continue
+      }
+      const fallback = path.join(TMP_DIR, path.basename(p))
+      file = Bun.file(fallback)
+      if (await file.exists()) {
+        resolved.push(fallback)
+      }
+    }
+    if (resolved.length === 0) {
+      return `Error: none of the specified images were found (looked in: ${allPaths.join(", ")})`
+    }
+    const apiKey = process.env["VISION_API_KEY"]
+    const baseUrl = process.env["VISION_API_URL"]
+    const model = process.env["VISION_MODEL"]
+    if (!apiKey) return "Error: VISION_API_KEY not set"
+    if (!baseUrl) return "Error: VISION_API_URL not set"
+    if (!model) return "Error: VISION_MODEL not set"
+    const apiUrl = `${baseUrl.replace(/\/+$/, "")}/chat/completions`
+    const content: Record<string, unknown>[] = []
+    if (args.question) {
+      content.push({ type: "text", text: args.question })
+    } else if (resolved.length > 1) {
+      content.push({ type: "text", text: `Describe each of these ${resolved.length} images in detail, labeling which description corresponds to which file.` })
+    } else {
+      content.push({ type: "text", text: "Please describe this image in detail" })
+    }
+    for (const filePath of resolved) {
+      const file = Bun.file(filePath)
+      const mime = file.type || "image/png"
+      const buffer = await file.arrayBuffer()
+      const base64 = Buffer.from(buffer).toString("base64")
+      content.push({ type: "image_url", image_url: { url: `data:${mime};base64,${base64}` } })
+    }
+    const response = await fetch(apiUrl, {
+      method: "POST",
+      headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}` },
+      body: JSON.stringify({
+        model,
+        messages: [{ role: "user", content }],
+        max_tokens: 4096,
+      }),
+    })
+    if (!response.ok) {
+      const text = await response.text()
+      return `Vision API error (${response.status}): ${text}`
+    }
+    const data = (await response.json()) as { choices: { message: { content: string } }[] }
+    return data.choices?.[0]?.message?.content ?? "No description returned."
+  },
+})