opencode-see-image 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +107 -0
- package/bun.lock +73 -0
- package/index.ts +240 -0
- package/package.json +27 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# opencode-see-image
|
|
2
|
+
|
|
3
|
+
Give non-vision opencode models the ability to **see images and screenshots** by routing them to a vision-capable model.
|
|
4
|
+
|
|
5
|
+
When a user attaches a screenshot to a text-only model (e.g. GLM-5.2, DeepSeek, Kimi), opencode rejects it with an error. This plugin intercepts that flow: it registers a `see_image` tool that sends the image to a vision model (MiniMax M3 by default) and returns a textual description the primary model can reason about.
|
|
6
|
+
|
|
7
|
+
## Install
|
|
8
|
+
|
|
9
|
+
Add the plugin to your opencode config:
|
|
10
|
+
|
|
11
|
+
```jsonc
|
|
12
|
+
// ~/.config/opencode/opencode.jsonc
|
|
13
|
+
{
|
|
14
|
+
"$schema": "https://opencode.ai/config.json",
|
|
15
|
+
"plugin": ["opencode-see-image"]
|
|
16
|
+
}
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
Then **restart opencode**.
|
|
20
|
+
|
|
21
|
+
That's it — the plugin self-contains both the tool and the triggering instructions (injected into the system prompt). No separate skill file needed.
|
|
22
|
+
|
|
23
|
+
## Prerequisites
|
|
24
|
+
|
|
25
|
+
You need a connected vision-capable provider. The defaults assume **opencode-go** with **MiniMax M3**:
|
|
26
|
+
|
|
27
|
+
1. Run `/connect` in opencode
|
|
28
|
+
2. Select **opencode-go**
|
|
29
|
+
3. Paste your API key from [opencode.ai/auth](https://opencode.ai/auth)
|
|
30
|
+
|
|
31
|
+
If you already have opencode-go connected, you're done.
|
|
32
|
+
|
|
33
|
+
## How it works
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
user attaches screenshot
|
|
37
|
+
│
|
|
38
|
+
▼
|
|
39
|
+
opencode rejects it: 'this model does not support image input'
|
|
40
|
+
│ (the model only sees the filename, no pixels)
|
|
41
|
+
▼
|
|
42
|
+
plugin's system-prompt instructions tell the model to call see_image
|
|
43
|
+
│
|
|
44
|
+
▼
|
|
45
|
+
see_image tool:
|
|
46
|
+
1. locates the file (macOS screenshot temp dirs, ~/Desktop, ~/Downloads, cwd)
|
|
47
|
+
2. base64-encodes it
|
|
48
|
+
3. sends it to the vision model via the Anthropic Messages API
|
|
49
|
+
4. returns the textual description
|
|
50
|
+
│
|
|
51
|
+
▼
|
|
52
|
+
primary model answers using the description
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
## Configuration
|
|
56
|
+
|
|
57
|
+
All settings are env-var overrides. Defaults work out-of-the-box for opencode-go + MiniMax M3.
|
|
58
|
+
|
|
59
|
+
| Env var | Default | Description |
|
|
60
|
+
|---|---|---|
|
|
61
|
+
| `SEE_IMAGE_MODEL` | `minimax-m3` | Vision model ID to call |
|
|
62
|
+
| `SEE_IMAGE_PROVIDER` | `opencode-go` | Provider key in opencode's `auth.json` |
|
|
63
|
+
| `SEE_IMAGE_ENDPOINT` | `https://opencode.ai/zen/go/v1/messages` | Anthropic-Messages-compatible endpoint |
|
|
64
|
+
| `SEE_IMAGE_API_KEY` | _(reads auth.json)_ | Bypass auth.json with an explicit key |
|
|
65
|
+
| `SEE_IMAGE_API_VERSION` | `2023-06-01` | `anthropic-version` header value |
|
|
66
|
+
| `SEE_IMAGE_USER_AGENT` | _(Chrome UA)_ | Override the User-Agent header |
|
|
67
|
+
|
|
68
|
+
### Using a different vision model
|
|
69
|
+
|
|
70
|
+
Any Anthropic-Messages-compatible endpoint works. For example, to use a direct MiniMax key:
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
export SEE_IMAGE_ENDPOINT="https://api.minimax.io/v1/messages"
|
|
74
|
+
export SEE_IMAGE_MODEL="minimax-m3"
|
|
75
|
+
export SEE_IMAGE_API_KEY="your-minimax-key"
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
To use a different opencode-go model (e.g. Kimi K2.7):
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
export SEE_IMAGE_MODEL="kimi-k2.7-code"
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Verified vision-capable models on opencode-go
|
|
85
|
+
|
|
86
|
+
| Model | Speed | Notes |
|
|
87
|
+
|---|---|---|
|
|
88
|
+
| `minimax-m3` | ~3s | Default. Fast, clean text output. |
|
|
89
|
+
| `kimi-k2.7-code` | ~7s | Clean output, accurate. |
|
|
90
|
+
| `kimi-k2.6` | ~20s | Accurate but slow. |
|
|
91
|
+
| `qwen3.7-plus` | ~20s | Emits thinking blocks (handled). |
|
|
92
|
+
|
|
93
|
+
## File search locations
|
|
94
|
+
|
|
95
|
+
When opencode rejects an image attachment, the model only receives a bare filename. `see_image` searches these locations in order:
|
|
96
|
+
|
|
97
|
+
1. `$TMPDIR/TemporaryItems/NSIRD_screencaptureui_*/` — where macOS stashes dragged screenshots
|
|
98
|
+
2. `$TMPDIR/TemporaryItems/`
|
|
99
|
+
3. `~/Desktop` — default screenshot save location
|
|
100
|
+
4. `~/Downloads`
|
|
101
|
+
5. Current working directory
|
|
102
|
+
|
|
103
|
+
Pass an absolute `filePath` to skip the search.
|
|
104
|
+
|
|
105
|
+
## License
|
|
106
|
+
|
|
107
|
+
MIT
|
package/bun.lock
ADDED
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
{
|
|
2
|
+
"lockfileVersion": 1,
|
|
3
|
+
"configVersion": 1,
|
|
4
|
+
"workspaces": {
|
|
5
|
+
"": {
|
|
6
|
+
"name": "opencode-see-image",
|
|
7
|
+
"dependencies": {
|
|
8
|
+
"@opencode-ai/plugin": "^1.15.0",
|
|
9
|
+
},
|
|
10
|
+
},
|
|
11
|
+
},
|
|
12
|
+
"packages": {
|
|
13
|
+
"@msgpackr-extract/msgpackr-extract-darwin-arm64": ["@msgpackr-extract/msgpackr-extract-darwin-arm64@3.0.4", "", { "os": "darwin", "cpu": "arm64" }, "sha512-LCkGo6JDfaBhgST7UpPWgNgLINpcpabaHfyz5OBx75nUYxBsaEPxjnyNjWpeb/xBup/682QnBfRBy2/LvPutZQ=="],
|
|
14
|
+
|
|
15
|
+
"@msgpackr-extract/msgpackr-extract-darwin-x64": ["@msgpackr-extract/msgpackr-extract-darwin-x64@3.0.4", "", { "os": "darwin", "cpu": "x64" }, "sha512-zExlW9zUJKZH/tOtVMttwjKa4Xm/3KcNjnE3dPN92uCktwavMxpgCA3MoJK/DOnTWsQgo224OaST27/mPNAf+w=="],
|
|
16
|
+
|
|
17
|
+
"@msgpackr-extract/msgpackr-extract-linux-arm": ["@msgpackr-extract/msgpackr-extract-linux-arm@3.0.4", "", { "os": "linux", "cpu": "arm" }, "sha512-Tg3yX65f5GbtXLkrYEHE5oibZG9epyYWas7FogTTEJeDEF9JlXJzKgXaNhT3UXlTOeA+AfZpYZYZ0uPj7Cfquw=="],
|
|
18
|
+
|
|
19
|
+
"@msgpackr-extract/msgpackr-extract-linux-arm64": ["@msgpackr-extract/msgpackr-extract-linux-arm64@3.0.4", "", { "os": "linux", "cpu": "arm64" }, "sha512-dgX0P/9wGPJeHFBG+ZmhgE6bmtMt7NP5CRBGyyktpopdk/mW4POnrpQsSLtKI1dwpc+pPLuXHDh6vvskyQE/sw=="],
|
|
20
|
+
|
|
21
|
+
"@msgpackr-extract/msgpackr-extract-linux-x64": ["@msgpackr-extract/msgpackr-extract-linux-x64@3.0.4", "", { "os": "linux", "cpu": "x64" }, "sha512-8TNXMEjJc3QEy7R/x1INhgiU+XakDAFUzBhaz7+Rbrs8NH5UQeHQxxmzsSBJGyV6I1jW79undiQm8tOI+D+8FQ=="],
|
|
22
|
+
|
|
23
|
+
"@msgpackr-extract/msgpackr-extract-win32-x64": ["@msgpackr-extract/msgpackr-extract-win32-x64@3.0.4", "", { "os": "win32", "cpu": "x64" }, "sha512-CmCXPQrkbwExx3j946/PtHWHbYJiCRBRDl4BlkRQcJB/YOwQxJRTpoo7aTsortjgoJ1x7opzTSxn7C+ASSLVjQ=="],
|
|
24
|
+
|
|
25
|
+
"@opencode-ai/plugin": ["@opencode-ai/plugin@1.17.8", "", { "dependencies": { "@opencode-ai/sdk": "1.17.8", "effect": "4.0.0-beta.74", "zod": "4.1.8" }, "peerDependencies": { "@opentui/core": ">=0.3.4", "@opentui/keymap": ">=0.3.4", "@opentui/solid": ">=0.3.4" }, "optionalPeers": ["@opentui/core", "@opentui/keymap", "@opentui/solid"] }, "sha512-pkmnYQz5d+xf0h6fAjgplSSJKLqgYKOXr+x6y40GRPdW+/IfndFkMGq7CDsG2SieGD84qv4zYDMyolGo06IMpw=="],
|
|
26
|
+
|
|
27
|
+
"@opencode-ai/sdk": ["@opencode-ai/sdk@1.17.8", "", { "dependencies": { "cross-spawn": "7.0.6" } }, "sha512-6MKmsj2ujZyL44jy+12dpwWYDYKPS9fUr+0wVQxaIlPYQ/eAt8T8T3QrybplJ5ZtHfZUX+esXZ02x2UYYm7oEw=="],
|
|
28
|
+
|
|
29
|
+
"@standard-schema/spec": ["@standard-schema/spec@1.1.0", "", {}, "sha512-l2aFy5jALhniG5HgqrD6jXLi/rUWrKvqN/qJx6yoJsgKhblVd+iqqU4RCXavm/jPityDo5TCvKMnpjKnOriy0w=="],
|
|
30
|
+
|
|
31
|
+
"cross-spawn": ["cross-spawn@7.0.6", "", { "dependencies": { "path-key": "^3.1.0", "shebang-command": "^2.0.0", "which": "^2.0.1" } }, "sha512-uV2QOWP2nWzsy2aMp8aRibhi9dlzF5Hgh5SHaB9OiTGEyDTiJJyx0uy51QXdyWbtAHNua4XJzUKca3OzKUd3vA=="],
|
|
32
|
+
|
|
33
|
+
"detect-libc": ["detect-libc@2.1.2", "", {}, "sha512-Btj2BOOO83o3WyH59e8MgXsxEQVcarkUOpEYrubB0urwnN10yQ364rsiByU11nZlqWYZm05i/of7io4mzihBtQ=="],
|
|
34
|
+
|
|
35
|
+
"effect": ["effect@4.0.0-beta.74", "", { "dependencies": { "@standard-schema/spec": "^1.1.0", "fast-check": "^4.8.0", "find-my-way-ts": "^0.1.6", "ini": "^7.0.0", "kubernetes-types": "^1.30.0", "msgpackr": "^2.0.1", "multipasta": "^0.2.7", "toml": "^4.1.1", "uuid": "^14.0.0", "yaml": "^2.9.0" } }, "sha512-Yx+Kh12U+i2FmjwEfKs+ePFmpMd43RPD1oGqc/VraSS9bYzvF0Ff3PojwEFEVEewp8xc92Uxu28gTspU4qyvHA=="],
|
|
36
|
+
|
|
37
|
+
"fast-check": ["fast-check@4.8.0", "", { "dependencies": { "pure-rand": "^8.0.0" } }, "sha512-GOJ158CUMnN6cSahsv4+ExARvIDuzzinFjkp0E9WtiBa5zcVeLozVkWaE4IzFcc+Y48Wp1EDlUZsXRyAztQcSg=="],
|
|
38
|
+
|
|
39
|
+
"find-my-way-ts": ["find-my-way-ts@0.1.6", "", {}, "sha512-a85L9ZoXtNAey3Y6Z+eBWW658kO/MwR7zIafkIUPUMf3isZG0NCs2pjW2wtjxAKuJPxMAsHUIP4ZPGv0o5gyTA=="],
|
|
40
|
+
|
|
41
|
+
"ini": ["ini@7.0.0", "", {}, "sha512-ifK0CgjALofS5bkrcTy4RaQ9Vx2Knf/eLeIO+NaswQEpH1UblrtTSCIvN71qQDMq0PeQ/SSPojvEJp9vvvfr+w=="],
|
|
42
|
+
|
|
43
|
+
"isexe": ["isexe@2.0.0", "", {}, "sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw=="],
|
|
44
|
+
|
|
45
|
+
"kubernetes-types": ["kubernetes-types@1.30.0", "", {}, "sha512-Dew1okvhM/SQcIa2rcgujNndZwU8VnSapDgdxlYoB84ZlpAD43U6KLAFqYo17ykSFGHNPrg0qry0bP+GJd9v7Q=="],
|
|
46
|
+
|
|
47
|
+
"msgpackr": ["msgpackr@2.0.4", "", { "optionalDependencies": { "msgpackr-extract": "^3.0.4" } }, "sha512-o1C5KRmuRt+apqMr1HuGSqWStZoRBUpEsCsl15uM9VdAF1qHLtvMOU2En747EnTyEl6c4pzPewRMFF31s1CNbA=="],
|
|
48
|
+
|
|
49
|
+
"msgpackr-extract": ["msgpackr-extract@3.0.4", "", { "dependencies": { "node-gyp-build-optional-packages": "5.2.2" }, "optionalDependencies": { "@msgpackr-extract/msgpackr-extract-darwin-arm64": "3.0.4", "@msgpackr-extract/msgpackr-extract-darwin-x64": "3.0.4", "@msgpackr-extract/msgpackr-extract-linux-arm": "3.0.4", "@msgpackr-extract/msgpackr-extract-linux-arm64": "3.0.4", "@msgpackr-extract/msgpackr-extract-linux-x64": "3.0.4", "@msgpackr-extract/msgpackr-extract-win32-x64": "3.0.4" }, "bin": { "download-msgpackr-prebuilds": "bin/download-prebuilds.js" } }, "sha512-4kmO/MdyUIkLIvTPr8VHLil4AtoKIoniWPIEk5+CDy0xnWC84azhSFmuJ7PxZdsYtiP5kEeQsORAVIeMgxT+Hw=="],
|
|
50
|
+
|
|
51
|
+
"multipasta": ["multipasta@0.2.7", "", {}, "sha512-KPA58d68KgGil15oDqXjkUBEBYc00XvbPj5/X+dyzeo/lWm9Nc25pQRlf1D+gv4OpK7NM0J1odrbu9JNNGvynA=="],
|
|
52
|
+
|
|
53
|
+
"node-gyp-build-optional-packages": ["node-gyp-build-optional-packages@5.2.2", "", { "dependencies": { "detect-libc": "^2.0.1" }, "bin": { "node-gyp-build-optional-packages": "bin.js", "node-gyp-build-optional-packages-optional": "optional.js", "node-gyp-build-optional-packages-test": "build-test.js" } }, "sha512-s+w+rBWnpTMwSFbaE0UXsRlg7hU4FjekKU4eyAih5T8nJuNZT1nNsskXpxmeqSK9UzkBl6UgRlnKc8hz8IEqOw=="],
|
|
54
|
+
|
|
55
|
+
"path-key": ["path-key@3.1.1", "", {}, "sha512-ojmeN0qd+y0jszEtoY48r0Peq5dwMEkIlCOu6Q5f41lfkswXuKtYrhgoTpLnyIcHm24Uhqx+5Tqm2InSwLhE6Q=="],
|
|
56
|
+
|
|
57
|
+
"pure-rand": ["pure-rand@8.4.0", "", {}, "sha512-IoM8YF/jY0hiugFo/wOWqfmarlE6J0wc6fDK1PhftMk7MGhVZl88sZimmqBBFomLOCSmcCCpsfj7wXASCpvK9A=="],
|
|
58
|
+
|
|
59
|
+
"shebang-command": ["shebang-command@2.0.0", "", { "dependencies": { "shebang-regex": "^3.0.0" } }, "sha512-kHxr2zZpYtdmrN1qDjrrX/Z1rR1kG8Dx+gkpK1G4eXmvXswmcE1hTWBWYUzlraYw1/yZp6YuDY77YtvbN0dmDA=="],
|
|
60
|
+
|
|
61
|
+
"shebang-regex": ["shebang-regex@3.0.0", "", {}, "sha512-7++dFhtcx3353uBaq8DDR4NuxBetBzC7ZQOhmTQInHEd6bSrXdiEyzCvG07Z44UYdLShWUyXt5M/yhz8ekcb1A=="],
|
|
62
|
+
|
|
63
|
+
"toml": ["toml@4.1.1", "", {}, "sha512-EBJnVBr3dTXdA89WVFoAIPUqkBjxPMwRqsfuo1r240tKFHXv3zgca4+NJib/h6TyvGF7vOawz0jGuryJCdNHrw=="],
|
|
64
|
+
|
|
65
|
+
"uuid": ["uuid@14.0.0", "", { "bin": { "uuid": "dist-node/bin/uuid" } }, "sha512-Qo+uWgilfSmAhXCMav1uYFynlQO7fMFiMVZsQqZRMIXp0O7rR7qjkj+cPvBHLgBqi960QCoo/PH2/6ZtVqKvrg=="],
|
|
66
|
+
|
|
67
|
+
"which": ["which@2.0.2", "", { "dependencies": { "isexe": "^2.0.0" }, "bin": { "node-which": "./bin/node-which" } }, "sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA=="],
|
|
68
|
+
|
|
69
|
+
"yaml": ["yaml@2.9.0", "", { "bin": { "yaml": "bin.mjs" } }, "sha512-2AvhNX3mb8zd6Zy7INTtSpl1F15HW6Wnqj0srWlkKLcpYl/gMIMJiyuGq2KeI2YFxUPjdlB+3Lc10seMLtL4cA=="],
|
|
70
|
+
|
|
71
|
+
"zod": ["zod@4.1.8", "", {}, "sha512-5R1P+WwQqmmMIEACyzSvo4JXHY5WiAFHRMg+zBZKgKS+Q1viRa0C1hmUKtHltoIFKtIdki3pRxkmpP74jnNYHQ=="],
|
|
72
|
+
}
|
|
73
|
+
}
|
package/index.ts
ADDED
|
@@ -0,0 +1,240 @@
|
|
|
1
|
+
import { tool } from "@opencode-ai/plugin"
|
|
2
|
+
import path from "path"
|
|
3
|
+
import os from "os"
|
|
4
|
+
import fs from "fs"
|
|
5
|
+
import type { Plugin } from "@opencode-ai/plugin"
|
|
6
|
+
|
|
7
|
+
// ─── Configuration (env-overridable) ────────────────────────────────────────
|
|
8
|
+
// Defaults target opencode-go's MiniMax M3. Users on other providers can
|
|
9
|
+
// override via environment variables without editing this file.
|
|
10
|
+
|
|
11
|
+
const ENDPOINT =
|
|
12
|
+
process.env.SEE_IMAGE_ENDPOINT ||
|
|
13
|
+
"https://opencode.ai/zen/go/v1/messages"
|
|
14
|
+
const MODEL = process.env.SEE_IMAGE_MODEL || "minimax-m3"
|
|
15
|
+
const PROVIDER_ID = process.env.SEE_IMAGE_PROVIDER || "opencode-go"
|
|
16
|
+
const API_VERSION = process.env.SEE_IMAGE_API_VERSION || "2023-06-01"
|
|
17
|
+
const USER_AGENT =
|
|
18
|
+
process.env.SEE_IMAGE_USER_AGENT ||
|
|
19
|
+
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0 Safari/537.36"
|
|
20
|
+
|
|
21
|
+
const EXT_MEDIA: Record<string, string> = {
|
|
22
|
+
png: "image/png",
|
|
23
|
+
jpg: "image/jpeg",
|
|
24
|
+
jpeg: "image/jpeg",
|
|
25
|
+
gif: "image/gif",
|
|
26
|
+
webp: "image/webp",
|
|
27
|
+
bmp: "image/bmp",
|
|
28
|
+
}
|
|
29
|
+
|
|
30
|
+
// ─── Auth ───────────────────────────────────────────────────────────────────
|
|
31
|
+
function readApiKey(): string {
|
|
32
|
+
// 1. Explicit env var wins.
|
|
33
|
+
if (process.env.SEE_IMAGE_API_KEY) return process.env.SEE_IMAGE_API_KEY
|
|
34
|
+
|
|
35
|
+
// 2. Read from opencode's auth store (~/.local/share/opencode/auth.json).
|
|
36
|
+
const authPath = path.join(os.homedir(), ".local/share/opencode/auth.json")
|
|
37
|
+
try {
|
|
38
|
+
const auth = JSON.parse(fs.readFileSync(authPath, "utf8"))
|
|
39
|
+
const entry = auth[PROVIDER_ID]
|
|
40
|
+
if (entry && (entry.key || entry.access)) {
|
|
41
|
+
return entry.key || entry.access
|
|
42
|
+
}
|
|
43
|
+
} catch {
|
|
44
|
+
// fall through to error
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
throw new Error(
|
|
48
|
+
`see_image: no API key. Either run /connect for "${PROVIDER_ID}" in opencode, ` +
|
|
49
|
+
`or set SEE_IMAGE_API_KEY, or set SEE_IMAGE_PROVIDER to a connected provider ID. ` +
|
|
50
|
+
`(Looked in ${authPath} for key "${PROVIDER_ID}".)`,
|
|
51
|
+
)
|
|
52
|
+
}
|
|
53
|
+
|
|
54
|
+
// ─── File resolution ────────────────────────────────────────────────────────
|
|
55
|
+
// When opencode rejects an image attachment, the model only sees a bare
|
|
56
|
+
// filename (no path). This resolves bare filenames by searching the places
|
|
57
|
+
// macOS / opencode tend to stash screenshots.
|
|
58
|
+
function resolveFilePath(name: string, cwd: string): string {
|
|
59
|
+
if (path.isAbsolute(name) && fs.existsSync(name)) return name
|
|
60
|
+
|
|
61
|
+
const resolved = path.resolve(cwd, name)
|
|
62
|
+
if (fs.existsSync(resolved)) return resolved
|
|
63
|
+
|
|
64
|
+
const tmpdir = process.env.TMPDIR || "/tmp"
|
|
65
|
+
const searchDirs: string[] = []
|
|
66
|
+
|
|
67
|
+
// macOS screenshot tool temp dirs (NSIRD_screencaptureui_<rand>) — this is
|
|
68
|
+
// where dragged screenshots actually land, not ~/Desktop.
|
|
69
|
+
const tempItems = path.join(tmpdir, "TemporaryItems")
|
|
70
|
+
if (fs.existsSync(tempItems)) {
|
|
71
|
+
try {
|
|
72
|
+
for (const sub of fs.readdirSync(tempItems, { withFileTypes: true })) {
|
|
73
|
+
if (sub.isDirectory() && sub.name.startsWith("NSIRD_screencaptureui")) {
|
|
74
|
+
searchDirs.push(path.join(tempItems, sub.name))
|
|
75
|
+
}
|
|
76
|
+
}
|
|
77
|
+
} catch {}
|
|
78
|
+
}
|
|
79
|
+
searchDirs.push(tempItems)
|
|
80
|
+
searchDirs.push(path.join(os.homedir(), "Desktop"))
|
|
81
|
+
searchDirs.push(path.join(os.homedir(), "Downloads"))
|
|
82
|
+
searchDirs.push(cwd)
|
|
83
|
+
|
|
84
|
+
for (const dir of searchDirs) {
|
|
85
|
+
if (!dir) continue
|
|
86
|
+
try {
|
|
87
|
+
const full = path.join(dir, name)
|
|
88
|
+
if (fs.existsSync(full)) return full
|
|
89
|
+
} catch {}
|
|
90
|
+
}
|
|
91
|
+
|
|
92
|
+
// Shallow recursive search in the top-level search dirs.
|
|
93
|
+
for (const dir of searchDirs) {
|
|
94
|
+
if (!dir || !fs.existsSync(dir)) continue
|
|
95
|
+
try {
|
|
96
|
+
for (const entry of fs.readdirSync(dir, { withFileTypes: true })) {
|
|
97
|
+
if (entry.name === name) return path.join(dir, name)
|
|
98
|
+
}
|
|
99
|
+
} catch {}
|
|
100
|
+
}
|
|
101
|
+
|
|
102
|
+
const searched = searchDirs.filter(Boolean).join(", ")
|
|
103
|
+
throw new Error(
|
|
104
|
+
`see_image: could not find "${name}". Searched: ${searched}. ` +
|
|
105
|
+
`Pass an absolute filePath instead.`,
|
|
106
|
+
)
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
// ─── Tool definition ────────────────────────────────────────────────────────
|
|
110
|
+
const seeImageTool = tool({
|
|
111
|
+
description:
|
|
112
|
+
'See an image/screenshot that the current model cannot view. Use when the user attaches an image and you get a "this model does not support image input" / "Cannot read" error, or when a screenshot/image is referenced ("see this", "can you see", .png/.jpg). Routes the image to a vision-capable model and returns a detailed textual description you can reason about as if you saw it. Pass filePath as an absolute path OR a bare filename (auto-located in macOS screenshot temp dirs, ~/Desktop, ~/Downloads, cwd).',
|
|
113
|
+
args: {
|
|
114
|
+
filePath: tool.schema
|
|
115
|
+
.string()
|
|
116
|
+
.describe(
|
|
117
|
+
'Path to the image. Absolute path, or a bare filename like "Screenshot 2026-06-18 at 17.32.24.png" to auto-locate.',
|
|
118
|
+
),
|
|
119
|
+
question: tool.schema
|
|
120
|
+
.string()
|
|
121
|
+
.optional()
|
|
122
|
+
.describe(
|
|
123
|
+
"Optional specific question about the image. Defaults to a general detailed description.",
|
|
124
|
+
),
|
|
125
|
+
},
|
|
126
|
+
async execute(args, context) {
|
|
127
|
+
const fullPath = resolveFilePath(args.filePath, context.directory)
|
|
128
|
+
const ext = path.extname(fullPath).slice(1).toLowerCase()
|
|
129
|
+
const mediaType = EXT_MEDIA[ext] || "image/png"
|
|
130
|
+
|
|
131
|
+
const buf = fs.readFileSync(fullPath)
|
|
132
|
+
const b64 = Buffer.from(buf).toString("base64")
|
|
133
|
+
|
|
134
|
+
const prompt =
|
|
135
|
+
args.question && args.question.trim().length > 0
|
|
136
|
+
? args.question
|
|
137
|
+
: "Describe this image in detail. If it is a screenshot, describe the UI, text content, and layout precisely. This description will be used by another model to answer the user, so be thorough and accurate."
|
|
138
|
+
|
|
139
|
+
const body = {
|
|
140
|
+
model: MODEL,
|
|
141
|
+
max_tokens: 2048,
|
|
142
|
+
messages: [
|
|
143
|
+
{
|
|
144
|
+
role: "user",
|
|
145
|
+
content: [
|
|
146
|
+
{
|
|
147
|
+
type: "image",
|
|
148
|
+
source: { type: "base64", media_type: mediaType, data: b64 },
|
|
149
|
+
},
|
|
150
|
+
{ type: "text", text: prompt },
|
|
151
|
+
],
|
|
152
|
+
},
|
|
153
|
+
],
|
|
154
|
+
}
|
|
155
|
+
|
|
156
|
+
const key = readApiKey()
|
|
157
|
+
const res = await fetch(ENDPOINT, {
|
|
158
|
+
method: "POST",
|
|
159
|
+
headers: {
|
|
160
|
+
"x-api-key": key,
|
|
161
|
+
"anthropic-version": API_VERSION,
|
|
162
|
+
"content-type": "application/json",
|
|
163
|
+
"user-agent": USER_AGENT,
|
|
164
|
+
},
|
|
165
|
+
body: JSON.stringify(body),
|
|
166
|
+
})
|
|
167
|
+
|
|
168
|
+
if (!res.ok) {
|
|
169
|
+
const errText = await res.text()
|
|
170
|
+
throw new Error(
|
|
171
|
+
`see_image: vision call to "${MODEL}" failed: HTTP ${res.status} — ${errText.slice(0, 300)}`,
|
|
172
|
+
)
|
|
173
|
+
}
|
|
174
|
+
|
|
175
|
+
const data: any = await res.json()
|
|
176
|
+
// Join all text blocks, skipping thinking/signature blocks (some models
|
|
177
|
+
// like qwen/minimax-m2.7 emit reasoning before the answer).
|
|
178
|
+
const text = data?.content
|
|
179
|
+
?.map((c: any) => c.text)
|
|
180
|
+
.filter((t: any) => typeof t === "string" && t.length > 0)
|
|
181
|
+
.join("\n")
|
|
182
|
+
.trim()
|
|
183
|
+
|
|
184
|
+
if (!text) {
|
|
185
|
+
throw new Error(
|
|
186
|
+
`see_image: model "${MODEL}" returned no text. Response: ${JSON.stringify(data).slice(0, 300)}`,
|
|
187
|
+
)
|
|
188
|
+
}
|
|
189
|
+
|
|
190
|
+
context.metadata({
|
|
191
|
+
title: `see_image: ${path.basename(fullPath)}`,
|
|
192
|
+
metadata: { model: MODEL, provider: PROVIDER_ID, file: fullPath },
|
|
193
|
+
})
|
|
194
|
+
|
|
195
|
+
return text
|
|
196
|
+
},
|
|
197
|
+
})
|
|
198
|
+
|
|
199
|
+
// ─── System prompt injection (the "skill") ──────────────────────────────────
|
|
200
|
+
// Injected via experimental.chat.system.transform so the triggering logic
|
|
201
|
+
// ships with the plugin — no separate SKILL.md install needed.
|
|
202
|
+
const SYSTEM_INSTRUCTIONS = `# See Image (vision bridge) — opencode-see-image plugin
|
|
203
|
+
|
|
204
|
+
You have access to a \`see_image\` tool. The current model may not support image input directly. When a user attaches a screenshot or image, opencode rejects it and you only receive an error string containing the **filename** — no path, no pixels. Use \`see_image\` to actually view it.
|
|
205
|
+
|
|
206
|
+
## When to use \`see_image\`
|
|
207
|
+
|
|
208
|
+
Use ONLY when one of these is true:
|
|
209
|
+
1. You receive an error like: \`Cannot read "Screenshot ....png" (this model does not support image input)\`
|
|
210
|
+
2. The user references an image/screenshot they expect you to see ("see this", "look at this", "can you see this", ".png"/".jpg")
|
|
211
|
+
3. The user pastes an image path they want you to inspect
|
|
212
|
+
|
|
213
|
+
Do NOT use \`see_image\` for reading text files — use the \`read\` tool for those.
|
|
214
|
+
|
|
215
|
+
## How to use it
|
|
216
|
+
|
|
217
|
+
1. **Extract the filename** from the error string (the quoted name), or use the path the user gave.
|
|
218
|
+
2. **Call \`see_image\`** with \`filePath\` set to the bare filename (it auto-locates) or an absolute path. Pass an optional \`question\` if the user asked something specific.
|
|
219
|
+
3. **Answer using the returned description** as if you saw the image. Be natural — don't mention that you used another model unless asked.
|
|
220
|
+
|
|
221
|
+
## Important
|
|
222
|
+
|
|
223
|
+
- Never guess or confabulate image contents from the filename or surrounding text. If you have not called \`see_image\`, you have NOT seen the image.
|
|
224
|
+
- If the tool cannot find the file, tell the user the filename and ask for a full path or to drag the file into the project directory.
|
|
225
|
+
- To inspect a specific detail, pass a targeted \`question\` (e.g. "What error is shown in the terminal?").`
|
|
226
|
+
|
|
227
|
+
// ─── Plugin export ──────────────────────────────────────────────────────────
|
|
228
|
+
const SeeImagePlugin: Plugin = async (ctx) => {
|
|
229
|
+
return {
|
|
230
|
+
tool: {
|
|
231
|
+
see_image: seeImageTool,
|
|
232
|
+
},
|
|
233
|
+
"experimental.chat.system.transform": async (_input, output) => {
|
|
234
|
+
output.system.push(SYSTEM_INSTRUCTIONS)
|
|
235
|
+
},
|
|
236
|
+
}
|
|
237
|
+
}
|
|
238
|
+
|
|
239
|
+
export default SeeImagePlugin
|
|
240
|
+
export { SeeImagePlugin }
|
package/package.json
ADDED
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "opencode-see-image",
|
|
3
|
+
"version": "0.1.0",
|
|
4
|
+
"description": "Give non-vision opencode models the ability to see images/screenshots by routing them to a vision-capable model (MiniMax M3 via opencode-go by default).",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"main": "index.ts",
|
|
7
|
+
"exports": {
|
|
8
|
+
".": "./index.ts"
|
|
9
|
+
},
|
|
10
|
+
"scripts": {
|
|
11
|
+
"build": "bun build index.ts --target=bun --outdir dist",
|
|
12
|
+
"typecheck": "bun build index.ts --target=bun --outdir /tmp/see-image-check"
|
|
13
|
+
},
|
|
14
|
+
"keywords": [
|
|
15
|
+
"opencode",
|
|
16
|
+
"opencode-plugin",
|
|
17
|
+
"vision",
|
|
18
|
+
"image",
|
|
19
|
+
"screenshot",
|
|
20
|
+
"minimax",
|
|
21
|
+
"glm"
|
|
22
|
+
],
|
|
23
|
+
"license": "MIT",
|
|
24
|
+
"dependencies": {
|
|
25
|
+
"@opencode-ai/plugin": "^1.15.0"
|
|
26
|
+
}
|
|
27
|
+
}
|