opencode-see-image 0.3.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +18 -16
- package/index.ts +181 -191
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -32,25 +32,27 @@ Install the opencode-see-image plugin so I can send you screenshots. Do this:
|
|
|
32
32
|
After I restart and attach a screenshot, you should call the see_image tool to view it.
|
|
33
33
|
```
|
|
34
34
|
|
|
35
|
-
Then restart opencode.
|
|
36
|
-
|
|
37
35
|
## Prerequisites
|
|
38
36
|
|
|
39
|
-
You need a connected vision-capable provider. The plugin auto-detects whichever you have connected
|
|
37
|
+
You need a connected vision-capable provider. The plugin auto-detects whichever you have connected, **either of these works**:
|
|
40
38
|
|
|
41
|
-
###
|
|
39
|
+
### Free (OpenCode Zen)
|
|
42
40
|
1. Run `/connect` in opencode
|
|
43
41
|
2. Select **opencode** (OpenCode Zen)
|
|
44
42
|
3. Paste your API key from [opencode.ai/auth](https://opencode.ai/auth)
|
|
45
43
|
|
|
46
|
-
The plugin falls back to **big-pickle** (
|
|
44
|
+
The plugin falls back to **big-pickle** (~12000ms). No subscription needed.
|
|
47
45
|
|
|
48
|
-
###
|
|
46
|
+
### Paid, w/ OpenCode Go
|
|
49
47
|
1. Run `/connect` in opencode
|
|
50
48
|
2. Select **opencode-go**
|
|
51
49
|
3. Paste your API key from [opencode.ai/auth](https://opencode.ai/auth)
|
|
52
50
|
|
|
53
|
-
The plugin prefers **minimax-m3** via opencode-go (~
|
|
51
|
+
The plugin prefers **minimax-m3** via opencode-go (~3000ms) when available.
|
|
52
|
+
|
|
53
|
+
### Paid, w/ another provider
|
|
54
|
+
|
|
55
|
+
Set the `SEE_IMAGE_*` env vars to point at any Anthropic-Messages-compatible endpoint. See [Configuration](#configuration) below.
|
|
54
56
|
|
|
55
57
|
**Resolution order:** explicit `SEE_IMAGE_API_KEY` env → configured `SEE_IMAGE_PROVIDER` → `opencode-go` (MiniMax M3) → `opencode` (big-pickle, free).
|
|
56
58
|
|
|
@@ -85,7 +87,7 @@ The plugin registers a `see_image` tool with two arguments:
|
|
|
85
87
|
| `filePath` | string | yes | Path to the image. Absolute path, or a bare filename like `"Screenshot 2026-06-18 at 17.32.24.png"` to auto-locate. |
|
|
86
88
|
| `question` | string | no | A specific question about the image. Defaults to a general detailed description. Use this to focus on a particular detail (e.g. `"What error is shown in the terminal?"`). |
|
|
87
89
|
|
|
88
|
-
Your model calls this tool automatically when you attach a screenshot
|
|
90
|
+
Your model calls this tool automatically when you attach a screenshot, you don't need to do anything special. The `question` arg is optional; the model uses it when you ask something specific about the image.
|
|
89
91
|
|
|
90
92
|
## Configuration
|
|
91
93
|
|
|
@@ -122,20 +124,20 @@ export SEE_IMAGE_MODEL="kimi-k2.7-code"
|
|
|
122
124
|
|
|
123
125
|
| Model | Speed | Notes |
|
|
124
126
|
|---|---|---|
|
|
125
|
-
| `big-pickle` | ~
|
|
127
|
+
| `big-pickle` | ~12000ms | Free. Accurate. Default fallback when only Zen is connected. |
|
|
126
128
|
|
|
127
129
|
**Paid (OpenCode Go):**
|
|
128
130
|
|
|
129
131
|
| Model | Speed | Notes |
|
|
130
132
|
|---|---|---|
|
|
131
|
-
| `minimax-m3` | ~
|
|
132
|
-
| `kimi-k2.7-code` | ~
|
|
133
|
-
| `kimi-k2.6` | ~
|
|
134
|
-
| `qwen3.7-plus` | ~
|
|
133
|
+
| `minimax-m3` | ~3000ms | Default. Fast, clean text output. |
|
|
134
|
+
| `kimi-k2.7-code` | ~7000ms | Clean output, accurate. |
|
|
135
|
+
| `kimi-k2.6` | ~20000ms | Accurate but slow. |
|
|
136
|
+
| `qwen3.7-plus` | ~20000ms | Emits thinking blocks (handled). |
|
|
135
137
|
|
|
136
138
|
## Updating
|
|
137
139
|
|
|
138
|
-
**Auto-update (built in):** the plugin checks npm for a newer version on every opencode startup. If one exists, it runs `bun update` automatically and shows a toast: *"opencode-see-image updated to X.Y.Z
|
|
140
|
+
**Auto-update (built in):** the plugin checks npm for a newer version on every opencode startup. If one exists, it runs `bun update` automatically and shows a toast: *"opencode-see-image updated to X.Y.Z, restart opencode to apply"*. You just need to restart opencode to load the new version. Nothing to configure.
|
|
139
141
|
|
|
140
142
|
**Manual update** (if you want to force it now):
|
|
141
143
|
```bash
|
|
@@ -152,9 +154,9 @@ Then restart opencode.
|
|
|
152
154
|
|
|
153
155
|
When opencode rejects an image attachment, the model only receives a bare filename. `see_image` searches these locations in order:
|
|
154
156
|
|
|
155
|
-
1. `$TMPDIR/TemporaryItems/NSIRD_screencaptureui_*/`
|
|
157
|
+
1. `$TMPDIR/TemporaryItems/NSIRD_screencaptureui_*/` (where macOS stashes dragged screenshots)
|
|
156
158
|
2. `$TMPDIR/TemporaryItems/`
|
|
157
|
-
3. `~/Desktop`
|
|
159
|
+
3. `~/Desktop` (default screenshot save location)
|
|
158
160
|
4. `~/Downloads`
|
|
159
161
|
5. Current working directory
|
|
160
162
|
|
package/index.ts
CHANGED
|
@@ -4,10 +4,6 @@ import os from "os"
|
|
|
4
4
|
import fs from "fs"
|
|
5
5
|
import type { Plugin } from "@opencode-ai/plugin"
|
|
6
6
|
|
|
7
|
-
// ─── Configuration (env-overridable) ────────────────────────────────────────
|
|
8
|
-
// Defaults target opencode-go's MiniMax M3. Users on other providers can
|
|
9
|
-
// override via environment variables without editing this file.
|
|
10
|
-
|
|
11
7
|
const ENDPOINT =
|
|
12
8
|
process.env.SEE_IMAGE_ENDPOINT ||
|
|
13
9
|
"https://opencode.ai/zen/go/v1/messages"
|
|
@@ -27,61 +23,6 @@ const EXT_MEDIA: Record<string, string> = {
|
|
|
27
23
|
bmp: "image/bmp",
|
|
28
24
|
}
|
|
29
25
|
|
|
30
|
-
// ─── Auth ───────────────────────────────────────────────────────────────────
|
|
31
|
-
// Resolves a usable API key + the endpoint + model to use. Falls back through
|
|
32
|
-
// a chain so users with any connected opencode subscription (paid opencode-go
|
|
33
|
-
// or free opencode Zen) get a working default with zero config.
|
|
34
|
-
function resolveAuth(): { key: string; endpoint: string; model: string } {
|
|
35
|
-
// 1. Explicit env vars win outright.
|
|
36
|
-
if (process.env.SEE_IMAGE_API_KEY) {
|
|
37
|
-
return { key: process.env.SEE_IMAGE_API_KEY, endpoint: ENDPOINT, model: MODEL }
|
|
38
|
-
}
|
|
39
|
-
|
|
40
|
-
// 2. Walk opencode's auth store and try the configured provider first,
|
|
41
|
-
// then a curated fallback chain (paid → free).
|
|
42
|
-
const authPath = path.join(os.homedir(), ".local/share/opencode/auth.json")
|
|
43
|
-
let auth: any = {}
|
|
44
|
-
try {
|
|
45
|
-
auth = JSON.parse(fs.readFileSync(authPath, "utf8"))
|
|
46
|
-
} catch {
|
|
47
|
-
// ignore — handled by the empty-auth path below
|
|
48
|
-
}
|
|
49
|
-
|
|
50
|
-
// Each candidate: [providerId, endpoint, defaultModel]
|
|
51
|
-
const candidates: Array<[string, string, string]> = [
|
|
52
|
-
[PROVIDER_ID, ENDPOINT, MODEL],
|
|
53
|
-
// Free fallback: OpenCode Zen's big-pickle supports vision at no cost.
|
|
54
|
-
["opencode", "https://opencode.ai/zen/v1/messages", "big-pickle"],
|
|
55
|
-
// Paid fallbacks on opencode-go:
|
|
56
|
-
["opencode-go", "https://opencode.ai/zen/go/v1/messages", "minimax-m3"],
|
|
57
|
-
]
|
|
58
|
-
|
|
59
|
-
const tried: string[] = []
|
|
60
|
-
for (const [pid, ep, mdl] of candidates) {
|
|
61
|
-
const entry = auth[pid]
|
|
62
|
-
const k = entry && (entry.key || entry.access)
|
|
63
|
-
if (k) {
|
|
64
|
-
// If the user pinned PROVIDER_ID but not MODEL, honor the candidate's
|
|
65
|
-
// default model only when provider matches the pinned one; otherwise
|
|
66
|
-
// keep the configured MODEL (may be set via env).
|
|
67
|
-
const useModel =
|
|
68
|
-
pid === PROVIDER_ID || !process.env.SEE_IMAGE_MODEL ? mdl : MODEL
|
|
69
|
-
return { key: k, endpoint: ep, model: useModel }
|
|
70
|
-
}
|
|
71
|
-
tried.push(pid)
|
|
72
|
-
}
|
|
73
|
-
|
|
74
|
-
throw new Error(
|
|
75
|
-
`see_image: no API key. Connect a provider in opencode via /connect — ` +
|
|
76
|
-
`either "opencode-go" (paid, fast MiniMax M3) or "opencode" (free, big-pickle). ` +
|
|
77
|
-
`Or set SEE_IMAGE_API_KEY explicitly. (Checked providers: ${tried.join(", ") || "none"} in ${authPath}.)`,
|
|
78
|
-
)
|
|
79
|
-
}
|
|
80
|
-
|
|
81
|
-
// ─── File resolution ────────────────────────────────────────────────────────
|
|
82
|
-
// When opencode rejects an image attachment, the model only sees a bare
|
|
83
|
-
// filename (no path). This resolves bare filenames by searching the places
|
|
84
|
-
// macOS / opencode tend to stash screenshots.
|
|
85
26
|
function resolveFilePath(name: string, cwd: string): string {
|
|
86
27
|
if (path.isAbsolute(name) && fs.existsSync(name)) return name
|
|
87
28
|
|
|
@@ -91,8 +32,6 @@ function resolveFilePath(name: string, cwd: string): string {
|
|
|
91
32
|
const tmpdir = process.env.TMPDIR || "/tmp"
|
|
92
33
|
const searchDirs: string[] = []
|
|
93
34
|
|
|
94
|
-
// macOS screenshot tool temp dirs (NSIRD_screencaptureui_<rand>) — this is
|
|
95
|
-
// where dragged screenshots actually land, not ~/Desktop.
|
|
96
35
|
const tempItems = path.join(tmpdir, "TemporaryItems")
|
|
97
36
|
if (fs.existsSync(tempItems)) {
|
|
98
37
|
try {
|
|
@@ -116,7 +55,6 @@ function resolveFilePath(name: string, cwd: string): string {
|
|
|
116
55
|
} catch {}
|
|
117
56
|
}
|
|
118
57
|
|
|
119
|
-
// Shallow recursive search in the top-level search dirs.
|
|
120
58
|
for (const dir of searchDirs) {
|
|
121
59
|
if (!dir || !fs.existsSync(dir)) continue
|
|
122
60
|
try {
|
|
@@ -133,103 +71,135 @@ function resolveFilePath(name: string, cwd: string): string {
|
|
|
133
71
|
)
|
|
134
72
|
}
|
|
135
73
|
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
:
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
model,
|
|
170
|
-
max_tokens: 2048,
|
|
171
|
-
messages: [
|
|
172
|
-
{
|
|
173
|
-
role: "user",
|
|
174
|
-
content: [
|
|
175
|
-
{
|
|
176
|
-
type: "image",
|
|
177
|
-
source: { type: "base64", media_type: mediaType, data: b64 },
|
|
178
|
-
},
|
|
74
|
+
async function seeImageViaSDK(
|
|
75
|
+
client: any,
|
|
76
|
+
dataUrl: string,
|
|
77
|
+
mediaType: string,
|
|
78
|
+
prompt: string,
|
|
79
|
+
): Promise<{ text: string; model: string; provider: string }> {
|
|
80
|
+
const envProvider = process.env.SEE_IMAGE_PROVIDER
|
|
81
|
+
const envModel = process.env.SEE_IMAGE_MODEL
|
|
82
|
+
const candidates: Array<{ providerID: string; modelID: string }> = []
|
|
83
|
+
if (envProvider && envModel) {
|
|
84
|
+
candidates.push({ providerID: envProvider, modelID: envModel })
|
|
85
|
+
}
|
|
86
|
+
candidates.push({ providerID: "opencode-go", modelID: "minimax-m3" })
|
|
87
|
+
candidates.push({ providerID: "opencode", modelID: "big-pickle" })
|
|
88
|
+
|
|
89
|
+
const errors: string[] = []
|
|
90
|
+
|
|
91
|
+
for (const { providerID, modelID } of candidates) {
|
|
92
|
+
let sessionID: string | undefined
|
|
93
|
+
try {
|
|
94
|
+
const sessionRes = await client.session.create({ body: {} })
|
|
95
|
+
sessionID = sessionRes.data?.id
|
|
96
|
+
if (!sessionID) {
|
|
97
|
+
errors.push(`${providerID}/${modelID}: no session ID`)
|
|
98
|
+
continue
|
|
99
|
+
}
|
|
100
|
+
|
|
101
|
+
const result = await client.session.prompt({
|
|
102
|
+
path: { id: sessionID },
|
|
103
|
+
body: {
|
|
104
|
+
model: { providerID, modelID },
|
|
105
|
+
parts: [
|
|
106
|
+
{ type: "file", mime: mediaType, url: dataUrl },
|
|
179
107
|
{ type: "text", text: prompt },
|
|
180
108
|
],
|
|
109
|
+
tools: {},
|
|
110
|
+
system:
|
|
111
|
+
"You are a vision assistant. Describe the image accurately and concisely. Answer with text only.",
|
|
181
112
|
},
|
|
182
|
-
|
|
183
|
-
}
|
|
113
|
+
})
|
|
184
114
|
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
"
|
|
190
|
-
"
|
|
191
|
-
|
|
192
|
-
},
|
|
193
|
-
body: JSON.stringify(body),
|
|
194
|
-
})
|
|
115
|
+
const parts = result.data?.parts ?? []
|
|
116
|
+
const text = (parts as any[])
|
|
117
|
+
.filter((p: any) => p.type === "text")
|
|
118
|
+
.map((p: any) => p.text)
|
|
119
|
+
.filter((t: any) => typeof t === "string" && t.length > 0)
|
|
120
|
+
.join("\n")
|
|
121
|
+
.trim()
|
|
195
122
|
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
123
|
+
if (text) {
|
|
124
|
+
return { text, model: modelID, provider: providerID }
|
|
125
|
+
}
|
|
126
|
+
errors.push(`${providerID}/${modelID}: no text in response`)
|
|
127
|
+
} catch (e: any) {
|
|
128
|
+
errors.push(`${providerID}/${modelID}: ${e?.message ?? e}`)
|
|
129
|
+
} finally {
|
|
130
|
+
if (sessionID) {
|
|
131
|
+
await client.session
|
|
132
|
+
.delete({ path: { id: sessionID } })
|
|
133
|
+
.catch(() => {})
|
|
134
|
+
}
|
|
201
135
|
}
|
|
136
|
+
}
|
|
202
137
|
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
?.map((c: any) => c.text)
|
|
208
|
-
.filter((t: any) => typeof t === "string" && t.length > 0)
|
|
209
|
-
.join("\n")
|
|
210
|
-
.trim()
|
|
211
|
-
|
|
212
|
-
if (!text) {
|
|
213
|
-
throw new Error(
|
|
214
|
-
`see_image: model "${model}" returned no text. Response: ${JSON.stringify(data).slice(0, 300)}`,
|
|
215
|
-
)
|
|
216
|
-
}
|
|
138
|
+
throw new Error(
|
|
139
|
+
`see_image: SDK vision call failed for all candidates. ${errors.join("; ")}`,
|
|
140
|
+
)
|
|
141
|
+
}
|
|
217
142
|
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
143
|
+
async function seeImageViaHTTP(
|
|
144
|
+
b64: string,
|
|
145
|
+
mediaType: string,
|
|
146
|
+
prompt: string,
|
|
147
|
+
): Promise<{ text: string; model: string; provider: string }> {
|
|
148
|
+
const key = process.env.SEE_IMAGE_API_KEY!
|
|
149
|
+
const body = {
|
|
150
|
+
model: MODEL,
|
|
151
|
+
max_tokens: 2048,
|
|
152
|
+
messages: [
|
|
153
|
+
{
|
|
154
|
+
role: "user",
|
|
155
|
+
content: [
|
|
156
|
+
{
|
|
157
|
+
type: "image",
|
|
158
|
+
source: { type: "base64", media_type: mediaType, data: b64 },
|
|
159
|
+
},
|
|
160
|
+
{ type: "text", text: prompt },
|
|
161
|
+
],
|
|
162
|
+
},
|
|
163
|
+
],
|
|
164
|
+
}
|
|
165
|
+
|
|
166
|
+
const res = await fetch(ENDPOINT, {
|
|
167
|
+
method: "POST",
|
|
168
|
+
headers: {
|
|
169
|
+
"x-api-key": key,
|
|
170
|
+
"anthropic-version": API_VERSION,
|
|
171
|
+
"content-type": "application/json",
|
|
172
|
+
"user-agent": USER_AGENT,
|
|
173
|
+
},
|
|
174
|
+
body: JSON.stringify(body),
|
|
175
|
+
})
|
|
176
|
+
|
|
177
|
+
if (!res.ok) {
|
|
178
|
+
const errText = await res.text()
|
|
179
|
+
throw new Error(
|
|
180
|
+
`see_image: HTTP vision call to "${MODEL}" failed: HTTP ${res.status}, ${errText.slice(0, 300)}`,
|
|
181
|
+
)
|
|
182
|
+
}
|
|
222
183
|
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
184
|
+
const data: any = await res.json()
|
|
185
|
+
const text = data?.content
|
|
186
|
+
?.map((c: any) => c.text)
|
|
187
|
+
.filter((t: any) => typeof t === "string" && t.length > 0)
|
|
188
|
+
.join("\n")
|
|
189
|
+
.trim()
|
|
190
|
+
|
|
191
|
+
if (!text) {
|
|
192
|
+
throw new Error(
|
|
193
|
+
`see_image: model "${MODEL}" returned no text. Response: ${JSON.stringify(data).slice(0, 300)}`,
|
|
194
|
+
)
|
|
195
|
+
}
|
|
196
|
+
|
|
197
|
+
return { text, model: MODEL, provider: PROVIDER_ID }
|
|
198
|
+
}
|
|
226
199
|
|
|
227
|
-
|
|
228
|
-
// Injected via experimental.chat.system.transform so the triggering logic
|
|
229
|
-
// ships with the plugin — no separate SKILL.md install needed.
|
|
230
|
-
const SYSTEM_INSTRUCTIONS = `# See Image (vision bridge) — opencode-see-image plugin
|
|
200
|
+
const SYSTEM_INSTRUCTIONS = `# See Image (vision bridge), opencode-see-image plugin
|
|
231
201
|
|
|
232
|
-
You have access to a \`see_image\` tool. The current model may not support image input directly. When a user attaches a screenshot or image, opencode rejects it and you only receive an error string containing the **filename
|
|
202
|
+
You have access to a \`see_image\` tool. The current model may not support image input directly. When a user attaches a screenshot or image, opencode rejects it and you only receive an error string containing the **filename**, no path, no pixels. Use \`see_image\` to actually view it.
|
|
233
203
|
|
|
234
204
|
## When to use \`see_image\`
|
|
235
205
|
|
|
@@ -238,13 +208,13 @@ Use ONLY when one of these is true:
|
|
|
238
208
|
2. The user references an image/screenshot they expect you to see ("see this", "look at this", "can you see this", ".png"/".jpg")
|
|
239
209
|
3. The user pastes an image path they want you to inspect
|
|
240
210
|
|
|
241
|
-
Do NOT use \`see_image\` for reading text files
|
|
211
|
+
Do NOT use \`see_image\` for reading text files, use the \`read\` tool for those.
|
|
242
212
|
|
|
243
213
|
## How to use it
|
|
244
214
|
|
|
245
215
|
1. **Extract the filename** from the error string (the quoted name), or use the path the user gave.
|
|
246
216
|
2. **Call \`see_image\`** with \`filePath\` set to the bare filename (it auto-locates) or an absolute path. Pass an optional \`question\` if the user asked something specific.
|
|
247
|
-
3. **Answer using the returned description** as if you saw the image. Be natural
|
|
217
|
+
3. **Answer using the returned description** as if you saw the image. Be natural, don't mention that you used another model unless asked.
|
|
248
218
|
|
|
249
219
|
## Important
|
|
250
220
|
|
|
@@ -252,18 +222,11 @@ Do NOT use \`see_image\` for reading text files — use the \`read\` tool for th
|
|
|
252
222
|
- If the tool cannot find the file, tell the user the filename and ask for a full path or to drag the file into the project directory.
|
|
253
223
|
- To inspect a specific detail, pass a targeted \`question\` (e.g. "What error is shown in the terminal?").`
|
|
254
224
|
|
|
255
|
-
// ─── Auto-update ────────────────────────────────────────────────────────────
|
|
256
|
-
// Runs once at plugin init (async, non-blocking). Checks npm for a newer
|
|
257
|
-
// version, runs `bun update` in the opencode plugin cache if available, and
|
|
258
|
-
// toasts the user to restart opencode to apply. Never throws — failures are
|
|
259
|
-
// logged and swallowed so the plugin always loads.
|
|
260
|
-
|
|
261
225
|
const PKG_NAME = "opencode-see-image"
|
|
262
226
|
const REGISTRY_LATEST = `https://registry.npmjs.org/${PKG_NAME}/latest`
|
|
263
227
|
|
|
264
228
|
function currentVersion(): string | null {
|
|
265
229
|
try {
|
|
266
|
-
// import.meta.url points at this module inside the bun cache.
|
|
267
230
|
const here = new URL(".", import.meta.url)
|
|
268
231
|
const pkgPath = new URL("package.json", here)
|
|
269
232
|
const pkg = JSON.parse(fs.readFileSync(pkgPath, "utf8"))
|
|
@@ -291,41 +254,25 @@ async function maybeAutoUpdate(
|
|
|
291
254
|
log: (msg: string, level?: string) => void,
|
|
292
255
|
) {
|
|
293
256
|
const current = currentVersion()
|
|
294
|
-
if (!current)
|
|
295
|
-
log("could not determine current version; skipping update check", "debug")
|
|
296
|
-
return
|
|
297
|
-
}
|
|
257
|
+
if (!current) return
|
|
298
258
|
|
|
299
259
|
let latest: string
|
|
300
260
|
try {
|
|
301
261
|
const res = await fetch(REGISTRY_LATEST, {
|
|
302
262
|
headers: { accept: "application/json" },
|
|
303
263
|
})
|
|
304
|
-
if (!res.ok)
|
|
305
|
-
log(`registry fetch returned HTTP ${res.status}`, "debug")
|
|
306
|
-
return
|
|
307
|
-
}
|
|
264
|
+
if (!res.ok) return
|
|
308
265
|
const data: any = await res.json()
|
|
309
266
|
latest = data?.version
|
|
310
|
-
if (!latest)
|
|
311
|
-
|
|
312
|
-
return
|
|
313
|
-
}
|
|
314
|
-
} catch (e: any) {
|
|
315
|
-
log(`registry fetch failed: ${e?.message ?? e}`, "debug")
|
|
267
|
+
if (!latest) return
|
|
268
|
+
} catch {
|
|
316
269
|
return
|
|
317
270
|
}
|
|
318
271
|
|
|
319
|
-
if (!semverGt(latest, current))
|
|
320
|
-
log(`up to date (${current})`, "debug")
|
|
321
|
-
return
|
|
322
|
-
}
|
|
272
|
+
if (!semverGt(latest, current)) return
|
|
323
273
|
|
|
324
|
-
log(`update available: ${current}
|
|
274
|
+
log(`update available: ${current} -> ${latest}; running bun update`, "info")
|
|
325
275
|
|
|
326
|
-
// Update the plugin in opencode's cache. --no-save keeps the lockfile
|
|
327
|
-
// resolution intact while still pulling the new tarball. We cd into the
|
|
328
|
-
// cache dir because bun operates on the nearest package.json/lockfile.
|
|
329
276
|
const cacheDir = path.join(os.homedir(), ".cache/opencode")
|
|
330
277
|
try {
|
|
331
278
|
await $`cd ${cacheDir} && bun update ${PKG_NAME} --no-save`.quiet()
|
|
@@ -334,37 +281,80 @@ async function maybeAutoUpdate(
|
|
|
334
281
|
return
|
|
335
282
|
}
|
|
336
283
|
|
|
337
|
-
// Tell the user to restart. Toast is non-blocking; if it fails, we log.
|
|
338
284
|
try {
|
|
339
285
|
await client?.tui?.showToast?.({
|
|
340
286
|
body: {
|
|
341
|
-
message: `${PKG_NAME} updated to ${latest}
|
|
287
|
+
message: `${PKG_NAME} updated to ${latest}, restart opencode to apply`,
|
|
342
288
|
variant: "success",
|
|
343
289
|
},
|
|
344
290
|
})
|
|
345
291
|
} catch {
|
|
346
|
-
log(`update applied: ${current}
|
|
292
|
+
log(`update applied: ${current} -> ${latest}; restart opencode to load`, "info")
|
|
347
293
|
}
|
|
348
294
|
}
|
|
349
295
|
|
|
350
|
-
// ─── Plugin export ──────────────────────────────────────────────────────────
|
|
351
296
|
const SeeImagePlugin: Plugin = async (ctx) => {
|
|
352
297
|
const { client, $ } = ctx
|
|
353
298
|
|
|
354
299
|
const log = (message: string, level: string = "info") => {
|
|
355
300
|
try {
|
|
356
|
-
client?.app?.log?.({
|
|
357
|
-
|
|
358
|
-
})
|
|
359
|
-
} catch {
|
|
360
|
-
// logging is best-effort
|
|
361
|
-
}
|
|
301
|
+
client?.app?.log?.({ body: { service: PKG_NAME, level, message } })
|
|
302
|
+
} catch {}
|
|
362
303
|
}
|
|
363
304
|
|
|
364
|
-
// Fire-and-forget the update check. Never awaited so plugin init is not
|
|
365
|
-
// delayed by network. Errors are swallowed inside.
|
|
366
305
|
maybeAutoUpdate(client, $, log).catch(() => {})
|
|
367
306
|
|
|
307
|
+
const seeImageTool = tool({
|
|
308
|
+
description:
|
|
309
|
+
'See an image/screenshot that the current model cannot view. Use when the user attaches an image and you get a "this model does not support image input" / "Cannot read" error, or when a screenshot/image is referenced ("see this", "can you see", .png/.jpg). Routes the image to a vision-capable model and returns a detailed textual description you can reason about as if you saw it. Pass filePath as an absolute path OR a bare filename (auto-located in macOS screenshot temp dirs, ~/Desktop, ~/Downloads, cwd).',
|
|
310
|
+
args: {
|
|
311
|
+
filePath: tool.schema
|
|
312
|
+
.string()
|
|
313
|
+
.describe(
|
|
314
|
+
'Path to the image. Absolute path, or a bare filename like "Screenshot 2026-06-18 at 17.32.24.png" to auto-locate.',
|
|
315
|
+
),
|
|
316
|
+
question: tool.schema
|
|
317
|
+
.string()
|
|
318
|
+
.optional()
|
|
319
|
+
.describe(
|
|
320
|
+
"Optional specific question about the image. Defaults to a general detailed description.",
|
|
321
|
+
),
|
|
322
|
+
},
|
|
323
|
+
async execute(args, context) {
|
|
324
|
+
const fullPath = resolveFilePath(args.filePath, context.directory)
|
|
325
|
+
const ext = path.extname(fullPath).slice(1).toLowerCase()
|
|
326
|
+
const mediaType = EXT_MEDIA[ext] || "image/png"
|
|
327
|
+
|
|
328
|
+
const buf = fs.readFileSync(fullPath)
|
|
329
|
+
const b64 = Buffer.from(buf).toString("base64")
|
|
330
|
+
const dataUrl = `data:${mediaType};base64,${b64}`
|
|
331
|
+
|
|
332
|
+
const prompt =
|
|
333
|
+
args.question && args.question.trim().length > 0
|
|
334
|
+
? args.question
|
|
335
|
+
: "Describe this image in detail. If it is a screenshot, describe the UI, text content, and layout precisely. This description will be used by another model to answer the user, so be thorough and accurate."
|
|
336
|
+
|
|
337
|
+
let result: { text: string; model: string; provider: string }
|
|
338
|
+
|
|
339
|
+
if (process.env.SEE_IMAGE_API_KEY) {
|
|
340
|
+
result = await seeImageViaHTTP(b64, mediaType, prompt)
|
|
341
|
+
} else {
|
|
342
|
+
result = await seeImageViaSDK(client, dataUrl, mediaType, prompt)
|
|
343
|
+
}
|
|
344
|
+
|
|
345
|
+
context.metadata({
|
|
346
|
+
title: `see_image: ${path.basename(fullPath)}`,
|
|
347
|
+
metadata: {
|
|
348
|
+
model: result.model,
|
|
349
|
+
provider: result.provider,
|
|
350
|
+
file: fullPath,
|
|
351
|
+
},
|
|
352
|
+
})
|
|
353
|
+
|
|
354
|
+
return result.text
|
|
355
|
+
},
|
|
356
|
+
})
|
|
357
|
+
|
|
368
358
|
return {
|
|
369
359
|
tool: {
|
|
370
360
|
see_image: seeImageTool,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "opencode-see-image",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.4.0",
|
|
4
4
|
"description": "Give non-vision opencode models the ability to see images/screenshots by routing them to a vision-capable model (MiniMax M3 via opencode-go by default).",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "index.ts",
|