opencode-see-image 0.9.2 → 0.9.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +68 -67
  2. package/index.ts +11 -1
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -1,18 +1,18 @@
1
1
  # opencode-see-image
2
2
 
3
- Give non-vision opencode models the ability to **see images and screenshots** by routing them to a vision-capable model.
3
+ give non-vision opencode models the ability to see images and screenshots by routing them to a vision-capable model.
4
4
 
5
- When a user attaches a screenshot to a text-only model, opencode rejects it with an error. This plugin intercepts that flow by registering a `see_image` tool that sends the image to a vision model and returns a textual description the primary model can reason about.
5
+ when a user attaches a screenshot to a text-only model, opencode rejects it with an error. This plugin intercepts that flow by registering a `see_image` tool that sends the image to a vision model and returns a textual description the primary model can reason about.
6
6
 
7
- ## Install
7
+ ## install
8
8
 
9
- **Option A, one command (recommended):**
9
+ **one command (recommended):**
10
10
  ```bash
11
11
  opencode plugin opencode-see-image --global
12
12
  ```
13
13
  This installs the package and adds it to your config. Then restart opencode.
14
14
 
15
- **Option B, edit config manually:**
15
+ **edit config manually:**
16
16
 
17
17
  Add the plugin to your opencode config:
18
18
 
@@ -25,76 +25,76 @@ Add the plugin to your opencode config:
25
25
  ```
26
26
  Then restart opencode.
27
27
 
28
- ## Install via your agent
28
+ ## install via your agent (for some reason?)
29
29
 
30
- Ask your agent:
30
+ ask your agent:
31
31
  ```
32
32
  install the opencode-see-image plugin
33
33
  ```
34
- It'll run `opencode plugin opencode-see-image --global` and tell you to restart.
34
+ it'll run `opencode plugin opencode-see-image --global` and tell you to restart.
35
35
 
36
- ## Prerequisites
36
+ ## prerequisites
37
37
 
38
- You need a connected vision-capable provider. The plugin auto-detects whichever you have connected, **either of these works**:
38
+ you need a connected vision-capable provider. The plugin auto-detects whichever you have connected, **either of these work**:
39
39
 
40
- ### Free (OpenCode Zen)
41
- 1. Run `/connect` in opencode
42
- 2. Select **opencode** (OpenCode Zen)
43
- 3. Paste your API key from [opencode.ai/auth](https://opencode.ai/auth)
40
+ ### free (OpenCode Zen)
41
+ 1. run `/connect` in opencode
42
+ 2. select **opencode** (OpenCode Zen)
43
+ 3. paste your API key from [opencode.ai/auth](https://opencode.ai/auth)
44
44
 
45
- The plugin falls back to **mimo-v2.5-free**. No subscription needed.
45
+ the plugin falls back to **mimo-v2.5-free**.
46
46
 
47
- ### Paid, w/ OpenCode Go
48
- 1. Run `/connect` in opencode
49
- 2. Select **opencode-go**
50
- 3. Paste your API key from [opencode.ai/auth](https://opencode.ai/auth)
47
+ ### paid, w/ OpenCode Go
48
+ 1. run `/connect` in opencode
49
+ 2. select **opencode-go**
50
+ 3. paste your API key from [opencode.ai/auth](https://opencode.ai/auth)
51
51
 
52
- The plugin prefers **minimax-m3** via opencode-go (~3000ms) when available.
52
+ the plugin prefers **minimax-m3** via opencode-go when available.
53
53
 
54
- ### Paid, w/ another provider
54
+ ### paid, w/ another provider
55
55
 
56
- Set the `SEE_IMAGE_*` env vars to point at any Anthropic-Messages-compatible endpoint. See [Configuration](#configuration) below.
56
+ set the `SEE_IMAGE_*` env vars to point at any Anthropic-Messages-compatible endpoint. see [Configuration](#configuration) below.
57
57
 
58
- **Resolution order:** explicit `SEE_IMAGE_API_KEY` env → configured `SEE_IMAGE_PROVIDER` → `opencode-go` (MiniMax M3) → `opencode` (mimo-v2.5-free, free).
58
+ **the resolve order:** explicit `SEE_IMAGE_API_KEY` env → configured `SEE_IMAGE_PROVIDER` → `opencode-go` (MiniMax M3) → `opencode` (mimo-v2.5-free).
59
59
 
60
- ## How it works
60
+ ## how the _eye surgery_ works
61
61
 
62
62
  ```
63
63
  user attaches screenshot
64
-
65
-
64
+ |
65
+ v
66
66
  opencode rejects it: 'this model does not support image input'
67
- (the model only sees the filename, no pixels)
68
-
67
+ | (the model only sees the filename)
68
+ v
69
69
  plugin's system-prompt instructions tell the model to call see_image
70
-
71
-
70
+ |
71
+ v
72
72
  see_image tool:
73
- 1. queries opencode's SQLite DB for the image (handles clipboard pastes, dragged files, screenshots)
73
+ 1. queries opencode's SQLite DB for the image
74
74
  2. falls back to filesystem search if not in DB
75
75
  3. sends the image to the vision model via opencode's SDK
76
76
  4. returns the textual description
77
-
78
-
77
+ |
78
+ v
79
79
  primary model answers using the description
80
80
  ```
81
81
 
82
- ## The `see_image` tool
82
+ ## the `see_image` tool
83
83
 
84
- The plugin registers a `see_image` tool with two arguments:
84
+ the plugin registers a `see_image` tool with two arguments:
85
85
 
86
- | Arg | Type | Required | Description |
86
+ | arg | type | required? | description |
87
87
  |---|---|---|---|
88
- | `filePath` | string | yes | Path to the image. Absolute path, or a bare filename like `"Screenshot 2026-06-18 at 17.32.24.png"` to auto-locate. |
89
- | `question` | string | no | A specific question about the image. Defaults to a general detailed description. Use this to focus on a particular detail (e.g. `"What error is shown in the terminal?"`). |
88
+ | `filePath` | string | y | path to the image. Absolute path, or a bare filename like `"Screenshot 2026-06-18 at 17.32.24.png"` to auto-locate. |
89
+ | `question` | string | n | a specific question about the image. Defaults to a general detailed description. Use this to focus on a particular detail (e.g. `"What error is shown in the terminal?"`). |
90
90
 
91
- Your model calls this tool automatically when you attach a screenshot, you don't need to do anything special. The `question` arg is optional; the model uses it when you ask something specific about the image.
91
+ your model calls this tool automatically when you attach a screenshot, you don't need to do anything special. The `question` arg is optional; the model uses it when you ask something specific about the image.
92
92
 
93
- ## Configuration
93
+ ## configuration
94
94
 
95
- All settings are env-var overrides. The plugin uses opencode's SDK client by default (handles auth automatically). Set `SEE_IMAGE_API_KEY` to bypass the SDK and call an HTTP endpoint directly.
95
+ all settings are env-var overrides. The plugin uses opencode's SDK client by default (handles auth automatically). Set `SEE_IMAGE_API_KEY` to bypass the SDK and call an HTTP endpoint directly.
96
96
 
97
- | Env var | Default | Description |
97
+ | env var | default | description |
98
98
  |---|---|---|
99
99
  | `SEE_IMAGE_MODEL` | `minimax-m3` | Vision model ID |
100
100
  | `SEE_IMAGE_PROVIDER` | `opencode-go` | Provider ID for SDK routing |
@@ -104,9 +104,9 @@ All settings are env-var overrides. The plugin uses opencode's SDK client by def
104
104
  | `SEE_IMAGE_USER_AGENT` | _(Chrome UA)_ | User-Agent header (HTTP mode only) |
105
105
  | `SEE_IMAGE_TIMEOUT` | `30000` | Per-candidate timeout in ms. Prevents hanging on slow models. |
106
106
 
107
- ### Using a different vision model
107
+ ### using a different vision model
108
108
 
109
- Any Anthropic-Messages-compatible endpoint works. For example, to use a direct MiniMax key:
109
+ any Anthropic-Messages-compatible endpoint works. for example, to use a direct MiniMax key:
110
110
 
111
111
  ```bash
112
112
  export SEE_IMAGE_ENDPOINT="https://api.minimax.io/v1/messages"
@@ -114,60 +114,61 @@ export SEE_IMAGE_MODEL="minimax-m3"
114
114
  export SEE_IMAGE_API_KEY="your-minimax-key"
115
115
  ```
116
116
 
117
- To use a different opencode-go model (e.g. Kimi K2.7):
117
+ to use a different opencode-go model (e.g. Kimi K2.7):
118
118
 
119
119
  ```bash
120
120
  export SEE_IMAGE_MODEL="kimi-k2.7-code"
121
121
  ```
122
122
 
123
- ### Verified vision-capable models
123
+ ### verified vision-capable models
124
124
 
125
125
  **Free (OpenCode Zen):**
126
126
 
127
- | Model | Speed | Notes |
128
- |---|---|---|
129
- | `mimo-v2.5-free` | | Free. Default fallback when only Zen is connected (routed via CLI). |
130
- | `big-pickle` | ~12000ms | Free. Accurate. Alternative Zen fallback. |
127
+ | model | Notes |
128
+ |---|---|
129
+ | `mimo-v2.5-free` | free. may be a bit slow. default fallback when only Zen is connected (routed via CLI). |
130
+ | `big-pickle` | for some reason, big pickle works as an image capable model when called through the sdk w/ an active opencode go sub. |
131
131
 
132
- **Paid (OpenCode Go):**
132
+ **paid (OpenCode Go):**
133
133
 
134
- | Model | Speed | Notes |
134
+ | model | speed | notes |
135
135
  |---|---|---|
136
- | `minimax-m3` | ~3000ms | Default. Fast, clean text output. |
137
- | `kimi-k2.7-code` | ~7000ms | Clean output, accurate. |
138
- | `kimi-k2.6` | ~20000ms | Accurate but slow. |
139
- | `qwen3.7-plus` | ~20000ms | Emits thinking blocks (handled). |
136
+ | `minimax-m3` | ~3000ms | default. fast, clean, and accurate. |
137
+ | `kimi-k2.7-code` | ~7000ms | clean and accurate. |
138
+ | `kimi-k2.6` | ~12000ms | accurate but slow. |
139
+ | `qwen3.7-plus` | ~15000ms | slow, spends a bit more tokens because of thinking. |
140
140
 
141
- ## Updating
141
+ ## updating
142
142
 
143
- **Auto-update (built in):** the plugin checks npm for a newer version on startup. If one exists, it updates itself via `opencode plugin --force` (uses opencode's bundled bun, no global bun needed) and shows a toast: *"opencode-see-image updated to X.Y.Z, restart opencode to apply"*. You just need to restart opencode to load the new version. Nothing to configure.
143
+ **auto-update (built in):** uses the opencode-plugin-update-kit and shows a toast: *"opencode-see-image updated to X.Y.Z, restart opencode to apply"*. You just need to restart opencode to load the new version.
144
144
 
145
- **Manual update** (if you want to force it now):
145
+ **manual update**:
146
146
  ```bash
147
147
  opencode plugin opencode-see-image --force --global
148
148
  ```
149
- Then restart opencode. (No bun required, this uses opencode's own bun.)
149
+ then restart opencode.
150
150
 
151
- **Pin a version** in your config to opt out of auto-updates:
151
+ **pin a version** in your config to opt out of auto-updates:
152
152
  ```jsonc
153
153
  "plugin": ["opencode-see-image@0.4.2"]
154
154
  ```
155
155
 
156
- ## Limitations
156
+ ## kimitations
157
157
 
158
- - **macOS-only filesystem search** the filesystem fallback targets macOS screenshot temp dirs. Linux/Windows users should rely on the DB lookup (which is cross-platform) or pass absolute paths.
158
+ - **macOS-only filesystem search**. the filesystem fallback targets macOS screenshot temp dirs. Linux/Windows users should rely on the DB lookup (which is cross-platform) or pass absolute paths.
159
+ > if you can add compat for more platforms, i would love a pr.
159
160
 
160
- ## File search locations
161
+ ## file search locations
161
162
 
162
- When opencode rejects an image attachment, the model only receives a bare filename. `see_image` searches these locations in order:
163
+ when opencode rejects an image attachment, the model only receives a bare filename. `see_image` searches these locations in order:
163
164
 
164
165
  1. `$TMPDIR/TemporaryItems/NSIRD_screencaptureui_*/` (where macOS stashes dragged screenshots)
165
166
  2. `$TMPDIR/TemporaryItems/`
166
167
  3. `~/Desktop` (default screenshot save location)
167
168
  4. `~/Downloads`
168
- 5. Current working directory
169
+ 5. current working directory
169
170
 
170
- Pass an absolute `filePath` to skip the search.
171
+ pass an absolute `filePath` to skip the search.
171
172
 
172
173
  ## License
173
174
 
package/index.ts CHANGED
@@ -522,7 +522,17 @@ const SeeImagePlugin: Plugin = async (ctx) => {
522
522
  .string()
523
523
  .optional()
524
524
  .describe(
525
- "Optional specific question about the image. Defaults to a general detailed description.",
525
+ [
526
+ "What to ask the vision model. Omit for a general detailed description.",
527
+ "Tailor it to the situation for much better results:",
528
+ '- Reading/transcribing text or code: "Transcribe all text exactly, preserving layout, line breaks, and code indentation."',
529
+ '- An error or stack trace screenshot: "Quote the exact error message and stack trace, then state the likely cause."',
530
+ '- Reproducing a UI as code: "Describe the layout, components, text, colors, and spacing precisely enough to rebuild this UI in code."',
531
+ '- A technical diagram/architecture: "Explain this diagram: list each component and the relationships and data/flow direction between them."',
532
+ '- A chart/graph/dashboard: "Read this visualization: axes, series, key values, and the main takeaway."',
533
+ '- Comparing against an expected design: "Describe this UI in detail so it can be diffed against an expected layout (note any visible defects or misalignment)."',
534
+ "Otherwise pass the user's own specific question verbatim.",
535
+ ].join("\n"),
526
536
  ),
527
537
  },
528
538
  async execute(args, context) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "opencode-see-image",
3
- "version": "0.9.2",
3
+ "version": "0.9.3",
4
4
  "description": "Give non-vision opencode models the ability to see images/screenshots by routing them to a vision-capable model (MiniMax M3 via opencode-go by default).",
5
5
  "type": "module",
6
6
  "main": "index.ts",