opencode-vision 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/SKILL.md +24 -6
  2. package/package.json +1 -1
package/SKILL.md CHANGED
@@ -15,8 +15,8 @@ description: >-
15
15
  ordering/equality/layout/readability/state/diff/describe), asks the
16
16
  user once per session which vision model to use, assembles a versioned
17
17
  request, delegates, parses the typed report. Image paths from
18
- screenshot_out_file/filePath; inline-only images saved to /tmp via
19
- base64 -d.
18
+ screenshot_out_file/filePath; inline-only images saved to /tmp via node
19
+ (not shell echo, to avoid embedding image bytes in commands).
20
20
  ---
21
21
 
22
22
  # Vision — Visual Judgment Skill
@@ -199,19 +199,37 @@ Some tool results return image attachments with
199
199
  e.g. `cua-driver_zoom` (inline-only, no path param), or
200
200
  `playwright_browser_take_screenshot` called without a `filename`. The
201
201
  vision subagent needs a file path to `read`. Save the inline image to
202
- disk first:
202
+ disk first.
203
+
204
+ **Prefer avoiding inline images altogether**: when calling
205
+ `cua-driver_get_window_state`, always pass `screenshot_out_file` so a
206
+ file path is available directly. When calling
207
+ `chrome-devtools_take_screenshot` or `playwright_browser_take_screenshot`,
208
+ always pass `filePath` / `filename`. This avoids the inline-only case
209
+ entirely and is the safest path.
210
+
211
+ If you must handle an inline-only image, write the base64 payload to a
212
+ file using `node -e` (not `echo | base64 -d`, which embeds the raw
213
+ image data in a shell command — screenshots may contain sensitive
214
+ content like tokens or credentials):
203
215
 
204
216
  ```
205
217
  If a tool result has attachments[].url starting "data:image/...;base64,"
206
218
  but no file path:
207
219
  1. Extract the base64 payload from the data URL (the part after
208
220
  ";base64,").
209
- 2. Write it to /tmp/vision-<random>.png via bash:
210
- echo "<base64>" | base64 -d > /tmp/vision-<random>.png
221
+ 2. Write it to /tmp/vision-<random>.png using node, which avoids
222
+ passing the base64 through the shell:
223
+ node -e "require('fs').writeFileSync('/tmp/vision-<random>.png',
224
+ Buffer.from('<base64>','base64'))"
225
+ Or write a small script to /tmp and run it, passing the base64 via
226
+ stdin to avoid it appearing in the command line.
211
227
  3. Use that path in the request's images[].path.
212
228
  ```
213
229
 
214
- This is the recommended handling for any inline-only image result.
230
+ Do not use `echo "<base64>" | base64 -d` — it embeds the raw image
231
+ bytes in the shell command, creating an exfiltration risk if the
232
+ screenshot contains sensitive data.
215
233
 
216
234
  ## Step 4. Pick model (once per session)
217
235
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "opencode-vision",
3
- "version": "0.2.0",
3
+ "version": "0.2.1",
4
4
  "description": "Typed visual-judgment skill for opencode. Registers 10 vision subagents (one per top-tier vision model across OpenAI, Kimi for Coding, Ollama Cloud, and opencode-go) and a skill that teaches a text-only orchestrator to extract visual-judgment intent, classify it into a typed judgment, and delegate to a vision subagent with a versioned request/report contract.",
5
5
  "type": "module",
6
6
  "main": "./dist/index.js",