opencode-minimax-easy-vision 1.1.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,10 @@
1
1
  # Opencode MiniMax Easy Vision
2
2
 
3
- MiniMax Easy Vision is a plugin for [OpenCode](https://opencode.ai) that enables **vision support** for models that lack native image attachment support. Originally built for [MiniMax](https://www.minimax.io/) models, it can be configured to work with any model that requires MCP-based image handling. It restores a simple "paste and ask" workflow by automatically handling image assets and routing them through the [MiniMax Coding Plan MCP](https://github.com/MiniMax-AI/MiniMax-Coding-Plan-MCP).
3
+ MiniMax Easy Vision is a plugin for [OpenCode](https://opencode.ai) that enables **vision support** for models that lack native image attachment support.
4
+
5
+ Originally built for [MiniMax](https://www.minimax.io/) models, it can be configured to work with any model that requires MCP-based image handling.
6
+
7
+ It restores the "paste and ask" workflow by automatically saving image assets and routing them through the [MiniMax Coding Plan MCP](https://github.com/MiniMax-AI/MiniMax-Coding-Plan-MCP)
4
8
 
5
9
  ## Demo
6
10
 
@@ -12,84 +16,75 @@ https://github.com/user-attachments/assets/826f90ea-913f-427e-ace8-0b711302c497
12
16
 
13
17
  ## The Problem
14
18
 
15
- When using MiniMax models (for example, MiniMax M2.1) inside OpenCode, users run into a limitation: **vision is not supported via native image attachments**.
19
+ When using MiniMax models (like MiniMax M2.1) in OpenCode, native image attachments aren't supported.
16
20
 
17
- MiniMax models rely on the MiniMax Coding Plan MCP's `understand_image` tool, which requires an explicit file path or URL. This breaks the normal chat workflow:
21
+ These models expect the MiniMax Coding Plan MCP's `understand_image` tool, which requires an explicit file path. This breaks the normal flow:
18
22
 
19
- * **Ignored images**: Images pasted directly into chat are ignored by MiniMax models.
20
- * **Manual steps**: Users must save screenshots, locate file paths, and reference them manually.
21
- * **Broken flow**: The "paste and ask" vision workflow available in other models is lost.
23
+ * **Ignored images**: Pasted images are simply ignored by the model.
24
+ * **Manual steps**: You have to save screenshots manually, find the path, and reference it in your prompt.
25
+ * **Broken flow**: The "paste and ask" experience available with Claude or GPT models is lost.
22
26
 
23
27
  ## What This Plugin Does
24
28
 
25
- This plugin removes that friction by automating the vision pipeline for configured models.
29
+ This plugin automates the vision pipeline so you don't have to think about it.
26
30
 
27
- Internally, it:
31
+ **How it works:**
28
32
 
29
- 1. Detects when a configured model is active (MiniMax by default)
30
- 2. Intercepts images pasted into the chat
31
- 3. Saves them to a temporary local directory
32
- 4. Injects the required context so the model can invoke the `understand_image` MCP tool with the correct file path
33
+ 1. **Detects** when a configured model is active.
34
+ 2. **Intercepts** images pasted into the chat.
35
+ 3. **Saves** them to a temporary local directory.
36
+ 4. **Injects** the necessary context for the model to invoke the `understand_image` tool with the correct path.
33
37
 
34
- From the user's perspective, pasted images simply work with vision, just like how it works out of the box with other vision-capable models like Claude.
38
+ **Result:** You just paste the image and ask your question just like how you do with Claude or GPT models. The plugin handles the rest.
35
39
 
36
40
  ## Supported Models
37
41
 
38
- By default, the plugin activates for MiniMax models, identified by:
42
+ By default, the plugin activates for MiniMax models:
39
43
 
40
44
  * **Provider ID** containing `minimax`
41
45
  * **Model ID** containing `minimax` or `abab`
42
46
 
43
- Examples:
44
-
47
+ **Examples:**
45
48
  * `minimax/minimax-m2.1`
46
49
  * `minimax/abab6.5s-chat`
47
50
 
48
51
  ### Custom Model Configuration
49
52
 
50
- You can configure which models the plugin applies to by creating a config file.
51
-
52
- #### Config File Locations
53
+ You can enable this for other models by creating a config file.
53
54
 
54
- The plugin looks for configuration in these locations (in order of priority):
55
+ #### Locations (Priority Order)
55
56
 
56
57
  1. **Project level**: `.opencode/opencode-minimax-easy-vision.json`
57
58
  2. **User level**: `~/.config/opencode/opencode-minimax-easy-vision.json`
58
59
 
59
- Project-level config takes precedence over user-level config.
60
-
61
- #### Config File Format
60
+ #### Config Format
62
61
 
63
62
  ```json
64
63
  {
65
- "models": ["minimax/*", "glm/*", "openai/gpt-4-vision"]
64
+ "models": ["minimax/*", "opencode/*", "*/glm-4.7-free"]
66
65
  }
67
66
  ```
68
67
 
69
68
  #### Pattern Syntax
70
69
 
71
- Model patterns use a `provider/model` format with wildcard support:
72
-
73
- | Pattern | Description |
74
- | -------------- | --------------------------------------------------- |
75
- | `*` | Match ALL models (global wildcard) |
76
- | `minimax/*` | Match all models from the `minimax` provider |
77
- | `*/glm-4v` | Match `glm-4v` model from any provider |
78
- | `openai/gpt-4` | Exact match for provider and model |
79
- | `*/abab*` | Match any model containing `abab` from any provider |
70
+ | Pattern | Matches |
71
+ | ---------------- | --------------------------------------- |
72
+ | `*` | Match ALL models |
73
+ | `minimax/*` | All models from the `minimax` provider |
74
+ | `*/glm-4.7-free` | Specific model from any provider |
75
+ | `opencode/*` | All models from the `opencode` provider |
76
+ | `*/abab*` | Any model containing `abab` |
80
77
 
81
78
  #### Wildcard Rules
82
79
 
83
- * `*` at the start matches any prefix: `*suffix` matches values ending with `suffix`
84
- * `*` at the end matches any suffix: `prefix*` matches values starting with `prefix`
85
- * `*` alone matches everything
80
+ * `*suffix` matches values ending with `suffix`
81
+ * `prefix*` matches values starting with `prefix`
82
+ * `*` matches everything
86
83
  * `*text*` matches values containing `text`
87
84
 
88
- #### Precedence
89
-
90
- When multiple patterns are specified, the first matching pattern wins. If the `models` array is empty or the config file doesn't exist, the plugin falls back to default MiniMax-only behavior.
85
+ If the config is missing or empty, it defaults to MiniMax-only behavior.
91
86
 
92
- #### Examples
87
+ #### Configuration Examples
93
88
 
94
89
  **Enable for all models:**
95
90
 
@@ -99,35 +94,54 @@ When multiple patterns are specified, the first matching pattern wins. If the `m
99
94
  }
100
95
  ```
101
96
 
102
- **Enable for specific providers:**
97
+ **Specific providers:**
103
98
 
104
99
  ```json
105
100
  {
106
- "models": ["minimax/*", "glm/*", "zhipu/*"]
101
+ "models": ["minimax/*", "opencode/*", "google/*"]
107
102
  }
108
103
  ```
109
104
 
110
- **Mix of providers and specific models:**
105
+ **Mix of providers and models:**
111
106
 
112
107
  ```json
113
108
  {
114
- "models": ["minimax/*", "openai/gpt-4-vision", "*/claude-3*"]
109
+ "models": ["minimax/*", "opencode/gpt-5-nano", "*/claude-3-7-sonnet*"]
115
110
  }
116
111
  ```
117
112
 
113
+ ### Custom Image Analysis Tool
114
+
115
+ By default, the plugin uses `mcp_minimax_understand_image` from the MiniMax Coding Plan MCP. You can configure a different MCP tool for image analysis:
116
+
117
+ ```json
118
+ {
119
+ "models": ["*"],
120
+ "imageAnalysisTool": "mcp_openrouter_analyze_image"
121
+ }
122
+ ```
123
+
124
+ This allows you to use other MCP servers that provide image analysis capabilities, such as:
125
+
126
+ * [openrouter-image-mcp](https://github.com/JonathanJude/openrouter-image-mcp) - Uses OpenRouter with GPT-4V, Claude, Gemini
127
+ * [mcp-image-recognition](https://github.com/mario-andreschak/mcp-image-recognition) - Uses Anthropic/OpenAI Vision APIs
128
+ * [Peekaboo](https://github.com/steipete/Peekaboo) - macOS screenshot + AI analysis
129
+
130
+ The plugin will instruct the model to use the configured tool. The tool should accept an image file path as input.
131
+
118
132
  ## Supported Image Formats
119
133
 
120
134
  * PNG
121
135
  * JPEG
122
136
  * WebP
123
137
 
124
- *(These formats are dictated by the limitations of the [MiniMax Coding Plan MCP](https://github.com/MiniMax-AI/MiniMax-Coding-Plan-MCP) `understand_image` tool.)*
138
+ *(Limited by the [MiniMax Coding Plan MCP](https://github.com/MiniMax-AI/MiniMax-Coding-Plan-MCP) `understand_image` tool.)*
125
139
 
126
140
  ## Installation
127
141
 
128
142
  ### Via npm
129
143
 
130
- Add the plugin to the `plugin` array in your `opencode.json` file:
144
+ Just add the plugin to the `plugin` array in your `opencode.json` file:
131
145
 
132
146
  ```json
133
147
  {
@@ -136,23 +150,18 @@ Add the plugin to the `plugin` array in your `opencode.json` file:
136
150
  }
137
151
  ```
138
152
 
139
- ### From local source
153
+ ### From Local Source
140
154
 
141
- 1. Clone or download this repository
155
+ 1. Clone the repository.
142
156
  2. Build the plugin:
143
-
144
157
  ```bash
145
- npm install
146
- npm run build
158
+ npm install && npm run build
147
159
  ```
148
- 3. Copy the built file to your OpenCode plugin directory:
149
-
150
- * Project-level: `.opencode/plugin/minimax-easy-vision.js`
151
- * Global: `~/.config/opencode/plugin/minimax-easy-vision.js`
160
+ 3. Copy the built `dist/index.js` into your OpenCode plugin directory.
152
161
 
153
162
  ## Prerequisites
154
163
 
155
- The MiniMax Coding Plan MCP server must be configured in `opencode.json`:
164
+ The MiniMax Coding Plan MCP server must be configured in your `opencode.json`:
156
165
 
157
166
  ```json
158
167
  {
@@ -169,34 +178,20 @@ The MiniMax Coding Plan MCP server must be configured in `opencode.json`:
169
178
  }
170
179
  ```
171
180
 
172
- For full setup details, refer to the MiniMax Coding Plan MCP and MiniMax API documentation.
173
-
174
181
  ## Usage
175
182
 
176
- 1. Start OpenCode with a supported model (MiniMax by default, or any configured model)
177
- 2. Paste an image into the chat (`Cmd+V` / `Ctrl+V`)
178
- 3. Ask a question about the image
179
-
180
- What happens internally:
181
-
182
- * The image is saved to `{tmpdir}/opencode-minimax-vision/<uuid>.<ext>`
183
- * Instructions are injected for the model to use the `understand_image` MCP tool
184
- * The model performs vision analysis and responds
183
+ 1. Select a supported model in OpenCode.
184
+ 2. Paste an image (`Cmd+V` / `Ctrl+V`).
185
+ 3. Ask a question about it, just like how you do for other models with native vision support.
185
186
 
186
- ### Example interaction
187
+ ### Example Interaction
187
188
 
188
- ```text
189
- You: [pasted screenshot] What does this error message say?
190
-
191
- # Automatically injected:
192
- # [SYSTEM: Image Attachment Detected]
193
- # 1 image has been saved to: /tmp/opencode-minimax-vision/abc123.png
194
- # To analyze this image, use the understand_image MCP tool...
195
-
196
- Model: I'll analyze the screenshot using the understand_image tool.
197
- [Calls mcp_minimax_understand_image with the saved path]
198
- Model: The error message indicates a "TypeError: Cannot read property 'foo' of undefined"...
199
- ```
189
+ > **You**: [pasted screenshot] Why is this failing?
190
+ >
191
+ > **Model**: I'll check the image using the `understand_image` tool.
192
+ > `[Calls mcp_minimax_understand_image path="/tmp/xyz.png"]`
193
+ >
194
+ > **Model**: The error suggests a syntax error on line 12.
200
195
 
201
196
  ## Development
202
197
 
@@ -205,11 +200,11 @@ npm install
205
200
  npm run build
206
201
  ```
207
202
 
208
- The built plugin will be available at `dist/index.js`.
203
+ The built plugin will be available at `dist/index.js`
209
204
 
210
205
  ## License
211
206
 
212
- GPL-3.0. See [LICENSE.md](./LICENCE.md) for details.
207
+ GPL-3.0. See [LICENSE.md](./LICENSE.md)
213
208
 
214
209
  ## References
215
210
 
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,MAAM,EAAE,MAAM,qBAAqB,CAAC;AAgTlD,eAAO,MAAM,uBAAuB,EAAE,MAwGrC,CAAC;AAEF,eAAe,uBAAuB,CAAC"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,EAAE,MAAM,EAAE,MAAM,qBAAqB,CAAC;AAqdlD,eAAO,MAAM,uBAAuB,EAAE,MA+DrC,CAAC;AAEF,eAAe,uBAAuB,CAAC"}
package/dist/index.js CHANGED
@@ -3,9 +3,13 @@ import { join } from "node:path";
3
3
  import { mkdir, writeFile, readFile } from "node:fs/promises";
4
4
  import { existsSync } from "node:fs";
5
5
  import { randomUUID } from "node:crypto";
6
+ // Constants
6
7
  const PLUGIN_NAME = "minimax-easy-vision";
7
8
  const CONFIG_FILENAME = "opencode-minimax-easy-vision.json";
8
9
  const TEMP_DIR_NAME = "opencode-minimax-vision";
10
+ const MAX_TOOL_NAME_LENGTH = 256;
11
+ const DEFAULT_MODEL_PATTERNS = ["minimax/*", "*/abab*"];
12
+ const DEFAULT_IMAGE_ANALYSIS_TOOL = "mcp_minimax_understand_image";
9
13
  const SUPPORTED_MIME_TYPES = new Set([
10
14
  "image/png",
11
15
  "image/jpeg",
@@ -18,135 +22,200 @@ const MIME_TO_EXTENSION = {
18
22
  "image/jpg": "jpg",
19
23
  "image/webp": "webp",
20
24
  };
21
- const DEFAULT_MODEL_PATTERNS = ["minimax/*", "*/abab*"];
25
+ // Plugin State
22
26
  let pluginConfig = {};
27
+ // Config: Path Resolution
23
28
  function getUserConfigPath() {
24
29
  return join(homedir(), ".config", "opencode", CONFIG_FILENAME);
25
30
  }
26
31
  function getProjectConfigPath(directory) {
27
32
  return join(directory, ".opencode", CONFIG_FILENAME);
28
33
  }
29
- async function loadConfigFile(configPath) {
34
+ // Config: File Parsing
35
+ function parseModelsArray(value) {
36
+ if (!Array.isArray(value))
37
+ return undefined;
38
+ const models = value.filter((m) => typeof m === "string");
39
+ return models.length > 0 ? models : undefined;
40
+ }
41
+ function parseImageAnalysisTool(value) {
42
+ if (typeof value !== "string")
43
+ return undefined;
44
+ if (value.trim() === "")
45
+ return undefined;
46
+ if (value.length > MAX_TOOL_NAME_LENGTH)
47
+ return undefined;
48
+ return value;
49
+ }
50
+ function parseConfigObject(raw) {
51
+ if (!raw || typeof raw !== "object")
52
+ return {};
53
+ const obj = raw;
54
+ return {
55
+ models: parseModelsArray(obj.models),
56
+ imageAnalysisTool: parseImageAnalysisTool(obj.imageAnalysisTool),
57
+ };
58
+ }
59
+ async function readConfigFile(configPath) {
60
+ if (!existsSync(configPath))
61
+ return null;
30
62
  try {
31
- if (!existsSync(configPath)) {
32
- return null;
33
- }
34
63
  const content = await readFile(configPath, "utf-8");
35
64
  const parsed = JSON.parse(content);
36
- if (parsed && typeof parsed === "object" && parsed !== null) {
37
- const config = parsed;
38
- if (Array.isArray(config.models)) {
39
- const models = config.models.filter((m) => typeof m === "string");
40
- return { models };
41
- }
42
- }
43
- return {};
65
+ return parseConfigObject(parsed);
44
66
  }
45
67
  catch {
46
68
  return null;
47
69
  }
48
70
  }
49
- // Config precedence: project > user > defaults
71
+ // Config: Precedence & Merging (project > user > defaults)
72
+ function selectWithPrecedence(projectValue, userValue, defaultValue) {
73
+ if (projectValue !== undefined) {
74
+ return { value: projectValue, source: "project" };
75
+ }
76
+ if (userValue !== undefined) {
77
+ return { value: userValue, source: "user" };
78
+ }
79
+ return { value: defaultValue, source: "default" };
80
+ }
50
81
  async function loadPluginConfig(directory, log) {
51
- const userConfigPath = getUserConfigPath();
52
- const projectConfigPath = getProjectConfigPath(directory);
53
- const userConfig = await loadConfigFile(userConfigPath);
54
- const projectConfig = await loadConfigFile(projectConfigPath);
55
- if (projectConfig?.models && projectConfig.models.length > 0) {
56
- pluginConfig = projectConfig;
57
- log(`Loaded project config from ${projectConfigPath}: ${projectConfig.models.join(", ")}`);
82
+ const userConfig = await readConfigFile(getUserConfigPath());
83
+ const projectConfig = await readConfigFile(getProjectConfigPath(directory));
84
+ // Resolve models with precedence
85
+ const modelsResult = selectWithPrecedence(projectConfig?.models, userConfig?.models, undefined);
86
+ if (modelsResult.source !== "default") {
87
+ log(`Loaded models from ${modelsResult.source} config: ${modelsResult.value.join(", ")}`);
58
88
  }
59
- else if (userConfig?.models && userConfig.models.length > 0) {
60
- pluginConfig = userConfig;
61
- log(`Loaded user config from ${userConfigPath}: ${userConfig.models.join(", ")}`);
89
+ else {
90
+ log(`Using default models: ${DEFAULT_MODEL_PATTERNS.join(", ")}`);
91
+ }
92
+ // Resolve imageAnalysisTool with precedence
93
+ const toolResult = selectWithPrecedence(projectConfig?.imageAnalysisTool, userConfig?.imageAnalysisTool, undefined);
94
+ if (toolResult.source !== "default") {
95
+ log(`Using imageAnalysisTool from ${toolResult.source} config: ${toolResult.value}`);
62
96
  }
63
97
  else {
64
- pluginConfig = {};
65
- log(`No config found, using defaults: ${DEFAULT_MODEL_PATTERNS.join(", ")}`);
98
+ log(`Using default imageAnalysisTool: ${DEFAULT_IMAGE_ANALYSIS_TOOL}`);
66
99
  }
100
+ pluginConfig = {
101
+ models: modelsResult.value,
102
+ imageAnalysisTool: toolResult.value,
103
+ };
104
+ }
105
+ // Config: Accessors
106
+ function getConfiguredModels() {
107
+ return pluginConfig.models ?? DEFAULT_MODEL_PATTERNS;
108
+ }
109
+ function getImageAnalysisTool() {
110
+ return pluginConfig.imageAnalysisTool ?? DEFAULT_IMAGE_ANALYSIS_TOOL;
67
111
  }
68
- // Order matters: check *text* before *text or text* to avoid false matches
69
- function matchesPattern(pattern, value) {
70
- const lowerPattern = pattern.toLowerCase();
71
- const lowerValue = value.toLowerCase();
72
- if (lowerPattern === "*") {
112
+ // Pattern Matching (supports wildcards: *, prefix*, *suffix, *contains*)
113
+ function matchesWildcardPattern(pattern, value) {
114
+ const p = pattern.toLowerCase();
115
+ const v = value.toLowerCase();
116
+ // Global wildcard
117
+ if (p === "*")
73
118
  return true;
119
+ // Contains: *text*
120
+ if (p.startsWith("*") && p.endsWith("*") && p.length > 2) {
121
+ return v.includes(p.slice(1, -1));
74
122
  }
75
- if (lowerPattern.startsWith("*") &&
76
- lowerPattern.endsWith("*") &&
77
- lowerPattern.length > 2) {
78
- const middle = lowerPattern.slice(1, -1);
79
- return lowerValue.includes(middle);
123
+ // Prefix: text*
124
+ if (p.endsWith("*")) {
125
+ return v.startsWith(p.slice(0, -1));
80
126
  }
81
- if (lowerPattern.endsWith("*")) {
82
- const prefix = lowerPattern.slice(0, -1);
83
- return lowerValue.startsWith(prefix);
127
+ // Suffix: *text
128
+ if (p.startsWith("*")) {
129
+ return v.endsWith(p.slice(1));
84
130
  }
85
- if (lowerPattern.startsWith("*")) {
86
- const suffix = lowerPattern.slice(1);
87
- return lowerValue.endsWith(suffix);
131
+ // Exact match
132
+ return v === p;
133
+ }
134
+ function matchesSinglePattern(pattern, model) {
135
+ // Global wildcard matches everything
136
+ if (pattern === "*")
137
+ return true;
138
+ const slashIndex = pattern.indexOf("/");
139
+ // No slash: match against both provider and model
140
+ if (slashIndex === -1) {
141
+ return (matchesWildcardPattern(pattern, model.modelID) ||
142
+ matchesWildcardPattern(pattern, model.providerID));
88
143
  }
89
- return lowerValue === lowerPattern;
144
+ // With slash: match provider/model separately
145
+ const providerPattern = pattern.slice(0, slashIndex);
146
+ const modelPattern = pattern.slice(slashIndex + 1);
147
+ return (matchesWildcardPattern(providerPattern, model.providerID) &&
148
+ matchesWildcardPattern(modelPattern, model.modelID));
90
149
  }
91
- // Pattern format: "provider/model" with wildcards. No slash = match against both.
92
- function modelMatchesPatterns(model, patterns) {
150
+ function modelMatchesAnyPattern(model) {
93
151
  if (!model)
94
152
  return false;
95
- for (const pattern of patterns) {
96
- if (pattern === "*") {
97
- return true;
98
- }
99
- const slashIndex = pattern.indexOf("/");
100
- if (slashIndex === -1) {
101
- if (matchesPattern(pattern, model.modelID)) {
102
- return true;
103
- }
104
- if (matchesPattern(pattern, model.providerID)) {
105
- return true;
106
- }
107
- }
108
- else {
109
- const providerPattern = pattern.slice(0, slashIndex);
110
- const modelPattern = pattern.slice(slashIndex + 1);
111
- const providerMatches = matchesPattern(providerPattern, model.providerID);
112
- const modelMatches = matchesPattern(modelPattern, model.modelID);
113
- if (providerMatches && modelMatches) {
114
- return true;
115
- }
116
- }
117
- }
118
- return false;
119
- }
120
- function shouldApplyVisionHook(model) {
121
- const patterns = pluginConfig.models && pluginConfig.models.length > 0
122
- ? pluginConfig.models
123
- : DEFAULT_MODEL_PATTERNS;
124
- return modelMatchesPatterns(model, patterns);
153
+ const patterns = getConfiguredModels();
154
+ return patterns.some((pattern) => matchesSinglePattern(pattern, model));
125
155
  }
156
+ // Type Guards
157
+ //
158
+ // Messages in OpenCode contain "parts" - an array of different content types:
159
+ // - TextPart: The user's typed text
160
+ // - FilePart: Attached files (images, PDFs, etc.) with mime type and URL
126
161
  function isImageFilePart(part) {
127
162
  if (part.type !== "file")
128
163
  return false;
129
- const filePart = part;
130
- return SUPPORTED_MIME_TYPES.has(filePart.mime?.toLowerCase() ?? "");
164
+ const mime = part.mime?.toLowerCase() ?? "";
165
+ return SUPPORTED_MIME_TYPES.has(mime);
131
166
  }
132
167
  function isTextPart(part) {
133
168
  return part.type === "text";
134
169
  }
135
- function parseDataUrl(dataUrl) {
170
+ // Image Processing: URL Handlers
171
+ //
172
+ // Images can arrive via different URL schemes:
173
+ // - file:// → Already on disk, just need the local path
174
+ // - data: → Base64-encoded, must decode and save to temp file
175
+ // - http(s): → Remote URL, pass through for MCP tool to fetch directly
176
+ function handleFileUrl(url, filePart, log) {
177
+ // Image is already saved locally; strip the file:// prefix to get the path
178
+ const localPath = url.replace("file://", "");
179
+ log(`Image already on disk: ${localPath}`);
180
+ return { path: localPath, mime: filePart.mime, partId: filePart.id };
181
+ }
182
+ function parseBase64DataUrl(dataUrl) {
136
183
  const match = dataUrl.match(/^data:([^;]+);base64,(.+)$/);
137
184
  if (!match)
138
185
  return null;
139
186
  try {
140
- return {
141
- mime: match[1],
142
- data: Buffer.from(match[2], "base64"),
143
- };
187
+ return { mime: match[1], data: Buffer.from(match[2], "base64") };
144
188
  }
145
189
  catch {
146
190
  return null;
147
191
  }
148
192
  }
149
- function getExtension(mime) {
193
+ async function handleDataUrl(url, filePart, log) {
194
+ // Pasted clipboard images arrive as base64 data URLs.
195
+ // Decode and save to a temp file so the MCP tool can read it.
196
+ const parsed = parseBase64DataUrl(url);
197
+ if (!parsed) {
198
+ log(`Failed to parse data URL for part ${filePart.id}`);
199
+ return null;
200
+ }
201
+ try {
202
+ const savedPath = await saveImageToTemp(parsed.data, parsed.mime);
203
+ log(`Saved image to: ${savedPath}`);
204
+ return { path: savedPath, mime: parsed.mime, partId: filePart.id };
205
+ }
206
+ catch (err) {
207
+ log(`Failed to save image: ${err}`);
208
+ return null;
209
+ }
210
+ }
211
+ function handleHttpUrl(url, filePart, log) {
212
+ // Remote URLs are passed directly to the MCP tool, which can fetch them itself.
213
+ // This avoids unnecessary network requests and disk I/O.
214
+ log(`Image is remote URL: ${url}`);
215
+ return { path: url, mime: filePart.mime, partId: filePart.id };
216
+ }
217
+ // Image Processing: File Operations
218
+ function getExtensionForMime(mime) {
150
219
  return MIME_TO_EXTENSION[mime.toLowerCase()] ?? "png";
151
220
  }
152
221
  async function ensureTempDir() {
@@ -156,91 +225,112 @@ async function ensureTempDir() {
156
225
  }
157
226
  async function saveImageToTemp(data, mime) {
158
227
  const tempDir = await ensureTempDir();
159
- const ext = getExtension(mime);
160
- const filename = `${randomUUID()}.${ext}`;
228
+ const filename = `${randomUUID()}.${getExtensionForMime(mime)}`;
161
229
  const filepath = join(tempDir, filename);
162
230
  await writeFile(filepath, data);
163
231
  return filepath;
164
232
  }
165
- function generateInjectionPrompt(imagePaths, userText) {
166
- if (imagePaths.length === 0)
233
+ // Image Processing: Main Processor
234
+ async function processImagePart(filePart, log) {
235
+ const url = filePart.url;
236
+ if (!url) {
237
+ log(`Skipping image part ${filePart.id}: no URL`);
238
+ return null;
239
+ }
240
+ if (url.startsWith("file://")) {
241
+ return handleFileUrl(url, filePart, log);
242
+ }
243
+ if (url.startsWith("data:")) {
244
+ return handleDataUrl(url, filePart, log);
245
+ }
246
+ if (url.startsWith("http://") || url.startsWith("https://")) {
247
+ return handleHttpUrl(url, filePart, log);
248
+ }
249
+ log(`Unsupported URL scheme for part ${filePart.id}: ${url.substring(0, 50)}...`);
250
+ return null;
251
+ }
252
+ async function extractImagesFromParts(parts, log) {
253
+ const savedImages = [];
254
+ for (const part of parts) {
255
+ if (!isImageFilePart(part))
256
+ continue;
257
+ const result = await processImagePart(part, log);
258
+ if (result) {
259
+ savedImages.push(result);
260
+ }
261
+ }
262
+ return savedImages;
263
+ }
264
+ // Prompt Generation
265
+ //
266
+ // Since the target model doesn't natively understand image attachments,
267
+ // we replace them with text instructions that tell the model to use an
268
+ // MCP tool (e.g., understand_image) with the file path or URL.
269
+ // The user's original text is preserved as "User's request: ...".
270
+ function generateInjectionPrompt(images, userText, toolName) {
271
+ if (images.length === 0)
167
272
  return userText;
168
- const isSingle = imagePaths.length === 1;
169
- const imageList = imagePaths
273
+ const isSingle = images.length === 1;
274
+ const imageList = images
170
275
  .map((img, idx) => `- Image ${idx + 1}: ${img.path}`)
171
276
  .join("\n");
172
- return `The user has shared ${isSingle ? "an image" : `${imagePaths.length} images`}. The ${isSingle ? "image is" : "images are"} saved at:
277
+ const imageCountText = isSingle ? "an image" : `${images.length} images`;
278
+ const imagePlural = isSingle ? "image is" : "images are";
279
+ const analyzeText = isSingle ? "this image" : "each image";
280
+ return `The user has shared ${imageCountText}. The ${imagePlural} saved at:
173
281
  ${imageList}
174
282
 
175
- Use the \`mcp_minimax_understand_image\` tool to analyze ${isSingle ? "this image" : "each image"}. Pass the file path as \`image_source\` and describe what to look for in \`prompt\`.
283
+ Use the \`${toolName}\` tool to analyze ${analyzeText}.
176
284
 
177
285
  User's request: ${userText || "(analyze the image)"}`;
178
286
  }
179
- async function processMessageImages(parts, log) {
180
- const savedImages = [];
181
- for (const part of parts) {
182
- if (!isImageFilePart(part))
183
- continue;
184
- const filePart = part;
185
- const url = filePart.url;
186
- if (!url) {
187
- log(`Skipping image part ${filePart.id}: no URL`);
188
- continue;
287
+ // Message Transformation
288
+ //
289
+ // The transformation flow:
290
+ // 1. Find the last user message (most recent request)
291
+ // 2. Extract and save any images from its parts
292
+ // 3. Remove the image parts (they can't be sent to the model)
293
+ // 4. Replace/update the text part with injection instructions
294
+ function findLastUserMessage(messages) {
295
+ for (let i = messages.length - 1; i >= 0; i--) {
296
+ if (messages[i].info.role === "user") {
297
+ return { message: messages[i], index: i };
189
298
  }
190
- if (url.startsWith("file://")) {
191
- const localPath = url.replace("file://", "");
192
- log(`Image already on disk: ${localPath}`);
193
- savedImages.push({
194
- path: localPath,
195
- mime: filePart.mime,
196
- partId: filePart.id,
197
- });
198
- continue;
199
- }
200
- if (url.startsWith("data:")) {
201
- const parsed = parseDataUrl(url);
202
- if (!parsed) {
203
- log(`Failed to parse data URL for part ${filePart.id}`);
204
- continue;
205
- }
206
- try {
207
- const savedPath = await saveImageToTemp(parsed.data, parsed.mime);
208
- log(`Saved image to: ${savedPath}`);
209
- savedImages.push({
210
- path: savedPath,
211
- mime: parsed.mime,
212
- partId: filePart.id,
213
- });
214
- }
215
- catch (err) {
216
- log(`Failed to save image: ${err}`);
217
- }
218
- continue;
219
- }
220
- if (url.startsWith("http://") || url.startsWith("https://")) {
221
- log(`Image is remote URL: ${url}`);
222
- savedImages.push({
223
- path: url,
224
- mime: filePart.mime,
225
- partId: filePart.id,
226
- });
227
- continue;
228
- }
229
- log(`Unsupported URL scheme for part ${filePart.id}: ${url.substring(0, 50)}...`);
230
299
  }
231
- return savedImages;
300
+ return null;
301
+ }
302
+ function getModelFromMessage(message) {
303
+ const info = message.info;
304
+ return info.model;
305
+ }
306
+ function removeProcessedImageParts(parts, processedIds) {
307
+ // Remove image parts that were successfully processed; they've been converted
308
+ // to file paths in the injection prompt and the model can't interpret raw images.
309
+ return parts.filter((part) => !(part.type === "file" && processedIds.has(part.id)));
310
+ }
311
+ function updateOrCreateTextPart(message, newText) {
312
+ const textPartIndex = message.parts.findIndex(isTextPart);
313
+ if (textPartIndex !== -1) {
314
+ message.parts[textPartIndex].text = newText;
315
+ }
316
+ else {
317
+ const newTextPart = {
318
+ id: `transformed-${randomUUID()}`,
319
+ sessionID: message.info.sessionID,
320
+ messageID: message.info.id,
321
+ type: "text",
322
+ text: newText,
323
+ synthetic: true,
324
+ };
325
+ message.parts.unshift(newTextPart);
326
+ }
232
327
  }
328
+ // Plugin Export
233
329
  export const MinimaxEasyVisionPlugin = async (input) => {
234
330
  const { client, directory } = input;
235
331
  const log = (msg) => {
236
332
  client.app
237
- .log({
238
- body: {
239
- service: PLUGIN_NAME,
240
- level: "info",
241
- message: msg,
242
- },
243
- })
333
+ .log({ body: { service: PLUGIN_NAME, level: "info", message: msg } })
244
334
  .catch(() => { });
245
335
  };
246
336
  await loadPluginConfig(directory, log);
@@ -248,29 +338,19 @@ export const MinimaxEasyVisionPlugin = async (input) => {
248
338
  return {
249
339
  "experimental.chat.messages.transform": async (_input, output) => {
250
340
  const { messages } = output;
251
- let lastUserMessage;
252
- let lastUserIndex = -1;
253
- for (let i = messages.length - 1; i >= 0; i--) {
254
- if (messages[i].info.role === "user") {
255
- lastUserMessage = messages[i];
256
- lastUserIndex = i;
257
- break;
258
- }
259
- }
260
- if (!lastUserMessage) {
341
+ const result = findLastUserMessage(messages);
342
+ if (!result)
261
343
  return;
262
- }
263
- const userInfo = lastUserMessage.info;
264
- if (!shouldApplyVisionHook(userInfo.model)) {
344
+ const { message: lastUserMessage, index: lastUserIndex } = result;
345
+ const model = getModelFromMessage(lastUserMessage);
346
+ if (!modelMatchesAnyPattern(model))
265
347
  return;
266
- }
267
348
  log("Model matched, checking for images...");
268
349
  const hasImages = lastUserMessage.parts.some(isImageFilePart);
269
- if (!hasImages) {
350
+ if (!hasImages)
270
351
  return;
271
- }
272
352
  log("Found images in message, processing...");
273
- const savedImages = await processMessageImages(lastUserMessage.parts, log);
353
+ const savedImages = await extractImagesFromParts(lastUserMessage.parts, log);
274
354
  if (savedImages.length === 0) {
275
355
  log("No images were successfully saved");
276
356
  return;
@@ -278,25 +358,10 @@ export const MinimaxEasyVisionPlugin = async (input) => {
278
358
  log(`Saved ${savedImages.length} image(s), transforming message...`);
279
359
  const existingTextPart = lastUserMessage.parts.find(isTextPart);
280
360
  const userText = existingTextPart?.text ?? "";
281
- const transformedText = generateInjectionPrompt(savedImages.map((img) => ({ path: img.path, mime: img.mime })), userText);
282
- const processedPartIds = new Set(savedImages.map((img) => img.partId));
283
- lastUserMessage.parts = lastUserMessage.parts.filter((part) => !(part.type === "file" && processedPartIds.has(part.id)));
284
- const textPartIndex = lastUserMessage.parts.findIndex(isTextPart);
285
- if (textPartIndex !== -1) {
286
- const textPart = lastUserMessage.parts[textPartIndex];
287
- textPart.text = transformedText;
288
- }
289
- else {
290
- const newTextPart = {
291
- id: `transformed-${randomUUID()}`,
292
- sessionID: lastUserMessage.info.sessionID,
293
- messageID: lastUserMessage.info.id,
294
- type: "text",
295
- text: transformedText,
296
- synthetic: true,
297
- };
298
- lastUserMessage.parts.unshift(newTextPart);
299
- }
361
+ const transformedText = generateInjectionPrompt(savedImages, userText, getImageAnalysisTool());
362
+ const processedIds = new Set(savedImages.map((img) => img.partId));
363
+ lastUserMessage.parts = removeProcessedImageParts(lastUserMessage.parts, processedIds);
364
+ updateOrCreateTextPart(lastUserMessage, transformedText);
300
365
  messages[lastUserIndex] = lastUserMessage;
301
366
  log("Successfully injected image path instructions");
302
367
  },
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "opencode-minimax-easy-vision",
3
- "version": "1.1.1",
3
+ "version": "1.2.0",
4
4
  "description": "OpenCode plugin that enables vision support for Minimax models by saving pasted images and injecting MCP tool instructions",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",