images-mcp 1.2.1 → 1.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (6) hide show
  1. package/README.md +220 -40
  2. package/cli.ts +252 -38
  3. package/core.ts +52 -21
  4. package/mcp.ts +94 -99
  5. package/package.json +9 -4
  6. package/schemas.ts +35 -0
package/README.md CHANGED
@@ -1,62 +1,242 @@
1
1
  # images-mcp
2
2
 
3
- An MCP server for AI image generation.
3
+ Generate and edit images from the command line (`images-mcp`) or as an MCP server.
4
4
 
5
- ## Tools
5
+ ## Requirements
6
6
 
7
- ### `openai_generate_image`
8
- Generate or edit images using OpenAI.
7
+ - Bun `>=1.0.0`
9
8
 
10
- | Parameter | Default | Options |
11
- |-----------|---------|---------|
12
- | `prompt` | required | Text description or editing instructions |
13
- | `output_path` | required | File path to save (.png) |
14
- | `model` | `gpt-image-1.5` | `gpt-image-1.5` |
15
- | `input_images` | - | Array of image file paths for editing/reference |
16
- | `size` | `auto` | `auto`, `1024x1024`, `1536x1024`, `1024x1536` |
17
- | `quality` | `auto` | `auto`, `high`, `medium`, `low` |
18
- | `background` | `auto` | `auto`, `transparent`, `opaque` |
9
+ ## CLI (Primary Interface)
19
10
 
20
- ### `gemini_generate_image`
21
- Generate or edit images using Google Gemini.
11
+ ### Quick Start
22
12
 
23
- | Parameter | Default | Options |
24
- |-----------|---------|---------|
25
- | `prompt` | required | Text description or editing instructions |
26
- | `output_path` | required | File path to save (.png) |
27
- | `model` | `gemini-3-pro-image-preview` | `gemini-2.5-flash-image`, `gemini-3-pro-image-preview` |
28
- | `input_images` | - | Array of image file paths for editing/reference |
29
- | `aspect_ratio` | - | `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
30
- | `image_size` | - | `1K`, `2K`, `4K` |
13
+ ```bash
14
+ bun install
15
+ ```
31
16
 
32
- ## Environment Variables
17
+ Set API keys (use one or both providers):
33
18
 
34
- - `OPENAI_API_KEY` - Required for OpenAI
35
- - `GEMINI_API_KEY` or `GOOGLE_API_KEY` - Required for Gemini
19
+ ```bash
20
+ export OPENAI_API_KEY="..."
21
+ export GEMINI_API_KEY="..." # or GOOGLE_API_KEY
22
+ ```
36
23
 
37
- ## MCP Server
24
+ Run directly from this repo:
38
25
 
39
26
  ```bash
40
- bun install
41
- bun run start
27
+ bun run cli.ts --help
42
28
  ```
43
29
 
44
- ## CLI
30
+ If `images-mcp` is on your `PATH`, you can run:
45
31
 
46
- The package also works as a CLI with subcommands.
32
+ ```bash
33
+ images-mcp --help
34
+ ```
35
+
36
+ ### Install/Run Modes
37
+
38
+ 1. Source checkout mode (always works in this repo):
39
+ - `bun run cli.ts <command> [flags]`
40
+ 2. Binary-on-path mode (after package/global install or linking):
41
+ - `images-mcp <command> [flags]`
42
+ 3. MCP server mode (for MCP clients, stdio transport):
43
+ - `bun run start` (same as `bun run mcp.ts`)
44
+
45
+ ### Commands
47
46
 
48
47
  ```bash
49
- # OpenAI
50
- images-mcp openai --prompt "A neon cat" --output ./cat.png
48
+ images-mcp openai [flags]
49
+ images-mcp gemini [flags]
50
+ images-mcp --help
51
+ ```
52
+
53
+ ### CLI Flags (Full Reference)
54
+
55
+ Common flags (both commands):
56
+
57
+ | Flag | Required | Default | Notes |
58
+ |---|---|---|---|
59
+ | `--prompt <text>` | yes* | - | Prompt or edit instructions (`*` can be read from stdin when `--prompt` is omitted) |
60
+ | `--output <path>` | yes | - | Output file path |
61
+ | `--input <path>` | no | - | Repeatable input image path |
62
+ | `--inputs <path>` | no | - | Backward-compatible alias of `--input` (accepted, not shown in built-in help text) |
63
+ | `--help`, `-h` | no | - | Print usage |
64
+
65
+ OpenAI flags (`images-mcp openai`):
66
+
67
+ | Flag | Required | Default | Allowed values |
68
+ |---|---|---|---|
69
+ | `--model <value>` | no | `gpt-image-1.5` | `gpt-image-1.5` |
70
+ | `--size <value>` | no | `auto` | `auto`, `1024x1024`, `1536x1024`, `1024x1536` |
71
+ | `--quality <value>` | no | `auto` | `auto`, `high`, `medium`, `low` |
72
+ | `--background <value>` | no | `auto` | `auto`, `transparent`, `opaque` |
73
+
74
+ OpenAI output file extensions:
51
75
 
52
- # OpenAI edit
53
- images-mcp openai --prompt "Make it snowy" --output ./cat-snow.png --input ./cat.png
76
+ - `.png`, `.jpg`, `.jpeg`, `.webp`
54
77
 
55
- # Gemini
56
- images-mcp gemini --prompt "A ceramic teapot" --output ./teapot.png
78
+ Gemini flags (`images-mcp gemini`):
57
79
 
58
- # Gemini edit
59
- images-mcp gemini --prompt "Make it blue" --output ./teapot-blue.png --input ./teapot.png
80
+ | Flag | Required | Default | Allowed values |
81
+ |---|---|---|---|
82
+ | `--model <value>` | no | `gemini-3-pro-image-preview` | `gemini-3-pro-image-preview`, `gemini-2.5-flash-image` |
83
+ | `--aspect-ratio <value>` | no | unset | `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
84
+ | `--image-size <value>` | no | unset | `1K`, `2K`, `4K` |
85
+
86
+ Gemini output file extensions:
87
+
88
+ - `.png`
89
+
90
+ ### CLI Examples
91
+
92
+ Generate with OpenAI:
93
+
94
+ ```bash
95
+ bun run cli.ts openai \
96
+ --prompt "A neon cat in rainy Tokyo, cinematic lighting" \
97
+ --output ./cat.png
98
+ ```
99
+
100
+ Generate with OpenAI by piping prompt from stdin:
101
+
102
+ ```bash
103
+ cat prompt.txt | bun run cli.ts openai --output ./cat.png
104
+ ```
105
+
106
+ Or with stdin redirection:
107
+
108
+ ```bash
109
+ bun run cli.ts openai --output ./cat.png < prompt.txt
110
+ ```
111
+
112
+ Edit with OpenAI:
113
+
114
+ ```bash
115
+ bun run cli.ts openai \
116
+ --prompt "Add snow and keep the cat centered" \
117
+ --output ./cat-snow.png \
118
+ --input ./cat.png
119
+ ```
120
+
121
+ Generate with Gemini:
122
+
123
+ ```bash
124
+ bun run cli.ts gemini \
125
+ --prompt "A ceramic teapot product photo on white background" \
126
+ --output ./teapot.png \
127
+ --aspect-ratio 4:3 \
128
+ --image-size 2K
60
129
  ```
61
130
 
62
- Run `images-mcp --help` for all flags.
131
+ Edit with Gemini and multiple references:
132
+
133
+ ```bash
134
+ bun run cli.ts gemini \
135
+ --prompt "Combine both references into one consistent illustration" \
136
+ --output ./combined.png \
137
+ --input ./ref-1.png \
138
+ --input ./ref-2.png
139
+ ```
140
+
141
+ ### JSON Output Behavior
142
+
143
+ Successful runs print formatted JSON to stdout. Shape:
144
+
145
+ ```json
146
+ {
147
+ "success": true,
148
+ "path": "/absolute/path/to/output.png",
149
+ "bytes": 123456,
150
+ "...provider_fields": "..."
151
+ }
152
+ ```
153
+
154
+ Provider-specific success fields:
155
+
156
+ - OpenAI: `model`, `size`, `quality`, `input_images_count`
157
+ - Gemini: `model`, `aspect_ratio`, `image_size`, `input_images_count`
158
+
159
+ ### Errors and Exit Codes
160
+
161
+ - Exit `0`: Help (`--help`) or successful generation.
162
+ - Exit `1`: Argument parsing/validation errors, runtime errors, API/auth errors, file errors.
163
+ - Parse/validation errors are written to `stderr` as `<message>` followed by usage.
164
+ - Runtime errors are written to `stderr` as `Error: <message>`.
165
+
166
+ Common parse failures:
167
+
168
+ - Missing required flags (when neither `--prompt` nor non-empty piped stdin prompt is provided):
169
+ - `Missing required --prompt or --output`
170
+ - Missing value for a flag:
171
+ - `Missing value for --output`
172
+ - Unknown flags:
173
+ - `Unknown flag(s) for openai: --foo`
174
+ - Unexpected positional argument:
175
+ - `Unexpected argument: value`
176
+ - Invalid enum value:
177
+ - `Invalid value for --size: "500x500". Allowed values: auto, 1024x1024, 1536x1024, 1024x1536`
178
+
179
+ ## Troubleshooting
180
+
181
+ ### Missing API Key
182
+
183
+ OpenAI command without `OPENAI_API_KEY` fails in the OpenAI SDK.
184
+
185
+ Gemini command without both `GEMINI_API_KEY` and `GOOGLE_API_KEY` fails with:
186
+
187
+ ```text
188
+ Error: Missing GEMINI_API_KEY or GOOGLE_API_KEY environment variable
189
+ ```
190
+
191
+ Fix:
192
+
193
+ ```bash
194
+ export OPENAI_API_KEY="..."
195
+ export GEMINI_API_KEY="..." # or GOOGLE_API_KEY
196
+ ```
197
+
198
+ ### Invalid or Unsupported Flags
199
+
200
+ If you pass a flag not supported by the selected command, the CLI exits with code `1` and prints a command-specific unknown-flag message.
201
+
202
+ If you pass an unsupported value, the CLI prints the allowed values for that flag.
203
+
204
+ ## MCP Server (Also Supported)
205
+
206
+ The same image functionality is available over MCP/stdin-stdout transport.
207
+
208
+ Start server:
209
+
210
+ ```bash
211
+ bun run start
212
+ ```
213
+
214
+ Registered MCP tools:
215
+
216
+ 1. `openai_generate_image`
217
+ 2. `gemini_generate_image`
218
+
219
+ Tool parameter defaults/options mirror the same schemas used by the CLI:
220
+
221
+ OpenAI tool params:
222
+
223
+ | Parameter | Default | Allowed values |
224
+ |---|---|---|
225
+ | `prompt` | required | text |
226
+ | `output_path` | required | path |
227
+ | `model` | `gpt-image-1.5` | `gpt-image-1.5` |
228
+ | `input_images` | unset | string[] |
229
+ | `size` | `auto` | `auto`, `1024x1024`, `1536x1024`, `1024x1536` |
230
+ | `quality` | `auto` | `auto`, `high`, `medium`, `low` |
231
+ | `background` | `auto` | `auto`, `transparent`, `opaque` |
232
+
233
+ Gemini tool params:
234
+
235
+ | Parameter | Default | Allowed values |
236
+ |---|---|---|
237
+ | `prompt` | required | text |
238
+ | `output_path` | required | path |
239
+ | `model` | `gemini-3-pro-image-preview` | `gemini-3-pro-image-preview`, `gemini-2.5-flash-image` |
240
+ | `input_images` | unset | string[] |
241
+ | `aspect_ratio` | unset | `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
242
+ | `image_size` | unset | `1K`, `2K`, `4K` |
package/cli.ts CHANGED
@@ -1,12 +1,25 @@
1
1
  #!/usr/bin/env bun
2
+ import { Cli, Command, Option } from "clipanion";
3
+ import tty from "node:tty";
2
4
  import {
3
5
  generateGeminiImage,
4
6
  generateOpenAIImage,
7
+ } from "./core.ts";
8
+ import {
9
+ geminiAspectRatioSchema,
10
+ geminiImageSizeSchema,
11
+ geminiModelSchema,
12
+ geminiParamsSchema,
13
+ openAIBackgroundSchema,
14
+ openAIModelSchema,
15
+ openAIParamsSchema,
16
+ openAIQualitySchema,
17
+ openAISizeSchema,
5
18
  type GeminiParams,
6
19
  type OpenAIParams,
7
- } from "./core.ts";
20
+ } from "./schemas.ts";
8
21
 
9
- type ParsedArgs =
22
+ export type ParsedArgs =
10
23
  | { mode: "openai"; params: OpenAIParams }
11
24
  | { mode: "gemini"; params: GeminiParams }
12
25
  | { mode: "help"; message?: string };
@@ -15,14 +28,15 @@ function printUsage(message?: string) {
15
28
  if (message) {
16
29
  console.error(message);
17
30
  }
18
- console.error(`images-mcp (CLI)
31
+ const write = message ? console.error : console.log;
32
+ write(`images-mcp (CLI)
19
33
 
20
34
  Usage:
21
35
  images-mcp openai [args] Generate/edit via OpenAI
22
36
  images-mcp gemini [args] Generate/edit via Gemini
23
37
 
24
38
  Common args:
25
- --prompt Text prompt (required)
39
+ --prompt Text prompt (required unless piped via stdin)
26
40
  --output Output file path (required)
27
41
  --input Input image path (repeatable)
28
42
 
@@ -39,7 +53,138 @@ Gemini args:
39
53
  `);
40
54
  }
41
55
 
42
- function parseArgs(argv: string[]): ParsedArgs {
56
+ const formatEnumError = (flag: string, value: string, allowed: readonly string[]) =>
57
+ `Invalid value for --${flag}: "${value}". Allowed values: ${allowed.join(", ")}`;
58
+ const isAllowedEnumValue = (value: string, allowed: readonly string[]) => allowed.includes(value);
59
+
60
+ const formatUnknownFlagsError = (command: "openai" | "gemini", flags: string[]) =>
61
+ `Unknown flag(s) for ${command}: ${flags.map((flag) => `--${flag}`).join(", ")}`;
62
+
63
+ const uniqueInOrder = (values: string[]) => [...new Set(values)];
64
+
65
+ const parseInputImages = (input: string[] | undefined, inputs: string[] | undefined) => {
66
+ const all = [...(input ?? []), ...(inputs ?? [])];
67
+ return all.length > 0 ? all : undefined;
68
+ };
69
+
70
+ class OpenAICommand extends Command {
71
+ static override paths = [["openai"]];
72
+
73
+ prompt = Option.String("--prompt", { required: false });
74
+ output = Option.String("--output", { required: false });
75
+ input = Option.Array("--input", [], { arity: 1 });
76
+ inputs = Option.Array("--inputs", [], { arity: 1 });
77
+ model = Option.String("--model", { required: false });
78
+ size = Option.String("--size", { required: false });
79
+ quality = Option.String("--quality", { required: false });
80
+ background = Option.String("--background", { required: false });
81
+
82
+ override async execute() {
83
+ return 0;
84
+ }
85
+ }
86
+
87
+ class GeminiCommand extends Command {
88
+ static override paths = [["gemini"]];
89
+
90
+ prompt = Option.String("--prompt", { required: false });
91
+ output = Option.String("--output", { required: false });
92
+ input = Option.Array("--input", [], { arity: 1 });
93
+ inputs = Option.Array("--inputs", [], { arity: 1 });
94
+ model = Option.String("--model", { required: false });
95
+ aspectRatio = Option.String("--aspect-ratio", { required: false });
96
+ imageSize = Option.String("--image-size", { required: false });
97
+
98
+ override async execute() {
99
+ return 0;
100
+ }
101
+ }
102
+
103
+ function createCli() {
104
+ const cli = new Cli({
105
+ binaryName: "images-mcp",
106
+ binaryLabel: "images-mcp",
107
+ });
108
+
109
+ cli.register(OpenAICommand);
110
+ cli.register(GeminiCommand);
111
+
112
+ return cli;
113
+ }
114
+
115
+ function structuralErrorForArgs(args: string[]): string | undefined {
116
+ for (let i = 0; i < args.length; i += 1) {
117
+ const token = args[i];
118
+ if (!token) break;
119
+ if (token === "--help" || token === "-h") return undefined;
120
+ if (!token.startsWith("--")) return `Unexpected argument: ${token}`;
121
+
122
+ const key = token.slice(2);
123
+ const value = args[i + 1];
124
+ if (!value || value.startsWith("--")) return `Missing value for --${key}`;
125
+ i += 1;
126
+ }
127
+ }
128
+
129
+ function findUnknownFlags(command: "openai" | "gemini", args: string[]): string[] {
130
+ const known = command === "openai"
131
+ ? new Set(["prompt", "output", "input", "inputs", "model", "size", "quality", "background"])
132
+ : new Set(["prompt", "output", "input", "inputs", "model", "aspect-ratio", "image-size"]);
133
+
134
+ const unknownFlags: string[] = [];
135
+ for (let i = 0; i < args.length; i += 1) {
136
+ const token = args[i];
137
+ if (!token?.startsWith("--")) {
138
+ continue;
139
+ }
140
+
141
+ const key = token.slice(2);
142
+ if (!known.has(key)) {
143
+ unknownFlags.push(key);
144
+ }
145
+
146
+ i += 1;
147
+ }
148
+
149
+ return uniqueInOrder(unknownFlags);
150
+ }
151
+
152
+ function parseCommand(command: "openai" | "gemini", args: string[]) {
153
+ const cli = createCli();
154
+
155
+ try {
156
+ const parsed = cli.process([command, ...args]);
157
+ if (parsed instanceof OpenAICommand || parsed instanceof GeminiCommand) {
158
+ return parsed;
159
+ }
160
+ return { mode: "help", message: `Unknown command: ${command}` } as const;
161
+ } catch (error) {
162
+ return {
163
+ mode: "help",
164
+ message: error instanceof Error ? error.message.split("\n")[0] : String(error),
165
+ } as const;
166
+ }
167
+ }
168
+
169
+ function parseOpenAICommand(args: string[]) {
170
+ const parsed = parseCommand("openai", args);
171
+ if ("mode" in parsed) return parsed;
172
+ if (!(parsed instanceof OpenAICommand)) {
173
+ return { mode: "help", message: "Unknown command: openai" } as const;
174
+ }
175
+ return parsed;
176
+ }
177
+
178
+ function parseGeminiCommand(args: string[]) {
179
+ const parsed = parseCommand("gemini", args);
180
+ if ("mode" in parsed) return parsed;
181
+ if (!(parsed instanceof GeminiCommand)) {
182
+ return { mode: "help", message: "Unknown command: gemini" } as const;
183
+ }
184
+ return parsed;
185
+ }
186
+
187
+ export function parseArgs(argv: string[]): ParsedArgs {
43
188
  if (argv.length === 0) return { mode: "help" };
44
189
 
45
190
  const command = argv[0];
@@ -51,58 +196,125 @@ function parseArgs(argv: string[]): ParsedArgs {
51
196
  return { mode: "help", message: `Unknown command: ${command}` };
52
197
  }
53
198
 
54
- const flags: Record<string, string[]> = {};
55
- for (let i = 0; i < rest.length; i += 1) {
56
- const token = rest[i];
57
- if (!token) break;
58
- if (token === "--help" || token === "-h") return { mode: "help" };
59
- if (!token.startsWith("--")) {
60
- return { mode: "help", message: `Unexpected argument: ${token}` };
61
- }
62
- const key = token.slice(2);
63
- const value = rest[i + 1];
64
- if (!value || value.startsWith("--")) {
65
- return { mode: "help", message: `Missing value for --${key}` };
66
- }
67
- if (!flags[key]) flags[key] = [];
68
- flags[key].push(value);
69
- i += 1;
70
- }
199
+ if (rest.includes("--help") || rest.includes("-h")) return { mode: "help" };
71
200
 
72
- const prompt = flags.prompt?.[0];
73
- const output_path = flags.output?.[0];
74
- const input_images = flags.input ?? flags.inputs;
201
+ const structuralError = structuralErrorForArgs(rest);
202
+ if (structuralError) {
203
+ return { mode: "help", message: structuralError };
204
+ }
75
205
 
76
- if (!prompt || !output_path) {
77
- return { mode: "help", message: "Missing required --prompt or --output" };
206
+ const unknownFlags = findUnknownFlags(command, rest);
207
+ if (unknownFlags.length > 0) {
208
+ return { mode: "help", message: formatUnknownFlagsError(command, unknownFlags) };
78
209
  }
79
210
 
80
211
  if (command === "openai") {
81
- const params: OpenAIParams = {
212
+ const parsed = parseOpenAICommand(rest);
213
+ if ("mode" in parsed) return parsed;
214
+
215
+ const prompt = parsed.prompt;
216
+ const output_path = parsed.output;
217
+ const input_images = parseInputImages(parsed.input, parsed.inputs);
218
+ if (!prompt || !output_path) {
219
+ return { mode: "help", message: "Missing required --prompt or --output" };
220
+ }
221
+
222
+ const model = parsed.model ?? "gpt-image-1.5";
223
+ const size = parsed.size ?? "auto";
224
+ const quality = parsed.quality ?? "auto";
225
+ const background = parsed.background ?? "auto";
226
+
227
+ if (!isAllowedEnumValue(model, openAIModelSchema.options)) {
228
+ return { mode: "help", message: formatEnumError("model", model, openAIModelSchema.options) };
229
+ }
230
+ if (!isAllowedEnumValue(size, openAISizeSchema.options)) {
231
+ return { mode: "help", message: formatEnumError("size", size, openAISizeSchema.options) };
232
+ }
233
+ if (!isAllowedEnumValue(quality, openAIQualitySchema.options)) {
234
+ return { mode: "help", message: formatEnumError("quality", quality, openAIQualitySchema.options) };
235
+ }
236
+ if (!isAllowedEnumValue(background, openAIBackgroundSchema.options)) {
237
+ return { mode: "help", message: formatEnumError("background", background, openAIBackgroundSchema.options) };
238
+ }
239
+
240
+ const validated = openAIParamsSchema.safeParse({
82
241
  prompt,
83
242
  output_path,
84
- model: (flags.model?.[0] as OpenAIParams["model"]) ?? "gpt-image-1.5",
243
+ model,
85
244
  input_images,
86
- size: (flags.size?.[0] as OpenAIParams["size"]) ?? "auto",
87
- quality: (flags.quality?.[0] as OpenAIParams["quality"]) ?? "auto",
88
- background: (flags.background?.[0] as OpenAIParams["background"]) ?? "auto",
245
+ size,
246
+ quality,
247
+ background,
248
+ });
249
+ if (!validated.success) {
250
+ return { mode: "help", message: `Invalid OpenAI parameters: ${validated.error.issues[0]?.message ?? "Unknown error"}` };
251
+ }
252
+
253
+ const params: OpenAIParams = {
254
+ ...validated.data,
89
255
  };
90
256
  return { mode: "openai", params };
91
257
  }
92
258
 
93
- const params: GeminiParams = {
259
+ const parsed = parseGeminiCommand(rest);
260
+ if ("mode" in parsed) return parsed;
261
+
262
+ const prompt = parsed.prompt;
263
+ const output_path = parsed.output;
264
+ const input_images = parseInputImages(parsed.input, parsed.inputs);
265
+ if (!prompt || !output_path) {
266
+ return { mode: "help", message: "Missing required --prompt or --output" };
267
+ }
268
+
269
+ const model = parsed.model ?? "gemini-3-pro-image-preview";
270
+ const aspectRatio = parsed.aspectRatio;
271
+ const imageSize = parsed.imageSize;
272
+ if (!isAllowedEnumValue(model, geminiModelSchema.options)) {
273
+ return { mode: "help", message: formatEnumError("model", model, geminiModelSchema.options) };
274
+ }
275
+ if (aspectRatio && !isAllowedEnumValue(aspectRatio, geminiAspectRatioSchema.options)) {
276
+ return { mode: "help", message: formatEnumError("aspect-ratio", aspectRatio, geminiAspectRatioSchema.options) };
277
+ }
278
+ if (imageSize && !isAllowedEnumValue(imageSize, geminiImageSizeSchema.options)) {
279
+ return { mode: "help", message: formatEnumError("image-size", imageSize, geminiImageSizeSchema.options) };
280
+ }
281
+
282
+ const validated = geminiParamsSchema.safeParse({
94
283
  prompt,
95
284
  output_path,
96
- model: (flags.model?.[0] as GeminiParams["model"]) ?? "gemini-3-pro-image-preview",
285
+ model,
97
286
  input_images,
98
- aspect_ratio: flags["aspect-ratio"]?.[0] as GeminiParams["aspect_ratio"],
99
- image_size: flags["image-size"]?.[0] as GeminiParams["image_size"],
287
+ aspect_ratio: aspectRatio,
288
+ image_size: imageSize,
289
+ });
290
+ if (!validated.success) {
291
+ return { mode: "help", message: `Invalid Gemini parameters: ${validated.error.issues[0]?.message ?? "Unknown error"}` };
292
+ }
293
+
294
+ const params: GeminiParams = {
295
+ ...validated.data,
100
296
  };
101
297
  return { mode: "gemini", params };
102
298
  }
103
299
 
104
300
  async function run() {
105
- const parsed = parseArgs(process.argv.slice(2));
301
+ const argv = process.argv.slice(2);
302
+ let parsed = parseArgs(argv);
303
+
304
+ const isMissingPromptOrOutput = parsed.mode === "help" && parsed.message === "Missing required --prompt or --output";
305
+ const hasPromptFlag = argv.includes("--prompt");
306
+ const stdinIsTTY = tty.isatty(0);
307
+
308
+ if (isMissingPromptOrOutput && !hasPromptFlag && !stdinIsTTY) {
309
+ const promptFromStdin = (await Bun.stdin.text()).trim();
310
+ if (promptFromStdin) {
311
+ const command = argv[0];
312
+ if (command === "openai" || command === "gemini") {
313
+ parsed = parseArgs([command, "--prompt", promptFromStdin, ...argv.slice(1)]);
314
+ }
315
+ }
316
+ }
317
+
106
318
  if (parsed.mode === "help") {
107
319
  printUsage(parsed.message);
108
320
  process.exit(parsed.message ? 1 : 0);
@@ -126,4 +338,6 @@ async function run() {
126
338
  }
127
339
  }
128
340
 
129
- await run();
341
+ if (import.meta.main) {
342
+ await run();
343
+ }
package/core.ts CHANGED
@@ -1,6 +1,8 @@
1
1
  import OpenAI from "openai";
2
2
  import { GoogleGenAI } from "@google/genai";
3
3
  import path from "node:path";
4
+ import type { GeminiParams, OpenAIParams } from "./schemas.ts";
5
+ export type { GeminiParams, OpenAIParams } from "./schemas.ts";
4
6
 
5
7
  const MIME_TYPES: Record<string, string> = {
6
8
  ".png": "image/png",
@@ -10,27 +12,21 @@ const MIME_TYPES: Record<string, string> = {
10
12
  ".webp": "image/webp",
11
13
  };
12
14
 
15
+ const OUTPUT_FORMATS = {
16
+ openai: {
17
+ ".png": { format: "png", mimeType: "image/png" },
18
+ ".jpg": { format: "jpeg", mimeType: "image/jpeg" },
19
+ ".jpeg": { format: "jpeg", mimeType: "image/jpeg" },
20
+ ".webp": { format: "webp", mimeType: "image/webp" },
21
+ },
22
+ gemini: {
23
+ ".png": { format: "png", mimeType: "image/png" },
24
+ },
25
+ } as const;
26
+
27
+ type Provider = keyof typeof OUTPUT_FORMATS;
13
28
  type GenerateResult<T> = { ok: true; data: T } | { ok: false; error: string };
14
29
 
15
- export type OpenAIParams = {
16
- prompt: string;
17
- output_path: string;
18
- model: "gpt-image-1.5";
19
- input_images?: string[];
20
- size: "auto" | "1024x1024" | "1536x1024" | "1024x1536";
21
- quality: "auto" | "high" | "medium" | "low";
22
- background: "auto" | "transparent" | "opaque";
23
- };
24
-
25
- export type GeminiParams = {
26
- prompt: string;
27
- output_path: string;
28
- model: "gemini-2.5-flash-image" | "gemini-3-pro-image-preview";
29
- input_images?: string[];
30
- aspect_ratio?: "1:1" | "2:3" | "3:2" | "3:4" | "4:3" | "4:5" | "5:4" | "9:16" | "16:9" | "21:9";
31
- image_size?: "1K" | "2K" | "4K";
32
- };
33
-
34
30
  export type OpenAIResult = {
35
31
  success: true;
36
32
  path: string;
@@ -55,6 +51,29 @@ export function getMimeType(filePath: string): string {
55
51
  return MIME_TYPES[path.extname(filePath).toLowerCase()] || "image/png";
56
52
  }
57
53
 
54
+ export function createOpenAIUploadFile(imageData: ArrayBuffer, imagePathOrName: string): File {
55
+ return new File([imageData], path.basename(imagePathOrName), { type: getMimeType(imagePathOrName) });
56
+ }
57
+
58
+ export function resolveOutputFormat(provider: Provider, outputPath: string):
59
+ | { ok: true; format: "png" | "jpeg" | "webp"; mimeType: string }
60
+ | { ok: false; error: string } {
61
+ const ext = path.extname(outputPath).toLowerCase();
62
+ const providerFormats = OUTPUT_FORMATS[provider] as Record<string, { format: "png" | "jpeg" | "webp"; mimeType: string }>;
63
+ const resolved = providerFormats[ext];
64
+
65
+ if (!resolved) {
66
+ const allowed = Object.keys(providerFormats).join(", ");
67
+ const shown = ext || "(none)";
68
+ return {
69
+ ok: false,
70
+ error: `Unsupported output extension for ${provider}: ${shown}. Allowed extensions: ${allowed}`,
71
+ };
72
+ }
73
+
74
+ return { ok: true, ...resolved };
75
+ }
76
+
58
77
  export async function readImageFile(imagePath: string): Promise<{ data: ArrayBuffer; name: string } | { error: string }> {
59
78
  const file = Bun.file(imagePath);
60
79
  if (!(await file.exists())) {
@@ -100,6 +119,11 @@ export async function generateOpenAIImage({
100
119
  quality,
101
120
  background,
102
121
  }: OpenAIParams): Promise<GenerateResult<OpenAIResult>> {
122
+ const outputFormat = resolveOutputFormat("openai", output_path);
123
+ if (!outputFormat.ok) {
124
+ return { ok: false, error: outputFormat.error };
125
+ }
126
+
103
127
  let imageData: string | undefined;
104
128
 
105
129
  if (input_images?.length) {
@@ -109,7 +133,7 @@ export async function generateOpenAIImage({
109
133
  if ("error" in result) {
110
134
  return { ok: false, error: result.error };
111
135
  }
112
- imageFiles.push(new File([result.data], result.name, { type: "image/png" }));
136
+ imageFiles.push(createOpenAIUploadFile(result.data, imagePath));
113
137
  }
114
138
 
115
139
  const response = await getOpenAI().images.edit({
@@ -117,6 +141,8 @@ export async function generateOpenAIImage({
117
141
  prompt,
118
142
  image: imageFiles.length === 1 ? imageFiles[0] : imageFiles,
119
143
  size: size === "auto" ? undefined : size,
144
+ background,
145
+ output_format: outputFormat.format,
120
146
  } as Parameters<OpenAI["images"]["edit"]>[0]);
121
147
 
122
148
  imageData = (response as OpenAI.ImagesResponse).data?.[0]?.b64_json;
@@ -128,7 +154,7 @@ export async function generateOpenAIImage({
128
154
  size: size === "auto" ? undefined : size,
129
155
  quality,
130
156
  background,
131
- output_format: "png",
157
+ output_format: outputFormat.format,
132
158
  } as Parameters<OpenAI["images"]["generate"]>[0]);
133
159
 
134
160
  imageData = (response as OpenAI.ImagesResponse).data?.[0]?.b64_json;
@@ -160,6 +186,11 @@ export async function generateGeminiImage({
160
186
  aspect_ratio,
161
187
  image_size,
162
188
  }: GeminiParams): Promise<GenerateResult<GeminiResult>> {
189
+ const outputFormat = resolveOutputFormat("gemini", output_path);
190
+ if (!outputFormat.ok) {
191
+ return { ok: false, error: outputFormat.error };
192
+ }
193
+
163
194
  const contents: Array<{ text: string } | { inlineData: { mimeType: string; data: string } }> = [{ text: prompt }];
164
195
 
165
196
  if (input_images?.length) {
package/mcp.ts CHANGED
@@ -5,107 +5,91 @@ import { z } from "zod";
5
5
  import {
6
6
  generateGeminiImage,
7
7
  generateOpenAIImage,
8
- type GeminiParams,
9
- type OpenAIParams,
10
8
  } from "./core.ts";
9
+ import { geminiInputShape, geminiParamsSchema, openAIInputShape, openAIParamsSchema } from "./schemas.ts";
11
10
 
12
11
  const textContent = (text: string) => ({ content: [{ type: "text" as const, text }] });
13
12
  const errorResponse = (message: string) => textContent(`Error: ${message}`);
14
13
  const successResponse = (data: object) => textContent(JSON.stringify(data, null, 2));
15
-
16
- const server = new McpServer({
17
- name: "images-mcp",
18
- version: "1.0.0",
19
- });
20
-
21
- server.registerTool(
22
- "openai_generate_image",
23
- {
24
- title: "OpenAI Image Generator",
25
- description: "Generate an image using OpenAI and save it to a file. Can accept input images for editing.",
26
- inputSchema: {
27
- prompt: z.string().describe("Description of the image to generate, or editing instructions if input_images provided"),
28
- output_path: z.string().describe("Path where the image should be saved (e.g., /path/to/image.png)"),
29
- model: z.enum(["gpt-image-1.5"]).default("gpt-image-1.5").describe("Model: gpt-image-1.5"),
30
- input_images: z.array(z.string()).optional().describe("Optional array of image file paths for editing/reference"),
31
- size: z.enum(["auto", "1024x1024", "1536x1024", "1024x1536"]).default("auto").describe("Image size"),
32
- quality: z.enum(["auto", "high", "medium", "low"]).default("auto").describe("Image quality"),
33
- background: z.enum(["auto", "transparent", "opaque"]).default("auto").describe("Background type"),
14
+ const formatValidationError = (error: z.ZodError) =>
15
+ error.issues.map((issue) => `${issue.path.join(".") || "input"}: ${issue.message}`).join("; ");
16
+
17
+ export function createMcpServer(): McpServer {
18
+ const server = new McpServer({
19
+ name: "images-mcp",
20
+ version: "1.0.0",
21
+ });
22
+
23
+ server.registerTool(
24
+ "openai_generate_image",
25
+ {
26
+ title: "OpenAI Image Generator",
27
+ description: "Generate an image using OpenAI and save it to a file. Can accept input images for editing.",
28
+ inputSchema: openAIInputShape,
34
29
  },
35
- },
36
- async ({ prompt, output_path, model, input_images, size, quality, background }) => {
37
- try {
38
- const result = await generateOpenAIImage({
39
- prompt,
40
- output_path,
41
- model,
42
- input_images,
43
- size,
44
- quality,
45
- background,
46
- } as OpenAIParams);
47
- if (!result.ok) return errorResponse(result.error);
48
- return successResponse(result.data);
49
- } catch (error) {
50
- return errorResponse(error instanceof Error ? error.message : String(error));
30
+ async ({ prompt, output_path, model, input_images, size, quality, background }) => {
31
+ try {
32
+ const parsed = openAIParamsSchema.safeParse({
33
+ prompt,
34
+ output_path,
35
+ model,
36
+ input_images,
37
+ size,
38
+ quality,
39
+ background,
40
+ });
41
+ if (!parsed.success) return errorResponse(`Invalid OpenAI parameters: ${formatValidationError(parsed.error)}`);
42
+ const result = await generateOpenAIImage(parsed.data);
43
+ if (!result.ok) return errorResponse(result.error);
44
+ return successResponse(result.data);
45
+ } catch (error) {
46
+ return errorResponse(error instanceof Error ? error.message : String(error));
47
+ }
51
48
  }
52
- }
53
- );
54
-
55
- server.registerTool(
56
- "gemini_generate_image",
57
- {
58
- title: "Gemini Image Generator",
59
- description: "Generate or edit an image using Google Gemini and save it to a file. Can accept input images for editing.",
60
- inputSchema: {
61
- prompt: z.string().describe("Description of the image to generate, or editing instructions if input_images provided"),
62
- output_path: z.string().describe("Path where the image should be saved (e.g., /path/to/image.png)"),
63
- model: z.enum(["gemini-2.5-flash-image", "gemini-3-pro-image-preview"]).default("gemini-3-pro-image-preview").describe("Model"),
64
- input_images: z.array(z.string()).optional().describe("Optional array of image file paths for editing/reference"),
65
- aspect_ratio: z
66
- .enum(["1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"])
67
- .optional()
68
- .describe("Aspect ratio"),
69
- image_size: z.enum(["1K", "2K", "4K"]).optional().describe("Image size"),
49
+ );
50
+
51
+ server.registerTool(
52
+ "gemini_generate_image",
53
+ {
54
+ title: "Gemini Image Generator",
55
+ description: "Generate or edit an image using Google Gemini and save it to a file. Can accept input images for editing.",
56
+ inputSchema: geminiInputShape,
70
57
  },
71
- },
72
- async ({ prompt, output_path, model, input_images, aspect_ratio, image_size }) => {
73
- try {
74
- const result = await generateGeminiImage({
75
- prompt,
76
- output_path,
77
- model,
78
- input_images,
79
- aspect_ratio,
80
- image_size,
81
- } as GeminiParams);
82
- if (!result.ok) return errorResponse(result.error);
83
- return successResponse(result.data);
84
- } catch (error) {
85
- return errorResponse(error instanceof Error ? error.message : String(error));
58
+ async ({ prompt, output_path, model, input_images, aspect_ratio, image_size }) => {
59
+ try {
60
+ const parsed = geminiParamsSchema.safeParse({
61
+ prompt,
62
+ output_path,
63
+ model,
64
+ input_images,
65
+ aspect_ratio,
66
+ image_size,
67
+ });
68
+ if (!parsed.success) return errorResponse(`Invalid Gemini parameters: ${formatValidationError(parsed.error)}`);
69
+ const result = await generateGeminiImage(parsed.data);
70
+ if (!result.ok) return errorResponse(result.error);
71
+ return successResponse(result.data);
72
+ } catch (error) {
73
+ return errorResponse(error instanceof Error ? error.message : String(error));
74
+ }
86
75
  }
87
- }
88
- );
89
-
90
- server.prompt(
91
- "create-image",
92
- {
93
- description: "Generate an image using AI with professional prompting guidance",
94
- arguments: [
95
- {
96
- name: "description",
97
- description: "What image to create",
98
- required: true,
76
+ );
77
+
78
+ server.registerPrompt(
79
+ "create-image",
80
+ {
81
+ description: "Generate an image using AI with professional prompting guidance",
82
+ argsSchema: {
83
+ description: z.string().describe("What image to create"),
99
84
  },
100
- ],
101
- },
102
- async ({ description }) => ({
103
- messages: [
104
- {
105
- role: "user",
106
- content: {
107
- type: "text",
108
- text: `# Generate Image
85
+ },
86
+ async ({ description }) => ({
87
+ messages: [
88
+ {
89
+ role: "user",
90
+ content: {
91
+ type: "text",
92
+ text: `# Generate Image
109
93
 
110
94
  Create an image based on this request: **${description}**
111
95
 
@@ -189,12 +173,23 @@ MUST NOT include:
189
173
  ## Output
190
174
 
191
175
  Save to the current directory with a descriptive filename based on the content.`,
176
+ },
192
177
  },
193
- },
194
- ],
195
- })
196
- );
197
-
198
- const transport = new StdioServerTransport();
199
- await server.connect(transport);
200
- console.error("Images MCP server running on stdio");
178
+ ],
179
+ })
180
+ );
181
+
182
+ return server;
183
+ }
184
+
185
+ export async function startMcpServer() {
186
+ const server = createMcpServer();
187
+ const transport = new StdioServerTransport();
188
+ await server.connect(transport);
189
+ console.error("Images MCP server running on stdio");
190
+ return server;
191
+ }
192
+
193
+ if (import.meta.main) {
194
+ await startMcpServer();
195
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "images-mcp",
3
- "version": "1.2.1",
3
+ "version": "1.2.2",
4
4
  "description": "MCP server for AI image generation using OpenAI and Google Gemini",
5
5
  "type": "module",
6
6
  "main": "mcp.ts",
@@ -10,10 +10,14 @@
10
10
  "files": [
11
11
  "cli.ts",
12
12
  "core.ts",
13
- "mcp.ts"
13
+ "mcp.ts",
14
+ "schemas.ts"
14
15
  ],
15
16
  "scripts": {
16
- "start": "bun run mcp.ts"
17
+ "start": "bun run mcp.ts",
18
+ "test": "bun test",
19
+ "typecheck": "bunx tsc --noEmit",
20
+ "ci:check": "bun run test && bun run typecheck"
17
21
  },
18
22
  "keywords": [
19
23
  "mcp",
@@ -36,7 +40,7 @@
36
40
  },
37
41
  "homepage": "https://github.com/ebeloded/images-mcp#readme",
38
42
  "engines": {
39
- "node": ">=18"
43
+ "bun": ">=1.0.0"
40
44
  },
41
45
  "devDependencies": {
42
46
  "@types/bun": "latest"
@@ -47,6 +51,7 @@
47
51
  "dependencies": {
48
52
  "@google/genai": "^1.34.0",
49
53
  "@modelcontextprotocol/sdk": "^1.25.1",
54
+ "clipanion": "^4.0.0-rc.4",
50
55
  "openai": "^6.14.0",
51
56
  "zod": "^4.2.1"
52
57
  }
package/schemas.ts ADDED
@@ -0,0 +1,35 @@
1
+ import { z } from "zod";
2
+
3
+ export const openAIModelSchema = z.enum(["gpt-image-1.5"]);
4
+ export const openAISizeSchema = z.enum(["auto", "1024x1024", "1536x1024", "1024x1536"]);
5
+ export const openAIQualitySchema = z.enum(["auto", "high", "medium", "low"]);
6
+ export const openAIBackgroundSchema = z.enum(["auto", "transparent", "opaque"]);
7
+
8
+ export const geminiModelSchema = z.enum(["gemini-2.5-flash-image", "gemini-3-pro-image-preview"]);
9
+ export const geminiAspectRatioSchema = z.enum(["1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"]);
10
+ export const geminiImageSizeSchema = z.enum(["1K", "2K", "4K"]);
11
+
12
+ export const openAIInputShape = {
13
+ prompt: z.string().describe("Description of the image to generate, or editing instructions if input_images provided"),
14
+ output_path: z.string().describe("Path where the image should be saved (e.g., /path/to/image.png)"),
15
+ model: openAIModelSchema.default("gpt-image-1.5").describe("Model: gpt-image-1.5"),
16
+ input_images: z.array(z.string()).optional().describe("Optional array of image file paths for editing/reference"),
17
+ size: openAISizeSchema.default("auto").describe("Image size"),
18
+ quality: openAIQualitySchema.default("auto").describe("Image quality"),
19
+ background: openAIBackgroundSchema.default("auto").describe("Background type"),
20
+ } as const;
21
+
22
+ export const geminiInputShape = {
23
+ prompt: z.string().describe("Description of the image to generate, or editing instructions if input_images provided"),
24
+ output_path: z.string().describe("Path where the image should be saved (e.g., /path/to/image.png)"),
25
+ model: geminiModelSchema.default("gemini-3-pro-image-preview").describe("Model"),
26
+ input_images: z.array(z.string()).optional().describe("Optional array of image file paths for editing/reference"),
27
+ aspect_ratio: geminiAspectRatioSchema.optional().describe("Aspect ratio"),
28
+ image_size: geminiImageSizeSchema.optional().describe("Image size"),
29
+ } as const;
30
+
31
+ export const openAIParamsSchema = z.object(openAIInputShape).strict();
32
+ export const geminiParamsSchema = z.object(geminiInputShape).strict();
33
+
34
+ export type OpenAIParams = z.infer<typeof openAIParamsSchema>;
35
+ export type GeminiParams = z.infer<typeof geminiParamsSchema>;