npm - images-mcp - Versions diffs - 1.2.1 → 1.2.2 - Mend

images-mcp 1.2.1 → 1.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md CHANGED Viewed

@@ -1,62 +1,242 @@
 # images-mcp
-An MCP server for AI image generation.
+Generate and edit images from the command line (`images-mcp`) or as an MCP server.
-## Tools
+## Requirements
-### `openai_generate_image`
-Generate or edit images using OpenAI.
+- Bun `>=1.0.0`
-| Parameter | Default | Options |
-|-----------|---------|---------|
-| `prompt` | required | Text description or editing instructions |
-| `output_path` | required | File path to save (.png) |
-| `model` | `gpt-image-1.5` | `gpt-image-1.5` |
-| `input_images` | - | Array of image file paths for editing/reference |
-| `size` | `auto` | `auto`, `1024x1024`, `1536x1024`, `1024x1536` |
-| `quality` | `auto` | `auto`, `high`, `medium`, `low` |
-| `background` | `auto` | `auto`, `transparent`, `opaque` |
+## CLI (Primary Interface)
-### `gemini_generate_image`
-Generate or edit images using Google Gemini.
+### Quick Start
-| Parameter | Default | Options |
-|-----------|---------|---------|
-| `prompt` | required | Text description or editing instructions |
-| `output_path` | required | File path to save (.png) |
-| `model` | `gemini-3-pro-image-preview` | `gemini-2.5-flash-image`, `gemini-3-pro-image-preview` |
-| `input_images` | - | Array of image file paths for editing/reference |
-| `aspect_ratio` | - | `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
-| `image_size` | - | `1K`, `2K`, `4K` |
+```bash
+bun install
+```
-## Environment Variables
+Set API keys (use one or both providers):
-- `OPENAI_API_KEY` - Required for OpenAI
-- `GEMINI_API_KEY` or `GOOGLE_API_KEY` - Required for Gemini
+```bash
+export OPENAI_API_KEY="..."
+export GEMINI_API_KEY="..."   # or GOOGLE_API_KEY
+```
-## MCP Server
+Run directly from this repo:
 ```bash
-bun install
-bun run start
+bun run cli.ts --help
 ```
-## CLI
+If `images-mcp` is on your `PATH`, you can run:
-The package also works as a CLI with subcommands.
+```bash
+images-mcp --help
+```
+### Install/Run Modes
+1. Source checkout mode (always works in this repo):
+   - `bun run cli.ts <command> [flags]`
+2. Binary-on-path mode (after package/global install or linking):
+   - `images-mcp <command> [flags]`
+3. MCP server mode (for MCP clients, stdio transport):
+   - `bun run start` (same as `bun run mcp.ts`)
+### Commands
 ```bash
-# OpenAI
-images-mcp openai --prompt "A neon cat" --output ./cat.png
+images-mcp openai [flags]
+images-mcp gemini [flags]
+images-mcp --help
+```
+### CLI Flags (Full Reference)
+Common flags (both commands):
+| Flag | Required | Default | Notes |
+|---|---|---|---|
+| `--prompt <text>` | yes* | - | Prompt or edit instructions (`*` can be read from stdin when `--prompt` is omitted) |
+| `--output <path>` | yes | - | Output file path |
+| `--input <path>` | no | - | Repeatable input image path |
+| `--inputs <path>` | no | - | Backward-compatible alias of `--input` (accepted, not shown in built-in help text) |
+| `--help`, `-h` | no | - | Print usage |
+OpenAI flags (`images-mcp openai`):
+| Flag | Required | Default | Allowed values |
+|---|---|---|---|
+| `--model <value>` | no | `gpt-image-1.5` | `gpt-image-1.5` |
+| `--size <value>` | no | `auto` | `auto`, `1024x1024`, `1536x1024`, `1024x1536` |
+| `--quality <value>` | no | `auto` | `auto`, `high`, `medium`, `low` |
+| `--background <value>` | no | `auto` | `auto`, `transparent`, `opaque` |
+OpenAI output file extensions:
-# OpenAI edit
-images-mcp openai --prompt "Make it snowy" --output ./cat-snow.png --input ./cat.png
+- `.png`, `.jpg`, `.jpeg`, `.webp`
-# Gemini
-images-mcp gemini --prompt "A ceramic teapot" --output ./teapot.png
+Gemini flags (`images-mcp gemini`):
-# Gemini edit
-images-mcp gemini --prompt "Make it blue" --output ./teapot-blue.png --input ./teapot.png
+| Flag | Required | Default | Allowed values |
+|---|---|---|---|
+| `--model <value>` | no | `gemini-3-pro-image-preview` | `gemini-3-pro-image-preview`, `gemini-2.5-flash-image` |
+| `--aspect-ratio <value>` | no | unset | `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
+| `--image-size <value>` | no | unset | `1K`, `2K`, `4K` |
+Gemini output file extensions:
+- `.png`
+### CLI Examples
+Generate with OpenAI:
+```bash
+bun run cli.ts openai \
+  --prompt "A neon cat in rainy Tokyo, cinematic lighting" \
+  --output ./cat.png
+```
+Generate with OpenAI by piping prompt from stdin:
+```bash
+cat prompt.txt | bun run cli.ts openai --output ./cat.png
+```
+Or with stdin redirection:
+```bash
+bun run cli.ts openai --output ./cat.png < prompt.txt
+```
+Edit with OpenAI:
+```bash
+bun run cli.ts openai \
+  --prompt "Add snow and keep the cat centered" \
+  --output ./cat-snow.png \
+  --input ./cat.png
+```
+Generate with Gemini:
+```bash
+bun run cli.ts gemini \
+  --prompt "A ceramic teapot product photo on white background" \
+  --output ./teapot.png \
+  --aspect-ratio 4:3 \
+  --image-size 2K
 ```
-Run `images-mcp --help` for all flags.
+Edit with Gemini and multiple references:
+```bash
+bun run cli.ts gemini \
+  --prompt "Combine both references into one consistent illustration" \
+  --output ./combined.png \
+  --input ./ref-1.png \
+  --input ./ref-2.png
+```
+### JSON Output Behavior
+Successful runs print formatted JSON to stdout. Shape:
+```json
+{
+  "success": true,
+  "path": "/absolute/path/to/output.png",
+  "bytes": 123456,
+  "...provider_fields": "..."
+}
+```
+Provider-specific success fields:
+- OpenAI: `model`, `size`, `quality`, `input_images_count`
+- Gemini: `model`, `aspect_ratio`, `image_size`, `input_images_count`
+### Errors and Exit Codes
+- Exit `0`: Help (`--help`) or successful generation.
+- Exit `1`: Argument parsing/validation errors, runtime errors, API/auth errors, file errors.
+- Parse/validation errors are written to `stderr` as `<message>` followed by usage.
+- Runtime errors are written to `stderr` as `Error: <message>`.
+Common parse failures:
+- Missing required flags (when neither `--prompt` nor non-empty piped stdin prompt is provided):
+  - `Missing required --prompt or --output`
+- Missing value for a flag:
+  - `Missing value for --output`
+- Unknown flags:
+  - `Unknown flag(s) for openai: --foo`
+- Unexpected positional argument:
+  - `Unexpected argument: value`
+- Invalid enum value:
+  - `Invalid value for --size: "500x500". Allowed values: auto, 1024x1024, 1536x1024, 1024x1536`
+## Troubleshooting
+### Missing API Key
+OpenAI command without `OPENAI_API_KEY` fails in the OpenAI SDK.
+Gemini command without both `GEMINI_API_KEY` and `GOOGLE_API_KEY` fails with:
+```text
+Error: Missing GEMINI_API_KEY or GOOGLE_API_KEY environment variable
+```
+Fix:
+```bash
+export OPENAI_API_KEY="..."
+export GEMINI_API_KEY="..."   # or GOOGLE_API_KEY
+```
+### Invalid or Unsupported Flags
+If you pass a flag not supported by the selected command, the CLI exits with code `1` and prints a command-specific unknown-flag message.
+If you pass an unsupported value, the CLI prints the allowed values for that flag.
+## MCP Server (Also Supported)
+The same image functionality is available over MCP/stdin-stdout transport.
+Start server:
+```bash
+bun run start
+```
+Registered MCP tools:
+1. `openai_generate_image`
+2. `gemini_generate_image`
+Tool parameter defaults/options mirror the same schemas used by the CLI:
+OpenAI tool params:
+| Parameter | Default | Allowed values |
+|---|---|---|
+| `prompt` | required | text |
+| `output_path` | required | path |
+| `model` | `gpt-image-1.5` | `gpt-image-1.5` |
+| `input_images` | unset | string[] |
+| `size` | `auto` | `auto`, `1024x1024`, `1536x1024`, `1024x1536` |
+| `quality` | `auto` | `auto`, `high`, `medium`, `low` |
+| `background` | `auto` | `auto`, `transparent`, `opaque` |
+Gemini tool params:
+| Parameter | Default | Allowed values |
+|---|---|---|
+| `prompt` | required | text |
+| `output_path` | required | path |
+| `model` | `gemini-3-pro-image-preview` | `gemini-3-pro-image-preview`, `gemini-2.5-flash-image` |
+| `input_images` | unset | string[] |
+| `aspect_ratio` | unset | `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
+| `image_size` | unset | `1K`, `2K`, `4K` |

package/cli.ts CHANGED Viewed

@@ -1,12 +1,25 @@
 #!/usr/bin/env bun
+import { Cli, Command, Option } from "clipanion";
+import tty from "node:tty";
 import {
   generateGeminiImage,
   generateOpenAIImage,
+} from "./core.ts";
+import {
+  geminiAspectRatioSchema,
+  geminiImageSizeSchema,
+  geminiModelSchema,
+  geminiParamsSchema,
+  openAIBackgroundSchema,
+  openAIModelSchema,
+  openAIParamsSchema,
+  openAIQualitySchema,
+  openAISizeSchema,
   type GeminiParams,
   type OpenAIParams,
-} from "./core.ts";
+} from "./schemas.ts";
-type ParsedArgs =
+export type ParsedArgs =
   | { mode: "openai"; params: OpenAIParams }
   | { mode: "gemini"; params: GeminiParams }
   | { mode: "help"; message?: string };
@@ -15,14 +28,15 @@ function printUsage(message?: string) {
   if (message) {
     console.error(message);
   }
-  console.error(`images-mcp (CLI)
+  const write = message ? console.error : console.log;
+  write(`images-mcp (CLI)
 Usage:
   images-mcp openai  [args]  Generate/edit via OpenAI
   images-mcp gemini  [args]  Generate/edit via Gemini
 Common args:
-  --prompt        Text prompt (required)
+  --prompt        Text prompt (required unless piped via stdin)
   --output        Output file path (required)
   --input         Input image path (repeatable)
@@ -39,7 +53,138 @@ Gemini args:
 `);
 }
-function parseArgs(argv: string[]): ParsedArgs {
+const formatEnumError = (flag: string, value: string, allowed: readonly string[]) =>
+  `Invalid value for --${flag}: "${value}". Allowed values: ${allowed.join(", ")}`;
+const isAllowedEnumValue = (value: string, allowed: readonly string[]) => allowed.includes(value);
+const formatUnknownFlagsError = (command: "openai" | "gemini", flags: string[]) =>
+  `Unknown flag(s) for ${command}: ${flags.map((flag) => `--${flag}`).join(", ")}`;
+const uniqueInOrder = (values: string[]) => [...new Set(values)];
+const parseInputImages = (input: string[] | undefined, inputs: string[] | undefined) => {
+  const all = [...(input ?? []), ...(inputs ?? [])];
+  return all.length > 0 ? all : undefined;
+};
+class OpenAICommand extends Command {
+  static override paths = [["openai"]];
+  prompt = Option.String("--prompt", { required: false });
+  output = Option.String("--output", { required: false });
+  input = Option.Array("--input", [], { arity: 1 });
+  inputs = Option.Array("--inputs", [], { arity: 1 });
+  model = Option.String("--model", { required: false });
+  size = Option.String("--size", { required: false });
+  quality = Option.String("--quality", { required: false });
+  background = Option.String("--background", { required: false });
+  override async execute() {
+    return 0;
+  }
+}
+class GeminiCommand extends Command {
+  static override paths = [["gemini"]];
+  prompt = Option.String("--prompt", { required: false });
+  output = Option.String("--output", { required: false });
+  input = Option.Array("--input", [], { arity: 1 });
+  inputs = Option.Array("--inputs", [], { arity: 1 });
+  model = Option.String("--model", { required: false });
+  aspectRatio = Option.String("--aspect-ratio", { required: false });
+  imageSize = Option.String("--image-size", { required: false });
+  override async execute() {
+    return 0;
+  }
+}
+function createCli() {
+  const cli = new Cli({
+    binaryName: "images-mcp",
+    binaryLabel: "images-mcp",
+  });
+  cli.register(OpenAICommand);
+  cli.register(GeminiCommand);
+  return cli;
+}
+function structuralErrorForArgs(args: string[]): string | undefined {
+  for (let i = 0; i < args.length; i += 1) {
+    const token = args[i];
+    if (!token) break;
+    if (token === "--help" || token === "-h") return undefined;
+    if (!token.startsWith("--")) return `Unexpected argument: ${token}`;
+    const key = token.slice(2);
+    const value = args[i + 1];
+    if (!value || value.startsWith("--")) return `Missing value for --${key}`;
+    i += 1;
+  }
+}
+function findUnknownFlags(command: "openai" | "gemini", args: string[]): string[] {
+  const known = command === "openai"
+    ? new Set(["prompt", "output", "input", "inputs", "model", "size", "quality", "background"])
+    : new Set(["prompt", "output", "input", "inputs", "model", "aspect-ratio", "image-size"]);
+  const unknownFlags: string[] = [];
+  for (let i = 0; i < args.length; i += 1) {
+    const token = args[i];
+    if (!token?.startsWith("--")) {
+      continue;
+    }
+    const key = token.slice(2);
+    if (!known.has(key)) {
+      unknownFlags.push(key);
+    }
+    i += 1;
+  }
+  return uniqueInOrder(unknownFlags);
+}
+function parseCommand(command: "openai" | "gemini", args: string[]) {
+  const cli = createCli();
+  try {
+    const parsed = cli.process([command, ...args]);
+    if (parsed instanceof OpenAICommand || parsed instanceof GeminiCommand) {
+      return parsed;
+    }
+    return { mode: "help", message: `Unknown command: ${command}` } as const;
+  } catch (error) {
+    return {
+      mode: "help",
+      message: error instanceof Error ? error.message.split("\n")[0] : String(error),
+    } as const;
+  }
+}
+function parseOpenAICommand(args: string[]) {
+  const parsed = parseCommand("openai", args);
+  if ("mode" in parsed) return parsed;
+  if (!(parsed instanceof OpenAICommand)) {
+    return { mode: "help", message: "Unknown command: openai" } as const;
+  }
+  return parsed;
+}
+function parseGeminiCommand(args: string[]) {
+  const parsed = parseCommand("gemini", args);
+  if ("mode" in parsed) return parsed;
+  if (!(parsed instanceof GeminiCommand)) {
+    return { mode: "help", message: "Unknown command: gemini" } as const;
+  }
+  return parsed;
+}
+export function parseArgs(argv: string[]): ParsedArgs {
   if (argv.length === 0) return { mode: "help" };
   const command = argv[0];
@@ -51,58 +196,125 @@ function parseArgs(argv: string[]): ParsedArgs {
     return { mode: "help", message: `Unknown command: ${command}` };
   }
-  const flags: Record<string, string[]> = {};
-  for (let i = 0; i < rest.length; i += 1) {
-    const token = rest[i];
-    if (!token) break;
-    if (token === "--help" || token === "-h") return { mode: "help" };
-    if (!token.startsWith("--")) {
-      return { mode: "help", message: `Unexpected argument: ${token}` };
-    }
-    const key = token.slice(2);
-    const value = rest[i + 1];
-    if (!value || value.startsWith("--")) {
-      return { mode: "help", message: `Missing value for --${key}` };
-    }
-    if (!flags[key]) flags[key] = [];
-    flags[key].push(value);
-    i += 1;
-  }
+  if (rest.includes("--help") || rest.includes("-h")) return { mode: "help" };
-  const prompt = flags.prompt?.[0];
-  const output_path = flags.output?.[0];
-  const input_images = flags.input ?? flags.inputs;
+  const structuralError = structuralErrorForArgs(rest);
+  if (structuralError) {
+    return { mode: "help", message: structuralError };
+  }
-  if (!prompt || !output_path) {
-    return { mode: "help", message: "Missing required --prompt or --output" };
+  const unknownFlags = findUnknownFlags(command, rest);
+  if (unknownFlags.length > 0) {
+    return { mode: "help", message: formatUnknownFlagsError(command, unknownFlags) };
   }
   if (command === "openai") {
-    const params: OpenAIParams = {
+    const parsed = parseOpenAICommand(rest);
+    if ("mode" in parsed) return parsed;
+    const prompt = parsed.prompt;
+    const output_path = parsed.output;
+    const input_images = parseInputImages(parsed.input, parsed.inputs);
+    if (!prompt || !output_path) {
+      return { mode: "help", message: "Missing required --prompt or --output" };
+    }
+    const model = parsed.model ?? "gpt-image-1.5";
+    const size = parsed.size ?? "auto";
+    const quality = parsed.quality ?? "auto";
+    const background = parsed.background ?? "auto";
+    if (!isAllowedEnumValue(model, openAIModelSchema.options)) {
+      return { mode: "help", message: formatEnumError("model", model, openAIModelSchema.options) };
+    }
+    if (!isAllowedEnumValue(size, openAISizeSchema.options)) {
+      return { mode: "help", message: formatEnumError("size", size, openAISizeSchema.options) };
+    }
+    if (!isAllowedEnumValue(quality, openAIQualitySchema.options)) {
+      return { mode: "help", message: formatEnumError("quality", quality, openAIQualitySchema.options) };
+    }
+    if (!isAllowedEnumValue(background, openAIBackgroundSchema.options)) {
+      return { mode: "help", message: formatEnumError("background", background, openAIBackgroundSchema.options) };
+    }
+    const validated = openAIParamsSchema.safeParse({
       prompt,
       output_path,
-      model: (flags.model?.[0] as OpenAIParams["model"]) ?? "gpt-image-1.5",
+      model,
       input_images,
-      size: (flags.size?.[0] as OpenAIParams["size"]) ?? "auto",
-      quality: (flags.quality?.[0] as OpenAIParams["quality"]) ?? "auto",
-      background: (flags.background?.[0] as OpenAIParams["background"]) ?? "auto",
+      size,
+      quality,
+      background,
+    });
+    if (!validated.success) {
+      return { mode: "help", message: `Invalid OpenAI parameters: ${validated.error.issues[0]?.message ?? "Unknown error"}` };
+    }
+    const params: OpenAIParams = {
+      ...validated.data,
     };
     return { mode: "openai", params };
   }
-  const params: GeminiParams = {
+  const parsed = parseGeminiCommand(rest);
+  if ("mode" in parsed) return parsed;
+  const prompt = parsed.prompt;
+  const output_path = parsed.output;
+  const input_images = parseInputImages(parsed.input, parsed.inputs);
+  if (!prompt || !output_path) {
+    return { mode: "help", message: "Missing required --prompt or --output" };
+  }
+  const model = parsed.model ?? "gemini-3-pro-image-preview";
+  const aspectRatio = parsed.aspectRatio;
+  const imageSize = parsed.imageSize;
+  if (!isAllowedEnumValue(model, geminiModelSchema.options)) {
+    return { mode: "help", message: formatEnumError("model", model, geminiModelSchema.options) };
+  }
+  if (aspectRatio && !isAllowedEnumValue(aspectRatio, geminiAspectRatioSchema.options)) {
+    return { mode: "help", message: formatEnumError("aspect-ratio", aspectRatio, geminiAspectRatioSchema.options) };
+  }
+  if (imageSize && !isAllowedEnumValue(imageSize, geminiImageSizeSchema.options)) {
+    return { mode: "help", message: formatEnumError("image-size", imageSize, geminiImageSizeSchema.options) };
+  }
+  const validated = geminiParamsSchema.safeParse({
     prompt,
     output_path,
-    model: (flags.model?.[0] as GeminiParams["model"]) ?? "gemini-3-pro-image-preview",
+    model,
     input_images,
-    aspect_ratio: flags["aspect-ratio"]?.[0] as GeminiParams["aspect_ratio"],
-    image_size: flags["image-size"]?.[0] as GeminiParams["image_size"],
+    aspect_ratio: aspectRatio,
+    image_size: imageSize,
+  });
+  if (!validated.success) {
+    return { mode: "help", message: `Invalid Gemini parameters: ${validated.error.issues[0]?.message ?? "Unknown error"}` };
+  }
+  const params: GeminiParams = {
+    ...validated.data,
   };
   return { mode: "gemini", params };
 }
 async function run() {
-  const parsed = parseArgs(process.argv.slice(2));
+  const argv = process.argv.slice(2);
+  let parsed = parseArgs(argv);
+  const isMissingPromptOrOutput = parsed.mode === "help" && parsed.message === "Missing required --prompt or --output";
+  const hasPromptFlag = argv.includes("--prompt");
+  const stdinIsTTY = tty.isatty(0);
+  if (isMissingPromptOrOutput && !hasPromptFlag && !stdinIsTTY) {
+    const promptFromStdin = (await Bun.stdin.text()).trim();
+    if (promptFromStdin) {
+      const command = argv[0];
+      if (command === "openai" || command === "gemini") {
+        parsed = parseArgs([command, "--prompt", promptFromStdin, ...argv.slice(1)]);
+      }
+    }
+  }
   if (parsed.mode === "help") {
     printUsage(parsed.message);
     process.exit(parsed.message ? 1 : 0);
@@ -126,4 +338,6 @@ async function run() {
   }
 }
-await run();
+if (import.meta.main) {
+  await run();
+}

package/core.ts CHANGED Viewed

@@ -1,6 +1,8 @@
 import OpenAI from "openai";
 import { GoogleGenAI } from "@google/genai";
 import path from "node:path";
+import type { GeminiParams, OpenAIParams } from "./schemas.ts";
+export type { GeminiParams, OpenAIParams } from "./schemas.ts";
 const MIME_TYPES: Record<string, string> = {
   ".png": "image/png",
@@ -10,27 +12,21 @@ const MIME_TYPES: Record<string, string> = {
   ".webp": "image/webp",
 };
+const OUTPUT_FORMATS = {
+  openai: {
+    ".png": { format: "png", mimeType: "image/png" },
+    ".jpg": { format: "jpeg", mimeType: "image/jpeg" },
+    ".jpeg": { format: "jpeg", mimeType: "image/jpeg" },
+    ".webp": { format: "webp", mimeType: "image/webp" },
+  },
+  gemini: {
+    ".png": { format: "png", mimeType: "image/png" },
+  },
+} as const;
+type Provider = keyof typeof OUTPUT_FORMATS;
 type GenerateResult<T> = { ok: true; data: T } | { ok: false; error: string };
-export type OpenAIParams = {
-  prompt: string;
-  output_path: string;
-  model: "gpt-image-1.5";
-  input_images?: string[];
-  size: "auto" | "1024x1024" | "1536x1024" | "1024x1536";
-  quality: "auto" | "high" | "medium" | "low";
-  background: "auto" | "transparent" | "opaque";
-};
-export type GeminiParams = {
-  prompt: string;
-  output_path: string;
-  model: "gemini-2.5-flash-image" | "gemini-3-pro-image-preview";
-  input_images?: string[];
-  aspect_ratio?: "1:1" | "2:3" | "3:2" | "3:4" | "4:3" | "4:5" | "5:4" | "9:16" | "16:9" | "21:9";
-  image_size?: "1K" | "2K" | "4K";
-};
 export type OpenAIResult = {
   success: true;
   path: string;
@@ -55,6 +51,29 @@ export function getMimeType(filePath: string): string {
   return MIME_TYPES[path.extname(filePath).toLowerCase()] || "image/png";
 }
+export function createOpenAIUploadFile(imageData: ArrayBuffer, imagePathOrName: string): File {
+  return new File([imageData], path.basename(imagePathOrName), { type: getMimeType(imagePathOrName) });
+}
+export function resolveOutputFormat(provider: Provider, outputPath: string):
+  | { ok: true; format: "png" | "jpeg" | "webp"; mimeType: string }
+  | { ok: false; error: string } {
+  const ext = path.extname(outputPath).toLowerCase();
+  const providerFormats = OUTPUT_FORMATS[provider] as Record<string, { format: "png" | "jpeg" | "webp"; mimeType: string }>;
+  const resolved = providerFormats[ext];
+  if (!resolved) {
+    const allowed = Object.keys(providerFormats).join(", ");
+    const shown = ext || "(none)";
+    return {
+      ok: false,
+      error: `Unsupported output extension for ${provider}: ${shown}. Allowed extensions: ${allowed}`,
+    };
+  }
+  return { ok: true, ...resolved };
+}
 export async function readImageFile(imagePath: string): Promise<{ data: ArrayBuffer; name: string } | { error: string }> {
   const file = Bun.file(imagePath);
   if (!(await file.exists())) {
@@ -100,6 +119,11 @@ export async function generateOpenAIImage({
   quality,
   background,
 }: OpenAIParams): Promise<GenerateResult<OpenAIResult>> {
+  const outputFormat = resolveOutputFormat("openai", output_path);
+  if (!outputFormat.ok) {
+    return { ok: false, error: outputFormat.error };
+  }
   let imageData: string | undefined;
   if (input_images?.length) {
@@ -109,7 +133,7 @@ export async function generateOpenAIImage({
       if ("error" in result) {
         return { ok: false, error: result.error };
       }
-      imageFiles.push(new File([result.data], result.name, { type: "image/png" }));
+      imageFiles.push(createOpenAIUploadFile(result.data, imagePath));
     }
     const response = await getOpenAI().images.edit({
@@ -117,6 +141,8 @@ export async function generateOpenAIImage({
       prompt,
       image: imageFiles.length === 1 ? imageFiles[0] : imageFiles,
       size: size === "auto" ? undefined : size,
+      background,
+      output_format: outputFormat.format,
     } as Parameters<OpenAI["images"]["edit"]>[0]);
     imageData = (response as OpenAI.ImagesResponse).data?.[0]?.b64_json;
@@ -128,7 +154,7 @@ export async function generateOpenAIImage({
       size: size === "auto" ? undefined : size,
       quality,
       background,
-      output_format: "png",
+      output_format: outputFormat.format,
     } as Parameters<OpenAI["images"]["generate"]>[0]);
     imageData = (response as OpenAI.ImagesResponse).data?.[0]?.b64_json;
@@ -160,6 +186,11 @@ export async function generateGeminiImage({
   aspect_ratio,
   image_size,
 }: GeminiParams): Promise<GenerateResult<GeminiResult>> {
+  const outputFormat = resolveOutputFormat("gemini", output_path);
+  if (!outputFormat.ok) {
+    return { ok: false, error: outputFormat.error };
+  }
   const contents: Array<{ text: string } | { inlineData: { mimeType: string; data: string } }> = [{ text: prompt }];
   if (input_images?.length) {

package/mcp.ts CHANGED Viewed

@@ -5,107 +5,91 @@ import { z } from "zod";
 import {
   generateGeminiImage,
   generateOpenAIImage,
-  type GeminiParams,
-  type OpenAIParams,
 } from "./core.ts";
+import { geminiInputShape, geminiParamsSchema, openAIInputShape, openAIParamsSchema } from "./schemas.ts";
 const textContent = (text: string) => ({ content: [{ type: "text" as const, text }] });
 const errorResponse = (message: string) => textContent(`Error: ${message}`);
 const successResponse = (data: object) => textContent(JSON.stringify(data, null, 2));
-const server = new McpServer({
-  name: "images-mcp",
-  version: "1.0.0",
-});
-server.registerTool(
-  "openai_generate_image",
-  {
-    title: "OpenAI Image Generator",
-    description: "Generate an image using OpenAI and save it to a file. Can accept input images for editing.",
-    inputSchema: {
-      prompt: z.string().describe("Description of the image to generate, or editing instructions if input_images provided"),
-      output_path: z.string().describe("Path where the image should be saved (e.g., /path/to/image.png)"),
-      model: z.enum(["gpt-image-1.5"]).default("gpt-image-1.5").describe("Model: gpt-image-1.5"),
-      input_images: z.array(z.string()).optional().describe("Optional array of image file paths for editing/reference"),
-      size: z.enum(["auto", "1024x1024", "1536x1024", "1024x1536"]).default("auto").describe("Image size"),
-      quality: z.enum(["auto", "high", "medium", "low"]).default("auto").describe("Image quality"),
-      background: z.enum(["auto", "transparent", "opaque"]).default("auto").describe("Background type"),
+const formatValidationError = (error: z.ZodError) =>
+  error.issues.map((issue) => `${issue.path.join(".") || "input"}: ${issue.message}`).join("; ");
+export function createMcpServer(): McpServer {
+  const server = new McpServer({
+    name: "images-mcp",
+    version: "1.0.0",
+  });
+  server.registerTool(
+    "openai_generate_image",
+    {
+      title: "OpenAI Image Generator",
+      description: "Generate an image using OpenAI and save it to a file. Can accept input images for editing.",
+      inputSchema: openAIInputShape,
     },
-  },
-  async ({ prompt, output_path, model, input_images, size, quality, background }) => {
-    try {
-      const result = await generateOpenAIImage({
-        prompt,
-        output_path,
-        model,
-        input_images,
-        size,
-        quality,
-        background,
-      } as OpenAIParams);
-      if (!result.ok) return errorResponse(result.error);
-      return successResponse(result.data);
-    } catch (error) {
-      return errorResponse(error instanceof Error ? error.message : String(error));
+    async ({ prompt, output_path, model, input_images, size, quality, background }) => {
+      try {
+        const parsed = openAIParamsSchema.safeParse({
+          prompt,
+          output_path,
+          model,
+          input_images,
+          size,
+          quality,
+          background,
+        });
+        if (!parsed.success) return errorResponse(`Invalid OpenAI parameters: ${formatValidationError(parsed.error)}`);
+        const result = await generateOpenAIImage(parsed.data);
+        if (!result.ok) return errorResponse(result.error);
+        return successResponse(result.data);
+      } catch (error) {
+        return errorResponse(error instanceof Error ? error.message : String(error));
+      }
     }
-  }
-);
-server.registerTool(
-  "gemini_generate_image",
-  {
-    title: "Gemini Image Generator",
-    description: "Generate or edit an image using Google Gemini and save it to a file. Can accept input images for editing.",
-    inputSchema: {
-      prompt: z.string().describe("Description of the image to generate, or editing instructions if input_images provided"),
-      output_path: z.string().describe("Path where the image should be saved (e.g., /path/to/image.png)"),
-      model: z.enum(["gemini-2.5-flash-image", "gemini-3-pro-image-preview"]).default("gemini-3-pro-image-preview").describe("Model"),
-      input_images: z.array(z.string()).optional().describe("Optional array of image file paths for editing/reference"),
-      aspect_ratio: z
-        .enum(["1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"])
-        .optional()
-        .describe("Aspect ratio"),
-      image_size: z.enum(["1K", "2K", "4K"]).optional().describe("Image size"),
+  );
+  server.registerTool(
+    "gemini_generate_image",
+    {
+      title: "Gemini Image Generator",
+      description: "Generate or edit an image using Google Gemini and save it to a file. Can accept input images for editing.",
+      inputSchema: geminiInputShape,
     },
-  },
-  async ({ prompt, output_path, model, input_images, aspect_ratio, image_size }) => {
-    try {
-      const result = await generateGeminiImage({
-        prompt,
-        output_path,
-        model,
-        input_images,
-        aspect_ratio,
-        image_size,
-      } as GeminiParams);
-      if (!result.ok) return errorResponse(result.error);
-      return successResponse(result.data);
-    } catch (error) {
-      return errorResponse(error instanceof Error ? error.message : String(error));
+    async ({ prompt, output_path, model, input_images, aspect_ratio, image_size }) => {
+      try {
+        const parsed = geminiParamsSchema.safeParse({
+          prompt,
+          output_path,
+          model,
+          input_images,
+          aspect_ratio,
+          image_size,
+        });
+        if (!parsed.success) return errorResponse(`Invalid Gemini parameters: ${formatValidationError(parsed.error)}`);
+        const result = await generateGeminiImage(parsed.data);
+        if (!result.ok) return errorResponse(result.error);
+        return successResponse(result.data);
+      } catch (error) {
+        return errorResponse(error instanceof Error ? error.message : String(error));
+      }
     }
-  }
-);
-server.prompt(
-  "create-image",
-  {
-    description: "Generate an image using AI with professional prompting guidance",
-    arguments: [
-      {
-        name: "description",
-        description: "What image to create",
-        required: true,
+  );
+  server.registerPrompt(
+    "create-image",
+    {
+      description: "Generate an image using AI with professional prompting guidance",
+      argsSchema: {
+        description: z.string().describe("What image to create"),
       },
-    ],
-  },
-  async ({ description }) => ({
-    messages: [
-      {
-        role: "user",
-        content: {
-          type: "text",
-          text: `# Generate Image
+    },
+    async ({ description }) => ({
+      messages: [
+        {
+          role: "user",
+          content: {
+            type: "text",
+            text: `# Generate Image
 Create an image based on this request: **${description}**
@@ -189,12 +173,23 @@ MUST NOT include:
 ## Output
 Save to the current directory with a descriptive filename based on the content.`,
+          },
         },
-      },
-    ],
-  })
-);
-const transport = new StdioServerTransport();
-await server.connect(transport);
-console.error("Images MCP server running on stdio");
+      ],
+    })
+  );
+  return server;
+}
+export async function startMcpServer() {
+  const server = createMcpServer();
+  const transport = new StdioServerTransport();
+  await server.connect(transport);
+  console.error("Images MCP server running on stdio");
+  return server;
+}
+if (import.meta.main) {
+  await startMcpServer();
+}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "images-mcp",
-  "version": "1.2.1",
+  "version": "1.2.2",
   "description": "MCP server for AI image generation using OpenAI and Google Gemini",
   "type": "module",
   "main": "mcp.ts",
@@ -10,10 +10,14 @@
   "files": [
     "cli.ts",
     "core.ts",
-    "mcp.ts"
+    "mcp.ts",
+    "schemas.ts"
   ],
   "scripts": {
-    "start": "bun run mcp.ts"
+    "start": "bun run mcp.ts",
+    "test": "bun test",
+    "typecheck": "bunx tsc --noEmit",
+    "ci:check": "bun run test && bun run typecheck"
   },
   "keywords": [
     "mcp",
@@ -36,7 +40,7 @@
   },
   "homepage": "https://github.com/ebeloded/images-mcp#readme",
   "engines": {
-    "node": ">=18"
+    "bun": ">=1.0.0"
   },
   "devDependencies": {
     "@types/bun": "latest"
@@ -47,6 +51,7 @@
   "dependencies": {
     "@google/genai": "^1.34.0",
     "@modelcontextprotocol/sdk": "^1.25.1",
+    "clipanion": "^4.0.0-rc.4",
     "openai": "^6.14.0",
     "zod": "^4.2.1"
   }

package/schemas.ts ADDED Viewed

@@ -0,0 +1,35 @@
+import { z } from "zod";
+export const openAIModelSchema = z.enum(["gpt-image-1.5"]);
+export const openAISizeSchema = z.enum(["auto", "1024x1024", "1536x1024", "1024x1536"]);
+export const openAIQualitySchema = z.enum(["auto", "high", "medium", "low"]);
+export const openAIBackgroundSchema = z.enum(["auto", "transparent", "opaque"]);
+export const geminiModelSchema = z.enum(["gemini-2.5-flash-image", "gemini-3-pro-image-preview"]);
+export const geminiAspectRatioSchema = z.enum(["1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"]);
+export const geminiImageSizeSchema = z.enum(["1K", "2K", "4K"]);
+export const openAIInputShape = {
+  prompt: z.string().describe("Description of the image to generate, or editing instructions if input_images provided"),
+  output_path: z.string().describe("Path where the image should be saved (e.g., /path/to/image.png)"),
+  model: openAIModelSchema.default("gpt-image-1.5").describe("Model: gpt-image-1.5"),
+  input_images: z.array(z.string()).optional().describe("Optional array of image file paths for editing/reference"),
+  size: openAISizeSchema.default("auto").describe("Image size"),
+  quality: openAIQualitySchema.default("auto").describe("Image quality"),
+  background: openAIBackgroundSchema.default("auto").describe("Background type"),
+} as const;
+export const geminiInputShape = {
+  prompt: z.string().describe("Description of the image to generate, or editing instructions if input_images provided"),
+  output_path: z.string().describe("Path where the image should be saved (e.g., /path/to/image.png)"),
+  model: geminiModelSchema.default("gemini-3-pro-image-preview").describe("Model"),
+  input_images: z.array(z.string()).optional().describe("Optional array of image file paths for editing/reference"),
+  aspect_ratio: geminiAspectRatioSchema.optional().describe("Aspect ratio"),
+  image_size: geminiImageSizeSchema.optional().describe("Image size"),
+} as const;
+export const openAIParamsSchema = z.object(openAIInputShape).strict();
+export const geminiParamsSchema = z.object(geminiInputShape).strict();
+export type OpenAIParams = z.infer<typeof openAIParamsSchema>;
+export type GeminiParams = z.infer<typeof geminiParamsSchema>;