imgx-mcp 0.9.0 → 0.9.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,20 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.9.1 (2026-03-02)
4
+
5
+ ### Added
6
+
7
+ - **Skill included in npm package** — `skills/image-generation/SKILL.md` and `references/providers.md` now ship with the npm package, making it easier to install the Claude Code skill
8
+
9
+ ### Changed
10
+
11
+ - README restructured: Skill section moved after Quick Start, Plugin section moved to bottom
12
+ - Skill install instructions added (npm copy, curl from GitHub, manual placement)
13
+ - SKILL.md: added missing MCP parameters (`output_format`, `output_dir`, `model`, `provider` on edit tools)
14
+ - SKILL.md: CLI fallback updated from plugin path to `npx imgx-mcp`
15
+ - providers.md: OpenAI `OUTPUT_FORMAT` corrected from CLI-only to MCP `output_format` parameter
16
+ - npm keywords: added `skill`, `claude-code`
17
+
3
18
  ## 0.9.0 (2026-02-28)
4
19
 
5
20
  ### Changed
package/README.md CHANGED
@@ -1,216 +1,133 @@
1
1
  # imgx-mcp
2
2
 
3
- AI image generation and editing for Claude Code, Codex CLI, and MCP-compatible AI agents. Provider-agnostic design with capability-based abstraction.
3
+ AI image generation and editing MCP server. Works with Claude Code, Gemini CLI, Cursor, Windsurf, and any MCP-compatible tool.
4
4
 
5
- ## Install
5
+ Generate images from text, edit existing images with text instructions, iterate on results — all from your AI coding environment.
6
6
 
7
- ### As a Claude Code plugin
7
+ ## Quick start
8
8
 
9
- ```
10
- /plugin marketplace add somacoffeekyoto/imgx-mcp
11
- /plugin install imgx-mcp@somacoffeekyoto-imgx-mcp
12
- ```
13
-
14
- After installation, restart Claude Code. The `image-generation` skill becomes available — Claude Code can generate and edit images via natural language instructions.
15
-
16
- ### Update
17
-
18
- #### Claude Code plugin
19
-
20
- You can try updating via the plugin manager:
9
+ Add to your tool's MCP config (`.mcp.json`, `settings.json`, etc.):
21
10
 
22
- ```
23
- /plugin update → select "installed" → imgx-mcp → update
11
+ ```json
12
+ {
13
+ "mcpServers": {
14
+ "imgx": {
15
+ "command": "npx",
16
+ "args": ["--package=imgx-mcp", "-y", "imgx-mcp"],
17
+ "env": { "GEMINI_API_KEY": "your-key" }
18
+ }
19
+ }
20
+ }
24
21
  ```
25
22
 
26
- If the update shows no changes or the plugin doesn't reflect the latest version, uninstall and reinstall:
23
+ That's it. Your AI agent can now generate and edit images.
27
24
 
28
- ```
29
- /plugin uninstall imgx-mcp@somacoffeekyoto-imgx-mcp
30
- /plugin install imgx-mcp@somacoffeekyoto-imgx-mcp
31
- ```
25
+ > **Windows**: Replace `"command": "npx"` with `"command": "cmd"` and prepend `"/c"` to the args array.
32
26
 
33
- Then restart Claude Code.
27
+ ## Skill (Claude Code)
34
28
 
35
- #### Standalone CLI
29
+ For Claude Code users, imgx-mcp includes an `image-generation` skill — a guided prompt that teaches Claude how to use the MCP tools effectively. With the skill installed, type `/image-generation` to start a guided workflow.
36
30
 
37
- ```bash
38
- npm update -g imgx-mcp
39
- ```
31
+ ### Install the skill
40
32
 
41
- ### As a standalone CLI
33
+ Copy the skill directory from the npm package or GitHub repository to your project:
42
34
 
43
35
  ```bash
44
- npm install -g imgx-mcp
45
- ```
46
-
47
- Requires Node.js 18+.
48
-
49
- ## Setup
50
-
51
- Set up at least one provider:
52
-
53
- **Gemini** — get a key from [Google AI Studio](https://aistudio.google.com/apikey) (free tier available):
36
+ # From npm (after npx has cached the package)
37
+ cp -r $(npm root -g)/imgx-mcp/skills .claude/skills
54
38
 
55
- ```bash
56
- imgx config set api-key YOUR_GEMINI_API_KEY --provider gemini
39
+ # Or from the GitHub repository
40
+ curl -sL https://raw.githubusercontent.com/somacoffeekyoto/imgx-mcp/main/skills/image-generation/SKILL.md \
41
+ -o .claude/skills/image-generation/SKILL.md --create-dirs
42
+ curl -sL https://raw.githubusercontent.com/somacoffeekyoto/imgx-mcp/main/skills/image-generation/references/providers.md \
43
+ -o .claude/skills/image-generation/references/providers.md --create-dirs
57
44
  ```
58
45
 
59
- **OpenAI** get a key from [OpenAI Platform](https://platform.openai.com/api-keys):
46
+ Or place skill files manually:
60
47
 
61
- ```bash
62
- imgx config set api-key YOUR_OPENAI_API_KEY --provider openai
63
48
  ```
64
-
65
- Keys are stored in `~/.config/imgx/config.json` (Linux/macOS) or `%APPDATA%\imgx\config.json` (Windows). Alternatively, set environment variables:
66
-
67
- ```bash
68
- export GEMINI_API_KEY="your-api-key"
69
- export OPENAI_API_KEY="your-api-key"
49
+ your-project/
50
+ .mcp.json MCP server config (Quick start above)
51
+ .claude/
52
+ skills/
53
+ image-generation/
54
+ SKILL.md ← skill prompt
55
+ references/
56
+ providers.md ← provider reference
70
57
  ```
71
58
 
72
- Environment variables take precedence over the config file.
73
-
74
- ## Usage
75
-
76
- ### Generate an image from text
77
-
78
- ```bash
79
- imgx generate -p "A coffee cup on a wooden table, morning light" -o output.png
80
- ```
59
+ The skill files are included in the [npm package](https://www.npmjs.com/package/imgx-mcp) under `skills/` and in the [GitHub repository](https://github.com/somacoffeekyoto/imgx-mcp/tree/main/skills/image-generation).
81
60
 
82
- ### Edit an existing image
61
+ > **Personal skill** (all projects): Place in `~/.claude/skills/image-generation/` instead of `.claude/skills/`.
83
62
 
84
- ```bash
85
- imgx edit -i photo.png -p "Change the background to sunset" -o edited.png
86
- ```
63
+ ### What the skill does
87
64
 
88
- ### Iterative editing with `--last`
65
+ The skill guides Claude Code through image workflows: blog covers, iterative editing, provider comparison, icon generation. It knows the MCP tool parameters and best practices, so you get better results with less effort.
89
66
 
90
- ```bash
91
- imgx edit -i photo.png -p "Make the background darker"
92
- # → {"success": true, "filePaths": ["./imgx-a1b2c3d4.png"]}
67
+ ### MCP server vs Skill
93
68
 
94
- imgx edit --last -p "Add warm lighting"
95
- # Uses the previous output as input automatically
69
+ | | MCP server | Skill |
70
+ |---|---|---|
71
+ | What it does | Exposes image tools to AI agents | Guided prompt for using the tools |
72
+ | Works with | Any MCP-compatible tool | Claude Code |
73
+ | Install | Add to `.mcp.json` | Copy skill files to project |
74
+ | Team sharing | Commit `.mcp.json` to repo | Commit `.claude/skills/` to repo |
96
75
 
97
- imgx edit --last -p "Crop to 16:9" -o final.png
98
- ```
76
+ **Recommended**: Set up the MCP server (Quick start) + install the skill if you use Claude Code.
99
77
 
100
- ### Options
78
+ ## MCP tools
101
79
 
102
- | Flag | Short | Description |
103
- |------|-------|-------------|
104
- | `--prompt` | `-p` | Image description or edit instruction (required) |
105
- | `--output` | `-o` | Output file path (auto-generated if omitted) |
106
- | `--input` | `-i` | Input image to edit (`edit` command only) |
107
- | `--last` | `-l` | Use last output as input (`edit` command only) |
108
- | `--aspect-ratio` | `-a` | `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2:3`, `3:2` |
109
- | `--resolution` | `-r` | `1K`, `2K`, `4K` |
110
- | `--count` | `-n` | Number of images to generate |
111
- | `--format` | `-f` | Output format: `png`, `jpeg`, `webp` (OpenAI only) |
112
- | `--model` | `-m` | Model name |
113
- | `--provider` | | Provider name (default: `gemini`) |
114
- | `--output-dir` | `-d` | Output directory |
115
-
116
- ### Configuration
80
+ | Tool | Description |
81
+ |------|-------------|
82
+ | `generate_image` | Generate an image from a text prompt |
83
+ | `edit_image` | Edit an existing image with text instructions |
84
+ | `edit_last` | Edit the last generated/edited image (no input path needed) |
85
+ | `list_providers` | List available providers and capabilities |
117
86
 
118
- ```bash
119
- imgx config set api-key <key> --provider gemini # Save Gemini API key
120
- imgx config set api-key <key> --provider openai # Save OpenAI API key
121
- imgx config set model <name> # Set default model
122
- imgx config set output-dir <dir> # Set default output directory
123
- imgx config set aspect-ratio 16:9 # Set default aspect ratio
124
- imgx config set resolution 2K # Set default resolution
125
- imgx config list # Show all settings
126
- imgx config get api-key # Show a specific setting (API key is masked)
127
- imgx config path # Show config file location
128
- ```
87
+ Images are saved to `~/Pictures/imgx/` by default. File paths are returned in the response. Inline image preview is included in MCP responses (base64).
129
88
 
130
- ### Project config (`.imgxrc`)
89
+ ### Iterative editing
131
90
 
132
- Generate a template with `imgx init`:
91
+ The `edit_last` tool uses the output of the previous `generate_image` or `edit_image` call as input. This enables a conversational workflow:
133
92
 
134
- ```bash
135
- imgx init
136
- # → creates .imgxrc in current directory
137
93
  ```
138
-
139
- Or create manually. Place a `.imgxrc` file in your project directory to set project-level defaults:
140
-
141
- ```json
142
- {
143
- "defaults": {
144
- "model": "gemini-2.5-flash-image",
145
- "outputDir": "./assets/images",
146
- "aspectRatio": "16:9"
147
- }
148
- }
94
+ "Generate a coffee shop interior" → generate_image
95
+ "Make the lighting warmer" → edit_last
96
+ "Add a person reading a book" → edit_last
149
97
  ```
150
98
 
151
- Project config is shared via Git. Do not put API keys in `.imgxrc` — use `imgx config set api-key` or environment variables instead.
152
-
153
- ### Settings resolution
99
+ No need to specify file paths between steps.
154
100
 
155
- Settings are resolved in this order (first match wins):
101
+ ## API key setup
156
102
 
157
- 1. CLI flags (`--model`, `--output-dir`, etc.)
158
- 2. Environment variables (`IMGX_MODEL`, `IMGX_OUTPUT_DIR`, etc.)
159
- 3. Project config (`.imgxrc` in current directory)
160
- 4. User config (`~/.config/imgx/config.json` or `%APPDATA%\imgx\config.json`)
161
- 5. Provider defaults
103
+ Set up at least one provider:
162
104
 
163
- ### Other commands
105
+ **Gemini** get a key from [Google AI Studio](https://aistudio.google.com/apikey) (free tier available):
164
106
 
165
107
  ```bash
166
- imgx providers # List available providers and their capabilities
167
- imgx capabilities # Show detailed capabilities of current provider
108
+ imgx config set api-key YOUR_GEMINI_API_KEY --provider gemini
168
109
  ```
169
110
 
170
- ### Environment variables
171
-
172
- Environment variables override config file settings.
173
-
174
- | Variable | Description |
175
- |----------|-------------|
176
- | `GEMINI_API_KEY` | Gemini API key |
177
- | `OPENAI_API_KEY` | OpenAI API key |
178
- | `IMGX_PROVIDER` | Default provider |
179
- | `IMGX_MODEL` | Default model |
180
- | `IMGX_OUTPUT_DIR` | Default output directory |
181
-
182
- ## Output
183
-
184
- All commands output JSON:
185
-
186
- ```json
187
- {"success": true, "filePaths": ["./output.png"]}
188
- ```
111
+ **OpenAI** get a key from [OpenAI Platform](https://platform.openai.com/api-keys):
189
112
 
190
- ```json
191
- {"success": false, "error": "error message"}
113
+ ```bash
114
+ imgx config set api-key YOUR_OPENAI_API_KEY --provider openai
192
115
  ```
193
116
 
194
- This makes imgx suitable for scripting, CI pipelines, and integration with other tools.
195
-
196
- ## MCP server
197
-
198
- imgx includes an MCP (Model Context Protocol) server, making it available to any MCP-compatible AI coding tool.
117
+ Keys are stored in `~/.config/imgx/config.json` (Linux/macOS) or `%APPDATA%\imgx\config.json` (Windows). Alternatively, pass keys via the `env` section in your MCP config, or set environment variables:
199
118
 
200
- ### Exposed tools
119
+ ```bash
120
+ export GEMINI_API_KEY="your-api-key"
121
+ export OPENAI_API_KEY="your-api-key"
122
+ ```
201
123
 
202
- | Tool | Description |
203
- |------|-------------|
204
- | `generate_image` | Generate an image from a text prompt |
205
- | `edit_image` | Edit an existing image with text instructions |
206
- | `edit_last` | Edit the last generated/edited image (no input path needed) |
207
- | `list_providers` | List available providers and capabilities |
124
+ Only include the API keys for providers you want to use. At least one is required.
208
125
 
209
- ### Configuration
126
+ ## MCP configuration by tool
210
127
 
211
- Add to your tool's MCP config. The `env` section is optional if you have already run `imgx config set api-key`.
128
+ ### Claude Code
212
129
 
213
- **Claude Code** (`.mcp.json` / `claude mcp add`):
130
+ `.mcp.json` in your project root:
214
131
 
215
132
  ```json
216
133
  {
@@ -224,11 +141,9 @@ Add to your tool's MCP config. The `env` section is optional if you have already
224
141
  }
225
142
  ```
226
143
 
227
- On Windows, replace `"command": "npx"` with `"command": "cmd"` and prepend `"/c"` to the args array.
144
+ ### Gemini CLI
228
145
 
229
- Or install as a [Claude Code plugin](#install) for automatic MCP registration.
230
-
231
- **Gemini CLI** (`~/.gemini/settings.json`):
146
+ `~/.gemini/settings.json`:
232
147
 
233
148
  ```json
234
149
  {
@@ -242,7 +157,9 @@ Or install as a [Claude Code plugin](#install) for automatic MCP registration.
242
157
  }
243
158
  ```
244
159
 
245
- **Claude Desktop** (`claude_desktop_config.json`):
160
+ ### Claude Desktop
161
+
162
+ `claude_desktop_config.json`:
246
163
 
247
164
  macOS / Linux:
248
165
 
@@ -274,9 +191,11 @@ Windows:
274
191
 
275
192
  Config file location: `%APPDATA%\Claude\claude_desktop_config.json` (Windows) or `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS). After editing, restart Claude Desktop.
276
193
 
277
- > **Note:** Claude Desktop runs the MCP server from its own app directory. Images will be saved there by default. To control the output location, add `"IMGX_OUTPUT_DIR": "C:\\Users\\you\\Pictures"` to the `env` section, or run `imgx config set output-dir <path>` beforehand.
194
+ > **Note:** Claude Desktop runs the MCP server from its own app directory. To control image output location, add `"IMGX_OUTPUT_DIR": "C:\\Users\\you\\Pictures"` to the `env` section, or run `imgx config set output-dir <path>` beforehand.
278
195
 
279
- **Codex CLI** (`.codex/config.toml`):
196
+ ### Codex CLI
197
+
198
+ `.codex/config.toml`:
280
199
 
281
200
  ```toml
282
201
  [mcp_servers.imgx]
@@ -285,27 +204,34 @@ args = ["--package=imgx-mcp", "-y", "imgx-mcp"]
285
204
  env = { GEMINI_API_KEY = "your-key", OPENAI_API_KEY = "your-key" }
286
205
  ```
287
206
 
207
+ ### Other tools
208
+
288
209
  The same `npx` pattern works with Cursor, Windsurf, Continue.dev, Cline, Zed, and other MCP-compatible tools. On Windows, use `cmd /c npx` instead of `npx` directly.
289
210
 
290
- Only include the API keys for providers you want to use. At least one is required.
211
+ ## Providers
212
+
213
+ | Provider | Models | Capabilities |
214
+ |----------|--------|-------------|
215
+ | Gemini | `gemini-3-pro-image-preview`, `gemini-2.5-flash-image` | Generate, edit, aspect ratio, resolution, reference images, person control |
216
+ | OpenAI | `gpt-image-1` | Generate, edit, aspect ratio, multi-output, output format (PNG/JPEG/WebP) |
291
217
 
292
218
  ## Architecture
293
219
 
294
220
  imgx separates **model-independent** and **model-dependent** concerns:
295
221
 
296
222
  ```
297
- CLI (argument parsing, output formatting) MCP server (tool definitions, stdio transport)
298
-
223
+ MCP server (tool definitions, stdio transport) CLI (argument parsing, output formatting)
224
+
299
225
  Core (Capability enum, ImageProvider interface, provider registry, file I/O)
300
226
 
301
227
  Provider (model-specific API calls, capability declarations)
302
228
  ```
303
229
 
304
- CLI and MCP server are two entry points into the same core. Both call the same provider functions.
230
+ MCP server and CLI are two entry points into the same core. Both call the same provider functions.
305
231
 
306
- Each provider declares its supported capabilities. The CLI dynamically enables or disables options based on what the active provider supports. Adding a new provider means implementing the `ImageProvider` interface and registering it — no changes to the CLI layer.
232
+ Each provider declares its supported capabilities. Adding a new provider means implementing the `ImageProvider` interface and registering it — no changes to the MCP or CLI layer.
307
233
 
308
- ### Supported capabilities
234
+ ### Capability system
309
235
 
310
236
  | Capability | Description |
311
237
  |------------|-------------|
@@ -318,12 +244,118 @@ Each provider declares its supported capabilities. The CLI dynamically enables o
318
244
  | `PERSON_CONTROL` | Control person generation in output |
319
245
  | `OUTPUT_FORMAT` | Choose output format (PNG, JPEG, WebP) |
320
246
 
321
- ### Current providers
247
+ ## CLI
322
248
 
323
- | Provider | Models | Capabilities |
324
- |----------|--------|-------------|
325
- | Gemini | `gemini-3-pro-image-preview`, `gemini-2.5-flash-image` | All 7 base capabilities |
326
- | OpenAI | `gpt-image-1` | Generate, edit, aspect ratio, multi-output, output format |
249
+ imgx-mcp also works as a standalone command-line tool.
250
+
251
+ ### Install
252
+
253
+ ```bash
254
+ npm install -g imgx-mcp
255
+ ```
256
+
257
+ Requires Node.js 18+.
258
+
259
+ ### Usage
260
+
261
+ ```bash
262
+ # Generate
263
+ imgx generate -p "A coffee cup on a wooden table, morning light" -o output.png
264
+
265
+ # Edit
266
+ imgx edit -i photo.png -p "Change the background to sunset" -o edited.png
267
+
268
+ # Iterative editing
269
+ imgx edit -i photo.png -p "Make the background darker"
270
+ imgx edit --last -p "Add warm lighting"
271
+ imgx edit --last -p "Crop to 16:9" -o final.png
272
+
273
+ # Provider management
274
+ imgx providers # List providers and capabilities
275
+ imgx capabilities # Detailed capabilities of current provider
276
+ ```
277
+
278
+ ### CLI options
279
+
280
+ | Flag | Short | Description |
281
+ |------|-------|-------------|
282
+ | `--prompt` | `-p` | Image description or edit instruction (required) |
283
+ | `--output` | `-o` | Output file path (auto-generated if omitted) |
284
+ | `--input` | `-i` | Input image to edit (`edit` command only) |
285
+ | `--last` | `-l` | Use last output as input (`edit` command only) |
286
+ | `--aspect-ratio` | `-a` | `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2:3`, `3:2` |
287
+ | `--resolution` | `-r` | `1K`, `2K`, `4K` |
288
+ | `--count` | `-n` | Number of images to generate |
289
+ | `--format` | `-f` | Output format: `png`, `jpeg`, `webp` (OpenAI only) |
290
+ | `--model` | `-m` | Model name |
291
+ | `--provider` | | Provider name (default: `gemini`) |
292
+ | `--output-dir` | `-d` | Output directory |
293
+
294
+ ### Configuration
295
+
296
+ ```bash
297
+ imgx config set api-key <key> --provider gemini # Save Gemini API key
298
+ imgx config set api-key <key> --provider openai # Save OpenAI API key
299
+ imgx config set model <name> # Set default model
300
+ imgx config set output-dir <dir> # Set default output directory
301
+ imgx config set aspect-ratio 16:9 # Set default aspect ratio
302
+ imgx config set resolution 2K # Set default resolution
303
+ imgx config list # Show all settings
304
+ imgx config get api-key # Show a specific setting (API key is masked)
305
+ imgx config path # Show config file location
306
+ ```
307
+
308
+ ### Project config (`.imgxrc`)
309
+
310
+ Generate a template with `imgx init`:
311
+
312
+ ```bash
313
+ imgx init
314
+ # → creates .imgxrc in current directory
315
+ ```
316
+
317
+ Or create manually:
318
+
319
+ ```json
320
+ {
321
+ "defaults": {
322
+ "model": "gemini-2.5-flash-image",
323
+ "outputDir": "./assets/images",
324
+ "aspectRatio": "16:9"
325
+ }
326
+ }
327
+ ```
328
+
329
+ Project config is shared via Git. Do not put API keys in `.imgxrc`.
330
+
331
+ ### Settings resolution
332
+
333
+ 1. CLI flags (`--model`, `--output-dir`, etc.)
334
+ 2. Environment variables (`IMGX_MODEL`, `IMGX_OUTPUT_DIR`, etc.)
335
+ 3. Project config (`.imgxrc` in current directory)
336
+ 4. User config (`~/.config/imgx/config.json` or `%APPDATA%\imgx\config.json`)
337
+ 5. Provider defaults
338
+
339
+ ### Output format
340
+
341
+ All CLI commands output JSON:
342
+
343
+ ```json
344
+ {"success": true, "filePaths": ["./output.png"]}
345
+ ```
346
+
347
+ ## Claude Code plugin
348
+
349
+ The plugin bundles MCP server + skill in one step. If you prefer not to configure `.mcp.json` and skill files manually:
350
+
351
+ ```
352
+ /plugin marketplace add somacoffeekyoto/imgx-mcp
353
+ /plugin install imgx-mcp@somacoffeekyoto-imgx-mcp
354
+ ```
355
+
356
+ Update: `/plugin` → installed → imgx-mcp → update. If the update shows no changes, uninstall and reinstall.
357
+
358
+ Uninstall: `/plugin uninstall imgx-mcp@somacoffeekyoto-imgx-mcp` then `/plugin marketplace remove somacoffeekyoto-imgx-mcp`.
327
359
 
328
360
  ## Development
329
361
 
@@ -336,28 +368,25 @@ npm run bundle # TypeScript compile + esbuild bundle
336
368
 
337
369
  The build produces two bundles:
338
370
 
339
- - `dist/cli.bundle.js` — CLI entry point
340
371
  - `dist/mcp.bundle.js` — MCP server entry point
372
+ - `dist/cli.bundle.js` — CLI entry point
341
373
 
342
374
  ## Uninstall
343
375
 
344
- ### Claude Code plugin
376
+ ### MCP server
345
377
 
346
- ```
347
- /plugin uninstall imgx-mcp@somacoffeekyoto-imgx-mcp
348
- /plugin marketplace remove somacoffeekyoto-imgx-mcp
349
- ```
378
+ Remove the `imgx` entry from your tool's MCP configuration file.
379
+
380
+ ### Skill
350
381
 
351
- ### Standalone CLI
382
+ Delete the `image-generation/` directory from `.claude/skills/` or `~/.claude/skills/`.
383
+
384
+ ### CLI
352
385
 
353
386
  ```bash
354
387
  npm uninstall -g imgx-mcp
355
388
  ```
356
389
 
357
- ### MCP server
358
-
359
- Remove the `imgx` entry from your tool's MCP configuration file.
360
-
361
390
  ### Clean up configuration (optional)
362
391
 
363
392
  ```bash
@@ -374,6 +403,9 @@ MIT — [SOMA COFFEE KYOTO](https://github.com/somacoffeekyoto)
374
403
 
375
404
  ## Links
376
405
 
406
+ - [Official page](https://somacoffee.net/imgx-mcp/)
377
407
  - [GitHub](https://github.com/somacoffeekyoto/imgx-mcp)
408
+ - [npm](https://www.npmjs.com/package/imgx-mcp)
409
+ - [MCP Registry](https://registry.modelcontextprotocol.io)
378
410
  - [SOMA COFFEE KYOTO](https://somacoffee.net)
379
411
  - [X (@somacoffeekyoto)](https://x.com/somacoffeekyoto)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "imgx-mcp",
3
- "version": "0.9.0",
3
+ "version": "0.9.2",
4
4
  "mcpName": "io.github.somacoffeekyoto/imgx",
5
5
  "description": "AI image generation and editing for Claude Code, Codex CLI, and MCP-compatible AI agents",
6
6
  "type": "module",
@@ -19,8 +19,10 @@
19
19
  "gemini",
20
20
  "openai",
21
21
  "ai",
22
- "cli",
23
- "mcp"
22
+ "mcp",
23
+ "skill",
24
+ "claude-code",
25
+ "cli"
24
26
  ],
25
27
  "author": "SOMA COFFEE KYOTO",
26
28
  "license": "MIT",
@@ -28,13 +30,14 @@
28
30
  "type": "git",
29
31
  "url": "git+https://github.com/somacoffeekyoto/imgx-mcp.git"
30
32
  },
31
- "homepage": "https://github.com/somacoffeekyoto/imgx-mcp",
33
+ "homepage": "https://somacoffee.net/imgx-mcp/",
32
34
  "bugs": {
33
35
  "url": "https://github.com/somacoffeekyoto/imgx-mcp/issues"
34
36
  },
35
37
  "files": [
36
38
  "dist/cli.bundle.js",
37
39
  "dist/mcp.bundle.js",
40
+ "skills/",
38
41
  "LICENSE",
39
42
  "README.md",
40
43
  "CHANGELOG.md"
@@ -0,0 +1,177 @@
1
+ ---
2
+ name: image-generation
3
+ description: Generate and edit AI images using Gemini or OpenAI. Text-to-image, text-based editing, iterative refinement.
4
+ ---
5
+
6
+ # Image Generation & Editing
7
+
8
+ Generate and edit images using the imgx MCP tools. Gemini and OpenAI providers supported.
9
+
10
+ ## When to use
11
+
12
+ - User asks to create, generate, or make an image
13
+ - User asks to edit, modify, or change an existing image
14
+ - User needs a cover image, diagram, icon, or visual asset
15
+ - User wants to refine an image iteratively ("make it darker", "change the background")
16
+
17
+ ## Setup
18
+
19
+ If the MCP tools (`generate_image`, `edit_image`, `edit_last`, `list_providers`) are already available, skip this section.
20
+
21
+ ### 1. Add MCP server
22
+
23
+ Add imgx-mcp to the project's `.mcp.json` (create the file if it doesn't exist):
24
+
25
+ ```json
26
+ {
27
+ "mcpServers": {
28
+ "imgx": {
29
+ "command": "npx",
30
+ "args": ["--package=imgx-mcp", "-y", "imgx-mcp"],
31
+ "env": { "GEMINI_API_KEY": "your-key" }
32
+ }
33
+ }
34
+ }
35
+ ```
36
+
37
+ On Windows, use `"command": "cmd"` and prepend `"/c"` to args:
38
+ ```json
39
+ {
40
+ "mcpServers": {
41
+ "imgx": {
42
+ "command": "cmd",
43
+ "args": ["/c", "npx", "--package=imgx-mcp", "-y", "imgx-mcp"],
44
+ "env": { "GEMINI_API_KEY": "your-key" }
45
+ }
46
+ }
47
+ }
48
+ ```
49
+
50
+ After adding, restart Claude Code for the MCP server to connect.
51
+
52
+ ### 2. API key
53
+
54
+ Get at least one API key:
55
+
56
+ - **Gemini** (default, free tier available): [Google AI Studio](https://aistudio.google.com/apikey)
57
+ - **OpenAI**: [OpenAI Platform](https://platform.openai.com/api-keys)
58
+
59
+ Set the key in the `.mcp.json` env section (above), or via CLI:
60
+ ```bash
61
+ npx imgx-mcp config set api-key YOUR_KEY --provider gemini
62
+ ```
63
+
64
+ ## MCP tools
65
+
66
+ Use these tools directly. No Bash needed.
67
+
68
+ ### generate_image
69
+
70
+ Generate an image from a text prompt.
71
+
72
+ | Parameter | Required | Description |
73
+ |-----------|----------|-------------|
74
+ | `prompt` | Yes | Image description |
75
+ | `aspect_ratio` | No | `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2:3`, `3:2` |
76
+ | `resolution` | No | `1K`, `2K`, `4K` (Gemini only) |
77
+ | `count` | No | Number of images (OpenAI only) |
78
+ | `output_format` | No | `png`, `jpeg`, `webp` (OpenAI only) |
79
+ | `model` | No | Model name |
80
+ | `provider` | No | `gemini` (default) or `openai` |
81
+ | `output` | No | Output file path |
82
+ | `output_dir` | No | Output directory |
83
+
84
+ ### edit_image
85
+
86
+ Edit an existing image with text instructions. No mask needed — the model determines what to change from the text.
87
+
88
+ | Parameter | Required | Description |
89
+ |-----------|----------|-------------|
90
+ | `input` | Yes | Path to the image to edit |
91
+ | `prompt` | Yes | Edit instruction |
92
+ | `aspect_ratio` | No | Output aspect ratio |
93
+ | `resolution` | No | Output resolution (Gemini only) |
94
+ | `output_format` | No | `png`, `jpeg`, `webp` (OpenAI only) |
95
+ | `model` | No | Model name |
96
+ | `provider` | No | `gemini` (default) or `openai` |
97
+ | `output` | No | Output file path |
98
+ | `output_dir` | No | Output directory |
99
+
100
+ ### edit_last
101
+
102
+ Edit the last generated or edited image. No input path needed — automatically uses the previous output.
103
+
104
+ | Parameter | Required | Description |
105
+ |-----------|----------|-------------|
106
+ | `prompt` | Yes | Edit instruction |
107
+ | `aspect_ratio` | No | Output aspect ratio |
108
+ | `resolution` | No | Output resolution (Gemini only) |
109
+ | `output_format` | No | `png`, `jpeg`, `webp` (OpenAI only) |
110
+ | `model` | No | Model name |
111
+ | `provider` | No | `gemini` (default) or `openai` |
112
+ | `output` | No | Output file path |
113
+ | `output_dir` | No | Output directory |
114
+
115
+ ### list_providers
116
+
117
+ List available providers and their capabilities. No parameters.
118
+
119
+ ## Practical workflows
120
+
121
+ ### Blog cover image
122
+
123
+ ```
124
+ 1. generate_image: prompt="A developer's desk with laptop showing terminal, coffee cup, warm morning light" aspect_ratio="16:9" resolution="2K"
125
+ 2. Review the result with the user
126
+ 3. edit_last: prompt="Make the color palette warmer" (if user wants changes)
127
+ 4. edit_last: prompt="Add subtle vignette effect" (further refinement)
128
+ ```
129
+
130
+ ### Iterative refinement
131
+
132
+ The `edit_last` tool is the key to conversational image editing. Each call takes the previous output as input:
133
+
134
+ ```
135
+ generate_image → edit_last → edit_last → edit_last → done
136
+ ```
137
+
138
+ Tell the user what was generated, ask if they want changes, and use `edit_last` to apply them. This is the most natural workflow.
139
+
140
+ ### Comparing providers
141
+
142
+ Generate the same prompt with different providers to let the user choose:
143
+
144
+ ```
145
+ 1. generate_image: prompt="..." provider="gemini"
146
+ 2. generate_image: prompt="..." provider="openai"
147
+ 3. Show both results. User picks their preferred version
148
+ 4. edit_last to refine the chosen one (note: edit_last uses the most recent output)
149
+ ```
150
+
151
+ ### Icon or logo variations
152
+
153
+ ```
154
+ 1. generate_image: prompt="Minimalist coffee bean icon, white background" aspect_ratio="1:1" count=3
155
+ (count works with OpenAI provider only)
156
+ 2. For Gemini, generate multiple times with slight prompt variations
157
+ ```
158
+
159
+ ## Tips
160
+
161
+ - **Be specific in prompts**: "A wooden table with a ceramic pour-over dripper, steam rising, soft natural light from left" works better than "coffee scene"
162
+ - **Use edit_last for iteration**: Don't ask the user to specify file paths. Just use `edit_last` after any generation or edit
163
+ - **Check provider capabilities**: Use `list_providers` if unsure what a provider supports
164
+ - **Default output**: Images save to `~/Pictures/imgx/` with auto-generated filenames. Use `output` or `output_dir` to customize
165
+ - **Inline preview**: MCP responses include base64 image data for inline display in supported clients
166
+
167
+ ## CLI fallback
168
+
169
+ If MCP tools are not available (MCP server not configured), fall back to CLI via Bash:
170
+
171
+ ```bash
172
+ npx imgx-mcp generate -p "prompt" -o output.png
173
+ npx imgx-mcp edit -i input.png -p "edit instruction"
174
+ npx imgx-mcp edit --last -p "refine further"
175
+ ```
176
+
177
+ See [providers reference](references/providers.md) for detailed provider capabilities.
@@ -0,0 +1,62 @@
1
+ # Provider Reference
2
+
3
+ ## Gemini (default)
4
+
5
+ | Item | Value |
6
+ |------|-------|
7
+ | Provider name | `gemini` |
8
+ | Default model | `gemini-3-pro-image-preview` |
9
+ | Alternative model | `gemini-2.5-flash-image` |
10
+ | API key env var | `GEMINI_API_KEY` |
11
+
12
+ ### Model comparison
13
+
14
+ | Feature | gemini-3-pro-image-preview | gemini-2.5-flash-image |
15
+ |---------|---------------------------|------------------------|
16
+ | Quality | Higher | Good |
17
+ | Speed | Slower | Faster |
18
+ | Cost | ~$0.134/image | Lower |
19
+ | Resolution | 1K, 2K, 4K | 1K, 2K |
20
+
21
+ ### Capabilities
22
+
23
+ | Capability | MCP parameter | Description |
24
+ |------------|---------------|-------------|
25
+ | TEXT_TO_IMAGE | (default) | Generate from text |
26
+ | IMAGE_EDITING | `input` | Edit with text instructions |
27
+ | ASPECT_RATIO | `aspect_ratio` | 7 ratios: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `9:16`, `16:9` |
28
+ | RESOLUTION_CONTROL | `resolution` | `1K`, `2K`, `4K` |
29
+ | REFERENCE_IMAGES | — | Use reference images (future) |
30
+ | PERSON_CONTROL | — | Control person generation (future) |
31
+
32
+ ## OpenAI
33
+
34
+ | Item | Value |
35
+ |------|-------|
36
+ | Provider name | `openai` |
37
+ | Default model | `gpt-image-1` |
38
+ | API key env var | `OPENAI_API_KEY` |
39
+
40
+ ### Capabilities
41
+
42
+ | Capability | MCP parameter | Description |
43
+ |------------|---------------|-------------|
44
+ | TEXT_TO_IMAGE | (default) | Generate from text |
45
+ | IMAGE_EDITING | `input` | Edit with text instructions |
46
+ | ASPECT_RATIO | `aspect_ratio` | 7 ratios: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `9:16`, `16:9` |
47
+ | MULTIPLE_OUTPUTS | `count` | Generate up to 4 images per request |
48
+ | OUTPUT_FORMAT | `output_format` | PNG, JPEG, WebP |
49
+
50
+ ### Provider comparison
51
+
52
+ | Feature | Gemini | OpenAI |
53
+ |---------|--------|--------|
54
+ | Edit (text-only, no mask) | Yes | Yes |
55
+ | Resolution control | Yes (1K/2K/4K) | No |
56
+ | Multiple outputs | No | Yes (up to 4) |
57
+ | Output format selection | No (PNG only) | Yes (PNG/JPEG/WebP) |
58
+ | Iterative editing (`edit_last`) | Yes | Yes |
59
+
60
+ ## Adding new providers
61
+
62
+ Providers implement the `ImageProvider` interface and register via the provider registry. Each provider declares its supported capabilities. The MCP server and CLI dynamically enable/disable options based on the active provider's capabilities.