pixel-surgeon-mcp 1.1.0 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +48 -28
  2. package/dist/index.js +4 -4
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -6,13 +6,14 @@
6
6
 
7
7
  <p align="center">
8
8
  <strong>MCP server for AI image &amp; video generation, editing, and transplant-grade region repair</strong><br/>
9
- Powered by Gemini 3.1 Flash Image, OpenAI GPT Image 2, and Veo 3
9
+ Powered by Gemini 3.1 Flash Image, OpenAI GPT Image 2, Grok Imagine, and Veo 3
10
10
  </p>
11
11
 
12
12
  <p align="center">
13
13
  <img src="https://img.shields.io/badge/MCP-stdio-blue" alt="MCP stdio" />
14
14
  <img src="https://img.shields.io/badge/Gemini_3.1-Flash_Image-4285F4?logo=google" alt="Gemini" />
15
15
  <img src="https://img.shields.io/badge/GPT_Image_2-OpenAI-412991?logo=openai&logoColor=white" alt="OpenAI" />
16
+ <img src="https://img.shields.io/badge/Grok_Imagine-xAI-000000?logo=x&logoColor=white" alt="Grok" />
16
17
  <img src="https://img.shields.io/badge/Veo_3-Video-34A853?logo=google" alt="Veo 3" />
17
18
  <img src="https://img.shields.io/badge/TypeScript-5.9-3178C6?logo=typescript&logoColor=white" alt="TypeScript" />
18
19
  </p>
@@ -23,15 +24,19 @@ An [MCP](https://modelcontextprotocol.io) server that gives Claude (or any MCP c
23
24
 
24
25
  ## How it works
25
26
 
26
- pixel-surgeon-mcp is a **multi-provider** image generation server. You can use either or both providers, and switch between them per-request:
27
+ pixel-surgeon-mcp is a **multi-provider** image generation server. You can use any combination of providers and switch between them per-request:
27
28
 
28
- ### Gemini (Google)
29
+ ### Gemini (Google) — balanced
29
30
 
30
- Google's image generation pipeline uses a two-stage approach: **Gemini 3.1 Pro** reasons about your prompt, then **Gemini 3.1 Flash Image** renders the pixels. Supports 9 aspect ratios at 512/1K/2K/4K resolution.
31
+ Google's image generation pipeline uses a two-stage approach: **Gemini 3.1 Pro** reasons about your prompt, then **Gemini 3.1 Flash Image** renders the pixels. Supports 9 aspect ratios at 512/1K/2K/4K resolution. Best price/performance ratio, with a free tier available.
31
32
 
32
- ### OpenAI GPT Image 2
33
+ ### OpenAI GPT Image 2 — highest quality
33
34
 
34
- OpenAI's latest image model with dramatically improved text rendering and visual fidelity. Supports flexible resolutions — pixel-surgeon maps your chosen size and aspect ratio to the optimal pixel dimensions automatically. Quality levels: `medium` (fast) and `high` (print-ready). **Excellent for infographics, diagrams, and text-heavy images** where Gemini models struggle.
35
+ OpenAI's latest image model with dramatically improved text rendering and visual fidelity. Supports flexible resolutions — pixel-surgeon maps your chosen size and aspect ratio to the optimal pixel dimensions automatically. Quality levels: `medium` (fast) and `high` (print-ready). **Excellent for infographics, diagrams, and text-heavy images** where other models struggle. Slower and more expensive.
36
+
37
+ ### Grok Imagine (xAI) — fastest
38
+
39
+ xAI's Aurora-powered image model. Fastest generation speed and lowest cost. Supports 7 aspect ratios at fixed resolutions (~1K). Good for rapid prototyping and iteration.
35
40
 
36
41
  ### Veo 3 (Video)
37
42
 
@@ -64,6 +69,7 @@ AI image models struggle with text-heavy images. The fix tools solve this by sen
64
69
  | `gemini-2.5-flash-image` | Google | 1K max (free tier) | Quick drafts, prototyping |
65
70
  | `gpt-image-2` | OpenAI | Flexible (up to 4K) | Text-heavy images, infographics, diagrams, typography |
66
71
  | `gpt-image-1` | OpenAI | 3 fixed sizes | Legacy support |
72
+ | `grok-imagine` | xAI | Fixed (~1K per ratio) | Fast iteration, lowest cost |
67
73
 
68
74
  Force a specific model per-call via the `model` tool parameter, or set `DEFAULT_IMAGE_MODEL` env var.
69
75
 
@@ -80,10 +86,10 @@ Magazine editorial, bold typography, halftone textures. Cream, black, and terrac
80
86
 
81
87
  <img src="assets/style-neo-brutalist.png" alt="neo-brutalist style example" width="400" />
82
88
 
83
- ### `neo-retro-futurism`
84
- 1960s Space Age meets 1980s arcade. Cathode blue, amber, and salmon palette.
89
+ ### `duval-software-infographic`
90
+ Duval Software's signature retro-futurist infographic style. 1960s Space Age meets 1980s arcade. Cathode blue, amber, and salmon palette. Great for diagrams and system overviews.
85
91
 
86
- <img src="assets/style-neo-retro-futurism.png" alt="neo-retro-futurism style example" width="400" />
92
+ <img src="assets/style-neo-retro-futurism.png" alt="duval-software-infographic style example" width="400" />
87
93
 
88
94
  ### `fractal-arcade`
89
95
  Dithered fractals, Sierpinski patterns, low-poly. CRT retro, Amiga/EGA palette.
@@ -99,7 +105,7 @@ Technical diagrams, system flows, data pipelines. Dark navy, cyan, and electric
99
105
 
100
106
  ### Get your API key(s)
101
107
 
102
- You need at least one provider API key. You can use both for maximum flexibility.
108
+ You need at least one provider API key. You can use any combination for maximum flexibility.
103
109
 
104
110
  #### Google (Gemini + Veo 3)
105
111
 
@@ -118,45 +124,59 @@ You need at least one provider API key. You can use both for maximum flexibility
118
124
 
119
125
  > GPT Image 2 excels at text rendering, infographics, and diagrams. If you primarily need text-heavy images, this is the provider to use.
120
126
 
121
- ### Prerequisites
127
+ #### xAI (Grok Imagine)
128
+
129
+ 1. Go to [xAI Console](https://console.x.ai/)
130
+ 2. Sign in or create an account
131
+ 3. Create an API key and copy it
132
+
133
+ > Grok Imagine is the fastest and cheapest provider. Great for rapid iteration and prototyping. Fixed output resolutions (~1K) with no size control.
122
134
 
123
- - Node.js 18+
135
+ ### Quick start (npx)
124
136
 
125
- ### Install
137
+ No install needed — run directly with npx. Pass whichever API keys you have:
126
138
 
127
139
  ```bash
128
- git clone https://github.com/j-east/pixel-surgeon-mcp.git
129
- cd pixel-surgeon-mcp
130
- npm install
131
- npm run build
140
+ npx pixel-surgeon-mcp
132
141
  ```
133
142
 
134
- ### Configure your MCP client
143
+ #### Claude Code CLI
135
144
 
136
- Add to your Claude Code or Claude Desktop config. Include whichever API keys you have:
145
+ ```bash
146
+ claude mcp add pixel-surgeon \
147
+ -e GOOGLE_API_KEY=your-google-key \
148
+ -e OPENAI_API_KEY=your-openai-key \
149
+ -e XAI_API_KEY=your-xai-key \
150
+ -- npx pixel-surgeon-mcp
151
+ ```
152
+
153
+ #### Claude Desktop / MCP client config
137
154
 
138
155
  ```json
139
156
  {
140
157
  "mcpServers": {
141
158
  "pixel-surgeon": {
142
- "command": "node",
143
- "args": ["/path/to/pixel-surgeon-mcp/dist/index.js"],
159
+ "command": "npx",
160
+ "args": ["pixel-surgeon-mcp"],
144
161
  "env": {
145
162
  "GOOGLE_API_KEY": "your-google-api-key",
146
- "OPENAI_API_KEY": "your-openai-api-key"
163
+ "OPENAI_API_KEY": "your-openai-api-key",
164
+ "XAI_API_KEY": "your-xai-api-key"
147
165
  }
148
166
  }
149
167
  }
150
168
  }
151
169
  ```
152
170
 
153
- Or via the Claude Code CLI:
171
+ ### Install from source
172
+
173
+ If you prefer a local clone:
154
174
 
155
175
  ```bash
156
- claude mcp add pixel-surgeon \
157
- -e GOOGLE_API_KEY=your-google-key \
158
- -e OPENAI_API_KEY=your-openai-key \
159
- -- node /path/to/pixel-surgeon-mcp/dist/index.js
176
+ git clone https://github.com/j-east/pixel-surgeon-mcp.git
177
+ cd pixel-surgeon-mcp
178
+ npm install
179
+ npm run build
160
180
  ```
161
181
 
162
182
  ### Image output
@@ -192,7 +212,7 @@ Add entries to the `STYLE_PRESETS` object in `src/index.ts`. Your PR should incl
192
212
 
193
213
  ### Model adapters
194
214
 
195
- The server currently supports Gemini, OpenAI, and Veo 3. We'd love adapters for other image/video generation APIs — Stable Diffusion, Flux, etc. If you're interested in adding one, open an issue first so we can align on the interface.
215
+ The server currently supports Gemini, OpenAI, Grok Imagine, and Veo 3. We'd love adapters for other image/video generation APIs — Stable Diffusion, Flux, etc. If you're interested in adding one, open an issue first so we can align on the interface.
196
216
 
197
217
  ## Built by Duval Software
198
218
 
package/dist/index.js CHANGED
@@ -1689,8 +1689,8 @@ const STYLE_PRESETS = {
1689
1689
  promptPrefix: "Neo-brutalist minimalist design. Magazine editorial style layout. Off-white / cream background with bold black typography in a heavy-weight grotesque sans-serif font, slightly overlapping and breaking the grid. Accent color: muted burnt orange or terracotta used sparingly as stripe or block elements. Raw, unpolished aesthetic — visible grid lines, asymmetric layout, oversized type that bleeds off edges. Subtle halftone texture overlay. Monospaced subtext in lowercase. No gradients, no glossy effects, no heavy saturation. Clean but edgy, restrained but bold.",
1690
1690
  defaultAspectRatio: "4:5",
1691
1691
  },
1692
- "duval-software-infographic": {
1693
- description: "Duval Software's signature retro-futurist infographic style. 1960s Space Age optimism meets 1980s arcade aesthetics. Cathode blue, warm amber, salmon red, warm green palette. CRT scanlines, atomic-age geometry, pixel-grid accents. Great for diagrams, system overviews, and technical illustrations.",
1692
+ "retro-futuristic-arcade": {
1693
+ description: "Retro-futurist infographic style. 1960s Space Age optimism meets 1980s arcade aesthetics. Cathode blue, warm amber, salmon red, warm green palette. CRT scanlines, atomic-age geometry, pixel-grid accents. Great for diagrams, system overviews, and technical illustrations.",
1694
1694
  promptPrefix: "Neo-retro-futurism style. Blend of 1960s Space Age futurism and 1980s video game aesthetics with a modern neo-retro sensibility. Color palette: deep cathode-ray blue (#1a3a5c to #4a9eff glowing CRT blue), warm amber (#d4a017 to #ffcc44), salmon red (#e8735a to #ff6b6b), and warm muted greens (#5a8a5c to #8bbd7b). Dark background evoking a CRT monitor with subtle scanline texture and faint phosphor glow. Typography: mix of retrofuturist geometric sans-serif (like Eurostile, Microgramma, or Bank Gothic) with pixel-grid or bitmap-style secondary text. Design elements: atomic-age starbursts, orbital ellipses, rounded-rectangle pods, jet-age swooshes, and subtle 8-bit pixel patterns along borders or dividers. Faint CRT curvature vignette at edges. Thin vector grid lines receding to a vanishing point. Icons and illustrations should feel like arcade cabinet art meets Googie architecture meets NASA mission patches. Warm analog glow on all light sources — no harsh pure whites, everything filtered through amber or blue phosphor. The overall mood is optimistic, adventurous, and slightly nostalgic — a future that never was, rendered through a cathode ray tube.",
1695
1695
  defaultAspectRatio: "4:5",
1696
1696
  },
@@ -1699,8 +1699,8 @@ const STYLE_PRESETS = {
1699
1699
  promptPrefix: "Geometric dithered illustration style. All shading done through dithering patterns, halftone dots, and geometric cross-hatch grids — NO smooth gradients anywhere. Every surface rendered with visible pixel-level dithering like a 16-color EGA/VGA palette pushed through ordered Bayer matrix dithering. Fractal geometric patterns in the background — Sierpinski triangles, hexagonal tessellations, recursive diamond grids. Color palette: deep cathode-ray blue (#1a3a5c to #4a9eff), warm amber (#d4a017 to #ffcc44), salmon red (#e8735a), warm muted greens (#5a8a5c). Subjects built from clean geometric shapes — triangular facets, polygonal planes, like a low-poly render but flat and 2D with dithered color fills instead of smooth shading. Think: Saul Bass designed a character select screen for an Amiga game. Geometric line-art icons. Chunky retrofuturist typeface for headers, smaller geometric caps for subtitles. Horizontal scanline overlay. No photorealism, no soft shadows, no AI-gradient smoothness. Every color transition is a hard dither pattern. Clean, precise, geometric, but retro-cool.",
1700
1700
  defaultAspectRatio: "4:5",
1701
1701
  },
1702
- "clean-tech-infographic": {
1703
- description: "Clean technical infographic for architecture diagrams, system flows, and data pipelines. Dark navy background, cyan/electric blue glowing connection lines, geometric nodes, professional and precise.",
1702
+ "duval-software-infographic": {
1703
+ description: "Duval Software's clean technical infographic for architecture diagrams, system flows, and data pipelines. Dark navy background, cyan/electric blue glowing connection lines, geometric nodes, professional and precise.",
1704
1704
  promptPrefix: "Clean, professional technical infographic on a dark navy (#0a1628) background with subtle grid lines. Use cyan (#00d4ff) and electric blue (#4a9eff) glowing connection lines between components. White and light gray text only — no bright colors for text. Components rendered as clean geometric shapes: rounded rectangles, hexagons, circles with thin borders and subtle inner glow. Icons are minimal line-art style (server racks, phones, browsers, databases, cloud services). Typography: modern sans-serif (like Inter or SF Pro) — bold for titles, regular weight for labels, monospace for technical details (ports, protocols, versions). Layout follows clear left-to-right or top-to-bottom data flow with labeled arrows showing protocols and data formats. No decorative illustrations, no clip art, no logos, no random embellishments. Include a thin tech stack bar at the bottom. The overall feel is a polished engineering diagram you'd present to a CTO — precise, minimal, and authoritative.",
1705
1705
  defaultAspectRatio: "16:9",
1706
1706
  },
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pixel-surgeon-mcp",
3
- "version": "1.1.0",
3
+ "version": "1.1.1",
4
4
  "mcpName": "io.github.j-east/pixel-surgeon",
5
5
  "main": "dist/index.js",
6
6
  "type": "module",