@houtini/gemini-mcp 1.4.2 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (171) hide show
  1. package/README.md +314 -784
  2. package/claude_desktop_config_example.json +1 -0
  3. package/dist/config/index.d.ts.map +1 -1
  4. package/dist/config/index.js +8 -4
  5. package/dist/config/index.js.map +1 -1
  6. package/dist/config/types.d.ts +5 -0
  7. package/dist/config/types.d.ts.map +1 -1
  8. package/dist/image-viewer/image-viewer-app.html +180 -0
  9. package/dist/image-viewer/src/ui/image-viewer.html +324 -0
  10. package/dist/index-new.d.ts +3 -0
  11. package/dist/index-new.d.ts.map +1 -0
  12. package/dist/index-new.js +7 -0
  13. package/dist/index-new.js.map +1 -0
  14. package/dist/index.d.ts +3 -1
  15. package/dist/index.d.ts.map +1 -1
  16. package/dist/index.js +70 -172
  17. package/dist/index.js.map +1 -1
  18. package/dist/landing-page-viewer/src/ui/landing-page-viewer.html +330 -0
  19. package/dist/services/gemini/export.d.ts +5 -0
  20. package/dist/services/gemini/export.d.ts.map +1 -0
  21. package/dist/services/gemini/export.js +5 -0
  22. package/dist/services/gemini/export.js.map +1 -0
  23. package/dist/services/gemini/image-service.d.ts +45 -0
  24. package/dist/services/gemini/image-service.d.ts.map +1 -0
  25. package/dist/services/gemini/image-service.js +248 -0
  26. package/dist/services/gemini/image-service.js.map +1 -0
  27. package/dist/services/gemini/index.d.ts +7 -2
  28. package/dist/services/gemini/index.d.ts.map +1 -1
  29. package/dist/services/gemini/index.js +132 -56
  30. package/dist/services/gemini/index.js.map +1 -1
  31. package/dist/services/gemini/types.d.ts +32 -0
  32. package/dist/services/gemini/types.d.ts.map +1 -1
  33. package/dist/services/gemini/video-service.d.ts +58 -0
  34. package/dist/services/gemini/video-service.d.ts.map +1 -0
  35. package/dist/services/gemini/video-service.js +325 -0
  36. package/dist/services/gemini/video-service.js.map +1 -0
  37. package/dist/services/media-server.d.ts +28 -0
  38. package/dist/services/media-server.d.ts.map +1 -0
  39. package/dist/services/media-server.js +195 -0
  40. package/dist/services/media-server.js.map +1 -0
  41. package/dist/svg-viewer/src/ui/svg-viewer.html +325 -0
  42. package/dist/tools/gemini-chat.d.ts.map +1 -1
  43. package/dist/tools/gemini-chat.js +7 -1
  44. package/dist/tools/gemini-chat.js.map +1 -1
  45. package/dist/tools/gemini-deep-research.d.ts +1 -2
  46. package/dist/tools/gemini-deep-research.d.ts.map +1 -1
  47. package/dist/tools/gemini-deep-research.js +11 -51
  48. package/dist/tools/gemini-deep-research.js.map +1 -1
  49. package/dist/tools/gemini-help.d.ts +3 -0
  50. package/dist/tools/gemini-help.d.ts.map +1 -0
  51. package/dist/tools/gemini-help.js +534 -0
  52. package/dist/tools/gemini-help.js.map +1 -0
  53. package/dist/tools/gemini-prompt-assistant.d.ts +20 -0
  54. package/dist/tools/gemini-prompt-assistant.d.ts.map +1 -0
  55. package/dist/tools/gemini-prompt-assistant.js +129 -0
  56. package/dist/tools/gemini-prompt-assistant.js.map +1 -0
  57. package/dist/tools/generate-landing-page.d.ts +15 -0
  58. package/dist/tools/generate-landing-page.d.ts.map +1 -0
  59. package/dist/tools/generate-landing-page.js +66 -0
  60. package/dist/tools/generate-landing-page.js.map +1 -0
  61. package/dist/tools/generate-svg.d.ts +14 -0
  62. package/dist/tools/generate-svg.d.ts.map +1 -0
  63. package/dist/tools/generate-svg.js +106 -0
  64. package/dist/tools/generate-svg.js.map +1 -0
  65. package/dist/tools/generate-video.d.ts +24 -0
  66. package/dist/tools/generate-video.d.ts.map +1 -0
  67. package/dist/tools/generate-video.js +163 -0
  68. package/dist/tools/generate-video.js.map +1 -0
  69. package/dist/tools/image-prompt-assistant.d.ts +3 -0
  70. package/dist/tools/image-prompt-assistant.d.ts.map +1 -0
  71. package/dist/tools/image-prompt-assistant.js +790 -0
  72. package/dist/tools/image-prompt-assistant.js.map +1 -0
  73. package/dist/tools/load-image-from-path.d.ts +11 -0
  74. package/dist/tools/load-image-from-path.d.ts.map +1 -0
  75. package/dist/tools/load-image-from-path.js +100 -0
  76. package/dist/tools/load-image-from-path.js.map +1 -0
  77. package/dist/tools/prompt-library/charts.d.ts +325 -0
  78. package/dist/tools/prompt-library/charts.d.ts.map +1 -0
  79. package/dist/tools/prompt-library/charts.js +384 -0
  80. package/dist/tools/prompt-library/charts.js.map +1 -0
  81. package/dist/tools/prompt-library/index.d.ts +8 -0
  82. package/dist/tools/prompt-library/index.d.ts.map +1 -0
  83. package/dist/tools/prompt-library/index.js +10 -0
  84. package/dist/tools/prompt-library/index.js.map +1 -0
  85. package/dist/tools/register-analyze-image.d.ts +3 -0
  86. package/dist/tools/register-analyze-image.d.ts.map +1 -0
  87. package/dist/tools/register-analyze-image.js +67 -0
  88. package/dist/tools/register-analyze-image.js.map +1 -0
  89. package/dist/tools/register-chat.d.ts +3 -0
  90. package/dist/tools/register-chat.d.ts.map +1 -0
  91. package/dist/tools/register-chat.js +71 -0
  92. package/dist/tools/register-chat.js.map +1 -0
  93. package/dist/tools/register-deep-research.d.ts +3 -0
  94. package/dist/tools/register-deep-research.d.ts.map +1 -0
  95. package/dist/tools/register-deep-research.js +59 -0
  96. package/dist/tools/register-deep-research.js.map +1 -0
  97. package/dist/tools/register-describe-image.d.ts +3 -0
  98. package/dist/tools/register-describe-image.d.ts.map +1 -0
  99. package/dist/tools/register-describe-image.js +59 -0
  100. package/dist/tools/register-describe-image.js.map +1 -0
  101. package/dist/tools/register-image-gen.d.ts +3 -0
  102. package/dist/tools/register-image-gen.d.ts.map +1 -0
  103. package/dist/tools/register-image-gen.js +235 -0
  104. package/dist/tools/register-image-gen.js.map +1 -0
  105. package/dist/tools/register-landing-page.d.ts +3 -0
  106. package/dist/tools/register-landing-page.d.ts.map +1 -0
  107. package/dist/tools/register-landing-page.js +79 -0
  108. package/dist/tools/register-landing-page.js.map +1 -0
  109. package/dist/tools/register-list-models.d.ts +3 -0
  110. package/dist/tools/register-list-models.d.ts.map +1 -0
  111. package/dist/tools/register-list-models.js +33 -0
  112. package/dist/tools/register-list-models.js.map +1 -0
  113. package/dist/tools/register-load-image.d.ts +3 -0
  114. package/dist/tools/register-load-image.d.ts.map +1 -0
  115. package/dist/tools/register-load-image.js +66 -0
  116. package/dist/tools/register-load-image.js.map +1 -0
  117. package/dist/tools/register-svg.d.ts +3 -0
  118. package/dist/tools/register-svg.d.ts.map +1 -0
  119. package/dist/tools/register-svg.js +84 -0
  120. package/dist/tools/register-svg.js.map +1 -0
  121. package/dist/tools/register-video.d.ts +3 -0
  122. package/dist/tools/register-video.d.ts.map +1 -0
  123. package/dist/tools/register-video.js +118 -0
  124. package/dist/tools/register-video.js.map +1 -0
  125. package/dist/tools/register-viewers.d.ts +8 -0
  126. package/dist/tools/register-viewers.d.ts.map +1 -0
  127. package/dist/tools/register-viewers.js +89 -0
  128. package/dist/tools/register-viewers.js.map +1 -0
  129. package/dist/tools/schemas.d.ts +33 -0
  130. package/dist/tools/schemas.d.ts.map +1 -0
  131. package/dist/tools/schemas.js +39 -0
  132. package/dist/tools/schemas.js.map +1 -0
  133. package/dist/tools/types.d.ts +12 -0
  134. package/dist/tools/types.d.ts.map +1 -0
  135. package/dist/tools/types.js +2 -0
  136. package/dist/tools/types.js.map +1 -0
  137. package/dist/ui/image-viewer.d.ts +2 -0
  138. package/dist/ui/image-viewer.d.ts.map +1 -0
  139. package/dist/ui/image-viewer.js +42 -0
  140. package/dist/ui/image-viewer.js.map +1 -0
  141. package/dist/utils/chart-design-system.d.ts +92 -0
  142. package/dist/utils/chart-design-system.d.ts.map +1 -0
  143. package/dist/utils/chart-design-system.js +235 -0
  144. package/dist/utils/chart-design-system.js.map +1 -0
  145. package/dist/utils/image-compress.d.ts +9 -0
  146. package/dist/utils/image-compress.d.ts.map +1 -0
  147. package/dist/utils/image-compress.js +43 -0
  148. package/dist/utils/image-compress.js.map +1 -0
  149. package/dist/utils/image-utils.d.ts +9 -0
  150. package/dist/utils/image-utils.d.ts.map +1 -0
  151. package/dist/utils/image-utils.js +257 -0
  152. package/dist/utils/image-utils.js.map +1 -0
  153. package/dist/utils/logger.d.ts.map +1 -1
  154. package/dist/utils/logger.js +45 -11
  155. package/dist/utils/logger.js.map +1 -1
  156. package/dist/utils/resolve-images.d.ts +29 -0
  157. package/dist/utils/resolve-images.d.ts.map +1 -0
  158. package/dist/utils/resolve-images.js +56 -0
  159. package/dist/utils/resolve-images.js.map +1 -0
  160. package/dist/utils/tool-wrapper.d.ts +13 -0
  161. package/dist/utils/tool-wrapper.d.ts.map +1 -0
  162. package/dist/utils/tool-wrapper.js +22 -0
  163. package/dist/utils/tool-wrapper.js.map +1 -0
  164. package/dist/utils/video-utils.d.ts +16 -0
  165. package/dist/utils/video-utils.d.ts.map +1 -0
  166. package/dist/utils/video-utils.js +319 -0
  167. package/dist/utils/video-utils.js.map +1 -0
  168. package/dist/video-viewer/src/ui/video-viewer.html +310 -0
  169. package/houtini-logo.jpg +0 -0
  170. package/package.json +24 -8
  171. package/server.json +30 -0
package/README.md CHANGED
@@ -1,784 +1,314 @@
1
- # Gemini MCP Server
2
-
3
- [![npm version](https://img.shields.io/npm/v/@houtini/gemini-mcp.svg?style=flat-square)](https://www.npmjs.com/package/@houtini/gemini-mcp)
4
- [![MCP Registry](https://img.shields.io/badge/MCP-Registry-blue?style=flat-square)](https://registry.modelcontextprotocol.io)
5
- [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg?style=flat-square)](https://opensource.org/licenses/Apache-2.0)
6
- [![TypeScript](https://img.shields.io/badge/TypeScript-5.3-blue?style=flat-square&logo=typescript)](https://www.typescriptlang.org/)
7
- [![MCP](https://img.shields.io/badge/MCP-Compatible-green?style=flat-square)](https://modelcontextprotocol.io)
8
-
9
- A production-ready Model Context Protocol server for Google's Gemini AI models. I've built this with TypeScript and the latest MCP SDK (1.25.3), focusing on real-world reliability rather than feature bloat.
10
-
11
- ## What This Does
12
-
13
- This server connects Claude Desktop (or any MCP client) to Google's Gemini models. The integration is straightforward: chat with Gemini, get model information, and run deep research tasks with Google Search grounding built in.
14
-
15
- What I think matters here: the server discovers available models automatically from Google's API, which means you're always working with the latest releases without updating configuration files. No hardcoded model lists that go stale.
16
-
17
- ## Quick Start
18
-
19
- The simplest way to use this is with `npx` - no installation required:
20
-
21
- ```bash
22
- # Get your API key from Google AI Studio first
23
- # https://makersuite.google.com/app/apikey
24
-
25
- # Test it works (optional)
26
- npx @houtini/gemini-mcp
27
-
28
- # Add to Claude Desktop (configuration below)
29
- ```
30
-
31
- ## Installation Options
32
-
33
- ### Recommended: npx (No Installation)
34
-
35
- ```bash
36
- npx @houtini/gemini-mcp
37
- ```
38
-
39
- This approach pulls the latest version automatically. I prefer this because you don't clutter your system with global packages, and updates happen transparently.
40
-
41
- ### Alternative: Global Installation
42
-
43
- ```bash
44
- npm install -g @houtini/gemini-mcp
45
- gemini-mcp
46
- ```
47
-
48
- ### Alternative: Local Project
49
-
50
- ```bash
51
- npm install @houtini/gemini-mcp
52
- npx @houtini/gemini-mcp
53
- ```
54
-
55
- ### From Source (Developers)
56
-
57
- ```bash
58
- git clone https://github.com/houtini-ai/gemini-mcp.git
59
- cd gemini-mcp
60
- npm install
61
- npm run build
62
- npm start
63
- ```
64
-
65
- ## Configuration
66
-
67
- ### Step 1: Get Your API Key
68
-
69
- Visit [Google AI Studio](https://makersuite.google.com/app/apikey) to create a free API key. This takes about 30 seconds.
70
-
71
- ### Step 2: Configure Claude Desktop
72
-
73
- Add this to your Claude Desktop config file:
74
-
75
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
76
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
77
-
78
- #### Using npx (Recommended)
79
-
80
- ```json
81
- {
82
- "mcpServers": {
83
- "gemini": {
84
- "command": "npx",
85
- "args": ["@houtini/gemini-mcp"],
86
- "env": {
87
- "GEMINI_API_KEY": "your-api-key-here"
88
- }
89
- }
90
- }
91
- }
92
- ```
93
-
94
- #### Using Global Installation
95
-
96
- ```json
97
- {
98
- "mcpServers": {
99
- "gemini": {
100
- "command": "gemini-mcp",
101
- "env": {
102
- "GEMINI_API_KEY": "your-api-key-here"
103
- }
104
- }
105
- }
106
- }
107
- ```
108
-
109
- Requires `npm install -g @houtini/gemini-mcp` first.
110
-
111
- #### Using Local Build
112
-
113
- ```json
114
- {
115
- "mcpServers": {
116
- "gemini": {
117
- "command": "node",
118
- "args": ["./node_modules/@houtini/gemini-mcp/dist/index.js"],
119
- "env": {
120
- "GEMINI_API_KEY": "your-api-key-here"
121
- }
122
- }
123
- }
124
- }
125
- ```
126
-
127
- Only works if installed locally in the current directory.
128
-
129
- ### Step 3: Restart Claude Desktop
130
-
131
- After updating the config, restart Claude Desktop. The server loads on startup.
132
-
133
- ### Optional: Additional Configuration
134
-
135
- ```json
136
- {
137
- "mcpServers": {
138
- "gemini": {
139
- "command": "npx",
140
- "args": ["@houtini/gemini-mcp"],
141
- "env": {
142
- "GEMINI_API_KEY": "your-api-key-here",
143
- "LOG_LEVEL": "info",
144
- "GEMINI_ALLOW_EXPERIMENTAL": "false"
145
- }
146
- }
147
- }
148
- }
149
- ```
150
-
151
- **Environment Variables:**
152
-
153
- | Variable | Default | What It Does |
154
- |----------|---------|--------------|
155
- | `GEMINI_API_KEY` | *required* | Your Google AI Studio API key |
156
- | `LOG_LEVEL` | `info` | Logging detail: `debug`, `info`, `warn`, `error` |
157
- | `GEMINI_ALLOW_EXPERIMENTAL` | `false` | Include experimental models (set `true` to enable) |
158
-
159
- ## Dynamic Model Discovery
160
-
161
- The server automatically discovers available Gemini models from Google's API on first use. This happens transparently - you don't need to configure anything.
162
-
163
- ### How It Works
164
-
165
- 1. Server starts instantly with reliable fallback models
166
- 2. First request triggers model discovery from Google's API (adds 1-2 seconds once)
167
- 3. Subsequent requests use the discovered models (no delay)
168
- 4. If discovery fails, fallback models work immediately
169
-
170
- What I've found: this approach keeps you current with Google's releases whilst maintaining instant startup. The server filters to stable production models by default, which avoids experimental model rate limits.
171
-
172
- ### What Gets Discovered
173
-
174
- - All available Gemini models (stable and experimental)
175
- - Accurate context window sizes directly from Google
176
- - Model capabilities and recommended use cases
177
- - Latest releases as soon as Google makes them available
178
-
179
- The default model selection prioritises: stable models over experimental, newest version available, Flash variants for speed, and capability matching for your request type.
180
-
181
- ### Performance Impact
182
-
183
- - Startup: 0ms (instant)
184
- - First request: +1-2 seconds (one-time discovery)
185
- - Subsequent requests: 0ms overhead
186
- - Discovery failure: 0ms (uses fallback immediately)
187
-
188
- Check your logs after first request to see what was discovered:
189
- ```
190
- Models discovered from API (count: 38, defaultModel: gemini-2.5-flash)
191
- ```
192
-
193
- ## Experimental Models
194
-
195
- By default, the server uses stable production models. This ensures reliable performance and avoids Google's stricter rate limits on experimental releases.
196
-
197
- ### Stable vs Experimental
198
-
199
- **Stable Models** (default behaviour):
200
- - Production-ready
201
- - Better rate limits
202
- - Consistent performance
203
- - Examples: `gemini-2.5-flash`, `gemini-2.5-pro`, `gemini-2.0-flash`
204
-
205
- **Experimental Models** (opt-in):
206
- - Latest features before stable release
207
- - Stricter rate limits
208
- - Potentially unexpected behaviour
209
- - Can be deprecated quickly
210
- - Examples: `gemini-exp-1206`, `gemini-2.0-flash-thinking-exp`
211
-
212
- ### Enabling Experimental Models
213
-
214
- Set `GEMINI_ALLOW_EXPERIMENTAL=true` in your configuration:
215
-
216
- ```json
217
- {
218
- "mcpServers": {
219
- "gemini": {
220
- "command": "npx",
221
- "args": ["@houtini/gemini-mcp"],
222
- "env": {
223
- "GEMINI_API_KEY": "your-api-key-here",
224
- "GEMINI_ALLOW_EXPERIMENTAL": "true"
225
- }
226
- }
227
- }
228
- }
229
- ```
230
-
231
- This includes experimental models in discovery and makes them eligible as defaults. You can still explicitly request any model regardless of this setting - the flag only affects which models are used automatically.
232
-
233
- ### When to Enable
234
-
235
- Keep experimental disabled if you need reliable, consistent performance or you're building production applications.
236
-
237
- Enable experimental if you're testing cutting-edge features, doing research, or you understand the rate limit trade-offs.
238
-
239
- ## Usage Examples
240
-
241
- ### Basic Chat
242
-
243
- ```
244
- Can you help me understand quantum computing using Gemini?
245
- ```
246
-
247
- Claude automatically uses the `gemini_chat` tool.
248
-
249
- ### Creative Writing
250
-
251
- ```
252
- Use Gemini to write a short story about artificial intelligence discovering creativity.
253
- ```
254
-
255
- ### Technical Analysis
256
-
257
- ```
258
- Use Gemini Pro to explain the differences between various machine learning algorithms.
259
- ```
260
-
261
- ### Model Selection
262
-
263
- ```
264
- Use Gemini 1.5 Pro to analyse this code and suggest improvements.
265
- ```
266
-
267
- ### Getting Model Information
268
-
269
- ```
270
- Show me all available Gemini models and their capabilities.
271
- ```
272
-
273
- ---
274
-
275
- ## Complete Prompting Guide
276
-
277
- Check the **[Comprehensive Prompting Guide](PROMPTING_GUIDE.md)** for:
278
-
279
- - Advanced prompting techniques
280
- - Model selection strategies
281
- - Parameter tuning (temperature, tokens, system prompts)
282
- - Using Google Search grounding
283
- - Creative workflows and use cases
284
- - Best practices
285
- - Troubleshooting
286
-
287
- **[Read the Prompting Guide](PROMPTING_GUIDE.md)**
288
-
289
- ---
290
-
291
- ## Google Search Grounding
292
-
293
- Google Search grounding is built in and enabled by default. This gives Gemini models access to current web information, which significantly improves accuracy for questions requiring up-to-date data.
294
-
295
- ### What It Does
296
-
297
- When you ask a question that benefits from current information:
298
- 1. Analyses your query to determine if web search helps
299
- 2. Generates relevant search queries automatically
300
- 3. Performs Google searches using targeted queries
301
- 4. Processes results and synthesises information
302
- 5. Provides enhanced response with inline citations
303
- 6. Shows search metadata including queries used
304
-
305
- ### Best Use Cases
306
-
307
- **Current Events & News**
308
- ```
309
- What are the latest developments in AI announced this month?
310
- Recent breakthroughs in quantum computing research?
311
- ```
312
-
313
- **Real-time Data**
314
- ```
315
- Current stock prices for major tech companies
316
- Today's weather forecast for London
317
- ```
318
-
319
- **Recent Developments**
320
- ```
321
- New software releases this week
322
- Latest scientific discoveries in medicine
323
- ```
324
-
325
- **Fact Checking**
326
- ```
327
- Verify recent statements about climate change
328
- Check the latest statistics on global internet usage
329
- ```
330
-
331
- ### Controlling Grounding
332
-
333
- Grounding is enabled by default. Disable it for purely creative or hypothetical responses:
334
-
335
- ```
336
- Use Gemini without web search to write a fictional story about dragons in space.
337
- ```
338
-
339
- For API calls, use the `grounding` parameter:
340
-
341
- ```json
342
- {
343
- "message": "Write a creative story about time travel",
344
- "grounding": false
345
- }
346
- ```
347
-
348
- ### Understanding Grounded Responses
349
-
350
- Grounded responses include source citations and search transparency:
351
-
352
- ```
353
- Sources: (https://example.com/article1) (https://example.com/article2)
354
- Search queries used: latest AI developments 2025, OpenAI GPT-5 release
355
- ```
356
-
357
- What I've found: grounding dramatically reduces hallucinations for factual queries whilst maintaining creative flexibility when you need it.
358
-
359
- ## Deep Research
360
-
361
- The server includes deep research capability that performs iterative multi-step research on complex topics. This synthesises comprehensive reports with proper citations.
362
-
363
- ### How It Works
364
-
365
- Deep research conducts multiple research iterations:
366
-
367
- 1. Initial broad exploration
368
- 2. Gap analysis identifying what's missing
369
- 3. Targeted research into specific areas
370
- 4. Synthesis into comprehensive report
371
- 5. Iteration until thorough coverage
372
-
373
- ### Using Deep Research
374
-
375
- ```
376
- Use Gemini deep research to investigate the impact of quantum computing on cybersecurity.
377
- ```
378
-
379
- With parameters:
380
- ```
381
- Use Gemini deep research with 7 iterations to create a comprehensive report on renewable energy trends, focusing on solar and wind power adoption rates.
382
- ```
383
-
384
- ### Research Parameters
385
-
386
- | Parameter | Type | Default | What It Does |
387
- |-----------|------|---------|--------------|
388
- | `research_question` | string | *required* | The topic to investigate |
389
- | `max_iterations` | integer | 5 | Research cycles (3-10) |
390
- | `focus_areas` | array | - | Specific aspects to emphasise |
391
- | `model` | string | *latest stable* | Which model to use |
392
-
393
- ### Best For
394
-
395
- - Academic research and literature reviews
396
- - Market analysis and competitive intelligence
397
- - Technology trend analysis
398
- - Policy research and impact assessments
399
- - Multi-faceted business problems
400
-
401
- ### Configuring Iterations by Environment
402
-
403
- Different AI environments have different timeout tolerances:
404
-
405
- **Claude Desktop (3-5 iterations recommended)**
406
- - Timeout: ~4 minutes
407
- - Safe maximum: 5 iterations
408
- - Use 3-4 for most tasks
409
-
410
- **Agent SDK / IDEs (7-10 iterations recommended)**
411
- - Timeout: 10+ minutes
412
- - Maximum: 10 iterations
413
- - Use 7-10 for comprehensive research
414
-
415
- **AI Platforms like Cline, Roo-Cline (7-10 iterations)**
416
- - Similar to Agent SDK
417
- - Can handle longer processes
418
-
419
- ### Handling Timeouts
420
-
421
- If you hit timeout or thread limits:
422
-
423
- 1. Reduce iterations (start with 3)
424
- 2. Narrow focus using `focus_areas` parameter
425
- 3. Split complex topics into smaller research tasks
426
- 4. Check which environment you're using
427
-
428
- Example with focused research:
429
- ```
430
- Use Gemini deep research with 3 iterations focusing on cost analysis and market adoption to examine solar panel technology trends.
431
- ```
432
-
433
- Deep research takes several minutes. It's designed for comprehensive analysis rather than quick answers.
434
-
435
- ## API Reference
436
-
437
- ### gemini_chat
438
-
439
- Chat with Gemini models.
440
-
441
- **Parameters:**
442
-
443
- | Parameter | Type | Required | Default | What It Does |
444
- |-----------|------|----------|---------|--------------|
445
- | `message` | string | Yes | - | The message to send |
446
- | `model` | string | No | *Latest stable* | Which model to use |
447
- | `temperature` | number | No | 0.7 | Randomness (0.0-1.0) |
448
- | `max_tokens` | integer | No | 8192 | Maximum response length (1-32768) |
449
- | `system_prompt` | string | No | - | System instruction |
450
- | `grounding` | boolean | No | true | Enable Google Search |
451
-
452
- **Example:**
453
- ```json
454
- {
455
- "message": "What are the latest developments in quantum computing?",
456
- "model": "gemini-1.5-pro",
457
- "temperature": 0.5,
458
- "max_tokens": 1000,
459
- "system_prompt": "You are a technology expert. Provide current information with sources.",
460
- "grounding": true
461
- }
462
- ```
463
-
464
- ### gemini_list_models
465
-
466
- Retrieve information about discovered Gemini models.
467
-
468
- **Parameters:** None required
469
-
470
- **Example:**
471
- ```json
472
- {}
473
- ```
474
-
475
- **Response includes:**
476
- - Model names and display names
477
- - Descriptions of strengths
478
- - Context window sizes from Google
479
- - Recommended use cases
480
-
481
- ### gemini_deep_research
482
-
483
- Conduct iterative multi-step research.
484
-
485
- **Parameters:**
486
-
487
- | Parameter | Type | Required | Default | What It Does |
488
- |-----------|------|----------|---------|--------------|
489
- | `research_question` | string | Yes | - | Topic to research |
490
- | `max_iterations` | integer | No | 5 | Research cycles (3-10) |
491
- | `focus_areas` | array | No | - | Specific areas to emphasise |
492
- | `model` | string | No | *Latest stable* | Model to use |
493
-
494
- **Example:**
495
- ```json
496
- {
497
- "research_question": "Impact of AI on healthcare diagnostics",
498
- "max_iterations": 7,
499
- "focus_areas": ["accuracy improvements", "cost implications", "regulatory challenges"]
500
- }
501
- ```
502
-
503
- ### Available Models
504
-
505
- Models are dynamically discovered from Google's API. Typical available models:
506
-
507
- | Model | Best For | Description |
508
- |-------|----------|-------------|
509
- | **gemini-2.5-flash** | General use | Latest Flash - fast, versatile |
510
- | **gemini-2.5-pro** | Complex reasoning | Latest Pro - advanced capabilities |
511
- | **gemini-2.0-flash** | Speed-optimised | Gemini 2.0 Flash - efficient |
512
- | **gemini-1.5-flash** | Quick responses | Gemini 1.5 Flash - fast |
513
- | **gemini-1.5-pro** | Large context | 2M token context window |
514
-
515
- Use `gemini_list_models` to see exact available models with current context limits.
516
-
517
- ## Development
518
-
519
- ### Building from Source
520
-
521
- ```bash
522
- git clone https://github.com/houtini-ai/gemini-mcp.git
523
- cd gemini-mcp
524
- npm install
525
- npm run build
526
- npm run dev
527
- ```
528
-
529
- ### Scripts
530
-
531
- | Command | What It Does |
532
- |---------|--------------|
533
- | `npm run build` | Compile TypeScript |
534
- | `npm run dev` | Development mode with live reload |
535
- | `npm start` | Run compiled server |
536
- | `npm test` | Run tests |
537
- | `npm run lint` | Check code style |
538
- | `npm run lint:fix` | Fix linting issues |
539
-
540
- ### Project Structure
541
-
542
- ```
543
- src/
544
- ├── config/ # Configuration management
545
- ├── services/ # Business logic
546
- │ └── gemini/ # Gemini API integration
547
- ├── tools/ # MCP tool implementations
548
- ├── utils/ # Logger and error handling
549
- ├── cli.ts # CLI entry
550
- └── index.ts # Main server
551
- ```
552
-
553
- ### Architecture
554
-
555
- The server follows clean, layered architecture:
556
-
557
- 1. CLI Layer - Command-line interface
558
- 2. Server Layer - MCP protocol handling
559
- 3. Tools Layer - MCP tool implementations
560
- 4. Service Layer - Business logic and API integration
561
- 5. Utility Layer - Logging and error handling
562
-
563
- ## Troubleshooting
564
-
565
- ### "GEMINI_API_KEY environment variable not set"
566
-
567
- Check your Claude Desktop configuration includes the API key in the `env` section.
568
-
569
- ### Server Not Appearing in Claude Desktop
570
-
571
- 1. Restart Claude Desktop after configuration changes
572
- 2. Verify config file path:
573
- - Windows: `%APPDATA%\Claude\claude_desktop_config.json`
574
- - macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
575
- 3. Validate JSON syntax
576
- 4. Test your API key at [Google AI Studio](https://makersuite.google.com/app/apikey)
577
-
578
- ### "Module not found" with npx
579
-
580
- ```bash
581
- # Clear npx cache
582
- npx --yes @houtini/gemini-mcp
583
-
584
- # Or install globally
585
- npm install -g @houtini/gemini-mcp
586
- ```
587
-
588
- ### Node.js Version Issues
589
-
590
- ```bash
591
- # Check version
592
- node --version
593
-
594
- # Should be v18.0.0 or higher
595
- # Update from https://nodejs.org
596
- ```
597
-
598
- ### Debug Mode
599
-
600
- Enable detailed logging:
601
-
602
- ```json
603
- {
604
- "mcpServers": {
605
- "gemini": {
606
- "command": "npx",
607
- "args": ["@houtini/gemini-mcp"],
608
- "env": {
609
- "GEMINI_API_KEY": "your-api-key-here",
610
- "LOG_LEVEL": "debug"
611
- }
612
- }
613
- }
614
- }
615
- ```
616
-
617
- ### Log Files
618
-
619
- Logs are written to:
620
- - Console output (Claude Desktop developer tools)
621
- - `logs/combined.log` - All levels
622
- - `logs/error.log` - Errors only
623
-
624
- ### Testing Your Setup
625
-
626
- Test with these queries:
627
- 1. "Can you list the available Gemini models?"
628
- 2. "Use Gemini to explain photosynthesis."
629
- 3. "Use Gemini 1.5 Pro with temperature 0.9 to write a creative poem about coding."
630
-
631
- ### Performance Tuning
632
-
633
- For better performance:
634
-
635
- - Adjust token limits based on your use case
636
- - Use appropriate models (Flash for speed, Pro for complexity)
637
- - Monitor logs for rate limiting issues
638
- - Set temperature values appropriately (0.7 balanced, 0.3 focused, 0.9 creative)
639
-
640
- ## Contributing
641
-
642
- Contributions welcome. Follow these steps:
643
-
644
- 1. Fork the repository
645
- 2. Create a feature branch: `git checkout -b feature/amazing-feature`
646
- 3. Make your changes and add tests
647
- 4. Run tests: `npm test`
648
- 5. Lint: `npm run lint:fix`
649
- 6. Build: `npm run build`
650
- 7. Commit: `git commit -m 'Add amazing feature'`
651
- 8. Push: `git push origin feature/amazing-feature`
652
- 9. Open a Pull Request
653
-
654
- ### Development Guidelines
655
-
656
- - Follow TypeScript best practices
657
- - Add tests for new functionality
658
- - Update documentation
659
- - Use conventional commit messages
660
- - Maintain backwards compatibility
661
-
662
- ## Technical Details
663
-
664
- ### Migration to MCP SDK 1.25.3
665
-
666
- This server has been migrated to the latest MCP SDK (1.25.3) with ES modules support. Key technical changes:
667
-
668
- **SDK Updates:**
669
- - Migrated from `Server` class to `McpServer` API
670
- - Tool registration uses `registerTool` with Zod validation
671
- - ES modules throughout (`"type": "module"`)
672
- - TypeScript configured for `nodenext` module resolution
673
-
674
- **Compatibility:**
675
- - Node.js 18+ (changed from 24+ for broader compatibility)
676
- - All imports use `.js` extensions for ES module compliance
677
- - Zod schemas for runtime type validation
678
- - Modern MCP protocol implementation
679
-
680
- **Build System:**
681
- - TypeScript compiles to ES2022 modules
682
- - Clean separation between business logic and MCP interface
683
- - Preserved all Gemini API client functionality
684
-
685
- What this means practically: the server now follows modern Node.js and MCP standards, which should prevent compatibility issues with future Claude Desktop updates whilst maintaining all existing functionality.
686
-
687
- ## Licence
688
-
689
- This project is licensed under the Apache 2.0 Licence - see the [LICENSE](LICENSE) file for details.
690
-
691
- ## Disclaimer
692
-
693
- **Use at Your Own Risk**: This software is provided "as is" without warranty. The authors accept no responsibility for damages, data loss, or other issues arising from use.
694
-
695
- **Content Safety**: This server interfaces with Google's Gemini AI models. Whilst content safety settings are implemented, AI-generated content quality cannot be guaranteed. Users are responsible for reviewing AI output before use and ensuring compliance with applicable laws.
696
-
697
- **API Key Security**: Your Google Gemini API key is sensitive. Keep it confidential, don't commit it to version control, rotate if exposed, and manage API usage costs.
698
-
699
- **Data Privacy**: This server processes data through the Model Context Protocol. Avoid sending sensitive or confidential information. Review Google's privacy policy and implement appropriate data handling.
700
-
701
- **Production Use**: Users deploying in production should conduct security audits, implement monitoring, have incident response procedures, and regularly update dependencies.
702
-
703
- **Third-Party Services**: This software relies on external services (Google Gemini API, npm packages). Service availability, pricing, and functionality may change.
704
-
705
- **No Professional Advice**: AI-generated content should not be considered professional advice (legal, medical, financial) without verification by qualified professionals.
706
-
707
- By using this software, you acknowledge these terms and agree to use at your own risk.
708
-
709
- ## Support
710
-
711
- - **GitHub Issues**: [Report bugs or request features](https://github.com/houtini-ai/gemini-mcp/issues)
712
- - **GitHub Discussions**: [Ask questions or share ideas](https://github.com/houtini-ai/gemini-mcp/discussions)
713
-
714
- ## Changelog
715
-
716
- ### v1.3.2 - Node.js 18+ Compatibility & Modern SDK
717
-
718
- **Breaking Changes:** None (all tool interfaces preserved)
719
-
720
- **Technical Updates:**
721
- - Updated to MCP SDK 1.25.3 (from 1.19.1)
722
- - Migrated to ES modules (`"type": "module"`)
723
- - Changed Node.js requirement to >=18.0.0 (from >=24.0.0) for broader compatibility
724
- - Migrated from `Server` to `McpServer` API
725
- - Implemented Zod schema validation for all tools
726
- - Updated TypeScript config to `nodenext` module resolution
727
-
728
- **Fixes:**
729
- - Resolved Node.js v24 ERR_MODULE_NOT_FOUND errors
730
- - Fixed TypeScript compilation with DOM types for fetch API
731
- - All imports now use `.js` extensions for ES module compliance
732
-
733
- **What This Means:**
734
- The server now works reliably with Node.js 18, 20, 22, and 24. All existing functionality preserved - this is purely a technical infrastructure update for better compatibility.
735
-
736
- ### v1.1.0 - Deep Research & Enhanced Discovery
737
-
738
- **New Features:**
739
- - Added deep research capability for iterative analysis
740
- - Enhanced model discovery with better filtering
741
- - Improved default model selection logic
742
- - Better handling of experimental vs stable models
743
-
744
- ### v1.0.4 - Security & Dependencies
745
-
746
- **Updates:**
747
- - Updated @google/generative-ai to v0.24.1
748
- - Updated @modelcontextprotocol/sdk to v1.19.1
749
- - Changed safety settings to BLOCK_MEDIUM_AND_ABOVE
750
- - Added comprehensive disclaimer
751
- - Zero vulnerabilities in dependencies
752
-
753
- ### v1.0.3 - Enhanced Grounding
754
-
755
- **Improvements:**
756
- - Fixed grounding metadata field names
757
- - Enhanced source citation processing
758
- - Improved grounding reliability
759
- - Better error handling for grounding
760
-
761
- ### v1.0.2 - Google Search Grounding
762
-
763
- **New Features:**
764
- - Added Google Search grounding (enabled by default)
765
- - Real-time web search integration
766
- - Source citations in responses
767
- - Configurable grounding parameter
768
-
769
- ### v1.0.0 - Initial Release
770
-
771
- **Core Features:**
772
- - Complete TypeScript rewrite
773
- - Professional modular architecture
774
- - Comprehensive error handling
775
- - Full MCP protocol compliance
776
- - Multiple Gemini model support
777
- - NPM package distribution
778
- - Production-ready build system
779
-
780
- ---
781
-
782
- **Built for the Model Context Protocol community**
783
-
784
- For more about MCP, visit [modelcontextprotocol.io](https://modelcontextprotocol.io)
1
+ # @houtini/gemini-mcp
2
+
3
+ [![npm version](https://img.shields.io/npm/v/@houtini/gemini-mcp.svg?style=flat-square)](https://www.npmjs.com/package/@houtini/gemini-mcp)
4
+ [![MCP Registry](https://img.shields.io/badge/MCP-Registry-blue?style=flat-square)](https://registry.modelcontextprotocol.io)
5
+
6
+ **I've been running this MCP server in my Claude Desktop setup for several months, and it's one of the few I leave enabled permanently.** Not because Gemini replaces Claude -- it doesn't -- but because grounded search, deep research, image generation, and video are things Gemini does well. Having them as tools inside Claude beats switching between browser tabs.
7
+
8
+ Thirteen tools. One `npx` command.
9
+
10
+ ### MCP App previews
11
+
12
+ Generated images and diagrams render inline in Claude Desktop with zoom controls, file paths, and prompt context:
13
+
14
+ | Image generation | SVG / diagram generation |
15
+ |:---:|:---:|
16
+ | ![Image preview in MCP App](image-preview-mcp-app.jpg) | ![Diagram preview in MCP App](diagram-preview-mcp-app.jpg) |
17
+
18
+ ---
19
+
20
+ ## Get started in two minutes
21
+
22
+ **Step 1: Get a Gemini API key**
23
+
24
+ Go to [Google AI Studio](https://aistudio.google.com/apikey) and create one. The free tier covers most development use -- you'll hit rate limits on deep research if you're hammering it, but for day-to-day work it's fine.
25
+
26
+ **Step 2: Add to your Claude Desktop config**
27
+
28
+ Config file locations:
29
+ - Windows: `C:\Users\{username}\AppData\Roaming\Claude\claude_desktop_config.json`
30
+ - macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
31
+
32
+ ```json
33
+ {
34
+ "mcpServers": {
35
+ "gemini": {
36
+ "command": "npx",
37
+ "args": ["@houtini/gemini-mcp"],
38
+ "env": {
39
+ "GEMINI_API_KEY": "your-api-key-here"
40
+ }
41
+ }
42
+ }
43
+ }
44
+ ```
45
+
46
+ **Step 3: Restart Claude Desktop**
47
+
48
+ That's it. The tools show up automatically. `npx` pulls the package on first run -- no separate install.
49
+
50
+ ### Local build instead
51
+
52
+ For development, or if you'd rather not rely on npx:
53
+
54
+ ```bash
55
+ git clone https://github.com/houtini-ai/gemini-mcp
56
+ cd gemini-mcp
57
+ npm install --include=dev
58
+ npm run build
59
+ ```
60
+
61
+ Then point your config at the local build:
62
+
63
+ ```json
64
+ {
65
+ "mcpServers": {
66
+ "gemini": {
67
+ "command": "node",
68
+ "args": ["C:/path/to/gemini-mcp/dist/index.js"],
69
+ "env": {
70
+ "GEMINI_API_KEY": "your-api-key-here"
71
+ }
72
+ }
73
+ }
74
+ }
75
+ ```
76
+
77
+ ---
78
+
79
+ ## What it does
80
+
81
+ ### Chat with Google Search grounding
82
+
83
+ ```
84
+ Use gemini:gemini_chat to ask: "What changed in the MCP spec in the last month?"
85
+ ```
86
+
87
+ Grounding is on by default. Gemini searches Google before answering, so you get current information rather than training data cutoff answers. Sources come back as markdown links.
88
+
89
+ For questions where you want reasoning over live search -- "explain this code" or similar -- set `grounding: false`.
90
+
91
+ Supports `thinking_level` on Gemini 3 models: `high` for maximum reasoning depth, `low` to keep it fast, `medium`/`minimal` on Gemini 3 Flash only.
92
+
93
+ ### Deep research
94
+
95
+ ```
96
+ Use gemini:gemini_deep_research with:
97
+ research_question="What are the current approaches to AI agent memory management?"
98
+ max_iterations=5
99
+ ```
100
+
101
+ Runs multiple grounded search iterations, then synthesises a full report. Takes 2-5 minutes depending on complexity. Worth it for anything where you need comprehensive coverage rather than a quick answer.
102
+
103
+ Set `max_iterations` to 3-4 in Claude Desktop (4-minute tool timeout). In IDEs (Cursor, Windsurf, VS Code) or agent frameworks with longer timeout tolerance, 7-10 iterations produces noticeably better synthesis. Pass `focus_areas` as an array to steer toward specific angles.
104
+
105
+ ### Image generation with search grounding
106
+
107
+ ```
108
+ Use gemini:generate_image with:
109
+ prompt="Stock price chart showing Apple (AAPL) closing prices for the last 5 trading days"
110
+ use_search=true
111
+ aspectRatio="16:9"
112
+ ```
113
+
114
+ Default model is `gemini-3-pro-image-preview` (Nano Banana Pro). Also supports `gemini-2.5-flash-image` for faster generation.
115
+
116
+ When `use_search=true`, Gemini searches Google for current data before generating. Financial and news queries work reliably and return 2-5 grounding sources as markdown links. Weather queries are inconsistent (Gemini API limitation, not a code issue).
117
+
118
+ ### Video generation with Veo 3.1
119
+
120
+ ```
121
+ Use gemini:generate_video with:
122
+ prompt="A close-up shot of a futuristic coffee machine brewing a glowing blue espresso, steam rising dramatically. Cinematic lighting."
123
+ resolution="1080p"
124
+ durationSeconds=8
125
+ ```
126
+
127
+ Uses Google's Veo 3.1 model. Generates 4-8 second videos at up to 4K resolution with native synchronised audio. Processing takes 2-5 minutes -- the tool polls automatically until the video is ready.
128
+
129
+ Options worth knowing about:
130
+ - `aspectRatio` -- `16:9` (landscape, default) or `9:16` (portrait/vertical)
131
+ - `generateAudio` -- on by default, produces dialogue and sound effects matching the prompt
132
+ - `sampleCount` -- generate up to 4 variations in one call
133
+ - `seed` -- for deterministic output across runs
134
+ - `generateThumbnail` -- extracts a frame via ffmpeg (needs ffmpeg in PATH)
135
+ - `generateHTMLPlayer` -- creates a local HTML player alongside the video
136
+
137
+ ### SVG generation
138
+
139
+ ```
140
+ Use gemini:generate_svg with:
141
+ prompt="Architecture diagram showing a microservices system with API gateway, three services, and a shared database"
142
+ style="technical"
143
+ width=1000
144
+ height=600
145
+ ```
146
+
147
+ Generates clean, production-ready SVG code for diagrams, illustrations, icons, and data visualisations. Styles: `technical` (diagrams), `artistic` (illustrations), `minimal` (simple), `data-viz` (charts).
148
+
149
+ ### Image editing and analysis
150
+
151
+ **Conversational editing** -- Gemini 3 Pro Image maintains context across editing turns using thought signatures. The server captures these automatically. Pass them back on subsequent edit calls for full continuity:
152
+
153
+ ```
154
+ Use gemini:edit_image with:
155
+ prompt="Change the colour scheme to blue and green"
156
+ images=[{data: imageBase64, mimeType: "image/png", thoughtSignature: "fromPreviousCall"}]
157
+ ```
158
+
159
+ Skip thought signatures and each edit starts from scratch.
160
+
161
+ **Analysis** -- two tools for different purposes:
162
+ - `describe_image` -- Fast general descriptions using Gemini 3 Flash
163
+ - `analyze_image` -- Structured extraction and detailed reasoning using Gemini 3.1 Pro
164
+
165
+ **Load local files:**
166
+ ```
167
+ Use gemini:load_image_from_path with filePath="C:/screenshots/error.png"
168
+ ```
169
+ Returns base64 data ready for any image tool.
170
+
171
+ ### Media resolution control
172
+
173
+ Reduce token usage by up to 75% whilst maintaining quality:
174
+
175
+ | Level | Tokens | Savings | Best for |
176
+ |-------|--------|---------|----------|
177
+ | `MEDIA_RESOLUTION_LOW` | 280 | 75% | Simple tasks, bulk operations |
178
+ | `MEDIA_RESOLUTION_MEDIUM` | 560 | 50% | PDFs/documents (OCR saturates here) |
179
+ | `MEDIA_RESOLUTION_HIGH` | 1120 | default | Detailed analysis |
180
+ | `MEDIA_RESOLUTION_ULTRA_HIGH` | 2000+ | per-image only | Maximum detail |
181
+
182
+ For PDF OCR, MEDIUM gives identical text extraction quality to HIGH at half the tokens. Set `global_media_resolution` to apply to all images, or override per-image with `mediaResolution`.
183
+
184
+ ### Landing page generation
185
+
186
+ ```
187
+ Use gemini:generate_landing_page with:
188
+ brief="A SaaS tool that helps developers monitor API latency"
189
+ companyName="PingWatch"
190
+ primaryColour="#6366F1"
191
+ style="startup"
192
+ sections=["hero", "features", "pricing", "cta"]
193
+ ```
194
+
195
+ Returns a self-contained HTML file -- inline CSS and vanilla JS, no external dependencies. Styles: `minimal`, `bold`, `corporate`, `startup`.
196
+
197
+ ### Professional chart design systems
198
+
199
+ The `gemini_prompt_assistant` tool includes 9 professional chart design systems:
200
+
201
+ | System | Inspiration | Best for |
202
+ |--------|------------|----------|
203
+ | **storytelling** | Cole Nussbaumer Knaflic | Executive presentations -- everything muted except one bold highlight |
204
+ | **financial** | Financial Times | Editorial journalism -- FT Pink background, serif titles |
205
+ | **terminal** | Bloomberg / Fintech | High-density dark mode with electric neon |
206
+ | **modernist** | W.E.B. Du Bois | Bold geometric blocks, stark contrasts |
207
+ | **professional** | IBM Carbon / Tailwind | Enterprise dashboards |
208
+ | **editorial** | FiveThirtyEight / Economist | Data journalism |
209
+ | **scientific** | Nature / Science | Academic rigour |
210
+ | **minimal** | Edward Tufte | Maximum data-ink ratio |
211
+ | **dark** | Observable | Modern dark mode |
212
+
213
+ ```
214
+ Use gemini:gemini_prompt_assistant with:
215
+ request_type="template"
216
+ use_case="product"
217
+ desired_outcome="Generate a professional product comparison chart"
218
+ ```
219
+
220
+ ### Help system
221
+
222
+ ```
223
+ Use gemini:gemini_help with topic="overview"
224
+ ```
225
+
226
+ Documentation for all features without leaving Claude. Topics: `overview`, `image_generation`, `image_editing`, `image_analysis`, `chat`, `deep_research`, `grounding`, `media_resolution`, `models`, `all`.
227
+
228
+ ---
229
+
230
+ ## Image output and storage
231
+
232
+ **Default behaviour:** Images return as inline base64 previews (quality 100, 1024px) rendered directly in Claude.
233
+
234
+ **Persistent storage:** Set `GEMINI_IMAGE_OUTPUT_DIR` to auto-save all generated images:
235
+
236
+ ```json
237
+ "env": {
238
+ "GEMINI_API_KEY": "your-api-key-here",
239
+ "GEMINI_IMAGE_OUTPUT_DIR": "C:/Users/username/Pictures/gemini-output"
240
+ }
241
+ ```
242
+
243
+ Every image saves with a timestamp filename. The tool returns both the inline preview and the file path.
244
+
245
+ **Per-call override:** Pass `outputPath` on any generation tool to save to a specific location.
246
+
247
+ The server uses a two-tier compression approach to handle the MCP protocol's ~1MB JSON-RPC limit whilst preserving full-resolution files on disk:
248
+
249
+ | Tier | Quality | Max dimension | Purpose |
250
+ |------|---------|---------------|---------|
251
+ | **Full-res** | Original | Original | Saved to disk |
252
+ | **Viewer preview** | 100 | 1024px | MCP App inline preview (~400KB) |
253
+
254
+ Gemini returns 2-5MB images. The full image is saved to disk immediately, and a compressed preview is created for the MCP App viewer.
255
+
256
+ ---
257
+
258
+ ## Configuration reference
259
+
260
+ | Variable | Required | Default | Description |
261
+ |----------|----------|---------|-------------|
262
+ | `GEMINI_API_KEY` | Yes | -- | Google AI API key from [AI Studio](https://aistudio.google.com/apikey) |
263
+ | `GEMINI_DEFAULT_MODEL` | No | `gemini-3.1-pro-preview` | Default model for `gemini_chat` and `analyze_image` |
264
+ | `GEMINI_DEFAULT_GROUNDING` | No | `true` | Enable Google Search grounding by default |
265
+ | `GEMINI_IMAGE_OUTPUT_DIR` | No | -- | Auto-save directory for generated images |
266
+ | `GEMINI_ALLOW_EXPERIMENTAL` | No | `false` | Include experimental/preview models in auto-discovery |
267
+ | `GEMINI_MCP_LOG_FILE` | No | `false` | Write logs to `~/.gemini-mcp/logs/` |
268
+ | `DEBUG_MCP` | No | `false` | Log to stderr for debugging tool calls |
269
+
270
+ ---
271
+
272
+ ## Tools reference
273
+
274
+ | Tool | Description |
275
+ |------|-------------|
276
+ | `gemini_chat` | Chat with Gemini 3.1 Pro. Google Search grounding on by default. Supports `thinking_level` for Gemini 3 |
277
+ | `gemini_deep_research` | Multi-step iterative research with Google Search. Synthesises comprehensive reports |
278
+ | `gemini_list_models` | Lists available models from the API |
279
+ | `gemini_help` | Documentation for all features without leaving Claude |
280
+ | `gemini_prompt_assistant` | Expert guidance for image generation with 9 chart design systems |
281
+ | `generate_image` | Image generation with search grounding and thought signatures for conversational editing |
282
+ | `edit_image` | Edit images with natural-language instructions. Supports multi-turn continuity |
283
+ | `describe_image` | Fast image descriptions using Gemini 3 Flash |
284
+ | `analyze_image` | Structured extraction and analysis using Gemini 3.1 Pro |
285
+ | `load_image_from_path` | Read a local image file and return base64 for any image tool |
286
+ | `generate_video` | Video generation with Veo 3.1 -- 4-8 seconds at up to 4K with native audio |
287
+ | `generate_svg` | Production-ready SVG graphics for diagrams, illustrations, and data visualisations |
288
+ | `generate_landing_page` | Self-contained HTML landing pages with inline CSS/JS |
289
+
290
+ ---
291
+
292
+ ## Model reference
293
+
294
+ | Model | Used by | Notes |
295
+ |-------|---------|-------|
296
+ | `gemini-3.1-pro-preview` | `gemini_chat`, `analyze_image` | Default. Advanced reasoning |
297
+ | `gemini-3-pro-image-preview` | `generate_image`, `edit_image` | Nano Banana Pro -- highest quality generation |
298
+ | `gemini-2.5-flash-image` | `generate_image` (optional) | Faster generation, higher volume |
299
+ | `gemini-3-flash-preview` | `describe_image` | Fast general descriptions |
300
+ | `veo-3.1-generate-preview` | `generate_video` | Veo 3.1 -- 4K video with native audio |
301
+
302
+ **Gemini 3 notes:** Temperature is forced to 1.0 on Gemini 3 models (Google's requirement -- lower values cause looping). Thought signatures are captured automatically for conversational image editing. Thinking level only applies to `gemini_chat`.
303
+
304
+ ---
305
+
306
+ ## Requirements
307
+
308
+ - Node.js 18+
309
+ - A Gemini API key from [Google AI Studio](https://aistudio.google.com/apikey)
310
+ - ffmpeg (optional, for video thumbnail extraction)
311
+
312
+ ## Licence
313
+
314
+ Apache-2.0