@erickstryck/simple-vision-mcp 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/.env.example ADDED
@@ -0,0 +1,8 @@
1
+ # Required
2
+ VISION_API_KEY=your-api-key-here
3
+ VISION_BASE_URL=https://api.example.com/v1
4
+ VISION_MODEL=your-vision-model
5
+
6
+ # Optional
7
+ VISION_MAX_TOKENS=4096
8
+ VISION_TIMEOUT=120
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,274 @@
1
+ # Simple Vision MCP
2
+
3
+ A lightweight, focused Model Context Protocol (MCP) server designed specifically for image analysis using OpenAI-compatible APIs. Built with TypeScript and the MCP SDK.
4
+
5
+ ## Motivation
6
+
7
+ When working with AI coding agents that don't natively support vision capabilities, you often need a reliable way to analyze images. Many existing MCP vision servers are tightly coupled to specific providers (like OpenRouter or OpenAI) or come with unnecessary complexity.
8
+
9
+ **Simple Vision MCP** was created to solve a specific problem: enabling any OpenAI-compatible API endpoint to function as a vision analysis backend. It focuses on doing one thing exceptionally well - analyzing images - while remaining flexible enough to work with any OpenAI-compatible provider.
10
+
11
+ ### The Problem We Solved
12
+
13
+ During setup, we encountered several issues:
14
+ 1. Many vision MCP servers only support specific providers (OpenRouter, OpenAI, etc.)
15
+ 2. Container-based solutions had stdio communication issues
16
+ 3. Python-based servers had dependency conflicts
17
+ 4. Existing solutions were overly complex for the basic need
18
+
19
+ Simple Vision MCP addresses these by:
20
+ - Supporting **any** OpenAI-compatible API endpoint
21
+ - Running as a native Node.js process (no containers needed)
22
+ - Minimal, focused codebase that's easy to debug and maintain
23
+ - Zero external dependencies beyond the MCP SDK
24
+
25
+ ## Features
26
+
27
+ - **OpenAI-Compatible**: Works with any API that follows the OpenAI chat completions format
28
+ - **Single Tool Focus**: One purpose - image analysis done right
29
+ - **TypeScript**: Full type safety and modern JavaScript
30
+ - **Minimal Dependencies**: Only essential dependencies
31
+ - **STDIO Communication**: Native MCP protocol support
32
+ - **Configurable**: Full control via environment variables
33
+
34
+ ## Installation
35
+
36
+ ### Prerequisites
37
+
38
+ - Node.js 18 or higher
39
+ - An OpenAI-compatible API endpoint with vision capabilities
40
+
41
+ ### From Source
42
+
43
+ ```bash
44
+ git clone https://github.com/yourusername/simple-vision-mcp.git
45
+ cd simple-vision-mcp
46
+ npm install
47
+ npm run build
48
+ ```
49
+
50
+ ### Global Installation
51
+
52
+ ```bash
53
+ npm install -g simple-vision-mcp
54
+ ```
55
+
56
+ ## Configuration
57
+
58
+ Simple Vision MCP is configured entirely via environment variables. Create a `.env` file or export variables directly:
59
+
60
+ | Variable | Description | Required | Default |
61
+ |----------|-------------|----------|---------|
62
+ | `VISION_API_KEY` | Your API key | Yes | - |
63
+ | `VISION_BASE_URL` | API endpoint base URL | Yes | `https://api.openai.com/v1` |
64
+ | `VISION_MODEL` | Model name for vision | Yes | `gpt-4o-mini` |
65
+ | `VISION_MAX_TOKENS` | Max response tokens | No | `4096` |
66
+ | `VISION_TIMEOUT` | Request timeout (seconds) | No | `120` |
67
+
68
+ ### Example .env File
69
+
70
+ ```env
71
+ VISION_API_KEY=your-api-key-here
72
+ VISION_BASE_URL=https://your-custom-endpoint.com/api/v1
73
+ VISION_MODEL=Qwen3.5-4B-AWQ
74
+ VISION_MAX_TOKENS=4096
75
+ VISION_TIMEOUT=120
76
+ ```
77
+
78
+ ## Usage
79
+
80
+ ### Running the Server
81
+
82
+ ```bash
83
+ # Using the built version
84
+ npm start
85
+
86
+ # Or directly with node
87
+ node dist/index.js
88
+
89
+ # With environment variables
90
+ VISION_API_KEY=your-key VISION_BASE_URL=https://api.example.com/v1 VISION_MODEL=your-model node dist/index.js
91
+ ```
92
+
93
+ ### OpenCode Configuration
94
+
95
+ Add to your `opencode.json`:
96
+
97
+ ```json
98
+ {
99
+ "mcp": {
100
+ "vision": {
101
+ "type": "local",
102
+ "command": ["node", "/path/to/simple-vision-mcp/dist/index.js"],
103
+ "env": {
104
+ "VISION_API_KEY": "your-api-key",
105
+ "VISION_BASE_URL": "https://your-endpoint.com/api/v1",
106
+ "VISION_MODEL": "your-vision-model"
107
+ },
108
+ "enabled": true
109
+ }
110
+ }
111
+ }
112
+ ```
113
+
114
+ ### Claude Desktop Configuration
115
+
116
+ Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
117
+
118
+ ```json
119
+ {
120
+ "mcpServers": {
121
+ "vision": {
122
+ "command": "node",
123
+ "args": ["/path/to/simple-vision-mcp/dist/index.js"],
124
+ "env": {
125
+ "VISION_API_KEY": "your-api-key",
126
+ "VISION_BASE_URL": "https://your-endpoint.com/api/v1",
127
+ "VISION_MODEL": "your-vision-model"
128
+ }
129
+ }
130
+ }
131
+ }
132
+ ```
133
+
134
+ ### Cursor Configuration
135
+
136
+ Add to your Cursor MCP settings:
137
+
138
+ ```json
139
+ {
140
+ "mcpServers": {
141
+ "vision": {
142
+ "command": "node",
143
+ "args": ["/path/to/simple-vision-mcp/dist/index.js"],
144
+ "env": {
145
+ "VISION_API_KEY": "your-api-key",
146
+ "VISION_BASE_URL": "https://your-endpoint.com/api/v1",
147
+ "VISION_MODEL": "your-vision-model"
148
+ }
149
+ }
150
+ }
151
+ }
152
+ ```
153
+
154
+ ## Available Tools
155
+
156
+ ### analyze_image
157
+
158
+ Analyzes an image and returns a detailed description.
159
+
160
+ **Parameters:**
161
+
162
+ | Parameter | Type | Description | Required |
163
+ |-----------|------|-------------|----------|
164
+ | `image_path` | string | Path to the image file | Yes |
165
+ | `prompt` | string | Custom analysis prompt | No |
166
+
167
+ **Default Prompt:** "Describe this image in detail, including objects, text, colors, composition, and any notable features."
168
+
169
+ **Example:**
170
+
171
+ ```json
172
+ {
173
+ "name": "analyze_image",
174
+ "arguments": {
175
+ "image_path": "/path/to/image.png",
176
+ "prompt": "What objects are in this image?"
177
+ }
178
+ }
179
+ ```
180
+
181
+ **Response:**
182
+
183
+ ```json
184
+ {
185
+ "content": [
186
+ {
187
+ "type": "text",
188
+ "text": "The image shows a red square with..."
189
+ }
190
+ ]
191
+ }
192
+ ```
193
+
194
+ ## Supported Image Formats
195
+
196
+ - PNG (.png)
197
+ - JPEG (.jpg, .jpeg)
198
+ - GIF (.gif)
199
+ - WebP (.webp)
200
+ - BMP (.bmp)
201
+
202
+ ## Development
203
+
204
+ ### Project Structure
205
+
206
+ ```
207
+ simple-vision-mcp/
208
+ ├── src/
209
+ │ ├── config/
210
+ │ │ └── index.ts # Configuration loading
211
+ │ ├── services/
212
+ │ │ └── visionService.ts # Vision API client
213
+ │ ├── tools/
214
+ │ │ └── analyzeImage.ts # MCP tool definition
215
+ │ ├── utils/
216
+ │ │ └── imageProcessor.ts # Image processing utilities
217
+ │ └── index.ts # Main entry point
218
+ ├── tests/
219
+ │ ├── config.test.ts
220
+ │ ├── imageProcessor.test.ts
221
+ │ └── visionService.test.ts
222
+ ├── package.json
223
+ ├── tsconfig.json
224
+ └── README.md
225
+ ```
226
+
227
+ ### Building
228
+
229
+ ```bash
230
+ npm run build
231
+ ```
232
+
233
+ ### Testing
234
+
235
+ ```bash
236
+ # Run tests once
237
+ npm test
238
+
239
+ # Watch mode
240
+ npm run test:watch
241
+ ```
242
+
243
+ ### Design Principles
244
+
245
+ 1. **Single Responsibility**: Each module has one clear purpose
246
+ 2. **Dependency Injection**: Services receive dependencies via constructor
247
+ 3. **Functional Core**: Business logic is pure and testable
248
+ 4. **Explicit over Implicit**: Clear types and function signatures
249
+
250
+ ## Troubleshooting
251
+
252
+ ### "VISION_API_KEY environment variable is required"
253
+
254
+ Ensure you've set the `VISION_API_KEY` environment variable before starting the server.
255
+
256
+ ### "Unsupported image format"
257
+
258
+ The image format is not supported. Ensure your image is PNG, JPEG, GIF, WebP, or BMP format.
259
+
260
+ ### "Vision API error: 401"
261
+
262
+ Authentication failed. Verify your API key is correct and has access to vision capabilities.
263
+
264
+ ### "Vision API error: 4xx/5xx"
265
+
266
+ Check your `VISION_BASE_URL` is correct and the API endpoint is accessible.
267
+
268
+ ## License
269
+
270
+ MIT License - see LICENSE file for details.
271
+
272
+ ## Contributing
273
+
274
+ Contributions welcome! Please feel free to submit a Pull Request.
@@ -0,0 +1,9 @@
1
+ export interface ServerConfig {
2
+ apiKey: string;
3
+ baseUrl: string;
4
+ model: string;
5
+ maxTokens: number;
6
+ timeout: number;
7
+ }
8
+ export declare function loadConfig(): ServerConfig;
9
+ //# sourceMappingURL=index.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/config/index.ts"],"names":[],"mappings":"AAAA,MAAM,WAAW,YAAY;IAC3B,MAAM,EAAE,MAAM,CAAC;IACf,OAAO,EAAE,MAAM,CAAC;IAChB,KAAK,EAAE,MAAM,CAAC;IACd,SAAS,EAAE,MAAM,CAAC;IAClB,OAAO,EAAE,MAAM,CAAC;CACjB;AAED,wBAAgB,UAAU,IAAI,YAAY,CAYzC"}
@@ -0,0 +1,12 @@
1
+ export function loadConfig() {
2
+ const apiKey = process.env.VISION_API_KEY || process.env.OPENAI_API_KEY || '';
3
+ const baseUrl = process.env.VISION_BASE_URL || process.env.OPENAI_BASE_URL || 'https://api.openai.com/v1';
4
+ const model = process.env.VISION_MODEL || process.env.OPENAI_MODEL || 'gpt-4o-mini';
5
+ const maxTokens = parseInt(process.env.VISION_MAX_TOKENS || '4096', 10);
6
+ const timeout = parseInt(process.env.VISION_TIMEOUT || '120', 10);
7
+ if (!apiKey) {
8
+ throw new Error('VISION_API_KEY or OPENAI_API_KEY environment variable is required');
9
+ }
10
+ return { apiKey, baseUrl, model, maxTokens, timeout };
11
+ }
12
+ //# sourceMappingURL=index.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/config/index.ts"],"names":[],"mappings":"AAQA,MAAM,UAAU,UAAU;IACxB,MAAM,MAAM,GAAG,OAAO,CAAC,GAAG,CAAC,cAAc,IAAI,OAAO,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC;IAC9E,MAAM,OAAO,GAAG,OAAO,CAAC,GAAG,CAAC,eAAe,IAAI,OAAO,CAAC,GAAG,CAAC,eAAe,IAAI,2BAA2B,CAAC;IAC1G,MAAM,KAAK,GAAG,OAAO,CAAC,GAAG,CAAC,YAAY,IAAI,OAAO,CAAC,GAAG,CAAC,YAAY,IAAI,aAAa,CAAC;IACpF,MAAM,SAAS,GAAG,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,iBAAiB,IAAI,MAAM,EAAE,EAAE,CAAC,CAAC;IACxE,MAAM,OAAO,GAAG,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,cAAc,IAAI,KAAK,EAAE,EAAE,CAAC,CAAC;IAElE,IAAI,CAAC,MAAM,EAAE,CAAC;QACZ,MAAM,IAAI,KAAK,CAAC,mEAAmE,CAAC,CAAC;IACvF,CAAC;IAED,OAAO,EAAE,MAAM,EAAE,OAAO,EAAE,KAAK,EAAE,SAAS,EAAE,OAAO,EAAE,CAAC;AACxD,CAAC"}
@@ -0,0 +1,2 @@
1
+ export {};
2
+ //# sourceMappingURL=index.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":""}
package/dist/index.js ADDED
@@ -0,0 +1,46 @@
1
+ import { Server } from '@modelcontextprotocol/sdk/server/index.js';
2
+ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
3
+ import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';
4
+ import { loadConfig } from './config/index.js';
5
+ import { VisionService } from './services/visionService.js';
6
+ import { createAnalyzeImageTool } from './tools/analyzeImage.js';
7
+ class VisionMCPServer {
8
+ server;
9
+ visionService;
10
+ constructor() {
11
+ const config = loadConfig();
12
+ this.visionService = new VisionService(config);
13
+ this.server = new Server({
14
+ name: 'simple-vision-mcp',
15
+ version: '1.0.0',
16
+ }, {
17
+ capabilities: {
18
+ tools: {},
19
+ },
20
+ });
21
+ this.setupHandlers();
22
+ }
23
+ setupHandlers() {
24
+ this.server.setRequestHandler(ListToolsRequestSchema, () => ({
25
+ tools: [createAnalyzeImageTool(this.visionService)],
26
+ }));
27
+ this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
28
+ const { name, arguments: args } = request.params;
29
+ if (name === 'analyze_image') {
30
+ const tool = createAnalyzeImageTool(this.visionService);
31
+ return await tool.handler(args);
32
+ }
33
+ throw new Error(`Unknown tool: ${name}`);
34
+ });
35
+ }
36
+ async run() {
37
+ const transport = new StdioServerTransport();
38
+ await this.server.connect(transport);
39
+ }
40
+ }
41
+ const server = new VisionMCPServer();
42
+ server.run().catch((error) => {
43
+ console.error('Failed to start server:', error);
44
+ process.exit(1);
45
+ });
46
+ //# sourceMappingURL=index.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,MAAM,EAAE,MAAM,2CAA2C,CAAC;AACnE,OAAO,EAAE,oBAAoB,EAAE,MAAM,2CAA2C,CAAC;AACjF,OAAO,EAAE,qBAAqB,EAAE,sBAAsB,EAAE,MAAM,oCAAoC,CAAC;AACnG,OAAO,EAAE,UAAU,EAAE,MAAM,mBAAmB,CAAC;AAC/C,OAAO,EAAE,aAAa,EAAE,MAAM,6BAA6B,CAAC;AAC5D,OAAO,EAAE,sBAAsB,EAAE,MAAM,yBAAyB,CAAC;AAEjE,MAAM,eAAe;IACX,MAAM,CAAS;IACf,aAAa,CAAgB;IAErC;QACE,MAAM,MAAM,GAAG,UAAU,EAAE,CAAC;QAC5B,IAAI,CAAC,aAAa,GAAG,IAAI,aAAa,CAAC,MAAM,CAAC,CAAC;QAE/C,IAAI,CAAC,MAAM,GAAG,IAAI,MAAM,CACtB;YACE,IAAI,EAAE,mBAAmB;YACzB,OAAO,EAAE,OAAO;SACjB,EACD;YACE,YAAY,EAAE;gBACZ,KAAK,EAAE,EAAE;aACV;SACF,CACF,CAAC;QAEF,IAAI,CAAC,aAAa,EAAE,CAAC;IACvB,CAAC;IAEO,aAAa;QACnB,IAAI,CAAC,MAAM,CAAC,iBAAiB,CAAC,sBAAsB,EAAE,GAAG,EAAE,CAAC,CAAC;YAC3D,KAAK,EAAE,CAAC,sBAAsB,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;SACpD,CAAC,CAAC,CAAC;QAEJ,IAAI,CAAC,MAAM,CAAC,iBAAiB,CAAC,qBAAqB,EAAE,KAAK,EAAE,OAAO,EAAE,EAAE;YACrE,MAAM,EAAE,IAAI,EAAE,SAAS,EAAE,IAAI,EAAE,GAAG,OAAO,CAAC,MAAM,CAAC;YAEjD,IAAI,IAAI,KAAK,eAAe,EAAE,CAAC;gBAC7B,MAAM,IAAI,GAAG,sBAAsB,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;gBACxD,OAAO,MAAM,IAAI,CAAC,OAAO,CAAC,IAA+C,CAAC,CAAC;YAC7E,CAAC;YAED,MAAM,IAAI,KAAK,CAAC,iBAAiB,IAAI,EAAE,CAAC,CAAC;QAC3C,CAAC,CAAC,CAAC;IACL,CAAC;IAED,KAAK,CAAC,GAAG;QACP,MAAM,SAAS,GAAG,IAAI,oBAAoB,EAAE,CAAC;QAC7C,MAAM,IAAI,CAAC,MAAM,CAAC,OAAO,CAAC,SAAS,CAAC,CAAC;IACvC,CAAC;CACF;AAED,MAAM,MAAM,GAAG,IAAI,eAAe,EAAE,CAAC;AACrC,MAAM,CAAC,GAAG,EAAE,CAAC,KAAK,CAAC,CAAC,KAAK,EAAE,EAAE;IAC3B,OAAO,CAAC,KAAK,CAAC,yBAAyB,EAAE,KAAK,CAAC,CAAC;IAChD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC"}
@@ -0,0 +1,20 @@
1
+ import { ServerConfig } from '../config/index.js';
2
+ export interface VisionRequest {
3
+ imageDataUrl: string;
4
+ prompt: string;
5
+ maxTokens: number;
6
+ }
7
+ export interface VisionResponse {
8
+ content: string;
9
+ usage?: {
10
+ promptTokens: number;
11
+ completionTokens: number;
12
+ totalTokens: number;
13
+ };
14
+ }
15
+ export declare class VisionService {
16
+ private readonly config;
17
+ constructor(config: ServerConfig);
18
+ analyze(request: VisionRequest): Promise<VisionResponse>;
19
+ }
20
+ //# sourceMappingURL=visionService.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"visionService.d.ts","sourceRoot":"","sources":["../../src/services/visionService.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,YAAY,EAAE,MAAM,oBAAoB,CAAC;AAElD,MAAM,WAAW,aAAa;IAC5B,YAAY,EAAE,MAAM,CAAC;IACrB,MAAM,EAAE,MAAM,CAAC;IACf,SAAS,EAAE,MAAM,CAAC;CACnB;AAED,MAAM,WAAW,cAAc;IAC7B,OAAO,EAAE,MAAM,CAAC;IAChB,KAAK,CAAC,EAAE;QACN,YAAY,EAAE,MAAM,CAAC;QACrB,gBAAgB,EAAE,MAAM,CAAC;QACzB,WAAW,EAAE,MAAM,CAAC;KACrB,CAAC;CACH;AAED,qBAAa,aAAa;IACZ,OAAO,CAAC,QAAQ,CAAC,MAAM;gBAAN,MAAM,EAAE,YAAY;IAE3C,OAAO,CAAC,OAAO,EAAE,aAAa,GAAG,OAAO,CAAC,cAAc,CAAC;CA6C/D"}
@@ -0,0 +1,44 @@
1
+ export class VisionService {
2
+ config;
3
+ constructor(config) {
4
+ this.config = config;
5
+ }
6
+ async analyze(request) {
7
+ const { imageDataUrl, prompt, maxTokens } = request;
8
+ const response = await fetch(`${this.config.baseUrl}/chat/completions`, {
9
+ method: 'POST',
10
+ headers: {
11
+ 'Content-Type': 'application/json',
12
+ 'Authorization': `Bearer ${this.config.apiKey}`,
13
+ },
14
+ body: JSON.stringify({
15
+ model: this.config.model,
16
+ messages: [
17
+ {
18
+ role: 'user',
19
+ content: [
20
+ { type: 'text', text: prompt },
21
+ { type: 'image_url', image_url: { url: imageDataUrl } },
22
+ ],
23
+ },
24
+ ],
25
+ max_tokens: maxTokens,
26
+ }),
27
+ });
28
+ if (!response.ok) {
29
+ const errorBody = await response.text();
30
+ throw new Error(`Vision API error: ${response.status} - ${errorBody}`);
31
+ }
32
+ const data = await response.json();
33
+ const content = data.choices[0]?.message?.content || '';
34
+ return {
35
+ content,
36
+ usage: data.usage ? {
37
+ promptTokens: data.usage.prompt_tokens,
38
+ completionTokens: data.usage.completion_tokens,
39
+ totalTokens: data.usage.total_tokens,
40
+ } : undefined,
41
+ };
42
+ }
43
+ }
44
+ //# sourceMappingURL=visionService.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"visionService.js","sourceRoot":"","sources":["../../src/services/visionService.ts"],"names":[],"mappings":"AAiBA,MAAM,OAAO,aAAa;IACK;IAA7B,YAA6B,MAAoB;QAApB,WAAM,GAAN,MAAM,CAAc;IAAG,CAAC;IAErD,KAAK,CAAC,OAAO,CAAC,OAAsB;QAClC,MAAM,EAAE,YAAY,EAAE,MAAM,EAAE,SAAS,EAAE,GAAG,OAAO,CAAC;QAEpD,MAAM,QAAQ,GAAG,MAAM,KAAK,CAAC,GAAG,IAAI,CAAC,MAAM,CAAC,OAAO,mBAAmB,EAAE;YACtE,MAAM,EAAE,MAAM;YACd,OAAO,EAAE;gBACP,cAAc,EAAE,kBAAkB;gBAClC,eAAe,EAAE,UAAU,IAAI,CAAC,MAAM,CAAC,MAAM,EAAE;aAChD;YACD,IAAI,EAAE,IAAI,CAAC,SAAS,CAAC;gBACnB,KAAK,EAAE,IAAI,CAAC,MAAM,CAAC,KAAK;gBACxB,QAAQ,EAAE;oBACR;wBACE,IAAI,EAAE,MAAM;wBACZ,OAAO,EAAE;4BACP,EAAE,IAAI,EAAE,MAAM,EAAE,IAAI,EAAE,MAAM,EAAE;4BAC9B,EAAE,IAAI,EAAE,WAAW,EAAE,SAAS,EAAE,EAAE,GAAG,EAAE,YAAY,EAAE,EAAE;yBACxD;qBACF;iBACF;gBACD,UAAU,EAAE,SAAS;aACtB,CAAC;SACH,CAAC,CAAC;QAEH,IAAI,CAAC,QAAQ,CAAC,EAAE,EAAE,CAAC;YACjB,MAAM,SAAS,GAAG,MAAM,QAAQ,CAAC,IAAI,EAAE,CAAC;YACxC,MAAM,IAAI,KAAK,CAAC,qBAAqB,QAAQ,CAAC,MAAM,MAAM,SAAS,EAAE,CAAC,CAAC;QACzE,CAAC;QAED,MAAM,IAAI,GAAG,MAAM,QAAQ,CAAC,IAAI,EAG/B,CAAC;QAEF,MAAM,OAAO,GAAG,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,EAAE,OAAO,EAAE,OAAO,IAAI,EAAE,CAAC;QAExD,OAAO;YACL,OAAO;YACP,KAAK,EAAE,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC;gBAClB,YAAY,EAAE,IAAI,CAAC,KAAK,CAAC,aAAa;gBACtC,gBAAgB,EAAE,IAAI,CAAC,KAAK,CAAC,iBAAiB;gBAC9C,WAAW,EAAE,IAAI,CAAC,KAAK,CAAC,YAAY;aACrC,CAAC,CAAC,CAAC,SAAS;SACd,CAAC;IACJ,CAAC;CACF"}
@@ -0,0 +1,31 @@
1
+ import { VisionService } from '../services/visionService.js';
2
+ export interface AnalyzeImageParams {
3
+ image_path: string;
4
+ prompt?: string;
5
+ }
6
+ export declare function createAnalyzeImageTool(visionService: VisionService): {
7
+ name: string;
8
+ description: string;
9
+ inputSchema: {
10
+ type: string;
11
+ properties: {
12
+ image_path: {
13
+ type: string;
14
+ description: string;
15
+ };
16
+ prompt: {
17
+ type: string;
18
+ description: string;
19
+ default: string;
20
+ };
21
+ };
22
+ required: string[];
23
+ };
24
+ handler: (params: AnalyzeImageParams) => Promise<{
25
+ content: {
26
+ type: string;
27
+ text: string;
28
+ }[];
29
+ }>;
30
+ };
31
+ //# sourceMappingURL=analyzeImage.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"analyzeImage.d.ts","sourceRoot":"","sources":["../../src/tools/analyzeImage.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,aAAa,EAAE,MAAM,8BAA8B,CAAC;AAG7D,MAAM,WAAW,kBAAkB;IACjC,UAAU,EAAE,MAAM,CAAC;IACnB,MAAM,CAAC,EAAE,MAAM,CAAC;CACjB;AAED,wBAAgB,sBAAsB,CAAC,aAAa,EAAE,aAAa;;;;;;;;;;;;;;;;;;sBAmBvC,kBAAkB;;;;;;EAiB7C"}
@@ -0,0 +1,36 @@
1
+ import { imagePathToData, imageDataToDataUrl } from '../utils/imageProcessor.js';
2
+ export function createAnalyzeImageTool(visionService) {
3
+ return {
4
+ name: 'analyze_image',
5
+ description: 'Analyzes an image and returns a detailed description. Supports PNG, JPEG, GIF, WebP, and BMP formats.',
6
+ inputSchema: {
7
+ type: 'object',
8
+ properties: {
9
+ image_path: {
10
+ type: 'string',
11
+ description: 'Path to the image file to analyze',
12
+ },
13
+ prompt: {
14
+ type: 'string',
15
+ description: 'Custom prompt for image analysis',
16
+ default: 'Describe this image in detail, including objects, text, colors, composition, and any notable features.',
17
+ },
18
+ },
19
+ required: ['image_path'],
20
+ },
21
+ handler: async (params) => {
22
+ const { image_path, prompt } = params;
23
+ const imageData = imagePathToData(image_path);
24
+ const imageDataUrl = imageDataToDataUrl(imageData);
25
+ const result = await visionService.analyze({
26
+ imageDataUrl,
27
+ prompt: prompt || 'Describe this image in detail, including objects, text, colors, composition, and any notable features.',
28
+ maxTokens: 4096,
29
+ });
30
+ return {
31
+ content: [{ type: 'text', text: result.content }],
32
+ };
33
+ },
34
+ };
35
+ }
36
+ //# sourceMappingURL=analyzeImage.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"analyzeImage.js","sourceRoot":"","sources":["../../src/tools/analyzeImage.ts"],"names":[],"mappings":"AACA,OAAO,EAAE,eAAe,EAAE,kBAAkB,EAAE,MAAM,4BAA4B,CAAC;AAOjF,MAAM,UAAU,sBAAsB,CAAC,aAA4B;IACjE,OAAO;QACL,IAAI,EAAE,eAAe;QACrB,WAAW,EAAE,uGAAuG;QACpH,WAAW,EAAE;YACX,IAAI,EAAE,QAAQ;YACd,UAAU,EAAE;gBACV,UAAU,EAAE;oBACV,IAAI,EAAE,QAAQ;oBACd,WAAW,EAAE,mCAAmC;iBACjD;gBACD,MAAM,EAAE;oBACN,IAAI,EAAE,QAAQ;oBACd,WAAW,EAAE,kCAAkC;oBAC/C,OAAO,EAAE,wGAAwG;iBAClH;aACF;YACD,QAAQ,EAAE,CAAC,YAAY,CAAC;SACzB;QACD,OAAO,EAAE,KAAK,EAAE,MAA0B,EAAE,EAAE;YAC5C,MAAM,EAAE,UAAU,EAAE,MAAM,EAAE,GAAG,MAAM,CAAC;YAEtC,MAAM,SAAS,GAAG,eAAe,CAAC,UAAU,CAAC,CAAC;YAC9C,MAAM,YAAY,GAAG,kBAAkB,CAAC,SAAS,CAAC,CAAC;YAEnD,MAAM,MAAM,GAAG,MAAM,aAAa,CAAC,OAAO,CAAC;gBACzC,YAAY;gBACZ,MAAM,EAAE,MAAM,IAAI,wGAAwG;gBAC1H,SAAS,EAAE,IAAI;aAChB,CAAC,CAAC;YAEH,OAAO;gBACL,OAAO,EAAE,CAAC,EAAE,IAAI,EAAE,MAAM,EAAE,IAAI,EAAE,MAAM,CAAC,OAAO,EAAE,CAAC;aAClD,CAAC;QACJ,CAAC;KACF,CAAC;AACJ,CAAC"}
@@ -0,0 +1,10 @@
1
+ export interface ImageData {
2
+ base64: string;
3
+ mimeType: string;
4
+ filename: string;
5
+ }
6
+ export declare function getMimeType(filePath: string): string;
7
+ export declare function isValidImageFormat(filePath: string): boolean;
8
+ export declare function imagePathToData(filePath: string): ImageData;
9
+ export declare function imageDataToDataUrl(imageData: ImageData): string;
10
+ //# sourceMappingURL=imageProcessor.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"imageProcessor.d.ts","sourceRoot":"","sources":["../../src/utils/imageProcessor.ts"],"names":[],"mappings":"AAGA,MAAM,WAAW,SAAS;IACxB,MAAM,EAAE,MAAM,CAAC;IACf,QAAQ,EAAE,MAAM,CAAC;IACjB,QAAQ,EAAE,MAAM,CAAC;CAClB;AAWD,wBAAgB,WAAW,CAAC,QAAQ,EAAE,MAAM,GAAG,MAAM,CAGpD;AAED,wBAAgB,kBAAkB,CAAC,QAAQ,EAAE,MAAM,GAAG,OAAO,CAG5D;AAED,wBAAgB,eAAe,CAAC,QAAQ,EAAE,MAAM,GAAG,SAAS,CAW3D;AAED,wBAAgB,kBAAkB,CAAC,SAAS,EAAE,SAAS,GAAG,MAAM,CAE/D"}
@@ -0,0 +1,32 @@
1
+ import { readFileSync } from 'fs';
2
+ import { extname, basename } from 'path';
3
+ const MIME_TYPES = {
4
+ '.png': 'image/png',
5
+ '.jpg': 'image/jpeg',
6
+ '.jpeg': 'image/jpeg',
7
+ '.gif': 'image/gif',
8
+ '.webp': 'image/webp',
9
+ '.bmp': 'image/bmp',
10
+ };
11
+ export function getMimeType(filePath) {
12
+ const ext = extname(filePath).toLowerCase();
13
+ return MIME_TYPES[ext] || 'application/octet-stream';
14
+ }
15
+ export function isValidImageFormat(filePath) {
16
+ const ext = extname(filePath).toLowerCase();
17
+ return ext in MIME_TYPES;
18
+ }
19
+ export function imagePathToData(filePath) {
20
+ if (!isValidImageFormat(filePath)) {
21
+ throw new Error(`Unsupported image format: ${extname(filePath)}`);
22
+ }
23
+ const imageBuffer = readFileSync(filePath);
24
+ const base64 = imageBuffer.toString('base64');
25
+ const mimeType = getMimeType(filePath);
26
+ const filename = basename(filePath);
27
+ return { base64, mimeType, filename };
28
+ }
29
+ export function imageDataToDataUrl(imageData) {
30
+ return `data:${imageData.mimeType};base64,${imageData.base64}`;
31
+ }
32
+ //# sourceMappingURL=imageProcessor.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"imageProcessor.js","sourceRoot":"","sources":["../../src/utils/imageProcessor.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,YAAY,EAAE,MAAM,IAAI,CAAC;AAClC,OAAO,EAAE,OAAO,EAAE,QAAQ,EAAE,MAAM,MAAM,CAAC;AAQzC,MAAM,UAAU,GAA2B;IACzC,MAAM,EAAE,WAAW;IACnB,MAAM,EAAE,YAAY;IACpB,OAAO,EAAE,YAAY;IACrB,MAAM,EAAE,WAAW;IACnB,OAAO,EAAE,YAAY;IACrB,MAAM,EAAE,WAAW;CACpB,CAAC;AAEF,MAAM,UAAU,WAAW,CAAC,QAAgB;IAC1C,MAAM,GAAG,GAAG,OAAO,CAAC,QAAQ,CAAC,CAAC,WAAW,EAAE,CAAC;IAC5C,OAAO,UAAU,CAAC,GAAG,CAAC,IAAI,0BAA0B,CAAC;AACvD,CAAC;AAED,MAAM,UAAU,kBAAkB,CAAC,QAAgB;IACjD,MAAM,GAAG,GAAG,OAAO,CAAC,QAAQ,CAAC,CAAC,WAAW,EAAE,CAAC;IAC5C,OAAO,GAAG,IAAI,UAAU,CAAC;AAC3B,CAAC;AAED,MAAM,UAAU,eAAe,CAAC,QAAgB;IAC9C,IAAI,CAAC,kBAAkB,CAAC,QAAQ,CAAC,EAAE,CAAC;QAClC,MAAM,IAAI,KAAK,CAAC,6BAA6B,OAAO,CAAC,QAAQ,CAAC,EAAE,CAAC,CAAC;IACpE,CAAC;IAED,MAAM,WAAW,GAAG,YAAY,CAAC,QAAQ,CAAC,CAAC;IAC3C,MAAM,MAAM,GAAG,WAAW,CAAC,QAAQ,CAAC,QAAQ,CAAC,CAAC;IAC9C,MAAM,QAAQ,GAAG,WAAW,CAAC,QAAQ,CAAC,CAAC;IACvC,MAAM,QAAQ,GAAG,QAAQ,CAAC,QAAQ,CAAC,CAAC;IAEpC,OAAO,EAAE,MAAM,EAAE,QAAQ,EAAE,QAAQ,EAAE,CAAC;AACxC,CAAC;AAED,MAAM,UAAU,kBAAkB,CAAC,SAAoB;IACrD,OAAO,QAAQ,SAAS,CAAC,QAAQ,WAAW,SAAS,CAAC,MAAM,EAAE,CAAC;AACjE,CAAC"}
package/package.json ADDED
@@ -0,0 +1,36 @@
1
+ {
2
+ "name": "@erickstryck/simple-vision-mcp",
3
+ "version": "1.0.0",
4
+ "description": "A lightweight MCP server for image analysis using OpenAI-compatible APIs",
5
+ "type": "module",
6
+ "main": "dist/index.js",
7
+ "bin": {
8
+ "simple-vision-mcp": "dist/index.js"
9
+ },
10
+ "scripts": {
11
+ "build": "tsc",
12
+ "start": "node dist/index.js",
13
+ "dev": "tsc && node dist/index.js",
14
+ "test": "vitest run",
15
+ "test:watch": "vitest"
16
+ },
17
+ "keywords": ["mcp", "vision", "image-analysis", "openai-compatible"],
18
+ "author": "erickstryck <erickstryck@hotmail.com>",
19
+ "license": "MIT",
20
+ "repository": {
21
+ "type": "git",
22
+ "url": "https://github.com/erickstryck/simple-vision-mcp"
23
+ },
24
+ "dependencies": {
25
+ "@modelcontextprotocol/sdk": "^1.0.0",
26
+ "dotenv": "^16.4.5"
27
+ },
28
+ "devDependencies": {
29
+ "@types/node": "^22.0.0",
30
+ "typescript": "^5.6.0",
31
+ "vitest": "^2.0.0"
32
+ },
33
+ "engines": {
34
+ "node": ">=18.0.0"
35
+ }
36
+ }
@@ -0,0 +1,21 @@
1
+ export interface ServerConfig {
2
+ apiKey: string;
3
+ baseUrl: string;
4
+ model: string;
5
+ maxTokens: number;
6
+ timeout: number;
7
+ }
8
+
9
+ export function loadConfig(): ServerConfig {
10
+ const apiKey = process.env.VISION_API_KEY || process.env.OPENAI_API_KEY || '';
11
+ const baseUrl = process.env.VISION_BASE_URL || process.env.OPENAI_BASE_URL || 'https://api.openai.com/v1';
12
+ const model = process.env.VISION_MODEL || process.env.OPENAI_MODEL || 'gpt-4o-mini';
13
+ const maxTokens = parseInt(process.env.VISION_MAX_TOKENS || '4096', 10);
14
+ const timeout = parseInt(process.env.VISION_TIMEOUT || '120', 10);
15
+
16
+ if (!apiKey) {
17
+ throw new Error('VISION_API_KEY or OPENAI_API_KEY environment variable is required');
18
+ }
19
+
20
+ return { apiKey, baseUrl, model, maxTokens, timeout };
21
+ }
package/src/index.ts ADDED
@@ -0,0 +1,58 @@
1
+ import { Server } from '@modelcontextprotocol/sdk/server/index.js';
2
+ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
3
+ import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';
4
+ import { loadConfig } from './config/index.js';
5
+ import { VisionService } from './services/visionService.js';
6
+ import { createAnalyzeImageTool } from './tools/analyzeImage.js';
7
+
8
+ class VisionMCPServer {
9
+ private server: Server;
10
+ private visionService: VisionService;
11
+
12
+ constructor() {
13
+ const config = loadConfig();
14
+ this.visionService = new VisionService(config);
15
+
16
+ this.server = new Server(
17
+ {
18
+ name: 'simple-vision-mcp',
19
+ version: '1.0.0',
20
+ },
21
+ {
22
+ capabilities: {
23
+ tools: {},
24
+ },
25
+ }
26
+ );
27
+
28
+ this.setupHandlers();
29
+ }
30
+
31
+ private setupHandlers(): void {
32
+ this.server.setRequestHandler(ListToolsRequestSchema, () => ({
33
+ tools: [createAnalyzeImageTool(this.visionService)],
34
+ }));
35
+
36
+ this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
37
+ const { name, arguments: args } = request.params;
38
+
39
+ if (name === 'analyze_image') {
40
+ const tool = createAnalyzeImageTool(this.visionService);
41
+ return await tool.handler(args as { image_path: string; prompt?: string });
42
+ }
43
+
44
+ throw new Error(`Unknown tool: ${name}`);
45
+ });
46
+ }
47
+
48
+ async run(): Promise<void> {
49
+ const transport = new StdioServerTransport();
50
+ await this.server.connect(transport);
51
+ }
52
+ }
53
+
54
+ const server = new VisionMCPServer();
55
+ server.run().catch((error) => {
56
+ console.error('Failed to start server:', error);
57
+ process.exit(1);
58
+ });
@@ -0,0 +1,66 @@
1
+ import { ServerConfig } from '../config/index.js';
2
+
3
+ export interface VisionRequest {
4
+ imageDataUrl: string;
5
+ prompt: string;
6
+ maxTokens: number;
7
+ }
8
+
9
+ export interface VisionResponse {
10
+ content: string;
11
+ usage?: {
12
+ promptTokens: number;
13
+ completionTokens: number;
14
+ totalTokens: number;
15
+ };
16
+ }
17
+
18
+ export class VisionService {
19
+ constructor(private readonly config: ServerConfig) {}
20
+
21
+ async analyze(request: VisionRequest): Promise<VisionResponse> {
22
+ const { imageDataUrl, prompt, maxTokens } = request;
23
+
24
+ const response = await fetch(`${this.config.baseUrl}/chat/completions`, {
25
+ method: 'POST',
26
+ headers: {
27
+ 'Content-Type': 'application/json',
28
+ 'Authorization': `Bearer ${this.config.apiKey}`,
29
+ },
30
+ body: JSON.stringify({
31
+ model: this.config.model,
32
+ messages: [
33
+ {
34
+ role: 'user',
35
+ content: [
36
+ { type: 'text', text: prompt },
37
+ { type: 'image_url', image_url: { url: imageDataUrl } },
38
+ ],
39
+ },
40
+ ],
41
+ max_tokens: maxTokens,
42
+ }),
43
+ });
44
+
45
+ if (!response.ok) {
46
+ const errorBody = await response.text();
47
+ throw new Error(`Vision API error: ${response.status} - ${errorBody}`);
48
+ }
49
+
50
+ const data = await response.json() as {
51
+ choices: Array<{ message: { content: string } }>;
52
+ usage?: { prompt_tokens: number; completion_tokens: number; total_tokens: number };
53
+ };
54
+
55
+ const content = data.choices[0]?.message?.content || '';
56
+
57
+ return {
58
+ content,
59
+ usage: data.usage ? {
60
+ promptTokens: data.usage.prompt_tokens,
61
+ completionTokens: data.usage.completion_tokens,
62
+ totalTokens: data.usage.total_tokens,
63
+ } : undefined,
64
+ };
65
+ }
66
+ }
@@ -0,0 +1,45 @@
1
+ import { VisionService } from '../services/visionService.js';
2
+ import { imagePathToData, imageDataToDataUrl } from '../utils/imageProcessor.js';
3
+
4
+ export interface AnalyzeImageParams {
5
+ image_path: string;
6
+ prompt?: string;
7
+ }
8
+
9
+ export function createAnalyzeImageTool(visionService: VisionService) {
10
+ return {
11
+ name: 'analyze_image',
12
+ description: 'Analyzes an image and returns a detailed description. Supports PNG, JPEG, GIF, WebP, and BMP formats.',
13
+ inputSchema: {
14
+ type: 'object',
15
+ properties: {
16
+ image_path: {
17
+ type: 'string',
18
+ description: 'Path to the image file to analyze',
19
+ },
20
+ prompt: {
21
+ type: 'string',
22
+ description: 'Custom prompt for image analysis',
23
+ default: 'Describe this image in detail, including objects, text, colors, composition, and any notable features.',
24
+ },
25
+ },
26
+ required: ['image_path'],
27
+ },
28
+ handler: async (params: AnalyzeImageParams) => {
29
+ const { image_path, prompt } = params;
30
+
31
+ const imageData = imagePathToData(image_path);
32
+ const imageDataUrl = imageDataToDataUrl(imageData);
33
+
34
+ const result = await visionService.analyze({
35
+ imageDataUrl,
36
+ prompt: prompt || 'Describe this image in detail, including objects, text, colors, composition, and any notable features.',
37
+ maxTokens: 4096,
38
+ });
39
+
40
+ return {
41
+ content: [{ type: 'text', text: result.content }],
42
+ };
43
+ },
44
+ };
45
+ }
@@ -0,0 +1,44 @@
1
+ import { readFileSync } from 'fs';
2
+ import { extname, basename } from 'path';
3
+
4
+ export interface ImageData {
5
+ base64: string;
6
+ mimeType: string;
7
+ filename: string;
8
+ }
9
+
10
+ const MIME_TYPES: Record<string, string> = {
11
+ '.png': 'image/png',
12
+ '.jpg': 'image/jpeg',
13
+ '.jpeg': 'image/jpeg',
14
+ '.gif': 'image/gif',
15
+ '.webp': 'image/webp',
16
+ '.bmp': 'image/bmp',
17
+ };
18
+
19
+ export function getMimeType(filePath: string): string {
20
+ const ext = extname(filePath).toLowerCase();
21
+ return MIME_TYPES[ext] || 'application/octet-stream';
22
+ }
23
+
24
+ export function isValidImageFormat(filePath: string): boolean {
25
+ const ext = extname(filePath).toLowerCase();
26
+ return ext in MIME_TYPES;
27
+ }
28
+
29
+ export function imagePathToData(filePath: string): ImageData {
30
+ if (!isValidImageFormat(filePath)) {
31
+ throw new Error(`Unsupported image format: ${extname(filePath)}`);
32
+ }
33
+
34
+ const imageBuffer = readFileSync(filePath);
35
+ const base64 = imageBuffer.toString('base64');
36
+ const mimeType = getMimeType(filePath);
37
+ const filename = basename(filePath);
38
+
39
+ return { base64, mimeType, filename };
40
+ }
41
+
42
+ export function imageDataToDataUrl(imageData: ImageData): string {
43
+ return `data:${imageData.mimeType};base64,${imageData.base64}`;
44
+ }
@@ -0,0 +1,63 @@
1
+ import { describe, it, expect, vi, beforeEach } from 'vitest';
2
+ import { loadConfig } from '../src/config/index.js';
3
+
4
+ describe('Config', () => {
5
+ const originalEnv = process.env;
6
+
7
+ beforeEach(() => {
8
+ vi.resetModules();
9
+ process.env = { ...originalEnv };
10
+ });
11
+
12
+ it('should load config from environment variables', () => {
13
+ process.env.VISION_API_KEY = 'test-key';
14
+ process.env.VISION_BASE_URL = 'https://test.com/v1';
15
+ process.env.VISION_MODEL = 'test-model';
16
+
17
+ const config = loadConfig();
18
+
19
+ expect(config.apiKey).toBe('test-key');
20
+ expect(config.baseUrl).toBe('https://test.com/v1');
21
+ expect(config.model).toBe('test-model');
22
+ });
23
+
24
+ it('should use fallback values for optional settings', () => {
25
+ process.env.VISION_API_KEY = 'test-key';
26
+
27
+ const config = loadConfig();
28
+
29
+ expect(config.maxTokens).toBe(4096);
30
+ expect(config.timeout).toBe(120);
31
+ });
32
+
33
+ it('should throw error when API key is missing', () => {
34
+ delete process.env.VISION_API_KEY;
35
+ delete process.env.OPENAI_API_KEY;
36
+
37
+ expect(() => loadConfig()).toThrow('VISION_API_KEY or OPENAI_API_KEY environment variable is required');
38
+ });
39
+
40
+ it('should prefer VISION_* variables over OPENAI_* variables', () => {
41
+ process.env.VISION_API_KEY = 'vision-key';
42
+ process.env.OPENAI_API_KEY = 'openai-key';
43
+ process.env.VISION_BASE_URL = 'https://vision.com/v1';
44
+ process.env.OPENAI_BASE_URL = 'https://openai.com/v1';
45
+
46
+ const config = loadConfig();
47
+
48
+ expect(config.apiKey).toBe('vision-key');
49
+ expect(config.baseUrl).toBe('https://vision.com/v1');
50
+ });
51
+
52
+ it('should use OPENAI_* variables when VISION_* are not set', () => {
53
+ process.env.OPENAI_API_KEY = 'openai-key';
54
+ process.env.OPENAI_BASE_URL = 'https://openai.com/v1';
55
+ process.env.OPENAI_MODEL = 'gpt-4o';
56
+
57
+ const config = loadConfig();
58
+
59
+ expect(config.apiKey).toBe('openai-key');
60
+ expect(config.baseUrl).toBe('https://openai.com/v1');
61
+ expect(config.model).toBe('gpt-4o');
62
+ });
63
+ });
@@ -0,0 +1,97 @@
1
+ import { describe, it, expect, beforeEach } from 'vitest';
2
+ import { writeFileSync, unlinkSync, mkdirSync, existsSync } from 'fs';
3
+ import { join } from 'path';
4
+ import { imagePathToData, imageDataToDataUrl, getMimeType, isValidImageFormat } from '../src/utils/imageProcessor.js';
5
+
6
+ describe('ImageProcessor', () => {
7
+ const testDir = '/tmp/simple-vision-mcp-tests';
8
+
9
+ beforeEach(() => {
10
+ if (!existsSync(testDir)) {
11
+ mkdirSync(testDir, { recursive: true });
12
+ }
13
+ });
14
+
15
+ describe('getMimeType', () => {
16
+ it('should return correct mime type for png', () => {
17
+ expect(getMimeType('/path/to/image.png')).toBe('image/png');
18
+ });
19
+
20
+ it('should return correct mime type for jpeg', () => {
21
+ expect(getMimeType('/path/to/image.jpg')).toBe('image/jpeg');
22
+ expect(getMimeType('/path/to/image.jpeg')).toBe('image/jpeg');
23
+ });
24
+
25
+ it('should return correct mime type for webp', () => {
26
+ expect(getMimeType('/path/to/image.webp')).toBe('image/webp');
27
+ });
28
+
29
+ it('should return correct mime type for gif', () => {
30
+ expect(getMimeType('/path/to/image.gif')).toBe('image/gif');
31
+ });
32
+
33
+ it('should return octet-stream for unknown formats', () => {
34
+ expect(getMimeType('/path/to/image.xyz')).toBe('application/octet-stream');
35
+ });
36
+ });
37
+
38
+ describe('isValidImageFormat', () => {
39
+ it('should return true for supported formats', () => {
40
+ expect(isValidImageFormat('image.png')).toBe(true);
41
+ expect(isValidImageFormat('image.jpg')).toBe(true);
42
+ expect(isValidImageFormat('image.jpeg')).toBe(true);
43
+ expect(isValidImageFormat('image.webp')).toBe(true);
44
+ expect(isValidImageFormat('image.gif')).toBe(true);
45
+ expect(isValidImageFormat('image.bmp')).toBe(true);
46
+ });
47
+
48
+ it('should return false for unsupported formats', () => {
49
+ expect(isValidImageFormat('image.pdf')).toBe(false);
50
+ expect(isValidImageFormat('image.txt')).toBe(false);
51
+ expect(isValidImageFormat('image.svg')).toBe(false);
52
+ });
53
+ });
54
+
55
+ describe('imagePathToData', () => {
56
+ it('should convert png file to base64 data', () => {
57
+ const testFile = join(testDir, 'test.png');
58
+ const pngData = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a]);
59
+ writeFileSync(testFile, pngData);
60
+
61
+ const result = imagePathToData(testFile);
62
+
63
+ expect(result.mimeType).toBe('image/png');
64
+ expect(result.base64).toBe(pngData.toString('base64'));
65
+ expect(result.filename).toBe('test.png');
66
+
67
+ unlinkSync(testFile);
68
+ });
69
+
70
+ it('should throw error for unsupported format', () => {
71
+ const testFile = join(testDir, 'test.pdf');
72
+ writeFileSync(testFile, Buffer.from('test'));
73
+
74
+ expect(() => imagePathToData(testFile)).toThrow('Unsupported image format: .pdf');
75
+
76
+ unlinkSync(testFile);
77
+ });
78
+
79
+ it('should throw error for non-existent file', () => {
80
+ expect(() => imagePathToData('/non/existent/file.png')).toThrow();
81
+ });
82
+ });
83
+
84
+ describe('imageDataToDataUrl', () => {
85
+ it('should convert image data to data URL', () => {
86
+ const imageData = {
87
+ base64: 'testbase64',
88
+ mimeType: 'image/png',
89
+ filename: 'test.png',
90
+ };
91
+
92
+ const result = imageDataToDataUrl(imageData);
93
+
94
+ expect(result).toBe('data:image/png;base64,testbase64');
95
+ });
96
+ });
97
+ });
@@ -0,0 +1,77 @@
1
+ import { describe, it, expect, vi, beforeEach } from 'vitest';
2
+ import { VisionService } from '../src/services/visionService.js';
3
+ import { ServerConfig } from '../src/config/index.js';
4
+
5
+ describe('VisionService', () => {
6
+ const mockConfig: ServerConfig = {
7
+ apiKey: 'test-api-key',
8
+ baseUrl: 'https://api.test.com/v1',
9
+ model: 'test-model',
10
+ maxTokens: 4096,
11
+ timeout: 120,
12
+ };
13
+
14
+ let visionService: VisionService;
15
+ let mockFetch: ReturnType<typeof vi.fn>;
16
+
17
+ beforeEach(() => {
18
+ mockFetch = vi.fn();
19
+ global.fetch = mockFetch;
20
+ visionService = new VisionService(mockConfig);
21
+ });
22
+
23
+ it('should analyze image successfully', async () => {
24
+ const mockResponse = {
25
+ ok: true,
26
+ json: () => Promise.resolve({
27
+ choices: [{ message: { content: 'Test description' } }],
28
+ usage: { prompt_tokens: 100, completion_tokens: 50, total_tokens: 150 },
29
+ }),
30
+ };
31
+ mockFetch.mockResolvedValue(mockResponse);
32
+
33
+ const result = await visionService.analyze({
34
+ imageDataUrl: 'data:image/png;base64,test',
35
+ prompt: 'Describe this image',
36
+ maxTokens: 4096,
37
+ });
38
+
39
+ expect(result.content).toBe('Test description');
40
+ expect(result.usage?.totalTokens).toBe(150);
41
+ });
42
+
43
+ it('should throw error on API failure', async () => {
44
+ const mockResponse = {
45
+ ok: false,
46
+ status: 401,
47
+ text: () => Promise.resolve('Unauthorized'),
48
+ };
49
+ mockFetch.mockResolvedValue(mockResponse);
50
+
51
+ await expect(
52
+ visionService.analyze({
53
+ imageDataUrl: 'data:image/png;base64,test',
54
+ prompt: 'Describe this image',
55
+ maxTokens: 4096,
56
+ })
57
+ ).rejects.toThrow('Vision API error: 401 - Unauthorized');
58
+ });
59
+
60
+ it('should handle empty response', async () => {
61
+ const mockResponse = {
62
+ ok: true,
63
+ json: () => Promise.resolve({
64
+ choices: [{ message: { content: '' } }],
65
+ }),
66
+ };
67
+ mockFetch.mockResolvedValue(mockResponse);
68
+
69
+ const result = await visionService.analyze({
70
+ imageDataUrl: 'data:image/png;base64,test',
71
+ prompt: 'Describe this image',
72
+ maxTokens: 4096,
73
+ });
74
+
75
+ expect(result.content).toBe('');
76
+ });
77
+ });
package/tsconfig.json ADDED
@@ -0,0 +1,20 @@
1
+ {
2
+ "compilerOptions": {
3
+ "target": "ES2022",
4
+ "module": "NodeNext",
5
+ "moduleResolution": "NodeNext",
6
+ "lib": ["ES2022"],
7
+ "outDir": "./dist",
8
+ "rootDir": "./src",
9
+ "strict": true,
10
+ "esModuleInterop": true,
11
+ "skipLibCheck": true,
12
+ "forceConsistentCasingInFileNames": true,
13
+ "resolveJsonModule": true,
14
+ "declaration": true,
15
+ "declarationMap": true,
16
+ "sourceMap": true
17
+ },
18
+ "include": ["src/**/*"],
19
+ "exclude": ["node_modules", "dist", "tests"]
20
+ }
@@ -0,0 +1,12 @@
1
+ import { defineConfig } from 'vitest/config';
2
+
3
+ export default defineConfig({
4
+ test: {
5
+ globals: true,
6
+ environment: 'node',
7
+ include: ['tests/**/*.test.ts'],
8
+ coverage: {
9
+ reporter: ['text', 'json', 'html'],
10
+ },
11
+ },
12
+ });