@it-master/image-recognition-mcp 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 akirose
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,238 @@
1
+ # Image Recognition MCP Server
2
+
3
+ A Model Context Protocol (MCP) server that provides AI-powered image recognition and description capabilities using OpenAI-compatible vision models.
4
+
5
+ ## Overview
6
+
7
+ This MCP server enables AI assistants to analyze and describe images through a simple URL-based interface. It supports OpenAI's vision models as well as OpenAI-compatible local models (such as LM Studio, Ollama, etc.), providing detailed descriptions of images and making it easy to integrate image analysis capabilities into your AI workflows.
8
+
9
+ ## Features
10
+
11
+ - **Image Analysis**: Analyze images from URLs and get detailed descriptions
12
+ - **Flexible Model Support**: Works with OpenAI's vision models and OpenAI-compatible local models (LM Studio, Ollama, etc.)
13
+ - **MCP Protocol**: Fully compatible with the Model Context Protocol standard
14
+ - **TypeScript**: Built with TypeScript for type safety and better development experience
15
+ - **Simple API**: Easy-to-use interface for image description requests
16
+
17
+ ## Installation
18
+
19
+ ### Prerequisites
20
+
21
+ - Node.js 18+
22
+ - npm or yarn
23
+ - OpenAI API key or local vision model server (e.g., LM Studio, Ollama)
24
+
25
+ ### MCP Client Configuration
26
+
27
+ To use this server with an MCP client, add the following configuration:
28
+
29
+ ```json
30
+ {
31
+ "mcpServers": {
32
+ "image-recognition": {
33
+ "command": "npx",
34
+ "args": ["-y", "@akirose/image-recognition-mcp"],
35
+ "env": {
36
+ "OPENAI_API_KEY": "your-actual-openai-api-key-here"
37
+ }
38
+ }
39
+ }
40
+ }
41
+ ```
42
+
43
+ To allow access to image files from any path, set `ALLOW_ALL_PATHS` to `true`:
44
+
45
+ ```json
46
+ {
47
+ "mcpServers": {
48
+ "image-recognition": {
49
+ "command": "npx",
50
+ "args": ["-y", "@akirose/image-recognition-mcp"],
51
+ "env": {
52
+ "OPENAI_API_KEY": "your-actual-openai-api-key-here",
53
+ "ALLOW_ALL_PATHS": "true"
54
+ }
55
+ }
56
+ }
57
+ }
58
+ ```
59
+
60
+ **⚠️ IMPORTANT:** The `env` section with your API key is required - this is the only way the MCP server can function. For local models, you can use any placeholder value for `OPENAI_API_KEY` and configure `OPENAI_BASE_URL` to point to your local server.
61
+
62
+ ### Environment Variables
63
+
64
+ The server supports the following environment variables:
65
+
66
+ - `OPENAI_API_KEY` - Your OpenAI API key, or any placeholder value when using local models (required)
67
+ - `OPENAI_BASE_URL` - Base URL for OpenAI API or OpenAI-compatible API servers (optional, defaults to OpenAI's official API)
68
+ - Example for LM Studio: `"http://127.0.0.1:1234/v1"`
69
+ - Example for Ollama: `"http://localhost:11434/v1"`
70
+ - `OPENAI_MODEL` - The vision model to use for image recognition (optional, defaults to "gpt-5-mini")
71
+ - For OpenAI: `"gpt-5-mini"`, `"gpt-4o"`, `"gpt-4o-mini"`, etc.
72
+ - For local models: `"llava"`, `"qwen/qwen3-vl-4b"`, or any locally available vision model
73
+ - `ALLOWED_IMAGE_PATHS` - Comma-separated list of allowed local file paths (optional, defaults to "./images,./assets")
74
+ - Example: `"./images,./assets,./downloads"`
75
+ - `ALLOW_ALL_PATHS` - Set to "true" to allow access to image files from any path. When enabled, only image file extensions (.jpg, .jpeg, .png, .gif, .webp) are allowed for security (optional, defaults to false)
76
+ - `ALLOWED_DOMAINS` - Comma-separated list of allowed URL domains for enhanced security (optional, defaults to allow all domains)
77
+ - Example: `"example.com,cdn.example.com,images.example.org"`
78
+ - When not set: All domains are allowed
79
+ - When set: Only specified domains will be allowed for URL-based image requests
80
+
81
+ ## Usage
82
+
83
+ ### Available Tools
84
+
85
+ #### `describe-image`
86
+
87
+ Analyzes an image from a URL or local file path and provides a detailed description.
88
+
89
+ **Parameters:**
90
+
91
+ - `imageUrl` (string): The URL of the image to analyze, or a local file path
92
+ - `prompt` (string, optional): The question or prompt to ask about the image (defaults to "what's in this image?")
93
+
94
+ **Example with URL:**
95
+
96
+ ```json
97
+ {
98
+ "tool": "describe-image",
99
+ "arguments": {
100
+ "imageUrl": "https://example.com/image.jpg",
101
+ "prompt": "what's in this image?"
102
+ }
103
+ }
104
+ ```
105
+
106
+ **Example with local file:**
107
+
108
+ ```json
109
+ {
110
+ "tool": "describe-image",
111
+ "arguments": {
112
+ "imageUrl": "./images/my-image.png",
113
+ "prompt": "Describe the objects in this image"
114
+ }
115
+ }
116
+ ```
117
+
118
+ **Response:**
119
+
120
+ ```json
121
+ {
122
+ "content": [
123
+ {
124
+ "type": "text",
125
+ "text": "The image shows a beautiful sunset over a mountain landscape with vibrant orange and pink colors in the sky..."
126
+ }
127
+ ]
128
+ }
129
+ ```
130
+
131
+ ### Integration with AI Assistants
132
+
133
+ This MCP server can be integrated with various AI assistants that support the MCP protocol, such as:
134
+
135
+ - Claude Desktop
136
+ - Other MCP-compatible AI systems
137
+
138
+ ## Development
139
+
140
+ ### Project Structure
141
+
142
+ ```
143
+ image-recognition-mcp/
144
+ ├── src/
145
+ │ ├── index.ts # Main server implementation
146
+ │ ├── path-validator.ts # Path validation and security functions
147
+ │ └── image-processor.ts # Image processing utilities
148
+ ├── test/
149
+ │ ├── index.test.ts # Unit tests
150
+ │ ├── describe-image-integration.test.ts # Integration tests
151
+ │ ├── test.png # Test image
152
+ │ └── README.md # Test documentation
153
+ ├── dist/ # Compiled JavaScript output
154
+ ├── package.json # Project dependencies and scripts
155
+ ├── tsconfig.json # TypeScript configuration
156
+ └── README.md # This file
157
+ ```
158
+
159
+ ### Running Tests
160
+
161
+ The project includes both unit tests and integration tests:
162
+
163
+ ```bash
164
+ # Run all tests
165
+ npm test
166
+
167
+ # Run unit tests only
168
+ npm run test:unit
169
+
170
+ # Run integration tests with local OpenAI-compatible server
171
+ npm run test:integration
172
+ ```
173
+
174
+ **Integration Tests Requirements:**
175
+ - A running OpenAI-compatible API server at `http://127.0.0.1:1234/v1`
176
+ - The server should support vision models (e.g., qwen/qwen3-vl-4b)
177
+ - You can use LM Studio, Ollama, or other compatible servers
178
+ - The integration tests use the `OPENAI_BASE_URL` and `OPENAI_MODEL` environment variables
179
+
180
+ The integration tests will:
181
+ - Test actual API calls to the vision model
182
+ - Verify image processing with the test image (`test/test.png`)
183
+ - Validate the complete MCP tool workflow with both default and custom prompts
184
+ - Test error handling and edge cases
185
+
186
+ ### Security Features
187
+
188
+ The server includes several security features:
189
+
190
+ - **Path Validation**: Restricts local file access to allowed directories
191
+ - **Extension Validation**: Only allows specific image file extensions (.jpg, .jpeg, .png, .gif, .webp)
192
+ - **Domain Restriction**: Optional URL domain whitelist for enhanced security
193
+ - **File Existence Checks**: Validates files exist before processing
194
+
195
+ ### Error Handling
196
+
197
+ The server includes robust error handling for:
198
+
199
+ - Invalid image URLs
200
+ - Unauthorized file paths or domains
201
+ - Network connectivity issues
202
+ - OpenAI API errors
203
+ - Invalid input parameters
204
+ - Unsupported file formats
205
+
206
+ ## Troubleshooting
207
+
208
+ ### Common Issues
209
+
210
+ **Server fails to start or doesn't work:**
211
+
212
+ - ✅ **Check if OpenAI API key is set**: This is the #1 cause of issues
213
+ ```bash
214
+ echo $OPENAI_API_KEY # Should show your API key
215
+ ```
216
+ - ✅ **Verify API key is valid**: Test with OpenAI's API directly
217
+ - ✅ **Check API key has sufficient credits**: Ensure your OpenAI account has available credits
218
+
219
+ **"Authentication failed" errors:**
220
+
221
+ - The OpenAI API key is missing or invalid
222
+ - Set the environment variable: `export OPENAI_API_KEY="your-key"`
223
+
224
+ ## Contributing
225
+
226
+ 1. Fork the repository
227
+ 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
228
+ 3. Commit your changes (`git commit -m 'Add some amazing feature'`)
229
+ 4. Push to the branch (`git push origin feature/amazing-feature`)
230
+ 5. Open a Pull Request
231
+
232
+ ## License
233
+
234
+ This project is licensed under the MIT License. See the `LICENSE` file for details.
235
+
236
+ ## Support
237
+
238
+ For support, please open an issue in the GitHub repository or contact the maintainer.
@@ -0,0 +1,3 @@
1
+ #!/usr/bin/env node
2
+
3
+ import "../dist/index.js"
@@ -0,0 +1,49 @@
1
+ import fs from "fs";
2
+ import path from "path";
3
+ /**
4
+ * Get MIME type from file extension
5
+ */
6
+ export function getMimeType(filePath) {
7
+ const ext = path.extname(filePath).toLowerCase();
8
+ switch (ext) {
9
+ case '.jpg':
10
+ case '.jpeg':
11
+ return 'image/jpeg';
12
+ case '.png':
13
+ return 'image/png';
14
+ case '.gif':
15
+ return 'image/gif';
16
+ case '.webp':
17
+ return 'image/webp';
18
+ default:
19
+ throw new Error(`Unsupported extension for MIME type: ${ext}`);
20
+ }
21
+ }
22
+ /**
23
+ * Read image file and convert to base64 data URL
24
+ */
25
+ export function convertImageToDataUrl(filePath) {
26
+ const fileBuffer = fs.readFileSync(filePath);
27
+ const base64Data = fileBuffer.toString('base64');
28
+ const mimeType = getMimeType(filePath);
29
+ return `data:${mimeType};base64,${base64Data}`;
30
+ }
31
+ /**
32
+ * Create image content object for local file
33
+ */
34
+ export function createLocalImageContent(filePath) {
35
+ const dataUrl = convertImageToDataUrl(filePath);
36
+ return {
37
+ type: "image_url",
38
+ image_url: { url: dataUrl }
39
+ };
40
+ }
41
+ /**
42
+ * Create image content object for URL
43
+ */
44
+ export function createUrlImageContent(imageUrl) {
45
+ return {
46
+ type: "image_url",
47
+ image_url: { url: imageUrl }
48
+ };
49
+ }
package/dist/index.js ADDED
@@ -0,0 +1,84 @@
1
+ #!/usr/bin/env node
2
+ import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
3
+ import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
4
+ import OpenAI from "openai";
5
+ import { z } from "zod";
6
+ import { isPathLocal, validateLocalPath, validateUrlDomain } from "./path-validator.js";
7
+ import { createLocalImageContent, createUrlImageContent } from "./image-processor.js";
8
+ const openai = new OpenAI({
9
+ baseURL: process.env.OPENAI_BASE_URL,
10
+ });
11
+ // Get the model from environment variable or use default
12
+ const model = process.env.OPENAI_MODEL || "gpt-5-mini";
13
+ // Get allowed paths from environment variable or use default
14
+ const allowedPaths = process.env.ALLOWED_IMAGE_PATHS
15
+ ? process.env.ALLOWED_IMAGE_PATHS.split(',').map(p => p.trim())
16
+ : ['./images', './assets'];
17
+ // Check if all paths should be allowed (only for image files)
18
+ const allowAllPaths = process.env.ALLOW_ALL_PATHS === 'true';
19
+ // Get allowed domains from environment variable (optional security feature)
20
+ const allowedDomains = process.env.ALLOWED_DOMAINS
21
+ ? process.env.ALLOWED_DOMAINS.split(',').map(d => d.trim())
22
+ : null; // null means all domains are allowed
23
+ // Create an MCP server
24
+ const server = new McpServer({
25
+ name: "Image Recognition",
26
+ version: "1.0.0",
27
+ }, {
28
+ capabilities: {
29
+ tools: {
30
+ list: true,
31
+ call: true,
32
+ },
33
+ },
34
+ });
35
+ server.registerTool("describe-image", {
36
+ title: "Describe Image",
37
+ description: "Describe an image by URL or local file path with a custom prompt",
38
+ inputSchema: {
39
+ imageUrl: z.string().describe("The image url or local file path to describe"),
40
+ prompt: z.string().describe("The question or prompt to ask about the image").default("what's in this image?"),
41
+ },
42
+ }, async ({ imageUrl, prompt }) => {
43
+ var _a;
44
+ let imageContent;
45
+ // Check if the input is a local file path
46
+ if (isPathLocal(imageUrl)) {
47
+ // Validate and get absolute path
48
+ const absolutePath = validateLocalPath(imageUrl, allowedPaths, allowAllPaths);
49
+ // Convert local file to image content
50
+ imageContent = createLocalImageContent(absolutePath);
51
+ }
52
+ else {
53
+ // Handle as URL - validate domain if ALLOWED_DOMAINS is set
54
+ validateUrlDomain(imageUrl, allowedDomains);
55
+ // Create URL image content
56
+ imageContent = createUrlImageContent(imageUrl);
57
+ }
58
+ try {
59
+ const response = await openai.chat.completions.create({
60
+ model: model,
61
+ messages: [
62
+ {
63
+ role: "user",
64
+ content: [
65
+ { type: "text", text: prompt },
66
+ imageContent,
67
+ ],
68
+ },
69
+ ],
70
+ });
71
+ return {
72
+ content: [{ type: "text", text: (_a = response.choices[0].message.content) !== null && _a !== void 0 ? _a : "" }],
73
+ };
74
+ }
75
+ catch (error) {
76
+ console.error("Error calling OpenAI API:", error);
77
+ return {
78
+ content: [{ type: "text", text: `Error describing the image: ${error instanceof Error ? error.message : 'Unknown error'}. Please check the server logs.` }],
79
+ };
80
+ }
81
+ });
82
+ // Start receiving messages on stdin and sending messages on stdout
83
+ const transport = new StdioServerTransport();
84
+ await server.connect(transport);
@@ -0,0 +1,61 @@
1
+ import path from "path";
2
+ import fs from "fs";
3
+ // Allowed image file extensions
4
+ const allowedExtensions = ['.jpg', '.jpeg', '.png', '.gif', '.webp'];
5
+ /**
6
+ * Check if the given path is a local file path
7
+ */
8
+ export function isPathLocal(imagePath) {
9
+ return imagePath.startsWith('/') || imagePath.startsWith('./') || imagePath.startsWith('../');
10
+ }
11
+ /**
12
+ * Validate file extension
13
+ */
14
+ export function validateExtension(filePath) {
15
+ const ext = path.extname(filePath).toLowerCase();
16
+ if (!allowedExtensions.includes(ext)) {
17
+ throw new Error(`Invalid file type: ${ext}. Allowed extensions: ${allowedExtensions.join(', ')}`);
18
+ }
19
+ }
20
+ /**
21
+ * Check if the file path is within allowed directories
22
+ */
23
+ export function isPathAllowed(absolutePath, allowedPaths) {
24
+ return allowedPaths.some(allowedPath => absolutePath.startsWith(path.resolve(allowedPath)));
25
+ }
26
+ /**
27
+ * Validate that the file exists
28
+ */
29
+ export function validateFileExists(filePath) {
30
+ if (!fs.existsSync(filePath)) {
31
+ throw new Error(`File not found: ${filePath}`);
32
+ }
33
+ }
34
+ /**
35
+ * Validate local file path with all security checks
36
+ */
37
+ export function validateLocalPath(imageUrl, allowedPaths, allowAllPaths) {
38
+ const absolutePath = path.resolve(imageUrl);
39
+ // Check file extension first
40
+ validateExtension(absolutePath);
41
+ // Validate that the file path is within allowed directories (unless all paths are allowed)
42
+ if (!allowAllPaths) {
43
+ if (!isPathAllowed(absolutePath, allowedPaths)) {
44
+ throw new Error(`File path not allowed: ${imageUrl}. Allowed paths: ${allowedPaths.join(', ')}`);
45
+ }
46
+ }
47
+ // Check if file exists
48
+ validateFileExists(absolutePath);
49
+ return absolutePath;
50
+ }
51
+ /**
52
+ * Validate URL domain (optional security feature)
53
+ */
54
+ export function validateUrlDomain(url, allowedDomains) {
55
+ if (allowedDomains) {
56
+ const parsedUrl = new URL(url);
57
+ if (!allowedDomains.includes(parsedUrl.hostname)) {
58
+ throw new Error(`URL domain not allowed: ${parsedUrl.hostname}. Allowed domains: ${allowedDomains.join(', ')}`);
59
+ }
60
+ }
61
+ }
package/package.json ADDED
@@ -0,0 +1,48 @@
1
+ {
2
+ "name": "@it-master/image-recognition-mcp",
3
+ "version": "1.1.2",
4
+ "mcpName": "io.github.shalevshalit/image-recognition-mcp",
5
+ "description": "MCP server for AI-powered image recognition and description using OpenAI-compatible vision models.",
6
+ "type": "module",
7
+ "files": [
8
+ "dist/",
9
+ "bin/",
10
+ "README.md",
11
+ "LICENSE"
12
+ ],
13
+ "main": "dist/index.js",
14
+ "bin": {
15
+ "@xitmasterx/image-recognition-mcp": "dist/index.js"
16
+ },
17
+ "repository": {
18
+ "type": "git",
19
+ "url": "git+https://github.com/xITmasterx/image-recognition-mcp.git"
20
+ },
21
+ "publishConfig": {
22
+ "access": "public"
23
+ },
24
+ "author": "akirose, xITmasterx",
25
+ "license": "MIT",
26
+ "engines": {
27
+ "node": ">=18.0.0"
28
+ },
29
+ "dependencies": {
30
+ "@modelcontextprotocol/sdk": "^1.15.0",
31
+ "openai": "^5.8.3"
32
+ },
33
+ "devDependencies": {
34
+ "@types/node": "^24.0.12",
35
+ "shx": "^0.4.0",
36
+ "typescript": "^5.0.0",
37
+ "vitest": "^1.6.0"
38
+ },
39
+ "scripts": {
40
+ "test": "vitest run",
41
+ "test:unit": "vitest run test/index.test.ts",
42
+ "test:integration": "OPENAI_BASE_URL=http://127.0.0.1:1234/v1 OPENAI_MODEL=qwen/qwen3-vl-4b ALLOWED_IMAGE_PATHS=./test vitest run test/describe-image-integration.test.ts",
43
+ "start": "node index.js",
44
+ "build": "tsc && shx chmod +x dist/*.js",
45
+ "prepare": "npm run build",
46
+ "prepublishOnly": "npm run build"
47
+ }
48
+ }