npm - @it-master/image-recognition-mcp - Versions diffs - 1.1.2 - Mend

@it-master/image-recognition-mcp 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/LICENSE +21 -0
package/README.md +238 -0
package/bin/image-recongnition-mcp.js +3 -0
package/dist/image-processor.js +49 -0
package/dist/index.js +84 -0
package/dist/path-validator.js +61 -0
package/package.json +48 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 akirose
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,238 @@
+# Image Recognition MCP Server
+A Model Context Protocol (MCP) server that provides AI-powered image recognition and description capabilities using OpenAI-compatible vision models.
+## Overview
+This MCP server enables AI assistants to analyze and describe images through a simple URL-based interface. It supports OpenAI's vision models as well as OpenAI-compatible local models (such as LM Studio, Ollama, etc.), providing detailed descriptions of images and making it easy to integrate image analysis capabilities into your AI workflows.
+## Features
+- **Image Analysis**: Analyze images from URLs and get detailed descriptions
+- **Flexible Model Support**: Works with OpenAI's vision models and OpenAI-compatible local models (LM Studio, Ollama, etc.)
+- **MCP Protocol**: Fully compatible with the Model Context Protocol standard
+- **TypeScript**: Built with TypeScript for type safety and better development experience
+- **Simple API**: Easy-to-use interface for image description requests
+## Installation
+### Prerequisites
+- Node.js 18+
+- npm or yarn
+- OpenAI API key or local vision model server (e.g., LM Studio, Ollama)
+### MCP Client Configuration
+To use this server with an MCP client, add the following configuration:
+```json
+{
+  "mcpServers": {
+    "image-recognition": {
+      "command": "npx",
+      "args": ["-y", "@akirose/image-recognition-mcp"],
+      "env": {
+        "OPENAI_API_KEY": "your-actual-openai-api-key-here"
+      }
+    }
+  }
+}
+```
+To allow access to image files from any path, set `ALLOW_ALL_PATHS` to `true`:
+```json
+{
+  "mcpServers": {
+    "image-recognition": {
+      "command": "npx",
+      "args": ["-y", "@akirose/image-recognition-mcp"],
+      "env": {
+        "OPENAI_API_KEY": "your-actual-openai-api-key-here",
+        "ALLOW_ALL_PATHS": "true"
+      }
+    }
+  }
+}
+```
+**⚠️ IMPORTANT:** The `env` section with your API key is required - this is the only way the MCP server can function. For local models, you can use any placeholder value for `OPENAI_API_KEY` and configure `OPENAI_BASE_URL` to point to your local server.
+### Environment Variables
+The server supports the following environment variables:
+- `OPENAI_API_KEY` - Your OpenAI API key, or any placeholder value when using local models (required)
+- `OPENAI_BASE_URL` - Base URL for OpenAI API or OpenAI-compatible API servers (optional, defaults to OpenAI's official API)
+  - Example for LM Studio: `"http://127.0.0.1:1234/v1"`
+  - Example for Ollama: `"http://localhost:11434/v1"`
+- `OPENAI_MODEL` - The vision model to use for image recognition (optional, defaults to "gpt-5-mini")
+  - For OpenAI: `"gpt-5-mini"`, `"gpt-4o"`, `"gpt-4o-mini"`, etc.
+  - For local models: `"llava"`, `"qwen/qwen3-vl-4b"`, or any locally available vision model
+- `ALLOWED_IMAGE_PATHS` - Comma-separated list of allowed local file paths (optional, defaults to "./images,./assets")
+  - Example: `"./images,./assets,./downloads"`
+- `ALLOW_ALL_PATHS` - Set to "true" to allow access to image files from any path. When enabled, only image file extensions (.jpg, .jpeg, .png, .gif, .webp) are allowed for security (optional, defaults to false)
+- `ALLOWED_DOMAINS` - Comma-separated list of allowed URL domains for enhanced security (optional, defaults to allow all domains)
+  - Example: `"example.com,cdn.example.com,images.example.org"`
+  - When not set: All domains are allowed
+  - When set: Only specified domains will be allowed for URL-based image requests
+## Usage
+### Available Tools
+#### `describe-image`
+Analyzes an image from a URL or local file path and provides a detailed description.
+**Parameters:**
+- `imageUrl` (string): The URL of the image to analyze, or a local file path
+- `prompt` (string, optional): The question or prompt to ask about the image (defaults to "what's in this image?")
+**Example with URL:**
+```json
+{
+  "tool": "describe-image",
+  "arguments": {
+    "imageUrl": "https://example.com/image.jpg",
+    "prompt": "what's in this image?"
+  }
+}
+```
+**Example with local file:**
+```json
+{
+  "tool": "describe-image",
+  "arguments": {
+    "imageUrl": "./images/my-image.png",
+    "prompt": "Describe the objects in this image"
+  }
+}
+```
+**Response:**
+```json
+{
+  "content": [
+    {
+      "type": "text",
+      "text": "The image shows a beautiful sunset over a mountain landscape with vibrant orange and pink colors in the sky..."
+    }
+  ]
+}
+```
+### Integration with AI Assistants
+This MCP server can be integrated with various AI assistants that support the MCP protocol, such as:
+- Claude Desktop
+- Other MCP-compatible AI systems
+## Development
+### Project Structure
+```
+image-recognition-mcp/
+├── src/
+│   ├── index.ts                # Main server implementation
+│   ├── path-validator.ts       # Path validation and security functions
+│   └── image-processor.ts      # Image processing utilities
+├── test/
+│   ├── index.test.ts           # Unit tests
+│   ├── describe-image-integration.test.ts  # Integration tests
+│   ├── test.png                # Test image
+│   └── README.md               # Test documentation
+├── dist/                       # Compiled JavaScript output
+├── package.json                # Project dependencies and scripts
+├── tsconfig.json               # TypeScript configuration
+└── README.md                   # This file
+```
+### Running Tests
+The project includes both unit tests and integration tests:
+```bash
+# Run all tests
+npm test
+# Run unit tests only
+npm run test:unit
+# Run integration tests with local OpenAI-compatible server
+npm run test:integration
+```
+**Integration Tests Requirements:**
+- A running OpenAI-compatible API server at `http://127.0.0.1:1234/v1`
+- The server should support vision models (e.g., qwen/qwen3-vl-4b)
+- You can use LM Studio, Ollama, or other compatible servers
+- The integration tests use the `OPENAI_BASE_URL` and `OPENAI_MODEL` environment variables
+The integration tests will:
+- Test actual API calls to the vision model
+- Verify image processing with the test image (`test/test.png`)
+- Validate the complete MCP tool workflow with both default and custom prompts
+- Test error handling and edge cases
+### Security Features
+The server includes several security features:
+- **Path Validation**: Restricts local file access to allowed directories
+- **Extension Validation**: Only allows specific image file extensions (.jpg, .jpeg, .png, .gif, .webp)
+- **Domain Restriction**: Optional URL domain whitelist for enhanced security
+- **File Existence Checks**: Validates files exist before processing
+### Error Handling
+The server includes robust error handling for:
+- Invalid image URLs
+- Unauthorized file paths or domains
+- Network connectivity issues
+- OpenAI API errors
+- Invalid input parameters
+- Unsupported file formats
+## Troubleshooting
+### Common Issues
+**Server fails to start or doesn't work:**
+- ✅ **Check if OpenAI API key is set**: This is the #1 cause of issues
+  ```bash
+  echo $OPENAI_API_KEY  # Should show your API key
+  ```
+- ✅ **Verify API key is valid**: Test with OpenAI's API directly
+- ✅ **Check API key has sufficient credits**: Ensure your OpenAI account has available credits
+**"Authentication failed" errors:**
+- The OpenAI API key is missing or invalid
+- Set the environment variable: `export OPENAI_API_KEY="your-key"`
+## Contributing
+1. Fork the repository
+2. Create a feature branch (`git checkout -b feature/amazing-feature`)
+3. Commit your changes (`git commit -m 'Add some amazing feature'`)
+4. Push to the branch (`git push origin feature/amazing-feature`)
+5. Open a Pull Request
+## License
+This project is licensed under the MIT License. See the `LICENSE` file for details.
+## Support
+For support, please open an issue in the GitHub repository or contact the maintainer.

package/bin/image-recongnition-mcp.js ADDED Viewed

@@ -0,0 +1,3 @@
+#!/usr/bin/env node
+import "../dist/index.js"

package/dist/image-processor.js ADDED Viewed

@@ -0,0 +1,49 @@
+import fs from "fs";
+import path from "path";
+/**
+ * Get MIME type from file extension
+ */
+export function getMimeType(filePath) {
+    const ext = path.extname(filePath).toLowerCase();
+    switch (ext) {
+        case '.jpg':
+        case '.jpeg':
+            return 'image/jpeg';
+        case '.png':
+            return 'image/png';
+        case '.gif':
+            return 'image/gif';
+        case '.webp':
+            return 'image/webp';
+        default:
+            throw new Error(`Unsupported extension for MIME type: ${ext}`);
+    }
+}
+/**
+ * Read image file and convert to base64 data URL
+ */
+export function convertImageToDataUrl(filePath) {
+    const fileBuffer = fs.readFileSync(filePath);
+    const base64Data = fileBuffer.toString('base64');
+    const mimeType = getMimeType(filePath);
+    return `data:${mimeType};base64,${base64Data}`;
+}
+/**
+ * Create image content object for local file
+ */
+export function createLocalImageContent(filePath) {
+    const dataUrl = convertImageToDataUrl(filePath);
+    return {
+        type: "image_url",
+        image_url: { url: dataUrl }
+    };
+}
+/**
+ * Create image content object for URL
+ */
+export function createUrlImageContent(imageUrl) {
+    return {
+        type: "image_url",
+        image_url: { url: imageUrl }
+    };
+}

package/dist/index.js ADDED Viewed

@@ -0,0 +1,84 @@
+#!/usr/bin/env node
+import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+import OpenAI from "openai";
+import { z } from "zod";
+import { isPathLocal, validateLocalPath, validateUrlDomain } from "./path-validator.js";
+import { createLocalImageContent, createUrlImageContent } from "./image-processor.js";
+const openai = new OpenAI({
+    baseURL: process.env.OPENAI_BASE_URL,
+});
+// Get the model from environment variable or use default
+const model = process.env.OPENAI_MODEL || "gpt-5-mini";
+// Get allowed paths from environment variable or use default
+const allowedPaths = process.env.ALLOWED_IMAGE_PATHS
+    ? process.env.ALLOWED_IMAGE_PATHS.split(',').map(p => p.trim())
+    : ['./images', './assets'];
+// Check if all paths should be allowed (only for image files)
+const allowAllPaths = process.env.ALLOW_ALL_PATHS === 'true';
+// Get allowed domains from environment variable (optional security feature)
+const allowedDomains = process.env.ALLOWED_DOMAINS
+    ? process.env.ALLOWED_DOMAINS.split(',').map(d => d.trim())
+    : null; // null means all domains are allowed
+// Create an MCP server
+const server = new McpServer({
+    name: "Image Recognition",
+    version: "1.0.0",
+}, {
+    capabilities: {
+        tools: {
+            list: true,
+            call: true,
+        },
+    },
+});
+server.registerTool("describe-image", {
+    title: "Describe Image",
+    description: "Describe an image by URL or local file path with a custom prompt",
+    inputSchema: {
+        imageUrl: z.string().describe("The image url or local file path to describe"),
+        prompt: z.string().describe("The question or prompt to ask about the image").default("what's in this image?"),
+    },
+}, async ({ imageUrl, prompt }) => {
+    var _a;
+    let imageContent;
+    // Check if the input is a local file path
+    if (isPathLocal(imageUrl)) {
+        // Validate and get absolute path
+        const absolutePath = validateLocalPath(imageUrl, allowedPaths, allowAllPaths);
+        // Convert local file to image content
+        imageContent = createLocalImageContent(absolutePath);
+    }
+    else {
+        // Handle as URL - validate domain if ALLOWED_DOMAINS is set
+        validateUrlDomain(imageUrl, allowedDomains);
+        // Create URL image content
+        imageContent = createUrlImageContent(imageUrl);
+    }
+    try {
+        const response = await openai.chat.completions.create({
+            model: model,
+            messages: [
+                {
+                    role: "user",
+                    content: [
+                        { type: "text", text: prompt },
+                        imageContent,
+                    ],
+                },
+            ],
+        });
+        return {
+            content: [{ type: "text", text: (_a = response.choices[0].message.content) !== null && _a !== void 0 ? _a : "" }],
+        };
+    }
+    catch (error) {
+        console.error("Error calling OpenAI API:", error);
+        return {
+            content: [{ type: "text", text: `Error describing the image: ${error instanceof Error ? error.message : 'Unknown error'}. Please check the server logs.` }],
+        };
+    }
+});
+// Start receiving messages on stdin and sending messages on stdout
+const transport = new StdioServerTransport();
+await server.connect(transport);

package/dist/path-validator.js ADDED Viewed

@@ -0,0 +1,61 @@
+import path from "path";
+import fs from "fs";
+// Allowed image file extensions
+const allowedExtensions = ['.jpg', '.jpeg', '.png', '.gif', '.webp'];
+/**
+ * Check if the given path is a local file path
+ */
+export function isPathLocal(imagePath) {
+    return imagePath.startsWith('/') || imagePath.startsWith('./') || imagePath.startsWith('../');
+}
+/**
+ * Validate file extension
+ */
+export function validateExtension(filePath) {
+    const ext = path.extname(filePath).toLowerCase();
+    if (!allowedExtensions.includes(ext)) {
+        throw new Error(`Invalid file type: ${ext}. Allowed extensions: ${allowedExtensions.join(', ')}`);
+    }
+}
+/**
+ * Check if the file path is within allowed directories
+ */
+export function isPathAllowed(absolutePath, allowedPaths) {
+    return allowedPaths.some(allowedPath => absolutePath.startsWith(path.resolve(allowedPath)));
+}
+/**
+ * Validate that the file exists
+ */
+export function validateFileExists(filePath) {
+    if (!fs.existsSync(filePath)) {
+        throw new Error(`File not found: ${filePath}`);
+    }
+}
+/**
+ * Validate local file path with all security checks
+ */
+export function validateLocalPath(imageUrl, allowedPaths, allowAllPaths) {
+    const absolutePath = path.resolve(imageUrl);
+    // Check file extension first
+    validateExtension(absolutePath);
+    // Validate that the file path is within allowed directories (unless all paths are allowed)
+    if (!allowAllPaths) {
+        if (!isPathAllowed(absolutePath, allowedPaths)) {
+            throw new Error(`File path not allowed: ${imageUrl}. Allowed paths: ${allowedPaths.join(', ')}`);
+        }
+    }
+    // Check if file exists
+    validateFileExists(absolutePath);
+    return absolutePath;
+}
+/**
+ * Validate URL domain (optional security feature)
+ */
+export function validateUrlDomain(url, allowedDomains) {
+    if (allowedDomains) {
+        const parsedUrl = new URL(url);
+        if (!allowedDomains.includes(parsedUrl.hostname)) {
+            throw new Error(`URL domain not allowed: ${parsedUrl.hostname}. Allowed domains: ${allowedDomains.join(', ')}`);
+        }
+    }
+}

package/package.json ADDED Viewed

@@ -0,0 +1,48 @@
+{
+  "name": "@it-master/image-recognition-mcp",
+  "version": "1.1.2",
+  "mcpName": "io.github.shalevshalit/image-recognition-mcp",
+  "description": "MCP server for AI-powered image recognition and description using OpenAI-compatible vision models.",
+  "type": "module",
+  "files": [
+    "dist/",
+    "bin/",
+    "README.md",
+    "LICENSE"
+  ],
+  "main": "dist/index.js",
+  "bin": {
+    "@xitmasterx/image-recognition-mcp": "dist/index.js"
+  },
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/xITmasterx/image-recognition-mcp.git"
+  },
+  "publishConfig": {
+    "access": "public"
+  },
+  "author": "akirose, xITmasterx",
+  "license": "MIT",
+  "engines": {
+    "node": ">=18.0.0"
+  },
+  "dependencies": {
+    "@modelcontextprotocol/sdk": "^1.15.0",
+    "openai": "^5.8.3"
+  },
+  "devDependencies": {
+    "@types/node": "^24.0.12",
+    "shx": "^0.4.0",
+    "typescript": "^5.0.0",
+    "vitest": "^1.6.0"
+  },
+  "scripts": {
+    "test": "vitest run",
+    "test:unit": "vitest run test/index.test.ts",
+    "test:integration": "OPENAI_BASE_URL=http://127.0.0.1:1234/v1 OPENAI_MODEL=qwen/qwen3-vl-4b ALLOWED_IMAGE_PATHS=./test vitest run test/describe-image-integration.test.ts",
+    "start": "node index.js",
+    "build": "tsc && shx chmod +x dist/*.js",
+    "prepare": "npm run build",
+    "prepublishOnly": "npm run build"
+  }
+}