npm - mcp-vision-web-bridge - Versions diffs - 0.2.0 → 0.2.1 - Mend

mcp-vision-web-bridge 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +97 -73
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,115 +1,139 @@
 # mcp-vision-web-bridge
-Local MCP server that gives Claude Desktop, Claude Code, and other MCP clients two practical bridge tools:
+MCP server for Claude Code / Claude Desktop that bridges text-only models with vision capabilities:
-- read an image from a recent Claude upload, clipboard, local path, URL, or base64 input, then send it to an OpenAI-compatible vision model;
-- read web links locally, extract readable text, then send the result to an OpenAI-compatible model.
+- **Image recognition** — read images from local path, clipboard, Claude upload, URL, or base64, then send to an OpenAI-compatible vision model
+- **Web page reading** — fetch web links, extract readable text, then ask the model to summarize or answer questions
-It does not include a model or any model credits. You bring your own OpenAI-compatible API endpoint and API key.
+Perfect for using models like DeepSeek V4 that don't natively support image input.
-## Use Cases
+> **No model included.** You bring your own OpenAI-compatible API endpoint and key.
-- Your Claude client can accept images, but the third-party model behind it does not reliably receive image content.
-- Your model provider supports vision, but your MCP client needs a local tool to collect image inputs.
-- You want a safer local web reader with private-network blocking enabled by default.
+## Quick Start (npx, recommended)
-## Capabilities
-| Capability | macOS | Windows | Linux |
-| --- | --- | --- | --- |
-| MCP server | Supported | Supported | Supported |
-| OpenAI-compatible chat completions | Supported | Supported | Supported |
-| Web page reading | Supported | Supported | Supported |
-| Recent Claude upload image | Supported | Best effort | Best effort |
-| Clipboard image | Supported | Supported via PowerShell / Windows Forms | Supported via `wl-paste` or `xclip` |
+Add to your Claude Code `.claude.json` or Claude Desktop `mcpServers` config:
-Recent Claude upload paths are client implementation details, not a stable public API. If auto-detection does not work in your environment, set `CLAUDE_UPLOAD_DIRS` manually.
-## Security Defaults
-The default configuration is intentionally conservative:
-- `.env` is ignored and should never be committed.
-- API keys are only read from environment variables.
-- Explicit local image paths are disabled unless `ALLOW_LOCAL_IMAGE_PATHS=true`.
-- Clipboard image reading is disabled unless `ALLOW_CLIPBOARD_IMAGES=true`.
-- Private-network web and image URLs are disabled unless `ALLOW_PRIVATE_NETWORK_URLS=true`.
-- Jina Reader fallback is disabled unless `USE_JINA_READER=true`.
-- The server does not log prompts, image data, API keys, or fetched page bodies.
-## Requirements
+```json
+{
+  "mcpServers": {
+    "vision-web-bridge": {
+      "command": "npx",
+      "args": ["-y", "mcp-vision-web-bridge"],
+      "env": {
+        "MODEL_BASE_URL": "https://api.openai.com/v1",
+        "MODEL_API_KEY": "your-api-key",
+        "MODEL_NAME": "gpt-4o",
+        "ALLOW_LOCAL_IMAGE_PATHS": "true",
+        "ALLOW_CLIPBOARD_IMAGES": "true"
+      }
+    }
+  }
+}
+```
-- Node.js 20 or newer
-- npm
+Restart Claude Code, then check `/mcp` — `vision-web-bridge` should show ✔ connected.
-This is a Node.js project. It does not require Python or a virtual environment.
+### Provider Examples
-## Install
+**Xiaomi MiMo:**
+```json
+"env": {
+  "MODEL_BASE_URL": "https://api.xiaomimimo.com/v1",
+  "MODEL_API_KEY": "sk-...",
+  "MODEL_NAME": "mimo-v2-omni"
+}
+```
-```bash
-npm install
-cp .env.example .env
+**SiliconFlow:**
+```json
+"env": {
+  "MODEL_BASE_URL": "https://api.siliconflow.cn/v1",
+  "MODEL_API_KEY": "sk-...",
+  "MODEL_NAME": "Qwen/Qwen3-VL-8B-Instruct"
+}
 ```
-Edit `.env`:
+**OpenAI:**
+```json
+"env": {
+  "MODEL_BASE_URL": "https://api.openai.com/v1",
+  "MODEL_API_KEY": "sk-...",
+  "MODEL_NAME": "gpt-4o"
+}
+```
-```bash
-MODEL_BASE_URL=https://api.example.com/v1
-MODEL_API_KEY=replace-with-your-own-key
-MODEL_NAME=replace-with-your-vision-model
-ALLOW_LOCAL_IMAGE_PATHS=false
-ALLOW_CLIPBOARD_IMAGES=false
-ALLOW_PRIVATE_NETWORK_URLS=false
-USE_JINA_READER=false
-MAX_IMAGE_BYTES=10485760
+**Gemini (via OpenAI-compatible layer):**
+```json
+"env": {
+  "MODEL_BASE_URL": "https://generativelanguage.googleapis.com/v1beta/openai",
+  "MODEL_API_KEY": "AIza...",
+  "MODEL_NAME": "gemini-2.0-flash"
+}
 ```
-`MODEL_BASE_URL` must be an OpenAI-compatible `/v1` endpoint.
+Any OpenAI-compatible `/v1` endpoint works.
-### SiliconFlow Example
+## Alternative: Local Install
+If you prefer to run from a local checkout:
 ```bash
-MODEL_BASE_URL=https://api.siliconflow.cn/v1
-MODEL_API_KEY=replace-with-your-own-key
-MODEL_NAME=Qwen/Qwen3-VL-8B-Instruct
+git clone https://github.com/dangpolly927-eng/mcp-vision-web-bridge.git
+cd mcp-vision-web-bridge
+npm install
 ```
-Use a model that supports vision input.
-## Claude Desktop Config
-Use absolute paths for your local checkout:
+Then configure with local path:
 ```json
 {
   "mcpServers": {
     "vision-web-bridge": {
       "command": "node",
-      "args": [
-        "--env-file-if-exists=/absolute/path/to/mcp-vision-web-bridge/.env",
-        "/absolute/path/to/mcp-vision-web-bridge/src/server.mjs"
-      ]
+      "args": ["D:\\path\\to\\mcp-vision-web-bridge\\src\\server.mjs"],
+      "env": {
+        "MODEL_BASE_URL": "https://api.example.com/v1",
+        "MODEL_API_KEY": "your-api-key",
+        "MODEL_NAME": "your-vision-model"
+      }
     }
   }
 }
 ```
-Restart Claude Desktop after changing the config.
+You can also use a `.env` file with `--env-file-if-exists` instead of inline env vars. See `.env.example`.
+## Requirements
+- Node.js >= 20
+## Capabilities
+| Capability | macOS | Windows | Linux |
+| --- | --- | --- | --- |
+| MCP server | Supported | Supported | Supported |
+| Local image path | Supported | Supported | Supported |
+| Clipboard image | Supported | PowerShell / WinForms | `wl-paste` / `xclip` |
+| Claude upload image | Supported | Best effort | Best effort |
+| Web page reading | Supported | Supported | Supported |
-## Claude Code Config
+## Security Defaults
-Add the same server to your Claude Code MCP config. After restart, check `/mcp` and confirm that `vision-web-bridge` is connected.
+All dangerous features are **opt-in** (disabled by default):
-The server exposes two tools:
+| Feature | Default |
+| --- | --- |
+| `ALLOW_LOCAL_IMAGE_PATHS` | `false` |
+| `ALLOW_CLIPBOARD_IMAGES` | `false` |
+| `ALLOW_PRIVATE_NETWORK_URLS` | `false` |
+| `USE_JINA_READER` | `false` |
-- `read_image_with_model`
-- `read_links_with_model`
+Set to `"true"` in env vars to enable.
-It also exposes two MCP prompts:
+## Tools
-- `/mcp__vision-web-bridge__img`
-- `/mcp__vision-web-bridge__clipboard-image`
+- **`read_image_with_model`** — Read image from local path, clipboard, URL, base64, or latest Claude upload
+- **`read_links_with_model`** — Fetch and summarize web page content
 ## Usage

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mcp-vision-web-bridge",
-  "version": "0.2.0",
+  "version": "0.2.1",
   "type": "module",
   "description": "A local MCP server that adds image understanding and web-page reading through an OpenAI-compatible model API.",
   "files": [