mcp-vision-web-bridge 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +97 -73
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,115 +1,139 @@
1
1
  # mcp-vision-web-bridge
2
2
 
3
- Local MCP server that gives Claude Desktop, Claude Code, and other MCP clients two practical bridge tools:
3
+ MCP server for Claude Code / Claude Desktop that bridges text-only models with vision capabilities:
4
4
 
5
- - read an image from a recent Claude upload, clipboard, local path, URL, or base64 input, then send it to an OpenAI-compatible vision model;
6
- - read web links locally, extract readable text, then send the result to an OpenAI-compatible model.
5
+ - **Image recognition** read images from local path, clipboard, Claude upload, URL, or base64, then send to an OpenAI-compatible vision model
6
+ - **Web page reading** — fetch web links, extract readable text, then ask the model to summarize or answer questions
7
7
 
8
- It does not include a model or any model credits. You bring your own OpenAI-compatible API endpoint and API key.
8
+ Perfect for using models like DeepSeek V4 that don't natively support image input.
9
9
 
10
- ## Use Cases
10
+ > **No model included.** You bring your own OpenAI-compatible API endpoint and key.
11
11
 
12
- - Your Claude client can accept images, but the third-party model behind it does not reliably receive image content.
13
- - Your model provider supports vision, but your MCP client needs a local tool to collect image inputs.
14
- - You want a safer local web reader with private-network blocking enabled by default.
12
+ ## Quick Start (npx, recommended)
15
13
 
16
- ## Capabilities
17
-
18
- | Capability | macOS | Windows | Linux |
19
- | --- | --- | --- | --- |
20
- | MCP server | Supported | Supported | Supported |
21
- | OpenAI-compatible chat completions | Supported | Supported | Supported |
22
- | Web page reading | Supported | Supported | Supported |
23
- | Recent Claude upload image | Supported | Best effort | Best effort |
24
- | Clipboard image | Supported | Supported via PowerShell / Windows Forms | Supported via `wl-paste` or `xclip` |
14
+ Add to your Claude Code `.claude.json` or Claude Desktop `mcpServers` config:
25
15
 
26
- Recent Claude upload paths are client implementation details, not a stable public API. If auto-detection does not work in your environment, set `CLAUDE_UPLOAD_DIRS` manually.
27
-
28
- ## Security Defaults
29
-
30
- The default configuration is intentionally conservative:
31
-
32
- - `.env` is ignored and should never be committed.
33
- - API keys are only read from environment variables.
34
- - Explicit local image paths are disabled unless `ALLOW_LOCAL_IMAGE_PATHS=true`.
35
- - Clipboard image reading is disabled unless `ALLOW_CLIPBOARD_IMAGES=true`.
36
- - Private-network web and image URLs are disabled unless `ALLOW_PRIVATE_NETWORK_URLS=true`.
37
- - Jina Reader fallback is disabled unless `USE_JINA_READER=true`.
38
- - The server does not log prompts, image data, API keys, or fetched page bodies.
39
-
40
- ## Requirements
16
+ ```json
17
+ {
18
+ "mcpServers": {
19
+ "vision-web-bridge": {
20
+ "command": "npx",
21
+ "args": ["-y", "mcp-vision-web-bridge"],
22
+ "env": {
23
+ "MODEL_BASE_URL": "https://api.openai.com/v1",
24
+ "MODEL_API_KEY": "your-api-key",
25
+ "MODEL_NAME": "gpt-4o",
26
+ "ALLOW_LOCAL_IMAGE_PATHS": "true",
27
+ "ALLOW_CLIPBOARD_IMAGES": "true"
28
+ }
29
+ }
30
+ }
31
+ }
32
+ ```
41
33
 
42
- - Node.js 20 or newer
43
- - npm
34
+ Restart Claude Code, then check `/mcp` — `vision-web-bridge` should show connected.
44
35
 
45
- This is a Node.js project. It does not require Python or a virtual environment.
36
+ ### Provider Examples
46
37
 
47
- ## Install
38
+ **Xiaomi MiMo:**
39
+ ```json
40
+ "env": {
41
+ "MODEL_BASE_URL": "https://api.xiaomimimo.com/v1",
42
+ "MODEL_API_KEY": "sk-...",
43
+ "MODEL_NAME": "mimo-v2-omni"
44
+ }
45
+ ```
48
46
 
49
- ```bash
50
- npm install
51
- cp .env.example .env
47
+ **SiliconFlow:**
48
+ ```json
49
+ "env": {
50
+ "MODEL_BASE_URL": "https://api.siliconflow.cn/v1",
51
+ "MODEL_API_KEY": "sk-...",
52
+ "MODEL_NAME": "Qwen/Qwen3-VL-8B-Instruct"
53
+ }
52
54
  ```
53
55
 
54
- Edit `.env`:
56
+ **OpenAI:**
57
+ ```json
58
+ "env": {
59
+ "MODEL_BASE_URL": "https://api.openai.com/v1",
60
+ "MODEL_API_KEY": "sk-...",
61
+ "MODEL_NAME": "gpt-4o"
62
+ }
63
+ ```
55
64
 
56
- ```bash
57
- MODEL_BASE_URL=https://api.example.com/v1
58
- MODEL_API_KEY=replace-with-your-own-key
59
- MODEL_NAME=replace-with-your-vision-model
60
-
61
- ALLOW_LOCAL_IMAGE_PATHS=false
62
- ALLOW_CLIPBOARD_IMAGES=false
63
- ALLOW_PRIVATE_NETWORK_URLS=false
64
- USE_JINA_READER=false
65
- MAX_IMAGE_BYTES=10485760
65
+ **Gemini (via OpenAI-compatible layer):**
66
+ ```json
67
+ "env": {
68
+ "MODEL_BASE_URL": "https://generativelanguage.googleapis.com/v1beta/openai",
69
+ "MODEL_API_KEY": "AIza...",
70
+ "MODEL_NAME": "gemini-2.0-flash"
71
+ }
66
72
  ```
67
73
 
68
- `MODEL_BASE_URL` must be an OpenAI-compatible `/v1` endpoint.
74
+ Any OpenAI-compatible `/v1` endpoint works.
69
75
 
70
- ### SiliconFlow Example
76
+ ## Alternative: Local Install
77
+
78
+ If you prefer to run from a local checkout:
71
79
 
72
80
  ```bash
73
- MODEL_BASE_URL=https://api.siliconflow.cn/v1
74
- MODEL_API_KEY=replace-with-your-own-key
75
- MODEL_NAME=Qwen/Qwen3-VL-8B-Instruct
81
+ git clone https://github.com/dangpolly927-eng/mcp-vision-web-bridge.git
82
+ cd mcp-vision-web-bridge
83
+ npm install
76
84
  ```
77
85
 
78
- Use a model that supports vision input.
79
-
80
- ## Claude Desktop Config
81
-
82
- Use absolute paths for your local checkout:
86
+ Then configure with local path:
83
87
 
84
88
  ```json
85
89
  {
86
90
  "mcpServers": {
87
91
  "vision-web-bridge": {
88
92
  "command": "node",
89
- "args": [
90
- "--env-file-if-exists=/absolute/path/to/mcp-vision-web-bridge/.env",
91
- "/absolute/path/to/mcp-vision-web-bridge/src/server.mjs"
92
- ]
93
+ "args": ["D:\\path\\to\\mcp-vision-web-bridge\\src\\server.mjs"],
94
+ "env": {
95
+ "MODEL_BASE_URL": "https://api.example.com/v1",
96
+ "MODEL_API_KEY": "your-api-key",
97
+ "MODEL_NAME": "your-vision-model"
98
+ }
93
99
  }
94
100
  }
95
101
  }
96
102
  ```
97
103
 
98
- Restart Claude Desktop after changing the config.
104
+ You can also use a `.env` file with `--env-file-if-exists` instead of inline env vars. See `.env.example`.
105
+
106
+ ## Requirements
107
+
108
+ - Node.js >= 20
109
+
110
+ ## Capabilities
111
+
112
+ | Capability | macOS | Windows | Linux |
113
+ | --- | --- | --- | --- |
114
+ | MCP server | Supported | Supported | Supported |
115
+ | Local image path | Supported | Supported | Supported |
116
+ | Clipboard image | Supported | PowerShell / WinForms | `wl-paste` / `xclip` |
117
+ | Claude upload image | Supported | Best effort | Best effort |
118
+ | Web page reading | Supported | Supported | Supported |
99
119
 
100
- ## Claude Code Config
120
+ ## Security Defaults
101
121
 
102
- Add the same server to your Claude Code MCP config. After restart, check `/mcp` and confirm that `vision-web-bridge` is connected.
122
+ All dangerous features are **opt-in** (disabled by default):
103
123
 
104
- The server exposes two tools:
124
+ | Feature | Default |
125
+ | --- | --- |
126
+ | `ALLOW_LOCAL_IMAGE_PATHS` | `false` |
127
+ | `ALLOW_CLIPBOARD_IMAGES` | `false` |
128
+ | `ALLOW_PRIVATE_NETWORK_URLS` | `false` |
129
+ | `USE_JINA_READER` | `false` |
105
130
 
106
- - `read_image_with_model`
107
- - `read_links_with_model`
131
+ Set to `"true"` in env vars to enable.
108
132
 
109
- It also exposes two MCP prompts:
133
+ ## Tools
110
134
 
111
- - `/mcp__vision-web-bridge__img`
112
- - `/mcp__vision-web-bridge__clipboard-image`
135
+ - **`read_image_with_model`** — Read image from local path, clipboard, URL, base64, or latest Claude upload
136
+ - **`read_links_with_model`** — Fetch and summarize web page content
113
137
 
114
138
  ## Usage
115
139
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mcp-vision-web-bridge",
3
- "version": "0.2.0",
3
+ "version": "0.2.1",
4
4
  "type": "module",
5
5
  "description": "A local MCP server that adds image understanding and web-page reading through an OpenAI-compatible model API.",
6
6
  "files": [