luma-mcp 1.0.3 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +39 -0
- package/CHANGELOG.md +50 -0
- package/README.md +163 -24
- package/build/config.d.ts +2 -0
- package/build/config.d.ts.map +1 -1
- package/build/config.js +24 -8
- package/build/config.js.map +1 -1
- package/build/index.js +17 -6
- package/build/index.js.map +1 -1
- package/build/siliconflow-client.d.ts +23 -0
- package/build/siliconflow-client.d.ts.map +1 -0
- package/build/siliconflow-client.js +85 -0
- package/build/siliconflow-client.js.map +1 -0
- package/build/vision-client.d.ts +18 -0
- package/build/vision-client.d.ts.map +1 -0
- package/build/vision-client.js +5 -0
- package/build/vision-client.js.map +1 -0
- package/build/zhipu-client.d.ts +6 -1
- package/build/zhipu-client.d.ts.map +1 -1
- package/build/zhipu-client.js +6 -0
- package/build/zhipu-client.js.map +1 -1
- package/package.json +8 -3
- package/test/test-deepseek-raw.ts +94 -0
- package/test/test-local.ts +20 -7
package/.env.example
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# Luma MCP 配置示例
|
|
2
|
+
|
|
3
|
+
# ==========================================
|
|
4
|
+
# 模型提供商选择
|
|
5
|
+
# ==========================================
|
|
6
|
+
# 可选值: zhipu, siliconflow
|
|
7
|
+
# 默认: zhipu
|
|
8
|
+
MODEL_PROVIDER=zhipu
|
|
9
|
+
|
|
10
|
+
# ==========================================
|
|
11
|
+
# 智谱 GLM-4.5V 配置(使用智谱时需要)
|
|
12
|
+
# ==========================================
|
|
13
|
+
ZHIPU_API_KEY=your-zhipu-api-key-here
|
|
14
|
+
|
|
15
|
+
# ==========================================
|
|
16
|
+
# 硅基流动 DeepSeek-OCR 配置(使用硅基流动时需要)
|
|
17
|
+
# ==========================================
|
|
18
|
+
# SILICONFLOW_API_KEY=your-siliconflow-api-key-here
|
|
19
|
+
|
|
20
|
+
# ==========================================
|
|
21
|
+
# 通用配置(可选)
|
|
22
|
+
# ==========================================
|
|
23
|
+
# 模型名称(留空则使用默认值)
|
|
24
|
+
# zhipu 默认: glm-4.5v
|
|
25
|
+
# siliconflow 默认: deepseek-ai/DeepSeek-OCR
|
|
26
|
+
# MODEL_NAME=
|
|
27
|
+
|
|
28
|
+
# 最大生成 tokens(默认: 4096)
|
|
29
|
+
# MAX_TOKENS=4096
|
|
30
|
+
|
|
31
|
+
# 温度参数 0-1(默认: 0.7)
|
|
32
|
+
# TEMPERATURE=0.7
|
|
33
|
+
|
|
34
|
+
# Top-p 参数 0-1(默认: 0.7)
|
|
35
|
+
# TOP_P=0.7
|
|
36
|
+
|
|
37
|
+
# 是否启用思考模式(仅 GLM-4.5V 支持,默认: false)
|
|
38
|
+
# 启用后可提高分析准确性,但会增加 20-30% tokens 消耗
|
|
39
|
+
# ENABLE_THINKING=false
|
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
|
|
5
|
+
## [1.1.0] - 2025-11-13
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
- 🎉 **多模型支持**: 新增硅基流动 DeepSeek-OCR 支持
|
|
9
|
+
- 🆓 **免费选项**: DeepSeek-OCR 通过硅基流动提供完全免费的 OCR 服务
|
|
10
|
+
- 📐 **统一接口**: 创建 VisionClient 接口,支持灵活扩展更多视觉模型
|
|
11
|
+
- ⚙️ **灵活配置**: 通过 `MODEL_PROVIDER` 环境变量轻松切换模型
|
|
12
|
+
|
|
13
|
+
### Changed
|
|
14
|
+
- 🔧 环境变量命名优化,支持通用配置(`MODEL_NAME`、`MAX_TOKENS` 等)
|
|
15
|
+
- 📝 更新文档,提供双模型配置说明和选择建议
|
|
16
|
+
- 🏗️ 重构代码结构,提升可维护性
|
|
17
|
+
|
|
18
|
+
### Technical Details
|
|
19
|
+
- 新增文件:
|
|
20
|
+
- `src/vision-client.ts` - 视觉模型客户端统一接口
|
|
21
|
+
- `src/siliconflow-client.ts` - 硅基流动 API 客户端实现
|
|
22
|
+
- `.env.example` - 配置示例文件
|
|
23
|
+
- 修改文件:
|
|
24
|
+
- `src/config.ts` - 支持多提供商配置
|
|
25
|
+
- `src/zhipu-client.ts` - 实现 VisionClient 接口
|
|
26
|
+
- `src/index.ts` - 根据配置动态选择客户端
|
|
27
|
+
- `README.md` - 完整的双模型使用文档
|
|
28
|
+
|
|
29
|
+
## [1.0.3] - 2025-11-XX
|
|
30
|
+
|
|
31
|
+
### Features
|
|
32
|
+
- 基于智谱 GLM-4.5V 的视觉理解能力
|
|
33
|
+
- 支持本地文件和远程 URL
|
|
34
|
+
- 内置重试机制
|
|
35
|
+
- 思考模式支持
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
**模型对比**:
|
|
40
|
+
|
|
41
|
+
| 特性 | GLM-4.5V | DeepSeek-OCR |
|
|
42
|
+
|----------|----------|--------------|
|
|
43
|
+
| 费用 | 收费 | **免费** |
|
|
44
|
+
| 中文理解 | 优秀 | 良好 |
|
|
45
|
+
| OCR 能力 | 良好 | **优秀** |
|
|
46
|
+
| 思考模式 | ✅ | ❌ |
|
|
47
|
+
|
|
48
|
+
**推荐使用场景**:
|
|
49
|
+
- 需要 OCR/文字识别 → **DeepSeek-OCR** (免费)
|
|
50
|
+
- 需要深度图片理解 → **GLM-4.5V**
|
package/README.md
CHANGED
|
@@ -1,26 +1,28 @@
|
|
|
1
1
|
# Luma MCP
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
多模型视觉理解 MCP 服务器,为不支持图片理解的 AI 助手提供视觉能力。
|
|
4
4
|
|
|
5
5
|
[English](./docs/README_EN.md) | 中文
|
|
6
6
|
|
|
7
7
|
## 特性
|
|
8
8
|
|
|
9
|
+
- **多模型支持**: 支持 GLM-4.5V(智谱)和 DeepSeek-OCR(硅基流动)
|
|
9
10
|
- **简单设计**: 单一 `analyze_image` 工具处理所有图片分析任务
|
|
10
11
|
- **智能理解**: 自动识别代码、UI、错误等不同场景
|
|
11
|
-
- **全面支持**:
|
|
12
|
+
- **全面支持**: 代码截图、界面设计、错误诊断、OCR 文字识别
|
|
12
13
|
- **标准 MCP 协议**: 无缝集成 Claude Desktop、Cline 等 MCP 客户端
|
|
13
|
-
-
|
|
14
|
+
- **免费选项**: DeepSeek-OCR 通过硅基流动提供免费调用
|
|
14
15
|
- **URL 支持**: 支持本地文件和远程图片 URL
|
|
15
16
|
- **重试机制**: 内置指数退避重试,提高可靠性
|
|
16
|
-
- **思考模式**: 默认启用深度分析
|
|
17
17
|
|
|
18
18
|
## 快速开始
|
|
19
19
|
|
|
20
20
|
### 前置要求
|
|
21
21
|
|
|
22
22
|
- Node.js >= 18.0.0
|
|
23
|
-
-
|
|
23
|
+
- **选择一种模型**:
|
|
24
|
+
- **方案 A**: 智谱 AI API Key ([获取地址](https://open.bigmodel.cn/)) - 中文理解优秀
|
|
25
|
+
- **方案 B**: 硅基流动 API Key ([获取地址](https://cloud.siliconflow.cn/)) - **免费使用**,OCR 能力强
|
|
24
26
|
|
|
25
27
|
### 安装
|
|
26
28
|
|
|
@@ -47,7 +49,7 @@ npx luma-mcp
|
|
|
47
49
|
|
|
48
50
|
**macOS 配置文件位置**: `~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
49
51
|
|
|
50
|
-
|
|
52
|
+
**方案 A: 使用智谱 GLM-4.5V**:
|
|
51
53
|
|
|
52
54
|
```json
|
|
53
55
|
{
|
|
@@ -63,7 +65,24 @@ npx luma-mcp
|
|
|
63
65
|
}
|
|
64
66
|
```
|
|
65
67
|
|
|
66
|
-
|
|
68
|
+
**方案 B: 使用硅基流动 DeepSeek-OCR(免费)**:
|
|
69
|
+
|
|
70
|
+
```json
|
|
71
|
+
{
|
|
72
|
+
"mcpServers": {
|
|
73
|
+
"luma": {
|
|
74
|
+
"command": "npx",
|
|
75
|
+
"args": ["-y", "luma-mcp"],
|
|
76
|
+
"env": {
|
|
77
|
+
"MODEL_PROVIDER": "siliconflow",
|
|
78
|
+
"SILICONFLOW_API_KEY": "your-siliconflow-api-key"
|
|
79
|
+
}
|
|
80
|
+
}
|
|
81
|
+
}
|
|
82
|
+
}
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
**本地开发(智谱)**:
|
|
67
86
|
|
|
68
87
|
```json
|
|
69
88
|
{
|
|
@@ -79,13 +98,30 @@ npx luma-mcp
|
|
|
79
98
|
}
|
|
80
99
|
```
|
|
81
100
|
|
|
101
|
+
**本地开发(硅基流动)**:
|
|
102
|
+
|
|
103
|
+
```json
|
|
104
|
+
{
|
|
105
|
+
"mcpServers": {
|
|
106
|
+
"luma": {
|
|
107
|
+
"command": "node",
|
|
108
|
+
"args": ["D:\\codes\\Luma_mcp\\build\\index.js"],
|
|
109
|
+
"env": {
|
|
110
|
+
"MODEL_PROVIDER": "siliconflow",
|
|
111
|
+
"SILICONFLOW_API_KEY": "your-siliconflow-api-key"
|
|
112
|
+
}
|
|
113
|
+
}
|
|
114
|
+
}
|
|
115
|
+
}
|
|
116
|
+
```
|
|
117
|
+
|
|
82
118
|
配置完成后重启 Claude Desktop。
|
|
83
119
|
|
|
84
120
|
#### Cline (VSCode)
|
|
85
121
|
|
|
86
|
-
|
|
122
|
+
在项目根目录或 `.vscode/` 目录下创建 `mcp.json`
|
|
87
123
|
|
|
88
|
-
|
|
124
|
+
**方案 A: 使用智谱 GLM-4.5V**:
|
|
89
125
|
|
|
90
126
|
```json
|
|
91
127
|
{
|
|
@@ -101,16 +137,17 @@ npx luma-mcp
|
|
|
101
137
|
}
|
|
102
138
|
```
|
|
103
139
|
|
|
104
|
-
|
|
140
|
+
**方案 B: 使用硅基流动 DeepSeek-OCR(免费)**:
|
|
105
141
|
|
|
106
142
|
```json
|
|
107
143
|
{
|
|
108
144
|
"mcpServers": {
|
|
109
145
|
"luma": {
|
|
110
|
-
"command": "
|
|
111
|
-
"args": ["
|
|
146
|
+
"command": "npx",
|
|
147
|
+
"args": ["-y", "luma-mcp"],
|
|
112
148
|
"env": {
|
|
113
|
-
"
|
|
149
|
+
"MODEL_PROVIDER": "siliconflow",
|
|
150
|
+
"SILICONFLOW_API_KEY": "your-siliconflow-api-key"
|
|
114
151
|
}
|
|
115
152
|
}
|
|
116
153
|
}
|
|
@@ -119,10 +156,16 @@ npx luma-mcp
|
|
|
119
156
|
|
|
120
157
|
#### Claude Code (命令行)
|
|
121
158
|
|
|
159
|
+
**使用智谱 GLM-4.5V**:
|
|
122
160
|
```bash
|
|
123
161
|
claude mcp add -s user luma-mcp --env ZHIPU_API_KEY=your-api-key -- npx -y luma-mcp
|
|
124
162
|
```
|
|
125
163
|
|
|
164
|
+
**使用硅基流动 DeepSeek-OCR(免费)**:
|
|
165
|
+
```bash
|
|
166
|
+
claude mcp add -s user luma-mcp --env MODEL_PROVIDER=siliconflow --env SILICONFLOW_API_KEY=your-api-key -- npx -y luma-mcp
|
|
167
|
+
```
|
|
168
|
+
|
|
126
169
|
#### 其他工具
|
|
127
170
|
|
|
128
171
|
更多 MCP 客户端配置方法请参考[智谱官方文档](https://docs.bigmodel.cn/cn/coding-plan/mcp/vision-mcp-server#claude-code)
|
|
@@ -164,6 +207,7 @@ Claude: [自动调用 analyze_image 工具]
|
|
|
164
207
|
|
|
165
208
|
不需要 MCP 客户端即可测试:
|
|
166
209
|
|
|
210
|
+
**测试智谱 GLM-4.5V**:
|
|
167
211
|
```bash
|
|
168
212
|
# 设置 API Key
|
|
169
213
|
export ZHIPU_API_KEY="your-api-key" # macOS/Linux
|
|
@@ -171,7 +215,23 @@ $env:ZHIPU_API_KEY="your-api-key" # Windows PowerShell
|
|
|
171
215
|
|
|
172
216
|
# 测试本地图片
|
|
173
217
|
npm run test:local ./test.png
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
**测试硅基流动 DeepSeek-OCR**:
|
|
221
|
+
```bash
|
|
222
|
+
# 设置 API Key 和提供商
|
|
223
|
+
export MODEL_PROVIDER=siliconflow
|
|
224
|
+
export SILICONFLOW_API_KEY="your-api-key" # macOS/Linux
|
|
174
225
|
|
|
226
|
+
$env:MODEL_PROVIDER="siliconflow"
|
|
227
|
+
$env:SILICONFLOW_API_KEY="your-api-key" # Windows PowerShell
|
|
228
|
+
|
|
229
|
+
# 测试本地图片
|
|
230
|
+
npm run test:local ./test.png
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
**其他测试命令**:
|
|
234
|
+
```bash
|
|
175
235
|
# 测试并提问
|
|
176
236
|
npm run test:local ./code-error.png "这段代码有什么问题?"
|
|
177
237
|
|
|
@@ -217,14 +277,32 @@ analyze_image({
|
|
|
217
277
|
|
|
218
278
|
## 环境变量
|
|
219
279
|
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
|
223
|
-
|
|
224
|
-
| `
|
|
225
|
-
| `
|
|
226
|
-
| `
|
|
227
|
-
| `
|
|
280
|
+
### 通用配置
|
|
281
|
+
|
|
282
|
+
| 变量名 | 必需 | 默认值 | 说明 |
|
|
283
|
+
|------------------|------|-------------|---------------------------------------|
|
|
284
|
+
| `MODEL_PROVIDER` | 否 | `zhipu` | 模型提供商:`zhipu` 或 `siliconflow` |
|
|
285
|
+
| `MODEL_NAME` | 否 | 见下文 | 模型名称(自动根据提供商选择) |
|
|
286
|
+
| `MAX_TOKENS` | 否 | `4096` | 最大生成 tokens |
|
|
287
|
+
| `TEMPERATURE` | 否 | `0.7` | 温度参数 (0-1) |
|
|
288
|
+
| `TOP_P` | 否 | `0.7` | Top-p 参数 (0-1) |
|
|
289
|
+
| `ENABLE_THINKING`| 否 | `false` | 是否启用思考模式(仅 GLM-4.5V) |
|
|
290
|
+
|
|
291
|
+
### 智谱 GLM-4.5V 专用
|
|
292
|
+
|
|
293
|
+
| 变量名 | 必需 | 默认值 | 说明 |
|
|
294
|
+
|------------------|---------------------|-------------|----------------------|
|
|
295
|
+
| `ZHIPU_API_KEY` | 是(使用智谱时) | - | 智谱 AI 的 API 密钥 |
|
|
296
|
+
|
|
297
|
+
默认模型:`glm-4.5v`
|
|
298
|
+
|
|
299
|
+
### 硅基流动 DeepSeek-OCR 专用
|
|
300
|
+
|
|
301
|
+
| 变量名 | 必需 | 默认值 | 说明 |
|
|
302
|
+
|------------------------|-------------------------|---------------------------------|----------------------------|
|
|
303
|
+
| `SILICONFLOW_API_KEY` | 是(使用硅基流动时) | - | 硅基流动的 API 密钥 |
|
|
304
|
+
|
|
305
|
+
默认模型:`deepseek-ai/DeepSeek-OCR`
|
|
228
306
|
|
|
229
307
|
**思考模式说明**:
|
|
230
308
|
- 默认开启,提高图片分析的准确性和详细程度
|
|
@@ -264,8 +342,10 @@ npm run test:local <图片路径> [问题]
|
|
|
264
342
|
luma-mcp/
|
|
265
343
|
├── src/
|
|
266
344
|
│ ├── index.ts # MCP 服务器入口
|
|
267
|
-
│ ├── config.ts #
|
|
345
|
+
│ ├── config.ts # 配置管理(支持多模型)
|
|
346
|
+
│ ├── vision-client.ts # 视觉模型客户端接口
|
|
268
347
|
│ ├── zhipu-client.ts # GLM-4.5V API 客户端
|
|
348
|
+
│ ├── siliconflow-client.ts # DeepSeek-OCR API 客户端
|
|
269
349
|
│ ├── image-processor.ts # 图片处理
|
|
270
350
|
│ ├── prompts.ts # 提示词模板
|
|
271
351
|
│ └── utils/
|
|
@@ -285,11 +365,18 @@ luma-mcp/
|
|
|
285
365
|
|
|
286
366
|
### 如何获取 API Key?
|
|
287
367
|
|
|
368
|
+
**智谱 GLM-4.5V**:
|
|
288
369
|
1. 访问 [智谱开放平台](https://open.bigmodel.cn/)
|
|
289
370
|
2. 注册/登录账号
|
|
290
371
|
3. 进入控制台创建 API Key
|
|
291
372
|
4. 复制 API Key 到配置文件
|
|
292
373
|
|
|
374
|
+
**硅基流动 DeepSeek-OCR(免费)**:
|
|
375
|
+
1. 访问 [硅基流动平台](https://cloud.siliconflow.cn/)
|
|
376
|
+
2. 注册/登录账号
|
|
377
|
+
3. 进入 API 管理创建 API Key
|
|
378
|
+
4. 复制 API Key 到配置文件
|
|
379
|
+
|
|
293
380
|
### 支持哪些图片格式?
|
|
294
381
|
|
|
295
382
|
支持 JPG、PNG、WebP、GIF 格式。建议使用 JPG 格式以获得更好的压缩率。
|
|
@@ -313,15 +400,31 @@ luma-mcp/
|
|
|
313
400
|
|
|
314
401
|
### 成本如何?
|
|
315
402
|
|
|
316
|
-
|
|
403
|
+
**硅基流动 DeepSeek-OCR**: **完全免费**,无需付费!
|
|
317
404
|
|
|
318
|
-
|
|
405
|
+
**智谱 GLM-4.5V**: 定价请参考[智谱官方定价](https://open.bigmodel.cn/pricing)。
|
|
406
|
+
|
|
407
|
+
典型场景估算(GLM-4.5V):
|
|
319
408
|
- 简单图片理解: 500-1000 tokens
|
|
320
409
|
- 代码截图分析: 1500-2500 tokens
|
|
321
410
|
- 详细 UI 分析: 2000-3000 tokens
|
|
322
411
|
|
|
323
412
|
启用思考模式会增加约 20-30% tokens。
|
|
324
413
|
|
|
414
|
+
### 如何选择模型?
|
|
415
|
+
|
|
416
|
+
| 特性 | GLM-4.5V(智谱) | DeepSeek-OCR(硅基流动) |
|
|
417
|
+
|------------|----------------|------------------------|
|
|
418
|
+
| **费用** | 收费 | **完全免费** |
|
|
419
|
+
| **中文理解** | 优秀 | 良好 |
|
|
420
|
+
| **OCR 能力** | 良好 | **优秀** |
|
|
421
|
+
| **思考模式** | 支持 | 不支持 |
|
|
422
|
+
| **适用场景** | 通用图片分析 | OCR、文字识别 |
|
|
423
|
+
|
|
424
|
+
**推荐**:
|
|
425
|
+
- 需要 OCR 或文字识别:选择 **DeepSeek-OCR**(免费)
|
|
426
|
+
- 需要深度图片理解:选择 **GLM-4.5V**
|
|
427
|
+
|
|
325
428
|
## 贡献
|
|
326
429
|
|
|
327
430
|
欢迎提交 Issue 和 Pull Request!
|
|
@@ -334,8 +437,44 @@ MIT License
|
|
|
334
437
|
|
|
335
438
|
- [智谱 AI 开放平台](https://open.bigmodel.cn/)
|
|
336
439
|
- [GLM-4.5V 文档](https://docs.bigmodel.cn/cn/guide/models/vlm/glm-4.5v)
|
|
440
|
+
- [硅基流动平台](https://cloud.siliconflow.cn/)
|
|
441
|
+
- [DeepSeek-OCR 文档](https://docs.siliconflow.cn/cn/api-reference/chat-completions/chat-completions)
|
|
337
442
|
- [MCP 协议文档](https://modelcontextprotocol.io/)
|
|
338
443
|
|
|
444
|
+
## 更新日志
|
|
445
|
+
|
|
446
|
+
### [1.1.0] - 2025-11-13
|
|
447
|
+
|
|
448
|
+
#### 新增
|
|
449
|
+
- 🎉 **多模型支持**: 新增硅基流动 DeepSeek-OCR 支持
|
|
450
|
+
- 🆓 **免费选项**: DeepSeek-OCR 通过硅基流动提供完全免费的 OCR 服务
|
|
451
|
+
- 📐 **统一接口**: 创建 VisionClient 接口,支持灵活扩展更多视觉模型
|
|
452
|
+
- ⚙️ **灵活配置**: 通过 `MODEL_PROVIDER` 环境变量轻松切换模型
|
|
453
|
+
|
|
454
|
+
#### 修改
|
|
455
|
+
- 🔧 环境变量命名优化,支持通用配置(`MODEL_NAME`、`MAX_TOKENS` 等)
|
|
456
|
+
- 📝 更新文档,提供双模型配置说明和选择建议
|
|
457
|
+
- 🏭️ 重构代码结构,提升可维护性
|
|
458
|
+
|
|
459
|
+
#### 技术细节
|
|
460
|
+
- 新增文件:
|
|
461
|
+
- `src/vision-client.ts` - 视觉模型客户端统一接口
|
|
462
|
+
- `src/siliconflow-client.ts` - 硅基流动 API 客户端实现
|
|
463
|
+
- `.env.example` - 配置示例文件
|
|
464
|
+
- 修改文件:
|
|
465
|
+
- `src/config.ts` - 支持多提供商配置
|
|
466
|
+
- `src/zhipu-client.ts` - 实现 VisionClient 接口
|
|
467
|
+
- `src/index.ts` - 根据配置动态选择客户端
|
|
468
|
+
|
|
469
|
+
### [1.0.3] - 2025-11-12
|
|
470
|
+
|
|
471
|
+
- 基于智谱 GLM-4.5V 的视觉理解能力
|
|
472
|
+
- 支持本地文件和远程 URL
|
|
473
|
+
- 内置重试机制
|
|
474
|
+
- 思考模式支持
|
|
475
|
+
|
|
476
|
+
更多更新历史请查看 [CHANGELOG.md](./CHANGELOG.md)
|
|
477
|
+
|
|
339
478
|
## 作者
|
|
340
479
|
|
|
341
480
|
Jochen
|
package/build/config.d.ts
CHANGED
package/build/config.d.ts.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,MAAM,WAAW,UAAU;IACzB,MAAM,EAAE,MAAM,CAAC;IACf,KAAK,EAAE,MAAM,CAAC;IACd,SAAS,EAAE,MAAM,CAAC;IAClB,WAAW,EAAE,MAAM,CAAC;IACpB,IAAI,EAAE,MAAM,CAAC;IACb,cAAc,EAAE,OAAO,CAAC;CACzB;AAED;;GAEG;AACH,wBAAgB,UAAU,IAAI,UAAU,
|
|
1
|
+
{"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,MAAM,MAAM,aAAa,GAAG,OAAO,GAAG,aAAa,CAAC;AAEpD,MAAM,WAAW,UAAU;IACzB,QAAQ,EAAE,aAAa,CAAC;IACxB,MAAM,EAAE,MAAM,CAAC;IACf,KAAK,EAAE,MAAM,CAAC;IACd,SAAS,EAAE,MAAM,CAAC;IAClB,WAAW,EAAE,MAAM,CAAC;IACpB,IAAI,EAAE,MAAM,CAAC;IACb,cAAc,EAAE,OAAO,CAAC;CACzB;AAED;;GAEG;AACH,wBAAgB,UAAU,IAAI,UAAU,CAiCvC"}
|
package/build/config.js
CHANGED
|
@@ -6,17 +6,33 @@
|
|
|
6
6
|
* 从环境变量加载配置
|
|
7
7
|
*/
|
|
8
8
|
export function loadConfig() {
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
9
|
+
// 确定使用的模型提供商
|
|
10
|
+
const provider = (process.env.MODEL_PROVIDER?.toLowerCase() || 'zhipu');
|
|
11
|
+
// 根据提供商获取 API Key
|
|
12
|
+
let apiKey;
|
|
13
|
+
let defaultModel;
|
|
14
|
+
if (provider === 'siliconflow') {
|
|
15
|
+
apiKey = process.env.SILICONFLOW_API_KEY;
|
|
16
|
+
defaultModel = 'deepseek-ai/DeepSeek-OCR';
|
|
17
|
+
if (!apiKey) {
|
|
18
|
+
throw new Error('SILICONFLOW_API_KEY environment variable is required when using SiliconFlow provider');
|
|
19
|
+
}
|
|
20
|
+
}
|
|
21
|
+
else {
|
|
22
|
+
apiKey = process.env.ZHIPU_API_KEY;
|
|
23
|
+
defaultModel = 'glm-4.5v';
|
|
24
|
+
if (!apiKey) {
|
|
25
|
+
throw new Error('ZHIPU_API_KEY environment variable is required when using Zhipu provider');
|
|
26
|
+
}
|
|
12
27
|
}
|
|
13
28
|
return {
|
|
29
|
+
provider,
|
|
14
30
|
apiKey,
|
|
15
|
-
model: process.env.
|
|
16
|
-
maxTokens: parseInt(process.env.
|
|
17
|
-
temperature: parseFloat(process.env.
|
|
18
|
-
topP: parseFloat(process.env.
|
|
19
|
-
enableThinking: process.env.
|
|
31
|
+
model: process.env.MODEL_NAME || defaultModel,
|
|
32
|
+
maxTokens: parseInt(process.env.MAX_TOKENS || '4096', 10),
|
|
33
|
+
temperature: parseFloat(process.env.TEMPERATURE || '0.7'),
|
|
34
|
+
topP: parseFloat(process.env.TOP_P || '0.7'),
|
|
35
|
+
enableThinking: process.env.ENABLE_THINKING === 'true',
|
|
20
36
|
};
|
|
21
37
|
}
|
|
22
38
|
//# sourceMappingURL=config.js.map
|
package/build/config.js.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA;;;GAGG;
|
|
1
|
+
{"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAcH;;GAEG;AACH,MAAM,UAAU,UAAU;IACxB,aAAa;IACb,MAAM,QAAQ,GAAG,CAAC,OAAO,CAAC,GAAG,CAAC,cAAc,EAAE,WAAW,EAAE,IAAI,OAAO,CAAkB,CAAC;IAEzF,kBAAkB;IAClB,IAAI,MAA0B,CAAC;IAC/B,IAAI,YAAoB,CAAC;IAEzB,IAAI,QAAQ,KAAK,aAAa,EAAE,CAAC;QAC/B,MAAM,GAAG,OAAO,CAAC,GAAG,CAAC,mBAAmB,CAAC;QACzC,YAAY,GAAG,0BAA0B,CAAC;QAE1C,IAAI,CAAC,MAAM,EAAE,CAAC;YACZ,MAAM,IAAI,KAAK,CAAC,sFAAsF,CAAC,CAAC;QAC1G,CAAC;IACH,CAAC;SAAM,CAAC;QACN,MAAM,GAAG,OAAO,CAAC,GAAG,CAAC,aAAa,CAAC;QACnC,YAAY,GAAG,UAAU,CAAC;QAE1B,IAAI,CAAC,MAAM,EAAE,CAAC;YACZ,MAAM,IAAI,KAAK,CAAC,0EAA0E,CAAC,CAAC;QAC9F,CAAC;IACH,CAAC;IAED,OAAO;QACL,QAAQ;QACR,MAAM;QACN,KAAK,EAAE,OAAO,CAAC,GAAG,CAAC,UAAU,IAAI,YAAY;QAC7C,SAAS,EAAE,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,UAAU,IAAI,MAAM,EAAE,EAAE,CAAC;QACzD,WAAW,EAAE,UAAU,CAAC,OAAO,CAAC,GAAG,CAAC,WAAW,IAAI,KAAK,CAAC;QACzD,IAAI,EAAE,UAAU,CAAC,OAAO,CAAC,GAAG,CAAC,KAAK,IAAI,KAAK,CAAC;QAC5C,cAAc,EAAE,OAAO,CAAC,GAAG,CAAC,eAAe,KAAK,MAAM;KACvD,CAAC;AACJ,CAAC"}
|
package/build/index.js
CHANGED
|
@@ -11,6 +11,7 @@ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
|
|
|
11
11
|
import { z } from 'zod';
|
|
12
12
|
import { loadConfig } from './config.js';
|
|
13
13
|
import { ZhipuClient } from './zhipu-client.js';
|
|
14
|
+
import { SiliconFlowClient } from './siliconflow-client.js';
|
|
14
15
|
import { imageToBase64, validateImageSource } from './image-processor.js';
|
|
15
16
|
import { buildAnalysisPrompt } from './prompts.js';
|
|
16
17
|
import { withRetry, createSuccessResponse, createErrorResponse } from './utils/helpers.js';
|
|
@@ -21,7 +22,14 @@ async function createServer() {
|
|
|
21
22
|
logger.info('Initializing Luma MCP Server');
|
|
22
23
|
// 加载配置
|
|
23
24
|
const config = loadConfig();
|
|
24
|
-
|
|
25
|
+
// 根据配置选择模型客户端
|
|
26
|
+
const visionClient = config.provider === 'siliconflow'
|
|
27
|
+
? new SiliconFlowClient(config)
|
|
28
|
+
: new ZhipuClient(config);
|
|
29
|
+
logger.info('Vision client initialized', {
|
|
30
|
+
provider: config.provider,
|
|
31
|
+
model: visionClient.getModelName()
|
|
32
|
+
});
|
|
25
33
|
// 创建服务器 - 使用 McpServer
|
|
26
34
|
const server = new McpServer({
|
|
27
35
|
name: 'luma-mcp',
|
|
@@ -37,15 +45,18 @@ async function createServer() {
|
|
|
37
45
|
await validateImageSource(imageSource);
|
|
38
46
|
// 2. 处理图片(读取或返回URL)
|
|
39
47
|
const imageDataUrl = await imageToBase64(imageSource);
|
|
40
|
-
// 3.
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
48
|
+
// 3. 构建提示词
|
|
49
|
+
// DeepSeek-OCR 需要简洁的 prompt,不支持复杂格式化
|
|
50
|
+
const fullPrompt = config.provider === 'siliconflow'
|
|
51
|
+
? prompt // DeepSeek-OCR: 直接使用原始 prompt
|
|
52
|
+
: buildAnalysisPrompt(prompt); // GLM-4.5V: 使用结构化 prompt
|
|
53
|
+
// 4. 调用视觉模型分析图片
|
|
54
|
+
return await visionClient.analyzeImage(imageDataUrl, fullPrompt);
|
|
44
55
|
}, 2, // 最多重试2次
|
|
45
56
|
1000 // 初始延补1秒
|
|
46
57
|
);
|
|
47
58
|
// 注册工具 - 使用 McpServer.tool() API
|
|
48
|
-
server.tool('analyze_image',
|
|
59
|
+
server.tool('analyze_image', `使用视觉模型分析图片内容。支持 GLM-4.5V(智谱)和 DeepSeek-OCR(硅基流动)。
|
|
49
60
|
|
|
50
61
|
**何时自动调用此工具**:
|
|
51
62
|
1. 用户提供了图片文件路径(包括临时路径、相对路径、绝对路径)
|
package/build/index.js.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAEA;;;GAGG;AAEH,wCAAwC;AACxC,OAAO,EAAE,uBAAuB,EAAE,MAAM,EAAE,MAAM,mBAAmB,CAAC;AACpE,uBAAuB,EAAE,CAAC;AAE1B,OAAO,EAAE,SAAS,EAAE,MAAM,yCAAyC,CAAC;AACpE,OAAO,EAAE,oBAAoB,EAAE,MAAM,2CAA2C,CAAC;AACjF,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAExB,OAAO,EAAE,UAAU,EAAE,MAAM,aAAa,CAAC;
|
|
1
|
+
{"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAEA;;;GAGG;AAEH,wCAAwC;AACxC,OAAO,EAAE,uBAAuB,EAAE,MAAM,EAAE,MAAM,mBAAmB,CAAC;AACpE,uBAAuB,EAAE,CAAC;AAE1B,OAAO,EAAE,SAAS,EAAE,MAAM,yCAAyC,CAAC;AACpE,OAAO,EAAE,oBAAoB,EAAE,MAAM,2CAA2C,CAAC;AACjF,OAAO,EAAE,CAAC,EAAE,MAAM,KAAK,CAAC;AAExB,OAAO,EAAE,UAAU,EAAE,MAAM,aAAa,CAAC;AAEzC,OAAO,EAAE,WAAW,EAAE,MAAM,mBAAmB,CAAC;AAChD,OAAO,EAAE,iBAAiB,EAAE,MAAM,yBAAyB,CAAC;AAC5D,OAAO,EAAE,aAAa,EAAE,mBAAmB,EAAE,MAAM,sBAAsB,CAAC;AAC1E,OAAO,EAAE,mBAAmB,EAAE,MAAM,cAAc,CAAC;AACnD,OAAO,EAAE,SAAS,EAAE,qBAAqB,EAAE,mBAAmB,EAAE,MAAM,oBAAoB,CAAC;AAE3F;;GAEG;AACH,KAAK,UAAU,YAAY;IACzB,MAAM,CAAC,IAAI,CAAC,8BAA8B,CAAC,CAAC;IAE5C,OAAO;IACP,MAAM,MAAM,GAAG,UAAU,EAAE,CAAC;IAE5B,cAAc;IACd,MAAM,YAAY,GAAiB,MAAM,CAAC,QAAQ,KAAK,aAAa;QAClE,CAAC,CAAC,IAAI,iBAAiB,CAAC,MAAM,CAAC;QAC/B,CAAC,CAAC,IAAI,WAAW,CAAC,MAAM,CAAC,CAAC;IAE5B,MAAM,CAAC,IAAI,CAAC,2BAA2B,EAAE;QACvC,QAAQ,EAAE,MAAM,CAAC,QAAQ;QACzB,KAAK,EAAE,YAAY,CAAC,YAAY,EAAE;KACnC,CAAC,CAAC;IAEH,uBAAuB;IACvB,MAAM,MAAM,GAAG,IAAI,SAAS,CAC1B;QACE,IAAI,EAAE,UAAU;QAChB,OAAO,EAAE,OAAO;KACjB,EACD;QACE,YAAY,EAAE;YACZ,KAAK,EAAE,EAAE;SACV;KACF,CACF,CAAC;IAEF,aAAa;IACb,MAAM,gBAAgB,GAAG,SAAS,CAChC,KAAK,EAAE,WAAmB,EAAE,MAAc,EAAE,EAAE;QAC5C,YAAY;QACZ,MAAM,mBAAmB,CAAC,WAAW,CAAC,CAAC;QAEvC,oBAAoB;QACpB,MAAM,YAAY,GAAG,MAAM,aAAa,CAAC,WAAW,CAAC,CAAC;QAEtD,WAAW;QACX,qCAAqC;QACrC,MAAM,UAAU,GAAG,MAAM,CAAC,QAAQ,KAAK,aAAa;YAClD,CAAC,CAAC,MAAM,CAAE,8BAA8B;YACxC,CAAC,CAAC,mBAAmB,CAAC,MAAM,CAAC,CAAC,CAAE,yBAAyB;QAE3D,gBAAgB;QAChB,OAAO,MAAM,YAAY,CAAC,YAAY,CAAC,YAAY,EAAE,UAAU,CAAC,CAAC;IACnE,CAAC,EACD,CAAC,EAAE,SAAS;IACZ,IAAI,CAAC,SAAS;KACf,CAAC;IAEF,iCAAiC;IACjC,MAAM,CAAC,IAAI,CACT,eAAe,EACf;;;;;;;;;;;;;uCAamC,EACnC;QACE,YAAY,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,CAAC,+IAA+I,CAAC;QAClL,MAAM,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,CAAC,8HAA8H,CAAC;KAC5J,EACD,KAAK,EAAE,MAAM,EAAE,EAAE;QACf,IAAI,CAAC;YACH,MAAM,CAAC,IAAI,CAAC,iBAAiB,EAAE;gBAC7B,MAAM,EAAE,MAAM,CAAC,YAAY;gBAC3B,MAAM,EAAE,MAAM,CAAC,MAAM;aACtB,CAAC,CAAC;YAEH,YAAY;YACZ,MAAM,MAAM,GAAG,MAAM,gBAAgB,CAAC,MAAM,CAAC,YAAY,EAAE,MAAM,CAAC,MAAM,CAAC,CAAC;YAE1E,MAAM,CAAC,IAAI,CAAC,uCAAuC,CAAC,CAAC;YACrD,OAAO,qBAAqB,CAAC,MAAM,CAAC,CAAC;QACvC,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,MAAM,CAAC,KAAK,CAAC,uBAAuB,EAAE;gBACpC,KAAK,EAAE,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,KAAK,CAAC;aAC9D,CAAC,CAAC;YAEH,OAAO,mBAAmB,CACxB,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,eAAe,CACzD,CAAC;QACJ,CAAC;IACH,CAAC,CACF,CAAC;IAEF,OAAO,MAAM,CAAC;AAChB,CAAC;AAED;;GAEG;AACH,KAAK,UAAU,IAAI;IACjB,IAAI,CAAC;QACH,MAAM,MAAM,GAAG,MAAM,YAAY,EAAE,CAAC;QACpC,MAAM,SAAS,GAAG,IAAI,oBAAoB,EAAE,CAAC;QAC7C,MAAM,MAAM,CAAC,OAAO,CAAC,SAAS,CAAC,CAAC;QAEhC,MAAM,CAAC,IAAI,CAAC,+CAA+C,CAAC,CAAC;IAC/D,CAAC;IAAC,OAAO,KAAK,EAAE,CAAC;QACf,MAAM,CAAC,KAAK,CAAC,iCAAiC,EAAE;YAC9C,KAAK,EAAE,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,KAAK,CAAC;SAC9D,CAAC,CAAC;QACH,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IAClB,CAAC;AACH,CAAC;AAED,SAAS;AACT,OAAO,CAAC,EAAE,CAAC,mBAAmB,EAAE,CAAC,KAAK,EAAE,EAAE;IACxC,MAAM,CAAC,KAAK,CAAC,oBAAoB,EAAE,EAAE,KAAK,EAAE,KAAK,CAAC,OAAO,EAAE,KAAK,EAAE,KAAK,CAAC,KAAK,EAAE,CAAC,CAAC;IACjF,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,oBAAoB,EAAE,CAAC,MAAM,EAAE,EAAE;IAC1C,MAAM,CAAC,KAAK,CAAC,qBAAqB,EAAE,EAAE,MAAM,EAAE,CAAC,CAAC;IAChD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,QAAQ,EAAE,GAAG,EAAE;IACxB,MAAM,CAAC,IAAI,CAAC,2CAA2C,CAAC,CAAC;IACzD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,SAAS,EAAE,GAAG,EAAE;IACzB,MAAM,CAAC,IAAI,CAAC,4CAA4C,CAAC,CAAC;IAC1D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,IAAI,EAAE,CAAC"}
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* 硅基流动 DeepSeek-OCR API 客户端
|
|
3
|
+
* 基于 OpenAI 兼容 API
|
|
4
|
+
*/
|
|
5
|
+
import type { LumaConfig } from './config.js';
|
|
6
|
+
import type { VisionClient } from './vision-client.js';
|
|
7
|
+
/**
|
|
8
|
+
* 硅基流动 API 客户端
|
|
9
|
+
*/
|
|
10
|
+
export declare class SiliconFlowClient implements VisionClient {
|
|
11
|
+
private config;
|
|
12
|
+
private apiEndpoint;
|
|
13
|
+
constructor(config: LumaConfig);
|
|
14
|
+
/**
|
|
15
|
+
* 分析图片
|
|
16
|
+
*/
|
|
17
|
+
analyzeImage(imageDataUrl: string, prompt: string, enableThinking?: boolean): Promise<string>;
|
|
18
|
+
/**
|
|
19
|
+
* 获取模型名称
|
|
20
|
+
*/
|
|
21
|
+
getModelName(): string;
|
|
22
|
+
}
|
|
23
|
+
//# sourceMappingURL=siliconflow-client.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"siliconflow-client.d.ts","sourceRoot":"","sources":["../src/siliconflow-client.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAGH,OAAO,KAAK,EAAE,UAAU,EAAE,MAAM,aAAa,CAAC;AAC9C,OAAO,KAAK,EAAE,YAAY,EAAE,MAAM,oBAAoB,CAAC;AA2CvD;;GAEG;AACH,qBAAa,iBAAkB,YAAW,YAAY;IACpD,OAAO,CAAC,MAAM,CAAa;IAC3B,OAAO,CAAC,WAAW,CAAoD;gBAE3D,MAAM,EAAE,UAAU;IAI9B;;OAEG;IACG,YAAY,CAAC,YAAY,EAAE,MAAM,EAAE,MAAM,EAAE,MAAM,EAAE,cAAc,CAAC,EAAE,OAAO,GAAG,OAAO,CAAC,MAAM,CAAC;IAsEnG;;OAEG;IACH,YAAY,IAAI,MAAM;CAGvB"}
|
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* 硅基流动 DeepSeek-OCR API 客户端
|
|
3
|
+
* 基于 OpenAI 兼容 API
|
|
4
|
+
*/
|
|
5
|
+
import axios from 'axios';
|
|
6
|
+
import { logger } from './utils/logger.js';
|
|
7
|
+
/**
|
|
8
|
+
* 硅基流动 API 客户端
|
|
9
|
+
*/
|
|
10
|
+
export class SiliconFlowClient {
|
|
11
|
+
config;
|
|
12
|
+
apiEndpoint = 'https://api.siliconflow.cn/v1/chat/completions';
|
|
13
|
+
constructor(config) {
|
|
14
|
+
this.config = config;
|
|
15
|
+
}
|
|
16
|
+
/**
|
|
17
|
+
* 分析图片
|
|
18
|
+
*/
|
|
19
|
+
async analyzeImage(imageDataUrl, prompt, enableThinking) {
|
|
20
|
+
const requestBody = {
|
|
21
|
+
model: this.config.model,
|
|
22
|
+
messages: [
|
|
23
|
+
{
|
|
24
|
+
role: 'user',
|
|
25
|
+
content: [
|
|
26
|
+
{
|
|
27
|
+
type: 'image_url',
|
|
28
|
+
image_url: {
|
|
29
|
+
url: imageDataUrl,
|
|
30
|
+
},
|
|
31
|
+
},
|
|
32
|
+
{
|
|
33
|
+
type: 'text',
|
|
34
|
+
text: prompt,
|
|
35
|
+
},
|
|
36
|
+
],
|
|
37
|
+
},
|
|
38
|
+
],
|
|
39
|
+
temperature: this.config.temperature,
|
|
40
|
+
max_tokens: this.config.maxTokens,
|
|
41
|
+
top_p: this.config.topP,
|
|
42
|
+
stream: false,
|
|
43
|
+
};
|
|
44
|
+
logger.info('Calling SiliconFlow DeepSeek-OCR API', {
|
|
45
|
+
model: this.config.model,
|
|
46
|
+
});
|
|
47
|
+
try {
|
|
48
|
+
const response = await axios.post(this.apiEndpoint, requestBody, {
|
|
49
|
+
headers: {
|
|
50
|
+
'Authorization': `Bearer ${this.config.apiKey}`,
|
|
51
|
+
'Content-Type': 'application/json',
|
|
52
|
+
},
|
|
53
|
+
timeout: 60000, // 60秒超时
|
|
54
|
+
});
|
|
55
|
+
if (!response.data.choices || response.data.choices.length === 0) {
|
|
56
|
+
throw new Error('No response from DeepSeek-OCR');
|
|
57
|
+
}
|
|
58
|
+
const result = response.data.choices[0].message.content;
|
|
59
|
+
const usage = response.data.usage;
|
|
60
|
+
logger.info('SiliconFlow API call successful', {
|
|
61
|
+
tokens: usage?.total_tokens || 0,
|
|
62
|
+
model: response.data.model
|
|
63
|
+
});
|
|
64
|
+
return result;
|
|
65
|
+
}
|
|
66
|
+
catch (error) {
|
|
67
|
+
logger.error('SiliconFlow API call failed', {
|
|
68
|
+
error: error instanceof Error ? error.message : String(error)
|
|
69
|
+
});
|
|
70
|
+
if (axios.isAxiosError(error)) {
|
|
71
|
+
const message = error.response?.data?.error?.message || error.message;
|
|
72
|
+
const status = error.response?.status;
|
|
73
|
+
throw new Error(`SiliconFlow API error (${status || 'unknown'}): ${message}`);
|
|
74
|
+
}
|
|
75
|
+
throw error;
|
|
76
|
+
}
|
|
77
|
+
}
|
|
78
|
+
/**
|
|
79
|
+
* 获取模型名称
|
|
80
|
+
*/
|
|
81
|
+
getModelName() {
|
|
82
|
+
return this.config.model;
|
|
83
|
+
}
|
|
84
|
+
}
|
|
85
|
+
//# sourceMappingURL=siliconflow-client.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"siliconflow-client.js","sourceRoot":"","sources":["../src/siliconflow-client.ts"],"names":[],"mappings":"AAAA;;;GAGG;AAEH,OAAO,KAAK,MAAM,OAAO,CAAC;AAG1B,OAAO,EAAE,MAAM,EAAE,MAAM,mBAAmB,CAAC;AA0C3C;;GAEG;AACH,MAAM,OAAO,iBAAiB;IACpB,MAAM,CAAa;IACnB,WAAW,GAAG,gDAAgD,CAAC;IAEvE,YAAY,MAAkB;QAC5B,IAAI,CAAC,MAAM,GAAG,MAAM,CAAC;IACvB,CAAC;IAED;;OAEG;IACH,KAAK,CAAC,YAAY,CAAC,YAAoB,EAAE,MAAc,EAAE,cAAwB;QAC/E,MAAM,WAAW,GAAuB;YACtC,KAAK,EAAE,IAAI,CAAC,MAAM,CAAC,KAAK;YACxB,QAAQ,EAAE;gBACR;oBACE,IAAI,EAAE,MAAM;oBACZ,OAAO,EAAE;wBACP;4BACE,IAAI,EAAE,WAAW;4BACjB,SAAS,EAAE;gCACT,GAAG,EAAE,YAAY;6BAClB;yBACF;wBACD;4BACE,IAAI,EAAE,MAAM;4BACZ,IAAI,EAAE,MAAM;yBACb;qBACF;iBACF;aACF;YACD,WAAW,EAAE,IAAI,CAAC,MAAM,CAAC,WAAW;YACpC,UAAU,EAAE,IAAI,CAAC,MAAM,CAAC,SAAS;YACjC,KAAK,EAAE,IAAI,CAAC,MAAM,CAAC,IAAI;YACvB,MAAM,EAAE,KAAK;SACd,CAAC;QAEF,MAAM,CAAC,IAAI,CAAC,sCAAsC,EAAE;YAClD,KAAK,EAAE,IAAI,CAAC,MAAM,CAAC,KAAK;SACzB,CAAC,CAAC;QAEH,IAAI,CAAC;YACH,MAAM,QAAQ,GAAG,MAAM,KAAK,CAAC,IAAI,CAC/B,IAAI,CAAC,WAAW,EAChB,WAAW,EACX;gBACE,OAAO,EAAE;oBACP,eAAe,EAAE,UAAU,IAAI,CAAC,MAAM,CAAC,MAAM,EAAE;oBAC/C,cAAc,EAAE,kBAAkB;iBACnC;gBACD,OAAO,EAAE,KAAK,EAAE,QAAQ;aACzB,CACF,CAAC;YAEF,IAAI,CAAC,QAAQ,CAAC,IAAI,CAAC,OAAO,IAAI,QAAQ,CAAC,IAAI,CAAC,OAAO,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;gBACjE,MAAM,IAAI,KAAK,CAAC,+BAA+B,CAAC,CAAC;YACnD,CAAC;YAED,MAAM,MAAM,GAAG,QAAQ,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC;YACxD,MAAM,KAAK,GAAG,QAAQ,CAAC,IAAI,CAAC,KAAK,CAAC;YAElC,MAAM,CAAC,IAAI,CAAC,iCAAiC,EAAE;gBAC7C,MAAM,EAAE,KAAK,EAAE,YAAY,IAAI,CAAC;gBAChC,KAAK,EAAE,QAAQ,CAAC,IAAI,CAAC,KAAK;aAC3B,CAAC,CAAC;YAEH,OAAO,MAAM,CAAC;QAChB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,MAAM,CAAC,KAAK,CAAC,6BAA6B,EAAE;gBAC1C,KAAK,EAAE,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,KAAK,CAAC;aAC9D,CAAC,CAAC;YAEH,IAAI,KAAK,CAAC,YAAY,CAAC,KAAK,CAAC,EAAE,CAAC;gBAC9B,MAAM,OAAO,GAAG,KAAK,CAAC,QAAQ,EAAE,IAAI,EAAE,KAAK,EAAE,OAAO,IAAI,KAAK,CAAC,OAAO,CAAC;gBACtE,MAAM,MAAM,GAAG,KAAK,CAAC,QAAQ,EAAE,MAAM,CAAC;gBACtC,MAAM,IAAI,KAAK,CAAC,0BAA0B,MAAM,IAAI,SAAS,MAAM,OAAO,EAAE,CAAC,CAAC;YAChF,CAAC;YACD,MAAM,KAAK,CAAC;QACd,CAAC;IACH,CAAC;IAED;;OAEG;IACH,YAAY;QACV,OAAO,IAAI,CAAC,MAAM,CAAC,KAAK,CAAC;IAC3B,CAAC;CACF"}
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* 视觉模型客户端统一接口
|
|
3
|
+
*/
|
|
4
|
+
export interface VisionClient {
|
|
5
|
+
/**
|
|
6
|
+
* 分析图片
|
|
7
|
+
* @param imageDataUrl 图片 Data URL 或 URL
|
|
8
|
+
* @param prompt 分析提示词
|
|
9
|
+
* @param enableThinking 是否启用思考模式(如果模型支持)
|
|
10
|
+
* @returns 分析结果文本
|
|
11
|
+
*/
|
|
12
|
+
analyzeImage(imageDataUrl: string, prompt: string, enableThinking?: boolean): Promise<string>;
|
|
13
|
+
/**
|
|
14
|
+
* 获取模型名称
|
|
15
|
+
*/
|
|
16
|
+
getModelName(): string;
|
|
17
|
+
}
|
|
18
|
+
//# sourceMappingURL=vision-client.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"vision-client.d.ts","sourceRoot":"","sources":["../src/vision-client.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,MAAM,WAAW,YAAY;IAC3B;;;;;;OAMG;IACH,YAAY,CAAC,YAAY,EAAE,MAAM,EAAE,MAAM,EAAE,MAAM,EAAE,cAAc,CAAC,EAAE,OAAO,GAAG,OAAO,CAAC,MAAM,CAAC,CAAC;IAE9F;;OAEG;IACH,YAAY,IAAI,MAAM,CAAC;CACxB"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"vision-client.js","sourceRoot":"","sources":["../src/vision-client.ts"],"names":[],"mappings":"AAAA;;GAEG"}
|
package/build/zhipu-client.d.ts
CHANGED
|
@@ -2,10 +2,11 @@
|
|
|
2
2
|
* 智谱 GLM-4.5V API 客户端
|
|
3
3
|
*/
|
|
4
4
|
import type { LumaConfig } from './config.js';
|
|
5
|
+
import type { VisionClient } from './vision-client.js';
|
|
5
6
|
/**
|
|
6
7
|
* 智谱 API 客户端
|
|
7
8
|
*/
|
|
8
|
-
export declare class ZhipuClient {
|
|
9
|
+
export declare class ZhipuClient implements VisionClient {
|
|
9
10
|
private config;
|
|
10
11
|
private apiEndpoint;
|
|
11
12
|
constructor(config: LumaConfig);
|
|
@@ -13,5 +14,9 @@ export declare class ZhipuClient {
|
|
|
13
14
|
* 分析图片
|
|
14
15
|
*/
|
|
15
16
|
analyzeImage(imageDataUrl: string, prompt: string, enableThinking?: boolean): Promise<string>;
|
|
17
|
+
/**
|
|
18
|
+
* 获取模型名称
|
|
19
|
+
*/
|
|
20
|
+
getModelName(): string;
|
|
16
21
|
}
|
|
17
22
|
//# sourceMappingURL=zhipu-client.d.ts.map
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"zhipu-client.d.ts","sourceRoot":"","sources":["../src/zhipu-client.ts"],"names":[],"mappings":"AAAA;;GAEG;AAGH,OAAO,KAAK,EAAE,UAAU,EAAE,MAAM,aAAa,CAAC;
|
|
1
|
+
{"version":3,"file":"zhipu-client.d.ts","sourceRoot":"","sources":["../src/zhipu-client.ts"],"names":[],"mappings":"AAAA;;GAEG;AAGH,OAAO,KAAK,EAAE,UAAU,EAAE,MAAM,aAAa,CAAC;AAC9C,OAAO,KAAK,EAAE,YAAY,EAAE,MAAM,oBAAoB,CAAC;AA4CvD;;GAEG;AACH,qBAAa,WAAY,YAAW,YAAY;IAC9C,OAAO,CAAC,MAAM,CAAa;IAC3B,OAAO,CAAC,WAAW,CAA2D;gBAElE,MAAM,EAAE,UAAU;IAI9B;;OAEG;IACG,YAAY,CAAC,YAAY,EAAE,MAAM,EAAE,MAAM,EAAE,MAAM,EAAE,cAAc,CAAC,EAAE,OAAO,GAAG,OAAO,CAAC,MAAM,CAAC;IA4EnG;;OAEG;IACH,YAAY,IAAI,MAAM;CAGvB"}
|
package/build/zhipu-client.js
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"zhipu-client.js","sourceRoot":"","sources":["../src/zhipu-client.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,OAAO,KAAK,MAAM,OAAO,CAAC;
|
|
1
|
+
{"version":3,"file":"zhipu-client.js","sourceRoot":"","sources":["../src/zhipu-client.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,OAAO,KAAK,MAAM,OAAO,CAAC;AAG1B,OAAO,EAAE,MAAM,EAAE,MAAM,mBAAmB,CAAC;AA2C3C;;GAEG;AACH,MAAM,OAAO,WAAW;IACd,MAAM,CAAa;IACnB,WAAW,GAAG,uDAAuD,CAAC;IAE9E,YAAY,MAAkB;QAC5B,IAAI,CAAC,MAAM,GAAG,MAAM,CAAC;IACvB,CAAC;IAED;;OAEG;IACH,KAAK,CAAC,YAAY,CAAC,YAAoB,EAAE,MAAc,EAAE,cAAwB;QAC/E,MAAM,WAAW,GAAiB;YAChC,KAAK,EAAE,IAAI,CAAC,MAAM,CAAC,KAAK;YACxB,QAAQ,EAAE;gBACR;oBACE,IAAI,EAAE,MAAM;oBACZ,OAAO,EAAE;wBACP;4BACE,IAAI,EAAE,WAAW;4BACjB,SAAS,EAAE;gCACT,GAAG,EAAE,YAAY;6BAClB;yBACF;wBACD;4BACE,IAAI,EAAE,MAAM;4BACZ,IAAI,EAAE,MAAM;yBACb;qBACF;iBACF;aACF;YACD,WAAW,EAAE,IAAI,CAAC,MAAM,CAAC,WAAW;YACpC,UAAU,EAAE,IAAI,CAAC,MAAM,CAAC,SAAS;YACjC,KAAK,EAAE,IAAI,CAAC,MAAM,CAAC,IAAI;YACvB,QAAQ,EAAE,EAAE,IAAI,EAAE,SAAS,EAAE,EAAE,mBAAmB;SACnD,CAAC;QAEF,2BAA2B;QAC3B,IAAI,IAAI,CAAC,MAAM,CAAC,cAAc,KAAK,KAAK,IAAI,cAAc,KAAK,KAAK,EAAE,CAAC;YACrE,OAAO,WAAW,CAAC,QAAQ,CAAC;QAC9B,CAAC;QAED,MAAM,CAAC,IAAI,CAAC,sBAAsB,EAAE;YAClC,KAAK,EAAE,IAAI,CAAC,MAAM,CAAC,KAAK;YACxB,QAAQ,EAAE,CAAC,CAAC,WAAW,CAAC,QAAQ;SACjC,CAAC,CAAC;QAEH,IAAI,CAAC;YACH,MAAM,QAAQ,GAAG,MAAM,KAAK,CAAC,IAAI,CAC/B,IAAI,CAAC,WAAW,EAChB,WAAW,EACX;gBACE,OAAO,EAAE;oBACP,eAAe,EAAE,UAAU,IAAI,CAAC,MAAM,CAAC,MAAM,EAAE;oBAC/C,cAAc,EAAE,kBAAkB;iBACnC;gBACD,OAAO,EAAE,KAAK,EAAE,QAAQ;aACzB,CACF,CAAC;YAEF,IAAI,CAAC,QAAQ,CAAC,IAAI,CAAC,OAAO,IAAI,QAAQ,CAAC,IAAI,CAAC,OAAO,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;gBACjE,MAAM,IAAI,KAAK,CAAC,2BAA2B,CAAC,CAAC;YAC/C,CAAC;YAED,MAAM,MAAM,GAAG,QAAQ,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,OAAO,CAAC;YACxD,MAAM,KAAK,GAAG,QAAQ,CAAC,IAAI,CAAC,KAAK,CAAC;YAElC,MAAM,CAAC,IAAI,CAAC,8BAA8B,EAAE;gBAC1C,MAAM,EAAE,KAAK,EAAE,YAAY,IAAI,CAAC;gBAChC,KAAK,EAAE,QAAQ,CAAC,IAAI,CAAC,KAAK;aAC3B,CAAC,CAAC;YAEH,OAAO,MAAM,CAAC;QAChB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,MAAM,CAAC,KAAK,CAAC,0BAA0B,EAAE;gBACvC,KAAK,EAAE,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,KAAK,CAAC;aAC9D,CAAC,CAAC;YAEH,IAAI,KAAK,CAAC,YAAY,CAAC,KAAK,CAAC,EAAE,CAAC;gBAC9B,MAAM,OAAO,GAAG,KAAK,CAAC,QAAQ,EAAE,IAAI,EAAE,KAAK,EAAE,OAAO,IAAI,KAAK,CAAC,OAAO,CAAC;gBACtE,MAAM,MAAM,GAAG,KAAK,CAAC,QAAQ,EAAE,MAAM,CAAC;gBACtC,MAAM,IAAI,KAAK,CAAC,uBAAuB,MAAM,IAAI,SAAS,MAAM,OAAO,EAAE,CAAC,CAAC;YAC7E,CAAC;YACD,MAAM,KAAK,CAAC;QACd,CAAC;IACH,CAAC;IAED;;OAEG;IACH,YAAY;QACV,OAAO,IAAI,CAAC,MAAM,CAAC,KAAK,CAAC;IAC3B,CAAC;CACF"}
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "luma-mcp",
|
|
3
|
-
"version": "1.0
|
|
4
|
-
"description": "
|
|
3
|
+
"version": "1.1.0",
|
|
4
|
+
"description": "Multi-model vision understanding MCP server. Supports GLM-4.5V (Zhipu) and DeepSeek-OCR (SiliconFlow - Free)",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
7
7
|
"luma-mcp": "build/index.js"
|
|
@@ -19,7 +19,12 @@
|
|
|
19
19
|
"ai",
|
|
20
20
|
"glm-4.5v",
|
|
21
21
|
"zhipu",
|
|
22
|
-
"
|
|
22
|
+
"deepseek-ocr",
|
|
23
|
+
"siliconflow",
|
|
24
|
+
"ocr",
|
|
25
|
+
"free",
|
|
26
|
+
"image-understanding",
|
|
27
|
+
"multi-model"
|
|
23
28
|
],
|
|
24
29
|
"author": "Jochen",
|
|
25
30
|
"license": "MIT",
|
|
@@ -0,0 +1,94 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* 直接测试 DeepSeek-OCR API(无任何包装)
|
|
3
|
+
*/
|
|
4
|
+
|
|
5
|
+
import axios from 'axios';
|
|
6
|
+
import * as fs from 'fs';
|
|
7
|
+
import * as path from 'path';
|
|
8
|
+
|
|
9
|
+
async function testDeepSeekOCR(imagePath: string) {
|
|
10
|
+
console.log('\n🧪 测试 DeepSeek-OCR API(原始调用)\n');
|
|
11
|
+
|
|
12
|
+
const apiKey = 'sk-skrldwndjawxvzzomztwmoinnwmvumezqyejysqutjwkjcdt';
|
|
13
|
+
|
|
14
|
+
// 读取图片并转为 base64
|
|
15
|
+
const imageBuffer = fs.readFileSync(imagePath);
|
|
16
|
+
const base64Image = imageBuffer.toString('base64');
|
|
17
|
+
const mimeType = imagePath.endsWith('.png') ? 'image/png' : 'image/jpeg';
|
|
18
|
+
const imageDataUrl = `data:${mimeType};base64,${base64Image}`;
|
|
19
|
+
|
|
20
|
+
console.log(`📸 图片: ${imagePath}`);
|
|
21
|
+
console.log(`📦 大小: ${(imageBuffer.length / 1024).toFixed(2)} KB\n`);
|
|
22
|
+
|
|
23
|
+
// 测试不同的 prompt
|
|
24
|
+
const prompts = [
|
|
25
|
+
'识别图片中的所有文字',
|
|
26
|
+
'OCR',
|
|
27
|
+
'Extract all text from this image',
|
|
28
|
+
'What do you see in this image?',
|
|
29
|
+
'请详细描述这张图片'
|
|
30
|
+
];
|
|
31
|
+
|
|
32
|
+
for (const prompt of prompts) {
|
|
33
|
+
console.log(`\n🔍 测试 Prompt: "${prompt}"`);
|
|
34
|
+
console.log('─'.repeat(50));
|
|
35
|
+
|
|
36
|
+
try {
|
|
37
|
+
const response = await axios.post(
|
|
38
|
+
'https://api.siliconflow.cn/v1/chat/completions',
|
|
39
|
+
{
|
|
40
|
+
model: 'deepseek-ai/DeepSeek-OCR',
|
|
41
|
+
messages: [
|
|
42
|
+
{
|
|
43
|
+
role: 'user',
|
|
44
|
+
content: [
|
|
45
|
+
{
|
|
46
|
+
type: 'image_url',
|
|
47
|
+
image_url: {
|
|
48
|
+
url: imageDataUrl,
|
|
49
|
+
},
|
|
50
|
+
},
|
|
51
|
+
{
|
|
52
|
+
type: 'text',
|
|
53
|
+
text: prompt,
|
|
54
|
+
},
|
|
55
|
+
],
|
|
56
|
+
},
|
|
57
|
+
],
|
|
58
|
+
temperature: 0.7,
|
|
59
|
+
max_tokens: 4096,
|
|
60
|
+
},
|
|
61
|
+
{
|
|
62
|
+
headers: {
|
|
63
|
+
'Authorization': `Bearer ${apiKey}`,
|
|
64
|
+
'Content-Type': 'application/json',
|
|
65
|
+
},
|
|
66
|
+
timeout: 60000,
|
|
67
|
+
}
|
|
68
|
+
);
|
|
69
|
+
|
|
70
|
+
const result = response.data.choices[0].message.content;
|
|
71
|
+
const usage = response.data.usage;
|
|
72
|
+
|
|
73
|
+
console.log(`✅ Tokens: ${usage.total_tokens} (prompt: ${usage.prompt_tokens}, completion: ${usage.completion_tokens})`);
|
|
74
|
+
console.log(`📝 响应长度: ${result?.length || 0} 字符`);
|
|
75
|
+
|
|
76
|
+
if (result && result.trim().length > 0) {
|
|
77
|
+
console.log('\n📊 结果:');
|
|
78
|
+
console.log('─'.repeat(50));
|
|
79
|
+
console.log(result);
|
|
80
|
+
console.log('─'.repeat(50));
|
|
81
|
+
console.log('\n✅ 找到有效响应!');
|
|
82
|
+
break;
|
|
83
|
+
} else {
|
|
84
|
+
console.log('❌ 空响应');
|
|
85
|
+
}
|
|
86
|
+
} catch (error: any) {
|
|
87
|
+
console.log(`❌ 错误: ${error.message}`);
|
|
88
|
+
}
|
|
89
|
+
}
|
|
90
|
+
}
|
|
91
|
+
|
|
92
|
+
// 运行测试
|
|
93
|
+
const imagePath = path.join(process.cwd(), 'test.png');
|
|
94
|
+
testDeepSeekOCR(imagePath).catch(console.error);
|
package/test/test-local.ts
CHANGED
|
@@ -4,7 +4,9 @@
|
|
|
4
4
|
*/
|
|
5
5
|
|
|
6
6
|
import { loadConfig } from '../src/config.js';
|
|
7
|
+
import type { VisionClient } from '../src/vision-client.js';
|
|
7
8
|
import { ZhipuClient } from '../src/zhipu-client.js';
|
|
9
|
+
import { SiliconFlowClient } from '../src/siliconflow-client.js';
|
|
8
10
|
import { imageToBase64, validateImageSource } from '../src/image-processor.js';
|
|
9
11
|
import { buildAnalysisPrompt } from '../src/prompts.js';
|
|
10
12
|
import { logger } from '../src/utils/logger.js';
|
|
@@ -18,7 +20,7 @@ async function testImageAnalysis(imagePath: string, question?: string) {
|
|
|
18
20
|
// 1. 加载配置
|
|
19
21
|
console.log('📝 加载配置...');
|
|
20
22
|
const config = loadConfig();
|
|
21
|
-
console.log(`✅ 配置加载成功: 模型 ${config.model}\n`);
|
|
23
|
+
console.log(`✅ 配置加载成功: 提供商 ${config.provider}, 模型 ${config.model}\n`);
|
|
22
24
|
|
|
23
25
|
// 2. 验证图片
|
|
24
26
|
console.log('🔍 验证图片来源...');
|
|
@@ -33,12 +35,19 @@ async function testImageAnalysis(imagePath: string, question?: string) {
|
|
|
33
35
|
|
|
34
36
|
// 4. 构建提示词
|
|
35
37
|
console.log('💬 构建提示词...');
|
|
36
|
-
|
|
38
|
+
// DeepSeek-OCR 需要简洁 prompt
|
|
39
|
+
const prompt = config.provider === 'siliconflow'
|
|
40
|
+
? (question || '请详细分析这张图片的内容')
|
|
41
|
+
: buildAnalysisPrompt(question);
|
|
37
42
|
console.log(`✅ 提示词: ${question || '通用描述'}\n`);
|
|
38
43
|
|
|
39
|
-
// 5.
|
|
40
|
-
|
|
41
|
-
|
|
44
|
+
// 5. 创建客户端并调用API
|
|
45
|
+
const client: VisionClient = config.provider === 'siliconflow'
|
|
46
|
+
? new SiliconFlowClient(config)
|
|
47
|
+
: new ZhipuClient(config);
|
|
48
|
+
|
|
49
|
+
const modelName = config.provider === 'siliconflow' ? 'DeepSeek-OCR' : 'GLM-4.5V';
|
|
50
|
+
console.log(`🤖 调用 ${modelName} API...`);
|
|
42
51
|
const result = await client.analyzeImage(imageDataUrl, prompt);
|
|
43
52
|
|
|
44
53
|
// 6. 显示结果
|
|
@@ -76,8 +85,12 @@ if (args.length === 0) {
|
|
|
76
85
|
npm run test:local https://example.com/image.jpg
|
|
77
86
|
|
|
78
87
|
环境变量:
|
|
79
|
-
|
|
80
|
-
|
|
88
|
+
# 使用智谱 GLM-4.5V
|
|
89
|
+
ZHIPU_API_KEY=your-api-key
|
|
90
|
+
|
|
91
|
+
# 使用硅基流动 DeepSeek-OCR
|
|
92
|
+
MODEL_PROVIDER=siliconflow
|
|
93
|
+
SILICONFLOW_API_KEY=your-api-key
|
|
81
94
|
`);
|
|
82
95
|
process.exit(1);
|
|
83
96
|
}
|