cicy-desktop 1.0.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/workflows/build.yml +85 -0
- package/.kiro/steering/dev-workflow.md +166 -0
- package/AGENTS.md +247 -0
- package/CLAUDE.md +162 -0
- package/DOCKER.md +85 -0
- package/Dockerfile +46 -0
- package/README.md +720 -0
- package/TODO-anti-detection.md +326 -0
- package/bin/cicy +176 -0
- package/bin/preinstall.sh +32 -0
- package/copy-to-desktop.sh +26 -0
- package/docs/AUTOMATION-API.md +342 -0
- package/docs/REQUEST_MONITORING.md +435 -0
- package/docs/REST-API-FEATURE.md +155 -0
- package/docs/REST-API.md +319 -0
- package/docs/feature-distributed-multi-agent.md +555 -0
- package/docs/yaml.md +255 -0
- package/electron-mcp-fixed.command +134 -0
- package/electron-mcp-simple.command +135 -0
- package/electron-mcp.command +92 -0
- package/generate-openapi.js +158 -0
- package/jest.config.js +10 -0
- package/jest.setup.global.js +13 -0
- package/jest.teardown.global.js +7 -0
- package/package.json +75 -0
- package/service.sh +164 -0
- package/src/config.js +8 -0
- package/src/extension/inject.js +135 -0
- package/src/main-old.js +837 -0
- package/src/main.js +403 -0
- package/src/preload-rpc.js +4 -0
- package/src/server/args-parser.js +37 -0
- package/src/server/electron-setup.js +33 -0
- package/src/server/express-app.js +166 -0
- package/src/server/logging.js +58 -0
- package/src/server/mcp-server.js +53 -0
- package/src/server/tool-registry.js +77 -0
- package/src/server/ui-routes.js +81 -0
- package/src/swagger-ui.html +41 -0
- package/src/tools/account-tools.js +194 -0
- package/src/tools/automation-tools.js +297 -0
- package/src/tools/cdp-tools.js +444 -0
- package/src/tools/clipboard-tools.js +180 -0
- package/src/tools/download-tools.js +57 -0
- package/src/tools/exec-js.js +297 -0
- package/src/tools/exec-tools.js +139 -0
- package/src/tools/file-tools.js +212 -0
- package/src/tools/hook-chatgpt.js +489 -0
- package/src/tools/hook-gemini.js +454 -0
- package/src/tools/index.js +19 -0
- package/src/tools/ipc-bridge.js +31 -0
- package/src/tools/ping.js +60 -0
- package/src/tools/r-reset.js +28 -0
- package/src/tools/screenshot-tools.js +28 -0
- package/src/tools/system-tools.js +531 -0
- package/src/tools/window-tools.js +882 -0
- package/src/ui.html +914 -0
- package/src/utils/auth.js +81 -0
- package/src/utils/cdp-utils.js +8 -0
- package/src/utils/download-manager.js +41 -0
- package/src/utils/process-utils.js +185 -0
- package/src/utils/snapshot-utils.js +56 -0
- package/src/utils/window-monitor.js +605 -0
- package/src/utils/window-state.js +137 -0
- package/src/utils/window-utils.js +336 -0
- package/update-desktop.sh +33 -0
|
@@ -0,0 +1,555 @@
|
|
|
1
|
+
# Feature: Distributed Multi-Agent Cluster
|
|
2
|
+
|
|
3
|
+
## 概述
|
|
4
|
+
|
|
5
|
+
将 Electron MCP Server 升级为分布式 Multi-Agent 集群系统,支持大规模并行自动化任务。
|
|
6
|
+
|
|
7
|
+
## 目标
|
|
8
|
+
|
|
9
|
+
- 支持 100+ Agent 并行工作
|
|
10
|
+
- 分布式部署,弹性扩容
|
|
11
|
+
- 统一的任务调度和监控
|
|
12
|
+
- 高可用、容错设计
|
|
13
|
+
|
|
14
|
+
## 架构设计
|
|
15
|
+
|
|
16
|
+
### 1. Master Node(调度中心)
|
|
17
|
+
|
|
18
|
+
**职责:**
|
|
19
|
+
- 任务分发和调度
|
|
20
|
+
- 负载均衡
|
|
21
|
+
- Worker 健康检查
|
|
22
|
+
- Agent Hub 管理界面
|
|
23
|
+
- 全局状态监控
|
|
24
|
+
|
|
25
|
+
**技术栈:**
|
|
26
|
+
- Express.js (HTTP API)
|
|
27
|
+
- Socket.IO (实时通信)
|
|
28
|
+
- Redis (任务队列 + 状态存储)
|
|
29
|
+
- Bull (任务队列管理)
|
|
30
|
+
|
|
31
|
+
**核心 API:**
|
|
32
|
+
```javascript
|
|
33
|
+
POST /api/tasks // 创建任务
|
|
34
|
+
GET /api/tasks/:id // 查询任务状态
|
|
35
|
+
GET /api/workers // 获取 Worker 列表
|
|
36
|
+
GET /api/agents // 获取所有 Agent 列表
|
|
37
|
+
POST /api/agents/create // 创建新 Agent
|
|
38
|
+
DELETE /api/agents/:id // 关闭 Agent
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
### 2. Worker Node(执行节点)
|
|
42
|
+
|
|
43
|
+
**职责:**
|
|
44
|
+
- 运行多个 Electron 窗口(Agent)
|
|
45
|
+
- 从队列拉取任务
|
|
46
|
+
- 执行自动化操作
|
|
47
|
+
- 上报状态和结果
|
|
48
|
+
|
|
49
|
+
**配置:**
|
|
50
|
+
```javascript
|
|
51
|
+
{
|
|
52
|
+
"workerId": "worker-1",
|
|
53
|
+
"masterUrl": "http://master:8100",
|
|
54
|
+
"maxAgents": 10, // 最大 Agent 数量
|
|
55
|
+
"port": 8101,
|
|
56
|
+
"resources": {
|
|
57
|
+
"cpu": 4,
|
|
58
|
+
"memory": "8GB"
|
|
59
|
+
}
|
|
60
|
+
}
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
**心跳机制:**
|
|
64
|
+
- 每 5 秒向 Master 发送心跳
|
|
65
|
+
- 上报 CPU、内存、Agent 状态
|
|
66
|
+
- 超过 30 秒无心跳视为离线
|
|
67
|
+
|
|
68
|
+
### 3. Agent(窗口实例)
|
|
69
|
+
|
|
70
|
+
**属性:**
|
|
71
|
+
```javascript
|
|
72
|
+
{
|
|
73
|
+
"agentId": "agent-1",
|
|
74
|
+
"workerId": "worker-1",
|
|
75
|
+
"windowId": 1,
|
|
76
|
+
"accountIdx": 0,
|
|
77
|
+
"status": "idle|busy|error",
|
|
78
|
+
"currentTask": "task-123",
|
|
79
|
+
"createdAt": "2026-02-09T05:00:00Z"
|
|
80
|
+
}
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
**状态流转:**
|
|
84
|
+
```
|
|
85
|
+
idle → busy → idle
|
|
86
|
+
↓ ↓
|
|
87
|
+
error ← error
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
## 核心功能
|
|
91
|
+
|
|
92
|
+
### 1. 任务队列系统
|
|
93
|
+
|
|
94
|
+
**任务结构:**
|
|
95
|
+
```javascript
|
|
96
|
+
{
|
|
97
|
+
"taskId": "task-123",
|
|
98
|
+
"type": "scrape|test|automation",
|
|
99
|
+
"priority": 1-10,
|
|
100
|
+
"payload": {
|
|
101
|
+
"url": "https://example.com",
|
|
102
|
+
"actions": [
|
|
103
|
+
{ "tool": "open_window", "args": {...} },
|
|
104
|
+
{ "tool": "cdp_click", "args": {...} },
|
|
105
|
+
{ "tool": "exec_js", "args": {...} }
|
|
106
|
+
]
|
|
107
|
+
},
|
|
108
|
+
"requirements": {
|
|
109
|
+
"accountIdx": 0, // 指定账户
|
|
110
|
+
"workerId": null, // 指定 Worker(可选)
|
|
111
|
+
"timeout": 300000 // 超时时间(ms)
|
|
112
|
+
},
|
|
113
|
+
"status": "pending|running|completed|failed",
|
|
114
|
+
"result": {...},
|
|
115
|
+
"createdAt": "2026-02-09T05:00:00Z",
|
|
116
|
+
"startedAt": null,
|
|
117
|
+
"completedAt": null
|
|
118
|
+
}
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
**队列优先级:**
|
|
122
|
+
- High Priority Queue (priority 8-10)
|
|
123
|
+
- Normal Queue (priority 4-7)
|
|
124
|
+
- Low Priority Queue (priority 1-3)
|
|
125
|
+
|
|
126
|
+
### 2. 负载均衡策略
|
|
127
|
+
|
|
128
|
+
**策略选择:**
|
|
129
|
+
1. **Round Robin**:轮询分配
|
|
130
|
+
2. **Least Connections**:分配给最空闲的 Worker
|
|
131
|
+
3. **Resource Based**:根据 CPU/内存负载分配
|
|
132
|
+
4. **Affinity**:相同账户的任务分配到同一 Worker
|
|
133
|
+
|
|
134
|
+
**实现:**
|
|
135
|
+
```javascript
|
|
136
|
+
class LoadBalancer {
|
|
137
|
+
selectWorker(task, workers) {
|
|
138
|
+
// 过滤健康的 Worker
|
|
139
|
+
const healthy = workers.filter(w => w.status === 'online');
|
|
140
|
+
|
|
141
|
+
// 根据策略选择
|
|
142
|
+
switch (this.strategy) {
|
|
143
|
+
case 'least-connections':
|
|
144
|
+
return healthy.sort((a, b) => a.busyAgents - b.busyAgents)[0];
|
|
145
|
+
case 'resource-based':
|
|
146
|
+
return healthy.sort((a, b) => a.cpuUsage - b.cpuUsage)[0];
|
|
147
|
+
default:
|
|
148
|
+
return healthy[this.roundRobinIndex++ % healthy.length];
|
|
149
|
+
}
|
|
150
|
+
}
|
|
151
|
+
}
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
### 3. Agent Hub 管理界面
|
|
155
|
+
|
|
156
|
+
**页面结构:**
|
|
157
|
+
```
|
|
158
|
+
/agent-hub
|
|
159
|
+
├── Dashboard
|
|
160
|
+
│ ├── 总览统计(Worker/Agent/Task 数量)
|
|
161
|
+
│ ├── 实时监控图表(CPU/内存/任务吞吐)
|
|
162
|
+
│ └── 告警信息
|
|
163
|
+
├── Workers
|
|
164
|
+
│ ├── Worker 列表(状态、资源、Agent 数量)
|
|
165
|
+
│ ├── 添加 Worker
|
|
166
|
+
│ └── Worker 详情(Agent 列表、日志)
|
|
167
|
+
├── Agents
|
|
168
|
+
│ ├── Agent 列表(带缩略图)
|
|
169
|
+
│ ├── 创建 Agent
|
|
170
|
+
│ ├── Agent 详情(当前任务、历史记录)
|
|
171
|
+
│ └── 批量操作
|
|
172
|
+
├── Tasks
|
|
173
|
+
│ ├── 任务列表(状态、进度)
|
|
174
|
+
│ ├── 创建任务
|
|
175
|
+
│ ├── 任务详情(执行日志、结果)
|
|
176
|
+
│ └── 任务模板
|
|
177
|
+
└── Settings
|
|
178
|
+
├── 负载均衡策略
|
|
179
|
+
├── 资源限制
|
|
180
|
+
└── 告警配置
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
**技术栈:**
|
|
184
|
+
- React + TypeScript
|
|
185
|
+
- Ant Design / Material-UI
|
|
186
|
+
- Socket.IO Client (实时更新)
|
|
187
|
+
- ECharts (监控图表)
|
|
188
|
+
|
|
189
|
+
### 4. 健康检查与容错
|
|
190
|
+
|
|
191
|
+
**健康检查:**
|
|
192
|
+
```javascript
|
|
193
|
+
// Worker 心跳
|
|
194
|
+
setInterval(() => {
|
|
195
|
+
axios.post(`${masterUrl}/api/heartbeat`, {
|
|
196
|
+
workerId: this.workerId,
|
|
197
|
+
status: 'online',
|
|
198
|
+
agents: this.getAgentStatus(),
|
|
199
|
+
resources: {
|
|
200
|
+
cpuUsage: os.loadavg()[0],
|
|
201
|
+
memoryUsage: process.memoryUsage(),
|
|
202
|
+
diskUsage: getDiskUsage()
|
|
203
|
+
}
|
|
204
|
+
});
|
|
205
|
+
}, 5000);
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
**容错机制:**
|
|
209
|
+
1. **Worker 离线**:
|
|
210
|
+
- 将其任务重新分配到其他 Worker
|
|
211
|
+
- 标记 Agent 为 offline
|
|
212
|
+
- 发送告警通知
|
|
213
|
+
|
|
214
|
+
2. **任务超时**:
|
|
215
|
+
- 自动重试(最多 3 次)
|
|
216
|
+
- 标记为 failed
|
|
217
|
+
- 记录错误日志
|
|
218
|
+
|
|
219
|
+
3. **Agent 崩溃**:
|
|
220
|
+
- 自动重启 Agent
|
|
221
|
+
- 恢复任务执行
|
|
222
|
+
- 上报异常
|
|
223
|
+
|
|
224
|
+
### 5. 服务发现与注册
|
|
225
|
+
|
|
226
|
+
**使用 Redis 作为注册中心:**
|
|
227
|
+
```javascript
|
|
228
|
+
// Worker 注册
|
|
229
|
+
redis.hset('workers', workerId, JSON.stringify({
|
|
230
|
+
workerId,
|
|
231
|
+
url: `http://${ip}:${port}`,
|
|
232
|
+
status: 'online',
|
|
233
|
+
maxAgents: 10,
|
|
234
|
+
registeredAt: Date.now()
|
|
235
|
+
}));
|
|
236
|
+
|
|
237
|
+
// 设置 TTL(30 秒)
|
|
238
|
+
redis.expire(`worker:${workerId}`, 30);
|
|
239
|
+
|
|
240
|
+
// Master 发现 Worker
|
|
241
|
+
const workers = await redis.hgetall('workers');
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
## 数据模型
|
|
245
|
+
|
|
246
|
+
### Redis 数据结构
|
|
247
|
+
|
|
248
|
+
```
|
|
249
|
+
# Worker 注册表
|
|
250
|
+
workers:{workerId} → {workerId, url, status, maxAgents, ...}
|
|
251
|
+
|
|
252
|
+
# Agent 状态
|
|
253
|
+
agents:{agentId} → {agentId, workerId, windowId, status, ...}
|
|
254
|
+
|
|
255
|
+
# 任务队列
|
|
256
|
+
queue:high → [taskId1, taskId2, ...]
|
|
257
|
+
queue:normal → [taskId3, taskId4, ...]
|
|
258
|
+
queue:low → [taskId5, taskId6, ...]
|
|
259
|
+
|
|
260
|
+
# 任务状态
|
|
261
|
+
tasks:{taskId} → {taskId, status, result, ...}
|
|
262
|
+
|
|
263
|
+
# Worker 心跳
|
|
264
|
+
heartbeat:{workerId} → timestamp (TTL: 30s)
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
## API 设计
|
|
268
|
+
|
|
269
|
+
### Master API
|
|
270
|
+
|
|
271
|
+
```javascript
|
|
272
|
+
// 任务管理
|
|
273
|
+
POST /api/tasks // 创建任务
|
|
274
|
+
GET /api/tasks // 任务列表
|
|
275
|
+
GET /api/tasks/:id // 任务详情
|
|
276
|
+
DELETE /api/tasks/:id // 取消任务
|
|
277
|
+
|
|
278
|
+
// Worker 管理
|
|
279
|
+
GET /api/workers // Worker 列表
|
|
280
|
+
GET /api/workers/:id // Worker 详情
|
|
281
|
+
POST /api/workers/:id/restart // 重启 Worker
|
|
282
|
+
|
|
283
|
+
// Agent 管理
|
|
284
|
+
GET /api/agents // Agent 列表
|
|
285
|
+
POST /api/agents // 创建 Agent
|
|
286
|
+
DELETE /api/agents/:id // 关闭 Agent
|
|
287
|
+
GET /api/agents/:id/screenshot // Agent 截图
|
|
288
|
+
|
|
289
|
+
// 监控
|
|
290
|
+
GET /api/stats // 统计信息
|
|
291
|
+
GET /api/metrics // Prometheus 指标
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
### Worker API
|
|
295
|
+
|
|
296
|
+
```javascript
|
|
297
|
+
// 保持现有 MCP 工具 API
|
|
298
|
+
POST /rpc/{tool_name} // 执行工具
|
|
299
|
+
|
|
300
|
+
// 新增集群相关
|
|
301
|
+
POST /api/heartbeat // 心跳上报
|
|
302
|
+
POST /api/tasks/:id/result // 任务结果上报
|
|
303
|
+
GET /api/agents // 本地 Agent 列表
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
## 部署方案
|
|
307
|
+
|
|
308
|
+
### 单机部署(开发/测试)
|
|
309
|
+
|
|
310
|
+
```bash
|
|
311
|
+
# 启动 Master
|
|
312
|
+
npm run start:master
|
|
313
|
+
|
|
314
|
+
# 启动 Worker
|
|
315
|
+
npm run start:worker -- --master=http://localhost:8100
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
### 分布式部署(生产)
|
|
319
|
+
|
|
320
|
+
```yaml
|
|
321
|
+
# docker-compose.yml
|
|
322
|
+
version: '3.8'
|
|
323
|
+
services:
|
|
324
|
+
redis:
|
|
325
|
+
image: redis:7-alpine
|
|
326
|
+
ports:
|
|
327
|
+
- "6379:6379"
|
|
328
|
+
|
|
329
|
+
master:
|
|
330
|
+
build: .
|
|
331
|
+
command: npm run start:master
|
|
332
|
+
ports:
|
|
333
|
+
- "8100:8100"
|
|
334
|
+
environment:
|
|
335
|
+
- REDIS_URL=redis://redis:6379
|
|
336
|
+
- NODE_ENV=production
|
|
337
|
+
|
|
338
|
+
worker:
|
|
339
|
+
build: .
|
|
340
|
+
command: npm run start:worker
|
|
341
|
+
environment:
|
|
342
|
+
- MASTER_URL=http://master:8100
|
|
343
|
+
- REDIS_URL=redis://redis:6379
|
|
344
|
+
- DISPLAY=:99
|
|
345
|
+
deploy:
|
|
346
|
+
replicas: 3
|
|
347
|
+
```
|
|
348
|
+
|
|
349
|
+
### Kubernetes 部署
|
|
350
|
+
|
|
351
|
+
```yaml
|
|
352
|
+
# k8s/master-deployment.yaml
|
|
353
|
+
apiVersion: apps/v1
|
|
354
|
+
kind: Deployment
|
|
355
|
+
metadata:
|
|
356
|
+
name: electron-mcp-master
|
|
357
|
+
spec:
|
|
358
|
+
replicas: 1
|
|
359
|
+
selector:
|
|
360
|
+
matchLabels:
|
|
361
|
+
app: electron-mcp-master
|
|
362
|
+
template:
|
|
363
|
+
metadata:
|
|
364
|
+
labels:
|
|
365
|
+
app: electron-mcp-master
|
|
366
|
+
spec:
|
|
367
|
+
containers:
|
|
368
|
+
- name: master
|
|
369
|
+
image: electron-mcp:latest
|
|
370
|
+
command: ["npm", "run", "start:master"]
|
|
371
|
+
ports:
|
|
372
|
+
- containerPort: 8100
|
|
373
|
+
env:
|
|
374
|
+
- name: REDIS_URL
|
|
375
|
+
value: "redis://redis-service:6379"
|
|
376
|
+
|
|
377
|
+
---
|
|
378
|
+
# k8s/worker-deployment.yaml
|
|
379
|
+
apiVersion: apps/v1
|
|
380
|
+
kind: Deployment
|
|
381
|
+
metadata:
|
|
382
|
+
name: electron-mcp-worker
|
|
383
|
+
spec:
|
|
384
|
+
replicas: 5
|
|
385
|
+
selector:
|
|
386
|
+
matchLabels:
|
|
387
|
+
app: electron-mcp-worker
|
|
388
|
+
template:
|
|
389
|
+
metadata:
|
|
390
|
+
labels:
|
|
391
|
+
app: electron-mcp-worker
|
|
392
|
+
spec:
|
|
393
|
+
containers:
|
|
394
|
+
- name: worker
|
|
395
|
+
image: electron-mcp:latest
|
|
396
|
+
command: ["npm", "run", "start:worker"]
|
|
397
|
+
env:
|
|
398
|
+
- name: MASTER_URL
|
|
399
|
+
value: "http://master-service:8100"
|
|
400
|
+
- name: DISPLAY
|
|
401
|
+
value: ":99"
|
|
402
|
+
resources:
|
|
403
|
+
limits:
|
|
404
|
+
memory: "4Gi"
|
|
405
|
+
cpu: "2"
|
|
406
|
+
```
|
|
407
|
+
|
|
408
|
+
## 监控与告警
|
|
409
|
+
|
|
410
|
+
### Prometheus 指标
|
|
411
|
+
|
|
412
|
+
```javascript
|
|
413
|
+
// 指标定义
|
|
414
|
+
const metrics = {
|
|
415
|
+
workers_total: new Gauge({ name: 'workers_total', help: 'Total workers' }),
|
|
416
|
+
agents_total: new Gauge({ name: 'agents_total', help: 'Total agents' }),
|
|
417
|
+
tasks_total: new Counter({ name: 'tasks_total', help: 'Total tasks' }),
|
|
418
|
+
tasks_duration: new Histogram({ name: 'tasks_duration_seconds', help: 'Task duration' }),
|
|
419
|
+
worker_cpu_usage: new Gauge({ name: 'worker_cpu_usage', help: 'Worker CPU usage' }),
|
|
420
|
+
worker_memory_usage: new Gauge({ name: 'worker_memory_usage', help: 'Worker memory usage' })
|
|
421
|
+
};
|
|
422
|
+
|
|
423
|
+
// 暴露指标
|
|
424
|
+
app.get('/metrics', async (req, res) => {
|
|
425
|
+
res.set('Content-Type', register.contentType);
|
|
426
|
+
res.end(await register.metrics());
|
|
427
|
+
});
|
|
428
|
+
```
|
|
429
|
+
|
|
430
|
+
### Grafana Dashboard
|
|
431
|
+
|
|
432
|
+
- Worker 状态面板
|
|
433
|
+
- Agent 数量趋势
|
|
434
|
+
- 任务吞吐量
|
|
435
|
+
- 资源使用率
|
|
436
|
+
- 错误率统计
|
|
437
|
+
|
|
438
|
+
### 告警规则
|
|
439
|
+
|
|
440
|
+
```yaml
|
|
441
|
+
# alerts.yml
|
|
442
|
+
groups:
|
|
443
|
+
- name: electron-mcp
|
|
444
|
+
rules:
|
|
445
|
+
- alert: WorkerDown
|
|
446
|
+
expr: up{job="electron-mcp-worker"} == 0
|
|
447
|
+
for: 1m
|
|
448
|
+
annotations:
|
|
449
|
+
summary: "Worker {{ $labels.instance }} is down"
|
|
450
|
+
|
|
451
|
+
- alert: HighCPUUsage
|
|
452
|
+
expr: worker_cpu_usage > 0.8
|
|
453
|
+
for: 5m
|
|
454
|
+
annotations:
|
|
455
|
+
summary: "Worker {{ $labels.worker_id }} CPU usage > 80%"
|
|
456
|
+
|
|
457
|
+
- alert: TaskQueueBacklog
|
|
458
|
+
expr: tasks_pending > 100
|
|
459
|
+
for: 10m
|
|
460
|
+
annotations:
|
|
461
|
+
summary: "Task queue has {{ $value }} pending tasks"
|
|
462
|
+
```
|
|
463
|
+
|
|
464
|
+
## 安全考虑
|
|
465
|
+
|
|
466
|
+
1. **认证授权**:
|
|
467
|
+
- Master API 需要 Bearer Token
|
|
468
|
+
- Worker 注册需要预共享密钥
|
|
469
|
+
- Agent Hub 需要登录
|
|
470
|
+
|
|
471
|
+
2. **网络隔离**:
|
|
472
|
+
- Worker 只能访问 Master
|
|
473
|
+
- Agent 网络流量可限制
|
|
474
|
+
|
|
475
|
+
3. **资源限制**:
|
|
476
|
+
- 每个 Worker 最大 Agent 数量
|
|
477
|
+
- 任务超时自动终止
|
|
478
|
+
- 内存/CPU 限制
|
|
479
|
+
|
|
480
|
+
## 性能优化
|
|
481
|
+
|
|
482
|
+
1. **连接池**:
|
|
483
|
+
- Redis 连接池
|
|
484
|
+
- HTTP 连接复用
|
|
485
|
+
|
|
486
|
+
2. **批量操作**:
|
|
487
|
+
- 批量创建 Agent
|
|
488
|
+
- 批量查询状态
|
|
489
|
+
|
|
490
|
+
3. **缓存策略**:
|
|
491
|
+
- Worker 列表缓存(5 秒)
|
|
492
|
+
- Agent 状态缓存(1 秒)
|
|
493
|
+
|
|
494
|
+
4. **异步处理**:
|
|
495
|
+
- 任务结果异步上报
|
|
496
|
+
- 日志异步写入
|
|
497
|
+
|
|
498
|
+
## 实施计划
|
|
499
|
+
|
|
500
|
+
### Phase 1: 基础架构(2 周)
|
|
501
|
+
- [ ] Master Node 基础框架
|
|
502
|
+
- [ ] Worker 注册与心跳
|
|
503
|
+
- [ ] Redis 任务队列
|
|
504
|
+
- [ ] 基础 API 实现
|
|
505
|
+
|
|
506
|
+
### Phase 2: 任务调度(2 周)
|
|
507
|
+
- [ ] 任务分发逻辑
|
|
508
|
+
- [ ] 负载均衡实现
|
|
509
|
+
- [ ] 任务执行与结果上报
|
|
510
|
+
- [ ] 容错机制
|
|
511
|
+
|
|
512
|
+
### Phase 3: 管理界面(2 周)
|
|
513
|
+
- [ ] Agent Hub 前端开发
|
|
514
|
+
- [ ] Dashboard 实时监控
|
|
515
|
+
- [ ] Worker/Agent 管理
|
|
516
|
+
- [ ] 任务管理界面
|
|
517
|
+
|
|
518
|
+
### Phase 4: 监控告警(1 周)
|
|
519
|
+
- [ ] Prometheus 集成
|
|
520
|
+
- [ ] Grafana Dashboard
|
|
521
|
+
- [ ] 告警规则配置
|
|
522
|
+
- [ ] 日志收集
|
|
523
|
+
|
|
524
|
+
### Phase 5: 部署优化(1 周)
|
|
525
|
+
- [ ] Docker 镜像优化
|
|
526
|
+
- [ ] Kubernetes 配置
|
|
527
|
+
- [ ] 性能测试
|
|
528
|
+
- [ ] 文档完善
|
|
529
|
+
|
|
530
|
+
## 验收标准
|
|
531
|
+
|
|
532
|
+
1. **功能完整性**:
|
|
533
|
+
- ✅ 支持 100+ Agent 并行
|
|
534
|
+
- ✅ 任务自动分发和执行
|
|
535
|
+
- ✅ Worker 动态扩容
|
|
536
|
+
- ✅ 完整的管理界面
|
|
537
|
+
|
|
538
|
+
2. **性能指标**:
|
|
539
|
+
- 任务分发延迟 < 100ms
|
|
540
|
+
- 单 Worker 支持 10+ Agent
|
|
541
|
+
- 任务吞吐量 > 1000/分钟
|
|
542
|
+
- 系统可用性 > 99.9%
|
|
543
|
+
|
|
544
|
+
3. **可维护性**:
|
|
545
|
+
- 完整的监控指标
|
|
546
|
+
- 详细的日志记录
|
|
547
|
+
- 清晰的错误提示
|
|
548
|
+
- 完善的文档
|
|
549
|
+
|
|
550
|
+
## 参考资料
|
|
551
|
+
|
|
552
|
+
- [Bull Queue Documentation](https://github.com/OptimalBits/bull)
|
|
553
|
+
- [Socket.IO Documentation](https://socket.io/docs/)
|
|
554
|
+
- [Prometheus Best Practices](https://prometheus.io/docs/practices/)
|
|
555
|
+
- [Kubernetes Patterns](https://kubernetes.io/docs/concepts/)
|