deepspider 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents/check.md +122 -0
- package/.claude/agents/debug.md +106 -0
- package/.claude/agents/dispatch.md +214 -0
- package/.claude/agents/implement.md +96 -0
- package/.claude/agents/plan.md +396 -0
- package/.claude/agents/research.md +120 -0
- package/.claude/commands/evolve/merge.md +80 -0
- package/.claude/commands/trellis/before-backend-dev.md +13 -0
- package/.claude/commands/trellis/before-frontend-dev.md +13 -0
- package/.claude/commands/trellis/break-loop.md +107 -0
- package/.claude/commands/trellis/check-backend.md +13 -0
- package/.claude/commands/trellis/check-cross-layer.md +153 -0
- package/.claude/commands/trellis/check-frontend.md +13 -0
- package/.claude/commands/trellis/create-command.md +154 -0
- package/.claude/commands/trellis/finish-work.md +129 -0
- package/.claude/commands/trellis/integrate-skill.md +219 -0
- package/.claude/commands/trellis/onboard.md +358 -0
- package/.claude/commands/trellis/parallel.md +193 -0
- package/.claude/commands/trellis/record-session.md +62 -0
- package/.claude/commands/trellis/start.md +280 -0
- package/.claude/commands/trellis/update-spec.md +213 -0
- package/.claude/hooks/inject-subagent-context.py +758 -0
- package/.claude/hooks/ralph-loop.py +374 -0
- package/.claude/hooks/session-start.py +126 -0
- package/.claude/settings.json +41 -0
- package/.claude/skills/deepagents-guide/SKILL.md +428 -0
- package/.cursor/commands/trellis-before-backend-dev.md +13 -0
- package/.cursor/commands/trellis-before-frontend-dev.md +13 -0
- package/.cursor/commands/trellis-break-loop.md +107 -0
- package/.cursor/commands/trellis-check-backend.md +13 -0
- package/.cursor/commands/trellis-check-cross-layer.md +153 -0
- package/.cursor/commands/trellis-check-frontend.md +13 -0
- package/.cursor/commands/trellis-create-command.md +154 -0
- package/.cursor/commands/trellis-finish-work.md +129 -0
- package/.cursor/commands/trellis-integrate-skill.md +219 -0
- package/.cursor/commands/trellis-onboard.md +358 -0
- package/.cursor/commands/trellis-record-session.md +62 -0
- package/.cursor/commands/trellis-start.md +156 -0
- package/.cursor/commands/trellis-update-spec.md +213 -0
- package/.env.example +11 -0
- package/.husky/pre-commit +1 -0
- package/.mcp.json +8 -0
- package/.trellis/.template-hashes.json +65 -0
- package/.trellis/.version +1 -0
- package/.trellis/scripts/add-session.sh +384 -0
- package/.trellis/scripts/common/developer.sh +129 -0
- package/.trellis/scripts/common/git-context.sh +263 -0
- package/.trellis/scripts/common/paths.sh +208 -0
- package/.trellis/scripts/common/phase.sh +150 -0
- package/.trellis/scripts/common/registry.sh +247 -0
- package/.trellis/scripts/common/task-queue.sh +142 -0
- package/.trellis/scripts/common/task-utils.sh +151 -0
- package/.trellis/scripts/common/worktree.sh +128 -0
- package/.trellis/scripts/create-bootstrap.sh +299 -0
- package/.trellis/scripts/get-context.sh +7 -0
- package/.trellis/scripts/get-developer.sh +15 -0
- package/.trellis/scripts/init-developer.sh +34 -0
- package/.trellis/scripts/multi-agent/cleanup.sh +396 -0
- package/.trellis/scripts/multi-agent/create-pr.sh +241 -0
- package/.trellis/scripts/multi-agent/plan.sh +207 -0
- package/.trellis/scripts/multi-agent/start.sh +310 -0
- package/.trellis/scripts/multi-agent/status.sh +828 -0
- package/.trellis/scripts/task.sh +1118 -0
- package/.trellis/spec/backend/deepagents-guide.md +337 -0
- package/.trellis/spec/backend/directory-structure.md +126 -0
- package/.trellis/spec/backend/examples/skills/deepagents-guide/README.md +11 -0
- package/.trellis/spec/backend/examples/skills/deepagents-guide/agent.js.template +20 -0
- package/.trellis/spec/backend/examples/skills/deepagents-guide/skills-config.js.template +13 -0
- package/.trellis/spec/backend/examples/skills/deepagents-guide/subagent.js.template +19 -0
- package/.trellis/spec/backend/hook-guidelines.md +178 -0
- package/.trellis/spec/backend/index.md +36 -0
- package/.trellis/spec/backend/quality-guidelines.md +201 -0
- package/.trellis/spec/backend/state-management.md +76 -0
- package/.trellis/spec/backend/tool-guidelines.md +144 -0
- package/.trellis/spec/backend/type-safety.md +71 -0
- package/.trellis/spec/guides/code-reuse-thinking-guide.md +92 -0
- package/.trellis/spec/guides/cross-layer-thinking-guide.md +94 -0
- package/.trellis/spec/guides/index.md +79 -0
- package/.trellis/tasks/archive/02-02-evolving-skills/prd.md +61 -0
- package/.trellis/tasks/archive/02-02-evolving-skills/task.json +29 -0
- package/.trellis/tasks/archive/2026-02/00-bootstrap-guidelines/prd.md +86 -0
- package/.trellis/tasks/archive/2026-02/00-bootstrap-guidelines/task.json +27 -0
- package/.trellis/tasks/archive/2026-02/02-02-skills-system/check.jsonl +3 -0
- package/.trellis/tasks/archive/2026-02/02-02-skills-system/debug.jsonl +2 -0
- package/.trellis/tasks/archive/2026-02/02-02-skills-system/implement.jsonl +5 -0
- package/.trellis/tasks/archive/2026-02/02-02-skills-system/prd.md +33 -0
- package/.trellis/tasks/archive/2026-02/02-02-skills-system/task.json +41 -0
- package/.trellis/workflow.md +407 -0
- package/.trellis/workspace/index.md +123 -0
- package/.trellis/workspace/pony/index.md +40 -0
- package/.trellis/workspace/pony/journal-1.md +7 -0
- package/.trellis/worktree.yaml +47 -0
- package/AGENTS.md +18 -0
- package/CLAUDE.md +292 -0
- package/README.md +134 -0
- package/agents/deepspider.md +142 -0
- package/docs/DEBUG.md +42 -0
- package/docs/GUIDE.md +334 -0
- package/docs/PROMPT.md +60 -0
- package/docs/USAGE.md +226 -0
- package/eslint.config.js +51 -0
- package/package.json +78 -0
- package/requirements-crypto.txt +14 -0
- package/src/agent/index.js +97 -0
- package/src/agent/logger.js +164 -0
- package/src/agent/middleware/filterTools.js +64 -0
- package/src/agent/middleware/report.js +79 -0
- package/src/agent/prompts/system.js +315 -0
- package/src/agent/run.js +575 -0
- package/src/agent/skills/anti-detect/SKILL.md +28 -0
- package/src/agent/skills/anti-detect/evolved.md +12 -0
- package/src/agent/skills/captcha/SKILL.md +37 -0
- package/src/agent/skills/captcha/evolved.md +12 -0
- package/src/agent/skills/config.js +30 -0
- package/src/agent/skills/crawler/SKILL.md +9 -0
- package/src/agent/skills/crawler/evolved.md +16 -0
- package/src/agent/skills/dynamic-analysis/SKILL.md +91 -0
- package/src/agent/skills/dynamic-analysis/evolved.md +12 -0
- package/src/agent/skills/env/SKILL.md +72 -0
- package/src/agent/skills/env/evolved.md +12 -0
- package/src/agent/skills/evolve.js +79 -0
- package/src/agent/skills/general/SKILL.md +12 -0
- package/src/agent/skills/general/evolved.md +12 -0
- package/src/agent/skills/js2python/SKILL.md +30 -0
- package/src/agent/skills/js2python/evolved.md +13 -0
- package/src/agent/skills/report/SKILL.md +21 -0
- package/src/agent/skills/report/evolved.md +12 -0
- package/src/agent/skills/sandbox/SKILL.md +22 -0
- package/src/agent/skills/sandbox/evolved.md +16 -0
- package/src/agent/skills/static-analysis/SKILL.md +93 -0
- package/src/agent/skills/static-analysis/evolved.md +12 -0
- package/src/agent/skills/xpath/SKILL.md +119 -0
- package/src/agent/subagents/anti-detect.js +45 -0
- package/src/agent/subagents/captcha.js +51 -0
- package/src/agent/subagents/crawler.js +138 -0
- package/src/agent/subagents/dynamic.js +64 -0
- package/src/agent/subagents/env-agent.js +82 -0
- package/src/agent/subagents/index.js +37 -0
- package/src/agent/subagents/js2python.js +72 -0
- package/src/agent/subagents/sandbox.js +55 -0
- package/src/agent/subagents/static.js +66 -0
- package/src/agent/tools/analysis.js +135 -0
- package/src/agent/tools/analyzer.js +85 -0
- package/src/agent/tools/anti-detect.js +89 -0
- package/src/agent/tools/antidebug.js +64 -0
- package/src/agent/tools/async.js +43 -0
- package/src/agent/tools/browser.js +324 -0
- package/src/agent/tools/captcha.js +223 -0
- package/src/agent/tools/capture.js +179 -0
- package/src/agent/tools/correlate.js +303 -0
- package/src/agent/tools/crawler.js +116 -0
- package/src/agent/tools/cryptohook.js +80 -0
- package/src/agent/tools/debug.js +246 -0
- package/src/agent/tools/deobfuscator.js +90 -0
- package/src/agent/tools/env.js +83 -0
- package/src/agent/tools/envdump.js +92 -0
- package/src/agent/tools/evolve.js +164 -0
- package/src/agent/tools/extract.js +114 -0
- package/src/agent/tools/extractor.js +54 -0
- package/src/agent/tools/file.js +224 -0
- package/src/agent/tools/hook.js +84 -0
- package/src/agent/tools/hookManager.js +178 -0
- package/src/agent/tools/index.js +137 -0
- package/src/agent/tools/nodejs.js +101 -0
- package/src/agent/tools/patch.js +46 -0
- package/src/agent/tools/preprocess.js +71 -0
- package/src/agent/tools/profile.js +122 -0
- package/src/agent/tools/python.js +627 -0
- package/src/agent/tools/report.js +124 -0
- package/src/agent/tools/runtime.js +132 -0
- package/src/agent/tools/sandbox.js +79 -0
- package/src/agent/tools/store.js +73 -0
- package/src/agent/tools/trace.js +74 -0
- package/src/agent/tools/tracing.js +201 -0
- package/src/agent/tools/utils.js +51 -0
- package/src/agent/tools/verify.js +184 -0
- package/src/agent/tools/webcrack.js +109 -0
- package/src/analyzer/ASTAnalyzer.js +387 -0
- package/src/analyzer/CallStackAnalyzer.js +379 -0
- package/src/analyzer/Deobfuscator.js +289 -0
- package/src/analyzer/EncryptionAnalyzer.js +99 -0
- package/src/analyzer/index.js +22 -0
- package/src/browser/EnvBridge.js +186 -0
- package/src/browser/cdp.js +168 -0
- package/src/browser/client.js +197 -0
- package/src/browser/collector.js +444 -0
- package/src/browser/collectors/RequestCryptoLinker.js +109 -0
- package/src/browser/collectors/ResponseSearcher.js +107 -0
- package/src/browser/collectors/ScriptCollector.js +158 -0
- package/src/browser/collectors/index.js +26 -0
- package/src/browser/defaultHooks.js +932 -0
- package/src/browser/hooks/crypto.js +55 -0
- package/src/browser/hooks/index.js +64 -0
- package/src/browser/hooks/native.js +9 -0
- package/src/browser/hooks/network.js +33 -0
- package/src/browser/index.js +42 -0
- package/src/browser/interceptors/NetworkInterceptor.js +116 -0
- package/src/browser/interceptors/ScriptInterceptor.js +76 -0
- package/src/browser/interceptors/index.js +6 -0
- package/src/browser/ui/analysisPanel.js +1782 -0
- package/src/browser/ui/confirmDialog.js +158 -0
- package/src/browser/ui/panel.html +152 -0
- package/src/browser/ui/selector.js +170 -0
- package/src/config/index.js +5 -0
- package/src/config/paths.js +71 -0
- package/src/config/patterns/crypto.js +36 -0
- package/src/config/profiles/chrome.json +71 -0
- package/src/config/profiles/firefox.json +44 -0
- package/src/config/profiles/safari.json +38 -0
- package/src/core/EnvMonitor.js +200 -0
- package/src/core/PatchGenerator.js +278 -0
- package/src/core/Sandbox.js +181 -0
- package/src/env/AntiAntiDebug.js +111 -0
- package/src/env/AsyncHook.js +68 -0
- package/src/env/BrowserAPIList.js +265 -0
- package/src/env/CookieHook.js +48 -0
- package/src/env/CryptoHook.js +205 -0
- package/src/env/EnvCodeGenerator.js +157 -0
- package/src/env/EnvDumper.js +356 -0
- package/src/env/EnvExtractor.js +220 -0
- package/src/env/HookBase.js +618 -0
- package/src/env/NetworkHook.js +159 -0
- package/src/env/modules/bom/history.js +29 -0
- package/src/env/modules/bom/location.js +26 -0
- package/src/env/modules/bom/navigator.js +70 -0
- package/src/env/modules/bom/screen.js +26 -0
- package/src/env/modules/bom/storage.js +23 -0
- package/src/env/modules/dom/document.js +110 -0
- package/src/env/modules/dom/event.js +51 -0
- package/src/env/modules/index.js +34 -0
- package/src/env/modules/webapi/fetch.js +46 -0
- package/src/env/modules/webapi/url.js +47 -0
- package/src/env/modules/webapi/xhr.js +48 -0
- package/src/index.js +27 -0
- package/src/mcp/server.js +89 -0
- package/src/store/DataStore.js +708 -0
- package/src/store/Store.js +158 -0
- package/src/store/Validator.js +24 -0
- package/test/analyze.test.js +90 -0
- package/test/envdump.test.js +74 -0
- package/test/flow.test.js +90 -0
- package/test/hooks.test.js +138 -0
- package/test/plugin.test.js +35 -0
- package/test/refactor-full.test.js +30 -0
- package/test/refactor.test.js +21 -0
- package/test/samples/obfuscated.js +61 -0
- package/test/samples/original.js +66 -0
- package/test/samples/v10_eval_chain.js +52 -0
- package/test/samples/v11_bytecode_vm.js +81 -0
- package/test/samples/v12_polymorphic.js +69 -0
- package/test/samples/v1_ob_basic.js +98 -0
- package/test/samples/v2_ob_advanced.js +99 -0
- package/test/samples/v3_jjencode.js +77 -0
- package/test/samples/v4_aaencode.js +73 -0
- package/test/samples/v5_control_flow.js +86 -0
- package/test/samples/v6_string_encryption.js +71 -0
- package/test/samples/v7_jsvmp.js +83 -0
- package/test/samples/v8_anti_debug.js +79 -0
- package/test/samples/v9_proxy_trap.js +49 -0
- package/test/samples.test.js +96 -0
- package/test/webcrack.test.js +55 -0
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
# DeepSpider Development Guidelines
|
|
2
|
+
|
|
3
|
+
> DeepSpider 项目开发规范
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
DeepSpider 是基于 DeepAgents + Patchright 的 JS 逆向分析引擎。
|
|
10
|
+
本目录包含项目的开发规范和代码模式。
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Guidelines Index
|
|
15
|
+
|
|
16
|
+
| Guide | Description | Status |
|
|
17
|
+
|-------|-------------|--------|
|
|
18
|
+
| [Directory Structure](./directory-structure.md) | 项目目录结构和模块组织 | Done |
|
|
19
|
+
| [DeepAgents Guide](./deepagents-guide.md) | DeepAgents 框架使用指南 | Done |
|
|
20
|
+
| [Tool Guidelines](./tool-guidelines.md) | LangChain 工具定义规范 | Done |
|
|
21
|
+
| [Hook Guidelines](./hook-guidelines.md) | 浏览器 Hook 注入规范 | Done |
|
|
22
|
+
| [State Management](./state-management.md) | Agent 状态与数据存储 | Done |
|
|
23
|
+
| [Quality Guidelines](./quality-guidelines.md) | 代码质量规范 | Done |
|
|
24
|
+
| [Type Safety](./type-safety.md) | Zod 类型验证规范 | Done |
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## Quick Reference
|
|
29
|
+
|
|
30
|
+
核心规范要点:
|
|
31
|
+
|
|
32
|
+
1. **Agent 创建**: 使用 `createDeepAgent()` + 配置对象
|
|
33
|
+
2. **工具定义**: 使用 `@langchain/core/tools` + Zod schema
|
|
34
|
+
3. **浏览器交互**: 优先使用 CDP,避免 `page.evaluate()`
|
|
35
|
+
4. **AST 遍历**: 使用 `@babel/traverse`
|
|
36
|
+
5. **数据存储**: 使用 `getDataStore()` 单例
|
|
@@ -0,0 +1,201 @@
|
|
|
1
|
+
# Quality Guidelines
|
|
2
|
+
|
|
3
|
+
> DeepSpider 代码质量规范
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
DeepSpider 遵循 CLAUDE.md 中定义的代码规范,重点关注:
|
|
10
|
+
- CDP 优先的浏览器交互
|
|
11
|
+
- Babel AST 遍历模式
|
|
12
|
+
- LangChain 工具定义规范
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## Forbidden Patterns
|
|
17
|
+
|
|
18
|
+
### 1. 使用 page.evaluate 代替 CDP
|
|
19
|
+
|
|
20
|
+
```javascript
|
|
21
|
+
// ❌ 禁止
|
|
22
|
+
const result = await page.evaluate(() => { ... });
|
|
23
|
+
|
|
24
|
+
// ✅ 使用 CDP
|
|
25
|
+
const cdp = await browser.getCDPSession();
|
|
26
|
+
const result = await cdp.send('Runtime.evaluate', { ... });
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
### 2. 直接访问封装类的内部属性
|
|
30
|
+
|
|
31
|
+
```javascript
|
|
32
|
+
// ❌ 禁止:暴露内部实现
|
|
33
|
+
cdpSession.client.on('Debugger.paused', handler);
|
|
34
|
+
|
|
35
|
+
// ✅ 使用封装类提供的方法
|
|
36
|
+
cdpSession.on('Debugger.paused', handler);
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
**原因**: 直接访问 `.client` 会导致封装泄漏,当内部实现变化时调用方会报错。
|
|
40
|
+
|
|
41
|
+
### 3. 子代理不配置中间件
|
|
42
|
+
|
|
43
|
+
```javascript
|
|
44
|
+
// ❌ 禁止:只在主 Agent 配置中间件,子代理不配置
|
|
45
|
+
// index.js
|
|
46
|
+
const agent = createDeepAgent({
|
|
47
|
+
middleware: [createFilterToolsMiddleware()],
|
|
48
|
+
subagents: [subagent1, subagent2],
|
|
49
|
+
});
|
|
50
|
+
|
|
51
|
+
// subagent1.js - 没有中间件
|
|
52
|
+
export const subagent1 = {
|
|
53
|
+
name: 'subagent1',
|
|
54
|
+
tools: [...],
|
|
55
|
+
middleware: [], // 空!
|
|
56
|
+
};
|
|
57
|
+
|
|
58
|
+
// ✅ 子代理也需要配置相同的中间件
|
|
59
|
+
export const subagent1 = {
|
|
60
|
+
name: 'subagent1',
|
|
61
|
+
tools: [...],
|
|
62
|
+
middleware: [
|
|
63
|
+
createFilterToolsMiddleware(), // 必须添加
|
|
64
|
+
createSkillsMiddleware({ ... }),
|
|
65
|
+
],
|
|
66
|
+
};
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
**原因**: DeepAgents 子代理不会继承主 Agent 的中间件配置。如果主 Agent 过滤了内置工具,子代理也必须单独配置过滤中间件,否则子代理仍会使用被过滤的工具。
|
|
70
|
+
|
|
71
|
+
### 4. setInterval 中使用 async 回调
|
|
72
|
+
|
|
73
|
+
```javascript
|
|
74
|
+
// ❌ 禁止:async 回调不会被等待,可能导致并发问题
|
|
75
|
+
setInterval(async () => {
|
|
76
|
+
const result = await detectCaptcha();
|
|
77
|
+
await handleResult(result);
|
|
78
|
+
}, 30000);
|
|
79
|
+
|
|
80
|
+
// ✅ 保持同步,只做状态检查和标记
|
|
81
|
+
let needsCheck = false;
|
|
82
|
+
setInterval(() => {
|
|
83
|
+
const elapsed = Date.now() - lastEventTime;
|
|
84
|
+
if (elapsed > timeout) {
|
|
85
|
+
console.log('[提示] 超时,请检查页面');
|
|
86
|
+
}
|
|
87
|
+
}, 30000);
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
**原因**: setInterval 不会等待 async 回调完成,多次触发会导致并发执行。
|
|
91
|
+
|
|
92
|
+
### 5. spawn 使用不存在的 timeout 选项
|
|
93
|
+
|
|
94
|
+
```javascript
|
|
95
|
+
// ❌ 禁止:spawn 不支持 timeout 选项,超时不会生效
|
|
96
|
+
const proc = spawn('node', ['-e', code], {
|
|
97
|
+
timeout: 10000, // 无效!
|
|
98
|
+
});
|
|
99
|
+
|
|
100
|
+
// ✅ 手动实现超时
|
|
101
|
+
const proc = spawn('node', ['-e', code]);
|
|
102
|
+
let killed = false;
|
|
103
|
+
|
|
104
|
+
const timer = setTimeout(() => {
|
|
105
|
+
killed = true;
|
|
106
|
+
proc.kill('SIGTERM');
|
|
107
|
+
}, 10000);
|
|
108
|
+
|
|
109
|
+
proc.on('close', () => {
|
|
110
|
+
clearTimeout(timer);
|
|
111
|
+
});
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
**原因**: `spawn` 的 options 不包含 `timeout`,这是 `execSync` 的选项。使用 spawn 时必须手动实现超时逻辑。
|
|
115
|
+
|
|
116
|
+
---
|
|
117
|
+
|
|
118
|
+
## Required Patterns
|
|
119
|
+
|
|
120
|
+
### 1. Babel AST 遍历
|
|
121
|
+
|
|
122
|
+
```javascript
|
|
123
|
+
import traverse from '@babel/traverse';
|
|
124
|
+
|
|
125
|
+
traverse.default(ast, {
|
|
126
|
+
FunctionDeclaration(path) {
|
|
127
|
+
// 处理
|
|
128
|
+
}
|
|
129
|
+
});
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
### 2. CDP Session 复用
|
|
133
|
+
|
|
134
|
+
```javascript
|
|
135
|
+
const cdp = await browser.getCDPSession();
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
---
|
|
139
|
+
|
|
140
|
+
## Testing Requirements
|
|
141
|
+
|
|
142
|
+
运行测试:
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
pnpm test
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
## Code Review Checklist
|
|
151
|
+
|
|
152
|
+
- [ ] 工具名称使用 snake_case
|
|
153
|
+
- [ ] 参数有 describe 描述
|
|
154
|
+
- [ ] 浏览器交互使用 CDP
|
|
155
|
+
- [ ] AST 遍历使用 Babel
|
|
156
|
+
- [ ] 数组访问前检查边界
|
|
157
|
+
- [ ] 对象访问前检查空值
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## Defensive Programming
|
|
162
|
+
|
|
163
|
+
### 1. 数组索引边界检查
|
|
164
|
+
|
|
165
|
+
```javascript
|
|
166
|
+
// ❌ 禁止:直接访问可能越界
|
|
167
|
+
const stage = stages[parseInt(index)];
|
|
168
|
+
stage.fields.push(field);
|
|
169
|
+
|
|
170
|
+
// ✅ 先检查边界
|
|
171
|
+
const idx = parseInt(index);
|
|
172
|
+
if (idx < 0 || idx >= stages.length) return;
|
|
173
|
+
const stage = stages[idx];
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
### 2. 工厂函数避免重复结构
|
|
177
|
+
|
|
178
|
+
```javascript
|
|
179
|
+
// ❌ 禁止:多处重复对象字面量
|
|
180
|
+
stages.push({ name: 'list', fields: [], entry: null });
|
|
181
|
+
// ... 另一处
|
|
182
|
+
stages = [{ name: 'list', fields: [], entry: null }];
|
|
183
|
+
|
|
184
|
+
// ✅ 使用工厂函数
|
|
185
|
+
function createStage(name) {
|
|
186
|
+
return { name, fields: [], entry: null, pagination: null };
|
|
187
|
+
}
|
|
188
|
+
stages.push(createStage('list'));
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
### 3. 空值检查
|
|
192
|
+
|
|
193
|
+
```javascript
|
|
194
|
+
// ❌ 禁止:假设对象存在
|
|
195
|
+
currentStage.fields.splice(index, 1);
|
|
196
|
+
|
|
197
|
+
// ✅ 先检查
|
|
198
|
+
if (!currentStage) return;
|
|
199
|
+
if (index < 0 || index >= currentStage.fields.length) return;
|
|
200
|
+
currentStage.fields.splice(index, 1);
|
|
201
|
+
```
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
# State Management
|
|
2
|
+
|
|
3
|
+
> Agent 状态与数据存储规范
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
DeepSpider 使用 DeepAgents 的状态后端和文件系统存储管理数据。
|
|
10
|
+
Agent 状态通过 FilesystemBackend 持久化,采集数据通过 DataStore 存储。
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## State Categories
|
|
15
|
+
|
|
16
|
+
| 类型 | 存储方式 | 示例 |
|
|
17
|
+
|------|----------|------|
|
|
18
|
+
| Agent 状态 | FilesystemBackend | `.deepspider-agent/` |
|
|
19
|
+
| 采集数据 | DataStore | `.deepspider-data/` |
|
|
20
|
+
| 会话状态 | MemorySaver | 内存中 |
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## DataStore Pattern
|
|
25
|
+
|
|
26
|
+
数据存储使用单例模式:
|
|
27
|
+
|
|
28
|
+
```javascript
|
|
29
|
+
import { getDataStore } from '../store/DataStore.js';
|
|
30
|
+
|
|
31
|
+
const store = getDataStore();
|
|
32
|
+
await store.saveResponse(data);
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
**示例**: `src/store/DataStore.js:699-706`
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## Agent Backend
|
|
40
|
+
|
|
41
|
+
Agent 状态后端配置:
|
|
42
|
+
|
|
43
|
+
```javascript
|
|
44
|
+
import { FilesystemBackend } from 'deepagents';
|
|
45
|
+
|
|
46
|
+
const backend = new FilesystemBackend({
|
|
47
|
+
rootDir: './.deepspider-agent'
|
|
48
|
+
});
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
**示例**: `src/agent/index.js:59-62`
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## Common Mistakes
|
|
56
|
+
|
|
57
|
+
### 1. 未使用单例
|
|
58
|
+
|
|
59
|
+
```javascript
|
|
60
|
+
// ❌ 错误:每次创建新实例
|
|
61
|
+
const store = new DataStore();
|
|
62
|
+
|
|
63
|
+
// ✅ 正确:使用单例
|
|
64
|
+
const store = getDataStore();
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### 2. 忘记启动会话
|
|
68
|
+
|
|
69
|
+
```javascript
|
|
70
|
+
// ❌ 错误:直接保存
|
|
71
|
+
await store.saveResponse(data);
|
|
72
|
+
|
|
73
|
+
// ✅ 正确:先启动会话
|
|
74
|
+
store.startSession();
|
|
75
|
+
await store.saveResponse(data);
|
|
76
|
+
```
|
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
# Tool Guidelines
|
|
2
|
+
|
|
3
|
+
> LangChain 工具定义规范
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
DeepSpider 使用 `@langchain/core/tools` 定义 Agent 工具。
|
|
10
|
+
每个工具是一个独立的功能单元,通过 Zod schema 定义参数类型。
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Tool Structure
|
|
15
|
+
|
|
16
|
+
标准工具定义结构:
|
|
17
|
+
|
|
18
|
+
```javascript
|
|
19
|
+
import { z } from 'zod';
|
|
20
|
+
import { tool } from '@langchain/core/tools';
|
|
21
|
+
|
|
22
|
+
export const myTool = tool(
|
|
23
|
+
async ({ param1, param2 }) => {
|
|
24
|
+
// 工具逻辑
|
|
25
|
+
return JSON.stringify(result, null, 2);
|
|
26
|
+
},
|
|
27
|
+
{
|
|
28
|
+
name: 'tool_name', // snake_case 命名
|
|
29
|
+
description: '工具描述', // 简洁明确
|
|
30
|
+
schema: z.object({
|
|
31
|
+
param1: z.string().describe('参数描述'),
|
|
32
|
+
param2: z.number().optional().default(100),
|
|
33
|
+
}),
|
|
34
|
+
}
|
|
35
|
+
);
|
|
36
|
+
|
|
37
|
+
// 导出工具数组
|
|
38
|
+
export const myTools = [myTool];
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
**示例**: `src/agent/tools/analyzer.js:14-38`
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## Schema Conventions
|
|
46
|
+
|
|
47
|
+
使用 Zod 定义参数 schema:
|
|
48
|
+
|
|
49
|
+
```javascript
|
|
50
|
+
schema: z.object({
|
|
51
|
+
// 必填参数
|
|
52
|
+
code: z.string().describe('JS代码'),
|
|
53
|
+
|
|
54
|
+
// 可选参数带默认值
|
|
55
|
+
extractFunctions: z.boolean().optional().default(true),
|
|
56
|
+
|
|
57
|
+
// 枚举类型
|
|
58
|
+
mode: z.enum(['fast', 'deep']).optional().default('fast'),
|
|
59
|
+
|
|
60
|
+
// 数组类型
|
|
61
|
+
patterns: z.array(z.string()).optional(),
|
|
62
|
+
})
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
**示例**: `src/agent/tools/analyzer.js:32-36`
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## Return Value Patterns
|
|
70
|
+
|
|
71
|
+
工具返回值规范:
|
|
72
|
+
|
|
73
|
+
```javascript
|
|
74
|
+
// 返回 JSON 字符串(推荐)
|
|
75
|
+
return JSON.stringify(result, null, 2);
|
|
76
|
+
|
|
77
|
+
// 返回简单字符串
|
|
78
|
+
return `分析完成: ${count} 个函数`;
|
|
79
|
+
|
|
80
|
+
// 错误处理
|
|
81
|
+
try {
|
|
82
|
+
// ...
|
|
83
|
+
} catch (e) {
|
|
84
|
+
return JSON.stringify({ error: e.message });
|
|
85
|
+
}
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
---
|
|
89
|
+
|
|
90
|
+
## Tool Organization
|
|
91
|
+
|
|
92
|
+
工具文件组织:
|
|
93
|
+
|
|
94
|
+
```javascript
|
|
95
|
+
// src/agent/tools/analyzer.js
|
|
96
|
+
|
|
97
|
+
// 1. 导入依赖
|
|
98
|
+
import { z } from 'zod';
|
|
99
|
+
import { tool } from '@langchain/core/tools';
|
|
100
|
+
import { ASTAnalyzer } from '../../analyzer/ASTAnalyzer.js';
|
|
101
|
+
|
|
102
|
+
// 2. 定义各个工具
|
|
103
|
+
export const analyzeAst = tool(...);
|
|
104
|
+
export const analyzeCallstack = tool(...);
|
|
105
|
+
|
|
106
|
+
// 3. 导出工具数组
|
|
107
|
+
export const analyzerTools = [analyzeAst, analyzeCallstack];
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
**示例**: `src/agent/tools/analyzer.js`
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
114
|
+
## Common Mistakes
|
|
115
|
+
|
|
116
|
+
### 1. 工具名称不规范
|
|
117
|
+
|
|
118
|
+
```javascript
|
|
119
|
+
// ❌ 错误:使用 camelCase
|
|
120
|
+
name: 'analyzeAst'
|
|
121
|
+
|
|
122
|
+
// ✅ 正确:使用 snake_case
|
|
123
|
+
name: 'analyze_ast'
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
### 2. 缺少参数描述
|
|
127
|
+
|
|
128
|
+
```javascript
|
|
129
|
+
// ❌ 错误:无描述
|
|
130
|
+
param1: z.string()
|
|
131
|
+
|
|
132
|
+
// ✅ 正确:有描述
|
|
133
|
+
param1: z.string().describe('JS代码')
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### 3. 返回非字符串
|
|
137
|
+
|
|
138
|
+
```javascript
|
|
139
|
+
// ❌ 错误:返回对象
|
|
140
|
+
return result;
|
|
141
|
+
|
|
142
|
+
// ✅ 正确:返回 JSON 字符串
|
|
143
|
+
return JSON.stringify(result, null, 2);
|
|
144
|
+
```
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# Type Safety
|
|
2
|
+
|
|
3
|
+
> Zod 类型验证规范
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
DeepSpider 是纯 JavaScript 项目,使用 Zod 进行运行时类型验证。
|
|
10
|
+
主要用于 LangChain 工具的参数 schema 定义。
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Zod Schema
|
|
15
|
+
|
|
16
|
+
工具参数使用 Zod 定义:
|
|
17
|
+
|
|
18
|
+
```javascript
|
|
19
|
+
import { z } from 'zod';
|
|
20
|
+
|
|
21
|
+
const schema = z.object({
|
|
22
|
+
code: z.string().describe('JS代码'),
|
|
23
|
+
mode: z.enum(['fast', 'deep']).default('fast'),
|
|
24
|
+
});
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## Validation
|
|
30
|
+
|
|
31
|
+
常用 Zod 类型:
|
|
32
|
+
|
|
33
|
+
| 类型 | 用法 |
|
|
34
|
+
|------|------|
|
|
35
|
+
| 字符串 | `z.string()` |
|
|
36
|
+
| 数字 | `z.number()` |
|
|
37
|
+
| 布尔 | `z.boolean()` |
|
|
38
|
+
| 枚举 | `z.enum(['a', 'b'])` |
|
|
39
|
+
| 数组 | `z.array(z.string())` |
|
|
40
|
+
| 可选 | `.optional()` |
|
|
41
|
+
| 默认值 | `.default(value)` |
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## Common Patterns
|
|
46
|
+
|
|
47
|
+
参数描述模式:
|
|
48
|
+
|
|
49
|
+
```javascript
|
|
50
|
+
schema: z.object({
|
|
51
|
+
// 必填 + 描述
|
|
52
|
+
code: z.string().describe('JS代码'),
|
|
53
|
+
|
|
54
|
+
// 可选 + 默认值
|
|
55
|
+
deep: z.boolean().optional().default(false),
|
|
56
|
+
})
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
## Forbidden Patterns
|
|
62
|
+
|
|
63
|
+
### 1. 缺少 describe
|
|
64
|
+
|
|
65
|
+
```javascript
|
|
66
|
+
// ❌ 错误
|
|
67
|
+
code: z.string()
|
|
68
|
+
|
|
69
|
+
// ✅ 正确
|
|
70
|
+
code: z.string().describe('JS代码')
|
|
71
|
+
```
|
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
# Code Reuse Thinking Guide
|
|
2
|
+
|
|
3
|
+
> **Purpose**: Stop and think before creating new code - does it already exist?
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## The Problem
|
|
8
|
+
|
|
9
|
+
**Duplicated code is the #1 source of inconsistency bugs.**
|
|
10
|
+
|
|
11
|
+
When you copy-paste or rewrite existing logic:
|
|
12
|
+
- Bug fixes don't propagate
|
|
13
|
+
- Behavior diverges over time
|
|
14
|
+
- Codebase becomes harder to understand
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## Before Writing New Code
|
|
19
|
+
|
|
20
|
+
### Step 1: Search First
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
# Search for similar function names
|
|
24
|
+
grep -r "functionName" .
|
|
25
|
+
|
|
26
|
+
# Search for similar logic
|
|
27
|
+
grep -r "keyword" .
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
### Step 2: Ask These Questions
|
|
31
|
+
|
|
32
|
+
| Question | If Yes... |
|
|
33
|
+
|----------|-----------|
|
|
34
|
+
| Does a similar function exist? | Use or extend it |
|
|
35
|
+
| Is this pattern used elsewhere? | Follow the existing pattern |
|
|
36
|
+
| Could this be a shared utility? | Create it in the right place |
|
|
37
|
+
| Am I copying code from another file? | **STOP** - extract to shared |
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## Common Duplication Patterns
|
|
42
|
+
|
|
43
|
+
### Pattern 1: Copy-Paste Functions
|
|
44
|
+
|
|
45
|
+
**Bad**: Copying a validation function to another file
|
|
46
|
+
|
|
47
|
+
**Good**: Extract to shared utilities, import where needed
|
|
48
|
+
|
|
49
|
+
### Pattern 2: Similar Components
|
|
50
|
+
|
|
51
|
+
**Bad**: Creating a new component that's 80% similar to existing
|
|
52
|
+
|
|
53
|
+
**Good**: Extend existing component with props/variants
|
|
54
|
+
|
|
55
|
+
### Pattern 3: Repeated Constants
|
|
56
|
+
|
|
57
|
+
**Bad**: Defining the same constant in multiple files
|
|
58
|
+
|
|
59
|
+
**Good**: Single source of truth, import everywhere
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## When to Abstract
|
|
64
|
+
|
|
65
|
+
**Abstract when**:
|
|
66
|
+
- Same code appears 3+ times
|
|
67
|
+
- Logic is complex enough to have bugs
|
|
68
|
+
- Multiple people might need this
|
|
69
|
+
|
|
70
|
+
**Don't abstract when**:
|
|
71
|
+
- Only used once
|
|
72
|
+
- Trivial one-liner
|
|
73
|
+
- Abstraction would be more complex than duplication
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## After Batch Modifications
|
|
78
|
+
|
|
79
|
+
When you've made similar changes to multiple files:
|
|
80
|
+
|
|
81
|
+
1. **Review**: Did you catch all instances?
|
|
82
|
+
2. **Search**: Run grep to find any missed
|
|
83
|
+
3. **Consider**: Should this be abstracted?
|
|
84
|
+
|
|
85
|
+
---
|
|
86
|
+
|
|
87
|
+
## Checklist Before Commit
|
|
88
|
+
|
|
89
|
+
- [ ] Searched for existing similar code
|
|
90
|
+
- [ ] No copy-pasted logic that should be shared
|
|
91
|
+
- [ ] Constants defined in one place
|
|
92
|
+
- [ ] Similar patterns follow same structure
|
|
@@ -0,0 +1,94 @@
|
|
|
1
|
+
# Cross-Layer Thinking Guide
|
|
2
|
+
|
|
3
|
+
> **Purpose**: Think through data flow across layers before implementing.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## The Problem
|
|
8
|
+
|
|
9
|
+
**Most bugs happen at layer boundaries**, not within layers.
|
|
10
|
+
|
|
11
|
+
Common cross-layer bugs:
|
|
12
|
+
- API returns format A, frontend expects format B
|
|
13
|
+
- Database stores X, service transforms to Y, but loses data
|
|
14
|
+
- Multiple layers implement the same logic differently
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## Before Implementing Cross-Layer Features
|
|
19
|
+
|
|
20
|
+
### Step 1: Map the Data Flow
|
|
21
|
+
|
|
22
|
+
Draw out how data moves:
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
Source → Transform → Store → Retrieve → Transform → Display
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
For each arrow, ask:
|
|
29
|
+
- What format is the data in?
|
|
30
|
+
- What could go wrong?
|
|
31
|
+
- Who is responsible for validation?
|
|
32
|
+
|
|
33
|
+
### Step 2: Identify Boundaries
|
|
34
|
+
|
|
35
|
+
| Boundary | Common Issues |
|
|
36
|
+
|----------|---------------|
|
|
37
|
+
| API ↔ Service | Type mismatches, missing fields |
|
|
38
|
+
| Service ↔ Database | Format conversions, null handling |
|
|
39
|
+
| Backend ↔ Frontend | Serialization, date formats |
|
|
40
|
+
| Component ↔ Component | Props shape changes |
|
|
41
|
+
|
|
42
|
+
### Step 3: Define Contracts
|
|
43
|
+
|
|
44
|
+
For each boundary:
|
|
45
|
+
- What is the exact input format?
|
|
46
|
+
- What is the exact output format?
|
|
47
|
+
- What errors can occur?
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Common Cross-Layer Mistakes
|
|
52
|
+
|
|
53
|
+
### Mistake 1: Implicit Format Assumptions
|
|
54
|
+
|
|
55
|
+
**Bad**: Assuming date format without checking
|
|
56
|
+
|
|
57
|
+
**Good**: Explicit format conversion at boundaries
|
|
58
|
+
|
|
59
|
+
### Mistake 2: Scattered Validation
|
|
60
|
+
|
|
61
|
+
**Bad**: Validating the same thing in multiple layers
|
|
62
|
+
|
|
63
|
+
**Good**: Validate once at the entry point
|
|
64
|
+
|
|
65
|
+
### Mistake 3: Leaky Abstractions
|
|
66
|
+
|
|
67
|
+
**Bad**: Component knows about database schema
|
|
68
|
+
|
|
69
|
+
**Good**: Each layer only knows its neighbors
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## Checklist for Cross-Layer Features
|
|
74
|
+
|
|
75
|
+
Before implementation:
|
|
76
|
+
- [ ] Mapped the complete data flow
|
|
77
|
+
- [ ] Identified all layer boundaries
|
|
78
|
+
- [ ] Defined format at each boundary
|
|
79
|
+
- [ ] Decided where validation happens
|
|
80
|
+
|
|
81
|
+
After implementation:
|
|
82
|
+
- [ ] Tested with edge cases (null, empty, invalid)
|
|
83
|
+
- [ ] Verified error handling at each boundary
|
|
84
|
+
- [ ] Checked data survives round-trip
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## When to Create Flow Documentation
|
|
89
|
+
|
|
90
|
+
Create detailed flow docs when:
|
|
91
|
+
- Feature spans 3+ layers
|
|
92
|
+
- Multiple teams are involved
|
|
93
|
+
- Data format is complex
|
|
94
|
+
- Feature has caused bugs before
|