deepspider 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (261) hide show
  1. package/.claude/agents/check.md +122 -0
  2. package/.claude/agents/debug.md +106 -0
  3. package/.claude/agents/dispatch.md +214 -0
  4. package/.claude/agents/implement.md +96 -0
  5. package/.claude/agents/plan.md +396 -0
  6. package/.claude/agents/research.md +120 -0
  7. package/.claude/commands/evolve/merge.md +80 -0
  8. package/.claude/commands/trellis/before-backend-dev.md +13 -0
  9. package/.claude/commands/trellis/before-frontend-dev.md +13 -0
  10. package/.claude/commands/trellis/break-loop.md +107 -0
  11. package/.claude/commands/trellis/check-backend.md +13 -0
  12. package/.claude/commands/trellis/check-cross-layer.md +153 -0
  13. package/.claude/commands/trellis/check-frontend.md +13 -0
  14. package/.claude/commands/trellis/create-command.md +154 -0
  15. package/.claude/commands/trellis/finish-work.md +129 -0
  16. package/.claude/commands/trellis/integrate-skill.md +219 -0
  17. package/.claude/commands/trellis/onboard.md +358 -0
  18. package/.claude/commands/trellis/parallel.md +193 -0
  19. package/.claude/commands/trellis/record-session.md +62 -0
  20. package/.claude/commands/trellis/start.md +280 -0
  21. package/.claude/commands/trellis/update-spec.md +213 -0
  22. package/.claude/hooks/inject-subagent-context.py +758 -0
  23. package/.claude/hooks/ralph-loop.py +374 -0
  24. package/.claude/hooks/session-start.py +126 -0
  25. package/.claude/settings.json +41 -0
  26. package/.claude/skills/deepagents-guide/SKILL.md +428 -0
  27. package/.cursor/commands/trellis-before-backend-dev.md +13 -0
  28. package/.cursor/commands/trellis-before-frontend-dev.md +13 -0
  29. package/.cursor/commands/trellis-break-loop.md +107 -0
  30. package/.cursor/commands/trellis-check-backend.md +13 -0
  31. package/.cursor/commands/trellis-check-cross-layer.md +153 -0
  32. package/.cursor/commands/trellis-check-frontend.md +13 -0
  33. package/.cursor/commands/trellis-create-command.md +154 -0
  34. package/.cursor/commands/trellis-finish-work.md +129 -0
  35. package/.cursor/commands/trellis-integrate-skill.md +219 -0
  36. package/.cursor/commands/trellis-onboard.md +358 -0
  37. package/.cursor/commands/trellis-record-session.md +62 -0
  38. package/.cursor/commands/trellis-start.md +156 -0
  39. package/.cursor/commands/trellis-update-spec.md +213 -0
  40. package/.env.example +11 -0
  41. package/.husky/pre-commit +1 -0
  42. package/.mcp.json +8 -0
  43. package/.trellis/.template-hashes.json +65 -0
  44. package/.trellis/.version +1 -0
  45. package/.trellis/scripts/add-session.sh +384 -0
  46. package/.trellis/scripts/common/developer.sh +129 -0
  47. package/.trellis/scripts/common/git-context.sh +263 -0
  48. package/.trellis/scripts/common/paths.sh +208 -0
  49. package/.trellis/scripts/common/phase.sh +150 -0
  50. package/.trellis/scripts/common/registry.sh +247 -0
  51. package/.trellis/scripts/common/task-queue.sh +142 -0
  52. package/.trellis/scripts/common/task-utils.sh +151 -0
  53. package/.trellis/scripts/common/worktree.sh +128 -0
  54. package/.trellis/scripts/create-bootstrap.sh +299 -0
  55. package/.trellis/scripts/get-context.sh +7 -0
  56. package/.trellis/scripts/get-developer.sh +15 -0
  57. package/.trellis/scripts/init-developer.sh +34 -0
  58. package/.trellis/scripts/multi-agent/cleanup.sh +396 -0
  59. package/.trellis/scripts/multi-agent/create-pr.sh +241 -0
  60. package/.trellis/scripts/multi-agent/plan.sh +207 -0
  61. package/.trellis/scripts/multi-agent/start.sh +310 -0
  62. package/.trellis/scripts/multi-agent/status.sh +828 -0
  63. package/.trellis/scripts/task.sh +1118 -0
  64. package/.trellis/spec/backend/deepagents-guide.md +337 -0
  65. package/.trellis/spec/backend/directory-structure.md +126 -0
  66. package/.trellis/spec/backend/examples/skills/deepagents-guide/README.md +11 -0
  67. package/.trellis/spec/backend/examples/skills/deepagents-guide/agent.js.template +20 -0
  68. package/.trellis/spec/backend/examples/skills/deepagents-guide/skills-config.js.template +13 -0
  69. package/.trellis/spec/backend/examples/skills/deepagents-guide/subagent.js.template +19 -0
  70. package/.trellis/spec/backend/hook-guidelines.md +178 -0
  71. package/.trellis/spec/backend/index.md +36 -0
  72. package/.trellis/spec/backend/quality-guidelines.md +201 -0
  73. package/.trellis/spec/backend/state-management.md +76 -0
  74. package/.trellis/spec/backend/tool-guidelines.md +144 -0
  75. package/.trellis/spec/backend/type-safety.md +71 -0
  76. package/.trellis/spec/guides/code-reuse-thinking-guide.md +92 -0
  77. package/.trellis/spec/guides/cross-layer-thinking-guide.md +94 -0
  78. package/.trellis/spec/guides/index.md +79 -0
  79. package/.trellis/tasks/archive/02-02-evolving-skills/prd.md +61 -0
  80. package/.trellis/tasks/archive/02-02-evolving-skills/task.json +29 -0
  81. package/.trellis/tasks/archive/2026-02/00-bootstrap-guidelines/prd.md +86 -0
  82. package/.trellis/tasks/archive/2026-02/00-bootstrap-guidelines/task.json +27 -0
  83. package/.trellis/tasks/archive/2026-02/02-02-skills-system/check.jsonl +3 -0
  84. package/.trellis/tasks/archive/2026-02/02-02-skills-system/debug.jsonl +2 -0
  85. package/.trellis/tasks/archive/2026-02/02-02-skills-system/implement.jsonl +5 -0
  86. package/.trellis/tasks/archive/2026-02/02-02-skills-system/prd.md +33 -0
  87. package/.trellis/tasks/archive/2026-02/02-02-skills-system/task.json +41 -0
  88. package/.trellis/workflow.md +407 -0
  89. package/.trellis/workspace/index.md +123 -0
  90. package/.trellis/workspace/pony/index.md +40 -0
  91. package/.trellis/workspace/pony/journal-1.md +7 -0
  92. package/.trellis/worktree.yaml +47 -0
  93. package/AGENTS.md +18 -0
  94. package/CLAUDE.md +292 -0
  95. package/README.md +134 -0
  96. package/agents/deepspider.md +142 -0
  97. package/docs/DEBUG.md +42 -0
  98. package/docs/GUIDE.md +334 -0
  99. package/docs/PROMPT.md +60 -0
  100. package/docs/USAGE.md +226 -0
  101. package/eslint.config.js +51 -0
  102. package/package.json +78 -0
  103. package/requirements-crypto.txt +14 -0
  104. package/src/agent/index.js +97 -0
  105. package/src/agent/logger.js +164 -0
  106. package/src/agent/middleware/filterTools.js +64 -0
  107. package/src/agent/middleware/report.js +79 -0
  108. package/src/agent/prompts/system.js +315 -0
  109. package/src/agent/run.js +575 -0
  110. package/src/agent/skills/anti-detect/SKILL.md +28 -0
  111. package/src/agent/skills/anti-detect/evolved.md +12 -0
  112. package/src/agent/skills/captcha/SKILL.md +37 -0
  113. package/src/agent/skills/captcha/evolved.md +12 -0
  114. package/src/agent/skills/config.js +30 -0
  115. package/src/agent/skills/crawler/SKILL.md +9 -0
  116. package/src/agent/skills/crawler/evolved.md +16 -0
  117. package/src/agent/skills/dynamic-analysis/SKILL.md +91 -0
  118. package/src/agent/skills/dynamic-analysis/evolved.md +12 -0
  119. package/src/agent/skills/env/SKILL.md +72 -0
  120. package/src/agent/skills/env/evolved.md +12 -0
  121. package/src/agent/skills/evolve.js +79 -0
  122. package/src/agent/skills/general/SKILL.md +12 -0
  123. package/src/agent/skills/general/evolved.md +12 -0
  124. package/src/agent/skills/js2python/SKILL.md +30 -0
  125. package/src/agent/skills/js2python/evolved.md +13 -0
  126. package/src/agent/skills/report/SKILL.md +21 -0
  127. package/src/agent/skills/report/evolved.md +12 -0
  128. package/src/agent/skills/sandbox/SKILL.md +22 -0
  129. package/src/agent/skills/sandbox/evolved.md +16 -0
  130. package/src/agent/skills/static-analysis/SKILL.md +93 -0
  131. package/src/agent/skills/static-analysis/evolved.md +12 -0
  132. package/src/agent/skills/xpath/SKILL.md +119 -0
  133. package/src/agent/subagents/anti-detect.js +45 -0
  134. package/src/agent/subagents/captcha.js +51 -0
  135. package/src/agent/subagents/crawler.js +138 -0
  136. package/src/agent/subagents/dynamic.js +64 -0
  137. package/src/agent/subagents/env-agent.js +82 -0
  138. package/src/agent/subagents/index.js +37 -0
  139. package/src/agent/subagents/js2python.js +72 -0
  140. package/src/agent/subagents/sandbox.js +55 -0
  141. package/src/agent/subagents/static.js +66 -0
  142. package/src/agent/tools/analysis.js +135 -0
  143. package/src/agent/tools/analyzer.js +85 -0
  144. package/src/agent/tools/anti-detect.js +89 -0
  145. package/src/agent/tools/antidebug.js +64 -0
  146. package/src/agent/tools/async.js +43 -0
  147. package/src/agent/tools/browser.js +324 -0
  148. package/src/agent/tools/captcha.js +223 -0
  149. package/src/agent/tools/capture.js +179 -0
  150. package/src/agent/tools/correlate.js +303 -0
  151. package/src/agent/tools/crawler.js +116 -0
  152. package/src/agent/tools/cryptohook.js +80 -0
  153. package/src/agent/tools/debug.js +246 -0
  154. package/src/agent/tools/deobfuscator.js +90 -0
  155. package/src/agent/tools/env.js +83 -0
  156. package/src/agent/tools/envdump.js +92 -0
  157. package/src/agent/tools/evolve.js +164 -0
  158. package/src/agent/tools/extract.js +114 -0
  159. package/src/agent/tools/extractor.js +54 -0
  160. package/src/agent/tools/file.js +224 -0
  161. package/src/agent/tools/hook.js +84 -0
  162. package/src/agent/tools/hookManager.js +178 -0
  163. package/src/agent/tools/index.js +137 -0
  164. package/src/agent/tools/nodejs.js +101 -0
  165. package/src/agent/tools/patch.js +46 -0
  166. package/src/agent/tools/preprocess.js +71 -0
  167. package/src/agent/tools/profile.js +122 -0
  168. package/src/agent/tools/python.js +627 -0
  169. package/src/agent/tools/report.js +124 -0
  170. package/src/agent/tools/runtime.js +132 -0
  171. package/src/agent/tools/sandbox.js +79 -0
  172. package/src/agent/tools/store.js +73 -0
  173. package/src/agent/tools/trace.js +74 -0
  174. package/src/agent/tools/tracing.js +201 -0
  175. package/src/agent/tools/utils.js +51 -0
  176. package/src/agent/tools/verify.js +184 -0
  177. package/src/agent/tools/webcrack.js +109 -0
  178. package/src/analyzer/ASTAnalyzer.js +387 -0
  179. package/src/analyzer/CallStackAnalyzer.js +379 -0
  180. package/src/analyzer/Deobfuscator.js +289 -0
  181. package/src/analyzer/EncryptionAnalyzer.js +99 -0
  182. package/src/analyzer/index.js +22 -0
  183. package/src/browser/EnvBridge.js +186 -0
  184. package/src/browser/cdp.js +168 -0
  185. package/src/browser/client.js +197 -0
  186. package/src/browser/collector.js +444 -0
  187. package/src/browser/collectors/RequestCryptoLinker.js +109 -0
  188. package/src/browser/collectors/ResponseSearcher.js +107 -0
  189. package/src/browser/collectors/ScriptCollector.js +158 -0
  190. package/src/browser/collectors/index.js +26 -0
  191. package/src/browser/defaultHooks.js +932 -0
  192. package/src/browser/hooks/crypto.js +55 -0
  193. package/src/browser/hooks/index.js +64 -0
  194. package/src/browser/hooks/native.js +9 -0
  195. package/src/browser/hooks/network.js +33 -0
  196. package/src/browser/index.js +42 -0
  197. package/src/browser/interceptors/NetworkInterceptor.js +116 -0
  198. package/src/browser/interceptors/ScriptInterceptor.js +76 -0
  199. package/src/browser/interceptors/index.js +6 -0
  200. package/src/browser/ui/analysisPanel.js +1782 -0
  201. package/src/browser/ui/confirmDialog.js +158 -0
  202. package/src/browser/ui/panel.html +152 -0
  203. package/src/browser/ui/selector.js +170 -0
  204. package/src/config/index.js +5 -0
  205. package/src/config/paths.js +71 -0
  206. package/src/config/patterns/crypto.js +36 -0
  207. package/src/config/profiles/chrome.json +71 -0
  208. package/src/config/profiles/firefox.json +44 -0
  209. package/src/config/profiles/safari.json +38 -0
  210. package/src/core/EnvMonitor.js +200 -0
  211. package/src/core/PatchGenerator.js +278 -0
  212. package/src/core/Sandbox.js +181 -0
  213. package/src/env/AntiAntiDebug.js +111 -0
  214. package/src/env/AsyncHook.js +68 -0
  215. package/src/env/BrowserAPIList.js +265 -0
  216. package/src/env/CookieHook.js +48 -0
  217. package/src/env/CryptoHook.js +205 -0
  218. package/src/env/EnvCodeGenerator.js +157 -0
  219. package/src/env/EnvDumper.js +356 -0
  220. package/src/env/EnvExtractor.js +220 -0
  221. package/src/env/HookBase.js +618 -0
  222. package/src/env/NetworkHook.js +159 -0
  223. package/src/env/modules/bom/history.js +29 -0
  224. package/src/env/modules/bom/location.js +26 -0
  225. package/src/env/modules/bom/navigator.js +70 -0
  226. package/src/env/modules/bom/screen.js +26 -0
  227. package/src/env/modules/bom/storage.js +23 -0
  228. package/src/env/modules/dom/document.js +110 -0
  229. package/src/env/modules/dom/event.js +51 -0
  230. package/src/env/modules/index.js +34 -0
  231. package/src/env/modules/webapi/fetch.js +46 -0
  232. package/src/env/modules/webapi/url.js +47 -0
  233. package/src/env/modules/webapi/xhr.js +48 -0
  234. package/src/index.js +27 -0
  235. package/src/mcp/server.js +89 -0
  236. package/src/store/DataStore.js +708 -0
  237. package/src/store/Store.js +158 -0
  238. package/src/store/Validator.js +24 -0
  239. package/test/analyze.test.js +90 -0
  240. package/test/envdump.test.js +74 -0
  241. package/test/flow.test.js +90 -0
  242. package/test/hooks.test.js +138 -0
  243. package/test/plugin.test.js +35 -0
  244. package/test/refactor-full.test.js +30 -0
  245. package/test/refactor.test.js +21 -0
  246. package/test/samples/obfuscated.js +61 -0
  247. package/test/samples/original.js +66 -0
  248. package/test/samples/v10_eval_chain.js +52 -0
  249. package/test/samples/v11_bytecode_vm.js +81 -0
  250. package/test/samples/v12_polymorphic.js +69 -0
  251. package/test/samples/v1_ob_basic.js +98 -0
  252. package/test/samples/v2_ob_advanced.js +99 -0
  253. package/test/samples/v3_jjencode.js +77 -0
  254. package/test/samples/v4_aaencode.js +73 -0
  255. package/test/samples/v5_control_flow.js +86 -0
  256. package/test/samples/v6_string_encryption.js +71 -0
  257. package/test/samples/v7_jsvmp.js +83 -0
  258. package/test/samples/v8_anti_debug.js +79 -0
  259. package/test/samples/v9_proxy_trap.js +49 -0
  260. package/test/samples.test.js +96 -0
  261. package/test/webcrack.test.js +55 -0
@@ -0,0 +1,36 @@
1
+ # DeepSpider Development Guidelines
2
+
3
+ > DeepSpider 项目开发规范
4
+
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ DeepSpider 是基于 DeepAgents + Patchright 的 JS 逆向分析引擎。
10
+ 本目录包含项目的开发规范和代码模式。
11
+
12
+ ---
13
+
14
+ ## Guidelines Index
15
+
16
+ | Guide | Description | Status |
17
+ |-------|-------------|--------|
18
+ | [Directory Structure](./directory-structure.md) | 项目目录结构和模块组织 | Done |
19
+ | [DeepAgents Guide](./deepagents-guide.md) | DeepAgents 框架使用指南 | Done |
20
+ | [Tool Guidelines](./tool-guidelines.md) | LangChain 工具定义规范 | Done |
21
+ | [Hook Guidelines](./hook-guidelines.md) | 浏览器 Hook 注入规范 | Done |
22
+ | [State Management](./state-management.md) | Agent 状态与数据存储 | Done |
23
+ | [Quality Guidelines](./quality-guidelines.md) | 代码质量规范 | Done |
24
+ | [Type Safety](./type-safety.md) | Zod 类型验证规范 | Done |
25
+
26
+ ---
27
+
28
+ ## Quick Reference
29
+
30
+ 核心规范要点:
31
+
32
+ 1. **Agent 创建**: 使用 `createDeepAgent()` + 配置对象
33
+ 2. **工具定义**: 使用 `@langchain/core/tools` + Zod schema
34
+ 3. **浏览器交互**: 优先使用 CDP,避免 `page.evaluate()`
35
+ 4. **AST 遍历**: 使用 `@babel/traverse`
36
+ 5. **数据存储**: 使用 `getDataStore()` 单例
@@ -0,0 +1,201 @@
1
+ # Quality Guidelines
2
+
3
+ > DeepSpider 代码质量规范
4
+
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ DeepSpider 遵循 CLAUDE.md 中定义的代码规范,重点关注:
10
+ - CDP 优先的浏览器交互
11
+ - Babel AST 遍历模式
12
+ - LangChain 工具定义规范
13
+
14
+ ---
15
+
16
+ ## Forbidden Patterns
17
+
18
+ ### 1. 使用 page.evaluate 代替 CDP
19
+
20
+ ```javascript
21
+ // ❌ 禁止
22
+ const result = await page.evaluate(() => { ... });
23
+
24
+ // ✅ 使用 CDP
25
+ const cdp = await browser.getCDPSession();
26
+ const result = await cdp.send('Runtime.evaluate', { ... });
27
+ ```
28
+
29
+ ### 2. 直接访问封装类的内部属性
30
+
31
+ ```javascript
32
+ // ❌ 禁止:暴露内部实现
33
+ cdpSession.client.on('Debugger.paused', handler);
34
+
35
+ // ✅ 使用封装类提供的方法
36
+ cdpSession.on('Debugger.paused', handler);
37
+ ```
38
+
39
+ **原因**: 直接访问 `.client` 会导致封装泄漏,当内部实现变化时调用方会报错。
40
+
41
+ ### 3. 子代理不配置中间件
42
+
43
+ ```javascript
44
+ // ❌ 禁止:只在主 Agent 配置中间件,子代理不配置
45
+ // index.js
46
+ const agent = createDeepAgent({
47
+ middleware: [createFilterToolsMiddleware()],
48
+ subagents: [subagent1, subagent2],
49
+ });
50
+
51
+ // subagent1.js - 没有中间件
52
+ export const subagent1 = {
53
+ name: 'subagent1',
54
+ tools: [...],
55
+ middleware: [], // 空!
56
+ };
57
+
58
+ // ✅ 子代理也需要配置相同的中间件
59
+ export const subagent1 = {
60
+ name: 'subagent1',
61
+ tools: [...],
62
+ middleware: [
63
+ createFilterToolsMiddleware(), // 必须添加
64
+ createSkillsMiddleware({ ... }),
65
+ ],
66
+ };
67
+ ```
68
+
69
+ **原因**: DeepAgents 子代理不会继承主 Agent 的中间件配置。如果主 Agent 过滤了内置工具,子代理也必须单独配置过滤中间件,否则子代理仍会使用被过滤的工具。
70
+
71
+ ### 4. setInterval 中使用 async 回调
72
+
73
+ ```javascript
74
+ // ❌ 禁止:async 回调不会被等待,可能导致并发问题
75
+ setInterval(async () => {
76
+ const result = await detectCaptcha();
77
+ await handleResult(result);
78
+ }, 30000);
79
+
80
+ // ✅ 保持同步,只做状态检查和标记
81
+ let needsCheck = false;
82
+ setInterval(() => {
83
+ const elapsed = Date.now() - lastEventTime;
84
+ if (elapsed > timeout) {
85
+ console.log('[提示] 超时,请检查页面');
86
+ }
87
+ }, 30000);
88
+ ```
89
+
90
+ **原因**: setInterval 不会等待 async 回调完成,多次触发会导致并发执行。
91
+
92
+ ### 5. spawn 使用不存在的 timeout 选项
93
+
94
+ ```javascript
95
+ // ❌ 禁止:spawn 不支持 timeout 选项,超时不会生效
96
+ const proc = spawn('node', ['-e', code], {
97
+ timeout: 10000, // 无效!
98
+ });
99
+
100
+ // ✅ 手动实现超时
101
+ const proc = spawn('node', ['-e', code]);
102
+ let killed = false;
103
+
104
+ const timer = setTimeout(() => {
105
+ killed = true;
106
+ proc.kill('SIGTERM');
107
+ }, 10000);
108
+
109
+ proc.on('close', () => {
110
+ clearTimeout(timer);
111
+ });
112
+ ```
113
+
114
+ **原因**: `spawn` 的 options 不包含 `timeout`,这是 `execSync` 的选项。使用 spawn 时必须手动实现超时逻辑。
115
+
116
+ ---
117
+
118
+ ## Required Patterns
119
+
120
+ ### 1. Babel AST 遍历
121
+
122
+ ```javascript
123
+ import traverse from '@babel/traverse';
124
+
125
+ traverse.default(ast, {
126
+ FunctionDeclaration(path) {
127
+ // 处理
128
+ }
129
+ });
130
+ ```
131
+
132
+ ### 2. CDP Session 复用
133
+
134
+ ```javascript
135
+ const cdp = await browser.getCDPSession();
136
+ ```
137
+
138
+ ---
139
+
140
+ ## Testing Requirements
141
+
142
+ 运行测试:
143
+
144
+ ```bash
145
+ pnpm test
146
+ ```
147
+
148
+ ---
149
+
150
+ ## Code Review Checklist
151
+
152
+ - [ ] 工具名称使用 snake_case
153
+ - [ ] 参数有 describe 描述
154
+ - [ ] 浏览器交互使用 CDP
155
+ - [ ] AST 遍历使用 Babel
156
+ - [ ] 数组访问前检查边界
157
+ - [ ] 对象访问前检查空值
158
+
159
+ ---
160
+
161
+ ## Defensive Programming
162
+
163
+ ### 1. 数组索引边界检查
164
+
165
+ ```javascript
166
+ // ❌ 禁止:直接访问可能越界
167
+ const stage = stages[parseInt(index)];
168
+ stage.fields.push(field);
169
+
170
+ // ✅ 先检查边界
171
+ const idx = parseInt(index);
172
+ if (idx < 0 || idx >= stages.length) return;
173
+ const stage = stages[idx];
174
+ ```
175
+
176
+ ### 2. 工厂函数避免重复结构
177
+
178
+ ```javascript
179
+ // ❌ 禁止:多处重复对象字面量
180
+ stages.push({ name: 'list', fields: [], entry: null });
181
+ // ... 另一处
182
+ stages = [{ name: 'list', fields: [], entry: null }];
183
+
184
+ // ✅ 使用工厂函数
185
+ function createStage(name) {
186
+ return { name, fields: [], entry: null, pagination: null };
187
+ }
188
+ stages.push(createStage('list'));
189
+ ```
190
+
191
+ ### 3. 空值检查
192
+
193
+ ```javascript
194
+ // ❌ 禁止:假设对象存在
195
+ currentStage.fields.splice(index, 1);
196
+
197
+ // ✅ 先检查
198
+ if (!currentStage) return;
199
+ if (index < 0 || index >= currentStage.fields.length) return;
200
+ currentStage.fields.splice(index, 1);
201
+ ```
@@ -0,0 +1,76 @@
1
+ # State Management
2
+
3
+ > Agent 状态与数据存储规范
4
+
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ DeepSpider 使用 DeepAgents 的状态后端和文件系统存储管理数据。
10
+ Agent 状态通过 FilesystemBackend 持久化,采集数据通过 DataStore 存储。
11
+
12
+ ---
13
+
14
+ ## State Categories
15
+
16
+ | 类型 | 存储方式 | 示例 |
17
+ |------|----------|------|
18
+ | Agent 状态 | FilesystemBackend | `.deepspider-agent/` |
19
+ | 采集数据 | DataStore | `.deepspider-data/` |
20
+ | 会话状态 | MemorySaver | 内存中 |
21
+
22
+ ---
23
+
24
+ ## DataStore Pattern
25
+
26
+ 数据存储使用单例模式:
27
+
28
+ ```javascript
29
+ import { getDataStore } from '../store/DataStore.js';
30
+
31
+ const store = getDataStore();
32
+ await store.saveResponse(data);
33
+ ```
34
+
35
+ **示例**: `src/store/DataStore.js:699-706`
36
+
37
+ ---
38
+
39
+ ## Agent Backend
40
+
41
+ Agent 状态后端配置:
42
+
43
+ ```javascript
44
+ import { FilesystemBackend } from 'deepagents';
45
+
46
+ const backend = new FilesystemBackend({
47
+ rootDir: './.deepspider-agent'
48
+ });
49
+ ```
50
+
51
+ **示例**: `src/agent/index.js:59-62`
52
+
53
+ ---
54
+
55
+ ## Common Mistakes
56
+
57
+ ### 1. 未使用单例
58
+
59
+ ```javascript
60
+ // ❌ 错误:每次创建新实例
61
+ const store = new DataStore();
62
+
63
+ // ✅ 正确:使用单例
64
+ const store = getDataStore();
65
+ ```
66
+
67
+ ### 2. 忘记启动会话
68
+
69
+ ```javascript
70
+ // ❌ 错误:直接保存
71
+ await store.saveResponse(data);
72
+
73
+ // ✅ 正确:先启动会话
74
+ store.startSession();
75
+ await store.saveResponse(data);
76
+ ```
@@ -0,0 +1,144 @@
1
+ # Tool Guidelines
2
+
3
+ > LangChain 工具定义规范
4
+
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ DeepSpider 使用 `@langchain/core/tools` 定义 Agent 工具。
10
+ 每个工具是一个独立的功能单元,通过 Zod schema 定义参数类型。
11
+
12
+ ---
13
+
14
+ ## Tool Structure
15
+
16
+ 标准工具定义结构:
17
+
18
+ ```javascript
19
+ import { z } from 'zod';
20
+ import { tool } from '@langchain/core/tools';
21
+
22
+ export const myTool = tool(
23
+ async ({ param1, param2 }) => {
24
+ // 工具逻辑
25
+ return JSON.stringify(result, null, 2);
26
+ },
27
+ {
28
+ name: 'tool_name', // snake_case 命名
29
+ description: '工具描述', // 简洁明确
30
+ schema: z.object({
31
+ param1: z.string().describe('参数描述'),
32
+ param2: z.number().optional().default(100),
33
+ }),
34
+ }
35
+ );
36
+
37
+ // 导出工具数组
38
+ export const myTools = [myTool];
39
+ ```
40
+
41
+ **示例**: `src/agent/tools/analyzer.js:14-38`
42
+
43
+ ---
44
+
45
+ ## Schema Conventions
46
+
47
+ 使用 Zod 定义参数 schema:
48
+
49
+ ```javascript
50
+ schema: z.object({
51
+ // 必填参数
52
+ code: z.string().describe('JS代码'),
53
+
54
+ // 可选参数带默认值
55
+ extractFunctions: z.boolean().optional().default(true),
56
+
57
+ // 枚举类型
58
+ mode: z.enum(['fast', 'deep']).optional().default('fast'),
59
+
60
+ // 数组类型
61
+ patterns: z.array(z.string()).optional(),
62
+ })
63
+ ```
64
+
65
+ **示例**: `src/agent/tools/analyzer.js:32-36`
66
+
67
+ ---
68
+
69
+ ## Return Value Patterns
70
+
71
+ 工具返回值规范:
72
+
73
+ ```javascript
74
+ // 返回 JSON 字符串(推荐)
75
+ return JSON.stringify(result, null, 2);
76
+
77
+ // 返回简单字符串
78
+ return `分析完成: ${count} 个函数`;
79
+
80
+ // 错误处理
81
+ try {
82
+ // ...
83
+ } catch (e) {
84
+ return JSON.stringify({ error: e.message });
85
+ }
86
+ ```
87
+
88
+ ---
89
+
90
+ ## Tool Organization
91
+
92
+ 工具文件组织:
93
+
94
+ ```javascript
95
+ // src/agent/tools/analyzer.js
96
+
97
+ // 1. 导入依赖
98
+ import { z } from 'zod';
99
+ import { tool } from '@langchain/core/tools';
100
+ import { ASTAnalyzer } from '../../analyzer/ASTAnalyzer.js';
101
+
102
+ // 2. 定义各个工具
103
+ export const analyzeAst = tool(...);
104
+ export const analyzeCallstack = tool(...);
105
+
106
+ // 3. 导出工具数组
107
+ export const analyzerTools = [analyzeAst, analyzeCallstack];
108
+ ```
109
+
110
+ **示例**: `src/agent/tools/analyzer.js`
111
+
112
+ ---
113
+
114
+ ## Common Mistakes
115
+
116
+ ### 1. 工具名称不规范
117
+
118
+ ```javascript
119
+ // ❌ 错误:使用 camelCase
120
+ name: 'analyzeAst'
121
+
122
+ // ✅ 正确:使用 snake_case
123
+ name: 'analyze_ast'
124
+ ```
125
+
126
+ ### 2. 缺少参数描述
127
+
128
+ ```javascript
129
+ // ❌ 错误:无描述
130
+ param1: z.string()
131
+
132
+ // ✅ 正确:有描述
133
+ param1: z.string().describe('JS代码')
134
+ ```
135
+
136
+ ### 3. 返回非字符串
137
+
138
+ ```javascript
139
+ // ❌ 错误:返回对象
140
+ return result;
141
+
142
+ // ✅ 正确:返回 JSON 字符串
143
+ return JSON.stringify(result, null, 2);
144
+ ```
@@ -0,0 +1,71 @@
1
+ # Type Safety
2
+
3
+ > Zod 类型验证规范
4
+
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ DeepSpider 是纯 JavaScript 项目,使用 Zod 进行运行时类型验证。
10
+ 主要用于 LangChain 工具的参数 schema 定义。
11
+
12
+ ---
13
+
14
+ ## Zod Schema
15
+
16
+ 工具参数使用 Zod 定义:
17
+
18
+ ```javascript
19
+ import { z } from 'zod';
20
+
21
+ const schema = z.object({
22
+ code: z.string().describe('JS代码'),
23
+ mode: z.enum(['fast', 'deep']).default('fast'),
24
+ });
25
+ ```
26
+
27
+ ---
28
+
29
+ ## Validation
30
+
31
+ 常用 Zod 类型:
32
+
33
+ | 类型 | 用法 |
34
+ |------|------|
35
+ | 字符串 | `z.string()` |
36
+ | 数字 | `z.number()` |
37
+ | 布尔 | `z.boolean()` |
38
+ | 枚举 | `z.enum(['a', 'b'])` |
39
+ | 数组 | `z.array(z.string())` |
40
+ | 可选 | `.optional()` |
41
+ | 默认值 | `.default(value)` |
42
+
43
+ ---
44
+
45
+ ## Common Patterns
46
+
47
+ 参数描述模式:
48
+
49
+ ```javascript
50
+ schema: z.object({
51
+ // 必填 + 描述
52
+ code: z.string().describe('JS代码'),
53
+
54
+ // 可选 + 默认值
55
+ deep: z.boolean().optional().default(false),
56
+ })
57
+ ```
58
+
59
+ ---
60
+
61
+ ## Forbidden Patterns
62
+
63
+ ### 1. 缺少 describe
64
+
65
+ ```javascript
66
+ // ❌ 错误
67
+ code: z.string()
68
+
69
+ // ✅ 正确
70
+ code: z.string().describe('JS代码')
71
+ ```
@@ -0,0 +1,92 @@
1
+ # Code Reuse Thinking Guide
2
+
3
+ > **Purpose**: Stop and think before creating new code - does it already exist?
4
+
5
+ ---
6
+
7
+ ## The Problem
8
+
9
+ **Duplicated code is the #1 source of inconsistency bugs.**
10
+
11
+ When you copy-paste or rewrite existing logic:
12
+ - Bug fixes don't propagate
13
+ - Behavior diverges over time
14
+ - Codebase becomes harder to understand
15
+
16
+ ---
17
+
18
+ ## Before Writing New Code
19
+
20
+ ### Step 1: Search First
21
+
22
+ ```bash
23
+ # Search for similar function names
24
+ grep -r "functionName" .
25
+
26
+ # Search for similar logic
27
+ grep -r "keyword" .
28
+ ```
29
+
30
+ ### Step 2: Ask These Questions
31
+
32
+ | Question | If Yes... |
33
+ |----------|-----------|
34
+ | Does a similar function exist? | Use or extend it |
35
+ | Is this pattern used elsewhere? | Follow the existing pattern |
36
+ | Could this be a shared utility? | Create it in the right place |
37
+ | Am I copying code from another file? | **STOP** - extract to shared |
38
+
39
+ ---
40
+
41
+ ## Common Duplication Patterns
42
+
43
+ ### Pattern 1: Copy-Paste Functions
44
+
45
+ **Bad**: Copying a validation function to another file
46
+
47
+ **Good**: Extract to shared utilities, import where needed
48
+
49
+ ### Pattern 2: Similar Components
50
+
51
+ **Bad**: Creating a new component that's 80% similar to existing
52
+
53
+ **Good**: Extend existing component with props/variants
54
+
55
+ ### Pattern 3: Repeated Constants
56
+
57
+ **Bad**: Defining the same constant in multiple files
58
+
59
+ **Good**: Single source of truth, import everywhere
60
+
61
+ ---
62
+
63
+ ## When to Abstract
64
+
65
+ **Abstract when**:
66
+ - Same code appears 3+ times
67
+ - Logic is complex enough to have bugs
68
+ - Multiple people might need this
69
+
70
+ **Don't abstract when**:
71
+ - Only used once
72
+ - Trivial one-liner
73
+ - Abstraction would be more complex than duplication
74
+
75
+ ---
76
+
77
+ ## After Batch Modifications
78
+
79
+ When you've made similar changes to multiple files:
80
+
81
+ 1. **Review**: Did you catch all instances?
82
+ 2. **Search**: Run grep to find any missed
83
+ 3. **Consider**: Should this be abstracted?
84
+
85
+ ---
86
+
87
+ ## Checklist Before Commit
88
+
89
+ - [ ] Searched for existing similar code
90
+ - [ ] No copy-pasted logic that should be shared
91
+ - [ ] Constants defined in one place
92
+ - [ ] Similar patterns follow same structure
@@ -0,0 +1,94 @@
1
+ # Cross-Layer Thinking Guide
2
+
3
+ > **Purpose**: Think through data flow across layers before implementing.
4
+
5
+ ---
6
+
7
+ ## The Problem
8
+
9
+ **Most bugs happen at layer boundaries**, not within layers.
10
+
11
+ Common cross-layer bugs:
12
+ - API returns format A, frontend expects format B
13
+ - Database stores X, service transforms to Y, but loses data
14
+ - Multiple layers implement the same logic differently
15
+
16
+ ---
17
+
18
+ ## Before Implementing Cross-Layer Features
19
+
20
+ ### Step 1: Map the Data Flow
21
+
22
+ Draw out how data moves:
23
+
24
+ ```
25
+ Source → Transform → Store → Retrieve → Transform → Display
26
+ ```
27
+
28
+ For each arrow, ask:
29
+ - What format is the data in?
30
+ - What could go wrong?
31
+ - Who is responsible for validation?
32
+
33
+ ### Step 2: Identify Boundaries
34
+
35
+ | Boundary | Common Issues |
36
+ |----------|---------------|
37
+ | API ↔ Service | Type mismatches, missing fields |
38
+ | Service ↔ Database | Format conversions, null handling |
39
+ | Backend ↔ Frontend | Serialization, date formats |
40
+ | Component ↔ Component | Props shape changes |
41
+
42
+ ### Step 3: Define Contracts
43
+
44
+ For each boundary:
45
+ - What is the exact input format?
46
+ - What is the exact output format?
47
+ - What errors can occur?
48
+
49
+ ---
50
+
51
+ ## Common Cross-Layer Mistakes
52
+
53
+ ### Mistake 1: Implicit Format Assumptions
54
+
55
+ **Bad**: Assuming date format without checking
56
+
57
+ **Good**: Explicit format conversion at boundaries
58
+
59
+ ### Mistake 2: Scattered Validation
60
+
61
+ **Bad**: Validating the same thing in multiple layers
62
+
63
+ **Good**: Validate once at the entry point
64
+
65
+ ### Mistake 3: Leaky Abstractions
66
+
67
+ **Bad**: Component knows about database schema
68
+
69
+ **Good**: Each layer only knows its neighbors
70
+
71
+ ---
72
+
73
+ ## Checklist for Cross-Layer Features
74
+
75
+ Before implementation:
76
+ - [ ] Mapped the complete data flow
77
+ - [ ] Identified all layer boundaries
78
+ - [ ] Defined format at each boundary
79
+ - [ ] Decided where validation happens
80
+
81
+ After implementation:
82
+ - [ ] Tested with edge cases (null, empty, invalid)
83
+ - [ ] Verified error handling at each boundary
84
+ - [ ] Checked data survives round-trip
85
+
86
+ ---
87
+
88
+ ## When to Create Flow Documentation
89
+
90
+ Create detailed flow docs when:
91
+ - Feature spans 3+ layers
92
+ - Multiple teams are involved
93
+ - Data format is complex
94
+ - Feature has caused bugs before