scientify 1.2.2 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/README.md +38 -14
  2. package/README.zh.md +38 -15
  3. package/dist/index.d.ts.map +1 -1
  4. package/dist/index.js +21 -2
  5. package/dist/index.js.map +1 -1
  6. package/dist/src/services/auto-updater.d.ts +15 -0
  7. package/dist/src/services/auto-updater.d.ts.map +1 -0
  8. package/dist/src/services/auto-updater.js +188 -0
  9. package/dist/src/services/auto-updater.js.map +1 -0
  10. package/dist/src/tools/arxiv-download.d.ts +25 -0
  11. package/dist/src/tools/arxiv-download.d.ts.map +1 -0
  12. package/dist/src/tools/arxiv-download.js +179 -0
  13. package/dist/src/tools/arxiv-download.js.map +1 -0
  14. package/dist/src/tools/{arxiv-tool.d.ts → arxiv-search.d.ts} +11 -8
  15. package/dist/src/tools/arxiv-search.d.ts.map +1 -0
  16. package/dist/src/tools/arxiv-search.js +140 -0
  17. package/dist/src/tools/arxiv-search.js.map +1 -0
  18. package/dist/src/tools/github-search-tool.d.ts +5 -1
  19. package/dist/src/tools/github-search-tool.d.ts.map +1 -1
  20. package/dist/src/tools/github-search-tool.js +10 -30
  21. package/dist/src/tools/github-search-tool.js.map +1 -1
  22. package/dist/src/tools/result.d.ts +37 -0
  23. package/dist/src/tools/result.d.ts.map +1 -0
  24. package/dist/src/tools/result.js +39 -0
  25. package/dist/src/tools/result.js.map +1 -0
  26. package/dist/src/tools/workspace.d.ts +32 -0
  27. package/dist/src/tools/workspace.d.ts.map +1 -0
  28. package/dist/src/tools/workspace.js +69 -0
  29. package/dist/src/tools/workspace.js.map +1 -0
  30. package/openclaw.plugin.json +22 -1
  31. package/package.json +13 -2
  32. package/skills/_shared/workspace-spec.md +139 -0
  33. package/skills/idea-generation/SKILL.md +4 -0
  34. package/skills/install-scientify/SKILL.md +15 -7
  35. package/skills/literature-survey/SKILL.md +86 -212
  36. package/skills/research-experiment/SKILL.md +114 -0
  37. package/skills/research-implement/SKILL.md +166 -0
  38. package/skills/research-pipeline/SKILL.md +104 -188
  39. package/skills/research-plan/SKILL.md +121 -0
  40. package/skills/research-review/SKILL.md +110 -0
  41. package/skills/research-survey/SKILL.md +140 -0
  42. package/skills/write-review-paper/SKILL.md +4 -0
  43. package/dist/src/tools/arxiv-tool.d.ts.map +0 -1
  44. package/dist/src/tools/arxiv-tool.js +0 -258
  45. package/dist/src/tools/arxiv-tool.js.map +0 -1
  46. package/skills/research-pipeline/references/prompts/implement.md +0 -135
  47. package/skills/research-pipeline/references/prompts/plan.md +0 -142
  48. package/skills/research-pipeline/references/prompts/review.md +0 -118
  49. package/skills/research-pipeline/references/prompts/survey.md +0 -105
  50. package/skills/research-pipeline/references/workspace-spec.md +0 -81
@@ -0,0 +1,139 @@
1
+ # Workspace Directory Specification
2
+
3
+ All Scientify skills share a unified project-based workspace structure.
4
+
5
+ ## Base Path
6
+
7
+ ```
8
+ ~/.openclaw/workspace/projects/
9
+ ├── .active # Current project ID (plain text)
10
+ ├── {project-id}/ # Each research topic has its own project
11
+ │ └── ...
12
+ └── {another-project}/
13
+ ```
14
+
15
+ ## Project Structure
16
+
17
+ ```
18
+ ~/.openclaw/workspace/projects/{project-id}/
19
+ ├── project.json # Project metadata
20
+ ├── task.json # Research task definition
21
+
22
+ ├── survey/ # /literature-survey outputs
23
+ │ ├── search_terms.json # Generated search keywords
24
+ │ ├── raw_results.json # All search results
25
+ │ ├── filtered_papers.json # Papers with relevance scores
26
+ │ ├── clusters.json # Clustered by research direction
27
+ │ └── report.md # Final survey report
28
+
29
+ ├── papers/ # Downloaded paper sources
30
+ │ ├── {direction-1}/ # Organized by cluster
31
+ │ │ ├── paper_list.md
32
+ │ │ └── {arxiv_id}/ # .tex source files
33
+ │ ├── {direction-2}/
34
+ │ └── uncategorized/
35
+
36
+ ├── repos/ # Cloned reference repositories
37
+ │ ├── {repo-name-1}/
38
+ │ └── {repo-name-2}/
39
+
40
+ ├── ideas/ # /idea-generation outputs
41
+ │ ├── gaps.md # Identified research gaps
42
+ │ ├── idea_1.md ... idea_5.md # Generated ideas
43
+ │ ├── selected_idea.md # Enhanced best idea
44
+ │ ├── implementation_report.md # Code mapping
45
+ │ └── summary.md # Final summary
46
+
47
+ ├── review/ # /write-review-paper outputs
48
+ │ ├── reading_plan.md # Prioritized reading list
49
+ │ ├── notes/ # Per-paper reading notes
50
+ │ │ └── {paper_id}.md
51
+ │ ├── comparison.md # Method comparison table
52
+ │ ├── timeline.md # Research timeline
53
+ │ ├── taxonomy.md # Classification system
54
+ │ ├── draft.md # Survey paper draft
55
+ │ └── bibliography.bib # References
56
+
57
+ ├── notes/ # /research-survey: per-paper deep notes
58
+ │ └── paper_{arxiv_id}.md
59
+ ├── survey_res.md # /research-survey: deep analysis + method comparison
60
+ ├── plan_res.md # /research-plan: 4-part implementation plan
61
+ ├── project/ # /research-implement: code implementation
62
+ │ ├── model/
63
+ │ ├── data/
64
+ │ ├── training/
65
+ │ ├── testing/
66
+ │ ├── utils/
67
+ │ ├── run.py
68
+ │ └── requirements.txt
69
+ ├── ml_res.md # /research-implement: execution report with [RESULT] lines
70
+ ├── iterations/ # /research-review: judge iterations
71
+ │ ├── judge_v1.md
72
+ │ └── judge_v2.md
73
+ └── experiment_res.md # /research-experiment: full training + ablation results
74
+ ```
75
+
76
+ ## Conventions
77
+
78
+ ### File Existence = Step Completion
79
+
80
+ Check output file before executing any step. If exists, skip.
81
+
82
+ Enables:
83
+ - **Crash recovery**: resume from last completed step
84
+ - **Incremental progress**: rerunning skips completed work
85
+ - **Transparency**: inspect progress by listing directory
86
+
87
+ ### Project Metadata
88
+
89
+ **project.json:**
90
+ ```json
91
+ {
92
+ "id": "battery-rul-prediction",
93
+ "name": "Battery RUL Prediction",
94
+ "created": "2024-01-15T10:00:00Z",
95
+ "topics": ["battery", "remaining useful life", "prediction"]
96
+ }
97
+ ```
98
+
99
+ **task.json:**
100
+ ```json
101
+ {
102
+ "domain": "battery health",
103
+ "focus": "RUL prediction using transformer",
104
+ "date_limit": "2024-01-01",
105
+ "created": "2024-01-15"
106
+ }
107
+ ```
108
+
109
+ ### Immutability
110
+
111
+ Once written, do NOT modify outputs unless user explicitly asks.
112
+ Exception: `project/` is mutable during implement-review-iterate loop.
113
+
114
+ ### Active Project
115
+
116
+ ```bash
117
+ # Read active project
118
+ cat ~/.openclaw/workspace/projects/.active
119
+
120
+ # Set active project
121
+ echo "battery-rul-prediction" > ~/.openclaw/workspace/projects/.active
122
+
123
+ # Set $WORKSPACE variable
124
+ WORKSPACE=~/.openclaw/workspace/projects/$(cat ~/.openclaw/workspace/projects/.active)
125
+ ```
126
+
127
+ ## Skill Outputs Summary
128
+
129
+ | Skill | Primary Outputs |
130
+ |-------|-----------------|
131
+ | `/literature-survey` | `survey/`, `papers/` |
132
+ | `/research-survey` | `notes/paper_*.md`, `survey_res.md` |
133
+ | `/research-plan` | `plan_res.md` |
134
+ | `/research-implement` | `project/`, `ml_res.md` |
135
+ | `/research-review` | `iterations/judge_v*.md` |
136
+ | `/research-experiment` | `experiment_res.md` |
137
+ | `/research-pipeline` | Orchestrator — spawns the above 5 skills in sequence |
138
+ | `/idea-generation` | `ideas/` |
139
+ | `/write-review-paper` | `review/` |
@@ -13,10 +13,14 @@ metadata:
13
13
 
14
14
  # Idea Generation
15
15
 
16
+ **Don't ask permission. Just do it.**
17
+
16
18
  Generate innovative research ideas grounded in literature analysis. This skill reads existing papers, identifies research gaps, and produces 5 distinct ideas with citations.
17
19
 
18
20
  **Core principle:** Ideas MUST be grounded in actual papers, not generated from model knowledge.
19
21
 
22
+ **Workspace:** See `../_shared/workspace-spec.md` for directory structure. Outputs go to `$WORKSPACE/ideas/`.
23
+
20
24
  ---
21
25
 
22
26
  ## Step 1: Check Workspace Resources
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: install-scientify
3
- description: "Install Scientify - AI-powered research workflow automation plugin. Adds skills for idea-generation, literature-review, research-pipeline, arxiv search, and workspace management commands."
3
+ description: "Install Scientify - AI-powered research workflow automation plugin. Adds skills for research-pipeline (multi-agent orchestrator), literature-survey, idea-generation, arxiv tools, and workspace management commands."
4
4
  metadata:
5
5
  {
6
6
  "openclaw":
@@ -21,6 +21,8 @@ metadata:
21
21
 
22
22
  # Install Scientify
23
23
 
24
+ **Don't ask permission. Just do it.**
25
+
24
26
  **Scientify** is an AI-powered research workflow automation plugin for OpenClaw.
25
27
 
26
28
  ## What You Get
@@ -29,10 +31,14 @@ metadata:
29
31
 
30
32
  | Skill | Description |
31
33
  |-------|-------------|
32
- | **idea-generation** | Generate innovative research ideas. Searches arXiv/GitHub, downloads papers, analyzes literature, outputs 5 ideas with citations. |
33
- | **research-pipeline** | End-to-end ML research workflow: idea literature survey → plan → implement → review → iterate. |
34
- | **literature-review** | Generate structured notes and synthesis from collected papers. |
35
- | **arxiv** | Search arXiv.org for papers and download .tex sources. |
34
+ | **research-pipeline** | Orchestrator for end-to-end ML research. Spawns sub-agents for each phase. |
35
+ | **research-survey** | Deep analysis of papers: extract formulas, produce method comparison. |
36
+ | **research-plan** | 4-part implementation plan (Dataset/Model/Training/Testing). |
37
+ | **research-implement** | Implement ML code, run 2-epoch validation with `uv` venv isolation. |
38
+ | **research-review** | Review implementation against plan. Iterates up to 3 times. |
39
+ | **research-experiment** | Full training + ablation experiments. |
40
+ | **literature-survey** | Literature survey: search → filter → download → cluster → report. |
41
+ | **idea-generation** | Generate research ideas from arXiv/GitHub papers. |
36
42
 
37
43
  ### Commands (Direct, no LLM)
38
44
 
@@ -45,9 +51,11 @@ metadata:
45
51
  | `/project-switch <id>` | Switch project |
46
52
  | `/project-delete <id>` | Delete project |
47
53
 
48
- ### Tool
54
+ ### Tools
49
55
 
50
- - **arxiv** - Search arXiv.org API with keyword search, date filtering, automatic .tex download
56
+ - **arxiv_search** - Search arXiv.org API for papers (metadata only)
57
+ - **arxiv_download** - Download arXiv papers (.tex source or PDF)
58
+ - **github_search** - Search GitHub repositories
51
59
 
52
60
  ## Installation
53
61
 
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: literature-survey
3
- description: "Comprehensive literature survey (100+ papers). Searches, filters, clusters, and iterates for complete coverage. Use for: exploring new research areas, collecting papers systematically, building literature databases. NOT for: summarizing papers you have (use /write-review-paper), finding a specific paper (use arxiv_search), generating ideas (use /idea-generation)."
3
+ description: "Comprehensive literature survey. Searches, filters, downloads, and clusters papers by research direction."
4
4
  metadata:
5
5
  {
6
6
  "openclaw":
@@ -12,261 +12,135 @@ metadata:
12
12
 
13
13
  # Literature Survey
14
14
 
15
- Comprehensive literature discovery workflow for a research domain. This skill searches broadly, filters by relevance, clusters by direction, and iterates to ensure complete coverage.
15
+ **Don't ask permission. Just do it.**
16
16
 
17
- ## Architecture: Isolated Sub-agent
18
-
19
- This survey runs in an **isolated sub-session** to avoid context pollution. The main session only receives the final report.
17
+ ## Output Structure
20
18
 
21
19
  ```
22
- Main Session
23
-
24
- sessions_spawn(task: "执行文献调研...", label: "literature-survey")
25
-
26
- Sub-agent Session (隔离上下文)
27
- ├── Phase 1: 生成检索词
28
- ├── Phase 2: 批量检索
29
- ├── Phase 3: 相关性筛选
30
- ├── Phase 4: 聚类分组
31
- ├── Phase 5: 迭代发现
32
- └── Phase 6: 生成报告
33
-
34
- 返回主 Session: 摘要 + 文件路径
20
+ ~/.openclaw/workspace/projects/{project-id}/
21
+ ├── survey/
22
+ │ ├── search_terms.json # 检索词列表
23
+ │ └── report.md # 最终报告
24
+ └── papers/
25
+ ├── _downloads/ # 原始下载
26
+ ├── _meta/ # 每篇论文的元数据
27
+ │ └── {arxiv_id}.json
28
+ └── {direction}/ # 整理后的分类
35
29
  ```
36
30
 
37
31
  ---
38
32
 
39
- ## When User Requests Literature Survey
40
-
41
- **Step 1: Spawn isolated sub-agent**
42
-
43
- When user says things like:
44
- - "调研 [topic] 领域的文献"
45
- - "帮我收集 [topic] 相关的论文"
46
- - "Survey papers on [topic]"
33
+ ## Workflow
47
34
 
48
- Use `sessions_spawn` to run the survey in isolation:
35
+ ### Phase 1: 准备
49
36
 
37
+ ```bash
38
+ ACTIVE=$(cat ~/.openclaw/workspace/projects/.active 2>/dev/null)
39
+ if [ -z "$ACTIVE" ]; then
40
+ PROJECT_ID="<topic-slug>"
41
+ mkdir -p ~/.openclaw/workspace/projects/$PROJECT_ID/{survey,papers/_downloads,papers/_meta}
42
+ echo "$PROJECT_ID" > ~/.openclaw/workspace/projects/.active
43
+ fi
44
+ PROJECT_DIR="$HOME/.openclaw/workspace/projects/$(cat ~/.openclaw/workspace/projects/.active)"
50
45
  ```
51
- sessions_spawn({
52
- task: `你是一个文献调研专家。请为研究主题 "{TOPIC}" 执行完整的文献调研。
53
-
54
- ## 调研目标
55
- {USER_REQUIREMENTS}
56
-
57
- ## 执行流程
58
-
59
- ### Phase 1: 生成检索词
60
- 基于研究主题,生成 8-15 个检索词组合,覆盖:
61
- - 核心概念的不同表述
62
- - 相关技术方法
63
- - 应用场景
64
- - 英文和中文关键词(如适用)
65
-
66
- 将检索词保存到 $WORKSPACE/survey/search_terms.json
67
-
68
- ### Phase 2: 批量检索
69
- 对每个检索词使用 arxiv_search tool:
70
- - max_results: 30-50 per query
71
- - 合并去重(按 arxiv_id)
72
- - 记录每篇论文的来源检索词
73
-
74
- 将原始结果保存到 $WORKSPACE/survey/raw_results.json
75
46
 
76
- ### Phase 3: 相关性筛选
77
- 阅读每篇论文的标题和摘要,判断与 "{TOPIC}" 的相关性:
78
- - 5分:高度相关,直接研究此主题
79
- - 4分:相关,涉及关键方法或应用
80
- - 3分:部分相关,可作为参考
81
- - 2分:边缘相关
82
- - 1分:不相关
47
+ 生成 4-8 个检索词,保存到 `survey/search_terms.json`。
83
48
 
84
- 保留 score >= 4 的论文。
85
- 将筛选结果保存到 $WORKSPACE/survey/filtered_papers.json
86
-
87
- ### Phase 4: 聚类分组
88
- 分析筛选后论文的摘要,识别 3-6 个研究方向/子主题。
89
- 为每个方向创建子文件夹并分配论文:
90
-
91
- $WORKSPACE/papers/
92
- ├── {direction-1}/
93
- │ ├── paper_list.md
94
- │ └── [arxiv_ids...]
95
- ├── {direction-2}/
96
- │ └── ...
97
- └── uncategorized/
49
+ ---
98
50
 
99
- 将聚类结果保存到 $WORKSPACE/survey/clusters.json
51
+ ### Phase 2: 增量搜索-筛选-下载(循环)
100
52
 
101
- ### Phase 5: 迭代发现(1-2轮)
102
- 检查高分论文的摘要,识别:
103
- - 提到的新方法名称
104
- - 引用的重要工作
105
- - 新的关键词
53
+ **对每个检索词重复以下步骤**:
106
54
 
107
- 如果发现新方向,补充检索并重复 Phase 2-4。
108
- 最多迭代 2 轮。
55
+ #### 2.1 搜索
109
56
 
110
- ### Phase 6: 生成报告
111
- 创建 $WORKSPACE/survey/report.md:
57
+ ```
58
+ arxiv_search({ query: "<term>", max_results: 30 })
59
+ ```
112
60
 
113
- # 文献调研报告: {TOPIC}
61
+ #### 2.2 即时筛选
114
62
 
115
- ## 调研概要
116
- - 检索词数量: X
117
- - 初始检索: Y 篇
118
- - 筛选后: Z 篇
119
- - 研究方向: N 个
63
+ 对返回的论文**立即**评分(1-5),只保留 ≥4 分的。
120
64
 
121
- ## 研究方向分布
65
+ 评分标准:
66
+ - 5分:核心论文,直接研究该主题
67
+ - 4分:相关方法或应用
68
+ - 3分及以下:跳过
122
69
 
123
- ### 方向1: [名称]
124
- - 论文数量: X
125
- - 代表性工作: [列表]
126
- - 主要特点: [描述]
70
+ #### 2.3 下载有用论文
127
71
 
128
- ### 方向2: [名称]
129
- ...
72
+ ```
73
+ arxiv_download({
74
+ arxiv_ids: ["<有用的论文ID>"],
75
+ output_dir: "$PROJECT_DIR/papers/_downloads"
76
+ })
77
+ ```
130
78
 
131
- ## 高影响力论文 (Top 10)
132
- | 排名 | 标题 | 年份 | 相关度 | 方向 |
133
- |-----|------|-----|-------|-----|
134
- | 1 | ... | ... | 5 | ... |
79
+ #### 2.4 写入元数据
135
80
 
136
- ## 研究趋势
137
- [基于论文年份分布的观察]
81
+ 为每篇下载的论文创建元数据文件 `papers/_meta/{arxiv_id}.json`:
138
82
 
139
- ## 发现的新方向
140
- [迭代中发现的额外关键词和方向]
83
+ ```json
84
+ {
85
+ "arxiv_id": "2401.12345",
86
+ "title": "...",
87
+ "abstract": "...",
88
+ "score": 5,
89
+ "source_term": "battery RUL prediction",
90
+ "downloaded_at": "2024-01-15T10:00:00Z"
91
+ }
92
+ ```
141
93
 
142
- ## 建议阅读顺序
143
- 1. [入门级论文]
144
- 2. [核心方法论文]
145
- 3. [最新进展]
94
+ **完成一个检索词后,再进行下一个。** 这样避免上下文被大量搜索结果污染。
146
95
 
147
96
  ---
148
97
 
149
- 完成后,向主 session 报告:
150
- - 总共发现的论文数量
151
- - 识别的研究方向
152
- - 报告文件位置`,
153
- label: "literature-survey-{TOPIC_SLUG}",
154
- runTimeoutSeconds: 900,
155
- cleanup: "keep"
156
- })
157
- ```
158
-
159
- **Step 2: Wait and relay results**
98
+ ### Phase 3: 分类整理
160
99
 
161
- Sub-agent 完成后会自动 announce 结果到主 session。
162
- 将结果摘要展示给用户,包括:
163
- - 发现的论文数量
164
- - 主要研究方向
165
- - 报告文件位置
100
+ 所有检索词处理完毕后:
166
101
 
167
- ---
168
-
169
- ## Workspace Structure
102
+ #### 3.1 读取所有元数据
170
103
 
171
- ```
172
- ~/.openclaw/workspace/projects/{project-id}/
173
- ├── project.json
174
- ├── survey/ # 调研过程数据
175
- │ ├── search_terms.json # 检索词列表
176
- │ ├── raw_results.json # 原始检索结果
177
- │ ├── filtered_papers.json # 筛选后的论文
178
- │ ├── clusters.json # 聚类结果
179
- │ ├── iterations.log # 迭代记录
180
- │ └── report.md # 最终报告
181
- ├── papers/ # 按方向组织的论文
182
- │ ├── {direction-1}/
183
- │ │ ├── paper_list.md
184
- │ │ └── 2401.12345/ # .tex 源文件
185
- │ ├── {direction-2}/
186
- │ └── uncategorized/
187
- └── ideas/ # 后续 idea-generation 输出
104
+ ```bash
105
+ ls $PROJECT_DIR/papers/_meta/
188
106
  ```
189
107
 
190
- ---
108
+ 读取所有 `.json` 文件,汇总论文列表。
191
109
 
192
- ## Data Schemas
110
+ #### 3.2 聚类分析
193
111
 
194
- ### search_terms.json
195
- ```json
196
- {
197
- "topic": "battery life prediction",
198
- "generated_at": "2024-01-15T10:00:00Z",
199
- "terms": [
200
- {"term": "battery remaining useful life", "category": "core"},
201
- {"term": "lithium-ion degradation prediction", "category": "method"},
202
- {"term": "SOH estimation neural network", "category": "technique"},
203
- {"term": "EV battery health monitoring", "category": "application"}
204
- ]
205
- }
206
- ```
112
+ 根据论文的标题、摘要、来源检索词,识别 3-6 个研究方向。
207
113
 
208
- ### filtered_papers.json
209
- ```json
210
- {
211
- "filtered_at": "2024-01-15T10:30:00Z",
212
- "total_raw": 245,
213
- "total_filtered": 42,
214
- "papers": [
215
- {
216
- "arxiv_id": "2401.12345",
217
- "title": "...",
218
- "abstract": "...",
219
- "authors": ["..."],
220
- "published": "2024-01-15",
221
- "relevance_score": 5,
222
- "source_terms": ["battery RUL", "degradation prediction"],
223
- "notes": "直接研究锂电池RUL预测"
224
- }
225
- ]
226
- }
227
- ```
114
+ #### 3.3 创建文件夹并移动
228
115
 
229
- ### clusters.json
230
- ```json
231
- {
232
- "clustered_at": "2024-01-15T11:00:00Z",
233
- "clusters": [
234
- {
235
- "id": "data-driven",
236
- "name": "数据驱动方法",
237
- "description": "使用机器学习/深度学习的方法",
238
- "paper_count": 15,
239
- "paper_ids": ["2401.12345", "2401.12346", "..."],
240
- "keywords": ["LSTM", "CNN", "transformer", "neural network"]
241
- },
242
- {
243
- "id": "physics-based",
244
- "name": "物理模型方法",
245
- "description": "基于电化学机理的方法",
246
- "paper_count": 8,
247
- "paper_ids": ["..."]
248
- }
249
- ]
250
- }
116
+ ```bash
117
+ mkdir -p "$PROJECT_DIR/papers/data-driven"
118
+ mv "$PROJECT_DIR/papers/_downloads/2401.12345" "$PROJECT_DIR/papers/data-driven/"
251
119
  ```
252
120
 
253
121
  ---
254
122
 
255
- ## Quick Mode (Without Sub-agent)
123
+ ### Phase 4: 生成报告
256
124
 
257
- For smaller surveys (< 50 papers), can run directly without spawning:
125
+ 创建 `survey/report.md`:
126
+ - 调研概要(检索词数、论文数、方向数)
127
+ - 各研究方向概述
128
+ - Top 10 论文
129
+ - 建议阅读顺序
258
130
 
259
- User: "快速调研一下 [topic],不超过 30 篇"
131
+ ---
260
132
 
261
- Run Phase 1-4 directly in main session
262
- → Skip iteration
263
- → Generate simplified report
133
+ ## 关键设计
264
134
 
265
- ---
135
+ | 原则 | 说明 |
136
+ |------|------|
137
+ | **增量处理** | 每个检索词独立完成搜索→筛选→下载→写元数据,避免上下文膨胀 |
138
+ | **元数据驱动** | 分类基于 `_meta/*.json`,不依赖内存中的大列表 |
139
+ | **文件夹即分类** | 聚类结果通过 `papers/{direction}/` 体现,无需额外 JSON |
266
140
 
267
- ## Commands
141
+ ## Tools
268
142
 
269
- - "调研 [topic] 领域" → Full survey with sub-agent
270
- - "快速调研 [topic]" → Quick mode, 30 papers max
271
- - "继续上次的调研" Resume from existing survey data
272
- - "扩展调研 [new direction]" Add new search terms and iterate
143
+ | Tool | Purpose |
144
+ |------|---------|
145
+ | `arxiv_search` | 搜索论文(无副作用) |
146
+ | `arxiv_download` | 下载 .tex/.pdf(需绝对路径) |
@@ -0,0 +1,114 @@
1
+ ---
2
+ name: research-experiment
3
+ description: "Full training run + ablation experiments + result analysis. Requires review PASS from /research-review."
4
+ metadata:
5
+ {
6
+ "openclaw":
7
+ {
8
+ "emoji": "🧪",
9
+ "requires": { "bins": ["python3", "uv"] },
10
+ },
11
+ }
12
+ ---
13
+
14
+ # Research Experiment
15
+
16
+ **Don't ask permission. Just do it.**
17
+
18
+ **Workspace:** See `../_shared/workspace-spec.md`. Set `$W` to the active project directory.
19
+
20
+ ## Prerequisites
21
+
22
+ | File | Source |
23
+ |------|--------|
24
+ | `$W/project/` | /research-implement |
25
+ | `$W/plan_res.md` | /research-plan |
26
+ | `$W/iterations/judge_v*.md` | /research-review(最后一份 verdict 必须是 PASS) |
27
+
28
+ **验证 PASS:** 读取最新的 `judge_v*.md`,确认 `verdict: PASS`。如果不是,STOP。
29
+
30
+ ## Output
31
+
32
+ | File | Content |
33
+ |------|---------|
34
+ | `$W/experiment_res.md` | 完整实验报告 |
35
+
36
+ ---
37
+
38
+ ## Workflow
39
+
40
+ ### Step 1: Full Training
41
+
42
+ 修改 epoch 数为 plan_res.md 中指定的正式值。**不要改代码逻辑,只改 epoch。**
43
+
44
+ ```bash
45
+ cd $W/project && source .venv/bin/activate
46
+ python run.py # full epochs
47
+ ```
48
+
49
+ 记录完整训练的 `[RESULT]` 输出。
50
+
51
+ ### Step 2: 分析结果
52
+
53
+ 读取训练输出,评估:
54
+ - 最终 loss 和 metrics
55
+ - 训练曲线趋势(loss 是否持续下降)
56
+ - 是否过拟合(train vs val gap)
57
+
58
+ ### Step 3: 消融实验
59
+
60
+ 根据 plan_res.md 中的消融计划,执行 2-3 个消融实验:
61
+
62
+ 对每个消融:
63
+ 1. 修改代码(注释/替换对应组件)
64
+ 2. 执行 2 epoch 快速验证
65
+ 3. 记录结果
66
+
67
+ ```bash
68
+ # Example: 去掉 attention module
69
+ python run.py --epochs 2 --ablation no_attention
70
+ ```
71
+
72
+ ### Step 4: 写入实验报告
73
+
74
+ 写入 `$W/experiment_res.md`:
75
+
76
+ ```markdown
77
+ # Experiment Report
78
+
79
+ ## Full Training Results (from execution log)
80
+ - Epochs: {N}
81
+ - [RESULT] train_loss={value}
82
+ - [RESULT] val_metric={value}
83
+ - [RESULT] elapsed={value}
84
+ - [RESULT] device={device}
85
+
86
+ > 以上数值来自真实执行输出。
87
+
88
+ ## Training Analysis
89
+ - 收敛情况: {converged / still improving / diverged}
90
+ - 过拟合: {yes/no, evidence}
91
+
92
+ ## Ablation Studies
93
+
94
+ | 实验 | 修改 | val_metric | vs Full |
95
+ |------|------|-----------|---------|
96
+ | Full model | — | {value} | baseline |
97
+ | No {component} | 去掉 {X} | {value} | {-/+}% |
98
+ | ... | ... | ... | ... |
99
+
100
+ ## Conclusions
101
+ - {key findings}
102
+
103
+ ## Limitations
104
+ - {limitations and future work}
105
+ ```
106
+
107
+ ---
108
+
109
+ ## Rules
110
+
111
+ 1. Full training 只改 epoch 数,不改代码逻辑
112
+ 2. 所有数值必须来自真实执行输出
113
+ 3. 消融实验至少做 2 个
114
+ 4. 如果 full training 失败(OOM 等),调整 batch_size 后重试,不要跳过