scientify 1.2.2 → 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +38 -14
- package/README.zh.md +38 -15
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +21 -2
- package/dist/index.js.map +1 -1
- package/dist/src/services/auto-updater.d.ts +15 -0
- package/dist/src/services/auto-updater.d.ts.map +1 -0
- package/dist/src/services/auto-updater.js +188 -0
- package/dist/src/services/auto-updater.js.map +1 -0
- package/dist/src/tools/arxiv-download.d.ts +25 -0
- package/dist/src/tools/arxiv-download.d.ts.map +1 -0
- package/dist/src/tools/arxiv-download.js +179 -0
- package/dist/src/tools/arxiv-download.js.map +1 -0
- package/dist/src/tools/{arxiv-tool.d.ts → arxiv-search.d.ts} +11 -8
- package/dist/src/tools/arxiv-search.d.ts.map +1 -0
- package/dist/src/tools/arxiv-search.js +140 -0
- package/dist/src/tools/arxiv-search.js.map +1 -0
- package/dist/src/tools/github-search-tool.d.ts +5 -1
- package/dist/src/tools/github-search-tool.d.ts.map +1 -1
- package/dist/src/tools/github-search-tool.js +10 -30
- package/dist/src/tools/github-search-tool.js.map +1 -1
- package/dist/src/tools/result.d.ts +37 -0
- package/dist/src/tools/result.d.ts.map +1 -0
- package/dist/src/tools/result.js +39 -0
- package/dist/src/tools/result.js.map +1 -0
- package/dist/src/tools/workspace.d.ts +32 -0
- package/dist/src/tools/workspace.d.ts.map +1 -0
- package/dist/src/tools/workspace.js +69 -0
- package/dist/src/tools/workspace.js.map +1 -0
- package/openclaw.plugin.json +22 -1
- package/package.json +13 -2
- package/skills/_shared/workspace-spec.md +139 -0
- package/skills/idea-generation/SKILL.md +4 -0
- package/skills/install-scientify/SKILL.md +15 -7
- package/skills/literature-survey/SKILL.md +86 -212
- package/skills/research-experiment/SKILL.md +114 -0
- package/skills/research-implement/SKILL.md +166 -0
- package/skills/research-pipeline/SKILL.md +104 -188
- package/skills/research-plan/SKILL.md +121 -0
- package/skills/research-review/SKILL.md +110 -0
- package/skills/research-survey/SKILL.md +140 -0
- package/skills/write-review-paper/SKILL.md +4 -0
- package/dist/src/tools/arxiv-tool.d.ts.map +0 -1
- package/dist/src/tools/arxiv-tool.js +0 -258
- package/dist/src/tools/arxiv-tool.js.map +0 -1
- package/skills/research-pipeline/references/prompts/implement.md +0 -135
- package/skills/research-pipeline/references/prompts/plan.md +0 -142
- package/skills/research-pipeline/references/prompts/review.md +0 -118
- package/skills/research-pipeline/references/prompts/survey.md +0 -105
- package/skills/research-pipeline/references/workspace-spec.md +0 -81
|
@@ -0,0 +1,166 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research-implement
|
|
3
|
+
description: "Implement ML code from plan, run 2-epoch validation, verify real results. Requires plan_res.md from /research-plan."
|
|
4
|
+
metadata:
|
|
5
|
+
{
|
|
6
|
+
"openclaw":
|
|
7
|
+
{
|
|
8
|
+
"emoji": "💻",
|
|
9
|
+
"requires": { "bins": ["python3", "uv"] },
|
|
10
|
+
},
|
|
11
|
+
}
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# Research Implement
|
|
15
|
+
|
|
16
|
+
**Don't ask permission. Just do it.**
|
|
17
|
+
|
|
18
|
+
**Workspace:** See `../_shared/workspace-spec.md`. Set `$W` to the active project directory.
|
|
19
|
+
|
|
20
|
+
## Prerequisites
|
|
21
|
+
|
|
22
|
+
| File | Source |
|
|
23
|
+
|------|--------|
|
|
24
|
+
| `$W/plan_res.md` | /research-plan |
|
|
25
|
+
| `$W/survey_res.md` | /research-survey |
|
|
26
|
+
| `$W/repos/` (optional) | reference code |
|
|
27
|
+
|
|
28
|
+
**If `plan_res.md` is missing, STOP:** "需要先运行 /research-plan 完成实现计划"
|
|
29
|
+
|
|
30
|
+
## Output
|
|
31
|
+
|
|
32
|
+
| File | Content |
|
|
33
|
+
|------|---------|
|
|
34
|
+
| `$W/project/` | 完整可运行代码 |
|
|
35
|
+
| `$W/ml_res.md` | 实现报告(含真实执行结果) |
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## Workflow
|
|
40
|
+
|
|
41
|
+
### Step 1: 读取计划
|
|
42
|
+
|
|
43
|
+
读取 `$W/plan_res.md`,提取:
|
|
44
|
+
- 所有组件列表
|
|
45
|
+
- 数据集信息
|
|
46
|
+
- 训练参数
|
|
47
|
+
|
|
48
|
+
### Step 2: 创建项目结构
|
|
49
|
+
|
|
50
|
+
```
|
|
51
|
+
$W/project/
|
|
52
|
+
model/ # 模型组件(每个组件一个文件)
|
|
53
|
+
data/ # 数据加载
|
|
54
|
+
training/ # 训练循环 + loss
|
|
55
|
+
testing/ # 评估
|
|
56
|
+
utils/ # 工具函数
|
|
57
|
+
run.py # 入口(必须输出 [RESULT] 行)
|
|
58
|
+
requirements.txt
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
### Step 3: 实现代码
|
|
62
|
+
|
|
63
|
+
按此顺序实现(每步完成后立即验证):
|
|
64
|
+
|
|
65
|
+
**3a. requirements.txt** — 列出所有依赖,pin 主版本
|
|
66
|
+
|
|
67
|
+
**3b. 数据管道**
|
|
68
|
+
```bash
|
|
69
|
+
cd $W/project && uv venv .venv && source .venv/bin/activate
|
|
70
|
+
uv pip install -r requirements.txt
|
|
71
|
+
python -c "from data.dataset import *; print('data OK')"
|
|
72
|
+
```
|
|
73
|
+
验证:import 无报错
|
|
74
|
+
|
|
75
|
+
**3c. 模型架构**
|
|
76
|
+
```bash
|
|
77
|
+
python -c "from model import *; import torch; x = torch.randn(2, ...); print(model(x).shape)"
|
|
78
|
+
```
|
|
79
|
+
验证:输出 shape 正确
|
|
80
|
+
|
|
81
|
+
**3d. Loss + 训练循环**
|
|
82
|
+
|
|
83
|
+
**3e. 评估逻辑**
|
|
84
|
+
|
|
85
|
+
**3f. run.py** — 必须包含:
|
|
86
|
+
```python
|
|
87
|
+
print(f"[RESULT] train_loss={train_loss:.6f}")
|
|
88
|
+
print(f"[RESULT] val_metric={val_metric:.6f}")
|
|
89
|
+
print(f"[RESULT] elapsed={elapsed:.1f}s")
|
|
90
|
+
print(f"[RESULT] device={device}")
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
### Step 4: 环境搭建 + 执行
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
cd $W/project
|
|
97
|
+
uv venv .venv
|
|
98
|
+
source .venv/bin/activate
|
|
99
|
+
|
|
100
|
+
# 自动检测依赖格式
|
|
101
|
+
if [ -f "pyproject.toml" ]; then
|
|
102
|
+
uv pip install -e .
|
|
103
|
+
elif [ -f "requirements.txt" ]; then
|
|
104
|
+
uv pip install -r requirements.txt
|
|
105
|
+
fi
|
|
106
|
+
|
|
107
|
+
# 2 epoch 验证
|
|
108
|
+
python run.py --epochs 2
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
### Step 5: 验证执行结果
|
|
112
|
+
|
|
113
|
+
**执行完成后,必须:**
|
|
114
|
+
|
|
115
|
+
1. 读取 stdout/stderr 完整输出
|
|
116
|
+
2. 确认存在 `[RESULT]` 行
|
|
117
|
+
3. 确认 loss 非 NaN/Inf
|
|
118
|
+
4. 确认 loss 有下降趋势(即使微小)
|
|
119
|
+
|
|
120
|
+
**如果执行失败:**
|
|
121
|
+
- 读取报错信息
|
|
122
|
+
- 修复代码
|
|
123
|
+
- 重新执行
|
|
124
|
+
- 最多重试 3 次
|
|
125
|
+
|
|
126
|
+
### Step 6: 写入报告
|
|
127
|
+
|
|
128
|
+
写入 `$W/ml_res.md`:
|
|
129
|
+
|
|
130
|
+
```markdown
|
|
131
|
+
# Implementation Report
|
|
132
|
+
|
|
133
|
+
## Data Source
|
|
134
|
+
- Dataset: {name} — real / mock (reason)
|
|
135
|
+
- If mock: steps to obtain real data: [...]
|
|
136
|
+
|
|
137
|
+
## Components Implemented
|
|
138
|
+
- {module}: {description}
|
|
139
|
+
|
|
140
|
+
## Quick Validation Results (from execution log)
|
|
141
|
+
- Epochs: 2
|
|
142
|
+
- [RESULT] train_loss={从执行输出中复制}
|
|
143
|
+
- [RESULT] val_metric={从执行输出中复制}
|
|
144
|
+
- [RESULT] elapsed={从执行输出中复制}
|
|
145
|
+
- [RESULT] device={从执行输出中复制}
|
|
146
|
+
|
|
147
|
+
> 以上数值直接引用自代码执行输出。
|
|
148
|
+
> 如任何数值无法从执行日志中验证,标注为 ⚠️ UNVERIFIED。
|
|
149
|
+
|
|
150
|
+
## Deviations from Plan
|
|
151
|
+
- {changes and why}
|
|
152
|
+
|
|
153
|
+
## Known Issues
|
|
154
|
+
- {issues}
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## Critical Rules
|
|
160
|
+
|
|
161
|
+
1. **禁止编造结果。** 所有数值必须来自代码执行输出。执行失败就报告失败。
|
|
162
|
+
2. **禁止使用全局 pip。** 必须用 uv venv 隔离。
|
|
163
|
+
3. **禁止直接 import repos/**,必须改写适配。
|
|
164
|
+
4. **mock 数据必须标注** — 代码中 `# MOCK DATA: <reason>`,报告中声明。
|
|
165
|
+
5. **run.py 必须输出 `[RESULT]` 行**,报告必须引用这些输出。
|
|
166
|
+
6. 3 次重试后仍失败,写入失败报告并停止。
|
|
@@ -1,267 +1,183 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: research-pipeline
|
|
3
|
-
description: "
|
|
3
|
+
description: "Orchestrates the full research workflow by spawning sub-agents for each phase. Checks workspace state, dispatches tasks, verifies outputs. Use for: end-to-end ML research. Each phase runs in an isolated context via sessions_spawn."
|
|
4
4
|
metadata:
|
|
5
5
|
{
|
|
6
6
|
"openclaw":
|
|
7
7
|
{
|
|
8
8
|
"emoji": "🔬",
|
|
9
|
-
"requires": { "bins": ["git", "python3"] },
|
|
9
|
+
"requires": { "bins": ["git", "python3", "uv"] },
|
|
10
10
|
},
|
|
11
11
|
}
|
|
12
12
|
---
|
|
13
13
|
|
|
14
|
-
# Research Pipeline
|
|
14
|
+
# Research Pipeline (Orchestrator)
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
**Don't ask permission. Just do it.**
|
|
17
17
|
|
|
18
|
-
|
|
18
|
+
你是编排器。你不直接做研究工作,而是:
|
|
19
|
+
1. 检查 workspace 文件状态
|
|
20
|
+
2. 为下一步构造任务描述
|
|
21
|
+
3. 用 `sessions_spawn` 派发给子 agent
|
|
22
|
+
4. 等待完成后验证产出
|
|
23
|
+
5. 重复直到流程结束
|
|
19
24
|
|
|
20
|
-
|
|
25
|
+
**Workspace:** See `../_shared/workspace-spec.md`. Set `$W` to the active project directory.
|
|
21
26
|
|
|
22
|
-
|
|
27
|
+
---
|
|
23
28
|
|
|
24
|
-
|
|
29
|
+
## Step 0: 初始化
|
|
25
30
|
|
|
26
|
-
### Check Active Project
|
|
27
31
|
```bash
|
|
28
|
-
cat ~/.openclaw/workspace/projects/.active 2>/dev/null
|
|
32
|
+
ACTIVE=$(cat ~/.openclaw/workspace/projects/.active 2>/dev/null)
|
|
29
33
|
```
|
|
30
34
|
|
|
31
|
-
|
|
35
|
+
如果没有 active project:
|
|
36
|
+
1. 问用户:研究主题是什么?
|
|
37
|
+
2. 创建项目目录
|
|
38
|
+
3. 写入 `task.json`
|
|
32
39
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
### Directory Structure
|
|
36
|
-
```
|
|
37
|
-
$WORKSPACE/
|
|
38
|
-
├── project.json # Project metadata
|
|
39
|
-
├── task.json # Research task/idea definition
|
|
40
|
-
├── search_results.md # Search results (Step 2)
|
|
41
|
-
├── prepare_res.md # Selected repos (Step 3)
|
|
42
|
-
├── papers/ # Downloaded papers (Step 4)
|
|
43
|
-
├── repos/ # Cloned repositories (Step 3)
|
|
44
|
-
├── notes/ # Paper notes (Step 5)
|
|
45
|
-
├── survey_res.md # Literature survey (Step 5)
|
|
46
|
-
├── plan_res.md # Implementation plan (Step 6)
|
|
47
|
-
├── project/ # Code implementation (Step 7)
|
|
48
|
-
├── ml_res.md # Implementation report (Step 7)
|
|
49
|
-
├── iterations/ # Review iterations (Step 8-9)
|
|
50
|
-
│ ├── judge_v1.md
|
|
51
|
-
│ └── ...
|
|
52
|
-
└── experiment_res.md # Final results (Step 10)
|
|
53
|
-
```
|
|
40
|
+
设置 `$W = ~/.openclaw/workspace/projects/{project-id}`
|
|
54
41
|
|
|
55
42
|
---
|
|
56
43
|
|
|
57
|
-
##
|
|
58
|
-
|
|
59
|
-
Read `$WORKSPACE/task.json`. If it does not exist, ask the user for:
|
|
44
|
+
## 调度循环
|
|
60
45
|
|
|
61
|
-
|
|
62
|
-
- **references** (optional): ArXiv IDs or paper titles as starting points.
|
|
63
|
-
- **domain** (optional): e.g. "recommendation systems", "NLP", "computer vision".
|
|
46
|
+
按顺序检查每个阶段。**每次只执行一个阶段。**
|
|
64
47
|
|
|
65
|
-
|
|
48
|
+
### Phase 1: Literature Survey
|
|
66
49
|
|
|
67
|
-
|
|
68
|
-
{
|
|
69
|
-
"idea": "...",
|
|
70
|
-
"references": ["2401.12345", "..."],
|
|
71
|
-
"domain": "...",
|
|
72
|
-
"date_limit": "2024-01-01"
|
|
73
|
-
}
|
|
74
|
-
```
|
|
50
|
+
**检查:** `$W/papers/_meta/` 目录存在且有 `.json` 文件?
|
|
75
51
|
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
## Step 2: Search
|
|
79
|
-
|
|
80
|
-
Use the `arxiv` tool to search for 5-10 related papers based on the idea and any reference paper titles. Use the `github_search` tool to find related repositories.
|
|
81
|
-
|
|
82
|
-
Combine results into a markdown report:
|
|
52
|
+
**如果缺失,spawn:**
|
|
83
53
|
|
|
84
54
|
```
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
- [repo_name](url) — stars — language — summary of relevance
|
|
55
|
+
sessions_spawn({
|
|
56
|
+
task: "工作目录: $W\n执行 /literature-survey 技能\n\n研究主题: {从 task.json 提取}\n请搜索、筛选、下载相关论文到 $W/papers/",
|
|
57
|
+
label: "Literature Survey"
|
|
58
|
+
})
|
|
90
59
|
```
|
|
91
60
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
## Step 3: Prepare References
|
|
95
|
-
|
|
96
|
-
Read `$WORKSPACE/search_results.md`. Select 3-5 of the most relevant repositories.
|
|
61
|
+
**验证:** `ls $W/papers/_meta/*.json` 至少有 3 个文件
|
|
97
62
|
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
```bash
|
|
101
|
-
git clone --depth 1 <url> $WORKSPACE/repos/<repo_name>
|
|
102
|
-
```
|
|
103
|
-
|
|
104
|
-
Write a summary of selected repos and their relevance to the idea.
|
|
105
|
-
|
|
106
|
-
**Output:** `$WORKSPACE/prepare_res.md`
|
|
107
|
-
|
|
108
|
-
## Step 4: Download Papers
|
|
109
|
-
|
|
110
|
-
For each important paper from Step 2, use the `arxiv` tool with `download: true` and `output_dir: "$WORKSPACE/papers/"` to get .tex source files.
|
|
111
|
-
|
|
112
|
-
If download fails for any paper, note the failure and continue. The survey step can work with abstracts alone.
|
|
63
|
+
---
|
|
113
64
|
|
|
114
|
-
|
|
65
|
+
### Phase 2: Deep Survey
|
|
115
66
|
|
|
116
|
-
|
|
67
|
+
**检查:** `$W/survey_res.md` 存在?
|
|
117
68
|
|
|
118
|
-
|
|
69
|
+
**如果缺失,先读取 Phase 1 摘要,然后 spawn:**
|
|
119
70
|
|
|
120
|
-
|
|
71
|
+
```
|
|
72
|
+
sessions_spawn({
|
|
73
|
+
task: "工作目录: $W\n执行 /research-survey 技能\n\n上下文: 已下载 {N} 篇论文,方向包括 {directions}\n请深度分析论文,提取公式,写入 survey_res.md",
|
|
74
|
+
label: "Deep Survey"
|
|
75
|
+
})
|
|
76
|
+
```
|
|
121
77
|
|
|
122
|
-
|
|
123
|
-
2. Extract: core method, mathematical formulas, key contributions.
|
|
124
|
-
3. Read the corresponding reference codebase in `$WORKSPACE/repos/`.
|
|
125
|
-
4. Map math formulas to code implementations.
|
|
126
|
-
5. Write structured notes to `$WORKSPACE/notes/paper_NNN.md`.
|
|
78
|
+
**验证:** `$W/survey_res.md` 存在且包含"核心方法对比"表格
|
|
127
79
|
|
|
128
|
-
|
|
80
|
+
---
|
|
129
81
|
|
|
130
|
-
|
|
131
|
-
# [Paper Title]
|
|
82
|
+
### Phase 3: Implementation Plan
|
|
132
83
|
|
|
133
|
-
|
|
134
|
-
...
|
|
84
|
+
**检查:** `$W/plan_res.md` 存在?
|
|
135
85
|
|
|
136
|
-
|
|
137
|
-
...
|
|
86
|
+
**如果缺失,读取 survey_res.md 摘要,然后 spawn:**
|
|
138
87
|
|
|
139
|
-
## Code Implementation
|
|
140
|
-
File: repos/<repo>/path/to/file.py
|
|
141
|
-
```python
|
|
142
|
-
# relevant code excerpt
|
|
143
88
|
```
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
89
|
+
sessions_spawn({
|
|
90
|
+
task: "工作目录: $W\n执行 /research-plan 技能\n\n上下文: 调研发现核心方法是 {method},推荐技术路线 {route}\n请制定完整实现计划到 plan_res.md",
|
|
91
|
+
label: "Research Plan"
|
|
92
|
+
})
|
|
147
93
|
```
|
|
148
94
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
**Output:** `$WORKSPACE/notes/paper_*.md` + `$WORKSPACE/survey_res.md`
|
|
152
|
-
|
|
153
|
-
## Step 6: Implementation Plan
|
|
154
|
-
|
|
155
|
-
Read `references/prompts/plan.md` for detailed guidance.
|
|
95
|
+
**验证:** `$W/plan_res.md` 存在且包含 4 个 section(Dataset/Model/Training/Testing)
|
|
156
96
|
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
1. **Dataset Plan**: data source, loading pipeline, preprocessing, dataloader design.
|
|
160
|
-
2. **Model Plan**: architecture, math formulas to implement, reference code to adapt.
|
|
161
|
-
3. **Training Plan**: loss functions, optimizer, hyperparameters, monitoring.
|
|
162
|
-
4. **Testing Plan**: metrics, evaluation protocol, baselines.
|
|
163
|
-
|
|
164
|
-
**Output:** `$WORKSPACE/plan_res.md`
|
|
97
|
+
---
|
|
165
98
|
|
|
166
|
-
|
|
99
|
+
### Phase 4: Implementation
|
|
167
100
|
|
|
168
|
-
|
|
101
|
+
**检查:** `$W/ml_res.md` 存在?
|
|
169
102
|
|
|
170
|
-
|
|
103
|
+
**如果缺失,读取 plan_res.md 要点,然后 spawn:**
|
|
171
104
|
|
|
172
105
|
```
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
testing/ # evaluation scripts
|
|
178
|
-
utils/ # shared utilities
|
|
179
|
-
run.py # main entry point
|
|
180
|
-
requirements.txt
|
|
106
|
+
sessions_spawn({
|
|
107
|
+
task: "工作目录: $W\n执行 /research-implement 技能\n\n上下文:\n- 计划包含 {N} 个组件: {list}\n- 数据集: {dataset}\n- 框架: PyTorch\n请实现代码到 $W/project/,运行 2 epoch 验证,写入 ml_res.md",
|
|
108
|
+
label: "Research Implement"
|
|
109
|
+
})
|
|
181
110
|
```
|
|
182
111
|
|
|
183
|
-
|
|
112
|
+
**验证:**
|
|
113
|
+
- `$W/project/run.py` 存在
|
|
114
|
+
- `$W/ml_res.md` 包含 `[RESULT]` 行
|
|
115
|
+
- loss 值非 NaN/Inf
|
|
184
116
|
|
|
185
|
-
|
|
186
|
-
- Implement EVERY component from `plan_res.md`.
|
|
187
|
-
- Use real datasets, not toy data.
|
|
188
|
-
- First run: 2 epochs only (quick validation).
|
|
189
|
-
|
|
190
|
-
Execute:
|
|
191
|
-
|
|
192
|
-
```bash
|
|
193
|
-
cd $WORKSPACE/project && pip install -r requirements.txt && python run.py --epochs 2
|
|
194
|
-
```
|
|
117
|
+
---
|
|
195
118
|
|
|
196
|
-
|
|
119
|
+
### Phase 5: Review
|
|
197
120
|
|
|
198
|
-
|
|
121
|
+
**检查:** `$W/iterations/` 下最新 `judge_v*.md` 的 verdict 是否为 PASS?
|
|
199
122
|
|
|
200
|
-
|
|
123
|
+
**如果没有 PASS,spawn:**
|
|
201
124
|
|
|
202
|
-
|
|
125
|
+
```
|
|
126
|
+
sessions_spawn({
|
|
127
|
+
task: "工作目录: $W\n执行 /research-review 技能\n\n上下文:\n- 实现报告: ml_res.md 显示 train_loss={value}\n- 计划在 plan_res.md\n请审查代码,如需修改则迭代修复(最多 3 轮)",
|
|
128
|
+
label: "Research Review"
|
|
129
|
+
})
|
|
130
|
+
```
|
|
203
131
|
|
|
204
|
-
|
|
132
|
+
**验证:** 最新 `judge_v*.md` 中 `verdict: PASS` 或 `verdict: BLOCKED`
|
|
205
133
|
|
|
206
|
-
|
|
207
|
-
- The plan from `plan_res.md`: are all components present?
|
|
208
|
-
- Code quality: no toy implementations, proper error handling, correct data pipeline.
|
|
134
|
+
如果 BLOCKED → 报告用户,等待指示
|
|
209
135
|
|
|
210
|
-
|
|
136
|
+
---
|
|
211
137
|
|
|
212
|
-
|
|
213
|
-
# Review v1
|
|
138
|
+
### Phase 6: Full Experiment
|
|
214
139
|
|
|
215
|
-
|
|
140
|
+
**检查:** `$W/experiment_res.md` 存在?
|
|
216
141
|
|
|
217
|
-
|
|
218
|
-
- [ ] Dataset loading matches plan
|
|
219
|
-
- [ ] Model architecture matches formulas
|
|
220
|
-
- [ ] Loss function correct
|
|
221
|
-
- [ ] Training loop proper
|
|
222
|
-
- [ ] Evaluation metrics correct
|
|
142
|
+
**如果缺失,spawn:**
|
|
223
143
|
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
144
|
+
```
|
|
145
|
+
sessions_spawn({
|
|
146
|
+
task: "工作目录: $W\n执行 /research-experiment 技能\n\n上下文:\n- Review PASS,代码已验证\n- plan_res.md 中指定 full epochs\n请执行完整训练 + 消融实验,写入 experiment_res.md",
|
|
147
|
+
label: "Research Experiment"
|
|
148
|
+
})
|
|
227
149
|
```
|
|
228
150
|
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
## Step 9: Iterate
|
|
232
|
-
|
|
233
|
-
If the review verdict is `NEEDS_REVISION`:
|
|
234
|
-
|
|
235
|
-
1. Read `$WORKSPACE/iterations/judge_vN.md` for the latest suggestions.
|
|
236
|
-
2. Fix each issue in `$WORKSPACE/project/`.
|
|
237
|
-
3. Re-run the 2-epoch validation.
|
|
238
|
-
4. Write a new review to `$WORKSPACE/iterations/judge_v(N+1).md`.
|
|
239
|
-
5. Repeat until `PASS` or 3 iterations reached.
|
|
151
|
+
**验证:** `$W/experiment_res.md` 包含 `[RESULT]` 行和消融表格
|
|
240
152
|
|
|
241
|
-
|
|
153
|
+
---
|
|
242
154
|
|
|
243
|
-
|
|
155
|
+
## 完成
|
|
244
156
|
|
|
245
|
-
|
|
157
|
+
所有 Phase 验证通过后,输出最终摘要:
|
|
246
158
|
|
|
247
|
-
|
|
159
|
+
```
|
|
160
|
+
研究流程完成!
|
|
161
|
+
- 论文: {N} 篇分析
|
|
162
|
+
- 代码: $W/project/
|
|
163
|
+
- 结果: $W/experiment_res.md
|
|
164
|
+
- 审查: $W/iterations/ ({N} 轮)
|
|
165
|
+
```
|
|
248
166
|
|
|
249
|
-
|
|
250
|
-
2. Execute full training run.
|
|
251
|
-
3. Collect and analyze results.
|
|
167
|
+
---
|
|
252
168
|
|
|
253
|
-
|
|
169
|
+
## 上下文桥接规则
|
|
254
170
|
|
|
255
|
-
|
|
171
|
+
每次 spawn 前,编排器必须:
|
|
172
|
+
1. **读取**上一步的产出文件
|
|
173
|
+
2. **摘要** 2-5 行关键信息(不要复制全文)
|
|
174
|
+
3. **写入** spawn task 的"上下文"部分
|
|
256
175
|
|
|
257
|
-
|
|
176
|
+
这确保子 agent 拿到足够信息启动,同时不会被前序步骤的完整输出污染。
|
|
258
177
|
|
|
259
178
|
## Recovery
|
|
260
179
|
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
3. Resume from the first missing output file.
|
|
266
|
-
|
|
267
|
-
Never re-do a step whose output file already exists unless the user explicitly asks.
|
|
180
|
+
如果编排器中断:
|
|
181
|
+
1. 重新运行 /research-pipeline
|
|
182
|
+
2. 编排器会自动检查所有文件,跳过已完成的阶段
|
|
183
|
+
3. 从第一个缺失的产出文件开始继续
|
|
@@ -0,0 +1,121 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research-plan
|
|
3
|
+
description: "Create a structured implementation plan from survey results. Produces dataset/model/training/testing plans. Requires survey_res.md from /research-survey."
|
|
4
|
+
metadata:
|
|
5
|
+
{
|
|
6
|
+
"openclaw":
|
|
7
|
+
{
|
|
8
|
+
"emoji": "📋",
|
|
9
|
+
},
|
|
10
|
+
}
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Research Plan
|
|
14
|
+
|
|
15
|
+
**Don't ask permission. Just do it.**
|
|
16
|
+
|
|
17
|
+
**Workspace:** See `../_shared/workspace-spec.md`. Set `$W` to the active project directory.
|
|
18
|
+
|
|
19
|
+
## Prerequisites
|
|
20
|
+
|
|
21
|
+
| File | Source |
|
|
22
|
+
|------|--------|
|
|
23
|
+
| `$W/task.json` | /research-pipeline or user |
|
|
24
|
+
| `$W/survey_res.md` | /research-survey |
|
|
25
|
+
| `$W/notes/paper_*.md` | /research-survey |
|
|
26
|
+
| `$W/repos/` (optional) | git clone |
|
|
27
|
+
|
|
28
|
+
**If `survey_res.md` is missing, STOP:** "需要先运行 /research-survey 完成深度分析"
|
|
29
|
+
|
|
30
|
+
## Output
|
|
31
|
+
|
|
32
|
+
| File | Content |
|
|
33
|
+
|------|---------|
|
|
34
|
+
| `$W/plan_res.md` | 四部分实现计划 |
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## Workflow
|
|
39
|
+
|
|
40
|
+
### Step 1: 读取上下文
|
|
41
|
+
|
|
42
|
+
读取以下文件,理解研究目标和技术方案:
|
|
43
|
+
- `$W/task.json` — 研究目标
|
|
44
|
+
- `$W/survey_res.md` — 技术路线建议和核心公式
|
|
45
|
+
- 浏览 `$W/repos/` 的目录结构(如有)
|
|
46
|
+
|
|
47
|
+
### Step 2: 制定四部分计划
|
|
48
|
+
|
|
49
|
+
写入 `$W/plan_res.md`:
|
|
50
|
+
|
|
51
|
+
```markdown
|
|
52
|
+
# Implementation Plan
|
|
53
|
+
|
|
54
|
+
## 1. Dataset Plan
|
|
55
|
+
|
|
56
|
+
- **数据集名称:** {name}
|
|
57
|
+
- **来源:** {URL or description}
|
|
58
|
+
- **大小:** {samples / size}
|
|
59
|
+
- **预处理步骤:**
|
|
60
|
+
1. {step}
|
|
61
|
+
2. {step}
|
|
62
|
+
- **DataLoader 设计:**
|
|
63
|
+
- batch_size: {value}
|
|
64
|
+
- 输入格式: {shape}
|
|
65
|
+
- 输出格式: {shape}
|
|
66
|
+
|
|
67
|
+
## 2. Model Plan
|
|
68
|
+
|
|
69
|
+
- **架构概述:** {1-2 sentences}
|
|
70
|
+
- **组件列表:**
|
|
71
|
+
|
|
72
|
+
| 组件 | 对应公式 | 参考代码 | 输入 → 输出 |
|
|
73
|
+
|------|----------|----------|-------------|
|
|
74
|
+
| {component} | $formula$ | `repos/xxx/file.py` | {shape} → {shape} |
|
|
75
|
+
|
|
76
|
+
- **参数量估计:** {approximate}
|
|
77
|
+
|
|
78
|
+
## 3. Training Plan
|
|
79
|
+
|
|
80
|
+
- **Loss 函数:** {formula + description}
|
|
81
|
+
- **Optimizer:** {Adam/SGD/...}, lr={value}
|
|
82
|
+
- **Scheduler:** {if any}
|
|
83
|
+
- **训练参数:**
|
|
84
|
+
- epochs (validation): 2
|
|
85
|
+
- epochs (full): {value}
|
|
86
|
+
- batch_size: {value}
|
|
87
|
+
- **监控指标:** {loss, metrics to log}
|
|
88
|
+
|
|
89
|
+
## 4. Testing Plan
|
|
90
|
+
|
|
91
|
+
- **评估指标:**
|
|
92
|
+
|
|
93
|
+
| Metric | 公式/描述 | 期望范围 |
|
|
94
|
+
|--------|-----------|----------|
|
|
95
|
+
| {metric} | {description} | {range} |
|
|
96
|
+
|
|
97
|
+
- **Baselines:** {what to compare against}
|
|
98
|
+
- **消融实验(初步规划):**
|
|
99
|
+
1. {ablation 1}
|
|
100
|
+
2. {ablation 2}
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### Step 3: 自检
|
|
104
|
+
|
|
105
|
+
验证计划的完整性:
|
|
106
|
+
- [ ] 每个模型组件都有对应公式
|
|
107
|
+
- [ ] 数据集有具体获取方式
|
|
108
|
+
- [ ] Loss 函数有数学定义
|
|
109
|
+
- [ ] 评估指标有明确定义
|
|
110
|
+
- [ ] 训练参数合理(不要 lr=0.1 for Adam)
|
|
111
|
+
|
|
112
|
+
如有不确定项,在计划中标注 `⚠️ TODO: {reason}`
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
## Rules
|
|
117
|
+
|
|
118
|
+
1. 计划中每个组件必须可追溯到 survey_res.md 中的公式或方法
|
|
119
|
+
2. 不要写"通用"计划 — 每个参数都要有具体值或合理估计
|
|
120
|
+
3. 如果参考仓库存在,组件表必须包含参考代码路径
|
|
121
|
+
4. plan_res.md 的完成标志:四个部分都存在且非空
|