scientify 2.1.0 → 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +1 -76
- package/dist/index.js.map +1 -1
- package/dist/src/cli/research.d.ts.map +1 -1
- package/dist/src/cli/research.js +6 -23
- package/dist/src/cli/research.js.map +1 -1
- package/dist/src/commands/metabolism-status.d.ts.map +1 -1
- package/dist/src/commands/metabolism-status.js +5 -25
- package/dist/src/commands/metabolism-status.js.map +1 -1
- package/dist/src/commands.d.ts +8 -8
- package/dist/src/commands.d.ts.map +1 -1
- package/dist/src/commands.js +91 -262
- package/dist/src/commands.js.map +1 -1
- package/dist/src/templates/bootstrap.d.ts.map +1 -1
- package/dist/src/templates/bootstrap.js +45 -59
- package/dist/src/templates/bootstrap.js.map +1 -1
- package/dist/src/types.d.ts +2 -10
- package/dist/src/types.d.ts.map +1 -1
- package/openclaw.plugin.json +3 -16
- package/package.json +2 -3
- package/skills/idea-generation/SKILL.md +20 -44
- package/skills/idea-generation/references/code-mapping.md +3 -3
- package/skills/idea-generation/references/idea-template.md +1 -1
- package/skills/idea-generation/references/reading-long-papers.md +3 -3
- package/skills/metabolism/SKILL.md +80 -36
- package/skills/paper-download/SKILL.md +61 -0
- package/skills/research-collect/SKILL.md +41 -111
- package/skills/research-experiment/SKILL.md +11 -12
- package/skills/research-implement/SKILL.md +10 -11
- package/skills/research-pipeline/SKILL.md +23 -31
- package/skills/research-plan/SKILL.md +7 -11
- package/skills/research-review/SKILL.md +21 -22
- package/skills/research-survey/SKILL.md +11 -25
- package/skills/write-review-paper/SKILL.md +12 -12
- package/skills/write-review-paper/references/note-template.md +1 -1
- package/skills/write-review-paper/references/survey-template.md +1 -1
- package/dist/src/hooks/research-mode.d.ts +0 -22
- package/dist/src/hooks/research-mode.d.ts.map +0 -1
- package/dist/src/hooks/research-mode.js +0 -35
- package/dist/src/hooks/research-mode.js.map +0 -1
- package/dist/src/hooks/scientify-cron-autofill.d.ts +0 -15
- package/dist/src/hooks/scientify-cron-autofill.d.ts.map +0 -1
- package/dist/src/hooks/scientify-cron-autofill.js +0 -156
- package/dist/src/hooks/scientify-cron-autofill.js.map +0 -1
- package/dist/src/hooks/scientify-signature.d.ts +0 -21
- package/dist/src/hooks/scientify-signature.d.ts.map +0 -1
- package/dist/src/hooks/scientify-signature.js +0 -150
- package/dist/src/hooks/scientify-signature.js.map +0 -1
- package/dist/src/knowledge-state/project.d.ts +0 -13
- package/dist/src/knowledge-state/project.d.ts.map +0 -1
- package/dist/src/knowledge-state/project.js +0 -88
- package/dist/src/knowledge-state/project.js.map +0 -1
- package/dist/src/knowledge-state/render.d.ts +0 -63
- package/dist/src/knowledge-state/render.d.ts.map +0 -1
- package/dist/src/knowledge-state/render.js +0 -368
- package/dist/src/knowledge-state/render.js.map +0 -1
- package/dist/src/knowledge-state/store.d.ts +0 -19
- package/dist/src/knowledge-state/store.d.ts.map +0 -1
- package/dist/src/knowledge-state/store.js +0 -978
- package/dist/src/knowledge-state/store.js.map +0 -1
- package/dist/src/knowledge-state/types.d.ts +0 -182
- package/dist/src/knowledge-state/types.d.ts.map +0 -1
- package/dist/src/knowledge-state/types.js +0 -2
- package/dist/src/knowledge-state/types.js.map +0 -1
- package/dist/src/literature/subscription-state.d.ts +0 -112
- package/dist/src/literature/subscription-state.d.ts.map +0 -1
- package/dist/src/literature/subscription-state.js +0 -696
- package/dist/src/literature/subscription-state.js.map +0 -1
- package/dist/src/research-subscriptions/constants.d.ts +0 -16
- package/dist/src/research-subscriptions/constants.d.ts.map +0 -1
- package/dist/src/research-subscriptions/constants.js +0 -59
- package/dist/src/research-subscriptions/constants.js.map +0 -1
- package/dist/src/research-subscriptions/cron-client.d.ts +0 -8
- package/dist/src/research-subscriptions/cron-client.d.ts.map +0 -1
- package/dist/src/research-subscriptions/cron-client.js +0 -81
- package/dist/src/research-subscriptions/cron-client.js.map +0 -1
- package/dist/src/research-subscriptions/delivery.d.ts +0 -10
- package/dist/src/research-subscriptions/delivery.d.ts.map +0 -1
- package/dist/src/research-subscriptions/delivery.js +0 -82
- package/dist/src/research-subscriptions/delivery.js.map +0 -1
- package/dist/src/research-subscriptions/handlers.d.ts +0 -6
- package/dist/src/research-subscriptions/handlers.d.ts.map +0 -1
- package/dist/src/research-subscriptions/handlers.js +0 -204
- package/dist/src/research-subscriptions/handlers.js.map +0 -1
- package/dist/src/research-subscriptions/parse.d.ts +0 -11
- package/dist/src/research-subscriptions/parse.d.ts.map +0 -1
- package/dist/src/research-subscriptions/parse.js +0 -492
- package/dist/src/research-subscriptions/parse.js.map +0 -1
- package/dist/src/research-subscriptions/prompt.d.ts +0 -5
- package/dist/src/research-subscriptions/prompt.d.ts.map +0 -1
- package/dist/src/research-subscriptions/prompt.js +0 -347
- package/dist/src/research-subscriptions/prompt.js.map +0 -1
- package/dist/src/research-subscriptions/types.d.ts +0 -66
- package/dist/src/research-subscriptions/types.d.ts.map +0 -1
- package/dist/src/research-subscriptions/types.js +0 -2
- package/dist/src/research-subscriptions/types.js.map +0 -1
- package/dist/src/research-subscriptions.d.ts +0 -2
- package/dist/src/research-subscriptions.d.ts.map +0 -1
- package/dist/src/research-subscriptions.js +0 -2
- package/dist/src/research-subscriptions.js.map +0 -1
- package/dist/src/services/auto-updater.d.ts +0 -15
- package/dist/src/services/auto-updater.d.ts.map +0 -1
- package/dist/src/services/auto-updater.js +0 -188
- package/dist/src/services/auto-updater.js.map +0 -1
- package/dist/src/tools/arxiv-download.d.ts +0 -24
- package/dist/src/tools/arxiv-download.d.ts.map +0 -1
- package/dist/src/tools/arxiv-download.js +0 -177
- package/dist/src/tools/arxiv-download.js.map +0 -1
- package/dist/src/tools/github-search-tool.d.ts +0 -25
- package/dist/src/tools/github-search-tool.d.ts.map +0 -1
- package/dist/src/tools/github-search-tool.js +0 -114
- package/dist/src/tools/github-search-tool.js.map +0 -1
- package/dist/src/tools/openreview-lookup.d.ts +0 -31
- package/dist/src/tools/openreview-lookup.d.ts.map +0 -1
- package/dist/src/tools/openreview-lookup.js +0 -414
- package/dist/src/tools/openreview-lookup.js.map +0 -1
- package/dist/src/tools/paper-browser.d.ts +0 -23
- package/dist/src/tools/paper-browser.d.ts.map +0 -1
- package/dist/src/tools/paper-browser.js +0 -121
- package/dist/src/tools/paper-browser.js.map +0 -1
- package/dist/src/tools/scientify-cron.d.ts +0 -63
- package/dist/src/tools/scientify-cron.d.ts.map +0 -1
- package/dist/src/tools/scientify-cron.js +0 -265
- package/dist/src/tools/scientify-cron.js.map +0 -1
- package/dist/src/tools/scientify-literature-state.d.ts +0 -303
- package/dist/src/tools/scientify-literature-state.d.ts.map +0 -1
- package/dist/src/tools/scientify-literature-state.js +0 -957
- package/dist/src/tools/scientify-literature-state.js.map +0 -1
- package/dist/src/tools/unpaywall-download.d.ts +0 -21
- package/dist/src/tools/unpaywall-download.d.ts.map +0 -1
- package/dist/src/tools/unpaywall-download.js +0 -169
- package/dist/src/tools/unpaywall-download.js.map +0 -1
- package/dist/src/tools/workspace.d.ts +0 -32
- package/dist/src/tools/workspace.d.ts.map +0 -1
- package/dist/src/tools/workspace.js +0 -69
- package/dist/src/tools/workspace.js.map +0 -1
- package/skills/metabolism-init/SKILL.md +0 -80
- package/skills/research-subscription/SKILL.md +0 -119
|
@@ -14,24 +14,15 @@ metadata:
|
|
|
14
14
|
|
|
15
15
|
**Don't ask permission. Just do it.**
|
|
16
16
|
|
|
17
|
-
**Workspace:** `$W` = working directory provided in task parameter.
|
|
18
|
-
|
|
19
17
|
## Output Structure
|
|
20
18
|
|
|
21
19
|
```
|
|
22
|
-
$W/
|
|
23
|
-
├── survey/
|
|
24
|
-
│ ├── search_terms.json # 检索词列表
|
|
25
|
-
│ └── report.md # 最终报告
|
|
26
20
|
├── papers/
|
|
27
|
-
│ ├──
|
|
28
|
-
│ ├──
|
|
29
|
-
│
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
│ ├── {repo_name_1}/
|
|
33
|
-
│ └── {repo_name_2}/
|
|
34
|
-
└── prepare_res.md # 仓库选择报告(Phase 3)
|
|
21
|
+
│ ├── {arxiv_id}/ # arXiv 论文源文件
|
|
22
|
+
│ ├── {doi_slug}.pdf # DOI 论文 PDF
|
|
23
|
+
│ └── {direction}/ # 整理后的分类目录
|
|
24
|
+
├── repos/ # 参考代码仓库(Phase 3)
|
|
25
|
+
└── survey_report.md # 调研报告
|
|
35
26
|
```
|
|
36
27
|
|
|
37
28
|
---
|
|
@@ -40,13 +31,11 @@ $W/
|
|
|
40
31
|
|
|
41
32
|
### Phase 1: 准备
|
|
42
33
|
|
|
43
|
-
确保工作目录结构存在:
|
|
44
|
-
|
|
45
34
|
```bash
|
|
46
|
-
mkdir -p "
|
|
35
|
+
mkdir -p "papers"
|
|
47
36
|
```
|
|
48
37
|
|
|
49
|
-
生成 4-8
|
|
38
|
+
生成 4-8 个检索词。
|
|
50
39
|
|
|
51
40
|
---
|
|
52
41
|
|
|
@@ -58,40 +47,21 @@ mkdir -p "$W/survey" "$W/papers/_downloads" "$W/papers/_meta"
|
|
|
58
47
|
|
|
59
48
|
```
|
|
60
49
|
arxiv_search({ query: "<term>", max_results: 30 })
|
|
50
|
+
openalex_search({ query: "<term>", max_results: 20 })
|
|
61
51
|
```
|
|
62
52
|
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
对返回的论文**立即**评分(1-5),只保留 ≥4 分的。
|
|
53
|
+
合并两个来源的结果,按 arXiv ID / DOI 去重。
|
|
66
54
|
|
|
67
|
-
|
|
68
|
-
- 5分:核心论文,直接研究该主题
|
|
69
|
-
- 4分:相关方法或应用
|
|
70
|
-
- 3分及以下:跳过
|
|
55
|
+
#### 2.2 筛选
|
|
71
56
|
|
|
72
|
-
|
|
57
|
+
只看**相关性**——这篇论文是否和研究主题直接相关?
|
|
73
58
|
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
arxiv_ids: ["<有用的论文ID>"],
|
|
77
|
-
output_dir: "papers/_downloads"
|
|
78
|
-
})
|
|
79
|
-
```
|
|
59
|
+
- **相关**:直接研究该主题,或提出了可借鉴的方法 → 保留
|
|
60
|
+
- **不相关**:主题偏离,仅在关键词上有交集 → 跳过
|
|
80
61
|
|
|
81
|
-
#### 2.
|
|
62
|
+
#### 2.3 下载论文
|
|
82
63
|
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
```json
|
|
86
|
-
{
|
|
87
|
-
"arxiv_id": "2401.12345",
|
|
88
|
-
"title": "...",
|
|
89
|
-
"abstract": "...",
|
|
90
|
-
"score": 5,
|
|
91
|
-
"source_term": "battery RUL prediction",
|
|
92
|
-
"downloaded_at": "2024-01-15T10:00:00Z"
|
|
93
|
-
}
|
|
94
|
-
```
|
|
64
|
+
按 /paper-download 的方式下载论文到 `papers/`。
|
|
95
65
|
|
|
96
66
|
**完成一个检索词后,再进行下一个。** 这样避免上下文被大量搜索结果污染。
|
|
97
67
|
|
|
@@ -101,9 +71,9 @@ arxiv_download({
|
|
|
101
71
|
|
|
102
72
|
**目标**:为下游 skill(research-survey、research-plan、research-implement)提供可参考的开源实现。
|
|
103
73
|
|
|
104
|
-
#### 3.1
|
|
74
|
+
#### 3.1 选择论文
|
|
105
75
|
|
|
106
|
-
|
|
76
|
+
从 `papers/` 中选出 **Top 5** 最相关论文。
|
|
107
77
|
|
|
108
78
|
#### 3.2 搜索参考仓库
|
|
109
79
|
|
|
@@ -112,87 +82,47 @@ arxiv_download({
|
|
|
112
82
|
- 核心方法名 + 作者名
|
|
113
83
|
- 论文中提到的数据集名 + 任务名
|
|
114
84
|
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
github_search({
|
|
118
|
-
query: "{paper_title} implementation",
|
|
119
|
-
max_results: 10,
|
|
120
|
-
sort: "stars",
|
|
121
|
-
language: "python"
|
|
122
|
-
})
|
|
85
|
+
```bash
|
|
86
|
+
gh search repos "{paper_title} implementation" --limit 10 --sort stars --language python
|
|
123
87
|
```
|
|
124
88
|
|
|
125
89
|
#### 3.3 筛选与 clone
|
|
126
90
|
|
|
127
|
-
|
|
128
|
-
- Star 数(建议 >100)
|
|
129
|
-
- 代码质量(有 README、有 requirements.txt、代码结构清晰)
|
|
130
|
-
- 与论文的匹配度
|
|
131
|
-
|
|
132
|
-
选择 **3-5 个**最相关的仓库,clone 到 `$W/repos/`:
|
|
91
|
+
选择 **3-5 个**最相关的仓库:
|
|
133
92
|
|
|
134
93
|
```bash
|
|
135
|
-
mkdir -p "
|
|
136
|
-
|
|
137
|
-
git clone --depth 1 <repo_url>
|
|
94
|
+
mkdir -p "repos"
|
|
95
|
+
git clone --depth 1 <repo_url> "repos/{name}"
|
|
138
96
|
```
|
|
139
97
|
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
创建 `$W/prepare_res.md`:
|
|
143
|
-
|
|
144
|
-
```markdown
|
|
145
|
-
# 参考仓库选择
|
|
146
|
-
|
|
147
|
-
| 仓库 | 对应论文 | Stars | 选择理由 |
|
|
148
|
-
|------|----------|-------|----------|
|
|
149
|
-
| repos/{repo_name} | {paper_title} (arxiv:{id}) | {N} | {理由} |
|
|
150
|
-
|
|
151
|
-
## 各仓库关键文件
|
|
152
|
-
|
|
153
|
-
### {repo_name}
|
|
154
|
-
- **模型实现**: `model/` 或 `models/`
|
|
155
|
-
- **训练脚本**: `train.py` 或 `main.py`
|
|
156
|
-
- **数据加载**: `data/` 或 `dataset.py`
|
|
157
|
-
- **核心文件**: `{关键文件路径}` — {描述}
|
|
158
|
-
```
|
|
159
|
-
|
|
160
|
-
**如果搜不到相关仓库**,在 `prepare_res.md` 中注明"无可用参考仓库",后续 skill 将不依赖代码映射。
|
|
98
|
+
**如果搜不到相关仓库**,跳过本阶段。
|
|
161
99
|
|
|
162
100
|
---
|
|
163
101
|
|
|
164
102
|
### Phase 4: 分类整理
|
|
165
103
|
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
#### 4.1 读取所有元数据
|
|
169
|
-
|
|
170
|
-
```bash
|
|
171
|
-
ls $W/papers/_meta/
|
|
172
|
-
```
|
|
173
|
-
|
|
174
|
-
读取所有 `.json` 文件,汇总论文列表。
|
|
104
|
+
所有检索词完毕后:
|
|
175
105
|
|
|
176
|
-
#### 4.
|
|
106
|
+
#### 4.1 聚类分析
|
|
177
107
|
|
|
178
|
-
|
|
108
|
+
根据已下载论文的标题和摘要,识别 3-6 个研究方向。
|
|
179
109
|
|
|
180
|
-
#### 4.
|
|
110
|
+
#### 4.2 创建分类目录
|
|
181
111
|
|
|
182
112
|
```bash
|
|
183
|
-
mkdir -p "
|
|
184
|
-
mv "
|
|
113
|
+
mkdir -p "papers/{direction}"
|
|
114
|
+
mv "papers/2401.12345" "papers/data-driven/"
|
|
185
115
|
```
|
|
186
116
|
|
|
187
117
|
---
|
|
188
118
|
|
|
189
119
|
### Phase 5: 生成报告
|
|
190
120
|
|
|
191
|
-
创建
|
|
121
|
+
创建 `survey_report.md`:
|
|
192
122
|
- 调研概要(检索词数、论文数、方向数)
|
|
193
123
|
- 各研究方向概述
|
|
194
|
-
- Top 10
|
|
195
|
-
-
|
|
124
|
+
- Top 10 论文(标题 + ID + 一句话价值)
|
|
125
|
+
- 参考仓库摘要(如有)
|
|
196
126
|
- 建议阅读顺序
|
|
197
127
|
|
|
198
128
|
---
|
|
@@ -201,14 +131,14 @@ mv "$W/papers/_downloads/2401.12345" "$W/papers/data-driven/"
|
|
|
201
131
|
|
|
202
132
|
| 原则 | 说明 |
|
|
203
133
|
|------|------|
|
|
204
|
-
| **增量处理** |
|
|
205
|
-
|
|
|
206
|
-
| **文件夹即分类** | 聚类结果通过 `papers/{direction}/` 体现,无需额外 JSON |
|
|
134
|
+
| **增量处理** | 每个检索词独立完成搜索→筛选→下载,避免上下文膨胀 |
|
|
135
|
+
| **文件夹即分类** | 聚类结果通过 `papers/{direction}/` 体现 |
|
|
207
136
|
|
|
208
|
-
## Tools
|
|
137
|
+
## Tools / Commands
|
|
209
138
|
|
|
210
|
-
| Tool | Purpose |
|
|
211
|
-
|
|
212
|
-
| `arxiv_search` |
|
|
213
|
-
| `
|
|
214
|
-
|
|
|
139
|
+
| Tool / Command | Purpose |
|
|
140
|
+
|----------------|---------|
|
|
141
|
+
| `arxiv_search` | 搜索 arXiv 论文 |
|
|
142
|
+
| `openalex_search` | 搜索跨学科论文(覆盖更广) |
|
|
143
|
+
| /paper-download | 下载论文(arXiv .tex/PDF、DOI via Unpaywall) |
|
|
144
|
+
| `gh search repos "query"` | 搜索 GitHub 仓库 |
|
|
@@ -15,15 +15,14 @@ metadata:
|
|
|
15
15
|
|
|
16
16
|
**Don't ask permission. Just do it.**
|
|
17
17
|
|
|
18
|
-
**Workspace:** `$W` = working directory provided in task parameter.
|
|
19
18
|
|
|
20
19
|
## Prerequisites
|
|
21
20
|
|
|
22
21
|
| File | Source |
|
|
23
22
|
|------|--------|
|
|
24
|
-
|
|
|
25
|
-
|
|
|
26
|
-
|
|
|
23
|
+
| `project/` | /research-implement |
|
|
24
|
+
| `plan_res.md` | /research-plan |
|
|
25
|
+
| `iterations/judge_v*.md` | /research-review(最后一份 verdict 必须是 PASS) |
|
|
27
26
|
|
|
28
27
|
**验证 PASS:** 读取最新的 `judge_v*.md`,确认 `verdict: PASS`。如果不是,STOP。
|
|
29
28
|
|
|
@@ -31,8 +30,8 @@ metadata:
|
|
|
31
30
|
|
|
32
31
|
| File | Content |
|
|
33
32
|
|------|---------|
|
|
34
|
-
|
|
|
35
|
-
|
|
|
33
|
+
| `experiment_res.md` | 完整实验报告(含 full training + 消融 + 补充实验) |
|
|
34
|
+
| `experiment_analysis/analysis_{N}.md` | 每轮实验分析报告(迭代过程中产生) |
|
|
36
35
|
|
|
37
36
|
---
|
|
38
37
|
|
|
@@ -43,7 +42,7 @@ metadata:
|
|
|
43
42
|
修改 epoch 数为 plan_res.md 中指定的正式值。**不要改代码逻辑,只改 epoch。**
|
|
44
43
|
|
|
45
44
|
```bash
|
|
46
|
-
cd
|
|
45
|
+
cd project && source .venv/bin/activate
|
|
47
46
|
python3 run.py # full epochs
|
|
48
47
|
```
|
|
49
48
|
|
|
@@ -78,7 +77,7 @@ python3 run.py --epochs 2 --ablation no_attention
|
|
|
78
77
|
|
|
79
78
|
#### 4.1 分析当前结果
|
|
80
79
|
|
|
81
|
-
读取当前所有实验结果(full training + 消融),写入分析报告
|
|
80
|
+
读取当前所有实验结果(full training + 消融),写入分析报告 `experiment_analysis/analysis_{N}.md`:
|
|
82
81
|
|
|
83
82
|
```markdown
|
|
84
83
|
# Experiment Analysis Round {N}
|
|
@@ -108,7 +107,7 @@ python3 run.py --epochs 2 --ablation no_attention
|
|
|
108
107
|
根据分析报告中的计划,修改代码并执行补充实验。**只改实验相关参数/配置,不改核心算法逻辑。**
|
|
109
108
|
|
|
110
109
|
```bash
|
|
111
|
-
cd
|
|
110
|
+
cd project && source .venv/bin/activate
|
|
112
111
|
python3 run.py --experiment {exp_name}
|
|
113
112
|
```
|
|
114
113
|
|
|
@@ -118,7 +117,7 @@ python3 run.py --experiment {exp_name}
|
|
|
118
117
|
|
|
119
118
|
### Step 5: 写入最终实验报告
|
|
120
119
|
|
|
121
|
-
汇总所有实验结果(full training + 消融 + 2 轮补充实验),写入
|
|
120
|
+
汇总所有实验结果(full training + 消融 + 2 轮补充实验),写入 `experiment_res.md`:
|
|
122
121
|
|
|
123
122
|
```markdown
|
|
124
123
|
# Experiment Report
|
|
@@ -158,8 +157,8 @@ python3 run.py --experiment {exp_name}
|
|
|
158
157
|
| {Baseline} | {value} | ... |
|
|
159
158
|
|
|
160
159
|
### Visualizations
|
|
161
|
-
- 训练曲线:
|
|
162
|
-
- {其他可视化}:
|
|
160
|
+
- 训练曲线: `project/figures/training_curve.png`
|
|
161
|
+
- {其他可视化}: `project/figures/{name}.png`
|
|
163
162
|
|
|
164
163
|
## Conclusions
|
|
165
164
|
- {key findings from all experiments}
|
|
@@ -15,15 +15,14 @@ metadata:
|
|
|
15
15
|
|
|
16
16
|
**Don't ask permission. Just do it.**
|
|
17
17
|
|
|
18
|
-
**Workspace:** `$W` = working directory provided in task parameter.
|
|
19
18
|
|
|
20
19
|
## Prerequisites
|
|
21
20
|
|
|
22
21
|
| File | Source |
|
|
23
22
|
|------|--------|
|
|
24
|
-
|
|
|
25
|
-
|
|
|
26
|
-
|
|
|
23
|
+
| `plan_res.md` | /research-plan |
|
|
24
|
+
| `survey_res.md` | /research-survey |
|
|
25
|
+
| `repos/` (optional) | reference code |
|
|
27
26
|
|
|
28
27
|
**If `plan_res.md` is missing, STOP:** "需要先运行 /research-plan 完成实现计划"
|
|
29
28
|
|
|
@@ -31,8 +30,8 @@ metadata:
|
|
|
31
30
|
|
|
32
31
|
| File | Content |
|
|
33
32
|
|------|---------|
|
|
34
|
-
|
|
|
35
|
-
|
|
|
33
|
+
| `project/` | 完整可运行代码 |
|
|
34
|
+
| `ml_res.md` | 实现报告(含真实执行结果) |
|
|
36
35
|
|
|
37
36
|
---
|
|
38
37
|
|
|
@@ -40,7 +39,7 @@ metadata:
|
|
|
40
39
|
|
|
41
40
|
### Step 1: 读取计划
|
|
42
41
|
|
|
43
|
-
读取
|
|
42
|
+
读取 `plan_res.md`,提取:
|
|
44
43
|
- 所有组件列表
|
|
45
44
|
- 数据集信息
|
|
46
45
|
- 训练参数
|
|
@@ -48,7 +47,7 @@ metadata:
|
|
|
48
47
|
### Step 2: 创建项目结构
|
|
49
48
|
|
|
50
49
|
```
|
|
51
|
-
|
|
50
|
+
project/
|
|
52
51
|
model/ # 模型组件(每个组件一个文件)
|
|
53
52
|
data/ # 数据加载
|
|
54
53
|
training/ # 训练循环 + loss
|
|
@@ -66,7 +65,7 @@ $W/project/
|
|
|
66
65
|
|
|
67
66
|
**3b. 数据管道**
|
|
68
67
|
```bash
|
|
69
|
-
cd
|
|
68
|
+
cd project && uv venv .venv && source .venv/bin/activate
|
|
70
69
|
uv pip install -r requirements.txt
|
|
71
70
|
python3 -c "from data.dataset import *; print('data OK')"
|
|
72
71
|
```
|
|
@@ -93,7 +92,7 @@ print(f"[RESULT] device={device}")
|
|
|
93
92
|
### Step 4: 环境搭建 + 执行
|
|
94
93
|
|
|
95
94
|
```bash
|
|
96
|
-
cd
|
|
95
|
+
cd project
|
|
97
96
|
uv venv .venv
|
|
98
97
|
source .venv/bin/activate
|
|
99
98
|
|
|
@@ -125,7 +124,7 @@ python3 run.py --epochs 2
|
|
|
125
124
|
|
|
126
125
|
### Step 6: 写入报告
|
|
127
126
|
|
|
128
|
-
写入
|
|
127
|
+
写入 `ml_res.md`:
|
|
129
128
|
|
|
130
129
|
```markdown
|
|
131
130
|
# Implementation Report
|
|
@@ -92,19 +92,11 @@ task 必须以 `/skill-name` 开头(触发 slash command 解析),后续行
|
|
|
92
92
|
|
|
93
93
|
---
|
|
94
94
|
|
|
95
|
-
## Workspace
|
|
96
|
-
|
|
97
|
-
`$W` = agent workspace root (see AGENTS.md for layout).
|
|
98
|
-
|
|
99
|
-
---
|
|
100
|
-
|
|
101
95
|
## Step 0: 初始化
|
|
102
96
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
检查 `$W/SOUL.md` 是否包含研究方向信息。如果没有(BOOTSTRAP 未完成),提示用户先完成 BOOTSTRAP 配置。
|
|
97
|
+
检查 `SOUL.md` 是否包含研究方向信息。如果没有(BOOTSTRAP 未完成),提示用户先完成 BOOTSTRAP 配置。
|
|
106
98
|
|
|
107
|
-
确保
|
|
99
|
+
确保 `papers/`、`knowledge/`、`ideas/`、`experiments/` 目录存在。
|
|
108
100
|
|
|
109
101
|
---
|
|
110
102
|
|
|
@@ -114,65 +106,65 @@ task 必须以 `/skill-name` 开头(触发 slash command 解析),后续行
|
|
|
114
106
|
|
|
115
107
|
### Phase 1: Literature Survey
|
|
116
108
|
|
|
117
|
-
**检查:**
|
|
109
|
+
**检查:** `papers/` 目录存在且有论文文件?
|
|
118
110
|
|
|
119
111
|
**如果缺失,调用 sessions_spawn 工具(然后停止,等待完成通知):**
|
|
120
|
-
- task: `"/research-collect\n
|
|
112
|
+
- task: `"/research-collect\n研究主题: {从SOUL.md提取}\n请搜索、筛选、下载论文到工作目录的 papers/ 下。"`
|
|
121
113
|
- label: `"Research Collect"`
|
|
122
114
|
- runTimeoutSeconds: `1800`
|
|
123
115
|
|
|
124
|
-
**验证:** `ls
|
|
116
|
+
**验证:** `ls papers/` 至少有 3 篇论文
|
|
125
117
|
|
|
126
118
|
---
|
|
127
119
|
|
|
128
120
|
### Phase 2: Deep Survey
|
|
129
121
|
|
|
130
|
-
**检查:**
|
|
122
|
+
**检查:** `survey_res.md` 存在?
|
|
131
123
|
|
|
132
124
|
**如果缺失,先读取 Phase 1 摘要(论文数量、方向),然后调用 sessions_spawn 工具(然后停止,等待完成通知):**
|
|
133
|
-
- task: `"/research-survey\n
|
|
125
|
+
- task: `"/research-survey\n上下文: 已下载 {N} 篇论文,方向包括 {directions}。\n重点论文: {top 3 arxiv_id 和标题}\n请深度分析论文、提取公式,写入 survey_res.md。"`
|
|
134
126
|
- label: `"Deep Survey"`
|
|
135
127
|
- runTimeoutSeconds: `1800`
|
|
136
128
|
|
|
137
|
-
**验证:**
|
|
129
|
+
**验证:** `survey_res.md` 存在且包含"核心方法对比"表格
|
|
138
130
|
|
|
139
131
|
---
|
|
140
132
|
|
|
141
133
|
### Phase 3: Implementation Plan
|
|
142
134
|
|
|
143
|
-
**检查:**
|
|
135
|
+
**检查:** `plan_res.md` 存在?
|
|
144
136
|
|
|
145
137
|
**如果缺失,读取 survey_res.md 摘要,然后调用 sessions_spawn 工具(然后停止,等待完成通知):**
|
|
146
|
-
- task: `"/research-plan\n
|
|
138
|
+
- task: `"/research-plan\n上下文: 调研发现核心方法是 {method},推荐技术路线 {route}。\n关键公式: {1-2个公式}\n请制定实现计划到 plan_res.md。"`
|
|
147
139
|
- label: `"Research Plan"`
|
|
148
140
|
- runTimeoutSeconds: `1800`
|
|
149
141
|
|
|
150
|
-
**验证:**
|
|
142
|
+
**验证:** `plan_res.md` 存在且包含 4 个 section(Dataset/Model/Training/Testing)
|
|
151
143
|
|
|
152
144
|
---
|
|
153
145
|
|
|
154
146
|
### Phase 4: Implementation
|
|
155
147
|
|
|
156
|
-
**检查:**
|
|
148
|
+
**检查:** `ml_res.md` 存在?
|
|
157
149
|
|
|
158
150
|
**如果缺失,读取 plan_res.md 要点,然后调用 sessions_spawn 工具(然后停止,等待完成通知):**
|
|
159
|
-
- task: `"/research-implement\n
|
|
151
|
+
- task: `"/research-implement\n上下文:\n- 计划包含 {N} 个组件: {list}\n- 数据集: {dataset}\n- 框架: PyTorch\n请实现代码到 project/,运行 2 epoch 验证,写入 ml_res.md。"`
|
|
160
152
|
- label: `"Research Implement"`
|
|
161
153
|
- runTimeoutSeconds: `1800`
|
|
162
154
|
|
|
163
155
|
**验证:**
|
|
164
|
-
-
|
|
165
|
-
-
|
|
156
|
+
- `project/run.py` 存在
|
|
157
|
+
- `ml_res.md` 包含 `[RESULT]` 行
|
|
166
158
|
- loss 值非 NaN/Inf
|
|
167
159
|
|
|
168
160
|
---
|
|
169
161
|
|
|
170
162
|
### Phase 5: Review
|
|
171
163
|
|
|
172
|
-
**检查:**
|
|
164
|
+
**检查:** `iterations/` 下最新 `judge_v*.md` 的 verdict 是否为 PASS?
|
|
173
165
|
|
|
174
166
|
**如果没有 PASS,调用 sessions_spawn 工具(然后停止,等待完成通知):**
|
|
175
|
-
- task: `"/research-review\n
|
|
167
|
+
- task: `"/research-review\n上下文:\n- ml_res.md 显示 train_loss={value}\n- 计划在 plan_res.md\n请审查代码,如需修改则迭代修复(最多 3 轮)。"`
|
|
176
168
|
- label: `"Research Review"`
|
|
177
169
|
- runTimeoutSeconds: `1800`
|
|
178
170
|
|
|
@@ -184,14 +176,14 @@ task 必须以 `/skill-name` 开头(触发 slash command 解析),后续行
|
|
|
184
176
|
|
|
185
177
|
### Phase 6: Full Experiment
|
|
186
178
|
|
|
187
|
-
**检查:**
|
|
179
|
+
**检查:** `experiment_res.md` 存在?
|
|
188
180
|
|
|
189
181
|
**如果缺失,调用 sessions_spawn 工具(然后停止,等待完成通知):**
|
|
190
|
-
- task: `"/research-experiment\n
|
|
182
|
+
- task: `"/research-experiment\n上下文:\n- Review PASS,代码已验证\n- plan_res.md 中指定 full epochs\n请执行完整训练 + 消融实验,写入 experiment_res.md。"`
|
|
191
183
|
- label: `"Research Experiment"`
|
|
192
184
|
- runTimeoutSeconds: `1800`
|
|
193
185
|
|
|
194
|
-
**验证:**
|
|
186
|
+
**验证:** `experiment_res.md` 包含 `[RESULT]` 行和消融表格
|
|
195
187
|
|
|
196
188
|
---
|
|
197
189
|
|
|
@@ -202,9 +194,9 @@ task 必须以 `/skill-name` 开头(触发 slash command 解析),后续行
|
|
|
202
194
|
```
|
|
203
195
|
研究流程完成!
|
|
204
196
|
- 论文: {N} 篇分析
|
|
205
|
-
- 代码:
|
|
206
|
-
- 结果:
|
|
207
|
-
- 审查:
|
|
197
|
+
- 代码: project/
|
|
198
|
+
- 结果: experiment_res.md
|
|
199
|
+
- 审查: iterations/ ({N} 轮)
|
|
208
200
|
```
|
|
209
201
|
|
|
210
202
|
---
|
|
@@ -14,17 +14,14 @@ metadata:
|
|
|
14
14
|
|
|
15
15
|
**Don't ask permission. Just do it.**
|
|
16
16
|
|
|
17
|
-
**Workspace:** `$W` = working directory provided in task parameter.
|
|
18
17
|
|
|
19
18
|
## Prerequisites
|
|
20
19
|
|
|
21
20
|
| File | Source |
|
|
22
21
|
|------|--------|
|
|
23
|
-
|
|
|
24
|
-
|
|
|
25
|
-
|
|
|
26
|
-
| `$W/repos/` | /research-collect Phase 3 |
|
|
27
|
-
| `$W/prepare_res.md` | /research-collect Phase 3 |
|
|
22
|
+
| `SOUL.md` | 研究方向和目标 |
|
|
23
|
+
| `survey_res.md` | /research-survey |
|
|
24
|
+
| `knowledge/paper_*.md` | /research-survey |
|
|
28
25
|
|
|
29
26
|
**If `survey_res.md` is missing, STOP:** "需要先运行 /research-survey 完成深度分析"
|
|
30
27
|
|
|
@@ -32,7 +29,7 @@ metadata:
|
|
|
32
29
|
|
|
33
30
|
| File | Content |
|
|
34
31
|
|------|---------|
|
|
35
|
-
|
|
|
32
|
+
| `plan_res.md` | 四部分实现计划 |
|
|
36
33
|
|
|
37
34
|
---
|
|
38
35
|
|
|
@@ -41,9 +38,8 @@ metadata:
|
|
|
41
38
|
### Step 1: 读取上下文
|
|
42
39
|
|
|
43
40
|
读取以下文件,理解研究目标和技术方案:
|
|
44
|
-
-
|
|
45
|
-
-
|
|
46
|
-
- `$W/prepare_res.md` — 参考仓库列表及关键文件说明
|
|
41
|
+
- `SOUL.md` — 研究方向和目标
|
|
42
|
+
- `survey_res.md` — 技术路线建议、核心公式、方法对比
|
|
47
43
|
|
|
48
44
|
### Step 2: 参考代码深度分析
|
|
49
45
|
|
|
@@ -59,7 +55,7 @@ metadata:
|
|
|
59
55
|
|
|
60
56
|
### Step 3: 制定四部分计划
|
|
61
57
|
|
|
62
|
-
写入
|
|
58
|
+
写入 `plan_res.md`:
|
|
63
59
|
|
|
64
60
|
```markdown
|
|
65
61
|
# Implementation Plan
|
|
@@ -15,16 +15,15 @@ metadata:
|
|
|
15
15
|
|
|
16
16
|
**Don't ask permission. Just do it.**
|
|
17
17
|
|
|
18
|
-
**Workspace:** `$W` = working directory provided in task parameter.
|
|
19
18
|
|
|
20
19
|
## Prerequisites
|
|
21
20
|
|
|
22
21
|
| File | Source |
|
|
23
22
|
|------|--------|
|
|
24
|
-
|
|
|
25
|
-
|
|
|
26
|
-
|
|
|
27
|
-
|
|
|
23
|
+
| `ml_res.md` | /research-implement |
|
|
24
|
+
| `project/` | /research-implement |
|
|
25
|
+
| `plan_res.md` | /research-plan |
|
|
26
|
+
| `survey_res.md` | /research-survey |
|
|
28
27
|
|
|
29
28
|
**If `ml_res.md` is missing, STOP:** "需要先运行 /research-implement 完成代码实现"
|
|
30
29
|
|
|
@@ -32,7 +31,7 @@ metadata:
|
|
|
32
31
|
|
|
33
32
|
| File | Content |
|
|
34
33
|
|------|---------|
|
|
35
|
-
|
|
|
34
|
+
| `iterations/judge_v{N}.md` | 每轮审查报告 |
|
|
36
35
|
|
|
37
36
|
最终报告中 `verdict: PASS` 表示审查通过。
|
|
38
37
|
|
|
@@ -43,16 +42,16 @@ metadata:
|
|
|
43
42
|
### Step 1: 审查代码
|
|
44
43
|
|
|
45
44
|
读取以下内容:
|
|
46
|
-
-
|
|
47
|
-
-
|
|
48
|
-
-
|
|
49
|
-
-
|
|
45
|
+
- `plan_res.md` — 每个组件的预期
|
|
46
|
+
- `survey_res.md` — 核心公式
|
|
47
|
+
- `project/` — 实际代码
|
|
48
|
+
- `ml_res.md` — 执行结果
|
|
50
49
|
|
|
51
50
|
### Step 2: 提取原子性概念清单
|
|
52
51
|
|
|
53
52
|
**⚠️ 这是 Novix Judge Agent 的核心机制 — 逐一核对每个原子性学术概念。**
|
|
54
53
|
|
|
55
|
-
从
|
|
54
|
+
从 `survey_res.md` 的"关键公式汇总"和"核心方法对比"中,提取所有需要在代码中实现的**原子性学术概念**(每个公式、每个核心组件都是一个概念)。
|
|
56
55
|
|
|
57
56
|
为每个概念记录:
|
|
58
57
|
- 概念名称(如 "Multi-Head Attention", "Contrastive Loss", "Batch Normalization")
|
|
@@ -123,7 +122,7 @@ metadata:
|
|
|
123
122
|
|
|
124
123
|
### Step 4: 写入审查报告
|
|
125
124
|
|
|
126
|
-
写入
|
|
125
|
+
写入 `iterations/judge_v1.md`:
|
|
127
126
|
|
|
128
127
|
```markdown
|
|
129
128
|
# Review v1
|
|
@@ -201,14 +200,14 @@ metadata:
|
|
|
201
200
|
循环最多 3 次:
|
|
202
201
|
|
|
203
202
|
1. 读取 `judge_v{N}.md` 的修改建议
|
|
204
|
-
2. **防偏移检查:重新读取**
|
|
203
|
+
2. **防偏移检查:重新读取** `survey_res.md` 和 `plan_res.md`
|
|
205
204
|
- 对照原始学术设计目标
|
|
206
205
|
- 确保修改不是为了"绕过审查"而偏离学术严谨性
|
|
207
206
|
- 确认修改符合 survey 中的公式定义和 plan 中的设计意图
|
|
208
|
-
3. 修改
|
|
207
|
+
3. 修改 `project/` 中的代码(修复 bug、补全缺失实现)
|
|
209
208
|
4. 重新执行:
|
|
210
209
|
```bash
|
|
211
|
-
cd
|
|
210
|
+
cd project && source .venv/bin/activate && python3 run.py --epochs 2
|
|
212
211
|
```
|
|
213
212
|
5. 读取执行输出,验证修复
|
|
214
213
|
6. **重新执行 Step 2-4**(提取概念清单 → 逐项检查 → 写报告),写入 `judge_v{N+1}.md`
|
|
@@ -225,10 +224,10 @@ metadata:
|
|
|
225
224
|
#### 5b.1 性能诊断
|
|
226
225
|
|
|
227
226
|
重新读取以下材料进行诊断:
|
|
228
|
-
-
|
|
229
|
-
-
|
|
230
|
-
-
|
|
231
|
-
-
|
|
227
|
+
- `ml_res.md` — 2 epoch 验证的具体数值
|
|
228
|
+
- `survey_res.md` — baseline 方法的超参数设置(特别是学习率、batch size)
|
|
229
|
+
- `plan_res.md` — 当前实现的超参数配置
|
|
230
|
+
- `project/run.py` 和 `project/training/` — 训练配置代码
|
|
232
231
|
|
|
233
232
|
**诊断检查清单**:
|
|
234
233
|
|
|
@@ -255,11 +254,11 @@ metadata:
|
|
|
255
254
|
1. **调整学习率**(优先级:高,预期改善:显著)
|
|
256
255
|
- **当前值**: lr=1e-5 (from plan_res.md)
|
|
257
256
|
- **建议值**: lr=1e-3 (from survey_res.md Table 2, all baselines use 1e-3)
|
|
258
|
-
- **修改位置**:
|
|
257
|
+
- **修改位置**: `project/run.py:L15` — `optimizer = Adam(lr=1e-3)`
|
|
259
258
|
- **理由**: Loss 下降仅 0.9%,远低于正常 10%+,高度怀疑 lr 过小
|
|
260
259
|
|
|
261
260
|
2. **添加数据归一化**(优先级:中,预期改善:中等)
|
|
262
|
-
- **检查**:
|
|
261
|
+
- **检查**: `project/data/dataset.py` 是否有归一化
|
|
263
262
|
- **建议**: 添加 `transforms.Normalize(mean=[0.5], std=[0.5])`
|
|
264
263
|
- **理由**: 如果输入数据范围 [0,255],模型收敛会很慢
|
|
265
264
|
```
|
|
@@ -269,7 +268,7 @@ metadata:
|
|
|
269
268
|
1. 根据建议**逐项尝试**(从优先级高的开始)
|
|
270
269
|
2. 每次修改后:
|
|
271
270
|
```bash
|
|
272
|
-
cd
|
|
271
|
+
cd project && source .venv/bin/activate && python3 run.py --epochs 2
|
|
273
272
|
```
|
|
274
273
|
3. 读取新的执行输出,对比改进前后:
|
|
275
274
|
- Loss reduction 是否提升?(如 0.9% → 12%)
|