scientify 1.1.5 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,217 +1,82 @@
1
1
  # Idea Template
2
2
 
3
- Use this template for each idea in `~/.openclaw/workspace/ideas/`.
3
+ Use this template for each generated idea (`$WORKSPACE/ideas/idea_N.md`).
4
4
 
5
- ---
5
+ ## Required Sections
6
6
 
7
- # Idea N: [Short Descriptive Title]
7
+ ```markdown
8
+ # Idea N: [Title]
8
9
 
9
- ## One-Line Summary
10
-
11
- [A single sentence that captures the core innovation. Should be understandable without context.]
12
-
13
- ---
10
+ ## One-line Summary
11
+ [Single sentence describing the core innovation]
14
12
 
15
13
  ## Challenges Addressed
14
+ 1. [Challenge 1]
15
+ 2. [Challenge 2]
16
16
 
17
- What problems in the current research landscape does this idea solve?
18
-
19
- - **[Challenge 1]**: [Brief description of the technical limitation]
20
- - **[Challenge 2]**: [Brief description of unsolved problem]
21
- - **[Challenge 3]**: [Brief description of key bottleneck]
22
-
23
- ---
17
+ ## Existing Methods & Limitations
24
18
 
25
- ## Existing Methods & Their Limitations
19
+ ### Method A [arXiv:XXXX.XXXXX]
20
+ - **Approach**: [Brief description]
21
+ - **Limitation**: [Why insufficient]
26
22
 
27
- | Method | Paper/Source | Strength | Weakness This Idea Addresses |
28
- |--------|--------------|----------|------------------------------|
29
- | [Method A] | [Citation] | [What it does well] | [Limitation] |
30
- | [Method B] | [Citation] | [What it does well] | [Limitation] |
31
- | [Method C] | [Citation] | [What it does well] | [Limitation] |
32
-
33
- ---
23
+ ### Method B [arXiv:XXXX.XXXXX]
24
+ - **Approach**: [Brief description]
25
+ - **Limitation**: [Why insufficient]
34
26
 
35
27
  ## Motivation
36
-
37
- ### Why is this problem important?
38
-
39
- [Explain the significance of solving this problem. Who benefits? What applications are enabled?]
40
-
41
- ### What gap does this fill?
42
-
43
- [Describe the specific research gap this idea addresses. Reference the limitations above.]
44
-
45
- ### Potential impact
46
-
47
- [Quantify if possible: "Could improve X metric by Y%" or "Enables new application Z"]
48
-
49
- ---
28
+ [Why this gap matters, what opportunity exists]
50
29
 
51
30
  ## Proposed Method
52
31
 
53
- ### Core Insight
54
-
55
- [2-3 sentences describing the key innovation. What is the "aha" moment?]
32
+ ### Core Idea
33
+ [Main innovation in 2-3 sentences]
56
34
 
57
35
  ### Technical Approach
36
+ [Detailed description]
58
37
 
59
- **Overview:**
60
-
61
- [1 paragraph high-level description]
62
-
63
- **Step-by-step methodology:**
64
-
65
- 1. **[Step 1 Name]**: [Description]
66
- - Input: [what this step takes]
67
- - Output: [what this step produces]
68
- - Key operation: [main computation]
69
-
70
- 2. **[Step 2 Name]**: [Description]
71
- - Input: ...
72
- - Output: ...
73
- - Key operation: ...
74
-
75
- 3. **[Step 3 Name]**: [Description]
76
- - ...
77
-
78
- ### Mathematical Formulation
79
-
80
- **Problem Setup:**
81
-
82
- Let $X \in \mathbb{R}^{n \times d}$ denote [description]...
83
-
84
- **Core Equations:**
85
-
86
- ```latex
87
- % Main loss function
88
- \mathcal{L} = \mathcal{L}_{task} + \lambda \mathcal{L}_{reg}
89
-
90
- % Where task loss is:
91
- \mathcal{L}_{task} = ...
92
-
93
- % And regularization term is:
94
- \mathcal{L}_{reg} = ...
95
- ```
96
-
97
- **Key derivations (if applicable):**
98
-
99
- [Show important mathematical steps that justify the approach]
100
-
101
- ### Architecture / Algorithm
102
-
103
- ```
104
- Algorithm: [Name]
105
- Input: [inputs]
106
- Output: [outputs]
107
-
108
- 1. Initialize [parameters]
109
- 2. For each [iteration]:
110
- a. Compute [something]
111
- b. Update [something]
112
- 3. Return [result]
113
- ```
114
-
115
- Or for neural architectures:
116
-
117
- ```
118
- [Input] → [Layer 1] → [Layer 2] → ... → [Output]
119
- (dim: ...) (dim: ...) (dim: ...)
120
- ```
38
+ ### Key Equations
39
+ $$
40
+ \mathcal{L} = ...
41
+ $$
121
42
 
122
- ---
43
+ ## How This Improves on Cited Papers
44
+ | Paper | Their Limitation | Our Improvement |
45
+ |-------|------------------|-----------------|
46
+ | [A] | ... | ... |
47
+ | [B] | ... | ... |
123
48
 
124
49
  ## Expected Advantages
125
-
126
- Why should this approach work better than existing methods?
127
-
128
- - **[Advantage 1]**: [Explanation with reasoning]
129
- - **[Advantage 2]**: [Explanation with reasoning]
130
- - **[Advantage 3]**: [Explanation with reasoning]
131
-
132
- **Theoretical justification (if applicable):**
133
-
134
- [Brief argument for why this should work]
135
-
136
- ---
137
-
138
- ## Potential Challenges
139
-
140
- What could go wrong? How to mitigate?
141
-
142
- | Challenge | Risk Level | Mitigation Strategy |
143
- |-----------|------------|---------------------|
144
- | [Challenge 1] | High/Med/Low | [How to address] |
145
- | [Challenge 2] | High/Med/Low | [How to address] |
146
- | [Challenge 3] | High/Med/Low | [How to address] |
147
-
148
- ---
50
+ 1. [Advantage 1]
51
+ 2. [Advantage 2]
149
52
 
150
53
  ## Evaluation Plan
151
54
 
152
55
  ### Datasets
153
-
154
- | Dataset | Task | Size | Why Chosen |
155
- |---------|------|------|------------|
156
- | [Dataset 1] | [Task] | [Size] | [Reason] |
157
- | [Dataset 2] | [Task] | [Size] | [Reason] |
56
+ - [Dataset 1]: [Why chosen]
158
57
 
159
58
  ### Baselines
160
-
161
- | Method | Paper | Why Compare |
162
- |--------|-------|-------------|
163
- | [Baseline 1] | [Citation] | [Reason] |
164
- | [Baseline 2] | [Citation] | [Reason] |
59
+ - [Method from Paper A]
60
+ - [Method from Paper B]
165
61
 
166
62
  ### Metrics
63
+ - [Metric 1]: [What it measures]
167
64
 
168
- | Metric | Description | Expected Improvement |
169
- |--------|-------------|---------------------|
170
- | [Metric 1] | [What it measures] | [X% over baseline] |
171
- | [Metric 2] | [What it measures] | [Y% over baseline] |
172
-
173
- ### Ablation Studies
174
-
175
- What components to ablate to understand contribution?
176
-
177
- 1. [Component 1]: Remove/replace to test [hypothesis]
178
- 2. [Component 2]: Remove/replace to test [hypothesis]
179
-
180
- ---
181
-
182
- ## Scores
183
-
184
- | Criterion | Score (1-5) | Justification |
185
- |-----------|-------------|---------------|
186
- | **Novelty** | [X] | [Why this score] |
187
- | **Feasibility** | [X] | [Why this score] |
188
- | **Impact** | [X] | [Why this score] |
189
- | **Total** | [Sum] | |
190
-
191
- ---
192
-
193
- ## Implementation Notes
194
-
195
- ### Recommended Libraries
196
-
197
- - [Library 1]: For [purpose]
198
- - [Library 2]: For [purpose]
199
-
200
- ### Reference Code
201
-
202
- - [Repo 1](URL): [What to reference]
203
- - [Repo 2](URL): [What to reference]
204
-
205
- ### Estimated Effort
206
-
207
- - Model implementation: [X days]
208
- - Data pipeline: [X days]
209
- - Training & evaluation: [X days]
210
- - Total: [X days]
65
+ ## Scores (1-5)
66
+ - **Novelty**: X/5 - [Justification]
67
+ - **Feasibility**: X/5 - [Justification]
68
+ - **Impact**: X/5 - [Justification]
69
+ - **Total**: XX/15
70
+ ```
211
71
 
212
- ---
72
+ ## Idea Strategies
213
73
 
214
- ## Related Ideas
74
+ Generate 5 ideas using different strategies:
215
75
 
216
- - **Idea [M]**: [How it relates - could be combined? alternative approach?]
217
- - **Future extension**: [What could come next after this idea]
76
+ | Idea | Strategy | Description |
77
+ |------|----------|-------------|
78
+ | 1 | Combination | Merge techniques from 2+ papers |
79
+ | 2 | Simplification | Simplify complex method |
80
+ | 3 | Generalization | Extend to new domain/task |
81
+ | 4 | Constraint Relaxation | Remove limiting assumption |
82
+ | 5 | Architecture Innovation | Novel model design |
@@ -0,0 +1,43 @@
1
+ # Reading Long Papers Strategy
2
+
3
+ For papers >50KB or >15k tokens, use chunked reading.
4
+
5
+ ## Step 1: Structure Scan
6
+
7
+ ```bash
8
+ ls -la $WORKSPACE/papers/{arxiv_id}/
9
+ wc -l $WORKSPACE/papers/{arxiv_id}/*.tex
10
+ ```
11
+
12
+ ## Step 2: Chunked Reading
13
+
14
+ Use Read tool with `offset` and `limit`:
15
+
16
+ ```
17
+ Tool: Read
18
+ Arguments:
19
+ file_path: "$WORKSPACE/papers/2404.04429/main.tex"
20
+ offset: 1
21
+ limit: 500 # First 500 lines
22
+ ```
23
+
24
+ ## Priority Sections
25
+
26
+ | Priority | Section | Why |
27
+ |----------|---------|-----|
28
+ | 1 | Abstract | Core contribution |
29
+ | 2 | Method | Technical details |
30
+ | 3 | Experiments | Results |
31
+ | 4 | Conclusion | Limitations |
32
+
33
+ ## Skip These
34
+
35
+ - Appendix, Acknowledgments
36
+ - Detailed hyperparameter tables
37
+
38
+ ## Quick Extraction
39
+
40
+ ```bash
41
+ grep -n "\\\\section{" main.tex
42
+ sed -n '/\\begin{abstract}/,/\\end{abstract}/p' main.tex
43
+ ```
@@ -0,0 +1,272 @@
1
+ ---
2
+ name: literature-survey
3
+ description: "Comprehensive literature survey (100+ papers). Searches, filters, clusters, and iterates for complete coverage. Use for: exploring new research areas, collecting papers systematically, building literature databases. NOT for: summarizing papers you have (use /write-review-paper), finding a specific paper (use arxiv_search), generating ideas (use /idea-generation)."
4
+ metadata:
5
+ {
6
+ "openclaw":
7
+ {
8
+ "emoji": "🔍",
9
+ },
10
+ }
11
+ ---
12
+
13
+ # Literature Survey
14
+
15
+ Comprehensive literature discovery workflow for a research domain. This skill searches broadly, filters by relevance, clusters by direction, and iterates to ensure complete coverage.
16
+
17
+ ## Architecture: Isolated Sub-agent
18
+
19
+ This survey runs in an **isolated sub-session** to avoid context pollution. The main session only receives the final report.
20
+
21
+ ```
22
+ Main Session
23
+
24
+ sessions_spawn(task: "执行文献调研...", label: "literature-survey")
25
+
26
+ Sub-agent Session (隔离上下文)
27
+ ├── Phase 1: 生成检索词
28
+ ├── Phase 2: 批量检索
29
+ ├── Phase 3: 相关性筛选
30
+ ├── Phase 4: 聚类分组
31
+ ├── Phase 5: 迭代发现
32
+ └── Phase 6: 生成报告
33
+
34
+ 返回主 Session: 摘要 + 文件路径
35
+ ```
36
+
37
+ ---
38
+
39
+ ## When User Requests Literature Survey
40
+
41
+ **Step 1: Spawn isolated sub-agent**
42
+
43
+ When user says things like:
44
+ - "调研 [topic] 领域的文献"
45
+ - "帮我收集 [topic] 相关的论文"
46
+ - "Survey papers on [topic]"
47
+
48
+ Use `sessions_spawn` to run the survey in isolation:
49
+
50
+ ```
51
+ sessions_spawn({
52
+ task: `你是一个文献调研专家。请为研究主题 "{TOPIC}" 执行完整的文献调研。
53
+
54
+ ## 调研目标
55
+ {USER_REQUIREMENTS}
56
+
57
+ ## 执行流程
58
+
59
+ ### Phase 1: 生成检索词
60
+ 基于研究主题,生成 8-15 个检索词组合,覆盖:
61
+ - 核心概念的不同表述
62
+ - 相关技术方法
63
+ - 应用场景
64
+ - 英文和中文关键词(如适用)
65
+
66
+ 将检索词保存到 $WORKSPACE/survey/search_terms.json
67
+
68
+ ### Phase 2: 批量检索
69
+ 对每个检索词使用 arxiv_search tool:
70
+ - max_results: 30-50 per query
71
+ - 合并去重(按 arxiv_id)
72
+ - 记录每篇论文的来源检索词
73
+
74
+ 将原始结果保存到 $WORKSPACE/survey/raw_results.json
75
+
76
+ ### Phase 3: 相关性筛选
77
+ 阅读每篇论文的标题和摘要,判断与 "{TOPIC}" 的相关性:
78
+ - 5分:高度相关,直接研究此主题
79
+ - 4分:相关,涉及关键方法或应用
80
+ - 3分:部分相关,可作为参考
81
+ - 2分:边缘相关
82
+ - 1分:不相关
83
+
84
+ 保留 score >= 4 的论文。
85
+ 将筛选结果保存到 $WORKSPACE/survey/filtered_papers.json
86
+
87
+ ### Phase 4: 聚类分组
88
+ 分析筛选后论文的摘要,识别 3-6 个研究方向/子主题。
89
+ 为每个方向创建子文件夹并分配论文:
90
+
91
+ $WORKSPACE/papers/
92
+ ├── {direction-1}/
93
+ │ ├── paper_list.md
94
+ │ └── [arxiv_ids...]
95
+ ├── {direction-2}/
96
+ │ └── ...
97
+ └── uncategorized/
98
+
99
+ 将聚类结果保存到 $WORKSPACE/survey/clusters.json
100
+
101
+ ### Phase 5: 迭代发现(1-2轮)
102
+ 检查高分论文的摘要,识别:
103
+ - 提到的新方法名称
104
+ - 引用的重要工作
105
+ - 新的关键词
106
+
107
+ 如果发现新方向,补充检索并重复 Phase 2-4。
108
+ 最多迭代 2 轮。
109
+
110
+ ### Phase 6: 生成报告
111
+ 创建 $WORKSPACE/survey/report.md:
112
+
113
+ # 文献调研报告: {TOPIC}
114
+
115
+ ## 调研概要
116
+ - 检索词数量: X
117
+ - 初始检索: Y 篇
118
+ - 筛选后: Z 篇
119
+ - 研究方向: N 个
120
+
121
+ ## 研究方向分布
122
+
123
+ ### 方向1: [名称]
124
+ - 论文数量: X
125
+ - 代表性工作: [列表]
126
+ - 主要特点: [描述]
127
+
128
+ ### 方向2: [名称]
129
+ ...
130
+
131
+ ## 高影响力论文 (Top 10)
132
+ | 排名 | 标题 | 年份 | 相关度 | 方向 |
133
+ |-----|------|-----|-------|-----|
134
+ | 1 | ... | ... | 5 | ... |
135
+
136
+ ## 研究趋势
137
+ [基于论文年份分布的观察]
138
+
139
+ ## 发现的新方向
140
+ [迭代中发现的额外关键词和方向]
141
+
142
+ ## 建议阅读顺序
143
+ 1. [入门级论文]
144
+ 2. [核心方法论文]
145
+ 3. [最新进展]
146
+
147
+ ---
148
+
149
+ 完成后,向主 session 报告:
150
+ - 总共发现的论文数量
151
+ - 识别的研究方向
152
+ - 报告文件位置`,
153
+ label: "literature-survey-{TOPIC_SLUG}",
154
+ runTimeoutSeconds: 900,
155
+ cleanup: "keep"
156
+ })
157
+ ```
158
+
159
+ **Step 2: Wait and relay results**
160
+
161
+ Sub-agent 完成后会自动 announce 结果到主 session。
162
+ 将结果摘要展示给用户,包括:
163
+ - 发现的论文数量
164
+ - 主要研究方向
165
+ - 报告文件位置
166
+
167
+ ---
168
+
169
+ ## Workspace Structure
170
+
171
+ ```
172
+ ~/.openclaw/workspace/projects/{project-id}/
173
+ ├── project.json
174
+ ├── survey/ # 调研过程数据
175
+ │ ├── search_terms.json # 检索词列表
176
+ │ ├── raw_results.json # 原始检索结果
177
+ │ ├── filtered_papers.json # 筛选后的论文
178
+ │ ├── clusters.json # 聚类结果
179
+ │ ├── iterations.log # 迭代记录
180
+ │ └── report.md # 最终报告
181
+ ├── papers/ # 按方向组织的论文
182
+ │ ├── {direction-1}/
183
+ │ │ ├── paper_list.md
184
+ │ │ └── 2401.12345/ # .tex 源文件
185
+ │ ├── {direction-2}/
186
+ │ └── uncategorized/
187
+ └── ideas/ # 后续 idea-generation 输出
188
+ ```
189
+
190
+ ---
191
+
192
+ ## Data Schemas
193
+
194
+ ### search_terms.json
195
+ ```json
196
+ {
197
+ "topic": "battery life prediction",
198
+ "generated_at": "2024-01-15T10:00:00Z",
199
+ "terms": [
200
+ {"term": "battery remaining useful life", "category": "core"},
201
+ {"term": "lithium-ion degradation prediction", "category": "method"},
202
+ {"term": "SOH estimation neural network", "category": "technique"},
203
+ {"term": "EV battery health monitoring", "category": "application"}
204
+ ]
205
+ }
206
+ ```
207
+
208
+ ### filtered_papers.json
209
+ ```json
210
+ {
211
+ "filtered_at": "2024-01-15T10:30:00Z",
212
+ "total_raw": 245,
213
+ "total_filtered": 42,
214
+ "papers": [
215
+ {
216
+ "arxiv_id": "2401.12345",
217
+ "title": "...",
218
+ "abstract": "...",
219
+ "authors": ["..."],
220
+ "published": "2024-01-15",
221
+ "relevance_score": 5,
222
+ "source_terms": ["battery RUL", "degradation prediction"],
223
+ "notes": "直接研究锂电池RUL预测"
224
+ }
225
+ ]
226
+ }
227
+ ```
228
+
229
+ ### clusters.json
230
+ ```json
231
+ {
232
+ "clustered_at": "2024-01-15T11:00:00Z",
233
+ "clusters": [
234
+ {
235
+ "id": "data-driven",
236
+ "name": "数据驱动方法",
237
+ "description": "使用机器学习/深度学习的方法",
238
+ "paper_count": 15,
239
+ "paper_ids": ["2401.12345", "2401.12346", "..."],
240
+ "keywords": ["LSTM", "CNN", "transformer", "neural network"]
241
+ },
242
+ {
243
+ "id": "physics-based",
244
+ "name": "物理模型方法",
245
+ "description": "基于电化学机理的方法",
246
+ "paper_count": 8,
247
+ "paper_ids": ["..."]
248
+ }
249
+ ]
250
+ }
251
+ ```
252
+
253
+ ---
254
+
255
+ ## Quick Mode (Without Sub-agent)
256
+
257
+ For smaller surveys (< 50 papers), can run directly without spawning:
258
+
259
+ User: "快速调研一下 [topic],不超过 30 篇"
260
+
261
+ → Run Phase 1-4 directly in main session
262
+ → Skip iteration
263
+ → Generate simplified report
264
+
265
+ ---
266
+
267
+ ## Commands
268
+
269
+ - "调研 [topic] 领域" → Full survey with sub-agent
270
+ - "快速调研 [topic]" → Quick mode, 30 papers max
271
+ - "继续上次的调研" → Resume from existing survey data
272
+ - "扩展调研 [new direction]" → Add new search terms and iterate
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: research-pipeline
3
- description: "End-to-end research automation: from idea to code implementation with literature review, planning, and iterative refinement. Use arxiv, github_search, and exec tools."
3
+ description: "End-to-end research automation: idea literature plan implement → review → iterate. Use for: implementing a specific research idea, full ML research workflow. NOT for: just exploring literature (use /literature-survey), just generating ideas (use /idea-generation), just writing review (use /write-review-paper)."
4
4
  metadata:
5
5
  {
6
6
  "openclaw":