scientify 1.2.1 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "scientify",
3
- "version": "1.2.1",
3
+ "version": "1.3.0",
4
4
  "description": "Scientify - AI-powered research workflow automation for OpenClaw. Includes idea generation, literature review, research pipeline skills, and arxiv tool.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -0,0 +1,129 @@
1
+ # Workspace Directory Specification
2
+
3
+ All Scientify skills share a unified project-based workspace structure.
4
+
5
+ ## Base Path
6
+
7
+ ```
8
+ ~/.openclaw/workspace/projects/
9
+ ├── .active # Current project ID (plain text)
10
+ ├── {project-id}/ # Each research topic has its own project
11
+ │ └── ...
12
+ └── {another-project}/
13
+ ```
14
+
15
+ ## Project Structure
16
+
17
+ ```
18
+ ~/.openclaw/workspace/projects/{project-id}/
19
+ ├── project.json # Project metadata
20
+ ├── task.json # Research task definition
21
+
22
+ ├── survey/ # /literature-survey outputs
23
+ │ ├── search_terms.json # Generated search keywords
24
+ │ ├── raw_results.json # All search results
25
+ │ ├── filtered_papers.json # Papers with relevance scores
26
+ │ ├── clusters.json # Clustered by research direction
27
+ │ └── report.md # Final survey report
28
+
29
+ ├── papers/ # Downloaded paper sources
30
+ │ ├── {direction-1}/ # Organized by cluster
31
+ │ │ ├── paper_list.md
32
+ │ │ └── {arxiv_id}/ # .tex source files
33
+ │ ├── {direction-2}/
34
+ │ └── uncategorized/
35
+
36
+ ├── repos/ # Cloned reference repositories
37
+ │ ├── {repo-name-1}/
38
+ │ └── {repo-name-2}/
39
+
40
+ ├── ideas/ # /idea-generation outputs
41
+ │ ├── gaps.md # Identified research gaps
42
+ │ ├── idea_1.md ... idea_5.md # Generated ideas
43
+ │ ├── selected_idea.md # Enhanced best idea
44
+ │ ├── implementation_report.md # Code mapping
45
+ │ └── summary.md # Final summary
46
+
47
+ ├── review/ # /write-review-paper outputs
48
+ │ ├── reading_plan.md # Prioritized reading list
49
+ │ ├── notes/ # Per-paper reading notes
50
+ │ │ └── {paper_id}.md
51
+ │ ├── comparison.md # Method comparison table
52
+ │ ├── timeline.md # Research timeline
53
+ │ ├── taxonomy.md # Classification system
54
+ │ ├── draft.md # Survey paper draft
55
+ │ └── bibliography.bib # References
56
+
57
+ ├── plan_res.md # /research-pipeline: implementation plan
58
+ ├── project/ # /research-pipeline: code implementation
59
+ │ ├── model/
60
+ │ ├── data/
61
+ │ ├── training/
62
+ │ ├── testing/
63
+ │ ├── run.py
64
+ │ └── requirements.txt
65
+ ├── iterations/ # Review iterations
66
+ │ ├── judge_v1.md
67
+ │ └── judge_v2.md
68
+ └── experiment_res.md # Final results
69
+ ```
70
+
71
+ ## Conventions
72
+
73
+ ### File Existence = Step Completion
74
+
75
+ Check output file before executing any step. If exists, skip.
76
+
77
+ Enables:
78
+ - **Crash recovery**: resume from last completed step
79
+ - **Incremental progress**: rerunning skips completed work
80
+ - **Transparency**: inspect progress by listing directory
81
+
82
+ ### Project Metadata
83
+
84
+ **project.json:**
85
+ ```json
86
+ {
87
+ "id": "battery-rul-prediction",
88
+ "name": "Battery RUL Prediction",
89
+ "created": "2024-01-15T10:00:00Z",
90
+ "topics": ["battery", "remaining useful life", "prediction"]
91
+ }
92
+ ```
93
+
94
+ **task.json:**
95
+ ```json
96
+ {
97
+ "domain": "battery health",
98
+ "focus": "RUL prediction using transformer",
99
+ "date_limit": "2024-01-01",
100
+ "created": "2024-01-15"
101
+ }
102
+ ```
103
+
104
+ ### Immutability
105
+
106
+ Once written, do NOT modify outputs unless user explicitly asks.
107
+ Exception: `project/` is mutable during implement-review-iterate loop.
108
+
109
+ ### Active Project
110
+
111
+ ```bash
112
+ # Read active project
113
+ cat ~/.openclaw/workspace/projects/.active
114
+
115
+ # Set active project
116
+ echo "battery-rul-prediction" > ~/.openclaw/workspace/projects/.active
117
+
118
+ # Set $WORKSPACE variable
119
+ WORKSPACE=~/.openclaw/workspace/projects/$(cat ~/.openclaw/workspace/projects/.active)
120
+ ```
121
+
122
+ ## Skill Outputs Summary
123
+
124
+ | Skill | Primary Outputs |
125
+ |-------|-----------------|
126
+ | `/literature-survey` | `survey/`, `papers/` |
127
+ | `/idea-generation` | `ideas/` |
128
+ | `/write-review-paper` | `review/` |
129
+ | `/research-pipeline` | `project/`, `iterations/`, `experiment_res.md` |
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: idea-generation
3
- description: "Generate innovative research ideas from a topic. SEARCHES arXiv/GitHub automatically, downloads papers, analyzes literature, outputs 5 novel ideas with citations. Use for: 找研究方向, 生成创新点, find research gaps, propose new methods. NOT for: summarizing papers (use /write-review-paper), literature survey (use /literature-survey)."
3
+ description: "Generate 5 innovative research ideas from collected papers. Analyzes literature, identifies gaps, proposes novel methods with citations. Use for: 找研究方向, 生成创新点, find research gaps. Requires papers in workspace (run /literature-survey first if needed)."
4
4
  metadata:
5
5
  {
6
6
  "openclaw":
@@ -13,389 +13,168 @@ metadata:
13
13
 
14
14
  # Idea Generation
15
15
 
16
- End-to-end workflow for generating innovative research ideas from a research topic. This skill implements a full research idea generation pipeline:
16
+ Generate innovative research ideas grounded in literature analysis. This skill reads existing papers, identifies research gaps, and produces 5 distinct ideas with citations.
17
17
 
18
- 1. Search papers and code repositories
19
- 2. Select and download references
20
- 3. Analyze literature and codebases
21
- 4. Generate multiple ideas
22
- 5. Select and enhance the best idea
23
- 6. Map to code implementations
18
+ **Core principle:** Ideas MUST be grounded in actual papers, not generated from model knowledge.
24
19
 
25
- ---
26
-
27
- ## ⚠️ CRITICAL: EXECUTION MODE
28
-
29
- **AUTONOMOUS EXECUTION**: Execute ALL steps without asking for user confirmation at each step.
30
- - Do NOT ask "要我继续吗?" or "Should I proceed?"
31
- - You MAY spawn subagents for parallel tasks (e.g., downloading multiple papers)
32
- - Only ask user when there's a genuine ambiguity (e.g., which focus area to choose)
33
- - Checkpoints are for YOUR internal verification, not for asking user
34
-
35
- **Run the entire workflow from Step 1 to Step 8 automatically.**
20
+ **Workspace:** See `../_shared/workspace-spec.md` for directory structure. Outputs go to `$WORKSPACE/ideas/`.
36
21
 
37
22
  ---
38
23
 
39
- ## ⚠️ CRITICAL: MANDATORY TOOL USAGE
40
-
41
- **DO NOT generate ideas from your own knowledge.** All ideas MUST be grounded in actual literature research.
42
-
43
- ### Blocking Requirements
44
-
45
- 1. **MUST call `arxiv` tool** to search papers - NO EXCEPTIONS
46
- 2. **MUST call `github_search` tool** to find repositories - NO EXCEPTIONS
47
- 3. **MUST write `search_results.md`** BEFORE proceeding to idea generation
48
- 4. **MUST reference specific papers** (with arXiv IDs) in generated ideas
49
- 5. **MUST clone actual repos** before code survey
50
-
51
- ### Anti-Pattern: DO NOT DO THIS
24
+ ## Step 1: Check Workspace Resources
52
25
 
53
- User asks about "time series forecasting" → Agent immediately lists methods from memory
54
- ❌ Agent generates ideas without calling any search tools
55
- ❌ Agent skips to idea generation without `search_results.md` existing
26
+ First, check what resources already exist:
56
27
 
57
- ### Correct Pattern: DO THIS
58
-
59
- User asks about "time series forecasting" → Agent calls `arxiv` tool with query
60
- ✅ Agent calls `github_search` tool to find implementations
61
- ✅ Agent writes search results to file
62
- ✅ Agent reads downloaded papers before generating ideas
63
- ✅ Ideas reference specific papers by arXiv ID
64
-
65
- ---
66
-
67
- ## Workspace Convention (Project-based)
28
+ ```bash
29
+ # Check active project
30
+ cat ~/.openclaw/workspace/projects/.active 2>/dev/null
68
31
 
69
- **IMPORTANT**: Each research topic uses its own project directory. Agent auto-selects or creates projects.
32
+ # Check papers
33
+ ls ~/.openclaw/workspace/projects/*/papers/ 2>/dev/null | head -20
70
34
 
71
- ```
72
- ~/.openclaw/workspace/
73
- └── projects/
74
- ├── .active # Current project ID (plain text file)
75
- ├── nlp-summarization/ # Project A
76
- │ ├── project.json # Project metadata
77
- │ ├── task.json # Research task definition
78
- │ ├── search_results.md # Search results
79
- │ ├── prepare_res.md # Selected repos summary
80
- │ ├── papers/ # Downloaded papers
81
- │ ├── repos/ # Cloned repositories
82
- │ └── ideas/ # Generated ideas
83
- ├── image-segmentation/ # Project B
84
- │ └── ...
85
- └── ...
35
+ # Check survey results
36
+ cat ~/.openclaw/workspace/projects/*/survey/clusters.json 2>/dev/null | head -5
86
37
  ```
87
38
 
88
- **All paths are project-relative**: `~/.openclaw/workspace/projects/{project_id}/`
39
+ ### Assess Available Resources
89
40
 
90
- **File existence = step completion.** Skip steps whose output already exists.
41
+ | Resource | Location | Status |
42
+ |----------|----------|--------|
43
+ | Papers | `$WORKSPACE/papers/` | Count: ? |
44
+ | Survey clusters | `$WORKSPACE/survey/clusters.json` | Exists: Y/N |
45
+ | Repos | `$WORKSPACE/repos/` | Count: ? |
91
46
 
92
47
  ---
93
48
 
94
- ## Step 0: Auto Project Management (REQUIRED)
49
+ ## Step 2: Ask User About Search Strategy
95
50
 
96
- **Autonomous - DO NOT ask user for confirmation.**
51
+ Based on workspace state, ask user:
97
52
 
98
- 1. Extract topic from user query → convert to kebab-case ID
99
- 2. Check `~/.openclaw/workspace/projects/` for existing match
100
- 3. Use existing or create new: `mkdir -p $PROJECT_ID/{papers,repos,ideas}`
101
- 4. Update `.active` file and set `$WORKSPACE` path
53
+ **If papers exist (≥5):**
54
+ > 📚 Found {N} papers in workspace from previous survey.
55
+ >
56
+ > Options:
57
+ > 1. **Use existing papers** - Generate ideas from current collection
58
+ > 2. **Search more** - Run `/literature-survey` to expand collection
59
+ > 3. **Quick search** - Add 5-10 more papers on specific topic
102
60
 
103
- > 📁 Using project: `{project_id}` (new/existing)
61
+ **If no papers:**
62
+ > 📭 No papers found in workspace.
63
+ >
64
+ > To generate grounded ideas, I need literature. Options:
65
+ > 1. **Run /literature-survey** - Comprehensive search (100+ papers, recommended)
66
+ > 2. **Quick search** - Fetch 10-15 papers on your topic now
67
+ > 3. **You provide papers** - Point me to existing PDFs/tex files
104
68
 
105
69
  ---
106
70
 
107
- ## Step 1: Parse Task
71
+ ## Step 3: Acquire Resources (if needed)
108
72
 
109
- Check `$WORKSPACE/task.json`. If missing, extract from user query:
73
+ ### Option A: Delegate to /literature-survey (Recommended)
110
74
 
111
- - **domain**: Research domain (e.g., "graph neural networks", "recommendation")
112
- - **focus** (optional): Specific problem or technique
113
- - **date_limit** (optional): Only consider papers before this date
114
-
115
- ```bash
116
- cat $WORKSPACE/task.json 2>/dev/null || echo "No task.json"
75
+ If user wants comprehensive search:
117
76
  ```
77
+ Please run: /literature-survey {topic}
118
78
 
119
- Create task.json:
120
- ```json
121
- {
122
- "domain": "graph neural networks",
123
- "focus": "scalable transformers for node classification",
124
- "date_limit": "2024-01-01",
125
- "created": "2024-XX-XX"
126
- }
127
- ```
128
-
129
- **Output:** `$WORKSPACE/task.json`
130
-
131
- ---
132
-
133
- ## Step 2: Search Papers and Code (MANDATORY)
134
-
135
- **⚠️ BLOCKING: You MUST complete this step before ANY idea generation.**
136
-
137
- ### 2.1 ArXiv Search (REQUIRED)
138
-
139
- **You MUST call the `arxiv` tool.** Example:
79
+ This will:
80
+ - Search 100+ papers systematically
81
+ - Filter by relevance (score ≥4)
82
+ - Cluster into research directions
83
+ - Save to $WORKSPACE/papers/
140
84
 
141
- ```
142
- Tool: arxiv
143
- Arguments:
144
- query: "text summarization transformer model"
145
- max_results: 10
146
- sort_by: "relevance"
147
- ```
148
-
149
- If `arxiv` tool is not available, use `WebSearch` with `site:arxiv.org`:
150
- ```
151
- Tool: WebSearch
152
- Arguments:
153
- query: "site:arxiv.org text summarization transformer model"
85
+ After survey completes, run /idea-generation again.
154
86
  ```
155
87
 
156
- ### 2.2 GitHub Search (REQUIRED)
88
+ ### Option B: Quick Search (5-10 papers)
157
89
 
158
- **You MUST call the `github_search` tool or search GitHub.** Example:
90
+ For fast iteration, do minimal search:
159
91
 
92
+ 1. **ArXiv search:**
160
93
  ```
161
- Tool: github_search
94
+ Tool: arxiv_search
162
95
  Arguments:
163
- query: "text summarization pytorch huggingface"
164
- sort: "stars"
165
- max_results: 20
166
- ```
167
-
168
- If `github_search` tool is not available, use `WebSearch`:
169
- ```
170
- Tool: WebSearch
171
- Arguments:
172
- query: "site:github.com text summarization pytorch stars:>100"
173
- ```
174
-
175
- ### 2.3 CHECKPOINT: Verify Search Completed
176
-
177
- Before proceeding, confirm:
178
- - [ ] Called arxiv/WebSearch for papers
179
- - [ ] Called github_search/WebSearch for repositories
180
- - [ ] Have at least 5 paper results
181
- - [ ] Have at least 5 repository results
182
-
183
- **If search returns 0 results, try different queries. DO NOT proceed without results.**
184
-
185
- ### 2.4 Compile Results
186
-
187
- Write to `$WORKSPACE/search_results.md`:
188
-
189
- ```markdown
190
- # Search Results
191
-
192
- ## Task
193
- - Domain: {domain}
194
- - Focus: {focus}
195
- - Date: {date}
196
-
197
- ## ArXiv Papers Found
198
-
199
- | # | Title | ArXiv ID | Year | Relevance |
200
- |---|-------|----------|------|-----------|
201
- | 1 | [Title](pdf_url) | 2401.xxxxx | 2024 | [Why relevant] |
202
- | 2 | ... | ... | ... | ... |
203
-
204
- ## GitHub Repositories Found
205
-
206
- | # | Repository | Stars | Language | Relevance |
207
- |---|------------|-------|----------|-----------|
208
- | 1 | [owner/repo](url) | 1.2k | Python | [Why relevant] |
209
- | 2 | ... | ... | ... | ... |
96
+ query: "{user_topic}"
97
+ max_results: 10
210
98
  ```
211
99
 
212
- **Output:** `$WORKSPACE/search_results.md`
213
-
214
- ---
215
-
216
- ## Step 3: Prepare - Select Repositories
217
-
218
- Read search results and select **at least 5** most valuable repositories.
219
-
220
- Selection criteria:
221
- - Direct implementation of relevant papers
222
- - High code quality (stars, documentation)
223
- - Active maintenance
224
- - Covers key techniques in the domain
225
-
226
- ### 3.1 Clone Selected Repos
227
-
100
+ 2. **Clone 3-5 reference repos:**
228
101
  ```bash
229
102
  mkdir -p $WORKSPACE/repos
230
- cd $WORKSPACE/repos
231
-
232
- # For each selected repo:
233
- git clone --depth 1 https://github.com/owner/repo1.git
234
- git clone --depth 1 https://github.com/owner/repo2.git
235
- # ... at least 5 repos
236
- ```
237
-
238
- ### 3.2 Document Selection
239
-
240
- Write to `$WORKSPACE/prepare_res.md`:
241
-
242
- ```markdown
243
- # Selected Reference Codebases
244
-
245
- ## Selection Rationale
246
- [Why these repos were chosen]
247
-
248
- ## Repositories
249
-
250
- ### 1. repo1
251
- - **URL**: https://github.com/owner/repo1
252
- - **Paper**: [Associated paper if any]
253
- - **Key Components**:
254
- - `model/` - Model architecture
255
- - `train.py` - Training loop
256
- - **Usage**: [How this will help implement our idea]
257
-
258
- ### 2. repo2
259
- ...
260
-
261
- ## Reference Papers
262
- Based on these repos, the key papers to read are:
263
- 1. [Paper Title 1] - ArXiv: 2401.xxxxx
264
- 2. [Paper Title 2] - ArXiv: 2401.xxxxx
265
- ...
103
+ git clone --depth 1 {repo_url} $WORKSPACE/repos/{name}
266
104
  ```
267
105
 
268
- **Output:** `$WORKSPACE/prepare_res.md` + `$WORKSPACE/repos/`
269
-
270
- ---
271
-
272
- ## Step 4: Download Papers
273
-
274
- For each paper referenced in prepare_res.md, download the source.
275
-
276
- **IMPORTANT: Download .tex source, NOT PDF.** .tex files are much easier for AI to read and extract information from.
277
-
278
- ### 4.1 Download .tex Source (RECOMMENDED - Use arxiv tool)
279
-
280
- Use the `arxiv` tool with `download: true` to automatically download and extract .tex sources:
281
-
282
- ```
283
- Tool: arxiv
284
- Arguments:
285
- query: "abstractive summarization long document"
286
- max_results: 10
287
- download: true
288
- output_dir: "$WORKSPACE/papers"
289
- ```
290
-
291
- The tool will:
292
- 1. Search for papers matching your query
293
- 2. Download .tex source from `https://arxiv.org/src/{arxiv_id}`
294
- 3. Extract tar.gz archives automatically
295
- 4. Fall back to PDF if .tex is unavailable
296
- 5. Return a `downloads` array showing what was downloaded
297
-
298
- **Output format:**
299
- ```json
300
- {
301
- "papers": [...],
302
- "downloads": [
303
- {"arxiv_id": "2404.04429", "format": "tex", "files": ["main.tex", "methods.tex"]},
304
- {"arxiv_id": "2308.03664", "format": "pdf", "files": ["2308.03664.pdf"], "error": "tex unavailable"}
305
- ],
306
- "output_dir": "$WORKSPACE/papers"
307
- }
308
- ```
309
-
310
- ### 4.2 Manual Download (Fallback)
311
-
312
- If the arxiv tool is unavailable, use bash:
106
+ 3. **Download paper sources:**
313
107
  ```bash
314
108
  mkdir -p $WORKSPACE/papers/{arxiv_id}
315
- cd $WORKSPACE/papers/{arxiv_id}
316
- curl -L "https://arxiv.org/src/{arxiv_id}" -o source.tar.gz
317
- tar -xzf source.tar.gz 2>/dev/null || mv source.tar.gz main.tex
109
+ curl -L "https://arxiv.org/src/{arxiv_id}" | tar -xz -C $WORKSPACE/papers/{arxiv_id}
318
110
  ```
319
111
 
320
- ### 4.3 Document Downloads
321
-
322
- Write to `$WORKSPACE/papers/download_log.md`:
323
-
324
- ```markdown
325
- # Downloaded Papers
326
-
327
- | ArXiv ID | Title | Format | Status |
328
- |----------|-------|--------|--------|
329
- | 2404.04429 | Physics-Informed ML for Battery... | .tex | ✓ |
330
- | 2308.03664 | Two-stage Early Prediction... | .tex | ✓ |
331
- | 2401.99999 | Some Other Paper | .pdf | ✓ (tex unavailable) |
332
- ```
333
-
334
- **Output:** `$WORKSPACE/papers/`
335
-
336
112
  ---
337
113
 
338
- ## Step 5: Generate Ideas (5 Ideas)
114
+ ## Step 4: Analyze Literature
339
115
 
340
- **⚠️ BLOCKING: DO NOT start this step unless Steps 2-4 are complete.**
116
+ **Prerequisites:** At least 5 papers in `$WORKSPACE/papers/`
341
117
 
342
- ### Pre-requisite Checkpoint
118
+ ### 4.1 Read Papers
343
119
 
344
- Before generating ANY ideas, verify these files exist:
345
- - [ ] `$WORKSPACE/search_results.md` - search results from Step 2
346
- - [ ] `$WORKSPACE/prepare_res.md` - selected repos from Step 3
347
- - [ ] At least 3 papers downloaded in `$WORKSPACE/papers/`
120
+ For each paper, extract:
121
+ - Core contribution (1 sentence)
122
+ - Key method/formula
123
+ - Limitations mentioned
124
+ - Future work suggestions
348
125
 
349
- **If any file is missing, GO BACK and complete the previous steps.**
126
+ **Long papers (>50KB):** See `references/reading-long-papers.md`
350
127
 
351
- This is the core intellectual step. Generate **exactly 5 distinct innovative ideas**.
352
-
353
- **IMPORTANT: Ideas must be grounded in the literature you just read. Each idea MUST:**
354
- - Reference at least 2 specific papers by arXiv ID
355
- - Identify specific limitations from those papers
356
- - Propose improvements based on gaps found in the literature
357
-
358
- ### 5.1 Analyze Literature First (REQUIRED)
359
-
360
- For each paper in `papers/`:
361
- 1. Read thoroughly (especially: abstract, method, experiments, limitations)
362
- 2. Extract: core contribution, math formulas, limitations, future work
363
- 3. Note connections to other papers
364
-
365
- **Long Papers (>50KB):** See `references/reading-long-papers.md` for chunked reading strategy.
366
-
367
- For each repo in `repos/`:
368
- 1. Understand structure: `gen_code_tree_structure` equivalent
369
- 2. Identify key implementations
370
- 3. Note reusable components
371
-
372
- ### 5.2 Identify Research Gaps
128
+ ### 4.2 Identify Research Gaps
373
129
 
374
130
  Look for:
375
131
  - Common limitations across papers
376
- - Unexplored combinations of techniques
132
+ - Unexplored technique combinations
377
133
  - Scalability issues
378
134
  - Assumptions that could be relaxed
379
135
 
380
- ### 5.3 Generate 5 Ideas
136
+ Document gaps in `$WORKSPACE/ideas/gaps.md`:
137
+ ```markdown
138
+ # Research Gaps Identified
139
+
140
+ ## Gap 1: [Description]
141
+ - Mentioned in: [paper1], [paper2]
142
+ - Why important: ...
143
+
144
+ ## Gap 2: [Description]
145
+ ...
146
+ ```
147
+
148
+ ---
149
+
150
+ ## Step 5: Generate 5 Ideas
381
151
 
382
152
  Create `$WORKSPACE/ideas/idea_1.md` through `idea_5.md` using template in `references/idea-template.md`.
383
153
 
384
- **Key requirements:**
385
- - Each idea must cite ≥2 papers by arXiv ID
386
- - Use different strategies (see template): combination, simplification, generalization, constraint relaxation, architecture innovation
154
+ **Requirements:**
155
+ - Each idea cites ≥2 papers by arXiv ID
156
+ - Use different strategies:
387
157
 
388
- **❌ REJECTED if:** No arXiv IDs cited, or ideas not grounded in literature
158
+ | Idea | Strategy |
159
+ |------|----------|
160
+ | 1 | Combination - merge 2+ techniques |
161
+ | 2 | Simplification - reduce complexity |
162
+ | 3 | Generalization - extend to new domain |
163
+ | 4 | Constraint relaxation - remove assumption |
164
+ | 5 | Architecture innovation - new design |
389
165
 
390
- **Output:** `$WORKSPACE/ideas/idea_1.md` through `idea_5.md`
166
+ **❌ REJECTED if:** No arXiv IDs cited, or ideas not grounded in literature
391
167
 
392
168
  ---
393
169
 
394
170
  ## Step 6: Select and Enhance Best Idea
395
171
 
396
- ### 6.1 Evaluate All Ideas
172
+ ### 6.1 Score All Ideas
397
173
 
398
- Score each idea on Novelty/Feasibility/Impact (1-5). Select highest total.
174
+ | Idea | Novelty | Feasibility | Impact | Total |
175
+ |------|---------|-------------|--------|-------|
176
+ | 1 | /5 | /5 | /5 | /15 |
177
+ | ... | | | | |
399
178
 
400
179
  ### 6.2 Enhance Selected Idea
401
180
 
@@ -404,62 +183,25 @@ Create `$WORKSPACE/ideas/selected_idea.md` with:
404
183
  - Architecture choices
405
184
  - Hyperparameters
406
185
  - Implementation roadmap
407
- - Failure modes & mitigations
408
-
409
- **Output:** `$WORKSPACE/ideas/selected_idea.md`
410
186
 
411
187
  ---
412
188
 
413
- ## Step 7: Code Survey - Map Idea to Implementations
189
+ ## Step 7: Code Survey
414
190
 
415
- Map each **atomic concept** in the selected idea to code in reference repos.
191
+ Map idea concepts to reference implementations.
416
192
 
417
- See `references/code-mapping.md` for detailed template.
418
-
419
- **Quick steps:**
420
- 1. Extract atomic concepts from `selected_idea.md`
421
- 2. Search repos: `grep -r "class.*Attention" $WORKSPACE/repos/`
422
- 3. Document mapping to `$WORKSPACE/ideas/implementation_report.md`
193
+ See `references/code-mapping.md` for template.
423
194
 
424
195
  **Output:** `$WORKSPACE/ideas/implementation_report.md`
425
196
 
426
197
  ---
427
198
 
428
- ## Step 8: Final Summary
199
+ ## Step 8: Summary
429
200
 
430
- Create `$WORKSPACE/ideas/summary.md` with:
431
- - Task overview (domain, focus)
432
- - Resources gathered (papers, repos count)
201
+ Create `$WORKSPACE/ideas/summary.md`:
433
202
  - All 5 ideas with scores
434
203
  - Selected idea details
435
- - Next steps: `/research-pipeline` or manual implementation
436
-
437
- **Output:** `$WORKSPACE/ideas/summary.md`
438
-
439
- ---
440
-
441
- ## Quality Checklist
442
-
443
- Before completing, verify:
444
-
445
- - [ ] At least 5 repos cloned in `repos/`
446
- - [ ] At least 3 papers downloaded in `papers/`
447
- - [ ] All 5 ideas are substantially different
448
- - [ ] Selected idea has complete math formulations
449
- - [ ] Implementation report covers ALL atomic concepts
450
- - [ ] Each concept has actual code reference (not placeholder)
451
- - [ ] Evaluation plan has specific datasets and metrics
452
-
453
- ---
454
-
455
- ## Integration with Other Skills
456
-
457
- **After idea-generation:**
458
- - `/research-pipeline` → Implement the selected idea
459
-
460
- **To gather more resources:**
461
- - `/literature-survey` → Comprehensive paper collection
462
- - `/write-review-paper` → Synthesize into review
204
+ - Next steps: `/research-pipeline` to implement
463
205
 
464
206
  ---
465
207
 
@@ -467,19 +209,15 @@ Before completing, verify:
467
209
 
468
210
  | User Says | Action |
469
211
  |-----------|--------|
470
- | "Generate research ideas for NLP" | Full workflow (Steps 1-8) |
471
- | "Search papers on X" | Steps 1-2 only |
472
- | "I have papers, generate ideas" | Skip to Step 5 |
473
- | "Enhance this idea: ..." | Skip to Step 6-7 |
474
- | "Map this idea to code" | Step 7 only |
212
+ | "Generate ideas for X" | Check workspace ask strategy → generate |
213
+ | "I have papers, generate ideas" | Skip to Step 4 |
214
+ | "Enhance idea N" | Jump to Step 6 |
215
+ | "Map to code" | Jump to Step 7 |
475
216
 
476
217
  ---
477
218
 
478
- ## Batch Processing Rule
479
-
480
- If more than 10 papers/repos to analyze:
481
- 1. First pass: Quick scan all (abstract/README only)
482
- 2. Select top 5-7 for deep analysis
483
- 3. Generate ideas from deep analysis
219
+ ## Integration
484
220
 
485
- Do NOT process all resources with full detail - context will overflow.
221
+ - **Before:** `/literature-survey` to collect papers
222
+ - **After:** `/research-pipeline` to implement selected idea
223
+ - **Alternative:** `/write-review-paper` to write survey instead
@@ -14,6 +14,8 @@ metadata:
14
14
 
15
15
  Comprehensive literature discovery workflow for a research domain. This skill searches broadly, filters by relevance, clusters by direction, and iterates to ensure complete coverage.
16
16
 
17
+ **Workspace:** See `../_shared/workspace-spec.md` for directory structure. Outputs go to `$WORKSPACE/survey/` and `$WORKSPACE/papers/`.
18
+
17
19
  ## Architecture: Isolated Sub-agent
18
20
 
19
21
  This survey runs in an **isolated sub-session** to avoid context pollution. The main session only receives the final report.
@@ -13,44 +13,22 @@ metadata:
13
13
 
14
14
  # Research Pipeline
15
15
 
16
- Automate an end-to-end ML research workflow: idea -> literature search -> survey -> plan -> implement -> review -> iterate.
16
+ Automate an end-to-end ML research workflow: idea literature survey plan implement review iterate.
17
17
 
18
- All intermediate results live in a project-based workspace directory. **File existence = step completion.** If a step's output file already exists, skip that step and move on. This enables crash recovery and incremental progress.
18
+ **Workspace:** See `../_shared/workspace-spec.md` for directory structure. Outputs go to `$WORKSPACE/project/`, `$WORKSPACE/iterations/`.
19
19
 
20
- ---
20
+ **File existence = step completion.** Skip steps whose output already exists.
21
21
 
22
- ## Workspace Convention (Project-based)
22
+ ---
23
23
 
24
- **IMPORTANT**: This skill uses the same project-based workspace as `idea-generation`. Check or set the active project first.
24
+ ## Step 0: Check Active Project
25
25
 
26
- ### Check Active Project
27
26
  ```bash
28
27
  cat ~/.openclaw/workspace/projects/.active 2>/dev/null
29
28
  ```
30
29
 
31
- If a project is active, set `$WORKSPACE = ~/.openclaw/workspace/projects/{project_id}/`.
32
-
33
- If no active project exists, create one based on the research idea (see Step 1).
34
-
35
- ### Directory Structure
36
- ```
37
- $WORKSPACE/
38
- ├── project.json # Project metadata
39
- ├── task.json # Research task/idea definition
40
- ├── search_results.md # Search results (Step 2)
41
- ├── prepare_res.md # Selected repos (Step 3)
42
- ├── papers/ # Downloaded papers (Step 4)
43
- ├── repos/ # Cloned repositories (Step 3)
44
- ├── notes/ # Paper notes (Step 5)
45
- ├── survey_res.md # Literature survey (Step 5)
46
- ├── plan_res.md # Implementation plan (Step 6)
47
- ├── project/ # Code implementation (Step 7)
48
- ├── ml_res.md # Implementation report (Step 7)
49
- ├── iterations/ # Review iterations (Step 8-9)
50
- │ ├── judge_v1.md
51
- │ └── ...
52
- └── experiment_res.md # Final results (Step 10)
53
- ```
30
+ If active, set `$WORKSPACE = ~/.openclaw/workspace/projects/{project_id}/`.
31
+ If none, create based on research idea in Step 1.
54
32
 
55
33
  ---
56
34
 
@@ -1,81 +1,5 @@
1
- # Workspace Directory Specification
1
+ # Workspace Specification
2
2
 
3
- All research pipeline artifacts live in a `workspace/` directory. The location is either specified by the user or defaults to the current working directory plus `workspace/`.
3
+ **This file has moved to the shared location.**
4
4
 
5
- ## Directory Layout
6
-
7
- ```
8
- workspace/
9
- task.json # Input: research task definition
10
- search_results.md # Step 2: arxiv + github search results
11
- prepare_res.md # Step 3: selected repos and rationale
12
- survey_res.md # Step 5: synthesized literature survey
13
- plan_res.md # Step 6: four-part implementation plan
14
- ml_res.md # Step 7: implementation report
15
- experiment_res.md # Step 10: full training results
16
-
17
- repos/ # Step 3: cloned reference repositories
18
- repo-name-1/
19
- repo-name-2/
20
-
21
- papers/ # Step 4: downloaded paper sources
22
- 2401.12345.tex
23
- 2401.67890.tex
24
-
25
- notes/ # Step 5: per-paper survey notes
26
- paper_001.md
27
- paper_002.md
28
-
29
- iterations/ # Steps 8-9: review history
30
- judge_v1.md
31
- judge_v2.md
32
-
33
- project/ # Step 7: implementation code
34
- model/
35
- data/
36
- training/
37
- testing/
38
- utils/
39
- run.py
40
- requirements.txt
41
- ```
42
-
43
- ## Conventions
44
-
45
- ### File Existence = Step Completion
46
-
47
- The research pipeline uses file existence as the checkpoint mechanism. Before executing any step, check whether its output file already exists. If it does, skip the step.
48
-
49
- This enables:
50
- - **Crash recovery**: resume from the last completed step.
51
- - **Incremental progress**: re-running the pipeline skips completed work.
52
- - **Transparency**: a human can inspect progress by listing the directory.
53
-
54
- ### Naming Rules
55
-
56
- - Markdown files (`.md`) for human-readable outputs.
57
- - JSON files (`.json`) for structured data (task definition).
58
- - Paper notes use sequential numbering: `paper_001.md`, `paper_002.md`.
59
- - Review iterations use version numbering: `judge_v1.md`, `judge_v2.md`.
60
-
61
- ### Immutability
62
-
63
- Once a step's output is written, do NOT modify it unless the user explicitly asks. If a step needs to be re-done, delete the output file first, then re-execute.
64
-
65
- Exception: `workspace/project/` is mutable during the implement-review-iterate loop (Steps 7-9).
66
-
67
- ### task.json Schema
68
-
69
- ```json
70
- {
71
- "idea": "A 1-3 sentence description of the research idea",
72
- "references": ["2401.12345", "paper title string"],
73
- "domain": "recommendation systems",
74
- "date_limit": "2024-01-01"
75
- }
76
- ```
77
-
78
- - `idea` (required): The core research idea to implement.
79
- - `references` (optional): ArXiv IDs or paper titles as starting points.
80
- - `domain` (optional): Research domain for focused searching.
81
- - `date_limit` (optional): Only consider papers published after this date.
5
+ See: `../../_shared/workspace-spec.md` for the unified workspace specification used by all Scientify skills.
@@ -14,6 +14,8 @@ metadata:
14
14
 
15
15
  Guide for writing a structured literature review or survey paper from papers you've already collected. This skill helps with reading strategy, note organization, and academic writing.
16
16
 
17
+ **Workspace:** See `../_shared/workspace-spec.md` for directory structure. Outputs go to `$WORKSPACE/review/`.
18
+
17
19
  ## Prerequisites
18
20
 
19
21
  Before starting, ensure you have: