scientify 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,261 @@
1
+ ---
2
+ name: literature-review
3
+ description: "Generate reading notes and summaries from EXISTING papers (PDF/.tex files user already has). Use for: summarize papers, create reading notes, write literature review section. Does NOT search for new papers or generate research ideas."
4
+ metadata:
5
+ {
6
+ "openclaw":
7
+ {
8
+ "emoji": "📖",
9
+ },
10
+ }
11
+ ---
12
+
13
+ # Literature Review
14
+
15
+ Generate structured notes and synthesis documents from academic papers. Use this skill when the user wants to:
16
+ - Summarize papers they've collected
17
+ - Create reading notes for a research topic
18
+ - Write a literature review section
19
+ - Compare methods across multiple papers
20
+
21
+ ## Workspace Convention (Project-based)
22
+
23
+ **IMPORTANT**: OpenClaw uses project-based workspaces. Each research topic has its own project directory.
24
+
25
+ ### Check Active Project First
26
+
27
+ Before starting, check the active project:
28
+ ```bash
29
+ cat ~/.openclaw/workspace/projects/.active 2>/dev/null
30
+ ```
31
+
32
+ If a project is active, use `$WORKSPACE = ~/.openclaw/workspace/projects/{project_id}/`.
33
+
34
+ If no active project, use the flat structure: `~/.openclaw/workspace/`.
35
+
36
+ ### Project-based Structure (Recommended)
37
+
38
+ ```
39
+ ~/.openclaw/workspace/projects/{project-id}/
40
+ ├── project.json # Project metadata
41
+ ├── papers/ # Downloaded PDFs/tex files
42
+ │ ├── 2401.12345/
43
+ │ │ └── main.tex
44
+ │ └── ...
45
+ └── literature/ # Generated outputs
46
+ ├── notes/ # Per-paper notes
47
+ │ ├── 2401.12345.md
48
+ │ └── ...
49
+ ├── synthesis.md # Cross-paper synthesis
50
+ ├── bibliography.bib # BibTeX entries
51
+ └── review_draft.md # Optional: formatted review
52
+ ```
53
+
54
+ ### Flat Structure (Fallback)
55
+
56
+ ```
57
+ ~/.openclaw/workspace/
58
+ ├── papers/
59
+ └── literature/
60
+ ├── notes/
61
+ ├── synthesis.md
62
+ └── ...
63
+ ```
64
+
65
+ **File existence = step completion.** Skip steps whose output already exists.
66
+
67
+ **In the steps below**, `$WORKSPACE` refers to the active project directory or `~/.openclaw/workspace/` if no project is active.
68
+
69
+ ## Step 1: Gather Papers
70
+
71
+ Check what papers are available:
72
+
73
+ 1. **Check active project first**: `cat ~/.openclaw/workspace/projects/.active`
74
+ 2. **Look in project papers directory**: `ls -la $WORKSPACE/papers/`
75
+ 3. Check if user provided URLs or arXiv IDs
76
+
77
+ If no papers found, ask user to provide:
78
+ - ArXiv IDs (e.g., "2401.12345")
79
+ - PDF URLs
80
+ - Local file paths
81
+
82
+ ## Step 2: Read and Annotate Each Paper
83
+
84
+ For each paper, create `$WORKSPACE/literature/notes/<paper_id>.md`:
85
+
86
+ First, ensure the output directory exists:
87
+ ```bash
88
+ mkdir -p $WORKSPACE/literature/notes
89
+ ```
90
+
91
+ ```markdown
92
+ # [Paper Title]
93
+
94
+ **ArXiv/DOI**: [id]
95
+ **Authors**: [list]
96
+ **Year**: [year]
97
+ **Venue**: [conference/journal if known]
98
+
99
+ ## TL;DR
100
+ [1-2 sentence summary of the main contribution]
101
+
102
+ ## Problem Statement
103
+ [What problem does this paper address?]
104
+
105
+ ## Method
106
+ [Key approach, algorithm, or framework]
107
+
108
+ ### Core Idea
109
+ [The central insight or innovation]
110
+
111
+ ### Technical Details
112
+ [Important formulas, architectures, or algorithms]
113
+
114
+ ```latex
115
+ [Key equations if applicable]
116
+ ```
117
+
118
+ ## Experiments
119
+ - **Datasets**: [list]
120
+ - **Baselines**: [list]
121
+ - **Main Results**: [key numbers]
122
+
123
+ ## Strengths
124
+ - [strength 1]
125
+ - [strength 2]
126
+
127
+ ## Weaknesses / Limitations
128
+ - [limitation 1]
129
+ - [limitation 2]
130
+
131
+ ## Relevance to My Research
132
+ [How does this paper relate to the user's work? Leave blank if unknown]
133
+
134
+ ## Key Quotes
135
+ > "[Important quote from the paper]" (Section X)
136
+
137
+ ## References to Follow
138
+ - [Paper A]: [why interesting]
139
+ - [Paper B]: [why interesting]
140
+ ```
141
+
142
+ ### Reading Strategy by Format
143
+
144
+ | Format | Method |
145
+ |--------|--------|
146
+ | `.tex` | Use `read` directly. Search for `\section`, `\begin{equation}` |
147
+ | `.pdf` | Use `read` (OpenClaw supports PDF). Focus on abstract, intro, method, experiments |
148
+ | URL | Use `web_fetch` to get content, then summarize |
149
+
150
+ ### Quality Checklist
151
+
152
+ Before finishing a note, verify:
153
+ - [ ] TL;DR captures the main contribution
154
+ - [ ] Method section explains the approach clearly
155
+ - [ ] At least 2 strengths and 2 limitations identified
156
+ - [ ] Key equations/algorithms included if applicable
157
+
158
+ ## Step 3: Generate BibTeX
159
+
160
+ Create `$WORKSPACE/literature/bibliography.bib`:
161
+
162
+ ```bibtex
163
+ @article{author2024title,
164
+ title={Full Paper Title},
165
+ author={Last, First and Last2, First2},
166
+ journal={arXiv preprint arXiv:2401.12345},
167
+ year={2024}
168
+ }
169
+ ```
170
+
171
+ For arXiv papers, use this format. For published papers, include venue, volume, pages.
172
+
173
+ ## Step 4: Synthesize Across Papers
174
+
175
+ Create `$WORKSPACE/literature/synthesis.md`:
176
+
177
+ ```markdown
178
+ # Literature Synthesis: [Topic]
179
+
180
+ ## Overview
181
+ [Brief introduction to the research area]
182
+
183
+ ## Taxonomy of Approaches
184
+
185
+ ### Category A: [Name]
186
+ Papers: [list]
187
+ Key characteristics: [describe]
188
+
189
+ ### Category B: [Name]
190
+ Papers: [list]
191
+ Key characteristics: [describe]
192
+
193
+ ## Comparison Table
194
+
195
+ | Paper | Method | Dataset | Key Metric | Result |
196
+ |-------|--------|---------|------------|--------|
197
+ | [A] | ... | ... | ... | ... |
198
+ | [B] | ... | ... | ... | ... |
199
+
200
+ ## Evolution of Ideas
201
+ [How has the field progressed? What are the trends?]
202
+
203
+ ## Open Problems
204
+ 1. [Gap 1]
205
+ 2. [Gap 2]
206
+
207
+ ## Recommendations
208
+ [Which papers to read first? Which approaches are most promising?]
209
+ ```
210
+
211
+ ## Step 5 (Optional): Draft Literature Review
212
+
213
+ If user requests a formal review, create `$WORKSPACE/literature/review_draft.md`:
214
+
215
+ ```markdown
216
+ # Literature Review: [Topic]
217
+
218
+ ## 1. Introduction
219
+ [Context and motivation for the review]
220
+
221
+ ## 2. Background
222
+ [Essential concepts the reader needs]
223
+
224
+ ## 3. Survey of Methods
225
+
226
+ ### 3.1 [Category A]
227
+ [Describe approaches in this category, cite papers]
228
+
229
+ ### 3.2 [Category B]
230
+ [Describe approaches in this category, cite papers]
231
+
232
+ ## 4. Empirical Comparison
233
+ [Summarize experimental findings across papers]
234
+
235
+ ## 5. Discussion
236
+ [Trends, gaps, and future directions]
237
+
238
+ ## 6. Conclusion
239
+ [Key takeaways]
240
+
241
+ ## References
242
+ [BibTeX citations]
243
+ ```
244
+
245
+ ## Batch Processing
246
+
247
+ If reviewing more than 10 papers:
248
+ 1. First pass: Generate TL;DR only for all papers
249
+ 2. User selects which papers need full notes
250
+ 3. Second pass: Full notes for selected papers
251
+ 4. Final pass: Synthesis
252
+
253
+ Do NOT process all papers with full detail in a single session—context will overflow.
254
+
255
+ ## Commands
256
+
257
+ User can say:
258
+ - "Review these papers" → Full workflow
259
+ - "Just summarize [paper]" → Single paper note
260
+ - "Compare [paper A] and [paper B]" → Focused comparison
261
+ - "Write a literature review on [topic]" → Full review draft
@@ -0,0 +1,267 @@
1
+ ---
2
+ name: research-pipeline
3
+ description: "End-to-end research automation: from idea to code implementation with literature review, planning, and iterative refinement. Use arxiv, github_search, and exec tools."
4
+ metadata:
5
+ {
6
+ "openclaw":
7
+ {
8
+ "emoji": "🔬",
9
+ "requires": { "bins": ["git", "python3"] },
10
+ },
11
+ }
12
+ ---
13
+
14
+ # Research Pipeline
15
+
16
+ Automate an end-to-end ML research workflow: idea -> literature search -> survey -> plan -> implement -> review -> iterate.
17
+
18
+ All intermediate results live in a project-based workspace directory. **File existence = step completion.** If a step's output file already exists, skip that step and move on. This enables crash recovery and incremental progress.
19
+
20
+ ---
21
+
22
+ ## Workspace Convention (Project-based)
23
+
24
+ **IMPORTANT**: This skill uses the same project-based workspace as `idea-generation`. Check or set the active project first.
25
+
26
+ ### Check Active Project
27
+ ```bash
28
+ cat ~/.openclaw/workspace/projects/.active 2>/dev/null
29
+ ```
30
+
31
+ If a project is active, set `$WORKSPACE = ~/.openclaw/workspace/projects/{project_id}/`.
32
+
33
+ If no active project exists, create one based on the research idea (see Step 1).
34
+
35
+ ### Directory Structure
36
+ ```
37
+ $WORKSPACE/
38
+ ├── project.json # Project metadata
39
+ ├── task.json # Research task/idea definition
40
+ ├── search_results.md # Search results (Step 2)
41
+ ├── prepare_res.md # Selected repos (Step 3)
42
+ ├── papers/ # Downloaded papers (Step 4)
43
+ ├── repos/ # Cloned repositories (Step 3)
44
+ ├── notes/ # Paper notes (Step 5)
45
+ ├── survey_res.md # Literature survey (Step 5)
46
+ ├── plan_res.md # Implementation plan (Step 6)
47
+ ├── project/ # Code implementation (Step 7)
48
+ ├── ml_res.md # Implementation report (Step 7)
49
+ ├── iterations/ # Review iterations (Step 8-9)
50
+ │ ├── judge_v1.md
51
+ │ └── ...
52
+ └── experiment_res.md # Final results (Step 10)
53
+ ```
54
+
55
+ ---
56
+
57
+ ## Step 1: Parse Task
58
+
59
+ Read `$WORKSPACE/task.json`. If it does not exist, ask the user for:
60
+
61
+ - **idea**: A description of the research idea (1-3 sentences).
62
+ - **references** (optional): ArXiv IDs or paper titles as starting points.
63
+ - **domain** (optional): e.g. "recommendation systems", "NLP", "computer vision".
64
+
65
+ Write the result to `$WORKSPACE/task.json`:
66
+
67
+ ```json
68
+ {
69
+ "idea": "...",
70
+ "references": ["2401.12345", "..."],
71
+ "domain": "...",
72
+ "date_limit": "2024-01-01"
73
+ }
74
+ ```
75
+
76
+ **Output:** `$WORKSPACE/task.json`
77
+
78
+ ## Step 2: Search
79
+
80
+ Use the `arxiv` tool to search for 5-10 related papers based on the idea and any reference paper titles. Use the `github_search` tool to find related repositories.
81
+
82
+ Combine results into a markdown report:
83
+
84
+ ```
85
+ ## ArXiv Papers
86
+ - [title](pdf_url) — arxiv_id — summary of relevance
87
+
88
+ ## GitHub Repositories
89
+ - [repo_name](url) — stars — language — summary of relevance
90
+ ```
91
+
92
+ **Output:** `$WORKSPACE/search_results.md`
93
+
94
+ ## Step 3: Prepare References
95
+
96
+ Read `$WORKSPACE/search_results.md`. Select 3-5 of the most relevant repositories.
97
+
98
+ For each selected repo, clone it into `$WORKSPACE/repos/`:
99
+
100
+ ```bash
101
+ git clone --depth 1 <url> $WORKSPACE/repos/<repo_name>
102
+ ```
103
+
104
+ Write a summary of selected repos and their relevance to the idea.
105
+
106
+ **Output:** `$WORKSPACE/prepare_res.md`
107
+
108
+ ## Step 4: Download Papers
109
+
110
+ For each important paper from Step 2, use the `arxiv` tool with `download: true` and `output_dir: "$WORKSPACE/papers/"` to get .tex source files.
111
+
112
+ If download fails for any paper, note the failure and continue. The survey step can work with abstracts alone.
113
+
114
+ **Output:** `$WORKSPACE/papers/*.tex` (or `.md` summaries if .tex unavailable)
115
+
116
+ ## Step 5: Literature Survey
117
+
118
+ This is the most intellectually demanding step. Read `references/prompts/survey.md` for detailed guidance.
119
+
120
+ For each paper:
121
+
122
+ 1. Read the .tex source (or abstract) thoroughly.
123
+ 2. Extract: core method, mathematical formulas, key contributions.
124
+ 3. Read the corresponding reference codebase in `$WORKSPACE/repos/`.
125
+ 4. Map math formulas to code implementations.
126
+ 5. Write structured notes to `$WORKSPACE/notes/paper_NNN.md`.
127
+
128
+ Each note file should contain:
129
+
130
+ ```markdown
131
+ # [Paper Title]
132
+
133
+ ## Core Method
134
+ ...
135
+
136
+ ## Math Formulas
137
+ ...
138
+
139
+ ## Code Implementation
140
+ File: repos/<repo>/path/to/file.py
141
+ ```python
142
+ # relevant code excerpt
143
+ ```
144
+
145
+ ## Key Insights
146
+ ...
147
+ ```
148
+
149
+ After all papers are surveyed, write a synthesis combining all notes.
150
+
151
+ **Output:** `$WORKSPACE/notes/paper_*.md` + `$WORKSPACE/survey_res.md`
152
+
153
+ ## Step 6: Implementation Plan
154
+
155
+ Read `references/prompts/plan.md` for detailed guidance.
156
+
157
+ Based on `survey_res.md`, `prepare_res.md`, and `task.json`, create a four-part plan:
158
+
159
+ 1. **Dataset Plan**: data source, loading pipeline, preprocessing, dataloader design.
160
+ 2. **Model Plan**: architecture, math formulas to implement, reference code to adapt.
161
+ 3. **Training Plan**: loss functions, optimizer, hyperparameters, monitoring.
162
+ 4. **Testing Plan**: metrics, evaluation protocol, baselines.
163
+
164
+ **Output:** `$WORKSPACE/plan_res.md`
165
+
166
+ ## Step 7: Implement
167
+
168
+ Read `references/prompts/implement.md` for detailed guidance.
169
+
170
+ Create a self-contained project in `$WORKSPACE/project/`:
171
+
172
+ ```
173
+ $WORKSPACE/project/
174
+ model/ # model architecture
175
+ data/ # data loading and preprocessing
176
+ training/ # training loop and configs
177
+ testing/ # evaluation scripts
178
+ utils/ # shared utilities
179
+ run.py # main entry point
180
+ requirements.txt
181
+ ```
182
+
183
+ **Critical rules:**
184
+
185
+ - Do NOT import directly from `$WORKSPACE/repos/`. Adapt and rewrite code.
186
+ - Implement EVERY component from `plan_res.md`.
187
+ - Use real datasets, not toy data.
188
+ - First run: 2 epochs only (quick validation).
189
+
190
+ Execute:
191
+
192
+ ```bash
193
+ cd $WORKSPACE/project && pip install -r requirements.txt && python run.py --epochs 2
194
+ ```
195
+
196
+ If using GPU containers, ensure sandbox has `gpus` and `shmSize` configured.
197
+
198
+ **Output:** `$WORKSPACE/project/` (code) + `$WORKSPACE/ml_res.md` (implementation report)
199
+
200
+ ## Step 8: Review
201
+
202
+ Read `references/prompts/review.md` for detailed guidance.
203
+
204
+ Review the implementation against:
205
+
206
+ - Each atomic idea from `survey_res.md`: is the math correctly translated to code?
207
+ - The plan from `plan_res.md`: are all components present?
208
+ - Code quality: no toy implementations, proper error handling, correct data pipeline.
209
+
210
+ Write a structured review:
211
+
212
+ ```markdown
213
+ # Review v1
214
+
215
+ ## Verdict: PASS / NEEDS_REVISION
216
+
217
+ ## Checklist
218
+ - [ ] Dataset loading matches plan
219
+ - [ ] Model architecture matches formulas
220
+ - [ ] Loss function correct
221
+ - [ ] Training loop proper
222
+ - [ ] Evaluation metrics correct
223
+
224
+ ## Issues (if NEEDS_REVISION)
225
+ 1. Issue description → suggested fix
226
+ 2. ...
227
+ ```
228
+
229
+ **Output:** `$WORKSPACE/iterations/judge_v1.md`
230
+
231
+ ## Step 9: Iterate
232
+
233
+ If the review verdict is `NEEDS_REVISION`:
234
+
235
+ 1. Read `$WORKSPACE/iterations/judge_vN.md` for the latest suggestions.
236
+ 2. Fix each issue in `$WORKSPACE/project/`.
237
+ 3. Re-run the 2-epoch validation.
238
+ 4. Write a new review to `$WORKSPACE/iterations/judge_v(N+1).md`.
239
+ 5. Repeat until `PASS` or 3 iterations reached.
240
+
241
+ If 3 iterations are exhausted without PASS, summarize remaining issues and ask the user for guidance.
242
+
243
+ **Output:** `$WORKSPACE/iterations/judge_v*.md` (review history)
244
+
245
+ ## Step 10: Full Training
246
+
247
+ Once review passes:
248
+
249
+ 1. Update epoch count in `run.py` to the full training value.
250
+ 2. Execute full training run.
251
+ 3. Collect and analyze results.
252
+
253
+ **Output:** `$WORKSPACE/experiment_res.md`
254
+
255
+ ## Batch Processing Rule
256
+
257
+ When you need to apply the same LLM operation to more than 10 files (e.g., summarizing all papers), do NOT process them one by one in conversation. Instead, write a script to handle them in batch.
258
+
259
+ ## Recovery
260
+
261
+ If the session crashes or context fills up:
262
+
263
+ 1. List files in `$WORKSPACE/` to see which steps completed.
264
+ 2. Read the most recent output file to understand current state.
265
+ 3. Resume from the first missing output file.
266
+
267
+ Never re-do a step whose output file already exists unless the user explicitly asks.