scientify 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +243 -0
- package/README.zh.md +244 -0
- package/dist/index.d.ts +3 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +51 -0
- package/dist/index.js.map +1 -0
- package/dist/src/commands.d.ts +26 -0
- package/dist/src/commands.d.ts.map +1 -0
- package/dist/src/commands.js +261 -0
- package/dist/src/commands.js.map +1 -0
- package/dist/src/tools/arxiv-tool.d.ts +26 -0
- package/dist/src/tools/arxiv-tool.d.ts.map +1 -0
- package/dist/src/tools/arxiv-tool.js +258 -0
- package/dist/src/tools/arxiv-tool.js.map +1 -0
- package/package.json +47 -0
- package/skills/arxiv/SKILL.md +138 -0
- package/skills/idea-generation/SKILL.md +751 -0
- package/skills/literature-review/SKILL.md +261 -0
- package/skills/research-pipeline/SKILL.md +267 -0
|
@@ -0,0 +1,261 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: literature-review
|
|
3
|
+
description: "Generate reading notes and summaries from EXISTING papers (PDF/.tex files user already has). Use for: summarize papers, create reading notes, write literature review section. Does NOT search for new papers or generate research ideas."
|
|
4
|
+
metadata:
|
|
5
|
+
{
|
|
6
|
+
"openclaw":
|
|
7
|
+
{
|
|
8
|
+
"emoji": "📖",
|
|
9
|
+
},
|
|
10
|
+
}
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Literature Review
|
|
14
|
+
|
|
15
|
+
Generate structured notes and synthesis documents from academic papers. Use this skill when the user wants to:
|
|
16
|
+
- Summarize papers they've collected
|
|
17
|
+
- Create reading notes for a research topic
|
|
18
|
+
- Write a literature review section
|
|
19
|
+
- Compare methods across multiple papers
|
|
20
|
+
|
|
21
|
+
## Workspace Convention (Project-based)
|
|
22
|
+
|
|
23
|
+
**IMPORTANT**: OpenClaw uses project-based workspaces. Each research topic has its own project directory.
|
|
24
|
+
|
|
25
|
+
### Check Active Project First
|
|
26
|
+
|
|
27
|
+
Before starting, check the active project:
|
|
28
|
+
```bash
|
|
29
|
+
cat ~/.openclaw/workspace/projects/.active 2>/dev/null
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
If a project is active, use `$WORKSPACE = ~/.openclaw/workspace/projects/{project_id}/`.
|
|
33
|
+
|
|
34
|
+
If no active project, use the flat structure: `~/.openclaw/workspace/`.
|
|
35
|
+
|
|
36
|
+
### Project-based Structure (Recommended)
|
|
37
|
+
|
|
38
|
+
```
|
|
39
|
+
~/.openclaw/workspace/projects/{project-id}/
|
|
40
|
+
├── project.json # Project metadata
|
|
41
|
+
├── papers/ # Downloaded PDFs/tex files
|
|
42
|
+
│ ├── 2401.12345/
|
|
43
|
+
│ │ └── main.tex
|
|
44
|
+
│ └── ...
|
|
45
|
+
└── literature/ # Generated outputs
|
|
46
|
+
├── notes/ # Per-paper notes
|
|
47
|
+
│ ├── 2401.12345.md
|
|
48
|
+
│ └── ...
|
|
49
|
+
├── synthesis.md # Cross-paper synthesis
|
|
50
|
+
├── bibliography.bib # BibTeX entries
|
|
51
|
+
└── review_draft.md # Optional: formatted review
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Flat Structure (Fallback)
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
~/.openclaw/workspace/
|
|
58
|
+
├── papers/
|
|
59
|
+
└── literature/
|
|
60
|
+
├── notes/
|
|
61
|
+
├── synthesis.md
|
|
62
|
+
└── ...
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
**File existence = step completion.** Skip steps whose output already exists.
|
|
66
|
+
|
|
67
|
+
**In the steps below**, `$WORKSPACE` refers to the active project directory or `~/.openclaw/workspace/` if no project is active.
|
|
68
|
+
|
|
69
|
+
## Step 1: Gather Papers
|
|
70
|
+
|
|
71
|
+
Check what papers are available:
|
|
72
|
+
|
|
73
|
+
1. **Check active project first**: `cat ~/.openclaw/workspace/projects/.active`
|
|
74
|
+
2. **Look in project papers directory**: `ls -la $WORKSPACE/papers/`
|
|
75
|
+
3. Check if user provided URLs or arXiv IDs
|
|
76
|
+
|
|
77
|
+
If no papers found, ask user to provide:
|
|
78
|
+
- ArXiv IDs (e.g., "2401.12345")
|
|
79
|
+
- PDF URLs
|
|
80
|
+
- Local file paths
|
|
81
|
+
|
|
82
|
+
## Step 2: Read and Annotate Each Paper
|
|
83
|
+
|
|
84
|
+
For each paper, create `$WORKSPACE/literature/notes/<paper_id>.md`:
|
|
85
|
+
|
|
86
|
+
First, ensure the output directory exists:
|
|
87
|
+
```bash
|
|
88
|
+
mkdir -p $WORKSPACE/literature/notes
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
```markdown
|
|
92
|
+
# [Paper Title]
|
|
93
|
+
|
|
94
|
+
**ArXiv/DOI**: [id]
|
|
95
|
+
**Authors**: [list]
|
|
96
|
+
**Year**: [year]
|
|
97
|
+
**Venue**: [conference/journal if known]
|
|
98
|
+
|
|
99
|
+
## TL;DR
|
|
100
|
+
[1-2 sentence summary of the main contribution]
|
|
101
|
+
|
|
102
|
+
## Problem Statement
|
|
103
|
+
[What problem does this paper address?]
|
|
104
|
+
|
|
105
|
+
## Method
|
|
106
|
+
[Key approach, algorithm, or framework]
|
|
107
|
+
|
|
108
|
+
### Core Idea
|
|
109
|
+
[The central insight or innovation]
|
|
110
|
+
|
|
111
|
+
### Technical Details
|
|
112
|
+
[Important formulas, architectures, or algorithms]
|
|
113
|
+
|
|
114
|
+
```latex
|
|
115
|
+
[Key equations if applicable]
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
## Experiments
|
|
119
|
+
- **Datasets**: [list]
|
|
120
|
+
- **Baselines**: [list]
|
|
121
|
+
- **Main Results**: [key numbers]
|
|
122
|
+
|
|
123
|
+
## Strengths
|
|
124
|
+
- [strength 1]
|
|
125
|
+
- [strength 2]
|
|
126
|
+
|
|
127
|
+
## Weaknesses / Limitations
|
|
128
|
+
- [limitation 1]
|
|
129
|
+
- [limitation 2]
|
|
130
|
+
|
|
131
|
+
## Relevance to My Research
|
|
132
|
+
[How does this paper relate to the user's work? Leave blank if unknown]
|
|
133
|
+
|
|
134
|
+
## Key Quotes
|
|
135
|
+
> "[Important quote from the paper]" (Section X)
|
|
136
|
+
|
|
137
|
+
## References to Follow
|
|
138
|
+
- [Paper A]: [why interesting]
|
|
139
|
+
- [Paper B]: [why interesting]
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Reading Strategy by Format
|
|
143
|
+
|
|
144
|
+
| Format | Method |
|
|
145
|
+
|--------|--------|
|
|
146
|
+
| `.tex` | Use `read` directly. Search for `\section`, `\begin{equation}` |
|
|
147
|
+
| `.pdf` | Use `read` (OpenClaw supports PDF). Focus on abstract, intro, method, experiments |
|
|
148
|
+
| URL | Use `web_fetch` to get content, then summarize |
|
|
149
|
+
|
|
150
|
+
### Quality Checklist
|
|
151
|
+
|
|
152
|
+
Before finishing a note, verify:
|
|
153
|
+
- [ ] TL;DR captures the main contribution
|
|
154
|
+
- [ ] Method section explains the approach clearly
|
|
155
|
+
- [ ] At least 2 strengths and 2 limitations identified
|
|
156
|
+
- [ ] Key equations/algorithms included if applicable
|
|
157
|
+
|
|
158
|
+
## Step 3: Generate BibTeX
|
|
159
|
+
|
|
160
|
+
Create `$WORKSPACE/literature/bibliography.bib`:
|
|
161
|
+
|
|
162
|
+
```bibtex
|
|
163
|
+
@article{author2024title,
|
|
164
|
+
title={Full Paper Title},
|
|
165
|
+
author={Last, First and Last2, First2},
|
|
166
|
+
journal={arXiv preprint arXiv:2401.12345},
|
|
167
|
+
year={2024}
|
|
168
|
+
}
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
For arXiv papers, use this format. For published papers, include venue, volume, pages.
|
|
172
|
+
|
|
173
|
+
## Step 4: Synthesize Across Papers
|
|
174
|
+
|
|
175
|
+
Create `$WORKSPACE/literature/synthesis.md`:
|
|
176
|
+
|
|
177
|
+
```markdown
|
|
178
|
+
# Literature Synthesis: [Topic]
|
|
179
|
+
|
|
180
|
+
## Overview
|
|
181
|
+
[Brief introduction to the research area]
|
|
182
|
+
|
|
183
|
+
## Taxonomy of Approaches
|
|
184
|
+
|
|
185
|
+
### Category A: [Name]
|
|
186
|
+
Papers: [list]
|
|
187
|
+
Key characteristics: [describe]
|
|
188
|
+
|
|
189
|
+
### Category B: [Name]
|
|
190
|
+
Papers: [list]
|
|
191
|
+
Key characteristics: [describe]
|
|
192
|
+
|
|
193
|
+
## Comparison Table
|
|
194
|
+
|
|
195
|
+
| Paper | Method | Dataset | Key Metric | Result |
|
|
196
|
+
|-------|--------|---------|------------|--------|
|
|
197
|
+
| [A] | ... | ... | ... | ... |
|
|
198
|
+
| [B] | ... | ... | ... | ... |
|
|
199
|
+
|
|
200
|
+
## Evolution of Ideas
|
|
201
|
+
[How has the field progressed? What are the trends?]
|
|
202
|
+
|
|
203
|
+
## Open Problems
|
|
204
|
+
1. [Gap 1]
|
|
205
|
+
2. [Gap 2]
|
|
206
|
+
|
|
207
|
+
## Recommendations
|
|
208
|
+
[Which papers to read first? Which approaches are most promising?]
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
## Step 5 (Optional): Draft Literature Review
|
|
212
|
+
|
|
213
|
+
If user requests a formal review, create `$WORKSPACE/literature/review_draft.md`:
|
|
214
|
+
|
|
215
|
+
```markdown
|
|
216
|
+
# Literature Review: [Topic]
|
|
217
|
+
|
|
218
|
+
## 1. Introduction
|
|
219
|
+
[Context and motivation for the review]
|
|
220
|
+
|
|
221
|
+
## 2. Background
|
|
222
|
+
[Essential concepts the reader needs]
|
|
223
|
+
|
|
224
|
+
## 3. Survey of Methods
|
|
225
|
+
|
|
226
|
+
### 3.1 [Category A]
|
|
227
|
+
[Describe approaches in this category, cite papers]
|
|
228
|
+
|
|
229
|
+
### 3.2 [Category B]
|
|
230
|
+
[Describe approaches in this category, cite papers]
|
|
231
|
+
|
|
232
|
+
## 4. Empirical Comparison
|
|
233
|
+
[Summarize experimental findings across papers]
|
|
234
|
+
|
|
235
|
+
## 5. Discussion
|
|
236
|
+
[Trends, gaps, and future directions]
|
|
237
|
+
|
|
238
|
+
## 6. Conclusion
|
|
239
|
+
[Key takeaways]
|
|
240
|
+
|
|
241
|
+
## References
|
|
242
|
+
[BibTeX citations]
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
## Batch Processing
|
|
246
|
+
|
|
247
|
+
If reviewing more than 10 papers:
|
|
248
|
+
1. First pass: Generate TL;DR only for all papers
|
|
249
|
+
2. User selects which papers need full notes
|
|
250
|
+
3. Second pass: Full notes for selected papers
|
|
251
|
+
4. Final pass: Synthesis
|
|
252
|
+
|
|
253
|
+
Do NOT process all papers with full detail in a single session—context will overflow.
|
|
254
|
+
|
|
255
|
+
## Commands
|
|
256
|
+
|
|
257
|
+
User can say:
|
|
258
|
+
- "Review these papers" → Full workflow
|
|
259
|
+
- "Just summarize [paper]" → Single paper note
|
|
260
|
+
- "Compare [paper A] and [paper B]" → Focused comparison
|
|
261
|
+
- "Write a literature review on [topic]" → Full review draft
|
|
@@ -0,0 +1,267 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research-pipeline
|
|
3
|
+
description: "End-to-end research automation: from idea to code implementation with literature review, planning, and iterative refinement. Use arxiv, github_search, and exec tools."
|
|
4
|
+
metadata:
|
|
5
|
+
{
|
|
6
|
+
"openclaw":
|
|
7
|
+
{
|
|
8
|
+
"emoji": "🔬",
|
|
9
|
+
"requires": { "bins": ["git", "python3"] },
|
|
10
|
+
},
|
|
11
|
+
}
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# Research Pipeline
|
|
15
|
+
|
|
16
|
+
Automate an end-to-end ML research workflow: idea -> literature search -> survey -> plan -> implement -> review -> iterate.
|
|
17
|
+
|
|
18
|
+
All intermediate results live in a project-based workspace directory. **File existence = step completion.** If a step's output file already exists, skip that step and move on. This enables crash recovery and incremental progress.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Workspace Convention (Project-based)
|
|
23
|
+
|
|
24
|
+
**IMPORTANT**: This skill uses the same project-based workspace as `idea-generation`. Check or set the active project first.
|
|
25
|
+
|
|
26
|
+
### Check Active Project
|
|
27
|
+
```bash
|
|
28
|
+
cat ~/.openclaw/workspace/projects/.active 2>/dev/null
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
If a project is active, set `$WORKSPACE = ~/.openclaw/workspace/projects/{project_id}/`.
|
|
32
|
+
|
|
33
|
+
If no active project exists, create one based on the research idea (see Step 1).
|
|
34
|
+
|
|
35
|
+
### Directory Structure
|
|
36
|
+
```
|
|
37
|
+
$WORKSPACE/
|
|
38
|
+
├── project.json # Project metadata
|
|
39
|
+
├── task.json # Research task/idea definition
|
|
40
|
+
├── search_results.md # Search results (Step 2)
|
|
41
|
+
├── prepare_res.md # Selected repos (Step 3)
|
|
42
|
+
├── papers/ # Downloaded papers (Step 4)
|
|
43
|
+
├── repos/ # Cloned repositories (Step 3)
|
|
44
|
+
├── notes/ # Paper notes (Step 5)
|
|
45
|
+
├── survey_res.md # Literature survey (Step 5)
|
|
46
|
+
├── plan_res.md # Implementation plan (Step 6)
|
|
47
|
+
├── project/ # Code implementation (Step 7)
|
|
48
|
+
├── ml_res.md # Implementation report (Step 7)
|
|
49
|
+
├── iterations/ # Review iterations (Step 8-9)
|
|
50
|
+
│ ├── judge_v1.md
|
|
51
|
+
│ └── ...
|
|
52
|
+
└── experiment_res.md # Final results (Step 10)
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
|
|
57
|
+
## Step 1: Parse Task
|
|
58
|
+
|
|
59
|
+
Read `$WORKSPACE/task.json`. If it does not exist, ask the user for:
|
|
60
|
+
|
|
61
|
+
- **idea**: A description of the research idea (1-3 sentences).
|
|
62
|
+
- **references** (optional): ArXiv IDs or paper titles as starting points.
|
|
63
|
+
- **domain** (optional): e.g. "recommendation systems", "NLP", "computer vision".
|
|
64
|
+
|
|
65
|
+
Write the result to `$WORKSPACE/task.json`:
|
|
66
|
+
|
|
67
|
+
```json
|
|
68
|
+
{
|
|
69
|
+
"idea": "...",
|
|
70
|
+
"references": ["2401.12345", "..."],
|
|
71
|
+
"domain": "...",
|
|
72
|
+
"date_limit": "2024-01-01"
|
|
73
|
+
}
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
**Output:** `$WORKSPACE/task.json`
|
|
77
|
+
|
|
78
|
+
## Step 2: Search
|
|
79
|
+
|
|
80
|
+
Use the `arxiv` tool to search for 5-10 related papers based on the idea and any reference paper titles. Use the `github_search` tool to find related repositories.
|
|
81
|
+
|
|
82
|
+
Combine results into a markdown report:
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
## ArXiv Papers
|
|
86
|
+
- [title](pdf_url) — arxiv_id — summary of relevance
|
|
87
|
+
|
|
88
|
+
## GitHub Repositories
|
|
89
|
+
- [repo_name](url) — stars — language — summary of relevance
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
**Output:** `$WORKSPACE/search_results.md`
|
|
93
|
+
|
|
94
|
+
## Step 3: Prepare References
|
|
95
|
+
|
|
96
|
+
Read `$WORKSPACE/search_results.md`. Select 3-5 of the most relevant repositories.
|
|
97
|
+
|
|
98
|
+
For each selected repo, clone it into `$WORKSPACE/repos/`:
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
git clone --depth 1 <url> $WORKSPACE/repos/<repo_name>
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
Write a summary of selected repos and their relevance to the idea.
|
|
105
|
+
|
|
106
|
+
**Output:** `$WORKSPACE/prepare_res.md`
|
|
107
|
+
|
|
108
|
+
## Step 4: Download Papers
|
|
109
|
+
|
|
110
|
+
For each important paper from Step 2, use the `arxiv` tool with `download: true` and `output_dir: "$WORKSPACE/papers/"` to get .tex source files.
|
|
111
|
+
|
|
112
|
+
If download fails for any paper, note the failure and continue. The survey step can work with abstracts alone.
|
|
113
|
+
|
|
114
|
+
**Output:** `$WORKSPACE/papers/*.tex` (or `.md` summaries if .tex unavailable)
|
|
115
|
+
|
|
116
|
+
## Step 5: Literature Survey
|
|
117
|
+
|
|
118
|
+
This is the most intellectually demanding step. Read `references/prompts/survey.md` for detailed guidance.
|
|
119
|
+
|
|
120
|
+
For each paper:
|
|
121
|
+
|
|
122
|
+
1. Read the .tex source (or abstract) thoroughly.
|
|
123
|
+
2. Extract: core method, mathematical formulas, key contributions.
|
|
124
|
+
3. Read the corresponding reference codebase in `$WORKSPACE/repos/`.
|
|
125
|
+
4. Map math formulas to code implementations.
|
|
126
|
+
5. Write structured notes to `$WORKSPACE/notes/paper_NNN.md`.
|
|
127
|
+
|
|
128
|
+
Each note file should contain:
|
|
129
|
+
|
|
130
|
+
```markdown
|
|
131
|
+
# [Paper Title]
|
|
132
|
+
|
|
133
|
+
## Core Method
|
|
134
|
+
...
|
|
135
|
+
|
|
136
|
+
## Math Formulas
|
|
137
|
+
...
|
|
138
|
+
|
|
139
|
+
## Code Implementation
|
|
140
|
+
File: repos/<repo>/path/to/file.py
|
|
141
|
+
```python
|
|
142
|
+
# relevant code excerpt
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
## Key Insights
|
|
146
|
+
...
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
After all papers are surveyed, write a synthesis combining all notes.
|
|
150
|
+
|
|
151
|
+
**Output:** `$WORKSPACE/notes/paper_*.md` + `$WORKSPACE/survey_res.md`
|
|
152
|
+
|
|
153
|
+
## Step 6: Implementation Plan
|
|
154
|
+
|
|
155
|
+
Read `references/prompts/plan.md` for detailed guidance.
|
|
156
|
+
|
|
157
|
+
Based on `survey_res.md`, `prepare_res.md`, and `task.json`, create a four-part plan:
|
|
158
|
+
|
|
159
|
+
1. **Dataset Plan**: data source, loading pipeline, preprocessing, dataloader design.
|
|
160
|
+
2. **Model Plan**: architecture, math formulas to implement, reference code to adapt.
|
|
161
|
+
3. **Training Plan**: loss functions, optimizer, hyperparameters, monitoring.
|
|
162
|
+
4. **Testing Plan**: metrics, evaluation protocol, baselines.
|
|
163
|
+
|
|
164
|
+
**Output:** `$WORKSPACE/plan_res.md`
|
|
165
|
+
|
|
166
|
+
## Step 7: Implement
|
|
167
|
+
|
|
168
|
+
Read `references/prompts/implement.md` for detailed guidance.
|
|
169
|
+
|
|
170
|
+
Create a self-contained project in `$WORKSPACE/project/`:
|
|
171
|
+
|
|
172
|
+
```
|
|
173
|
+
$WORKSPACE/project/
|
|
174
|
+
model/ # model architecture
|
|
175
|
+
data/ # data loading and preprocessing
|
|
176
|
+
training/ # training loop and configs
|
|
177
|
+
testing/ # evaluation scripts
|
|
178
|
+
utils/ # shared utilities
|
|
179
|
+
run.py # main entry point
|
|
180
|
+
requirements.txt
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
**Critical rules:**
|
|
184
|
+
|
|
185
|
+
- Do NOT import directly from `$WORKSPACE/repos/`. Adapt and rewrite code.
|
|
186
|
+
- Implement EVERY component from `plan_res.md`.
|
|
187
|
+
- Use real datasets, not toy data.
|
|
188
|
+
- First run: 2 epochs only (quick validation).
|
|
189
|
+
|
|
190
|
+
Execute:
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
cd $WORKSPACE/project && pip install -r requirements.txt && python run.py --epochs 2
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
If using GPU containers, ensure sandbox has `gpus` and `shmSize` configured.
|
|
197
|
+
|
|
198
|
+
**Output:** `$WORKSPACE/project/` (code) + `$WORKSPACE/ml_res.md` (implementation report)
|
|
199
|
+
|
|
200
|
+
## Step 8: Review
|
|
201
|
+
|
|
202
|
+
Read `references/prompts/review.md` for detailed guidance.
|
|
203
|
+
|
|
204
|
+
Review the implementation against:
|
|
205
|
+
|
|
206
|
+
- Each atomic idea from `survey_res.md`: is the math correctly translated to code?
|
|
207
|
+
- The plan from `plan_res.md`: are all components present?
|
|
208
|
+
- Code quality: no toy implementations, proper error handling, correct data pipeline.
|
|
209
|
+
|
|
210
|
+
Write a structured review:
|
|
211
|
+
|
|
212
|
+
```markdown
|
|
213
|
+
# Review v1
|
|
214
|
+
|
|
215
|
+
## Verdict: PASS / NEEDS_REVISION
|
|
216
|
+
|
|
217
|
+
## Checklist
|
|
218
|
+
- [ ] Dataset loading matches plan
|
|
219
|
+
- [ ] Model architecture matches formulas
|
|
220
|
+
- [ ] Loss function correct
|
|
221
|
+
- [ ] Training loop proper
|
|
222
|
+
- [ ] Evaluation metrics correct
|
|
223
|
+
|
|
224
|
+
## Issues (if NEEDS_REVISION)
|
|
225
|
+
1. Issue description → suggested fix
|
|
226
|
+
2. ...
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
**Output:** `$WORKSPACE/iterations/judge_v1.md`
|
|
230
|
+
|
|
231
|
+
## Step 9: Iterate
|
|
232
|
+
|
|
233
|
+
If the review verdict is `NEEDS_REVISION`:
|
|
234
|
+
|
|
235
|
+
1. Read `$WORKSPACE/iterations/judge_vN.md` for the latest suggestions.
|
|
236
|
+
2. Fix each issue in `$WORKSPACE/project/`.
|
|
237
|
+
3. Re-run the 2-epoch validation.
|
|
238
|
+
4. Write a new review to `$WORKSPACE/iterations/judge_v(N+1).md`.
|
|
239
|
+
5. Repeat until `PASS` or 3 iterations reached.
|
|
240
|
+
|
|
241
|
+
If 3 iterations are exhausted without PASS, summarize remaining issues and ask the user for guidance.
|
|
242
|
+
|
|
243
|
+
**Output:** `$WORKSPACE/iterations/judge_v*.md` (review history)
|
|
244
|
+
|
|
245
|
+
## Step 10: Full Training
|
|
246
|
+
|
|
247
|
+
Once review passes:
|
|
248
|
+
|
|
249
|
+
1. Update epoch count in `run.py` to the full training value.
|
|
250
|
+
2. Execute full training run.
|
|
251
|
+
3. Collect and analyze results.
|
|
252
|
+
|
|
253
|
+
**Output:** `$WORKSPACE/experiment_res.md`
|
|
254
|
+
|
|
255
|
+
## Batch Processing Rule
|
|
256
|
+
|
|
257
|
+
When you need to apply the same LLM operation to more than 10 files (e.g., summarizing all papers), do NOT process them one by one in conversation. Instead, write a script to handle them in batch.
|
|
258
|
+
|
|
259
|
+
## Recovery
|
|
260
|
+
|
|
261
|
+
If the session crashes or context fills up:
|
|
262
|
+
|
|
263
|
+
1. List files in `$WORKSPACE/` to see which steps completed.
|
|
264
|
+
2. Read the most recent output file to understand current state.
|
|
265
|
+
3. Resume from the first missing output file.
|
|
266
|
+
|
|
267
|
+
Never re-do a step whose output file already exists unless the user explicitly asks.
|