scientify 1.2.2 → 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +38 -14
- package/README.zh.md +38 -15
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +21 -2
- package/dist/index.js.map +1 -1
- package/dist/src/services/auto-updater.d.ts +15 -0
- package/dist/src/services/auto-updater.d.ts.map +1 -0
- package/dist/src/services/auto-updater.js +188 -0
- package/dist/src/services/auto-updater.js.map +1 -0
- package/dist/src/tools/arxiv-download.d.ts +25 -0
- package/dist/src/tools/arxiv-download.d.ts.map +1 -0
- package/dist/src/tools/arxiv-download.js +179 -0
- package/dist/src/tools/arxiv-download.js.map +1 -0
- package/dist/src/tools/{arxiv-tool.d.ts → arxiv-search.d.ts} +11 -8
- package/dist/src/tools/arxiv-search.d.ts.map +1 -0
- package/dist/src/tools/arxiv-search.js +140 -0
- package/dist/src/tools/arxiv-search.js.map +1 -0
- package/dist/src/tools/github-search-tool.d.ts +5 -1
- package/dist/src/tools/github-search-tool.d.ts.map +1 -1
- package/dist/src/tools/github-search-tool.js +10 -30
- package/dist/src/tools/github-search-tool.js.map +1 -1
- package/dist/src/tools/result.d.ts +37 -0
- package/dist/src/tools/result.d.ts.map +1 -0
- package/dist/src/tools/result.js +39 -0
- package/dist/src/tools/result.js.map +1 -0
- package/dist/src/tools/workspace.d.ts +32 -0
- package/dist/src/tools/workspace.d.ts.map +1 -0
- package/dist/src/tools/workspace.js +69 -0
- package/dist/src/tools/workspace.js.map +1 -0
- package/openclaw.plugin.json +22 -1
- package/package.json +13 -2
- package/skills/_shared/workspace-spec.md +139 -0
- package/skills/idea-generation/SKILL.md +4 -0
- package/skills/install-scientify/SKILL.md +15 -7
- package/skills/literature-survey/SKILL.md +86 -212
- package/skills/research-experiment/SKILL.md +114 -0
- package/skills/research-implement/SKILL.md +166 -0
- package/skills/research-pipeline/SKILL.md +104 -188
- package/skills/research-plan/SKILL.md +121 -0
- package/skills/research-review/SKILL.md +110 -0
- package/skills/research-survey/SKILL.md +140 -0
- package/skills/write-review-paper/SKILL.md +4 -0
- package/dist/src/tools/arxiv-tool.d.ts.map +0 -1
- package/dist/src/tools/arxiv-tool.js +0 -258
- package/dist/src/tools/arxiv-tool.js.map +0 -1
- package/skills/research-pipeline/references/prompts/implement.md +0 -135
- package/skills/research-pipeline/references/prompts/plan.md +0 -142
- package/skills/research-pipeline/references/prompts/review.md +0 -118
- package/skills/research-pipeline/references/prompts/survey.md +0 -105
- package/skills/research-pipeline/references/workspace-spec.md +0 -81
|
@@ -1,142 +0,0 @@
|
|
|
1
|
-
# Implementation Planning Guide
|
|
2
|
-
|
|
3
|
-
You are creating a detailed, actionable implementation plan. This plan must be specific enough that the implementation step can follow it without ambiguity.
|
|
4
|
-
|
|
5
|
-
## Prerequisites
|
|
6
|
-
|
|
7
|
-
Before planning, you must have:
|
|
8
|
-
|
|
9
|
-
- `workspace/task.json` — the research idea
|
|
10
|
-
- `workspace/survey_res.md` — the literature survey with theory-to-code mappings
|
|
11
|
-
- `workspace/prepare_res.md` — selected reference repositories
|
|
12
|
-
|
|
13
|
-
Read ALL of these files thoroughly before writing the plan. Also browse the reference codebases in `workspace/repos/` to understand their structure and reusable components.
|
|
14
|
-
|
|
15
|
-
## Plan Structure
|
|
16
|
-
|
|
17
|
-
The plan has four mandatory sections. Write all four to `workspace/plan_res.md`.
|
|
18
|
-
|
|
19
|
-
### 1. Dataset Plan
|
|
20
|
-
|
|
21
|
-
```markdown
|
|
22
|
-
## Dataset Plan
|
|
23
|
-
|
|
24
|
-
### Data Source
|
|
25
|
-
- Dataset name and where to obtain it
|
|
26
|
-
- Size and format
|
|
27
|
-
- Any preprocessing requirements
|
|
28
|
-
|
|
29
|
-
### Data Loading Pipeline
|
|
30
|
-
1. **Read**: How to load raw data (file format, library to use)
|
|
31
|
-
2. **Preprocess**: Transformations, tokenization, normalization, feature extraction
|
|
32
|
-
3. **DataLoader**: Batch construction, sampling strategy, collate function
|
|
33
|
-
|
|
34
|
-
### Data Splits
|
|
35
|
-
- Train/validation/test split ratios
|
|
36
|
-
- Any special handling (e.g., cold-start users, temporal splits)
|
|
37
|
-
```
|
|
38
|
-
|
|
39
|
-
Refer to the reference codebases for data loading patterns. Cite specific files: "See `repos/xyz/data/loader.py` for the graph construction approach."
|
|
40
|
-
|
|
41
|
-
### 2. Model Plan
|
|
42
|
-
|
|
43
|
-
```markdown
|
|
44
|
-
## Model Plan
|
|
45
|
-
|
|
46
|
-
### Architecture Overview
|
|
47
|
-
[High-level description of the model architecture]
|
|
48
|
-
|
|
49
|
-
### Components (one per atomic definition)
|
|
50
|
-
|
|
51
|
-
#### [Atomic Definition 1]
|
|
52
|
-
- **Math**: $formula$
|
|
53
|
-
- **Implementation**: Class name, input/output shapes, key methods
|
|
54
|
-
- **Reference**: repos/xyz/model/attention.py, class MultiHeadAttention
|
|
55
|
-
- **Adaptation notes**: [What to change from the reference]
|
|
56
|
-
|
|
57
|
-
#### [Atomic Definition 2]
|
|
58
|
-
...
|
|
59
|
-
|
|
60
|
-
### Forward Pass
|
|
61
|
-
[Step-by-step description of the forward pass, connecting all components]
|
|
62
|
-
|
|
63
|
-
### Parameter Count Estimate
|
|
64
|
-
[Rough estimate to sanity-check the architecture]
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
Every atomic definition from `survey_res.md` must appear as a component. If a definition doesn't map to a model component, explain why.
|
|
68
|
-
|
|
69
|
-
### 3. Training Plan
|
|
70
|
-
|
|
71
|
-
```markdown
|
|
72
|
-
## Training Plan
|
|
73
|
-
|
|
74
|
-
### Loss Function
|
|
75
|
-
- Formula: $L = ...$
|
|
76
|
-
- Components: [list each loss term and its purpose]
|
|
77
|
-
- Reference: repos/xyz/training/loss.py
|
|
78
|
-
|
|
79
|
-
### Optimizer
|
|
80
|
-
- Algorithm: [Adam, AdamW, SGD, etc.]
|
|
81
|
-
- Learning rate: [value] with [schedule: cosine, step, warmup, etc.]
|
|
82
|
-
- Weight decay: [value]
|
|
83
|
-
|
|
84
|
-
### Hyperparameters
|
|
85
|
-
| Parameter | Value | Rationale |
|
|
86
|
-
|-----------|-------|-----------|
|
|
87
|
-
| Batch size | ... | ... |
|
|
88
|
-
| Hidden dim | ... | ... |
|
|
89
|
-
| Num layers | ... | ... |
|
|
90
|
-
| Dropout | ... | ... |
|
|
91
|
-
|
|
92
|
-
### Training Loop
|
|
93
|
-
1. Forward pass
|
|
94
|
-
2. Compute loss
|
|
95
|
-
3. Backward pass
|
|
96
|
-
4. Gradient clipping (if applicable)
|
|
97
|
-
5. Optimizer step
|
|
98
|
-
6. Logging (every N steps)
|
|
99
|
-
7. Validation (every M epochs)
|
|
100
|
-
|
|
101
|
-
### Quick Validation
|
|
102
|
-
- Epochs: 2 (for initial validation)
|
|
103
|
-
- Expected behavior: loss should decrease, no NaN/Inf
|
|
104
|
-
|
|
105
|
-
### Full Training
|
|
106
|
-
- Epochs: [value from reference papers]
|
|
107
|
-
- Early stopping: [criteria]
|
|
108
|
-
- Checkpoint: save best model by validation metric
|
|
109
|
-
```
|
|
110
|
-
|
|
111
|
-
### 4. Testing Plan
|
|
112
|
-
|
|
113
|
-
```markdown
|
|
114
|
-
## Testing Plan
|
|
115
|
-
|
|
116
|
-
### Metrics
|
|
117
|
-
- Primary: [e.g., NDCG@10, BLEU, F1]
|
|
118
|
-
- Secondary: [e.g., Recall@20, Hit Rate]
|
|
119
|
-
- Reference: repos/xyz/evaluation/metrics.py
|
|
120
|
-
|
|
121
|
-
### Evaluation Protocol
|
|
122
|
-
1. Load best checkpoint
|
|
123
|
-
2. Run inference on test set
|
|
124
|
-
3. Compute metrics
|
|
125
|
-
4. Compare against baselines (from papers)
|
|
126
|
-
|
|
127
|
-
### Baselines
|
|
128
|
-
| Method | Metric | Value | Source |
|
|
129
|
-
|--------|--------|-------|--------|
|
|
130
|
-
| ... | ... | ... | [paper] |
|
|
131
|
-
|
|
132
|
-
### Expected Results
|
|
133
|
-
[Reasonable range for the proposed method based on paper claims]
|
|
134
|
-
```
|
|
135
|
-
|
|
136
|
-
## Quality Criteria
|
|
137
|
-
|
|
138
|
-
- Every section must reference specific files from `workspace/repos/` where applicable.
|
|
139
|
-
- Hyperparameter values should come from reference papers or standard practice, not guesses.
|
|
140
|
-
- The plan must be implementable end-to-end without additional research.
|
|
141
|
-
- If any information is missing (e.g., dataset not publicly available), flag it explicitly.
|
|
142
|
-
- Do not over-engineer: plan what's needed for a solid implementation, not a production system.
|
|
@@ -1,118 +0,0 @@
|
|
|
1
|
-
# Code Review Guide
|
|
2
|
-
|
|
3
|
-
You are reviewing the implementation in `workspace/project/` to verify it correctly implements the research idea. This is a quality gate before full training.
|
|
4
|
-
|
|
5
|
-
## Review Process
|
|
6
|
-
|
|
7
|
-
### Phase 1: Verify Against Survey
|
|
8
|
-
|
|
9
|
-
Read `workspace/survey_res.md` and extract the list of atomic definitions. For each atomic definition:
|
|
10
|
-
|
|
11
|
-
1. Find the corresponding code in `workspace/project/`.
|
|
12
|
-
2. Compare the code implementation against the mathematical formula.
|
|
13
|
-
3. Check: does the code faithfully implement the math? Watch for:
|
|
14
|
-
- Missing terms in equations.
|
|
15
|
-
- Incorrect tensor operations (e.g., sum vs mean, wrong axis).
|
|
16
|
-
- Hardcoded values where parameters should be used.
|
|
17
|
-
- Simplifications that change the method's behavior.
|
|
18
|
-
|
|
19
|
-
### Phase 2: Verify Against Plan
|
|
20
|
-
|
|
21
|
-
Read `workspace/plan_res.md`. Check each section:
|
|
22
|
-
|
|
23
|
-
**Dataset Plan:**
|
|
24
|
-
- Is the correct dataset used (not a substitute)?
|
|
25
|
-
- Does the preprocessing match the plan?
|
|
26
|
-
- Is the DataLoader configured correctly (batch size, sampling)?
|
|
27
|
-
|
|
28
|
-
**Model Plan:**
|
|
29
|
-
- Are all components present?
|
|
30
|
-
- Does the forward pass match the described architecture?
|
|
31
|
-
- Are parameter counts reasonable?
|
|
32
|
-
|
|
33
|
-
**Training Plan:**
|
|
34
|
-
- Is the loss function correct (all terms present, correct weighting)?
|
|
35
|
-
- Is the optimizer configured as planned?
|
|
36
|
-
- Are hyperparameters matching?
|
|
37
|
-
|
|
38
|
-
**Testing Plan:**
|
|
39
|
-
- Are the correct metrics implemented?
|
|
40
|
-
- Is the evaluation protocol correct?
|
|
41
|
-
|
|
42
|
-
### Phase 3: Code Quality
|
|
43
|
-
|
|
44
|
-
Check for implementation quality issues:
|
|
45
|
-
|
|
46
|
-
- **Not a toy**: The implementation should be substantive, not a simplified stub.
|
|
47
|
-
- **Correctness**: No obvious bugs (wrong indices, missing gradients, data leakage).
|
|
48
|
-
- **Completeness**: All imports resolved, all functions implemented (no `pass` or `TODO`).
|
|
49
|
-
- **Runnability**: The code should run end-to-end without errors.
|
|
50
|
-
|
|
51
|
-
### Phase 4: Cross-Reference with Codebases
|
|
52
|
-
|
|
53
|
-
If needed, compare against reference codebases in `workspace/repos/`:
|
|
54
|
-
|
|
55
|
-
- Are key algorithmic patterns correctly adapted?
|
|
56
|
-
- Were critical implementation details preserved during adaptation?
|
|
57
|
-
|
|
58
|
-
## Review Output
|
|
59
|
-
|
|
60
|
-
Write the review to `workspace/iterations/judge_vN.md` (increment N for each review iteration):
|
|
61
|
-
|
|
62
|
-
```markdown
|
|
63
|
-
# Review vN
|
|
64
|
-
|
|
65
|
-
## Verdict: PASS / NEEDS_REVISION
|
|
66
|
-
|
|
67
|
-
## Atomic Definition Checklist
|
|
68
|
-
|
|
69
|
-
| Definition | Implemented | Correct | Notes |
|
|
70
|
-
|-----------|-------------|---------|-------|
|
|
71
|
-
| [def 1] | Yes/No | Yes/No | [details] |
|
|
72
|
-
| [def 2] | Yes/No | Yes/No | [details] |
|
|
73
|
-
| ... | ... | ... | ... |
|
|
74
|
-
|
|
75
|
-
## Plan Compliance
|
|
76
|
-
|
|
77
|
-
| Section | Status | Notes |
|
|
78
|
-
|---------|--------|-------|
|
|
79
|
-
| Dataset | OK / Issue | ... |
|
|
80
|
-
| Model | OK / Issue | ... |
|
|
81
|
-
| Training | OK / Issue | ... |
|
|
82
|
-
| Testing | OK / Issue | ... |
|
|
83
|
-
|
|
84
|
-
## Issues (if NEEDS_REVISION)
|
|
85
|
-
|
|
86
|
-
### Issue 1: [Title]
|
|
87
|
-
- **Location**: `project/model/attention.py`, line ~42
|
|
88
|
-
- **Problem**: [Description of what's wrong]
|
|
89
|
-
- **Expected**: [What the correct implementation should do]
|
|
90
|
-
- **Suggestion**: [Specific fix]
|
|
91
|
-
|
|
92
|
-
### Issue 2: [Title]
|
|
93
|
-
...
|
|
94
|
-
|
|
95
|
-
## Summary
|
|
96
|
-
[Brief overall assessment: what's good, what needs work]
|
|
97
|
-
```
|
|
98
|
-
|
|
99
|
-
## Verdict Criteria
|
|
100
|
-
|
|
101
|
-
**PASS** if:
|
|
102
|
-
- All atomic definitions are implemented and correct.
|
|
103
|
-
- All plan sections are satisfied.
|
|
104
|
-
- Code runs end-to-end with decreasing loss.
|
|
105
|
-
- No critical bugs.
|
|
106
|
-
|
|
107
|
-
**NEEDS_REVISION** if:
|
|
108
|
-
- Any atomic definition is missing or incorrectly implemented.
|
|
109
|
-
- Any plan section has significant gaps.
|
|
110
|
-
- Code has bugs that prevent correct execution.
|
|
111
|
-
- Implementation is a toy/stub rather than a genuine attempt.
|
|
112
|
-
|
|
113
|
-
## Iteration Rules
|
|
114
|
-
|
|
115
|
-
- Each review is independent: re-evaluate everything, not just previously flagged issues.
|
|
116
|
-
- Be specific in suggestions: cite file names, line numbers, and concrete fixes.
|
|
117
|
-
- After 3 iterations of NEEDS_REVISION, escalate to the user with a summary of remaining issues.
|
|
118
|
-
- Never approve code that doesn't run or produces NaN/Inf.
|
|
@@ -1,105 +0,0 @@
|
|
|
1
|
-
# Literature Survey Guide
|
|
2
|
-
|
|
3
|
-
You are performing a literature survey to bridge theory and implementation. Your goal is to extract actionable knowledge from papers and codebases that will directly inform the implementation.
|
|
4
|
-
|
|
5
|
-
## Process
|
|
6
|
-
|
|
7
|
-
### Phase 1: Decompose the Idea
|
|
8
|
-
|
|
9
|
-
Before reading any papers, break the research idea (from `task.json`) into **atomic academic definitions**. Each atomic definition must be:
|
|
10
|
-
|
|
11
|
-
- A single, self-contained concept (e.g., "multi-head attention", "contrastive loss", "graph convolution").
|
|
12
|
-
- Have clear mathematical foundations.
|
|
13
|
-
- Be implementable as a code module.
|
|
14
|
-
- Be traceable to specific papers.
|
|
15
|
-
|
|
16
|
-
Write down your list of atomic definitions before proceeding. This ensures systematic coverage.
|
|
17
|
-
|
|
18
|
-
### Phase 2: Paper Reading (per paper)
|
|
19
|
-
|
|
20
|
-
For each paper in `workspace/papers/`:
|
|
21
|
-
|
|
22
|
-
1. **Skim first**: Read title, abstract, introduction, and conclusion to understand the paper's scope.
|
|
23
|
-
2. **Targeted reading**: For each atomic definition relevant to this paper, find:
|
|
24
|
-
- The formal definition (usually in a "Method" or "Approach" section).
|
|
25
|
-
- Mathematical formulas (equations, loss functions, update rules).
|
|
26
|
-
- Key theoretical claims or properties.
|
|
27
|
-
3. **Search strategically**: Use keyword search within the .tex file. Look for `\begin{equation}`, `\mathcal`, `\text{loss}`, etc.
|
|
28
|
-
|
|
29
|
-
### Phase 3: Code Reading (per repo)
|
|
30
|
-
|
|
31
|
-
For each reference codebase in `workspace/repos/`:
|
|
32
|
-
|
|
33
|
-
1. **Understand structure**: List the directory tree first.
|
|
34
|
-
2. **Find implementations**: Map each mathematical formula to its code implementation:
|
|
35
|
-
- Model architecture classes → model definition formulas
|
|
36
|
-
- Loss function implementations → loss formulas
|
|
37
|
-
- Data processing pipelines → input/output specifications
|
|
38
|
-
3. **Document the mapping**: For each formula, note the exact file, class, and function that implements it.
|
|
39
|
-
|
|
40
|
-
### Phase 4: Write Notes
|
|
41
|
-
|
|
42
|
-
For each paper, create `workspace/notes/paper_NNN.md` with this structure:
|
|
43
|
-
|
|
44
|
-
```markdown
|
|
45
|
-
# [Paper Title]
|
|
46
|
-
ArXiv: [id] | Authors: [first author et al.]
|
|
47
|
-
|
|
48
|
-
## Core Method
|
|
49
|
-
[1-2 paragraph summary of the paper's main contribution]
|
|
50
|
-
|
|
51
|
-
## Atomic Definitions Covered
|
|
52
|
-
[List which atomic definitions from Phase 1 this paper addresses]
|
|
53
|
-
|
|
54
|
-
## Math Formulas
|
|
55
|
-
|
|
56
|
-
### [Definition Name 1]
|
|
57
|
-
$$formula$$
|
|
58
|
-
- Variables: [explain each variable]
|
|
59
|
-
- Context: [when/where this is applied]
|
|
60
|
-
|
|
61
|
-
### [Definition Name 2]
|
|
62
|
-
...
|
|
63
|
-
|
|
64
|
-
## Code Implementation
|
|
65
|
-
|
|
66
|
-
### [Definition Name 1]
|
|
67
|
-
- **Repo**: repos/[name]
|
|
68
|
-
- **File**: path/to/file.py, class ClassName, method method_name
|
|
69
|
-
- **Key logic**:
|
|
70
|
-
```python
|
|
71
|
-
# Excerpt of the most relevant 10-30 lines
|
|
72
|
-
```
|
|
73
|
-
- **Notes**: [any adaptations, simplifications, or deviations from the paper]
|
|
74
|
-
|
|
75
|
-
## Key Insights
|
|
76
|
-
- [Insight 1: anything surprising or important for implementation]
|
|
77
|
-
- [Insight 2: ...]
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
### Phase 5: Synthesize
|
|
81
|
-
|
|
82
|
-
After all papers are surveyed, write `workspace/survey_res.md`:
|
|
83
|
-
|
|
84
|
-
```markdown
|
|
85
|
-
# Literature Survey: [Research Idea]
|
|
86
|
-
|
|
87
|
-
## Atomic Definitions
|
|
88
|
-
[Complete list with brief descriptions]
|
|
89
|
-
|
|
90
|
-
## Theory-to-Code Mapping
|
|
91
|
-
[For each atomic definition: the formula, which papers define it, which repos implement it, and the recommended implementation approach]
|
|
92
|
-
|
|
93
|
-
## Implementation Recommendations
|
|
94
|
-
[Which reference implementations to adapt, potential pitfalls, suggested architecture]
|
|
95
|
-
|
|
96
|
-
## Open Questions
|
|
97
|
-
[Anything unclear that may need user input]
|
|
98
|
-
```
|
|
99
|
-
|
|
100
|
-
## Quality Criteria
|
|
101
|
-
|
|
102
|
-
- Every atomic definition must appear in at least one paper note.
|
|
103
|
-
- Every formula must have a corresponding code reference (or be flagged as "no reference found").
|
|
104
|
-
- Do not skip papers. If a paper is not relevant, note why and move on.
|
|
105
|
-
- Err on the side of extracting more detail rather than less. The implementation step depends on this survey.
|
|
@@ -1,81 +0,0 @@
|
|
|
1
|
-
# Workspace Directory Specification
|
|
2
|
-
|
|
3
|
-
All research pipeline artifacts live in a `workspace/` directory. The location is either specified by the user or defaults to the current working directory plus `workspace/`.
|
|
4
|
-
|
|
5
|
-
## Directory Layout
|
|
6
|
-
|
|
7
|
-
```
|
|
8
|
-
workspace/
|
|
9
|
-
task.json # Input: research task definition
|
|
10
|
-
search_results.md # Step 2: arxiv + github search results
|
|
11
|
-
prepare_res.md # Step 3: selected repos and rationale
|
|
12
|
-
survey_res.md # Step 5: synthesized literature survey
|
|
13
|
-
plan_res.md # Step 6: four-part implementation plan
|
|
14
|
-
ml_res.md # Step 7: implementation report
|
|
15
|
-
experiment_res.md # Step 10: full training results
|
|
16
|
-
|
|
17
|
-
repos/ # Step 3: cloned reference repositories
|
|
18
|
-
repo-name-1/
|
|
19
|
-
repo-name-2/
|
|
20
|
-
|
|
21
|
-
papers/ # Step 4: downloaded paper sources
|
|
22
|
-
2401.12345.tex
|
|
23
|
-
2401.67890.tex
|
|
24
|
-
|
|
25
|
-
notes/ # Step 5: per-paper survey notes
|
|
26
|
-
paper_001.md
|
|
27
|
-
paper_002.md
|
|
28
|
-
|
|
29
|
-
iterations/ # Steps 8-9: review history
|
|
30
|
-
judge_v1.md
|
|
31
|
-
judge_v2.md
|
|
32
|
-
|
|
33
|
-
project/ # Step 7: implementation code
|
|
34
|
-
model/
|
|
35
|
-
data/
|
|
36
|
-
training/
|
|
37
|
-
testing/
|
|
38
|
-
utils/
|
|
39
|
-
run.py
|
|
40
|
-
requirements.txt
|
|
41
|
-
```
|
|
42
|
-
|
|
43
|
-
## Conventions
|
|
44
|
-
|
|
45
|
-
### File Existence = Step Completion
|
|
46
|
-
|
|
47
|
-
The research pipeline uses file existence as the checkpoint mechanism. Before executing any step, check whether its output file already exists. If it does, skip the step.
|
|
48
|
-
|
|
49
|
-
This enables:
|
|
50
|
-
- **Crash recovery**: resume from the last completed step.
|
|
51
|
-
- **Incremental progress**: re-running the pipeline skips completed work.
|
|
52
|
-
- **Transparency**: a human can inspect progress by listing the directory.
|
|
53
|
-
|
|
54
|
-
### Naming Rules
|
|
55
|
-
|
|
56
|
-
- Markdown files (`.md`) for human-readable outputs.
|
|
57
|
-
- JSON files (`.json`) for structured data (task definition).
|
|
58
|
-
- Paper notes use sequential numbering: `paper_001.md`, `paper_002.md`.
|
|
59
|
-
- Review iterations use version numbering: `judge_v1.md`, `judge_v2.md`.
|
|
60
|
-
|
|
61
|
-
### Immutability
|
|
62
|
-
|
|
63
|
-
Once a step's output is written, do NOT modify it unless the user explicitly asks. If a step needs to be re-done, delete the output file first, then re-execute.
|
|
64
|
-
|
|
65
|
-
Exception: `workspace/project/` is mutable during the implement-review-iterate loop (Steps 7-9).
|
|
66
|
-
|
|
67
|
-
### task.json Schema
|
|
68
|
-
|
|
69
|
-
```json
|
|
70
|
-
{
|
|
71
|
-
"idea": "A 1-3 sentence description of the research idea",
|
|
72
|
-
"references": ["2401.12345", "paper title string"],
|
|
73
|
-
"domain": "recommendation systems",
|
|
74
|
-
"date_limit": "2024-01-01"
|
|
75
|
-
}
|
|
76
|
-
```
|
|
77
|
-
|
|
78
|
-
- `idea` (required): The core research idea to implement.
|
|
79
|
-
- `references` (optional): ArXiv IDs or paper titles as starting points.
|
|
80
|
-
- `domain` (optional): Research domain for focused searching.
|
|
81
|
-
- `date_limit` (optional): Only consider papers published after this date.
|