scientify 1.2.2 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/README.md +38 -14
  2. package/README.zh.md +38 -15
  3. package/dist/index.d.ts.map +1 -1
  4. package/dist/index.js +21 -2
  5. package/dist/index.js.map +1 -1
  6. package/dist/src/services/auto-updater.d.ts +15 -0
  7. package/dist/src/services/auto-updater.d.ts.map +1 -0
  8. package/dist/src/services/auto-updater.js +188 -0
  9. package/dist/src/services/auto-updater.js.map +1 -0
  10. package/dist/src/tools/arxiv-download.d.ts +25 -0
  11. package/dist/src/tools/arxiv-download.d.ts.map +1 -0
  12. package/dist/src/tools/arxiv-download.js +179 -0
  13. package/dist/src/tools/arxiv-download.js.map +1 -0
  14. package/dist/src/tools/{arxiv-tool.d.ts → arxiv-search.d.ts} +11 -8
  15. package/dist/src/tools/arxiv-search.d.ts.map +1 -0
  16. package/dist/src/tools/arxiv-search.js +140 -0
  17. package/dist/src/tools/arxiv-search.js.map +1 -0
  18. package/dist/src/tools/github-search-tool.d.ts +5 -1
  19. package/dist/src/tools/github-search-tool.d.ts.map +1 -1
  20. package/dist/src/tools/github-search-tool.js +10 -30
  21. package/dist/src/tools/github-search-tool.js.map +1 -1
  22. package/dist/src/tools/result.d.ts +37 -0
  23. package/dist/src/tools/result.d.ts.map +1 -0
  24. package/dist/src/tools/result.js +39 -0
  25. package/dist/src/tools/result.js.map +1 -0
  26. package/dist/src/tools/workspace.d.ts +32 -0
  27. package/dist/src/tools/workspace.d.ts.map +1 -0
  28. package/dist/src/tools/workspace.js +69 -0
  29. package/dist/src/tools/workspace.js.map +1 -0
  30. package/openclaw.plugin.json +22 -1
  31. package/package.json +13 -2
  32. package/skills/_shared/workspace-spec.md +139 -0
  33. package/skills/idea-generation/SKILL.md +4 -0
  34. package/skills/install-scientify/SKILL.md +15 -7
  35. package/skills/literature-survey/SKILL.md +86 -212
  36. package/skills/research-experiment/SKILL.md +114 -0
  37. package/skills/research-implement/SKILL.md +166 -0
  38. package/skills/research-pipeline/SKILL.md +104 -188
  39. package/skills/research-plan/SKILL.md +121 -0
  40. package/skills/research-review/SKILL.md +110 -0
  41. package/skills/research-survey/SKILL.md +140 -0
  42. package/skills/write-review-paper/SKILL.md +4 -0
  43. package/dist/src/tools/arxiv-tool.d.ts.map +0 -1
  44. package/dist/src/tools/arxiv-tool.js +0 -258
  45. package/dist/src/tools/arxiv-tool.js.map +0 -1
  46. package/skills/research-pipeline/references/prompts/implement.md +0 -135
  47. package/skills/research-pipeline/references/prompts/plan.md +0 -142
  48. package/skills/research-pipeline/references/prompts/review.md +0 -118
  49. package/skills/research-pipeline/references/prompts/survey.md +0 -105
  50. package/skills/research-pipeline/references/workspace-spec.md +0 -81
@@ -1,142 +0,0 @@
1
- # Implementation Planning Guide
2
-
3
- You are creating a detailed, actionable implementation plan. This plan must be specific enough that the implementation step can follow it without ambiguity.
4
-
5
- ## Prerequisites
6
-
7
- Before planning, you must have:
8
-
9
- - `workspace/task.json` — the research idea
10
- - `workspace/survey_res.md` — the literature survey with theory-to-code mappings
11
- - `workspace/prepare_res.md` — selected reference repositories
12
-
13
- Read ALL of these files thoroughly before writing the plan. Also browse the reference codebases in `workspace/repos/` to understand their structure and reusable components.
14
-
15
- ## Plan Structure
16
-
17
- The plan has four mandatory sections. Write all four to `workspace/plan_res.md`.
18
-
19
- ### 1. Dataset Plan
20
-
21
- ```markdown
22
- ## Dataset Plan
23
-
24
- ### Data Source
25
- - Dataset name and where to obtain it
26
- - Size and format
27
- - Any preprocessing requirements
28
-
29
- ### Data Loading Pipeline
30
- 1. **Read**: How to load raw data (file format, library to use)
31
- 2. **Preprocess**: Transformations, tokenization, normalization, feature extraction
32
- 3. **DataLoader**: Batch construction, sampling strategy, collate function
33
-
34
- ### Data Splits
35
- - Train/validation/test split ratios
36
- - Any special handling (e.g., cold-start users, temporal splits)
37
- ```
38
-
39
- Refer to the reference codebases for data loading patterns. Cite specific files: "See `repos/xyz/data/loader.py` for the graph construction approach."
40
-
41
- ### 2. Model Plan
42
-
43
- ```markdown
44
- ## Model Plan
45
-
46
- ### Architecture Overview
47
- [High-level description of the model architecture]
48
-
49
- ### Components (one per atomic definition)
50
-
51
- #### [Atomic Definition 1]
52
- - **Math**: $formula$
53
- - **Implementation**: Class name, input/output shapes, key methods
54
- - **Reference**: repos/xyz/model/attention.py, class MultiHeadAttention
55
- - **Adaptation notes**: [What to change from the reference]
56
-
57
- #### [Atomic Definition 2]
58
- ...
59
-
60
- ### Forward Pass
61
- [Step-by-step description of the forward pass, connecting all components]
62
-
63
- ### Parameter Count Estimate
64
- [Rough estimate to sanity-check the architecture]
65
- ```
66
-
67
- Every atomic definition from `survey_res.md` must appear as a component. If a definition doesn't map to a model component, explain why.
68
-
69
- ### 3. Training Plan
70
-
71
- ```markdown
72
- ## Training Plan
73
-
74
- ### Loss Function
75
- - Formula: $L = ...$
76
- - Components: [list each loss term and its purpose]
77
- - Reference: repos/xyz/training/loss.py
78
-
79
- ### Optimizer
80
- - Algorithm: [Adam, AdamW, SGD, etc.]
81
- - Learning rate: [value] with [schedule: cosine, step, warmup, etc.]
82
- - Weight decay: [value]
83
-
84
- ### Hyperparameters
85
- | Parameter | Value | Rationale |
86
- |-----------|-------|-----------|
87
- | Batch size | ... | ... |
88
- | Hidden dim | ... | ... |
89
- | Num layers | ... | ... |
90
- | Dropout | ... | ... |
91
-
92
- ### Training Loop
93
- 1. Forward pass
94
- 2. Compute loss
95
- 3. Backward pass
96
- 4. Gradient clipping (if applicable)
97
- 5. Optimizer step
98
- 6. Logging (every N steps)
99
- 7. Validation (every M epochs)
100
-
101
- ### Quick Validation
102
- - Epochs: 2 (for initial validation)
103
- - Expected behavior: loss should decrease, no NaN/Inf
104
-
105
- ### Full Training
106
- - Epochs: [value from reference papers]
107
- - Early stopping: [criteria]
108
- - Checkpoint: save best model by validation metric
109
- ```
110
-
111
- ### 4. Testing Plan
112
-
113
- ```markdown
114
- ## Testing Plan
115
-
116
- ### Metrics
117
- - Primary: [e.g., NDCG@10, BLEU, F1]
118
- - Secondary: [e.g., Recall@20, Hit Rate]
119
- - Reference: repos/xyz/evaluation/metrics.py
120
-
121
- ### Evaluation Protocol
122
- 1. Load best checkpoint
123
- 2. Run inference on test set
124
- 3. Compute metrics
125
- 4. Compare against baselines (from papers)
126
-
127
- ### Baselines
128
- | Method | Metric | Value | Source |
129
- |--------|--------|-------|--------|
130
- | ... | ... | ... | [paper] |
131
-
132
- ### Expected Results
133
- [Reasonable range for the proposed method based on paper claims]
134
- ```
135
-
136
- ## Quality Criteria
137
-
138
- - Every section must reference specific files from `workspace/repos/` where applicable.
139
- - Hyperparameter values should come from reference papers or standard practice, not guesses.
140
- - The plan must be implementable end-to-end without additional research.
141
- - If any information is missing (e.g., dataset not publicly available), flag it explicitly.
142
- - Do not over-engineer: plan what's needed for a solid implementation, not a production system.
@@ -1,118 +0,0 @@
1
- # Code Review Guide
2
-
3
- You are reviewing the implementation in `workspace/project/` to verify it correctly implements the research idea. This is a quality gate before full training.
4
-
5
- ## Review Process
6
-
7
- ### Phase 1: Verify Against Survey
8
-
9
- Read `workspace/survey_res.md` and extract the list of atomic definitions. For each atomic definition:
10
-
11
- 1. Find the corresponding code in `workspace/project/`.
12
- 2. Compare the code implementation against the mathematical formula.
13
- 3. Check: does the code faithfully implement the math? Watch for:
14
- - Missing terms in equations.
15
- - Incorrect tensor operations (e.g., sum vs mean, wrong axis).
16
- - Hardcoded values where parameters should be used.
17
- - Simplifications that change the method's behavior.
18
-
19
- ### Phase 2: Verify Against Plan
20
-
21
- Read `workspace/plan_res.md`. Check each section:
22
-
23
- **Dataset Plan:**
24
- - Is the correct dataset used (not a substitute)?
25
- - Does the preprocessing match the plan?
26
- - Is the DataLoader configured correctly (batch size, sampling)?
27
-
28
- **Model Plan:**
29
- - Are all components present?
30
- - Does the forward pass match the described architecture?
31
- - Are parameter counts reasonable?
32
-
33
- **Training Plan:**
34
- - Is the loss function correct (all terms present, correct weighting)?
35
- - Is the optimizer configured as planned?
36
- - Are hyperparameters matching?
37
-
38
- **Testing Plan:**
39
- - Are the correct metrics implemented?
40
- - Is the evaluation protocol correct?
41
-
42
- ### Phase 3: Code Quality
43
-
44
- Check for implementation quality issues:
45
-
46
- - **Not a toy**: The implementation should be substantive, not a simplified stub.
47
- - **Correctness**: No obvious bugs (wrong indices, missing gradients, data leakage).
48
- - **Completeness**: All imports resolved, all functions implemented (no `pass` or `TODO`).
49
- - **Runnability**: The code should run end-to-end without errors.
50
-
51
- ### Phase 4: Cross-Reference with Codebases
52
-
53
- If needed, compare against reference codebases in `workspace/repos/`:
54
-
55
- - Are key algorithmic patterns correctly adapted?
56
- - Were critical implementation details preserved during adaptation?
57
-
58
- ## Review Output
59
-
60
- Write the review to `workspace/iterations/judge_vN.md` (increment N for each review iteration):
61
-
62
- ```markdown
63
- # Review vN
64
-
65
- ## Verdict: PASS / NEEDS_REVISION
66
-
67
- ## Atomic Definition Checklist
68
-
69
- | Definition | Implemented | Correct | Notes |
70
- |-----------|-------------|---------|-------|
71
- | [def 1] | Yes/No | Yes/No | [details] |
72
- | [def 2] | Yes/No | Yes/No | [details] |
73
- | ... | ... | ... | ... |
74
-
75
- ## Plan Compliance
76
-
77
- | Section | Status | Notes |
78
- |---------|--------|-------|
79
- | Dataset | OK / Issue | ... |
80
- | Model | OK / Issue | ... |
81
- | Training | OK / Issue | ... |
82
- | Testing | OK / Issue | ... |
83
-
84
- ## Issues (if NEEDS_REVISION)
85
-
86
- ### Issue 1: [Title]
87
- - **Location**: `project/model/attention.py`, line ~42
88
- - **Problem**: [Description of what's wrong]
89
- - **Expected**: [What the correct implementation should do]
90
- - **Suggestion**: [Specific fix]
91
-
92
- ### Issue 2: [Title]
93
- ...
94
-
95
- ## Summary
96
- [Brief overall assessment: what's good, what needs work]
97
- ```
98
-
99
- ## Verdict Criteria
100
-
101
- **PASS** if:
102
- - All atomic definitions are implemented and correct.
103
- - All plan sections are satisfied.
104
- - Code runs end-to-end with decreasing loss.
105
- - No critical bugs.
106
-
107
- **NEEDS_REVISION** if:
108
- - Any atomic definition is missing or incorrectly implemented.
109
- - Any plan section has significant gaps.
110
- - Code has bugs that prevent correct execution.
111
- - Implementation is a toy/stub rather than a genuine attempt.
112
-
113
- ## Iteration Rules
114
-
115
- - Each review is independent: re-evaluate everything, not just previously flagged issues.
116
- - Be specific in suggestions: cite file names, line numbers, and concrete fixes.
117
- - After 3 iterations of NEEDS_REVISION, escalate to the user with a summary of remaining issues.
118
- - Never approve code that doesn't run or produces NaN/Inf.
@@ -1,105 +0,0 @@
1
- # Literature Survey Guide
2
-
3
- You are performing a literature survey to bridge theory and implementation. Your goal is to extract actionable knowledge from papers and codebases that will directly inform the implementation.
4
-
5
- ## Process
6
-
7
- ### Phase 1: Decompose the Idea
8
-
9
- Before reading any papers, break the research idea (from `task.json`) into **atomic academic definitions**. Each atomic definition must be:
10
-
11
- - A single, self-contained concept (e.g., "multi-head attention", "contrastive loss", "graph convolution").
12
- - Have clear mathematical foundations.
13
- - Be implementable as a code module.
14
- - Be traceable to specific papers.
15
-
16
- Write down your list of atomic definitions before proceeding. This ensures systematic coverage.
17
-
18
- ### Phase 2: Paper Reading (per paper)
19
-
20
- For each paper in `workspace/papers/`:
21
-
22
- 1. **Skim first**: Read title, abstract, introduction, and conclusion to understand the paper's scope.
23
- 2. **Targeted reading**: For each atomic definition relevant to this paper, find:
24
- - The formal definition (usually in a "Method" or "Approach" section).
25
- - Mathematical formulas (equations, loss functions, update rules).
26
- - Key theoretical claims or properties.
27
- 3. **Search strategically**: Use keyword search within the .tex file. Look for `\begin{equation}`, `\mathcal`, `\text{loss}`, etc.
28
-
29
- ### Phase 3: Code Reading (per repo)
30
-
31
- For each reference codebase in `workspace/repos/`:
32
-
33
- 1. **Understand structure**: List the directory tree first.
34
- 2. **Find implementations**: Map each mathematical formula to its code implementation:
35
- - Model architecture classes → model definition formulas
36
- - Loss function implementations → loss formulas
37
- - Data processing pipelines → input/output specifications
38
- 3. **Document the mapping**: For each formula, note the exact file, class, and function that implements it.
39
-
40
- ### Phase 4: Write Notes
41
-
42
- For each paper, create `workspace/notes/paper_NNN.md` with this structure:
43
-
44
- ```markdown
45
- # [Paper Title]
46
- ArXiv: [id] | Authors: [first author et al.]
47
-
48
- ## Core Method
49
- [1-2 paragraph summary of the paper's main contribution]
50
-
51
- ## Atomic Definitions Covered
52
- [List which atomic definitions from Phase 1 this paper addresses]
53
-
54
- ## Math Formulas
55
-
56
- ### [Definition Name 1]
57
- $$formula$$
58
- - Variables: [explain each variable]
59
- - Context: [when/where this is applied]
60
-
61
- ### [Definition Name 2]
62
- ...
63
-
64
- ## Code Implementation
65
-
66
- ### [Definition Name 1]
67
- - **Repo**: repos/[name]
68
- - **File**: path/to/file.py, class ClassName, method method_name
69
- - **Key logic**:
70
- ```python
71
- # Excerpt of the most relevant 10-30 lines
72
- ```
73
- - **Notes**: [any adaptations, simplifications, or deviations from the paper]
74
-
75
- ## Key Insights
76
- - [Insight 1: anything surprising or important for implementation]
77
- - [Insight 2: ...]
78
- ```
79
-
80
- ### Phase 5: Synthesize
81
-
82
- After all papers are surveyed, write `workspace/survey_res.md`:
83
-
84
- ```markdown
85
- # Literature Survey: [Research Idea]
86
-
87
- ## Atomic Definitions
88
- [Complete list with brief descriptions]
89
-
90
- ## Theory-to-Code Mapping
91
- [For each atomic definition: the formula, which papers define it, which repos implement it, and the recommended implementation approach]
92
-
93
- ## Implementation Recommendations
94
- [Which reference implementations to adapt, potential pitfalls, suggested architecture]
95
-
96
- ## Open Questions
97
- [Anything unclear that may need user input]
98
- ```
99
-
100
- ## Quality Criteria
101
-
102
- - Every atomic definition must appear in at least one paper note.
103
- - Every formula must have a corresponding code reference (or be flagged as "no reference found").
104
- - Do not skip papers. If a paper is not relevant, note why and move on.
105
- - Err on the side of extracting more detail rather than less. The implementation step depends on this survey.
@@ -1,81 +0,0 @@
1
- # Workspace Directory Specification
2
-
3
- All research pipeline artifacts live in a `workspace/` directory. The location is either specified by the user or defaults to the current working directory plus `workspace/`.
4
-
5
- ## Directory Layout
6
-
7
- ```
8
- workspace/
9
- task.json # Input: research task definition
10
- search_results.md # Step 2: arxiv + github search results
11
- prepare_res.md # Step 3: selected repos and rationale
12
- survey_res.md # Step 5: synthesized literature survey
13
- plan_res.md # Step 6: four-part implementation plan
14
- ml_res.md # Step 7: implementation report
15
- experiment_res.md # Step 10: full training results
16
-
17
- repos/ # Step 3: cloned reference repositories
18
- repo-name-1/
19
- repo-name-2/
20
-
21
- papers/ # Step 4: downloaded paper sources
22
- 2401.12345.tex
23
- 2401.67890.tex
24
-
25
- notes/ # Step 5: per-paper survey notes
26
- paper_001.md
27
- paper_002.md
28
-
29
- iterations/ # Steps 8-9: review history
30
- judge_v1.md
31
- judge_v2.md
32
-
33
- project/ # Step 7: implementation code
34
- model/
35
- data/
36
- training/
37
- testing/
38
- utils/
39
- run.py
40
- requirements.txt
41
- ```
42
-
43
- ## Conventions
44
-
45
- ### File Existence = Step Completion
46
-
47
- The research pipeline uses file existence as the checkpoint mechanism. Before executing any step, check whether its output file already exists. If it does, skip the step.
48
-
49
- This enables:
50
- - **Crash recovery**: resume from the last completed step.
51
- - **Incremental progress**: re-running the pipeline skips completed work.
52
- - **Transparency**: a human can inspect progress by listing the directory.
53
-
54
- ### Naming Rules
55
-
56
- - Markdown files (`.md`) for human-readable outputs.
57
- - JSON files (`.json`) for structured data (task definition).
58
- - Paper notes use sequential numbering: `paper_001.md`, `paper_002.md`.
59
- - Review iterations use version numbering: `judge_v1.md`, `judge_v2.md`.
60
-
61
- ### Immutability
62
-
63
- Once a step's output is written, do NOT modify it unless the user explicitly asks. If a step needs to be re-done, delete the output file first, then re-execute.
64
-
65
- Exception: `workspace/project/` is mutable during the implement-review-iterate loop (Steps 7-9).
66
-
67
- ### task.json Schema
68
-
69
- ```json
70
- {
71
- "idea": "A 1-3 sentence description of the research idea",
72
- "references": ["2401.12345", "paper title string"],
73
- "domain": "recommendation systems",
74
- "date_limit": "2024-01-01"
75
- }
76
- ```
77
-
78
- - `idea` (required): The core research idea to implement.
79
- - `references` (optional): ArXiv IDs or paper titles as starting points.
80
- - `domain` (optional): Research domain for focused searching.
81
- - `date_limit` (optional): Only consider papers published after this date.