npm - scientify - Versions diffs - 1.3.0 → 1.4.0 - Mend

scientify 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

package/README.md +38 -14
package/README.zh.md +38 -15
package/dist/index.d.ts.map +1 -1
package/dist/index.js +21 -2
package/dist/index.js.map +1 -1
package/dist/src/services/auto-updater.d.ts +15 -0
package/dist/src/services/auto-updater.d.ts.map +1 -0
package/dist/src/services/auto-updater.js +188 -0
package/dist/src/services/auto-updater.js.map +1 -0
package/dist/src/tools/arxiv-download.d.ts +25 -0
package/dist/src/tools/arxiv-download.d.ts.map +1 -0
package/dist/src/tools/arxiv-download.js +179 -0
package/dist/src/tools/arxiv-download.js.map +1 -0
package/dist/src/tools/{arxiv-tool.d.ts → arxiv-search.d.ts} +11 -8
package/dist/src/tools/arxiv-search.d.ts.map +1 -0
package/dist/src/tools/arxiv-search.js +140 -0
package/dist/src/tools/arxiv-search.js.map +1 -0
package/dist/src/tools/github-search-tool.d.ts +5 -1
package/dist/src/tools/github-search-tool.d.ts.map +1 -1
package/dist/src/tools/github-search-tool.js +10 -30
package/dist/src/tools/github-search-tool.js.map +1 -1
package/dist/src/tools/result.d.ts +37 -0
package/dist/src/tools/result.d.ts.map +1 -0
package/dist/src/tools/result.js +39 -0
package/dist/src/tools/result.js.map +1 -0
package/dist/src/tools/workspace.d.ts +32 -0
package/dist/src/tools/workspace.d.ts.map +1 -0
package/dist/src/tools/workspace.js +69 -0
package/dist/src/tools/workspace.js.map +1 -0
package/openclaw.plugin.json +22 -1
package/package.json +13 -2
package/skills/_shared/workspace-spec.md +15 -5
package/skills/idea-generation/SKILL.md +2 -0
package/skills/install-scientify/SKILL.md +15 -7
package/skills/literature-survey/SKILL.md +86 -214
package/skills/research-experiment/SKILL.md +114 -0
package/skills/research-implement/SKILL.md +166 -0
package/skills/research-pipeline/SKILL.md +104 -166
package/skills/research-plan/SKILL.md +121 -0
package/skills/research-review/SKILL.md +110 -0
package/skills/research-survey/SKILL.md +140 -0
package/skills/write-review-paper/SKILL.md +2 -0
package/dist/src/tools/arxiv-tool.d.ts.map +0 -1
package/dist/src/tools/arxiv-tool.js +0 -258
package/dist/src/tools/arxiv-tool.js.map +0 -1
package/skills/research-pipeline/references/prompts/implement.md +0 -135
package/skills/research-pipeline/references/prompts/plan.md +0 -142
package/skills/research-pipeline/references/prompts/review.md +0 -118
package/skills/research-pipeline/references/prompts/survey.md +0 -105
package/skills/research-pipeline/references/workspace-spec.md +0 -5

package/skills/research-pipeline/SKILL.md CHANGED Viewed

@@ -1,245 +1,183 @@
 ---
 name: research-pipeline
-description: "End-to-end research automation: idea → literature → plan → implement → review → iterate. Use for: implementing a specific research idea, full ML research workflow. NOT for: just exploring literature (use /literature-survey), just generating ideas (use /idea-generation), just writing review (use /write-review-paper)."
+description: "Orchestrates the full research workflow by spawning sub-agents for each phase. Checks workspace state, dispatches tasks, verifies outputs. Use for: end-to-end ML research. Each phase runs in an isolated context via sessions_spawn."
 metadata:
   {
     "openclaw":
       {
         "emoji": "🔬",
-        "requires": { "bins": ["git", "python3"] },
+        "requires": { "bins": ["git", "python3", "uv"] },
       },
   }
 ---
-# Research Pipeline
+# Research Pipeline (Orchestrator)
-Automate an end-to-end ML research workflow: idea → literature → survey → plan → implement → review → iterate.
+**Don't ask permission. Just do it.**
-**Workspace:** See `../_shared/workspace-spec.md` for directory structure. Outputs go to `$WORKSPACE/project/`, `$WORKSPACE/iterations/`.
+你是编排器。你不直接做研究工作，而是：
+1. 检查 workspace 文件状态
+2. 为下一步构造任务描述
+3. 用 `sessions_spawn` 派发给子 agent
+4. 等待完成后验证产出
+5. 重复直到流程结束
-**File existence = step completion.** Skip steps whose output already exists.
+**Workspace:** See `../_shared/workspace-spec.md`. Set `$W` to the active project directory.
 ---
-## Step 0: Check Active Project
+## Step 0: 初始化
 ```bash
-cat ~/.openclaw/workspace/projects/.active 2>/dev/null
+ACTIVE=$(cat ~/.openclaw/workspace/projects/.active 2>/dev/null)
 ```
-If active, set `$WORKSPACE = ~/.openclaw/workspace/projects/{project_id}/`.
-If none, create based on research idea in Step 1.
+如果没有 active project：
+1. 问用户：研究主题是什么？
+2. 创建项目目录
+3. 写入 `task.json`
----
-## Step 1: Parse Task
-Read `$WORKSPACE/task.json`. If it does not exist, ask the user for:
+设置 `$W = ~/.openclaw/workspace/projects/{project-id}`
-- **idea**: A description of the research idea (1-3 sentences).
-- **references** (optional): ArXiv IDs or paper titles as starting points.
-- **domain** (optional): e.g. "recommendation systems", "NLP", "computer vision".
-Write the result to `$WORKSPACE/task.json`:
+---
-```json
-{
-  "idea": "...",
-  "references": ["2401.12345", "..."],
-  "domain": "...",
-  "date_limit": "2024-01-01"
-}
-```
+## 调度循环
-**Output:** `$WORKSPACE/task.json`
+按顺序检查每个阶段。**每次只执行一个阶段。**
-## Step 2: Search
+### Phase 1: Literature Survey
-Use the `arxiv` tool to search for 5-10 related papers based on the idea and any reference paper titles. Use the `github_search` tool to find related repositories.
+**检查:** `$W/papers/_meta/` 目录存在且有 `.json` 文件？
-Combine results into a markdown report:
+**如果缺失，spawn:**
 ```
-## ArXiv Papers
-- [title](pdf_url) — arxiv_id — summary of relevance
-## GitHub Repositories
-- [repo_name](url) — stars — language — summary of relevance
+sessions_spawn({
+  task: "工作目录: $W\n执行 /literature-survey 技能\n\n研究主题: {从 task.json 提取}\n请搜索、筛选、下载相关论文到 $W/papers/",
+  label: "Literature Survey"
+})
 ```
-**Output:** `$WORKSPACE/search_results.md`
-## Step 3: Prepare References
-Read `$WORKSPACE/search_results.md`. Select 3-5 of the most relevant repositories.
+**验证:** `ls $W/papers/_meta/*.json` 至少有 3 个文件
-For each selected repo, clone it into `$WORKSPACE/repos/`:
-```bash
-git clone --depth 1 <url> $WORKSPACE/repos/<repo_name>
-```
-Write a summary of selected repos and their relevance to the idea.
-**Output:** `$WORKSPACE/prepare_res.md`
-## Step 4: Download Papers
-For each important paper from Step 2, use the `arxiv` tool with `download: true` and `output_dir: "$WORKSPACE/papers/"` to get .tex source files.
-If download fails for any paper, note the failure and continue. The survey step can work with abstracts alone.
+---
-**Output:** `$WORKSPACE/papers/*.tex` (or `.md` summaries if .tex unavailable)
+### Phase 2: Deep Survey
-## Step 5: Literature Survey
+**检查:** `$W/survey_res.md` 存在？
-This is the most intellectually demanding step. Read `references/prompts/survey.md` for detailed guidance.
+**如果缺失，先读取 Phase 1 摘要，然后 spawn:**
-For each paper:
+```
+sessions_spawn({
+  task: "工作目录: $W\n执行 /research-survey 技能\n\n上下文: 已下载 {N} 篇论文，方向包括 {directions}\n请深度分析论文，提取公式，写入 survey_res.md",
+  label: "Deep Survey"
+})
+```
-1. Read the .tex source (or abstract) thoroughly.
-2. Extract: core method, mathematical formulas, key contributions.
-3. Read the corresponding reference codebase in `$WORKSPACE/repos/`.
-4. Map math formulas to code implementations.
-5. Write structured notes to `$WORKSPACE/notes/paper_NNN.md`.
+**验证:** `$W/survey_res.md` 存在且包含"核心方法对比"表格
-Each note file should contain:
+---
-```markdown
-# [Paper Title]
+### Phase 3: Implementation Plan
-## Core Method
-...
+**检查:** `$W/plan_res.md` 存在？
-## Math Formulas
-...
+**如果缺失，读取 survey_res.md 摘要，然后 spawn:**
-## Code Implementation
-File: repos/<repo>/path/to/file.py
-```python
-# relevant code excerpt
 ```
-## Key Insights
-...
+sessions_spawn({
+  task: "工作目录: $W\n执行 /research-plan 技能\n\n上下文: 调研发现核心方法是 {method}，推荐技术路线 {route}\n请制定完整实现计划到 plan_res.md",
+  label: "Research Plan"
+})
 ```
-After all papers are surveyed, write a synthesis combining all notes.
-**Output:** `$WORKSPACE/notes/paper_*.md` + `$WORKSPACE/survey_res.md`
+**验证:** `$W/plan_res.md` 存在且包含 4 个 section（Dataset/Model/Training/Testing）
-## Step 6: Implementation Plan
-Read `references/prompts/plan.md` for detailed guidance.
-Based on `survey_res.md`, `prepare_res.md`, and `task.json`, create a four-part plan:
-1. **Dataset Plan**: data source, loading pipeline, preprocessing, dataloader design.
-2. **Model Plan**: architecture, math formulas to implement, reference code to adapt.
-3. **Training Plan**: loss functions, optimizer, hyperparameters, monitoring.
-4. **Testing Plan**: metrics, evaluation protocol, baselines.
-**Output:** `$WORKSPACE/plan_res.md`
+---
-## Step 7: Implement
+### Phase 4: Implementation
-Read `references/prompts/implement.md` for detailed guidance.
+**检查:** `$W/ml_res.md` 存在？
-Create a self-contained project in `$WORKSPACE/project/`:
+**如果缺失，读取 plan_res.md 要点，然后 spawn:**
 ```
-$WORKSPACE/project/
-  model/          # model architecture
-  data/           # data loading and preprocessing
-  training/       # training loop and configs
-  testing/        # evaluation scripts
-  utils/          # shared utilities
-  run.py          # main entry point
-  requirements.txt
+sessions_spawn({
+  task: "工作目录: $W\n执行 /research-implement 技能\n\n上下文:\n- 计划包含 {N} 个组件: {list}\n- 数据集: {dataset}\n- 框架: PyTorch\n请实现代码到 $W/project/，运行 2 epoch 验证，写入 ml_res.md",
+  label: "Research Implement"
+})
 ```
-**Critical rules:**
-- Do NOT import directly from `$WORKSPACE/repos/`. Adapt and rewrite code.
-- Implement EVERY component from `plan_res.md`.
-- Use real datasets, not toy data.
-- First run: 2 epochs only (quick validation).
+**验证:**
+- `$W/project/run.py` 存在
+- `$W/ml_res.md` 包含 `[RESULT]` 行
+- loss 值非 NaN/Inf
-Execute:
-```bash
-cd $WORKSPACE/project && pip install -r requirements.txt && python run.py --epochs 2
-```
+---
-**Note:** GPU support requires external configuration. For GPU-accelerated training, consider using a dedicated ML environment or cloud instance.
+### Phase 5: Review
-**Output:** `$WORKSPACE/project/` (code) + `$WORKSPACE/ml_res.md` (implementation report)
+**检查:** `$W/iterations/` 下最新 `judge_v*.md` 的 verdict 是否为 PASS？
-## Step 8: Review
+**如果没有 PASS，spawn:**
-Read `references/prompts/review.md` for detailed guidance.
+```
+sessions_spawn({
+  task: "工作目录: $W\n执行 /research-review 技能\n\n上下文:\n- 实现报告: ml_res.md 显示 train_loss={value}\n- 计划在 plan_res.md\n请审查代码，如需修改则迭代修复（最多 3 轮）",
+  label: "Research Review"
+})
+```
-Review the implementation against:
+**验证:** 最新 `judge_v*.md` 中 `verdict: PASS` 或 `verdict: BLOCKED`
-- Each atomic idea from `survey_res.md`: is the math correctly translated to code?
-- The plan from `plan_res.md`: are all components present?
-- Code quality: no toy implementations, proper error handling, correct data pipeline.
+如果 BLOCKED → 报告用户，等待指示
-Write a structured review:
+---
-```markdown
-# Review v1
+### Phase 6: Full Experiment
-## Verdict: PASS / NEEDS_REVISION
+**检查:** `$W/experiment_res.md` 存在？
-## Checklist
-- [ ] Dataset loading matches plan
-- [ ] Model architecture matches formulas
-- [ ] Loss function correct
-- [ ] Training loop proper
-- [ ] Evaluation metrics correct
+**如果缺失，spawn:**
-## Issues (if NEEDS_REVISION)
-1. Issue description → suggested fix
-2. ...
+```
+sessions_spawn({
+  task: "工作目录: $W\n执行 /research-experiment 技能\n\n上下文:\n- Review PASS，代码已验证\n- plan_res.md 中指定 full epochs\n请执行完整训练 + 消融实验，写入 experiment_res.md",
+  label: "Research Experiment"
+})
 ```
-**Output:** `$WORKSPACE/iterations/judge_v1.md`
-## Step 9: Iterate
-If the review verdict is `NEEDS_REVISION`:
-1. Read `$WORKSPACE/iterations/judge_vN.md` for the latest suggestions.
-2. Fix each issue in `$WORKSPACE/project/`.
-3. Re-run the 2-epoch validation.
-4. Write a new review to `$WORKSPACE/iterations/judge_v(N+1).md`.
-5. Repeat until `PASS` or 3 iterations reached.
+**验证:** `$W/experiment_res.md` 包含 `[RESULT]` 行和消融表格
-If 3 iterations are exhausted without PASS, summarize remaining issues and ask the user for guidance.
+---
-**Output:** `$WORKSPACE/iterations/judge_v*.md` (review history)
+## 完成
-## Step 10: Full Training
+所有 Phase 验证通过后，输出最终摘要：
-Once review passes:
+```
+研究流程完成！
+- 论文: {N} 篇分析
+- 代码: $W/project/
+- 结果: $W/experiment_res.md
+- 审查: $W/iterations/ ({N} 轮)
+```
-1. Update epoch count in `run.py` to the full training value.
-2. Execute full training run.
-3. Collect and analyze results.
+---
-**Output:** `$WORKSPACE/experiment_res.md`
+## 上下文桥接规则
-## Batch Processing Rule
+每次 spawn 前，编排器必须：
+1. **读取**上一步的产出文件
+2. **摘要** 2-5 行关键信息（不要复制全文）
+3. **写入** spawn task 的"上下文"部分
-When you need to apply the same LLM operation to more than 10 files (e.g., summarizing all papers), do NOT process them one by one in conversation. Instead, write a script to handle them in batch.
+这确保子 agent 拿到足够信息启动，同时不会被前序步骤的完整输出污染。
 ## Recovery
-If the session crashes or context fills up:
-1. List files in `$WORKSPACE/` to see which steps completed.
-2. Read the most recent output file to understand current state.
-3. Resume from the first missing output file.
-Never re-do a step whose output file already exists unless the user explicitly asks.
+如果编排器中断：
+1. 重新运行 /research-pipeline
+2. 编排器会自动检查所有文件，跳过已完成的阶段
+3. 从第一个缺失的产出文件开始继续

package/skills/research-plan/SKILL.md ADDED Viewed

@@ -0,0 +1,121 @@
+---
+name: research-plan
+description: "Create a structured implementation plan from survey results. Produces dataset/model/training/testing plans. Requires survey_res.md from /research-survey."
+metadata:
+  {
+    "openclaw":
+      {
+        "emoji": "📋",
+      },
+  }
+---
+# Research Plan
+**Don't ask permission. Just do it.**
+**Workspace:** See `../_shared/workspace-spec.md`. Set `$W` to the active project directory.
+## Prerequisites
+| File | Source |
+|------|--------|
+| `$W/task.json` | /research-pipeline or user |
+| `$W/survey_res.md` | /research-survey |
+| `$W/notes/paper_*.md` | /research-survey |
+| `$W/repos/` (optional) | git clone |
+**If `survey_res.md` is missing, STOP:** "需要先运行 /research-survey 完成深度分析"
+## Output
+| File | Content |
+|------|---------|
+| `$W/plan_res.md` | 四部分实现计划 |
+---
+## Workflow
+### Step 1: 读取上下文
+读取以下文件，理解研究目标和技术方案：
+- `$W/task.json` — 研究目标
+- `$W/survey_res.md` — 技术路线建议和核心公式
+- 浏览 `$W/repos/` 的目录结构（如有）
+### Step 2: 制定四部分计划
+写入 `$W/plan_res.md`：
+```markdown
+# Implementation Plan
+## 1. Dataset Plan
+- **数据集名称:** {name}
+- **来源:** {URL or description}
+- **大小:** {samples / size}
+- **预处理步骤:**
+  1. {step}
+  2. {step}
+- **DataLoader 设计:**
+  - batch_size: {value}
+  - 输入格式: {shape}
+  - 输出格式: {shape}
+## 2. Model Plan
+- **架构概述:** {1-2 sentences}
+- **组件列表:**
+| 组件 | 对应公式 | 参考代码 | 输入 → 输出 |
+|------|----------|----------|-------------|
+| {component} | $formula$ | `repos/xxx/file.py` | {shape} → {shape} |
+- **参数量估计:** {approximate}
+## 3. Training Plan
+- **Loss 函数:** {formula + description}
+- **Optimizer:** {Adam/SGD/...}, lr={value}
+- **Scheduler:** {if any}
+- **训练参数:**
+  - epochs (validation): 2
+  - epochs (full): {value}
+  - batch_size: {value}
+- **监控指标:** {loss, metrics to log}
+## 4. Testing Plan
+- **评估指标:**
+| Metric | 公式/描述 | 期望范围 |
+|--------|-----------|----------|
+| {metric} | {description} | {range} |
+- **Baselines:** {what to compare against}
+- **消融实验（初步规划）:**
+  1. {ablation 1}
+  2. {ablation 2}
+```
+### Step 3: 自检
+验证计划的完整性：
+- [ ] 每个模型组件都有对应公式
+- [ ] 数据集有具体获取方式
+- [ ] Loss 函数有数学定义
+- [ ] 评估指标有明确定义
+- [ ] 训练参数合理（不要 lr=0.1 for Adam）
+如有不确定项，在计划中标注 `⚠️ TODO: {reason}`
+---
+## Rules
+1. 计划中每个组件必须可追溯到 survey_res.md 中的公式或方法
+2. 不要写"通用"计划 — 每个参数都要有具体值或合理估计
+3. 如果参考仓库存在，组件表必须包含参考代码路径
+4. plan_res.md 的完成标志：四个部分都存在且非空

package/skills/research-review/SKILL.md ADDED Viewed

@@ -0,0 +1,110 @@
+---
+name: research-review
+description: "Review ML implementation against plan and survey. Iterates fix-rerun-review up to 3 times. Requires ml_res.md from /research-implement."
+metadata:
+  {
+    "openclaw":
+      {
+        "emoji": "🔍",
+        "requires": { "bins": ["python3", "uv"] },
+      },
+  }
+---
+# Research Review
+**Don't ask permission. Just do it.**
+**Workspace:** See `../_shared/workspace-spec.md`. Set `$W` to the active project directory.
+## Prerequisites
+| File | Source |
+|------|--------|
+| `$W/ml_res.md` | /research-implement |
+| `$W/project/` | /research-implement |
+| `$W/plan_res.md` | /research-plan |
+| `$W/survey_res.md` | /research-survey |
+**If `ml_res.md` is missing, STOP:** "需要先运行 /research-implement 完成代码实现"
+## Output
+| File | Content |
+|------|---------|
+| `$W/iterations/judge_v{N}.md` | 每轮审查报告 |
+最终报告中 `verdict: PASS` 表示审查通过。
+---
+## Workflow
+### Step 1: 审查代码
+读取以下内容：
+- `$W/plan_res.md` — 每个组件的预期
+- `$W/survey_res.md` — 核心公式
+- `$W/project/` — 实际代码
+- `$W/ml_res.md` — 执行结果
+### Step 2: 逐项检查
+| 检查项 | 方法 |
+|--------|------|
+| 数据管道匹配 plan | 对比 plan Dataset Plan vs `data/` 实现 |
+| 模型架构匹配公式 | 对比 survey 公式 vs `model/` 实现 |
+| Loss 函数正确 | 对比 plan Training Plan vs `training/loss.py` |
+| 评估指标正确 | 对比 plan Testing Plan vs `testing/` |
+| [RESULT] 行存在 | 检查 ml_res.md 中的数值来源 |
+| Loss 合理 | 非 NaN/Inf，有下降趋势 |
+| 无 mock 数据（除非已声明） | 搜索 `# MOCK DATA` 注释 |
+### Step 3: 写入审查报告
+写入 `$W/iterations/judge_v1.md`：
+```markdown
+# Review v1
+## Verdict: PASS / NEEDS_REVISION
+## Checklist
+- [x/✗] Dataset loading matches plan
+- [x/✗] Model architecture matches formulas
+- [x/✗] Loss function correct
+- [x/✗] Training loop proper
+- [x/✗] Evaluation metrics correct
+- [x/✗] Results are from real execution (not fabricated)
+## Issues (if NEEDS_REVISION)
+1. **{issue}**: {description} → **Fix**: {specific fix instruction}
+2. ...
+```
+### Step 4: 迭代（如果 NEEDS_REVISION）
+循环最多 3 次：
+1. 读取 `judge_v{N}.md` 的修改建议
+2. 修改 `$W/project/` 中的代码
+3. 重新执行：
+   ```bash
+   cd $W/project && source .venv/bin/activate && python run.py --epochs 2
+   ```
+4. 读取执行输出，验证修复
+5. 写入 `judge_v{N+1}.md`
+6. 如果 PASS → 停止；否则继续
+### Step 5: 最终判定
+3 轮后仍 NEEDS_REVISION → 在最后一份 judge 中列出剩余问题，标记 `verdict: BLOCKED`，等待用户介入。
+---
+## Rules
+1. 审查必须逐项对照 plan，不能只看"代码能跑"
+2. 每个 issue 必须给出具体的修复指令（不是"请改进"）
+3. 验证修复后必须重新执行代码并检查输出
+4. PASS 的前提：所有 checklist 项通过 + [RESULT] 数值合理