scientify 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/README.md +38 -14
  2. package/README.zh.md +38 -15
  3. package/dist/index.d.ts.map +1 -1
  4. package/dist/index.js +21 -2
  5. package/dist/index.js.map +1 -1
  6. package/dist/src/services/auto-updater.d.ts +15 -0
  7. package/dist/src/services/auto-updater.d.ts.map +1 -0
  8. package/dist/src/services/auto-updater.js +188 -0
  9. package/dist/src/services/auto-updater.js.map +1 -0
  10. package/dist/src/tools/arxiv-download.d.ts +25 -0
  11. package/dist/src/tools/arxiv-download.d.ts.map +1 -0
  12. package/dist/src/tools/arxiv-download.js +179 -0
  13. package/dist/src/tools/arxiv-download.js.map +1 -0
  14. package/dist/src/tools/{arxiv-tool.d.ts → arxiv-search.d.ts} +11 -8
  15. package/dist/src/tools/arxiv-search.d.ts.map +1 -0
  16. package/dist/src/tools/arxiv-search.js +140 -0
  17. package/dist/src/tools/arxiv-search.js.map +1 -0
  18. package/dist/src/tools/github-search-tool.d.ts +5 -1
  19. package/dist/src/tools/github-search-tool.d.ts.map +1 -1
  20. package/dist/src/tools/github-search-tool.js +10 -30
  21. package/dist/src/tools/github-search-tool.js.map +1 -1
  22. package/dist/src/tools/result.d.ts +37 -0
  23. package/dist/src/tools/result.d.ts.map +1 -0
  24. package/dist/src/tools/result.js +39 -0
  25. package/dist/src/tools/result.js.map +1 -0
  26. package/dist/src/tools/workspace.d.ts +32 -0
  27. package/dist/src/tools/workspace.d.ts.map +1 -0
  28. package/dist/src/tools/workspace.js +69 -0
  29. package/dist/src/tools/workspace.js.map +1 -0
  30. package/openclaw.plugin.json +22 -1
  31. package/package.json +13 -2
  32. package/skills/_shared/workspace-spec.md +15 -5
  33. package/skills/idea-generation/SKILL.md +2 -0
  34. package/skills/install-scientify/SKILL.md +15 -7
  35. package/skills/literature-survey/SKILL.md +86 -214
  36. package/skills/research-experiment/SKILL.md +114 -0
  37. package/skills/research-implement/SKILL.md +166 -0
  38. package/skills/research-pipeline/SKILL.md +104 -166
  39. package/skills/research-plan/SKILL.md +121 -0
  40. package/skills/research-review/SKILL.md +110 -0
  41. package/skills/research-survey/SKILL.md +140 -0
  42. package/skills/write-review-paper/SKILL.md +2 -0
  43. package/dist/src/tools/arxiv-tool.d.ts.map +0 -1
  44. package/dist/src/tools/arxiv-tool.js +0 -258
  45. package/dist/src/tools/arxiv-tool.js.map +0 -1
  46. package/skills/research-pipeline/references/prompts/implement.md +0 -135
  47. package/skills/research-pipeline/references/prompts/plan.md +0 -142
  48. package/skills/research-pipeline/references/prompts/review.md +0 -118
  49. package/skills/research-pipeline/references/prompts/survey.md +0 -105
  50. package/skills/research-pipeline/references/workspace-spec.md +0 -5
@@ -0,0 +1,140 @@
1
+ ---
2
+ name: research-survey
3
+ description: "Deep analysis of downloaded papers: read .tex sources, extract core methods/formulas, map formulas to reference code, write structured notes. Requires papers already downloaded via /literature-survey."
4
+ metadata:
5
+ {
6
+ "openclaw":
7
+ {
8
+ "emoji": "📖",
9
+ },
10
+ }
11
+ ---
12
+
13
+ # Research Survey (Deep Analysis)
14
+
15
+ **Don't ask permission. Just do it.**
16
+
17
+ **Workspace:** See `../_shared/workspace-spec.md`. Set `$W` to the active project directory.
18
+
19
+ ## Prerequisites
20
+
21
+ Read and verify these files exist before starting:
22
+
23
+ | File | Source |
24
+ |------|--------|
25
+ | `$W/papers/_meta/*.json` | /literature-survey |
26
+ | `$W/papers/_downloads/` or `$W/papers/{direction}/` | /literature-survey |
27
+ | `$W/repos/` (optional) | git clone in /literature-survey or manually |
28
+
29
+ **If any prerequisite is missing, STOP and report:** "需要先运行 /literature-survey 完成论文下载"
30
+
31
+ ## Output
32
+
33
+ | File | Content |
34
+ |------|---------|
35
+ | `$W/notes/paper_{arxiv_id}.md` | Per-paper structured notes |
36
+ | `$W/survey_res.md` | Synthesis report |
37
+
38
+ ---
39
+
40
+ ## Workflow
41
+
42
+ ### Step 1: 收集论文列表
43
+
44
+ ```bash
45
+ ls $W/papers/_meta/
46
+ ```
47
+
48
+ 读取所有 `.json` 元数据,构建论文列表。按 score 降序排列。
49
+
50
+ ### Step 2: 逐篇深度分析
51
+
52
+ **对每篇论文**(优先高分论文):
53
+
54
+ #### 2.1 读 .tex 源码
55
+
56
+ 找到论文的 .tex 文件(在 `_downloads/{arxiv_id}/` 或 `{direction}/{arxiv_id}/` 下),重点读取:
57
+ - **Method / Approach** section
58
+ - **Model Architecture** section
59
+ - 数学公式定义
60
+
61
+ 如果没有 .tex(只有 PDF),基于 abstract 分析。
62
+
63
+ #### 2.2 提取核心内容
64
+
65
+ 从 .tex 中提取:
66
+ - **核心方法**:1-2 段描述
67
+ - **数学公式**:至少 1 个关键公式(保留 LaTeX 格式)
68
+ - **创新点**:与同领域其他方法的区别
69
+
70
+ #### 2.3 映射到代码(如果 repos/ 存在)
71
+
72
+ 如果 `$W/repos/` 中有相关仓库:
73
+ - 找到公式对应的代码实现
74
+ - 记录文件路径和行号
75
+
76
+ #### 2.4 写入笔记
77
+
78
+ 写入 `$W/notes/paper_{arxiv_id}.md`:
79
+
80
+ ```markdown
81
+ # {Paper Title}
82
+
83
+ - **arXiv:** {arxiv_id}
84
+ - **核心方法:** {1-2 sentences}
85
+
86
+ ## 数学公式
87
+
88
+ $$
89
+ {key formula in LaTeX}
90
+ $$
91
+
92
+ 含义:{解释}
93
+
94
+ ## 代码映射
95
+
96
+ 文件:`repos/{repo}/path/to/file.py:L42-L60`
97
+ ```python
98
+ # relevant code excerpt (< 20 lines)
99
+ ```
100
+
101
+ ## 与本研究的关系
102
+
103
+ {如何应用到当前研究}
104
+ ```
105
+
106
+ ### Step 3: 综合报告
107
+
108
+ 读取所有 `notes/paper_*.md`,写入 `$W/survey_res.md`:
109
+
110
+ ```markdown
111
+ # Survey Synthesis
112
+
113
+ ## 论文总览
114
+ - 分析论文数: {N}
115
+ - 涉及方向: {list}
116
+
117
+ ## 核心方法对比
118
+
119
+ | 论文 | 方法 | 核心公式 | 复杂度 | 优势 |
120
+ |------|------|----------|--------|------|
121
+ | ... | ... | ... | ... | ... |
122
+
123
+ ## 技术路线建议
124
+
125
+ 基于以上分析,推荐的技术路线是:
126
+ {建议}
127
+
128
+ ## 关键公式汇总
129
+
130
+ {列出所有提取的核心公式}
131
+ ```
132
+
133
+ ---
134
+
135
+ ## Rules
136
+
137
+ 1. 每篇论文必须读 .tex 原文(如有),不能只读 abstract
138
+ 2. 每篇笔记必须包含至少 1 个数学公式
139
+ 3. 如果有 repos/,必须尝试找到公式到代码的映射
140
+ 4. survey_res.md 必须包含方法对比表
@@ -12,6 +12,8 @@ metadata:
12
12
 
13
13
  # Literature Review Writing
14
14
 
15
+ **Don't ask permission. Just do it.**
16
+
15
17
  Guide for writing a structured literature review or survey paper from papers you've already collected. This skill helps with reading strategy, note organization, and academic writing.
16
18
 
17
19
  **Workspace:** See `../_shared/workspace-spec.md` for directory structure. Outputs go to `$WORKSPACE/review/`.
@@ -1 +0,0 @@
1
- {"version":3,"file":"arxiv-tool.d.ts","sourceRoot":"","sources":["../../../src/tools/arxiv-tool.ts"],"names":[],"mappings":"AAUA,eAAO,MAAM,eAAe;;;;;;;EA+B1B,CAAC;AA2LH,wBAAgB,eAAe;;;;;;;;;;;;2BAOE,MAAM,WAAW,OAAO;;;;EA6ExD"}
@@ -1,258 +0,0 @@
1
- import { Type } from "@sinclair/typebox";
2
- import * as fs from "node:fs";
3
- import * as path from "node:path";
4
- import * as os from "node:os";
5
- import * as tar from "tar";
6
- const ARXIV_API_URL = "https://export.arxiv.org/api/query";
7
- const DEFAULT_MAX_RESULTS = 10;
8
- const MAX_RESULTS_LIMIT = 50;
9
- export const ArxivToolSchema = Type.Object({
10
- query: Type.String({ description: "Search query for arXiv papers (e.g. 'graph neural network')." }),
11
- max_results: Type.Optional(Type.Number({
12
- description: "Maximum number of results to return (1-50). Default: 10.",
13
- minimum: 1,
14
- maximum: MAX_RESULTS_LIMIT,
15
- })),
16
- sort_by: Type.Optional(Type.String({
17
- description: 'Sort order: "relevance" (default), "lastUpdatedDate", or "submittedDate".',
18
- })),
19
- date_from: Type.Optional(Type.String({
20
- description: "Filter papers submitted after this date (YYYY-MM-DD).",
21
- })),
22
- download: Type.Optional(Type.Boolean({
23
- description: "If true, download .tex source for each paper to output_dir. Default: false.",
24
- })),
25
- output_dir: Type.Optional(Type.String({
26
- description: "Directory to download .tex source files into. Default: workspace/papers/",
27
- })),
28
- });
29
- const SORT_MAP = {
30
- relevance: "relevance",
31
- lastupdateddate: "lastUpdatedDate",
32
- submitteddate: "submittedDate",
33
- };
34
- function readStringParam(params, key, opts) {
35
- const value = params[key];
36
- if (value === undefined || value === null) {
37
- if (opts?.required) {
38
- throw new Error(`Missing required parameter: ${key}`);
39
- }
40
- return undefined;
41
- }
42
- return String(value);
43
- }
44
- function readNumberParam(params, key, opts) {
45
- const value = params[key];
46
- if (value === undefined || value === null)
47
- return undefined;
48
- const num = Number(value);
49
- if (isNaN(num))
50
- return undefined;
51
- return opts?.integer ? Math.floor(num) : num;
52
- }
53
- /**
54
- * Download and extract .tex source from arXiv
55
- */
56
- async function downloadTexSource(arxivId, outputDir) {
57
- const paperDir = path.join(outputDir, arxivId);
58
- await fs.promises.mkdir(paperDir, { recursive: true });
59
- const srcUrl = `https://arxiv.org/src/${arxivId}`;
60
- const tarPath = path.join(paperDir, "source.tar.gz");
61
- try {
62
- // Try to download .tex source
63
- const response = await fetch(srcUrl);
64
- if (!response.ok) {
65
- // Fallback to PDF
66
- return await downloadPdfFallback(arxivId, outputDir);
67
- }
68
- const buffer = Buffer.from(await response.arrayBuffer());
69
- await fs.promises.writeFile(tarPath, buffer);
70
- // Check if it's actually a tar.gz or just a single file
71
- const isTarGz = buffer[0] === 0x1f && buffer[1] === 0x8b;
72
- if (isTarGz) {
73
- // Extract tar.gz
74
- await tar.x({ file: tarPath, cwd: paperDir });
75
- await fs.promises.unlink(tarPath); // Remove tar.gz after extraction
76
- // Find all .tex files
77
- const files = await findTexFiles(paperDir);
78
- if (files.length === 0) {
79
- return await downloadPdfFallback(arxivId, outputDir);
80
- }
81
- return { success: true, format: "tex", files };
82
- }
83
- else {
84
- // Single file (probably .tex directly)
85
- const texPath = path.join(paperDir, "main.tex");
86
- await fs.promises.rename(tarPath, texPath);
87
- return { success: true, format: "tex", files: ["main.tex"] };
88
- }
89
- }
90
- catch (error) {
91
- // Fallback to PDF on any error
92
- return await downloadPdfFallback(arxivId, outputDir);
93
- }
94
- }
95
- async function downloadPdfFallback(arxivId, outputDir) {
96
- try {
97
- const pdfUrl = `https://arxiv.org/pdf/${arxivId}.pdf`;
98
- const response = await fetch(pdfUrl);
99
- if (!response.ok) {
100
- return { success: false, format: "pdf", files: [], error: `PDF download failed: ${response.status}` };
101
- }
102
- const pdfPath = path.join(outputDir, `${arxivId}.pdf`);
103
- const buffer = Buffer.from(await response.arrayBuffer());
104
- await fs.promises.writeFile(pdfPath, buffer);
105
- return { success: true, format: "pdf", files: [`${arxivId}.pdf`] };
106
- }
107
- catch (error) {
108
- return { success: false, format: "pdf", files: [], error: String(error) };
109
- }
110
- }
111
- async function findTexFiles(dir) {
112
- const files = [];
113
- const entries = await fs.promises.readdir(dir, { withFileTypes: true });
114
- for (const entry of entries) {
115
- const fullPath = path.join(dir, entry.name);
116
- if (entry.isDirectory()) {
117
- const subFiles = await findTexFiles(fullPath);
118
- files.push(...subFiles.map((f) => path.join(entry.name, f)));
119
- }
120
- else if (entry.name.endsWith(".tex")) {
121
- files.push(entry.name);
122
- }
123
- }
124
- return files;
125
- }
126
- function buildSearchUrl(query, maxResults, sortBy, dateFrom) {
127
- let searchQuery = query;
128
- if (dateFrom) {
129
- // ArXiv date filter format: submittedDate:[YYYYMMDD0000+TO+*]
130
- const dateFormatted = dateFrom.replace(/-/g, "");
131
- searchQuery = `${query} AND submittedDate:[${dateFormatted}0000 TO 99991231]`;
132
- }
133
- const params = new URLSearchParams({
134
- search_query: searchQuery,
135
- start: "0",
136
- max_results: String(maxResults),
137
- sortBy,
138
- sortOrder: "descending",
139
- });
140
- return `${ARXIV_API_URL}?${params.toString()}`;
141
- }
142
- function parseAtomXml(xml) {
143
- const papers = [];
144
- const entryRegex = /<entry>([\s\S]*?)<\/entry>/g;
145
- let match;
146
- while ((match = entryRegex.exec(xml)) !== null) {
147
- const entry = match[1];
148
- const getTag = (tag) => {
149
- const m = entry.match(new RegExp(`<${tag}[^>]*>([\\s\\S]*?)<\\/${tag}>`));
150
- return m ? m[1].trim() : "";
151
- };
152
- const title = getTag("title").replace(/\s+/g, " ");
153
- const abstract = getTag("summary").replace(/\s+/g, " ");
154
- const published = getTag("published");
155
- const updated = getTag("updated");
156
- // Extract arXiv ID from <id> tag
157
- const idUrl = getTag("id");
158
- const arxivId = idUrl.replace("http://arxiv.org/abs/", "").replace(/v\d+$/, "");
159
- // Extract authors
160
- const authors = [];
161
- const authorRegex = /<author>\s*<name>([^<]+)<\/name>/g;
162
- let authorMatch;
163
- while ((authorMatch = authorRegex.exec(entry)) !== null) {
164
- authors.push(authorMatch[1].trim());
165
- }
166
- // Extract PDF link
167
- const pdfMatch = entry.match(/<link[^>]+title="pdf"[^>]+href="([^"]+)"/);
168
- const pdfUrl = pdfMatch ? pdfMatch[1] : `https://arxiv.org/pdf/${arxivId}`;
169
- // Extract categories
170
- const categories = [];
171
- const catRegex = /<category[^>]+term="([^"]+)"/g;
172
- let catMatch;
173
- while ((catMatch = catRegex.exec(entry)) !== null) {
174
- categories.push(catMatch[1]);
175
- }
176
- if (title && arxivId) {
177
- papers.push({ title, authors, abstract, arxivId, pdfUrl, published, updated, categories });
178
- }
179
- }
180
- return papers;
181
- }
182
- export function createArxivTool() {
183
- return {
184
- label: "ArXiv",
185
- name: "arxiv",
186
- description: "Search arXiv.org for academic papers by keyword. Returns titles, authors, abstracts, and IDs. Optionally downloads .tex source files.",
187
- parameters: ArxivToolSchema,
188
- execute: async (_toolCallId, rawArgs) => {
189
- const params = rawArgs;
190
- const query = readStringParam(params, "query", { required: true });
191
- const maxResults = Math.min(readNumberParam(params, "max_results", { integer: true }) ?? DEFAULT_MAX_RESULTS, MAX_RESULTS_LIMIT);
192
- const rawSort = readStringParam(params, "sort_by") ?? "relevance";
193
- const sortBy = SORT_MAP[rawSort.toLowerCase()] ?? "relevance";
194
- const dateFrom = readStringParam(params, "date_from");
195
- const download = params.download === true || params.download === "true";
196
- const outputDir = readStringParam(params, "output_dir") ??
197
- path.join(os.homedir(), ".openclaw", "workspace", "papers");
198
- const url = buildSearchUrl(query, maxResults, sortBy, dateFrom);
199
- let response;
200
- try {
201
- response = await fetch(url);
202
- }
203
- catch (error) {
204
- return {
205
- type: "tool_result",
206
- content: JSON.stringify({
207
- error: "network_error",
208
- message: `Failed to reach arXiv API: ${error instanceof Error ? error.message : String(error)}`,
209
- }),
210
- };
211
- }
212
- if (!response.ok) {
213
- return {
214
- type: "tool_result",
215
- content: JSON.stringify({
216
- error: "api_error",
217
- message: `arXiv API returned ${response.status}: ${response.statusText}`,
218
- }),
219
- };
220
- }
221
- const xml = await response.text();
222
- const papers = parseAtomXml(xml);
223
- // If download requested, download .tex source for each paper
224
- let downloads;
225
- if (download && papers.length > 0) {
226
- await fs.promises.mkdir(outputDir, { recursive: true });
227
- downloads = [];
228
- for (const paper of papers) {
229
- const result = await downloadTexSource(paper.arxivId, outputDir);
230
- downloads.push({
231
- arxiv_id: paper.arxivId,
232
- format: result.format,
233
- files: result.files,
234
- error: result.error,
235
- });
236
- }
237
- }
238
- return {
239
- type: "tool_result",
240
- content: JSON.stringify({
241
- query,
242
- total: papers.length,
243
- papers: papers.map((p) => ({
244
- title: p.title,
245
- authors: p.authors,
246
- abstract: p.abstract,
247
- arxiv_id: p.arxivId,
248
- pdf_url: p.pdfUrl,
249
- published: p.published,
250
- categories: p.categories,
251
- })),
252
- ...(downloads && { downloads, output_dir: outputDir }),
253
- }),
254
- };
255
- },
256
- };
257
- }
258
- //# sourceMappingURL=arxiv-tool.js.map
@@ -1 +0,0 @@
1
- {"version":3,"file":"arxiv-tool.js","sourceRoot":"","sources":["../../../src/tools/arxiv-tool.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,IAAI,EAAE,MAAM,mBAAmB,CAAC;AACzC,OAAO,KAAK,EAAE,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,IAAI,MAAM,WAAW,CAAC;AAClC,OAAO,KAAK,EAAE,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,GAAG,MAAM,KAAK,CAAC;AAE3B,MAAM,aAAa,GAAG,oCAAoC,CAAC;AAC3D,MAAM,mBAAmB,GAAG,EAAE,CAAC;AAC/B,MAAM,iBAAiB,GAAG,EAAE,CAAC;AAE7B,MAAM,CAAC,MAAM,eAAe,GAAG,IAAI,CAAC,MAAM,CAAC;IACzC,KAAK,EAAE,IAAI,CAAC,MAAM,CAAC,EAAE,WAAW,EAAE,8DAA8D,EAAE,CAAC;IACnG,WAAW,EAAE,IAAI,CAAC,QAAQ,CACxB,IAAI,CAAC,MAAM,CAAC;QACV,WAAW,EAAE,0DAA0D;QACvE,OAAO,EAAE,CAAC;QACV,OAAO,EAAE,iBAAiB;KAC3B,CAAC,CACH;IACD,OAAO,EAAE,IAAI,CAAC,QAAQ,CACpB,IAAI,CAAC,MAAM,CAAC;QACV,WAAW,EACT,2EAA2E;KAC9E,CAAC,CACH;IACD,SAAS,EAAE,IAAI,CAAC,QAAQ,CACtB,IAAI,CAAC,MAAM,CAAC;QACV,WAAW,EAAE,uDAAuD;KACrE,CAAC,CACH;IACD,QAAQ,EAAE,IAAI,CAAC,QAAQ,CACrB,IAAI,CAAC,OAAO,CAAC;QACX,WAAW,EACT,6EAA6E;KAChF,CAAC,CACH;IACD,UAAU,EAAE,IAAI,CAAC,QAAQ,CACvB,IAAI,CAAC,MAAM,CAAC;QACV,WAAW,EAAE,0EAA0E;KACxF,CAAC,CACH;CACF,CAAC,CAAC;AAaH,MAAM,QAAQ,GAA2B;IACvC,SAAS,EAAE,WAAW;IACtB,eAAe,EAAE,iBAAiB;IAClC,aAAa,EAAE,eAAe;CAC/B,CAAC;AAEF,SAAS,eAAe,CAAC,MAA+B,EAAE,GAAW,EAAE,IAA6B;IAClG,MAAM,KAAK,GAAG,MAAM,CAAC,GAAG,CAAC,CAAC;IAC1B,IAAI,KAAK,KAAK,SAAS,IAAI,KAAK,KAAK,IAAI,EAAE,CAAC;QAC1C,IAAI,IAAI,EAAE,QAAQ,EAAE,CAAC;YACnB,MAAM,IAAI,KAAK,CAAC,+BAA+B,GAAG,EAAE,CAAC,CAAC;QACxD,CAAC;QACD,OAAO,SAAS,CAAC;IACnB,CAAC;IACD,OAAO,MAAM,CAAC,KAAK,CAAC,CAAC;AACvB,CAAC;AAED,SAAS,eAAe,CAAC,MAA+B,EAAE,GAAW,EAAE,IAA4B;IACjG,MAAM,KAAK,GAAG,MAAM,CAAC,GAAG,CAAC,CAAC;IAC1B,IAAI,KAAK,KAAK,SAAS,IAAI,KAAK,KAAK,IAAI;QAAE,OAAO,SAAS,CAAC;IAC5D,MAAM,GAAG,GAAG,MAAM,CAAC,KAAK,CAAC,CAAC;IAC1B,IAAI,KAAK,CAAC,GAAG,CAAC;QAAE,OAAO,SAAS,CAAC;IACjC,OAAO,IAAI,EAAE,OAAO,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC;AAC/C,CAAC;AAED;;GAEG;AACH,KAAK,UAAU,iBAAiB,CAC9B,OAAe,EACf,SAAiB;IAEjB,MAAM,QAAQ,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,OAAO,CAAC,CAAC;IAC/C,MAAM,EAAE,CAAC,QAAQ,CAAC,KAAK,CAAC,QAAQ,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;IAEvD,MAAM,MAAM,GAAG,yBAAyB,OAAO,EAAE,CAAC;IAClD,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,eAAe,CAAC,CAAC;IAErD,IAAI,CAAC;QACH,8BAA8B;QAC9B,MAAM,QAAQ,GAAG,MAAM,KAAK,CAAC,MAAM,CAAC,CAAC;QACrC,IAAI,CAAC,QAAQ,CAAC,EAAE,EAAE,CAAC;YACjB,kBAAkB;YAClB,OAAO,MAAM,mBAAmB,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;QACvD,CAAC;QAED,MAAM,MAAM,GAAG,MAAM,CAAC,IAAI,CAAC,MAAM,QAAQ,CAAC,WAAW,EAAE,CAAC,CAAC;QACzD,MAAM,EAAE,CAAC,QAAQ,CAAC,SAAS,CAAC,OAAO,EAAE,MAAM,CAAC,CAAC;QAE7C,wDAAwD;QACxD,MAAM,OAAO,GAAG,MAAM,CAAC,CAAC,CAAC,KAAK,IAAI,IAAI,MAAM,CAAC,CAAC,CAAC,KAAK,IAAI,CAAC;QAEzD,IAAI,OAAO,EAAE,CAAC;YACZ,iBAAiB;YACjB,MAAM,GAAG,CAAC,CAAC,CAAC,EAAE,IAAI,EAAE,OAAO,EAAE,GAAG,EAAE,QAAQ,EAAE,CAAC,CAAC;YAC9C,MAAM,EAAE,CAAC,QAAQ,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC,CAAC,iCAAiC;YAEpE,sBAAsB;YACtB,MAAM,KAAK,GAAG,MAAM,YAAY,CAAC,QAAQ,CAAC,CAAC;YAC3C,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;gBACvB,OAAO,MAAM,mBAAmB,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;YACvD,CAAC;YACD,OAAO,EAAE,OAAO,EAAE,IAAI,EAAE,MAAM,EAAE,KAAK,EAAE,KAAK,EAAE,CAAC;QACjD,CAAC;aAAM,CAAC;YACN,uCAAuC;YACvC,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE,UAAU,CAAC,CAAC;YAChD,MAAM,EAAE,CAAC,QAAQ,CAAC,MAAM,CAAC,OAAO,EAAE,OAAO,CAAC,CAAC;YAC3C,OAAO,EAAE,OAAO,EAAE,IAAI,EAAE,MAAM,EAAE,KAAK,EAAE,KAAK,EAAE,CAAC,UAAU,CAAC,EAAE,CAAC;QAC/D,CAAC;IACH,CAAC;IAAC,OAAO,KAAK,EAAE,CAAC;QACf,+BAA+B;QAC/B,OAAO,MAAM,mBAAmB,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;IACvD,CAAC;AACH,CAAC;AAED,KAAK,UAAU,mBAAmB,CAChC,OAAe,EACf,SAAiB;IAEjB,IAAI,CAAC;QACH,MAAM,MAAM,GAAG,yBAAyB,OAAO,MAAM,CAAC;QACtD,MAAM,QAAQ,GAAG,MAAM,KAAK,CAAC,MAAM,CAAC,CAAC;QACrC,IAAI,CAAC,QAAQ,CAAC,EAAE,EAAE,CAAC;YACjB,OAAO,EAAE,OAAO,EAAE,KAAK,EAAE,MAAM,EAAE,KAAK,EAAE,KAAK,EAAE,EAAE,EAAE,KAAK,EAAE,wBAAwB,QAAQ,CAAC,MAAM,EAAE,EAAE,CAAC;QACxG,CAAC;QACD,MAAM,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC,SAAS,EAAE,GAAG,OAAO,MAAM,CAAC,CAAC;QACvD,MAAM,MAAM,GAAG,MAAM,CAAC,IAAI,CAAC,MAAM,QAAQ,CAAC,WAAW,EAAE,CAAC,CAAC;QACzD,MAAM,EAAE,CAAC,QAAQ,CAAC,SAAS,CAAC,OAAO,EAAE,MAAM,CAAC,CAAC;QAC7C,OAAO,EAAE,OAAO,EAAE,IAAI,EAAE,MAAM,EAAE,KAAK,EAAE,KAAK,EAAE,CAAC,GAAG,OAAO,MAAM,CAAC,EAAE,CAAC;IACrE,CAAC;IAAC,OAAO,KAAK,EAAE,CAAC;QACf,OAAO,EAAE,OAAO,EAAE,KAAK,EAAE,MAAM,EAAE,KAAK,EAAE,KAAK,EAAE,EAAE,EAAE,KAAK,EAAE,MAAM,CAAC,KAAK,CAAC,EAAE,CAAC;IAC5E,CAAC;AACH,CAAC;AAED,KAAK,UAAU,YAAY,CAAC,GAAW;IACrC,MAAM,KAAK,GAAa,EAAE,CAAC;IAC3B,MAAM,OAAO,GAAG,MAAM,EAAE,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,EAAE,EAAE,aAAa,EAAE,IAAI,EAAE,CAAC,CAAC;IACxE,KAAK,MAAM,KAAK,IAAI,OAAO,EAAE,CAAC;QAC5B,MAAM,QAAQ,GAAG,IAAI,CAAC,IAAI,CAAC,GAAG,EAAE,KAAK,CAAC,IAAI,CAAC,CAAC;QAC5C,IAAI,KAAK,CAAC,WAAW,EAAE,EAAE,CAAC;YACxB,MAAM,QAAQ,GAAG,MAAM,YAAY,CAAC,QAAQ,CAAC,CAAC;YAC9C,KAAK,CAAC,IAAI,CAAC,GAAG,QAAQ,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC,CAAC,CAAC;QAC/D,CAAC;aAAM,IAAI,KAAK,CAAC,IAAI,CAAC,QAAQ,CAAC,MAAM,CAAC,EAAE,CAAC;YACvC,KAAK,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;QACzB,CAAC;IACH,CAAC;IACD,OAAO,KAAK,CAAC;AACf,CAAC;AAED,SAAS,cAAc,CAAC,KAAa,EAAE,UAAkB,EAAE,MAAc,EAAE,QAAiB;IAC1F,IAAI,WAAW,GAAG,KAAK,CAAC;IACxB,IAAI,QAAQ,EAAE,CAAC;QACb,8DAA8D;QAC9D,MAAM,aAAa,GAAG,QAAQ,CAAC,OAAO,CAAC,IAAI,EAAE,EAAE,CAAC,CAAC;QACjD,WAAW,GAAG,GAAG,KAAK,uBAAuB,aAAa,mBAAmB,CAAC;IAChF,CAAC;IACD,MAAM,MAAM,GAAG,IAAI,eAAe,CAAC;QACjC,YAAY,EAAE,WAAW;QACzB,KAAK,EAAE,GAAG;QACV,WAAW,EAAE,MAAM,CAAC,UAAU,CAAC;QAC/B,MAAM;QACN,SAAS,EAAE,YAAY;KACxB,CAAC,CAAC;IACH,OAAO,GAAG,aAAa,IAAI,MAAM,CAAC,QAAQ,EAAE,EAAE,CAAC;AACjD,CAAC;AAED,SAAS,YAAY,CAAC,GAAW;IAC/B,MAAM,MAAM,GAAiB,EAAE,CAAC;IAChC,MAAM,UAAU,GAAG,6BAA6B,CAAC;IACjD,IAAI,KAA6B,CAAC;IAElC,OAAO,CAAC,KAAK,GAAG,UAAU,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC,KAAK,IAAI,EAAE,CAAC;QAC/C,MAAM,KAAK,GAAG,KAAK,CAAC,CAAC,CAAC,CAAC;QACvB,MAAM,MAAM,GAAG,CAAC,GAAW,EAAE,EAAE;YAC7B,MAAM,CAAC,GAAG,KAAK,CAAC,KAAK,CAAC,IAAI,MAAM,CAAC,IAAI,GAAG,yBAAyB,GAAG,GAAG,CAAC,CAAC,CAAC;YAC1E,OAAO,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC,EAAE,CAAC;QAC9B,CAAC,CAAC;QAEF,MAAM,KAAK,GAAG,MAAM,CAAC,OAAO,CAAC,CAAC,OAAO,CAAC,MAAM,EAAE,GAAG,CAAC,CAAC;QACnD,MAAM,QAAQ,GAAG,MAAM,CAAC,SAAS,CAAC,CAAC,OAAO,CAAC,MAAM,EAAE,GAAG,CAAC,CAAC;QACxD,MAAM,SAAS,GAAG,MAAM,CAAC,WAAW,CAAC,CAAC;QACtC,MAAM,OAAO,GAAG,MAAM,CAAC,SAAS,CAAC,CAAC;QAElC,iCAAiC;QACjC,MAAM,KAAK,GAAG,MAAM,CAAC,IAAI,CAAC,CAAC;QAC3B,MAAM,OAAO,GAAG,KAAK,CAAC,OAAO,CAAC,uBAAuB,EAAE,EAAE,CAAC,CAAC,OAAO,CAAC,OAAO,EAAE,EAAE,CAAC,CAAC;QAEhF,kBAAkB;QAClB,MAAM,OAAO,GAAa,EAAE,CAAC;QAC7B,MAAM,WAAW,GAAG,mCAAmC,CAAC;QACxD,IAAI,WAAmC,CAAC;QACxC,OAAO,CAAC,WAAW,GAAG,WAAW,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,KAAK,IAAI,EAAE,CAAC;YACxD,OAAO,CAAC,IAAI,CAAC,WAAW,CAAC,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC,CAAC;QACtC,CAAC;QAED,mBAAmB;QACnB,MAAM,QAAQ,GAAG,KAAK,CAAC,KAAK,CAAC,0CAA0C,CAAC,CAAC;QACzE,MAAM,MAAM,GAAG,QAAQ,CAAC,CAAC,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,yBAAyB,OAAO,EAAE,CAAC;QAE3E,qBAAqB;QACrB,MAAM,UAAU,GAAa,EAAE,CAAC;QAChC,MAAM,QAAQ,GAAG,+BAA+B,CAAC;QACjD,IAAI,QAAgC,CAAC;QACrC,OAAO,CAAC,QAAQ,GAAG,QAAQ,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,KAAK,IAAI,EAAE,CAAC;YAClD,UAAU,CAAC,IAAI,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC,CAAC;QAC/B,CAAC;QAED,IAAI,KAAK,IAAI,OAAO,EAAE,CAAC;YACrB,MAAM,CAAC,IAAI,CAAC,EAAE,KAAK,EAAE,OAAO,EAAE,QAAQ,EAAE,OAAO,EAAE,MAAM,EAAE,SAAS,EAAE,OAAO,EAAE,UAAU,EAAE,CAAC,CAAC;QAC7F,CAAC;IACH,CAAC;IACD,OAAO,MAAM,CAAC;AAChB,CAAC;AAED,MAAM,UAAU,eAAe;IAC7B,OAAO;QACL,KAAK,EAAE,OAAO;QACd,IAAI,EAAE,OAAO;QACb,WAAW,EACT,uIAAuI;QACzI,UAAU,EAAE,eAAe;QAC3B,OAAO,EAAE,KAAK,EAAE,WAAmB,EAAE,OAAgB,EAAE,EAAE;YACvD,MAAM,MAAM,GAAG,OAAkC,CAAC;YAClD,MAAM,KAAK,GAAG,eAAe,CAAC,MAAM,EAAE,OAAO,EAAE,EAAE,QAAQ,EAAE,IAAI,EAAE,CAAE,CAAC;YACpE,MAAM,UAAU,GAAG,IAAI,CAAC,GAAG,CACzB,eAAe,CAAC,MAAM,EAAE,aAAa,EAAE,EAAE,OAAO,EAAE,IAAI,EAAE,CAAC,IAAI,mBAAmB,EAChF,iBAAiB,CAClB,CAAC;YACF,MAAM,OAAO,GAAG,eAAe,CAAC,MAAM,EAAE,SAAS,CAAC,IAAI,WAAW,CAAC;YAClE,MAAM,MAAM,GAAG,QAAQ,CAAC,OAAO,CAAC,WAAW,EAAE,CAAC,IAAI,WAAW,CAAC;YAC9D,MAAM,QAAQ,GAAG,eAAe,CAAC,MAAM,EAAE,WAAW,CAAC,CAAC;YACtD,MAAM,QAAQ,GAAG,MAAM,CAAC,QAAQ,KAAK,IAAI,IAAI,MAAM,CAAC,QAAQ,KAAK,MAAM,CAAC;YACxE,MAAM,SAAS,GAAG,eAAe,CAAC,MAAM,EAAE,YAAY,CAAC;gBACrD,IAAI,CAAC,IAAI,CAAC,EAAE,CAAC,OAAO,EAAE,EAAE,WAAW,EAAE,WAAW,EAAE,QAAQ,CAAC,CAAC;YAE9D,MAAM,GAAG,GAAG,cAAc,CAAC,KAAK,EAAE,UAAU,EAAE,MAAM,EAAE,QAAQ,CAAC,CAAC;YAEhE,IAAI,QAAkB,CAAC;YACvB,IAAI,CAAC;gBACH,QAAQ,GAAG,MAAM,KAAK,CAAC,GAAG,CAAC,CAAC;YAC9B,CAAC;YAAC,OAAO,KAAK,EAAE,CAAC;gBACf,OAAO;oBACL,IAAI,EAAE,aAAsB;oBAC5B,OAAO,EAAE,IAAI,CAAC,SAAS,CAAC;wBACtB,KAAK,EAAE,eAAe;wBACtB,OAAO,EAAE,8BAA8B,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,KAAK,CAAC,EAAE;qBAChG,CAAC;iBACH,CAAC;YACJ,CAAC;YAED,IAAI,CAAC,QAAQ,CAAC,EAAE,EAAE,CAAC;gBACjB,OAAO;oBACL,IAAI,EAAE,aAAsB;oBAC5B,OAAO,EAAE,IAAI,CAAC,SAAS,CAAC;wBACtB,KAAK,EAAE,WAAW;wBAClB,OAAO,EAAE,sBAAsB,QAAQ,CAAC,MAAM,KAAK,QAAQ,CAAC,UAAU,EAAE;qBACzE,CAAC;iBACH,CAAC;YACJ,CAAC;YAED,MAAM,GAAG,GAAG,MAAM,QAAQ,CAAC,IAAI,EAAE,CAAC;YAClC,MAAM,MAAM,GAAG,YAAY,CAAC,GAAG,CAAC,CAAC;YAEjC,6DAA6D;YAC7D,IAAI,SAAmG,CAAC;YACxG,IAAI,QAAQ,IAAI,MAAM,CAAC,MAAM,GAAG,CAAC,EAAE,CAAC;gBAClC,MAAM,EAAE,CAAC,QAAQ,CAAC,KAAK,CAAC,SAAS,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;gBACxD,SAAS,GAAG,EAAE,CAAC;gBACf,KAAK,MAAM,KAAK,IAAI,MAAM,EAAE,CAAC;oBAC3B,MAAM,MAAM,GAAG,MAAM,iBAAiB,CAAC,KAAK,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;oBACjE,SAAS,CAAC,IAAI,CAAC;wBACb,QAAQ,EAAE,KAAK,CAAC,OAAO;wBACvB,MAAM,EAAE,MAAM,CAAC,MAAM;wBACrB,KAAK,EAAE,MAAM,CAAC,KAAK;wBACnB,KAAK,EAAE,MAAM,CAAC,KAAK;qBACpB,CAAC,CAAC;gBACL,CAAC;YACH,CAAC;YAED,OAAO;gBACL,IAAI,EAAE,aAAsB;gBAC5B,OAAO,EAAE,IAAI,CAAC,SAAS,CAAC;oBACtB,KAAK;oBACL,KAAK,EAAE,MAAM,CAAC,MAAM;oBACpB,MAAM,EAAE,MAAM,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC;wBACzB,KAAK,EAAE,CAAC,CAAC,KAAK;wBACd,OAAO,EAAE,CAAC,CAAC,OAAO;wBAClB,QAAQ,EAAE,CAAC,CAAC,QAAQ;wBACpB,QAAQ,EAAE,CAAC,CAAC,OAAO;wBACnB,OAAO,EAAE,CAAC,CAAC,MAAM;wBACjB,SAAS,EAAE,CAAC,CAAC,SAAS;wBACtB,UAAU,EAAE,CAAC,CAAC,UAAU;qBACzB,CAAC,CAAC;oBACH,GAAG,CAAC,SAAS,IAAI,EAAE,SAAS,EAAE,UAAU,EAAE,SAAS,EAAE,CAAC;iBACvD,CAAC;aACH,CAAC;QACJ,CAAC;KACF,CAAC;AACJ,CAAC"}
@@ -1,135 +0,0 @@
1
- # Implementation Guide
2
-
3
- You are implementing an ML research project based on the plan in `workspace/plan_res.md`. The goal is a self-contained, runnable codebase in `workspace/project/`.
4
-
5
- ## Core Principles
6
-
7
- ### 1. Self-Contained Project
8
-
9
- ALL code must reside within `workspace/project/`. No direct imports from `workspace/repos/`. Reference code should be studied, understood, and rewritten to fit the project's architecture.
10
-
11
- When adapting reference code:
12
-
13
- - Understand the core logic and algorithm, not just copy the syntax.
14
- - Rewrite to fit consistent naming conventions and coding style.
15
- - Document the origin: add a comment like `# Adapted from repos/xyz/model/attention.py`.
16
- - Include all necessary utility functions — do not rely on external helpers.
17
-
18
- ### 2. Follow the Plan Exactly
19
-
20
- Implement every component listed in `workspace/plan_res.md`:
21
-
22
- - Every atomic definition from the Model Plan becomes a class or module.
23
- - The dataset pipeline matches the Dataset Plan.
24
- - The loss function matches the Training Plan formula.
25
- - The evaluation matches the Testing Plan metrics.
26
-
27
- Do not skip components. Do not substitute simpler alternatives. If a component seems wrong, flag it rather than silently changing it.
28
-
29
- ### 3. Real Data, Not Toy Data
30
-
31
- Use the actual datasets specified in the plan. If the dataset requires downloading, write the download logic. Never substitute with random data or tiny synthetic datasets for the implementation (the quick validation uses real data with 2 epochs, not fake data).
32
-
33
- ## Project Structure
34
-
35
- ```
36
- workspace/project/
37
- model/
38
- __init__.py
39
- [component files matching Model Plan]
40
- data/
41
- __init__.py
42
- dataset.py # Dataset class
43
- loader.py # DataLoader configuration
44
- preprocess.py # Preprocessing logic
45
- training/
46
- __init__.py
47
- trainer.py # Training loop
48
- loss.py # Loss functions
49
- testing/
50
- __init__.py
51
- evaluator.py # Evaluation logic
52
- metrics.py # Metric implementations
53
- utils/
54
- __init__.py
55
- [shared utilities]
56
- run.py # Main entry point
57
- requirements.txt # All dependencies with versions
58
- README.md # Brief description of the project
59
- ```
60
-
61
- ## Implementation Sequence
62
-
63
- Follow this order to catch issues early:
64
-
65
- 1. **requirements.txt**: List all dependencies. Pin major versions.
66
- 2. **Data pipeline**: Implement dataset loading first. Verify with a small print test.
67
- 3. **Model architecture**: Implement each component. Verify shapes with dummy input.
68
- 4. **Loss function**: Implement and verify with dummy predictions.
69
- 5. **Training loop**: Wire everything together. Include logging.
70
- 6. **Evaluation**: Implement metrics and test evaluation pipeline.
71
- 7. **run.py**: Main entry point with argument parsing.
72
-
73
- After each step, run a quick sanity check before moving on.
74
-
75
- ## Quick Validation Run
76
-
77
- The first run uses 2 epochs only:
78
-
79
- ```bash
80
- cd workspace/project
81
- pip install -r requirements.txt
82
- python run.py --epochs 2
83
- ```
84
-
85
- Expected outcomes:
86
- - No import errors or missing dependencies.
87
- - Loss decreases (even slightly) over 2 epochs.
88
- - No NaN or Inf in loss or gradients.
89
- - Evaluation metrics produce reasonable (not necessarily good) numbers.
90
- - Memory usage stays within limits.
91
-
92
- If the run fails, debug and fix before reporting. Common issues:
93
- - Shape mismatches: print tensor shapes at each step.
94
- - OOM: reduce batch size or model size for validation.
95
- - Data loading errors: verify file paths and formats.
96
-
97
- ## Debugging Tips
98
-
99
- - Add `print(f"tensor.shape = {tensor.shape}")` at critical points during initial debugging.
100
- - Use `torch.autograd.set_detect_anomaly(True)` to catch gradient issues.
101
- - If training is unstable, check learning rate and gradient norms.
102
- - Remove debugging prints before the final version.
103
-
104
- ## Implementation Report
105
-
106
- After the quick validation succeeds, write `workspace/ml_res.md`:
107
-
108
- ```markdown
109
- # Implementation Report
110
-
111
- ## Components Implemented
112
- - [List each module with brief description]
113
-
114
- ## Quick Validation Results
115
- - Epochs: 2
116
- - Final training loss: [value]
117
- - Validation metrics: [values]
118
- - Runtime: [time]
119
- - GPU memory: [peak usage]
120
-
121
- ## Deviations from Plan
122
- - [Any changes made and why]
123
-
124
- ## Known Issues
125
- - [Any issues encountered]
126
- ```
127
-
128
- ## Rules
129
-
130
- 1. Never import from `workspace/repos/` — adapt and rewrite instead.
131
- 2. Never use toy/synthetic data — use real datasets from the plan.
132
- 3. Never skip plan components — implement everything or flag the issue.
133
- 4. Always validate with 2 epochs before declaring success.
134
- 5. Always write `requirements.txt` with pinned versions.
135
- 6. If you cannot resolve an issue after 3 attempts, document the problem and ask the user.