scientify 1.2.0 → 1.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: idea-generation
3
- description: "Generate innovative research ideas from a topic. SEARCHES arXiv/GitHub automatically, downloads papers, analyzes literature, and outputs 5 novel research ideas with arXiv citations. Use for: 找研究方向, 生成创新点, find research gaps, propose new methods. NOT for summarizing existing papers (use literature-review instead)."
3
+ description: "Generate 5 innovative research ideas from collected papers. Analyzes literature, identifies gaps, proposes novel methods with citations. Use for: 找研究方向, 生成创新点, find research gaps. Requires papers in workspace (run /literature-survey first if needed)."
4
4
  metadata:
5
5
  {
6
6
  "openclaw":
@@ -13,719 +13,193 @@ metadata:
13
13
 
14
14
  # Idea Generation
15
15
 
16
- End-to-end workflow for generating innovative research ideas from a research topic. This skill implements a full research idea generation pipeline:
16
+ Generate innovative research ideas grounded in literature analysis. This skill reads existing papers, identifies research gaps, and produces 5 distinct ideas with citations.
17
17
 
18
- 1. Search papers and code repositories
19
- 2. Select and download references
20
- 3. Analyze literature and codebases
21
- 4. Generate multiple ideas
22
- 5. Select and enhance the best idea
23
- 6. Map to code implementations
18
+ **Core principle:** Ideas MUST be grounded in actual papers, not generated from model knowledge.
24
19
 
25
20
  ---
26
21
 
27
- ## ⚠️ CRITICAL: EXECUTION MODE
22
+ ## Step 1: Check Workspace Resources
28
23
 
29
- **AUTONOMOUS EXECUTION**: Execute ALL steps without asking for user confirmation at each step.
30
- - Do NOT ask "要我继续吗?" or "Should I proceed?"
31
- - You MAY spawn subagents for parallel tasks (e.g., downloading multiple papers)
32
- - Only ask user when there's a genuine ambiguity (e.g., which focus area to choose)
33
- - Checkpoints are for YOUR internal verification, not for asking user
34
-
35
- **Run the entire workflow from Step 1 to Step 8 automatically.**
36
-
37
- ---
38
-
39
- ## ⚠️ CRITICAL: MANDATORY TOOL USAGE
40
-
41
- **DO NOT generate ideas from your own knowledge.** All ideas MUST be grounded in actual literature research.
42
-
43
- ### Blocking Requirements
44
-
45
- 1. **MUST call `arxiv` tool** to search papers - NO EXCEPTIONS
46
- 2. **MUST call `github_search` tool** to find repositories - NO EXCEPTIONS
47
- 3. **MUST write `search_results.md`** BEFORE proceeding to idea generation
48
- 4. **MUST reference specific papers** (with arXiv IDs) in generated ideas
49
- 5. **MUST clone actual repos** before code survey
50
-
51
- ### Anti-Pattern: DO NOT DO THIS
52
-
53
- ❌ User asks about "time series forecasting" → Agent immediately lists methods from memory
54
- ❌ Agent generates ideas without calling any search tools
55
- ❌ Agent skips to idea generation without `search_results.md` existing
56
-
57
- ### Correct Pattern: DO THIS
58
-
59
- ✅ User asks about "time series forecasting" → Agent calls `arxiv` tool with query
60
- ✅ Agent calls `github_search` tool to find implementations
61
- ✅ Agent writes search results to file
62
- ✅ Agent reads downloaded papers before generating ideas
63
- ✅ Ideas reference specific papers by arXiv ID
64
-
65
- ---
66
-
67
- ## Workspace Convention (Project-based)
68
-
69
- **IMPORTANT**: Each research topic uses its own project directory. Agent auto-selects or creates projects.
70
-
71
- ```
72
- ~/.openclaw/workspace/
73
- └── projects/
74
- ├── .active # Current project ID (plain text file)
75
- ├── nlp-summarization/ # Project A
76
- │ ├── project.json # Project metadata
77
- │ ├── task.json # Research task definition
78
- │ ├── search_results.md # Search results
79
- │ ├── prepare_res.md # Selected repos summary
80
- │ ├── papers/ # Downloaded papers
81
- │ ├── repos/ # Cloned repositories
82
- │ └── ideas/ # Generated ideas
83
- ├── image-segmentation/ # Project B
84
- │ └── ...
85
- └── ...
86
- ```
87
-
88
- **All paths are project-relative**: `~/.openclaw/workspace/projects/{project_id}/`
89
-
90
- **File existence = step completion.** Skip steps whose output already exists.
91
-
92
- ---
93
-
94
- ## Step 0: Auto Project Management (REQUIRED)
95
-
96
- **Agent autonomously manages projects. DO NOT ask user for confirmation.**
97
-
98
- ### 0.1 Extract Topic from User Query
99
-
100
- Analyze the user's message to identify the research topic. Examples:
101
- - "帮我调研文本摘要方法" → topic: `text-summarization`
102
- - "推荐系统的深度学习方法" → topic: `rec-deep-learning`
103
- - "transformer attention optimization" → topic: `transformer-attention`
104
-
105
- Convert to kebab-case ID: lowercase, spaces/special chars → hyphens.
106
-
107
- ### 0.2 Check Existing Projects
24
+ First, check what resources already exist:
108
25
 
109
26
  ```bash
110
- ls ~/.openclaw/workspace/projects/ 2>/dev/null | grep -v "^\.active$"
111
- ```
27
+ # Check active project
28
+ cat ~/.openclaw/workspace/projects/.active 2>/dev/null
112
29
 
113
- Read each `project.json` to check if topic matches:
114
- ```bash
115
- cat ~/.openclaw/workspace/projects/*/project.json 2>/dev/null
116
- ```
30
+ # Check papers
31
+ ls ~/.openclaw/workspace/projects/*/papers/ 2>/dev/null | head -20
117
32
 
118
- ### 0.3 Select or Create Project
119
-
120
- **If matching project exists**: Use it, update `.active`
121
- ```bash
122
- echo "{project_id}" > ~/.openclaw/workspace/projects/.active
33
+ # Check survey results
34
+ cat ~/.openclaw/workspace/projects/*/survey/clusters.json 2>/dev/null | head -5
123
35
  ```
124
36
 
125
- **If no match**: Create new project
126
- ```bash
127
- PROJECT_ID="{topic-as-kebab-case}"
128
- mkdir -p ~/.openclaw/workspace/projects/$PROJECT_ID/{papers,repos,ideas}
129
- echo "$PROJECT_ID" > ~/.openclaw/workspace/projects/.active
130
-
131
- # Create project.json
132
- cat > ~/.openclaw/workspace/projects/$PROJECT_ID/project.json << 'EOF'
133
- {
134
- "id": "{project_id}",
135
- "name": "{Human readable name}",
136
- "created": "{ISO date}",
137
- "topics": ["{keyword1}", "{keyword2}"]
138
- }
139
- EOF
140
- ```
141
-
142
- ### 0.4 Set Working Paths
143
-
144
- After project selection, ALL subsequent paths use:
145
- ```
146
- WORKSPACE=~/.openclaw/workspace/projects/{project_id}
147
- $WORKSPACE/task.json
148
- $WORKSPACE/search_results.md
149
- $WORKSPACE/papers/
150
- $WORKSPACE/repos/
151
- $WORKSPACE/ideas/
152
- $WORKSPACE/prepare_res.md
153
- ```
37
+ ### Assess Available Resources
154
38
 
155
- **Log project selection** (inform user briefly):
156
- > 📁 Using project: `{project_id}` ({new/existing})
39
+ | Resource | Location | Status |
40
+ |----------|----------|--------|
41
+ | Papers | `$WORKSPACE/papers/` | Count: ? |
42
+ | Survey clusters | `$WORKSPACE/survey/clusters.json` | Exists: Y/N |
43
+ | Repos | `$WORKSPACE/repos/` | Count: ? |
157
44
 
158
45
  ---
159
46
 
160
- ## Step 1: Parse Task
47
+ ## Step 2: Ask User About Search Strategy
161
48
 
162
- Check `$WORKSPACE/task.json`. If missing, extract from user query:
49
+ Based on workspace state, ask user:
163
50
 
164
- - **domain**: Research domain (e.g., "graph neural networks", "recommendation")
165
- - **focus** (optional): Specific problem or technique
166
- - **date_limit** (optional): Only consider papers before this date
51
+ **If papers exist (≥5):**
52
+ > 📚 Found {N} papers in workspace from previous survey.
53
+ >
54
+ > Options:
55
+ > 1. **Use existing papers** - Generate ideas from current collection
56
+ > 2. **Search more** - Run `/literature-survey` to expand collection
57
+ > 3. **Quick search** - Add 5-10 more papers on specific topic
167
58
 
168
- ```bash
169
- cat $WORKSPACE/task.json 2>/dev/null || echo "No task.json"
170
- ```
171
-
172
- Create task.json:
173
- ```json
174
- {
175
- "domain": "graph neural networks",
176
- "focus": "scalable transformers for node classification",
177
- "date_limit": "2024-01-01",
178
- "created": "2024-XX-XX"
179
- }
180
- ```
181
-
182
- **Output:** `$WORKSPACE/task.json`
59
+ **If no papers:**
60
+ > 📭 No papers found in workspace.
61
+ >
62
+ > To generate grounded ideas, I need literature. Options:
63
+ > 1. **Run /literature-survey** - Comprehensive search (100+ papers, recommended)
64
+ > 2. **Quick search** - Fetch 10-15 papers on your topic now
65
+ > 3. **You provide papers** - Point me to existing PDFs/tex files
183
66
 
184
67
  ---
185
68
 
186
- ## Step 2: Search Papers and Code (MANDATORY)
187
-
188
- **⚠️ BLOCKING: You MUST complete this step before ANY idea generation.**
189
-
190
- ### 2.1 ArXiv Search (REQUIRED)
69
+ ## Step 3: Acquire Resources (if needed)
191
70
 
192
- **You MUST call the `arxiv` tool.** Example:
71
+ ### Option A: Delegate to /literature-survey (Recommended)
193
72
 
73
+ If user wants comprehensive search:
194
74
  ```
195
- Tool: arxiv
196
- Arguments:
197
- query: "text summarization transformer model"
198
- max_results: 10
199
- sort_by: "relevance"
200
- ```
201
-
202
- If `arxiv` tool is not available, use `WebSearch` with `site:arxiv.org`:
203
- ```
204
- Tool: WebSearch
205
- Arguments:
206
- query: "site:arxiv.org text summarization transformer model"
207
- ```
208
-
209
- ### 2.2 GitHub Search (REQUIRED)
210
-
211
- **You MUST call the `github_search` tool or search GitHub.** Example:
212
-
213
- ```
214
- Tool: github_search
215
- Arguments:
216
- query: "text summarization pytorch huggingface"
217
- sort: "stars"
218
- max_results: 20
219
- ```
220
-
221
- If `github_search` tool is not available, use `WebSearch`:
222
- ```
223
- Tool: WebSearch
224
- Arguments:
225
- query: "site:github.com text summarization pytorch stars:>100"
226
- ```
227
-
228
- ### 2.3 CHECKPOINT: Verify Search Completed
75
+ Please run: /literature-survey {topic}
229
76
 
230
- Before proceeding, confirm:
231
- - [ ] Called arxiv/WebSearch for papers
232
- - [ ] Called github_search/WebSearch for repositories
233
- - [ ] Have at least 5 paper results
234
- - [ ] Have at least 5 repository results
77
+ This will:
78
+ - Search 100+ papers systematically
79
+ - Filter by relevance (score ≥4)
80
+ - Cluster into research directions
81
+ - Save to $WORKSPACE/papers/
235
82
 
236
- **If search returns 0 results, try different queries. DO NOT proceed without results.**
237
-
238
- ### 2.4 Compile Results
239
-
240
- Write to `$WORKSPACE/search_results.md`:
241
-
242
- ```markdown
243
- # Search Results
244
-
245
- ## Task
246
- - Domain: {domain}
247
- - Focus: {focus}
248
- - Date: {date}
249
-
250
- ## ArXiv Papers Found
251
-
252
- | # | Title | ArXiv ID | Year | Relevance |
253
- |---|-------|----------|------|-----------|
254
- | 1 | [Title](pdf_url) | 2401.xxxxx | 2024 | [Why relevant] |
255
- | 2 | ... | ... | ... | ... |
256
-
257
- ## GitHub Repositories Found
258
-
259
- | # | Repository | Stars | Language | Relevance |
260
- |---|------------|-------|----------|-----------|
261
- | 1 | [owner/repo](url) | 1.2k | Python | [Why relevant] |
262
- | 2 | ... | ... | ... | ... |
83
+ After survey completes, run /idea-generation again.
263
84
  ```
264
85
 
265
- **Output:** `$WORKSPACE/search_results.md`
266
-
267
- ---
86
+ ### Option B: Quick Search (5-10 papers)
268
87
 
269
- ## Step 3: Prepare - Select Repositories
88
+ For fast iteration, do minimal search:
270
89
 
271
- Read search results and select **at least 5** most valuable repositories.
272
-
273
- Selection criteria:
274
- - Direct implementation of relevant papers
275
- - High code quality (stars, documentation)
276
- - Active maintenance
277
- - Covers key techniques in the domain
278
-
279
- ### 3.1 Clone Selected Repos
280
-
281
- ```bash
282
- mkdir -p $WORKSPACE/repos
283
- cd $WORKSPACE/repos
284
-
285
- # For each selected repo:
286
- git clone --depth 1 https://github.com/owner/repo1.git
287
- git clone --depth 1 https://github.com/owner/repo2.git
288
- # ... at least 5 repos
90
+ 1. **ArXiv search:**
289
91
  ```
290
-
291
- ### 3.2 Document Selection
292
-
293
- Write to `$WORKSPACE/prepare_res.md`:
294
-
295
- ```markdown
296
- # Selected Reference Codebases
297
-
298
- ## Selection Rationale
299
- [Why these repos were chosen]
300
-
301
- ## Repositories
302
-
303
- ### 1. repo1
304
- - **URL**: https://github.com/owner/repo1
305
- - **Paper**: [Associated paper if any]
306
- - **Key Components**:
307
- - `model/` - Model architecture
308
- - `train.py` - Training loop
309
- - **Usage**: [How this will help implement our idea]
310
-
311
- ### 2. repo2
312
- ...
313
-
314
- ## Reference Papers
315
- Based on these repos, the key papers to read are:
316
- 1. [Paper Title 1] - ArXiv: 2401.xxxxx
317
- 2. [Paper Title 2] - ArXiv: 2401.xxxxx
318
- ...
319
- ```
320
-
321
- **Output:** `$WORKSPACE/prepare_res.md` + `$WORKSPACE/repos/`
322
-
323
- ---
324
-
325
- ## Step 4: Download Papers
326
-
327
- For each paper referenced in prepare_res.md, download the source.
328
-
329
- **IMPORTANT: Download .tex source, NOT PDF.** .tex files are much easier for AI to read and extract information from.
330
-
331
- ### 4.1 Download .tex Source (RECOMMENDED - Use arxiv tool)
332
-
333
- Use the `arxiv` tool with `download: true` to automatically download and extract .tex sources:
334
-
335
- ```
336
- Tool: arxiv
92
+ Tool: arxiv_search
337
93
  Arguments:
338
- query: "abstractive summarization long document"
94
+ query: "{user_topic}"
339
95
  max_results: 10
340
- download: true
341
- output_dir: "$WORKSPACE/papers"
342
96
  ```
343
97
 
344
- The tool will:
345
- 1. Search for papers matching your query
346
- 2. Download .tex source from `https://arxiv.org/src/{arxiv_id}`
347
- 3. Extract tar.gz archives automatically
348
- 4. Fall back to PDF if .tex is unavailable
349
- 5. Return a `downloads` array showing what was downloaded
350
-
351
- **Output format:**
352
- ```json
353
- {
354
- "papers": [...],
355
- "downloads": [
356
- {"arxiv_id": "2404.04429", "format": "tex", "files": ["main.tex", "methods.tex"]},
357
- {"arxiv_id": "2308.03664", "format": "pdf", "files": ["2308.03664.pdf"], "error": "tex unavailable"}
358
- ],
359
- "output_dir": "$WORKSPACE/papers"
360
- }
98
+ 2. **Clone 3-5 reference repos:**
99
+ ```bash
100
+ mkdir -p $WORKSPACE/repos
101
+ git clone --depth 1 {repo_url} $WORKSPACE/repos/{name}
361
102
  ```
362
103
 
363
- ### 4.2 Manual Download (Fallback)
364
-
365
- If the arxiv tool is unavailable, use bash:
104
+ 3. **Download paper sources:**
366
105
  ```bash
367
106
  mkdir -p $WORKSPACE/papers/{arxiv_id}
368
- cd $WORKSPACE/papers/{arxiv_id}
369
- curl -L "https://arxiv.org/src/{arxiv_id}" -o source.tar.gz
370
- tar -xzf source.tar.gz 2>/dev/null || mv source.tar.gz main.tex
107
+ curl -L "https://arxiv.org/src/{arxiv_id}" | tar -xz -C $WORKSPACE/papers/{arxiv_id}
371
108
  ```
372
109
 
373
- ### 4.3 Document Downloads
110
+ ---
374
111
 
375
- Write to `$WORKSPACE/papers/download_log.md`:
112
+ ## Step 4: Analyze Literature
376
113
 
377
- ```markdown
378
- # Downloaded Papers
114
+ **Prerequisites:** At least 5 papers in `$WORKSPACE/papers/`
379
115
 
380
- | ArXiv ID | Title | Format | Status |
381
- |----------|-------|--------|--------|
382
- | 2404.04429 | Physics-Informed ML for Battery... | .tex | ✓ |
383
- | 2308.03664 | Two-stage Early Prediction... | .tex | ✓ |
384
- | 2401.99999 | Some Other Paper | .pdf | ✓ (tex unavailable) |
385
- ```
116
+ ### 4.1 Read Papers
386
117
 
387
- **Output:** `$WORKSPACE/papers/`
118
+ For each paper, extract:
119
+ - Core contribution (1 sentence)
120
+ - Key method/formula
121
+ - Limitations mentioned
122
+ - Future work suggestions
388
123
 
389
- ---
124
+ **Long papers (>50KB):** See `references/reading-long-papers.md`
390
125
 
391
- ## Step 5: Generate Ideas (5 Ideas)
392
-
393
- **⚠️ BLOCKING: DO NOT start this step unless Steps 2-4 are complete.**
394
-
395
- ### Pre-requisite Checkpoint
396
-
397
- Before generating ANY ideas, verify these files exist:
398
- - [ ] `$WORKSPACE/search_results.md` - search results from Step 2
399
- - [ ] `$WORKSPACE/prepare_res.md` - selected repos from Step 3
400
- - [ ] At least 3 papers downloaded in `$WORKSPACE/papers/`
401
-
402
- **If any file is missing, GO BACK and complete the previous steps.**
403
-
404
- This is the core intellectual step. Generate **exactly 5 distinct innovative ideas**.
405
-
406
- **IMPORTANT: Ideas must be grounded in the literature you just read. Each idea MUST:**
407
- - Reference at least 2 specific papers by arXiv ID
408
- - Identify specific limitations from those papers
409
- - Propose improvements based on gaps found in the literature
410
-
411
- ### 5.1 Analyze Literature First (REQUIRED)
412
-
413
- For each paper in `papers/`:
414
- 1. Read thoroughly (especially: abstract, method, experiments, limitations)
415
- 2. Extract: core contribution, math formulas, limitations, future work
416
- 3. Note connections to other papers
417
-
418
- **⚠️ Handling Long Papers (>50KB or >15k tokens):**
419
-
420
- If a .tex file is too long to read in one pass:
421
-
422
- 1. **First pass - Structure scan:**
423
- ```bash
424
- # List all .tex files and their sizes
425
- ls -la $WORKSPACE/papers/{arxiv_id}/
426
- # Check line count
427
- wc -l $WORKSPACE/papers/{arxiv_id}/*.tex
428
- ```
429
-
430
- 2. **Chunked reading strategy:**
431
- - Read `abstract` section first (usually in main.tex, first 200 lines)
432
- - Read `\section{Introduction}` or `\section{Method}` separately
433
- - Read `\section{Experiments}` or `\section{Results}` separately
434
- - Read `\section{Conclusion}` and `\section{Related Work}` last
435
-
436
- Use the Read tool with `offset` and `limit` parameters:
437
- ```
438
- Tool: Read
439
- Arguments:
440
- file_path: "$WORKSPACE/papers/2404.04429/main.tex"
441
- offset: 1
442
- limit: 500 # First 500 lines (abstract + intro)
443
- ```
444
-
445
- Then continue:
446
- ```
447
- Tool: Read
448
- Arguments:
449
- file_path: "$WORKSPACE/papers/2404.04429/main.tex"
450
- offset: 500
451
- limit: 500 # Lines 500-1000 (method section)
452
- ```
453
-
454
- 3. **Priority sections for idea generation:**
455
- | Priority | Section | Why |
456
- |----------|---------|-----|
457
- | 1 | Abstract | Core contribution |
458
- | 2 | Method/Approach | Technical details, formulas |
459
- | 3 | Experiments | What works, what doesn't |
460
- | 4 | Conclusion/Future Work | Limitations, open problems |
461
- | 5 | Related Work | Connections to other papers |
462
-
463
- 4. **Skip if context-limited:**
464
- - Appendix (proofs, supplementary)
465
- - Acknowledgments
466
- - Detailed hyperparameter tables
467
-
468
- For each repo in `repos/`:
469
- 1. Understand structure: `gen_code_tree_structure` equivalent
470
- 2. Identify key implementations
471
- 3. Note reusable components
472
-
473
- ### 5.2 Identify Research Gaps
126
+ ### 4.2 Identify Research Gaps
474
127
 
475
128
  Look for:
476
129
  - Common limitations across papers
477
- - Unexplored combinations of techniques
130
+ - Unexplored technique combinations
478
131
  - Scalability issues
479
132
  - Assumptions that could be relaxed
480
133
 
481
- ### 5.3 Generate Idea 1
134
+ Document gaps in `$WORKSPACE/ideas/gaps.md`:
135
+ ```markdown
136
+ # Research Gaps Identified
482
137
 
483
- Create `$WORKSPACE/ideas/idea_1.md` using the template in `references/idea-template.md`.
138
+ ## Gap 1: [Description]
139
+ - Mentioned in: [paper1], [paper2]
140
+ - Why important: ...
484
141
 
485
- **MUST include (with actual citations from your research):**
486
- - One-line summary
487
- - Challenges addressed
488
- - **Existing methods & limitations (cite specific papers by arXiv ID)**
489
- - Example: "Method A [arXiv:2301.12345] achieves X but fails at Y"
490
- - Example: "Method B [arXiv:2302.67890] proposes Z but has limitation W"
491
- - Motivation (why this gap matters)
492
- - Proposed method (with math formulas)
493
- - **How this improves on cited papers**
494
- - Expected advantages
495
- - Evaluation plan (datasets, baselines from the papers you read)
496
- - Novelty/Feasibility/Impact scores
142
+ ## Gap 2: [Description]
143
+ ...
144
+ ```
145
+
146
+ ---
497
147
 
498
- **❌ REJECTED if:** No arXiv IDs cited, or ideas not connected to searched literature
148
+ ## Step 5: Generate 5 Ideas
499
149
 
500
- ### 5.4 Generate Ideas 2-5
150
+ Create `$WORKSPACE/ideas/idea_1.md` through `idea_5.md` using template in `references/idea-template.md`.
501
151
 
502
- For each subsequent idea, explicitly try a **different strategy**:
152
+ **Requirements:**
153
+ - Each idea cites ≥2 papers by arXiv ID
154
+ - Use different strategies:
503
155
 
504
156
  | Idea | Strategy |
505
157
  |------|----------|
506
- | 1 | Combination - merge techniques from 2+ papers |
507
- | 2 | Simplification - simplify complex method |
508
- | 3 | Generalization - extend to new domain/task |
509
- | 4 | Constraint relaxation - remove limiting assumption |
510
- | 5 | Architecture innovation - novel model design |
158
+ | 1 | Combination - merge 2+ techniques |
159
+ | 2 | Simplification - reduce complexity |
160
+ | 3 | Generalization - extend to new domain |
161
+ | 4 | Constraint relaxation - remove assumption |
162
+ | 5 | Architecture innovation - new design |
511
163
 
512
- Create `idea_2.md`, `idea_3.md`, `idea_4.md`, `idea_5.md`.
513
-
514
- **Output:** `$WORKSPACE/ideas/idea_1.md` through `idea_5.md`
164
+ **❌ REJECTED if:** No arXiv IDs cited, or ideas not grounded in literature
515
165
 
516
166
  ---
517
167
 
518
168
  ## Step 6: Select and Enhance Best Idea
519
169
 
520
- ### 6.1 Evaluate All Ideas
170
+ ### 6.1 Score All Ideas
521
171
 
522
- Create evaluation matrix:
523
-
524
- ```markdown
525
- # Idea Evaluation
526
-
527
- | Idea | Title | Novelty | Feasibility | Impact | Total |
528
- |------|-------|---------|-------------|--------|-------|
529
- | 1 | ... | 4 | 3 | 4 | 11 |
530
- | 2 | ... | 5 | 4 | 5 | 14 |
531
- | 3 | ... | 3 | 5 | 3 | 11 |
532
- | 4 | ... | 4 | 4 | 4 | 12 |
533
- | 5 | ... | 3 | 3 | 4 | 10 |
534
-
535
- **Selected: Idea 2**
536
-
537
- ## Selection Rationale
538
- [Why this idea is most promising - technical innovation, feasibility, impact]
539
- ```
172
+ | Idea | Novelty | Feasibility | Impact | Total |
173
+ |------|---------|-------------|--------|-------|
174
+ | 1 | /5 | /5 | /5 | /15 |
175
+ | ... | | | | |
540
176
 
541
177
  ### 6.2 Enhance Selected Idea
542
178
 
543
- Take the winning idea and create `$WORKSPACE/ideas/selected_idea.md`:
544
-
545
- **Enhancements to add:**
546
- 1. More detailed math formulations (complete loss functions, gradients)
547
- 2. Specific architecture choices (layer sizes, activations)
548
- 3. Hyperparameter recommendations
549
- 4. Implementation roadmap
550
- 5. Potential failure modes and mitigations
551
- 6. Detailed experiment design
552
-
553
- **Output:** `$WORKSPACE/ideas/selected_idea.md`
554
-
555
- ---
556
-
557
- ## Step 7: Code Survey - Map Idea to Implementations
558
-
559
- This step bridges theory and code. For each **atomic concept** in the selected idea, find corresponding implementations in the reference repos.
560
-
561
- ### 7.1 Extract Atomic Concepts
562
-
563
- From selected_idea.md, list all concepts needing implementation:
564
-
565
- ```markdown
566
- ## Atomic Concepts to Implement
567
-
568
- 1. Multi-head Self-Attention
569
- 2. Graph Message Passing
570
- 3. Energy-based Diffusion
571
- 4. Adaptive Diffusivity Function
572
- 5. ...
573
- ```
574
-
575
- ### 7.2 Survey Codebases
576
-
577
- For each concept:
578
-
579
- 1. Search repos for relevant code:
580
- ```bash
581
- grep -r "class.*Attention" $WORKSPACE/repos/
582
- grep -r "def forward" $WORKSPACE/repos/
583
- ```
584
-
585
- 2. Read and understand the implementation
586
-
587
- 3. Document the mapping
588
-
589
- ### 7.3 Create Implementation Report
590
-
591
- Write to `$WORKSPACE/ideas/implementation_report.md`:
592
-
593
- ```markdown
594
- # Implementation Report
595
-
596
- ## Selected Idea Summary
597
- [One paragraph summary]
598
-
599
- ## Concept-to-Code Mapping
600
-
601
- ### Concept 1: Multi-head Self-Attention
602
-
603
- **Math Formula:**
604
- $$
605
- \text{Attention}(Q,K,V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V
606
- $$
607
-
608
- **Reference Implementation:**
609
- - File: `repos/transformer/attention.py`
610
- - Class: `MultiHeadAttention`
611
- - Key code:
612
- ```python
613
- class MultiHeadAttention(nn.Module):
614
- def __init__(self, d_model, n_heads):
615
- self.d_k = d_model // n_heads
616
- self.W_q = nn.Linear(d_model, d_model)
617
- # ...
618
-
619
- def forward(self, x):
620
- Q = self.W_q(x)
621
- # ...
622
- ```
623
-
624
- **Adaptation needed:**
625
- - [What to modify for our idea]
626
-
627
- ---
628
-
629
- ### Concept 2: Graph Message Passing
630
- ...
179
+ Create `$WORKSPACE/ideas/selected_idea.md` with:
180
+ - Detailed math (loss functions, gradients)
181
+ - Architecture choices
182
+ - Hyperparameters
183
+ - Implementation roadmap
631
184
 
632
185
  ---
633
186
 
634
- ## Implementation Roadmap
187
+ ## Step 7: Code Survey
635
188
 
636
- 1. [ ] Start with Concept X (foundational)
637
- 2. [ ] Build Concept Y on top
638
- 3. [ ] Integrate with Concept Z
639
- 4. [ ] Add training loop from repo W
189
+ Map idea concepts to reference implementations.
640
190
 
641
- ## Recommended Starting Point
642
- [Which repo to fork/use as base]
643
- ```
191
+ See `references/code-mapping.md` for template.
644
192
 
645
193
  **Output:** `$WORKSPACE/ideas/implementation_report.md`
646
194
 
647
195
  ---
648
196
 
649
- ## Step 8: Final Summary
197
+ ## Step 8: Summary
650
198
 
651
199
  Create `$WORKSPACE/ideas/summary.md`:
652
-
653
- ```markdown
654
- # Research Idea Generation Report
655
-
656
- ## Task
657
- - Domain: {domain}
658
- - Focus: {focus}
659
- - Date: {date}
660
-
661
- ## Resources Gathered
662
- - Papers analyzed: X
663
- - Repositories cloned: Y
664
- - Key techniques identified: Z
665
-
666
- ## Ideas Generated
667
- 1. **[Idea 1 title]** - Score: 11
668
- 2. **[Idea 2 title]** - Score: 14 ⭐ SELECTED
669
- 3. **[Idea 3 title]** - Score: 11
670
- 4. **[Idea 4 title]** - Score: 12
671
- 5. **[Idea 5 title]** - Score: 10
672
-
673
- ## Selected Idea
674
- **{Title}**
675
-
676
- {One paragraph description}
677
-
678
- ### Key Innovation
679
- {What makes this novel}
680
-
681
- ### Implementation Ready
682
- - Math formulas: ✓ Complete
683
- - Code references: ✓ Mapped
684
- - Evaluation plan: ✓ Defined
685
-
686
- ## Next Steps
687
- 1. Run `/research-pipeline` with `selected_idea.md` as input
688
- 2. Or manually implement following `implementation_report.md`
689
-
690
- ## Files Generated
691
- - `task.json` - Task definition
692
- - `search_results.md` - Search results
693
- - `prepare_res.md` - Selected repos
694
- - `ideas/idea_*.md` - 5 generated ideas
695
- - `ideas/selected_idea.md` - Enhanced best idea
696
- - `ideas/implementation_report.md` - Code mapping
697
- ```
698
-
699
- **Output:** `$WORKSPACE/ideas/summary.md`
700
-
701
- ---
702
-
703
- ## Quality Checklist
704
-
705
- Before completing, verify:
706
-
707
- - [ ] At least 5 repos cloned in `repos/`
708
- - [ ] At least 3 papers downloaded in `papers/`
709
- - [ ] All 5 ideas are substantially different
710
- - [ ] Selected idea has complete math formulations
711
- - [ ] Implementation report covers ALL atomic concepts
712
- - [ ] Each concept has actual code reference (not placeholder)
713
- - [ ] Evaluation plan has specific datasets and metrics
714
-
715
- ---
716
-
717
- ## Integration with Other Skills
718
-
719
- **After idea-generation:**
720
- ```
721
- /research-pipeline → Implement the selected idea
722
- ```
723
-
724
- **To gather more resources:**
725
- ```
726
- /arxiv "specific topic" → Search more papers
727
- /literature-review → Deep dive into papers
728
- ```
200
+ - All 5 ideas with scores
201
+ - Selected idea details
202
+ - Next steps: `/research-pipeline` to implement
729
203
 
730
204
  ---
731
205
 
@@ -733,19 +207,15 @@ Before completing, verify:
733
207
 
734
208
  | User Says | Action |
735
209
  |-----------|--------|
736
- | "Generate research ideas for NLP" | Full workflow (Steps 1-8) |
737
- | "Search papers on X" | Steps 1-2 only |
738
- | "I have papers, generate ideas" | Skip to Step 5 |
739
- | "Enhance this idea: ..." | Skip to Step 6-7 |
740
- | "Map this idea to code" | Step 7 only |
210
+ | "Generate ideas for X" | Check workspace ask strategy → generate |
211
+ | "I have papers, generate ideas" | Skip to Step 4 |
212
+ | "Enhance idea N" | Jump to Step 6 |
213
+ | "Map to code" | Jump to Step 7 |
741
214
 
742
215
  ---
743
216
 
744
- ## Batch Processing Rule
745
-
746
- If more than 10 papers/repos to analyze:
747
- 1. First pass: Quick scan all (abstract/README only)
748
- 2. Select top 5-7 for deep analysis
749
- 3. Generate ideas from deep analysis
217
+ ## Integration
750
218
 
751
- Do NOT process all resources with full detail - context will overflow.
219
+ - **Before:** `/literature-survey` to collect papers
220
+ - **After:** `/research-pipeline` to implement selected idea
221
+ - **Alternative:** `/write-review-paper` to write survey instead