researchbot 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,138 @@
1
+ # Analyze
2
+
3
+ Perform an in-depth analysis of an academic paper.
4
+
5
+ ## Input
6
+
7
+ The user provides `$ARGUMENTS` in the format: `<paper_reference> [output_path] [--model=<opus|sonnet>]`
8
+
9
+ **Paper reference** (required) can be:
10
+ - An arXiv ID (e.g. `ARXIV:1706.03762`)
11
+ - A DOI (e.g. `DOI:10.18653/v1/N19-1423`)
12
+ - A Semantic Scholar URL or ID
13
+ - A paper name (e.g. `Attention is All You Need`)
14
+
15
+ **Output path** (optional) can be:
16
+ - A directory path (absolute or relative) where the analysis will be saved as `{paper_slug}.md`
17
+ - If not provided, the analysis is printed to the conversation
18
+
19
+ **Model** (optional):
20
+ - `--model=opus` (default) — Use Opus 4.6 for deep, thorough analysis
21
+ - `--model=sonnet` — Use Sonnet 4.5 for faster analysis
22
+
23
+ Examples:
24
+ - `/analyze "Attention is All You Need"` — Output to conversation, use opus
25
+ - `/analyze ARXIV:1706.03762 workspace/transformers/` — Save to directory, use opus
26
+ - `/analyze Mamba /home/user/papers/ --model=sonnet` — Save to absolute path, use sonnet
27
+ - `/analyze "LoRA" --model=sonnet` — Output to conversation, use sonnet
28
+
29
+ ## Instructions
30
+
31
+ ### Phase 1: Parse Arguments
32
+
33
+ 1. **Extract the model parameter.** Check if `$ARGUMENTS` contains `--model=opus` or `--model=sonnet`.
34
+ - If `--model=sonnet` is present, use model: "sonnet"
35
+ - Otherwise, default to model: "opus"
36
+ - Remove the `--model=...` flag from the arguments string
37
+
38
+ 2. **Parse paper reference and output path.** From the remaining arguments, extract:
39
+ - `paper_reference`: The paper identifier (everything before the last space, or all of it if no path provided)
40
+ - `output_path`: Optional directory path (the last token if it looks like a path)
41
+
42
+ To detect if an output path was provided, check if the last token:
43
+ - Ends with `/` or `\`, OR
44
+ - Contains `/` or `\` path separators, OR
45
+ - Is explicitly a directory (check with bash `test -d`)
46
+
47
+ If ambiguous, assume no output path was provided.
48
+
49
+ ### Phase 2: Launch Subagent
50
+
51
+ Use the Task tool with `subagent_type: "general-purpose"` and the selected model to spawn a subagent with the following prompt:
52
+
53
+ ```
54
+ Analyze an academic paper in depth.
55
+
56
+ PAPER REFERENCE: {paper_reference}
57
+ OUTPUT PATH: {output_path or "Print to conversation"}
58
+
59
+ Instructions
60
+
61
+ 1. **Resolve and fetch the paper.** Use the `read_paper` MCP tool with the paper reference. This returns JSON with:
62
+ - `cache_path`: Path to the cached markdown file
63
+ - `title`, `authors`, `year`, `venue`, `citation_count`: Paper metadata
64
+
65
+ If `read_paper` fails, try using `search_papers` to find the paper and then `read_paper` with the resolved ID.
66
+
67
+ 2. **Read the full paper text.** Use the Read tool to read the paper from `cache_path`. The References section has already been stripped to reduce size.
68
+
69
+ 3. **Analyze the paper carefully.** Then extract the following structured analysis:
70
+
71
+ ### Core Contribution
72
+ What is the main contribution of this paper? What problem does it solve and what is novel about the approach?
73
+
74
+ ### Methodology
75
+ Describe the technical approach. What models, algorithms, or frameworks are introduced? Include key equations or formulations if they are central to the contribution.
76
+
77
+ ### Key Results
78
+ What are the main experimental results? Include specific numbers, benchmarks, and comparisons to baselines where available.
79
+
80
+ ### Limitations
81
+ What limitations do the authors acknowledge? What limitations are apparent but not stated?
82
+
83
+ ### Future Work
84
+ What directions for future work do the authors suggest? What open questions remain?
85
+
86
+ ### Key References
87
+ List 3-5 of the most important references cited in this paper that a reader should also look at.
88
+
89
+ 4. **Output the analysis.**
90
+ - **If no output path was provided:** Print the structured analysis as markdown to the conversation
91
+ - **If an output path was provided:**
92
+ 1. Create a slug from the paper title (lowercase, hyphens, alphanumeric only)
93
+ 2. Create the output directory if it doesn't exist (use `mkdir -p`)
94
+ 3. Resolve relative paths relative to the current working directory
95
+ 4. Write the analysis to `{output_path}/{slug}.md`
96
+ 5. Inform the user where the file was saved
97
+
98
+ Use the output format specified below.
99
+ ```
100
+
101
+ ### Phase 3: Report Results
102
+
103
+ After the subagent completes:
104
+ 1. If an error occurred, inform the user
105
+ 2. If the analysis was written to a file, confirm the file location
106
+ 3. If the analysis was printed to the conversation, the subagent will have already displayed it
107
+
108
+ ## Output Format
109
+
110
+ ```
111
+ # Analysis: {Paper Title}
112
+
113
+ **Authors:** {authors}
114
+ **Year:** {year} | **Venue:** {venue} | **Citations:** {count}
115
+
116
+ ## TL;DR
117
+ {One-paragraph summary of the paper}
118
+
119
+ ## Core Contribution
120
+ {...}
121
+
122
+ ## Methodology
123
+ {...}
124
+
125
+ ## Key Results
126
+ {...}
127
+
128
+ ## Limitations
129
+ {...}
130
+
131
+ ## Future Work
132
+ {...}
133
+
134
+ ## Key References
135
+ - {ref 1}
136
+ - {ref 2}
137
+ - ...
138
+ ```
@@ -0,0 +1,98 @@
1
+ # Compare Papers
2
+
3
+ Compare two academic papers, focusing on problem formulation and methodology.
4
+
5
+ ## Input
6
+
7
+ The user provides exactly 2 paper references as `$ARGUMENTS`. Each reference can be:
8
+ - An arXiv ID (e.g. `ARXIV:1706.03762`)
9
+ - An arXiv URL (e.g. `https://arxiv.org/abs/1706.03762`)
10
+ - A direct PDF URL (e.g. `https://arxiv.org/pdf/1706.03762.pdf`)
11
+ - A DOI (e.g. `DOI:10.18653/v1/N19-1423`)
12
+ - A Semantic Scholar URL or ID
13
+ - A paper name (e.g. `Attention is All You Need`)
14
+
15
+ References are separated by spaces or commas. If a paper name contains spaces, the user may quote it.
16
+
17
+ Examples:
18
+ - `/compare Mamba S4`
19
+ - `/compare ARXIV:2312.00752 ARXIV:2111.00396`
20
+ - `/compare https://arxiv.org/abs/2312.00752 https://arxiv.org/abs/2111.00396`
21
+ - `/compare "Attention is All You Need" "Mamba: Linear-Time Sequence Modeling"`
22
+
23
+ ## Instructions
24
+
25
+ 1. **Parse the input.** Extract exactly 2 paper references from `$ARGUMENTS`. If fewer or more than 2 papers are provided, inform the user and ask for clarification.
26
+
27
+ 2. **Resolve and fetch each paper.** For each paper reference:
28
+ - Use the `read_paper` MCP tool, which returns JSON with:
29
+ - `cache_path`: Path to the cached markdown file
30
+ - `title`, `authors`, `year`, `venue`, `citation_count`: Paper metadata
31
+ - If resolution fails, try `search_papers` to find the paper first
32
+
33
+ 3. **Read both papers.** Use the Read tool to read each paper from its `cache_path`. The References section has already been stripped to reduce size. Analyze the full text of each paper before comparing.
34
+
35
+ 4. **Compare on two primary dimensions:**
36
+
37
+ **Problem Formulation:**
38
+ - How does each paper frame the problem?
39
+ - What assumptions does each paper make?
40
+ - What are they fundamentally trying to achieve?
41
+ - What's the key difference in how they conceptualize the problem?
42
+
43
+ **Methodology:**
44
+ - What technical approach does each paper take?
45
+ - What are the key components or innovations?
46
+ - What trade-offs does each approach make?
47
+ - What's fundamentally different about how they solve the problem?
48
+
49
+ 5. **Check for a focus aspect.** If the user specifies a particular aspect to compare (e.g., "compare on scalability", "focus on experimental setup"), emphasize that aspect in your comparison while still covering the primary dimensions.
50
+
51
+ 6. **Output the comparison directly.** Print the structured comparison as markdown to the conversation. Do NOT write to any file.
52
+
53
+ ## Output Format
54
+
55
+ Use short names for papers (e.g., "Mamba" instead of full title) throughout the comparison for readability.
56
+
57
+ ```markdown
58
+ # Comparison: {Paper 1 short name} vs {Paper 2 short name}
59
+
60
+ ## Papers
61
+
62
+ | Paper | Year | Venue | Citations |
63
+ |-------|------|-------|-----------|
64
+ | {Full Title 1} | {year} | {venue} | {count} |
65
+ | {Full Title 2} | {year} | {venue} | {count} |
66
+
67
+ ## Problem Formulation
68
+
69
+ ### {Paper 1 short name}
70
+ {How this paper frames the problem. What are they trying to solve? What assumptions do they make? What constraints or requirements do they identify?}
71
+
72
+ ### {Paper 2 short name}
73
+ {How this paper frames the problem. What are they trying to solve? What assumptions do they make? What constraints or requirements do they identify?}
74
+
75
+ ### Key Differences in Problem Framing
76
+ {What's fundamentally different about how each paper conceptualizes the problem? Do they make different assumptions? Target different constraints? Frame success differently?}
77
+
78
+ ## Methodology
79
+
80
+ ### {Paper 1 short name}
81
+ {Technical approach. Key components and how they work. Main innovations or contributions.}
82
+
83
+ ### {Paper 2 short name}
84
+ {Technical approach. Key components and how they work. Main innovations or contributions.}
85
+
86
+ ### Key Methodological Differences
87
+ {What's fundamentally different about the approaches? What trade-offs does each make? Where would each approach be preferred?}
88
+
89
+ ## Summary
90
+
91
+ {One paragraph synthesizing the key takeaways. When would you use one approach vs the other? What does each paper contribute that the other doesn't?}
92
+ ```
93
+
94
+ ## Notes
95
+
96
+ - Be specific and technical in your comparisons. Avoid vague statements like "both papers address the problem well."
97
+ - When comparing methodology, focus on the "why" behind design choices, not just the "what."
98
+ - If papers have different scopes (e.g., one is more theoretical, one more empirical), acknowledge this in your comparison.
@@ -0,0 +1,152 @@
1
+ # Expand: Find Related Works
2
+
3
+ Find and analyze papers solving the same problem as a seed paper.
4
+
5
+ ## Input
6
+
7
+ The user provides a seed paper reference as `$ARGUMENTS`. This can be:
8
+ - An arXiv ID (e.g. `ARXIV:1706.03762`)
9
+ - An arXiv URL (e.g. `https://arxiv.org/abs/1706.03762`)
10
+ - A direct PDF URL (e.g. `https://arxiv.org/pdf/1706.03762.pdf`)
11
+ - A DOI (e.g. `DOI:10.18653/v1/N19-1423`)
12
+ - A Semantic Scholar URL or ID
13
+ - A paper name (e.g. `Attention is All You Need`)
14
+
15
+ Examples:
16
+ - `/expand Mamba`
17
+ - `/expand ARXIV:2312.00752`
18
+ - `/expand https://arxiv.org/abs/1706.03762`
19
+ - `/expand "Attention is All You Need"`
20
+
21
+ ## Instructions
22
+
23
+ ### Phase 1: Analyze the Seed Paper
24
+
25
+ 1. **Resolve and fetch the seed paper.** Use `read_paper` with the provided reference. This returns JSON with:
26
+ - `cache_path`: Path to the cached markdown file
27
+ - `title`, `authors`, `year`, `venue`, `citation_count`: Paper metadata
28
+
29
+ If it fails, use `search_papers` to find the paper first.
30
+
31
+ 2. **Read the full paper text.** Use the Read tool to read the paper from `cache_path`. The References section has already been stripped.
32
+
33
+ 3. **Extract the specific research problem.** Read the paper carefully and identify:
34
+ - The **specific problem** being addressed (not the broad topic)
35
+ - The problem should be stated in one sentence
36
+ - Example: "Efficient sequence modeling with linear complexity" (not "machine learning" or "NLP")
37
+
38
+ 4. **Generate a folder slug.** Create a slug from the research problem (lowercase, hyphens, 3-5 words).
39
+ - Example: `efficient-sequence-modeling` or `low-rank-adaptation-llms`
40
+
41
+ ### Phase 2: Search Exhaustively
42
+
43
+ Generate 3-5 targeted search queries based on the problem statement, then search using ALL of these methods:
44
+
45
+ 1. **Keyword search.** Use `search_papers` with each query (limit 20 per query)
46
+ 2. **Citations.** Use `get_citations` on the seed paper (limit 50)
47
+ 3. **References.** Use `get_references` on the seed paper (limit 50)
48
+ 4. **Similar papers.** Use `search_similar` on the seed paper (limit 20)
49
+
50
+ Deduplicate results by paper ID. You should have 50-150 candidate papers.
51
+
52
+ ### Phase 3: Parallel Paper Analysis (Subagents)
53
+
54
+ For each candidate paper, spawn a subagent to analyze it. Run subagents in parallel (batch 10-15 at a time).
55
+
56
+ **Subagent prompt template** (use Task tool with `subagent_type: "general-purpose"` and `model: "sonnet"`):
57
+
58
+ ```
59
+ Analyze whether this paper solves the SAME PROBLEM as the seed paper.
60
+
61
+ SEED PAPER:
62
+ - Title: {seed_title}
63
+ - Problem: {problem_statement}
64
+
65
+ CANDIDATE PAPER ID: {candidate_paper_id}
66
+
67
+ Instructions:
68
+ 1. Use `get_paper` to get the candidate paper's metadata (title, abstract, year)
69
+ 2. Read the abstract carefully
70
+ 3. Determine: Does this paper solve the SAME SPECIFIC PROBLEM as the seed?
71
+ - YES if it addresses the exact same problem (different approach is fine)
72
+ - NO if it's merely related, uses similar methods, or addresses a broader/narrower problem
73
+
74
+ If YES, also extract:
75
+ - approach: One sentence describing how it tackles the problem
76
+ - key_difference: How does it differ from the seed paper's approach?
77
+ - contribution: The main takeaway
78
+
79
+ Return your analysis as JSON:
80
+ {
81
+ "paper_id": "...",
82
+ "title": "...",
83
+ "year": ...,
84
+ "relevant": true/false,
85
+ "reason": "Why it is or isn't solving the same problem",
86
+ "approach": "..." (only if relevant),
87
+ "key_difference": "..." (only if relevant),
88
+ "contribution": "..." (only if relevant)
89
+ }
90
+ ```
91
+
92
+ ### Phase 4: Synthesize Results
93
+
94
+ 1. **Collect all subagent reports.** Filter to only relevant papers (those solving the same problem).
95
+
96
+ 2. **Group by approach.** If there are clear categories of approaches, group papers accordingly.
97
+
98
+ 3. **Create the workspace folder.** Create `workspace/{slug}/` if it doesn't exist.
99
+
100
+ 4. **Write the output file.** Write to `workspace/{slug}/related_works.md` using the format below.
101
+
102
+ ## Output Format
103
+
104
+ Write to `workspace/{slug}/related_works.md`:
105
+
106
+ ```markdown
107
+ # Related Works: {Research Problem}
108
+
109
+ **Seed paper:** {Title} ({year})
110
+ **Problem:** {One-sentence problem statement}
111
+ **Papers analyzed:** {N} candidates → {M} relevant
112
+
113
+ ## Overview
114
+
115
+ {Brief synthesis: How many papers address this problem? What are the main approaches? Any clear trends over time?}
116
+
117
+ ## Papers
118
+
119
+ ### {Paper 1 Title} ({year})
120
+
121
+ **Approach:** {How this paper tackles the problem}
122
+ **Key difference from seed:** {What's different about their approach}
123
+ **Contribution:** {Main takeaway}
124
+
125
+ ### {Paper 2 Title} ({year})
126
+
127
+ ...
128
+
129
+ ## Approach Categories
130
+
131
+ {Group papers by their approach if there are 2+ clear categories. Otherwise, omit this section.}
132
+
133
+ ### {Category 1 Name}
134
+ - {Paper A}: {brief description}
135
+ - {Paper B}: {brief description}
136
+
137
+ ### {Category 2 Name}
138
+ - {Paper C}: {brief description}
139
+ - {Paper D}: {brief description}
140
+
141
+ ## Summary
142
+
143
+ {What approaches exist to solve this problem? What trade-offs do they make? What does the seed paper contribute relative to this landscape? What's missing or underexplored?}
144
+ ```
145
+
146
+ ## Notes
147
+
148
+ - **Strict relevance filter.** Only include papers that solve the SAME problem. "Related" or "similar methods" is not enough. When in doubt, exclude.
149
+ - **Parallel execution.** Use the Task tool to spawn subagents in parallel. Send multiple Task tool calls in a single message.
150
+ - **Use Sonnet model.** Always set `model: "sonnet"` when spawning subagents to balance speed and quality.
151
+ - **Handle failures gracefully.** If a subagent fails or returns invalid JSON, skip that paper and continue.
152
+ - **Inform the user.** After completing, tell the user where the file was written and give a brief summary (e.g., "Analyzed 127 candidates, found 23 papers solving the same problem").
@@ -0,0 +1,172 @@
1
+ # Gaps: Identify Research Gaps
2
+
3
+ Analyze a related works document to identify gaps, open questions, and promising research directions.
4
+
5
+ ## Input
6
+
7
+ The user provides a workspace folder path as `$ARGUMENTS`. This folder should contain a `related_works.md` file created by `/expand`.
8
+
9
+ Examples:
10
+ - `/gaps workspace/efficient-sequence-modeling/`
11
+ - `/gaps workspace/low-rank-adaptation-llms/`
12
+
13
+ ## Instructions
14
+
15
+ ### Phase 1: Read and Understand the Landscape
16
+
17
+ 1. **Read the related works document.** Use the Read tool to read `$ARGUMENTS/related_works.md`.
18
+
19
+ 2. **Extract key information:**
20
+ - The research problem being addressed
21
+ - The seed paper and its approach
22
+ - All related papers and their approaches
23
+ - The approach categories (if present)
24
+ - The existing summary/synthesis
25
+
26
+ 3. **Build a mental model** of the research landscape:
27
+ - What approaches have been tried?
28
+ - What results have been achieved?
29
+ - What trade-offs do different approaches make?
30
+ - What's the current state-of-the-art?
31
+
32
+ ### Phase 2: Identify Gaps
33
+
34
+ Analyze the landscape systematically. For each category below, look for what's missing, unclear, or underexplored.
35
+
36
+ #### Methodological Gaps
37
+ - What techniques haven't been tried?
38
+ - What limitations do current approaches share?
39
+ - What combinations of methods are unexplored?
40
+ - What architectural choices are untested?
41
+ - Are there approaches from adjacent fields that haven't been applied?
42
+
43
+ #### Empirical Gaps
44
+ - What settings or domains haven't been tested?
45
+ - What scales (larger/smaller) are unexplored?
46
+ - What datasets or benchmarks are missing?
47
+ - Are results robust across different conditions?
48
+ - What ablations or analyses are missing?
49
+
50
+ #### Theoretical Gaps
51
+ - What phenomena lack explanation?
52
+ - What assumptions are untested or questionable?
53
+ - Why do certain approaches work (or not work)?
54
+ - What are the fundamental limits?
55
+ - What theoretical frameworks are missing?
56
+
57
+ #### Application Gaps
58
+ - What use cases haven't been explored?
59
+ - What domains could benefit but haven't been tried?
60
+ - What practical constraints haven't been addressed?
61
+ - What deployment scenarios are missing?
62
+
63
+ ### Phase 3: Assess Each Gap
64
+
65
+ For each gap you identify, assess:
66
+
67
+ 1. **Significance** — How important is filling this gap?
68
+ - Would it advance the field substantially?
69
+ - Does it block progress on other fronts?
70
+ - Would it have practical impact?
71
+
72
+ 2. **Tractability** — How feasible is it to address?
73
+ - **High**: Clear path forward, resources exist, could be done with current methods
74
+ - **Medium**: Requires some innovation or significant effort, but achievable
75
+ - **Low**: Fundamental challenges, unclear how to proceed, may require breakthroughs
76
+
77
+ 3. **Evidence** — What from the papers supports this being a gap?
78
+ - Which papers show this limitation?
79
+ - What's been tried vs. what hasn't?
80
+
81
+ 4. **Potential approach** — How might this gap be addressed?
82
+ - What would a solution look like?
83
+ - What would be needed (data, compute, new methods)?
84
+
85
+ ### Phase 4: Rank Opportunities
86
+
87
+ Identify the **top 3-5 research opportunities** by combining significance and tractability:
88
+ - High significance + High tractability = Top opportunity
89
+ - High significance + Medium tractability = Strong opportunity
90
+ - Medium significance + High tractability = Good opportunity
91
+
92
+ For each top opportunity, explain why it's promising and what makes it actionable.
93
+
94
+ ### Phase 5: Write Output
95
+
96
+ Write the analysis to `$ARGUMENTS/gaps.md` using the format below.
97
+
98
+ ## Output Format
99
+
100
+ Write to `$ARGUMENTS/gaps.md`:
101
+
102
+ ```markdown
103
+ # Research Gaps: {Research Problem}
104
+
105
+ **Based on:** related_works.md ({N} papers analyzed)
106
+ **Seed paper:** {Title}
107
+
108
+ ## Summary
109
+
110
+ {One paragraph overview: What's the state of the field? What are the major gaps? What's the most promising direction for new research?}
111
+
112
+ ## Methodological Gaps
113
+
114
+ ### {Gap Title}
115
+ **Description:** {What's missing or limited in current approaches}
116
+ **Evidence:** {Which papers show this limitation, what has/hasn't been tried}
117
+ **Potential approach:** {How this might be addressed}
118
+ **Tractability:** High / Medium / Low
119
+
120
+ ### {Gap Title}
121
+ ...
122
+
123
+ ## Empirical Gaps
124
+
125
+ ### {Gap Title}
126
+ **Description:** {What settings, domains, or scales are untested}
127
+ **Evidence:** {What's been tested vs. what hasn't}
128
+ **Potential approach:** {How this might be addressed}
129
+ **Tractability:** High / Medium / Low
130
+
131
+ ### {Gap Title}
132
+ ...
133
+
134
+ ## Theoretical Gaps
135
+
136
+ ### {Gap Title}
137
+ **Description:** {What's not understood or explained}
138
+ **Evidence:** {What questions remain open}
139
+ **Potential approach:** {How this might be addressed}
140
+ **Tractability:** High / Medium / Low
141
+
142
+ ### {Gap Title}
143
+ ...
144
+
145
+ ## Application Gaps
146
+
147
+ ### {Gap Title}
148
+ **Description:** {What use cases or domains are underexplored}
149
+ **Evidence:** {What applications haven't been tried}
150
+ **Potential approach:** {How this might be addressed}
151
+ **Tractability:** High / Medium / Low
152
+
153
+ ### {Gap Title}
154
+ ...
155
+
156
+ ## Top Opportunities
157
+
158
+ {Rank the top 3-5 gaps by a combination of significance and tractability. These are the most promising research directions.}
159
+
160
+ 1. **{Gap name}** — {Why this is promising: what makes it significant AND tractable}
161
+ 2. **{Gap name}** — {Why this is promising}
162
+ 3. **{Gap name}** — {Why this is promising}
163
+ ```
164
+
165
+ ## Notes
166
+
167
+ - **Be specific.** Vague gaps like "needs more research" are not useful. Identify concrete, actionable gaps.
168
+ - **Ground in evidence.** Every gap should be supported by what the papers do or don't address.
169
+ - **Quality over quantity.** It's better to identify 2-3 significant gaps per category than to list every possible limitation.
170
+ - **Omit empty categories.** If there are no meaningful gaps in a category, omit that section entirely.
171
+ - **Focus on actionable gaps.** Prioritize gaps that could realistically be addressed by a research project.
172
+ - **Inform the user.** After completing, tell the user where the file was written and highlight the top 2-3 opportunities.