deep-research-cc 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 desland01
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,299 @@
1
+ <div align="center">
2
+
3
+ # Deep Research
4
+
5
+ **Academic-grade multi-agent research pipeline for Claude Code.**
6
+
7
+ [![Version](https://img.shields.io/badge/version-1.0.0-green?style=for-the-badge)](VERSION)
8
+ [![License](https://img.shields.io/badge/license-MIT-blue?style=for-the-badge)](LICENSE)
9
+
10
+ <br>
11
+
12
+ ```bash
13
+ npx deep-research-cc
14
+ ```
15
+
16
+ **3-stage pipeline: Research Agents -> Synthesis Agents -> Report Builder**
17
+
18
+ <br>
19
+
20
+ *"Point it at a question, walk away, come back to an academic-grade research report with citations."*
21
+
22
+ <br>
23
+
24
+ [How It Works](#how-it-works) · [Prerequisites](#prerequisites) · [Installation](#installation) · [Usage](#usage) · [Architecture](#architecture)
25
+
26
+ </div>
27
+
28
+ ---
29
+
30
+ ## The Problem
31
+
32
+ Manual research with AI tools is shallow. You search, read, forget context, search again. Sources get lost. Analysis stays surface-level. And if your context window fills up, everything collapses.
33
+
34
+ **Deep Research fixes that.** It decomposes your question into independent research domains, dispatches parallel agents to investigate each one, synthesizes findings through dedicated analysis agents, and produces a comprehensive report with full citations. All three stages write to disk, so nothing is lost to context limits.
35
+
36
+ ---
37
+
38
+ ## How It Works
39
+
40
+ ### Stage 1: Research Agents (parallel)
41
+
42
+ You provide a topic. The pipeline decomposes it into 3-6 independent research domains and launches parallel agents, each investigating one domain using Firecrawl MCP for web search and scraping.
43
+
44
+ Each agent writes a raw research document to `docs/plans/YYYY-MM-DD-{domain}-research.md`.
45
+
46
+ ### Stage 2: Synthesis Agents (parallel)
47
+
48
+ After all research agents complete, synthesis agents launch in parallel. Each synthesis agent reads exactly ONE research file (strict 1:1 mapping) and produces a distilled domain summary with evidence quality assessment, gap analysis, and organized findings.
49
+
50
+ Each agent writes to `docs/plans/YYYY-MM-DD-{domain}-synthesis.md`.
51
+
52
+ ### Stage 3: Report Builder (single agent)
53
+
54
+ A single report builder agent reads ALL synthesis files and produces a comprehensive, externally-shareable research report following an academic template.
55
+
56
+ Final report: `docs/plans/YYYY-MM-DD-{topic}-report.md`
57
+
58
+ ---
59
+
60
+ ## What You Get
61
+
62
+ A structured research report with:
63
+
64
+ - **Executive Summary** with confidence assessment
65
+ - **Methodology** documenting search strategy and source evaluation
66
+ - **Domain Findings** with citations and evidence quality ratings
67
+ - **Cross-Cutting Analysis** identifying themes, contradictions, and risks
68
+ - **Synthesis and Recommendations** with alternatives table
69
+ - **Limitations** and suggested follow-up research
70
+ - **Full References** organized by domain with source type and relevance ratings
71
+
72
+ Every factual claim traces back to a cited source. The report stands alone: someone with no prior context can read and learn from it.
73
+
74
+ ---
75
+
76
+ ## Prerequisites
77
+
78
+ | Requirement | Details |
79
+ |-------------|---------|
80
+ | **Claude Code** | Anthropic's CLI for Claude |
81
+ | **Node.js** | v18+ (for npx installer) |
82
+ | **Firecrawl MCP** | Paid API key from [firecrawl.dev](https://firecrawl.dev) |
83
+
84
+ > **Note:** Firecrawl is a paid service that provides web search and scraping capabilities. Deep Research requires it for all web-based research. Visit [firecrawl.dev/pricing](https://firecrawl.dev/pricing) for current plans.
85
+
86
+ ---
87
+
88
+ ## Installation
89
+
90
+ ### Quick Install (recommended)
91
+
92
+ ```bash
93
+ npx deep-research-cc
94
+ ```
95
+
96
+ The installer copies the skill files into `~/.claude/` and backs up any existing files.
97
+
98
+ <details>
99
+ <summary><strong>Non-interactive install (Docker, CI, Scripts)</strong></summary>
100
+
101
+ ```bash
102
+ npx deep-research-cc --global # Install to ~/.claude/
103
+ npx deep-research-cc --local # Install to ./.claude/
104
+ ```
105
+
106
+ Use `--global` (`-g`) or `--local` (`-l`) to skip the interactive prompt.
107
+
108
+ </details>
109
+
110
+ <details>
111
+ <summary><strong>Development installation</strong></summary>
112
+
113
+ Clone the repository and use symlinks for live editing:
114
+
115
+ ```bash
116
+ git clone https://github.com/desland01/deep-research.git ~/deep-research
117
+ cd ~/deep-research
118
+ chmod +x install.sh
119
+ ./install.sh
120
+ ```
121
+
122
+ Changes in `~/deep-research/` are live immediately via symlinks.
123
+
124
+ </details>
125
+
126
+ ---
127
+
128
+ ## Firecrawl MCP Setup
129
+
130
+ Deep Research uses Firecrawl MCP for all web search and scraping. You must configure it before using the skill.
131
+
132
+ ### Step 1: Get an API Key
133
+
134
+ Sign up at [firecrawl.dev](https://firecrawl.dev) and get your API key from the dashboard.
135
+
136
+ ### Step 2: Add Firecrawl MCP to Claude Code
137
+
138
+ ```bash
139
+ claude mcp add firecrawl-mcp -e FIRECRAWL_API_KEY=your-key-here -- npx -y firecrawl-mcp
140
+ ```
141
+
142
+ ### Step 3: Restart Claude Code
143
+
144
+ Restart Claude Code to load both the new skill and the Firecrawl MCP server.
145
+
146
+ ### Verify Setup
147
+
148
+ Start Claude Code and type `/deep-research`. If the skill loads, you're ready.
149
+
150
+ ---
151
+
152
+ ## Usage
153
+
154
+ In any Claude Code session:
155
+
156
+ ```
157
+ /deep-research What are the best approaches to real-time voice AI for mobile apps?
158
+ ```
159
+
160
+ The pipeline will:
161
+
162
+ 1. Decompose your question into 3-6 research domains
163
+ 2. Launch parallel research agents (each writes to disk)
164
+ 3. Launch parallel synthesis agents (each reads one research file, writes synthesis)
165
+ 4. Launch a report builder (reads all syntheses, writes final report)
166
+ 5. Report the file location and top-level findings
167
+
168
+ ### Example Output Files
169
+
170
+ For a research topic "voice AI providers for React Native":
171
+
172
+ ```
173
+ docs/plans/
174
+ 2026-02-16-voice-ai-landscape-research.md
175
+ 2026-02-16-voice-ai-pricing-research.md
176
+ 2026-02-16-voice-ai-react-native-research.md
177
+ 2026-02-16-voice-ai-latency-research.md
178
+ 2026-02-16-voice-ai-privacy-research.md
179
+ 2026-02-16-voice-ai-landscape-synthesis.md
180
+ 2026-02-16-voice-ai-pricing-synthesis.md
181
+ 2026-02-16-voice-ai-react-native-synthesis.md
182
+ 2026-02-16-voice-ai-latency-synthesis.md
183
+ 2026-02-16-voice-ai-privacy-synthesis.md
184
+ 2026-02-16-voice-ai-providers-report.md <- Final report
185
+ ```
186
+
187
+ ---
188
+
189
+ ## Architecture
190
+
191
+ ```
192
+ /deep-research [topic]
193
+ |
194
+ v
195
+ Decompose into 3-6 domains
196
+ |
197
+ v
198
+ ┌────┬────┬────┬────┬────┐
199
+ R1 R2 R3 R4 R5 R6 <- Research agents (parallel, background)
200
+ | | | | | | Each writes *-research.md to disk
201
+ └────┴────┴────┴────┴────┘
202
+ |
203
+ v (wait for ALL to complete)
204
+ |
205
+ ┌────┬────┬────┬────┬────┐
206
+ S1 S2 S3 S4 S5 S6 <- Synthesis agents (parallel, background)
207
+ | | | | | | 1:1 with research agents
208
+ └────┴────┴────┴────┴────┘ Each reads one *-research.md, writes *-synthesis.md
209
+ |
210
+ v (wait for ALL to complete)
211
+ |
212
+ [RB] <- Report builder (single agent, background)
213
+ | Reads ALL *-synthesis.md files
214
+ v Writes final *-report.md
215
+ Final Report
216
+ ```
217
+
218
+ ### Agent Limits
219
+
220
+ | Constraint | Value |
221
+ |-----------|-------|
222
+ | Max parallel research agents | 6 |
223
+ | Max parallel synthesis agents | 6 (1:1 with research) |
224
+ | Report builder agents | 1 (always single) |
225
+ | Parent reads raw output | Never |
226
+ | Parent reads final report | Yes (after verification) |
227
+
228
+ ### Why Disk-Based Handoff?
229
+
230
+ Raw research agent output can exceed 600K tokens. If the parent agent reads it via TaskOutput, context gets crushed and downstream agents never launch. Writing to disk files keeps each stage isolated and the parent context clean.
231
+
232
+ ### Anti-Patterns
233
+
234
+ | Don't | Do Instead |
235
+ |-------|-----------|
236
+ | Read TaskOutput from research/synthesis agents | Wait for completion, check files on disk |
237
+ | Assign 2+ research files to one synthesis agent | Strict 1:1 mapping |
238
+ | Skip the report builder for "simple" research | Always run all 3 stages |
239
+ | Omit Firecrawl instructions from research agents | Agents don't inherit MCP context |
240
+ | Omit source URLs | Every claim must trace to a URL |
241
+
242
+ ---
243
+
244
+ ## Troubleshooting
245
+
246
+ **Skill not found after install?**
247
+ - Restart Claude Code to reload skills
248
+ - Verify files exist: `ls ~/.claude/skills/deep-research/`
249
+
250
+ **Research agents failing?**
251
+ - Check Firecrawl MCP is configured: look for `firecrawl-mcp` in your MCP server list
252
+ - Verify your API key is valid at [firecrawl.dev/dashboard](https://firecrawl.dev)
253
+
254
+ **Report missing sections?**
255
+ - Check that all research and synthesis files were created in `docs/plans/`
256
+ - The report builder only runs after ALL synthesis agents complete
257
+
258
+ **Context getting crushed?**
259
+ - This usually means the parent is reading raw agent output. The skill prevents this by design, but if you've modified the protocol, ensure the parent never reads research or synthesis files directly.
260
+
261
+ ---
262
+
263
+ ## Directory Structure
264
+
265
+ ```
266
+ deep-research/
267
+ ├── README.md
268
+ ├── VERSION
269
+ ├── LICENSE
270
+ ├── .gitignore
271
+ ├── package.json
272
+ ├── install.sh # Dev install (symlinks)
273
+ ├── bin/
274
+ │ └── install.js # npx entry point (copy-based)
275
+ └── skills/
276
+ └── deep-research/
277
+ ├── SKILL.md # 3-stage protocol
278
+ ├── academic-report-template.md # Report format template
279
+ └── firecrawl-reference.md # Firecrawl MCP tool reference
280
+ ```
281
+
282
+ ---
283
+
284
+ ## Contributing
285
+
286
+ 1. Clone the repo and run `./install.sh`
287
+ 2. Edit files in `~/deep-research/` (changes are live via symlinks)
288
+ 3. Test with `/deep-research` in Claude Code
289
+ 4. For protocol changes, verify all 3 stages produce expected output files
290
+
291
+ ---
292
+
293
+ <div align="center">
294
+
295
+ **Claude Code is powerful. Deep Research makes it thorough.**
296
+
297
+ *Academic-grade research pipelines, so you can decide with confidence.*
298
+
299
+ </div>
package/VERSION ADDED
@@ -0,0 +1 @@
1
+ 1.0.0
package/bin/install.js ADDED
@@ -0,0 +1,119 @@
1
+ #!/usr/bin/env node
2
+
3
+ const fs = require("fs");
4
+ const path = require("path");
5
+ const os = require("os");
6
+ const readline = require("readline");
7
+
8
+ const VERSION = fs
9
+ .readFileSync(path.join(__dirname, "..", "VERSION"), "utf8")
10
+ .trim();
11
+ const PKG_ROOT = path.join(__dirname, "..");
12
+ const HOME = os.homedir();
13
+
14
+ const TARGETS = {
15
+ global: path.join(HOME, ".claude"),
16
+ local: path.join(process.cwd(), ".claude"),
17
+ };
18
+
19
+ const DIRS_TO_COPY = ["skills/deep-research"];
20
+
21
+ // -- Helpers ----------------------------------------------------------------
22
+
23
+ function copyDirSync(src, dest) {
24
+ fs.mkdirSync(dest, { recursive: true });
25
+ for (const entry of fs.readdirSync(src, { withFileTypes: true })) {
26
+ const srcPath = path.join(src, entry.name);
27
+ const destPath = path.join(dest, entry.name);
28
+ if (entry.isDirectory()) {
29
+ copyDirSync(srcPath, destPath);
30
+ } else {
31
+ fs.copyFileSync(srcPath, destPath);
32
+ }
33
+ }
34
+ }
35
+
36
+ function backupIfExists(target) {
37
+ if (fs.existsSync(target)) {
38
+ const timestamp = new Date()
39
+ .toISOString()
40
+ .replace(/[-:T]/g, "")
41
+ .slice(0, 14);
42
+ const backup = `${target}.backup.${timestamp}`;
43
+ console.log(` Backing up existing -> ${path.basename(backup)}`);
44
+ fs.renameSync(target, backup);
45
+ }
46
+ }
47
+
48
+ function install(claudeDir) {
49
+ console.log(`\nInstalling to ${claudeDir}/\n`);
50
+
51
+ for (const rel of DIRS_TO_COPY) {
52
+ const src = path.join(PKG_ROOT, rel);
53
+ const dest = path.join(claudeDir, rel);
54
+
55
+ // Ensure parent dir exists
56
+ fs.mkdirSync(path.dirname(dest), { recursive: true });
57
+
58
+ backupIfExists(dest);
59
+ copyDirSync(src, dest);
60
+ console.log(` Copied: ${rel}`);
61
+ }
62
+
63
+ console.log(`
64
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
65
+ Install complete.
66
+
67
+ Skill installed:
68
+ /deep-research [topic] Run 3-stage academic research pipeline
69
+
70
+ Firecrawl MCP Setup (required):
71
+ Deep Research uses Firecrawl for web search and scraping.
72
+ You need a Firecrawl API key from https://firecrawl.dev
73
+
74
+ Add Firecrawl MCP to your project:
75
+ claude mcp add firecrawl-mcp -e FIRECRAWL_API_KEY=your-key -- npx -y firecrawl-mcp
76
+
77
+ Then restart Claude Code to load the new skill and MCP server.
78
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━`);
79
+ }
80
+
81
+ // -- Main -------------------------------------------------------------------
82
+
83
+ console.log(`
84
+ Deep Research Installer v${VERSION}
85
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━`);
86
+
87
+ const args = process.argv.slice(2);
88
+
89
+ if (args.includes("--global") || args.includes("-g")) {
90
+ install(TARGETS.global);
91
+ } else if (args.includes("--local") || args.includes("-l")) {
92
+ install(TARGETS.local);
93
+ } else {
94
+ // Interactive prompt
95
+ const rl = readline.createInterface({
96
+ input: process.stdin,
97
+ output: process.stdout,
98
+ });
99
+
100
+ console.log(`
101
+ Where would you like to install?
102
+
103
+ 1) Global (~/.claude/) Available in all projects
104
+ 2) Local (./.claude/) This project only
105
+ `);
106
+
107
+ rl.question("Choose [1/2]: ", (answer) => {
108
+ rl.close();
109
+ const choice = answer.trim();
110
+ if (choice === "1" || choice.toLowerCase() === "global") {
111
+ install(TARGETS.global);
112
+ } else if (choice === "2" || choice.toLowerCase() === "local") {
113
+ install(TARGETS.local);
114
+ } else {
115
+ console.log("Invalid choice. Use --global or --local flag, or enter 1 or 2.");
116
+ process.exit(1);
117
+ }
118
+ });
119
+ }
package/package.json ADDED
@@ -0,0 +1,28 @@
1
+ {
2
+ "name": "deep-research-cc",
3
+ "version": "1.0.0",
4
+ "description": "Academic-grade multi-agent research pipeline for Claude Code — 3-stage deep research that writes itself.",
5
+ "bin": {
6
+ "deep-research-cc": "bin/install.js"
7
+ },
8
+ "files": [
9
+ "bin/",
10
+ "skills/",
11
+ "VERSION"
12
+ ],
13
+ "keywords": [
14
+ "claude",
15
+ "claude-code",
16
+ "research",
17
+ "deep-research",
18
+ "firecrawl",
19
+ "academic",
20
+ "multi-agent"
21
+ ],
22
+ "author": "desland01",
23
+ "license": "MIT",
24
+ "repository": {
25
+ "type": "git",
26
+ "url": "git+https://github.com/desland01/deep-research.git"
27
+ }
28
+ }
@@ -0,0 +1,229 @@
1
+ ---
2
+ name: deep-research
3
+ description: "Use when evaluating technologies, making architectural decisions, comparing options across multiple dimensions, or any research task requiring 5+ web searches across multiple domains. Triggers on phrases like 'research this', 'figure out the best way', 'compare options for', or 'what are our choices'. NOT for simple factual lookups, codebase exploration, or tasks where you already have enough context."
4
+ argument-hint: [topic or question]
5
+ ---
6
+
7
+ # Deep Research Protocol (3-Stage Pipeline)
8
+
9
+ Autonomous multi-agent research workflow producing academic-level reports. Three stages: Research, Synthesis, Report. Each stage writes to disk. Parent never touches raw output from any stage.
10
+
11
+ ## Why This Exists
12
+
13
+ Raw research agent output can exceed 600K tokens. If the parent reads it (via TaskOutput or inline), context gets crushed and downstream agents never launch. This protocol keeps the parent context clean by using disk files as the handoff layer between all three stages.
14
+
15
+ ## The Rule
16
+
17
+ **Parent (you) must NEVER:**
18
+ - Read TaskOutput from research agents or synthesis agents
19
+ - Paste findings from any agent into another agent's prompt
20
+ - Read raw JSONL output files
21
+ - Read research or synthesis markdown files directly
22
+
23
+ **Parent (you) CAN:**
24
+ - Check completion status via task notifications (automatic)
25
+ - Read the final report on disk after Stage 3 completes
26
+ - Verify files exist using Glob
27
+ - Grep for structural markers (headers, citations) without reading full content
28
+
29
+ ## Protocol
30
+
31
+ ### Step 1: Decompose
32
+
33
+ Split the research question into 3-6 independent domains. Each domain becomes one research agent. Domains should be non-overlapping but collectively exhaustive.
34
+
35
+ Example decomposition for "Best voice AI provider for our app":
36
+ - Domain 1: Provider landscape and feature comparison
37
+ - Domain 2: Pricing models and cost projections
38
+ - Domain 3: React Native integration complexity
39
+ - Domain 4: Latency, reliability, and production readiness
40
+ - Domain 5: Privacy, compliance, and data handling
41
+
42
+ ### Step 2: Dispatch Research Agents (parallel, background)
43
+
44
+ Launch all research agents in parallel with `run_in_background: true`.
45
+
46
+ Each research agent writes to: `docs/plans/YYYY-MM-DD-{domain-slug}-research.md`
47
+
48
+ Use the Research Agent Prompt Template below. Every prompt MUST include:
49
+ - Numbered questions to answer (3-5 per agent)
50
+ - Firecrawl MCP tool instructions (agents do not inherit MCP context)
51
+ - Suggested search queries (3-5 per agent)
52
+ - The exact output file path
53
+ - Instructions to include source URLs for every claim
54
+
55
+ ### Step 3: Wait for Research Completion
56
+
57
+ Do nothing. Task notifications arrive automatically when agents complete. Do NOT poll, do NOT read output files. Wait until all research agents report completion.
58
+
59
+ ### Step 4: Dispatch Synthesis Agents (parallel, background)
60
+
61
+ After ALL research agents complete, launch synthesis agents with `run_in_background: true`.
62
+
63
+ **One synthesis agent per research agent.** This is a strict 1:1 mapping. Each synthesis agent reads exactly ONE research file and produces ONE synthesis file. Never assign multiple research files to a single synthesis agent.
64
+
65
+ Each synthesis agent reads from: `docs/plans/YYYY-MM-DD-{domain-slug}-research.md`
66
+ Each synthesis agent writes to: `docs/plans/YYYY-MM-DD-{domain-slug}-synthesis.md`
67
+
68
+ Use the Synthesis Agent Prompt Template below. Every prompt MUST include:
69
+ - The exact research file path to READ
70
+ - The exact synthesis file path to WRITE
71
+ - Instructions to distill, organize, cite, and assess source quality
72
+ - Instructions to flag gaps and contradictions
73
+
74
+ **Why 1:1?** Each domain deserves dedicated attention. Combining domains in synthesis loses nuance and produces shallow analysis.
75
+
76
+ ### Step 5: Dispatch Report Builder (single agent, background)
77
+
78
+ After ALL synthesis agents complete, launch ONE report builder agent with `run_in_background: true`.
79
+
80
+ The report builder reads ALL synthesis files and produces a single comprehensive report.
81
+
82
+ Report builder reads from: All `docs/plans/YYYY-MM-DD-*-synthesis.md` files
83
+ Report builder writes to: `docs/plans/YYYY-MM-DD-{overall-topic}-report.md`
84
+ Report builder references: `~/.claude/skills/deep-research/academic-report-template.md`
85
+
86
+ Use the Report Builder Prompt Template below. The report builder focuses on cross-cutting analysis, theme identification, and actionable recommendations. It does NOT re-analyze raw research. It works exclusively from the synthesized domain summaries.
87
+
88
+ ### Step 6: Verify and Report
89
+
90
+ After the report builder completes:
91
+
92
+ 1. Verify deliverables exist on disk (Glob for the report file)
93
+ 2. Grep for structural markers: `## Executive Summary`, `## References`, `## 2. Methodology`
94
+ 3. Grep for citation presence (source URLs in References section)
95
+ 4. Report to user: what was researched, where the report lives, key top-level findings (2-3 sentences max)
96
+
97
+ The user reads the report on disk. Do NOT paste report content into the chat.
98
+
99
+ ## Agent Limits
100
+
101
+ | Constraint | Value |
102
+ |-----------|-------|
103
+ | Max parallel research agents | 6 |
104
+ | Max parallel synthesis agents | 6 (1:1 with research agents) |
105
+ | Report builder agents | 1 (always single) |
106
+ | Research agent output | `docs/plans/YYYY-MM-DD-{domain}-research.md` |
107
+ | Synthesis agent output | `docs/plans/YYYY-MM-DD-{domain}-synthesis.md` |
108
+ | Report builder output | `docs/plans/YYYY-MM-DD-{topic}-report.md` |
109
+ | Parent reads raw output | NEVER (research or synthesis) |
110
+ | Parent reads final report | YES (after Step 6 verification) |
111
+
112
+ ## Key Principles
113
+
114
+ | Principle | Rationale |
115
+ |-----------|-----------|
116
+ | Disk files as handoff layer | Prevents context flooding between stages |
117
+ | 1:1 synthesis mapping | Each domain gets dedicated analytical attention |
118
+ | Report builder works from syntheses only | Cross-cutting analysis, not domain re-reading |
119
+ | Academic rigor | Every claim needs a citation, methodology documented, limitations acknowledged |
120
+ | Background execution for all agents | Parent stays responsive, agents run in parallel |
121
+ | Firecrawl MCP for web research | Consistent tool usage, agents don't inherit MCP context |
122
+ | All findings include source URLs | Traceability and verifiability of claims |
123
+ | Report should be externally shareable | Quality bar: someone outside the team can read and learn from it |
124
+
125
+ ## Template: Research Agent Prompt
126
+
127
+ ```
128
+ You are a technical researcher. Your job is to thoroughly investigate [DOMAIN].
129
+
130
+ Questions to answer:
131
+ 1. [Question 1]
132
+ 2. [Question 2]
133
+ 3. [Question 3]
134
+ 4. [Question 4 - optional]
135
+ 5. [Question 5 - optional]
136
+
137
+ Research method:
138
+ - Use Firecrawl MCP tools for all web search and URL scraping
139
+ - `mcp__firecrawl-mcp__firecrawl_search` for web search (use `query` param)
140
+ - `mcp__firecrawl-mcp__firecrawl_scrape` for reading specific URLs (use `url` param)
141
+ - Reference `~/.claude/skills/deep-research/firecrawl-reference.md` for tool patterns and params
142
+ - Suggested searches: "[query 1]", "[query 2]", "[query 3]"
143
+
144
+ Output requirements:
145
+ - Write ALL findings to `docs/plans/YYYY-MM-DD-{domain-slug}-research.md` using the Write tool
146
+ - Structure with clear H2/H3 headers and tables where data is comparable
147
+ - Include source URLs for EVERY factual claim (inline or footnote style)
148
+ - Include direct quotes from official documentation where relevant
149
+ - Note contradictions between sources
150
+ - Never use em dashes (use commas, periods, or restructure sentences)
151
+ - End with a "Sources" section listing all URLs consulted with brief descriptions
152
+ ```
153
+
154
+ ## Template: Synthesis Agent Prompt
155
+
156
+ ```
157
+ You are an analytical synthesizer. Your job is to distill raw research into a focused, well-organized domain summary.
158
+
159
+ Input: Read the research document at `docs/plans/YYYY-MM-DD-{domain-slug}-research.md` using the Read tool.
160
+
161
+ Output: Write your synthesis to `docs/plans/YYYY-MM-DD-{domain-slug}-synthesis.md` using the Write tool.
162
+
163
+ Synthesis requirements:
164
+ 1. **Key Findings** (bulleted, 5-10 items): The most important facts and conclusions from the research
165
+ 2. **Thematic Organization**: Group findings by theme, not by source. Use H2 headers for each theme.
166
+ 3. **Evidence Quality Assessment**: For each major claim, note:
167
+ - Number of independent sources confirming it
168
+ - Source credibility (official docs, peer-reviewed, blog post, forum, marketing material)
169
+ - Recency (when was this information published or last updated?)
170
+ 4. **Gaps and Contradictions**: What questions remain unanswered? Where do sources disagree?
171
+ 5. **Citations**: Preserve all source URLs from the research document. Every claim must trace back to a source.
172
+ 6. **Domain-Specific Recommendations**: Based on the evidence, what are the clear takeaways for this domain?
173
+
174
+ Style rules:
175
+ - Never use em dashes (use commas, periods, or restructure sentences)
176
+ - Tables for comparisons and structured data
177
+ - Concise paragraphs (3-4 sentences max)
178
+ - No filler, no hedging language, no sycophancy
179
+ - Write for a technical audience that wants actionable information
180
+ ```
181
+
182
+ ## Template: Report Builder Prompt
183
+
184
+ ```
185
+ You are an academic research report writer. Your job is to produce a comprehensive, externally-shareable research report from multiple domain syntheses.
186
+
187
+ Input: Read ALL synthesis files matching `docs/plans/YYYY-MM-DD-*-synthesis.md` using the Read tool.
188
+ Template: Read the report template at `~/.claude/skills/deep-research/academic-report-template.md` using the Read tool.
189
+
190
+ Output: Write the final report to `docs/plans/YYYY-MM-DD-{overall-topic}-report.md` using the Write tool.
191
+
192
+ Report structure (follow the academic report template):
193
+ 1. **Executive Summary** (200-300 words): Core question, key findings, primary recommendation, confidence level
194
+ 2. **Introduction**: Problem statement, scope and boundaries, numbered research questions
195
+ 3. **Methodology**: Search strategy, source evaluation criteria, source distribution by type, limitations
196
+ 4. **Domain Findings**: One major section per synthesis file. Key findings with citations, evidence quality, notable gaps.
197
+ 5. **Cross-Cutting Analysis**: Common themes across domains, contradictions, dependencies, risk assessment matrix
198
+ 6. **Synthesis and Recommendations**: Primary recommendation with rationale, alternatives table, implementation considerations, decision criteria for when to revisit
199
+ 7. **Limitations and Future Research**: Exclusions, thin evidence areas, suggested follow-up research
200
+ 8. **References**: All source URLs organized by domain, with title, access date, type, and relevance rating
201
+
202
+ Quality standards:
203
+ - Every factual claim must have a citation (URL or source reference)
204
+ - Distinguish between established facts, expert opinions, and emerging trends
205
+ - Quantify where possible (costs, latency numbers, adoption percentages)
206
+ - Acknowledge uncertainty explicitly rather than presenting speculation as fact
207
+ - The report should stand alone: a reader with no prior context should understand it fully
208
+
209
+ Style rules:
210
+ - Never use em dashes (use commas, periods, or restructure sentences)
211
+ - Professional, direct tone. No filler or hedging.
212
+ - Tables for all comparative data
213
+ - Use H2 for major sections, H3 for subsections
214
+ - Target length: 2,000-5,000 words depending on complexity
215
+ ```
216
+
217
+ ## Anti-Patterns
218
+
219
+ | Do NOT do this | Do this instead |
220
+ |---------------|-----------------|
221
+ | Read TaskOutput from any agent | Wait for task notifications, then check disk |
222
+ | Paste research into synthesis prompt | Tell synthesis agent which file to Read from disk |
223
+ | Assign 2+ research files to one synthesis agent | Maintain strict 1:1 mapping |
224
+ | Skip the report builder for "simple" research | Always run all 3 stages. The report builder adds cross-cutting analysis. |
225
+ | Read the research or synthesis files yourself | Only read the final report after Step 6 |
226
+ | Launch synthesis before ALL research completes | Wait for every research agent to finish |
227
+ | Launch report builder before ALL synthesis completes | Wait for every synthesis agent to finish |
228
+ | Omit Firecrawl instructions from research agents | Agents do not inherit MCP context. Always spell it out. |
229
+ | Omit source URLs | Every claim must trace to a URL |
@@ -0,0 +1,435 @@
1
+ # [TOPIC]
2
+
3
+ **Research Date:** [DATE]
4
+ **Domains Covered:** [DOMAIN_1], [DOMAIN_2], [DOMAIN_3], ... [DOMAIN_N]
5
+ **Total Sources Consulted:** [SOURCE_COUNT]
6
+ **Report Version:** 1.0
7
+
8
+ ---
9
+
10
+ ## Executive Summary
11
+
12
+ <!--
13
+ Write 200-300 words. State the core question, the key findings, the primary recommendation,
14
+ and an honest confidence assessment. This section should stand alone: a reader who only reads
15
+ this section should understand what was researched, what was found, and what to do next.
16
+ -->
17
+
18
+ **Core Question:** [STATE_THE_CENTRAL_RESEARCH_QUESTION]
19
+
20
+ **Key Findings:**
21
+
22
+ - [FINDING_1: One sentence summarizing the most important discovery]
23
+ - [FINDING_2: One sentence on the second most important finding]
24
+ - [FINDING_3: One sentence on the third finding]
25
+ - [FINDING_4: Optional fourth finding]
26
+ - [FINDING_5: Optional fifth finding]
27
+
28
+ **Primary Recommendation:** [ONE_PARAGRAPH_RECOMMENDATION]
29
+
30
+ **Confidence Level:** [HIGH / MODERATE / LOW]
31
+
32
+ <!--
33
+ Confidence criteria:
34
+ - HIGH: Multiple independent sources agree, official documentation confirms, tested/verified examples exist
35
+ - MODERATE: Several sources agree but some gaps remain, limited independent verification
36
+ - LOW: Few sources available, conflicting information, rapidly changing landscape, or heavy reliance on unofficial sources
37
+ -->
38
+
39
+ **Confidence Rationale:** [ONE_SENTENCE_EXPLAINING_WHY_THIS_CONFIDENCE_LEVEL]
40
+
41
+ ---
42
+
43
+ ## 1. Introduction
44
+
45
+ ### 1.1 Problem Statement and Motivation
46
+
47
+ <!--
48
+ Why does this research matter? What decision, project, or initiative prompted it?
49
+ Be specific about the practical context driving this investigation.
50
+ -->
51
+
52
+ [PROBLEM_STATEMENT]
53
+
54
+ ### 1.2 Scope and Boundaries
55
+
56
+ **In scope:**
57
+
58
+ - [INCLUDED_TOPIC_1]
59
+ - [INCLUDED_TOPIC_2]
60
+ - [INCLUDED_TOPIC_3]
61
+
62
+ **Explicitly excluded:**
63
+
64
+ - [EXCLUDED_TOPIC_1: brief reason for exclusion]
65
+ - [EXCLUDED_TOPIC_2: brief reason for exclusion]
66
+
67
+ ### 1.3 Research Questions
68
+
69
+ <!--
70
+ Number each question. These should be specific and answerable, not vague.
71
+ Good: "What is the p95 latency of Provider X's streaming API under concurrent load?"
72
+ Bad: "Is Provider X fast?"
73
+ -->
74
+
75
+ 1. [RESEARCH_QUESTION_1]
76
+ 2. [RESEARCH_QUESTION_2]
77
+ 3. [RESEARCH_QUESTION_3]
78
+ 4. [RESEARCH_QUESTION_N]
79
+
80
+ ---
81
+
82
+ ## 2. Methodology
83
+
84
+ ### 2.1 Search Strategy
85
+
86
+ <!--
87
+ Document how the research was conducted so the reader can assess rigor and reproduce it.
88
+ -->
89
+
90
+ | Parameter | Details |
91
+ |-----------|---------|
92
+ | Search tools | [e.g., Firecrawl MCP search, Firecrawl scrape, GitHub API] |
93
+ | Date range | [e.g., Sources published after January 2025] |
94
+ | Query terms | [List primary search queries used] |
95
+ | Languages | [e.g., English only] |
96
+ | Geographic scope | [e.g., Global, US-focused, etc.] |
97
+
98
+ ### 2.2 Source Evaluation Criteria
99
+
100
+ Sources were evaluated on the following dimensions:
101
+
102
+ | Criterion | Weight | Description |
103
+ |-----------|--------|-------------|
104
+ | Authority | High | Official docs, peer-reviewed papers, recognized experts |
105
+ | Recency | High | Published within [TIME_WINDOW], reflecting current state |
106
+ | Specificity | Medium | Addresses the exact question, not tangential topics |
107
+ | Independence | Medium | Not authored by a vendor about their own product |
108
+ | Reproducibility | Medium | Claims backed by code samples, benchmarks, or verifiable data |
109
+
110
+ ### 2.3 Source Distribution
111
+
112
+ | Source Type | Count | Notes |
113
+ |-------------|-------|-------|
114
+ | Official documentation | [N] | |
115
+ | Academic/research papers | [N] | |
116
+ | Technical blog posts | [N] | |
117
+ | GitHub repositories | [N] | |
118
+ | Conference talks/videos | [N] | |
119
+ | Community discussions | [N] | [e.g., Stack Overflow, Discord, forums] |
120
+ | Vendor/marketing materials | [N] | [treated with appropriate skepticism] |
121
+ | **Total** | **[SOURCE_COUNT]** | |
122
+
123
+ ### 2.4 Limitations of This Research
124
+
125
+ <!--
126
+ Be honest. Every research method has blind spots. Acknowledging them increases credibility.
127
+ -->
128
+
129
+ - [LIMITATION_1: e.g., "No hands-on benchmarking was performed; latency figures are from third-party reports"]
130
+ - [LIMITATION_2: e.g., "Pricing information may be outdated; last verified on [DATE]"]
131
+ - [LIMITATION_3: e.g., "Limited non-English sources consulted"]
132
+
133
+ ---
134
+
135
+ ## 3. Domain Findings
136
+
137
+ <!--
138
+ Create one subsection (3.1, 3.2, etc.) per research domain.
139
+ Each domain was investigated by a separate research agent with a focused scope.
140
+ -->
141
+
142
+ ### 3.1 [DOMAIN_1_TITLE]
143
+
144
+ #### Key Findings
145
+
146
+ <!--
147
+ Each finding should include a citation. Use the format [Source Title](URL) inline,
148
+ or use numbered references like [1] that map to the References section.
149
+ Prefer inline links for readability when the report will be consumed as markdown.
150
+ -->
151
+
152
+ - [FINDING]: [CITATION]
153
+ - [FINDING]: [CITATION]
154
+ - [FINDING]: [CITATION]
155
+
156
+ #### Evidence Quality
157
+
158
+ <!--
159
+ Rate the overall evidence quality for this domain and explain why.
160
+ -->
161
+
162
+ | Rating | Justification |
163
+ |--------|--------------|
164
+ | **[STRONG / MODERATE / WEAK]** | [Brief explanation, e.g., "Multiple official sources and independent benchmarks confirm these findings"] |
165
+
166
+ #### Notable Gaps
167
+
168
+ <!--
169
+ What questions in this domain could not be fully answered? What data was missing?
170
+ -->
171
+
172
+ - [GAP_1]
173
+ - [GAP_2]
174
+
175
+ #### Comparison Table
176
+
177
+ <!--
178
+ Include when the domain involves evaluating multiple options, tools, or approaches.
179
+ Omit this subsection if not applicable.
180
+ -->
181
+
182
+ | Criterion | Option A | Option B | Option C |
183
+ |-----------|----------|----------|----------|
184
+ | [CRITERION_1] | [VALUE] | [VALUE] | [VALUE] |
185
+ | [CRITERION_2] | [VALUE] | [VALUE] | [VALUE] |
186
+ | [CRITERION_3] | [VALUE] | [VALUE] | [VALUE] |
187
+ | **Overall** | [RATING] | [RATING] | [RATING] |
188
+
189
+ ---
190
+
191
+ ### 3.2 [DOMAIN_2_TITLE]
192
+
193
+ #### Key Findings
194
+
195
+ - [FINDING]: [CITATION]
196
+ - [FINDING]: [CITATION]
197
+
198
+ #### Evidence Quality
199
+
200
+ | Rating | Justification |
201
+ |--------|--------------|
202
+ | **[STRONG / MODERATE / WEAK]** | [Explanation] |
203
+
204
+ #### Notable Gaps
205
+
206
+ - [GAP_1]
207
+
208
+ ---
209
+
210
+ ### 3.N [DOMAIN_N_TITLE]
211
+
212
+ <!--
213
+ Repeat the domain section structure for each research domain.
214
+ Typical reports have 3-6 domain sections.
215
+ -->
216
+
217
+ #### Key Findings
218
+
219
+ - [FINDING]: [CITATION]
220
+
221
+ #### Evidence Quality
222
+
223
+ | Rating | Justification |
224
+ |--------|--------------|
225
+ | **[STRONG / MODERATE / WEAK]** | [Explanation] |
226
+
227
+ #### Notable Gaps
228
+
229
+ - [GAP_1]
230
+
231
+ ---
232
+
233
+ ## 4. Cross-Cutting Analysis
234
+
235
+ <!--
236
+ This is where the report goes beyond summarizing individual domains and performs synthesis.
237
+ Look for patterns, contradictions, and interactions across domains.
238
+ -->
239
+
240
+ ### 4.1 Common Themes
241
+
242
+ <!--
243
+ Identify findings or patterns that appeared independently in multiple domains.
244
+ These carry higher confidence because they were corroborated from different angles.
245
+ -->
246
+
247
+ | Theme | Domains Where Observed | Confidence |
248
+ |-------|----------------------|------------|
249
+ | [THEME_1] | [DOMAIN_A], [DOMAIN_B] | [HIGH/MODERATE/LOW] |
250
+ | [THEME_2] | [DOMAIN_A], [DOMAIN_C] | [HIGH/MODERATE/LOW] |
251
+
252
+ ### 4.2 Contradictions and Tensions
253
+
254
+ <!--
255
+ Where did different domains or sources disagree? Do not paper over disagreements.
256
+ Present both sides and, if possible, explain why they diverge.
257
+ -->
258
+
259
+ - **[TENSION_1]:** [Domain X suggests A, while Domain Y suggests B. This may be because...]
260
+ - **[TENSION_2]:** [Description of the contradiction and possible explanations]
261
+
262
+ ### 4.3 Dependencies and Interactions
263
+
264
+ <!--
265
+ How do decisions in one domain constrain or enable decisions in another?
266
+ Example: "Choosing Provider X for the API layer constrains the authentication options to OAuth 2.0 only."
267
+ -->
268
+
269
+ - [DEPENDENCY_1]
270
+ - [DEPENDENCY_2]
271
+
272
+ ### 4.4 Risk Assessment
273
+
274
+ <!--
275
+ Identify the key risks surfaced by the research. Rate each on likelihood and impact.
276
+ -->
277
+
278
+ | Risk | Likelihood | Impact | Mitigation |
279
+ |------|-----------|--------|------------|
280
+ | [RISK_1] | [HIGH/MED/LOW] | [HIGH/MED/LOW] | [Brief mitigation strategy] |
281
+ | [RISK_2] | [HIGH/MED/LOW] | [HIGH/MED/LOW] | [Brief mitigation strategy] |
282
+ | [RISK_3] | [HIGH/MED/LOW] | [HIGH/MED/LOW] | [Brief mitigation strategy] |
283
+
284
+ ---
285
+
286
+ ## 5. Synthesis and Recommendations
287
+
288
+ ### 5.1 Primary Recommendation
289
+
290
+ <!--
291
+ State the recommendation clearly and then explain why.
292
+ The rationale should reference specific findings from the domain sections.
293
+ A reader should be able to trace the recommendation back to evidence.
294
+ -->
295
+
296
+ **Recommendation:** [CLEAR_STATEMENT_OF_WHAT_TO_DO]
297
+
298
+ **Rationale:** [2-3 paragraphs explaining why, referencing specific findings from Sections 3 and 4]
299
+
300
+ ### 5.2 Alternative Approaches
301
+
302
+ <!--
303
+ Present the alternatives that were considered. Be fair to each.
304
+ The reader may have different constraints than those assumed in the primary recommendation.
305
+ -->
306
+
307
+ | Approach | Strengths | Weaknesses | Best When |
308
+ |----------|-----------|------------|-----------|
309
+ | [PRIMARY: recommended] | [STRENGTHS] | [WEAKNESSES] | [CONDITIONS] |
310
+ | [ALTERNATIVE_1] | [STRENGTHS] | [WEAKNESSES] | [CONDITIONS] |
311
+ | [ALTERNATIVE_2] | [STRENGTHS] | [WEAKNESSES] | [CONDITIONS] |
312
+
313
+ ### 5.3 Implementation Considerations
314
+
315
+ <!--
316
+ Practical notes for acting on the recommendation. Not a full implementation plan,
317
+ but enough to understand the effort, prerequisites, and sequencing.
318
+ -->
319
+
320
+ - **Prerequisites:** [What must be in place before starting]
321
+ - **Estimated effort:** [Rough sizing, e.g., "2-3 sprint cycles for a team of 2"]
322
+ - **Key decisions to make first:** [Decisions that block implementation]
323
+ - **Suggested sequencing:** [What to do first, second, third]
324
+
325
+ ### 5.4 Decision Criteria
326
+
327
+ <!--
328
+ Under what circumstances would the recommendation change? This is critical for making
329
+ the report durable. If conditions shift, the reader knows when to re-evaluate.
330
+ -->
331
+
332
+ The primary recommendation should be revisited if any of the following occur:
333
+
334
+ - [CONDITION_1: e.g., "Provider X raises pricing above $Y/month"]
335
+ - [CONDITION_2: e.g., "A competing solution achieves feature parity with lower latency"]
336
+ - [CONDITION_3: e.g., "Requirements change to include [SPECIFIC_CAPABILITY]"]
337
+
338
+ ---
339
+
340
+ ## 6. Limitations and Future Research
341
+
342
+ ### 6.1 What This Report Does Not Cover
343
+
344
+ <!--
345
+ Restate scope exclusions and add any areas that turned out to be relevant
346
+ but could not be adequately researched within the current scope.
347
+ -->
348
+
349
+ - [EXCLUSION_1]
350
+ - [EXCLUSION_2]
351
+
352
+ ### 6.2 Areas of Thin or Conflicting Evidence
353
+
354
+ <!--
355
+ Where was the evidence insufficient to draw strong conclusions?
356
+ Be specific about what data would resolve the uncertainty.
357
+ -->
358
+
359
+ - **[AREA_1]:** [What is uncertain and what evidence would help]
360
+ - **[AREA_2]:** [What is uncertain and what evidence would help]
361
+
362
+ ### 6.3 Suggested Follow-Up Research
363
+
364
+ <!--
365
+ Concrete next steps for research, not vague suggestions.
366
+ Include specific questions and, where possible, suggested approaches.
367
+ -->
368
+
369
+ 1. **[FOLLOW_UP_1]:** [Specific question to answer, suggested method]
370
+ 2. **[FOLLOW_UP_2]:** [Specific question to answer, suggested method]
371
+ 3. **[FOLLOW_UP_3]:** [Specific question to answer, suggested method]
372
+
373
+ ---
374
+
375
+ ## 7. References
376
+
377
+ <!--
378
+ Organize references by domain for easy navigation.
379
+ Every claim in the report should trace back to a reference here.
380
+
381
+ Reference format:
382
+ - [N] **Title** — URL — Accessed [DATE] — Type: [official docs / paper / blog / repo / discussion / vendor] — Relevance: [HIGH/MEDIUM/LOW]
383
+ -->
384
+
385
+ ### [DOMAIN_1_TITLE]
386
+
387
+ - [1] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
388
+ - [2] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
389
+
390
+ ### [DOMAIN_2_TITLE]
391
+
392
+ - [3] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
393
+ - [4] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
394
+
395
+ ### [DOMAIN_N_TITLE]
396
+
397
+ - [N] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
398
+
399
+ ---
400
+
401
+ ## Appendices
402
+
403
+ <!--
404
+ Include appendices only when there is substantial supplementary data that would
405
+ disrupt the flow of the main report. Each appendix should be self-contained.
406
+ Omit this entire section if no appendices are needed.
407
+ -->
408
+
409
+ ### Appendix A: [TITLE]
410
+
411
+ <!-- Example: Detailed Feature Comparison Matrix -->
412
+
413
+ | Feature | Option A | Option B | Option C | Option D |
414
+ |---------|----------|----------|----------|----------|
415
+ | [FEATURE_1] | [DETAIL] | [DETAIL] | [DETAIL] | [DETAIL] |
416
+ | [FEATURE_2] | [DETAIL] | [DETAIL] | [DETAIL] | [DETAIL] |
417
+
418
+ ### Appendix B: [TITLE]
419
+
420
+ <!-- Example: Pricing and Cost Analysis -->
421
+
422
+ | Tier | Monthly Cost | Included Usage | Overage Rate | Notes |
423
+ |------|-------------|---------------|-------------|-------|
424
+ | [TIER_1] | [COST] | [USAGE] | [RATE] | [NOTES] |
425
+ | [TIER_2] | [COST] | [USAGE] | [RATE] | [NOTES] |
426
+
427
+ ### Appendix C: [TITLE]
428
+
429
+ <!-- Example: Technical Specifications or Configuration Details -->
430
+
431
+ [SUPPLEMENTARY_CONTENT]
432
+
433
+ ---
434
+
435
+ *Report generated via Deep Research Protocol. [AGENT_COUNT] research agents across [DOMAIN_COUNT] domains, synthesized into a unified analysis.*
@@ -0,0 +1,220 @@
1
+ # Firecrawl MCP Reference
2
+
3
+ ## Layer 1: Quick Decision Table
4
+
5
+ Use this to pick the right Firecrawl tool instantly.
6
+
7
+ | I need to... | Tool | Key params | Credits |
8
+ |---|---|---|---|
9
+ | Search the web for info | `mcp__firecrawl-mcp__firecrawl_search` | `query`, `limit` | 2 per 10 results |
10
+ | Get content from a known URL | `mcp__firecrawl-mcp__firecrawl_scrape` | `url`, `formats: ["markdown"]` | 1 per page |
11
+ | Scrape multiple known URLs | `mcp__firecrawl-mcp__firecrawl_batch_scrape` | `urls`, `options` | 1 per page |
12
+ | Discover all URLs on a site | `mcp__firecrawl-mcp__firecrawl_map` | `url`, `search` | 1 |
13
+ | Crawl a site section | `mcp__firecrawl-mcp__firecrawl_crawl` | `url`, `limit`, `maxDiscoveryDepth` | 1 per page |
14
+ | Extract structured JSON from pages | `mcp__firecrawl-mcp__firecrawl_extract` | `urls`, `prompt`, `schema` | varies |
15
+
16
+ **Default for web search**: Always use `firecrawl_search` instead of `WebSearch`.
17
+ **Default for reading a URL**: Always use `firecrawl_scrape` instead of `WebFetch`.
18
+
19
+ ---
20
+
21
+ ## Layer 2: Common Patterns
22
+
23
+ ### Web Search (most common)
24
+ ```json
25
+ {
26
+ "query": "your search terms",
27
+ "limit": 5,
28
+ "sources": [{"type": "web"}]
29
+ }
30
+ ```
31
+
32
+ To also scrape the result pages:
33
+ ```json
34
+ {
35
+ "query": "your search terms",
36
+ "limit": 3,
37
+ "scrapeOptions": {
38
+ "formats": ["markdown"],
39
+ "onlyMainContent": true
40
+ }
41
+ }
42
+ ```
43
+
44
+ **Search sources**: `web` (default), `news`, `images`. Limit applies per source type.
45
+
46
+ **Search categories** (filter by site type):
47
+ - `github`: GitHub repos, code, issues
48
+ - `research`: arXiv, Nature, IEEE, PubMed
49
+ - `pdf`: PDF documents
50
+
51
+ **Time filters** (via `tbs` param):
52
+ - `qdr:h` past hour, `qdr:d` past 24h, `qdr:w` past week, `qdr:m` past month, `qdr:y` past year
53
+
54
+ ### Single Page Scrape
55
+ ```json
56
+ {
57
+ "url": "https://example.com",
58
+ "formats": ["markdown"],
59
+ "onlyMainContent": true,
60
+ "maxAge": 172800000
61
+ }
62
+ ```
63
+
64
+ **Performance tip**: `maxAge: 172800000` (2-day cache) makes scrapes up to 5x faster. Use `maxAge: 0` for fresh content.
65
+
66
+ ### Site Discovery (Map then Scrape)
67
+ ```json
68
+ {
69
+ "url": "https://docs.example.com",
70
+ "search": "api reference",
71
+ "limit": 50
72
+ }
73
+ ```
74
+ Use `map` to find URLs, then `scrape` or `batch_scrape` to get content.
75
+
76
+ ### Structured Data Extraction
77
+ ```json
78
+ {
79
+ "urls": ["https://example.com/pricing"],
80
+ "prompt": "Extract all pricing plans with name, price, and features",
81
+ "schema": {
82
+ "type": "object",
83
+ "properties": {
84
+ "plans": {
85
+ "type": "array",
86
+ "items": {
87
+ "type": "object",
88
+ "properties": {
89
+ "name": {"type": "string"},
90
+ "price": {"type": "number"},
91
+ "features": {"type": "array", "items": {"type": "string"}}
92
+ }
93
+ }
94
+ }
95
+ }
96
+ }
97
+ }
98
+ ```
99
+
100
+ ---
101
+
102
+ ## Layer 3: Full API Details
103
+
104
+ ### Scrape Formats
105
+ | Format | Description |
106
+ |--------|-------------|
107
+ | `markdown` | Clean markdown, ideal for LLM consumption |
108
+ | `html` | Parsed HTML |
109
+ | `rawHtml` | Unmodified HTML |
110
+ | `screenshot` | Page screenshot (supports `fullPage`, `quality`, `viewport`) |
111
+ | `links` | All links on the page |
112
+ | `summary` | AI-generated summary |
113
+ | `json` | Structured extraction with schema/prompt |
114
+ | `branding` | Brand identity extraction (colors, fonts, typography) |
115
+ | `changeTracking` | Diff against previous scrape |
116
+
117
+ ### Scrape Actions (interact before scraping)
118
+ Available action types: `wait`, `click`, `screenshot`, `write`, `press`, `scroll`, `scrape`, `executeJavascript`, `generatePDF`
119
+
120
+ Example: Login then scrape
121
+ ```json
122
+ {
123
+ "url": "https://example.com/login",
124
+ "formats": ["markdown"],
125
+ "actions": [
126
+ {"type": "write", "text": "user@example.com"},
127
+ {"type": "press", "key": "Tab"},
128
+ {"type": "write", "text": "password"},
129
+ {"type": "click", "selector": "button[type='submit']"},
130
+ {"type": "wait", "milliseconds": 1500}
131
+ ]
132
+ }
133
+ ```
134
+
135
+ ### Location and Language
136
+ ```json
137
+ {
138
+ "url": "https://example.com",
139
+ "location": {
140
+ "country": "US",
141
+ "languages": ["en"]
142
+ }
143
+ }
144
+ ```
145
+
146
+ ### Crawl Options
147
+ | Param | Description | Default |
148
+ |-------|-------------|---------|
149
+ | `maxDiscoveryDepth` | How deep to follow links | - |
150
+ | `limit` | Max pages to crawl | - |
151
+ | `allowExternalLinks` | Follow links to other domains | false |
152
+ | `deduplicateSimilarURLs` | Skip near-duplicate URLs | false |
153
+ | `includePaths` | Only crawl URLs matching these patterns | - |
154
+ | `excludePaths` | Skip URLs matching these patterns | - |
155
+
156
+ **Warning**: Crawl responses can be very large. Use `map` + `batch_scrape` for better control.
157
+
158
+ ### Map Options
159
+ | Param | Description |
160
+ |-------|-------------|
161
+ | `search` | Filter URLs by search term |
162
+ | `sitemap` | `include`, `skip`, or `only` |
163
+ | `includeSubdomains` | Include subdomains |
164
+ | `limit` | Max URLs to return (up to 100k) |
165
+ | `ignoreQueryParameters` | Deduplicate by path |
166
+
167
+ ### Cost Reference
168
+ | Operation | Credits |
169
+ |-----------|---------|
170
+ | Search (10 results) | 2 |
171
+ | Basic scrape | 1 per page |
172
+ | PDF parsing | 1 per PDF page |
173
+ | Enhanced proxy | +4 per page |
174
+ | JSON mode (structured extraction) | +4 per page |
175
+ | Map | 1 |
176
+
177
+ ### Caching
178
+ - Default `maxAge`: 172800000ms (2 days)
179
+ - Set `maxAge: 0` for always-fresh
180
+ - Set `storeInCache: false` to skip caching
181
+ - `changeTracking` format bypasses cache
182
+
183
+ ### Rate Limits and Retries
184
+ - Built-in exponential backoff on rate limits
185
+ - Configurable via env vars:
186
+ - `FIRECRAWL_RETRY_MAX_ATTEMPTS` (default: 3)
187
+ - `FIRECRAWL_RETRY_INITIAL_DELAY` (default: 1000ms)
188
+ - `FIRECRAWL_RETRY_MAX_DELAY` (default: 10000ms)
189
+ - `FIRECRAWL_RETRY_BACKOFF_FACTOR` (default: 2)
190
+
191
+ ### Agent Feature (New)
192
+ Autonomous web data gathering. Describe what you need, it searches, navigates, and extracts.
193
+ ```json
194
+ {
195
+ "prompt": "Find the pricing plans for Notion"
196
+ }
197
+ ```
198
+ Not available via MCP tools yet, API-only at `POST /v2/agent`.
199
+
200
+ ### Async Operations
201
+ - `crawl` and `batch_scrape` are async. They return an operation ID.
202
+ - Check status with `firecrawl_check_crawl_status` or `firecrawl_check_batch_status`.
203
+ - Results expire after 24 hours.
204
+
205
+ ---
206
+
207
+ ## When NOT to Use Firecrawl
208
+
209
+ | Situation | Use Instead |
210
+ |-----------|-------------|
211
+ | Searching local files | `Grep`, `Glob` |
212
+ | Reading a file on disk | `Read` |
213
+ | GitHub-specific operations (PRs, issues) | `gh` CLI or GitHub MCP tools |
214
+ | Authenticated service (Google Docs, Jira) | Service-specific MCP tool |
215
+
216
+ ## Setup
217
+ - Package: `firecrawl-mcp` (npm, stdio transport)
218
+ - Config: `~/.claude.json` under mcpServers
219
+ - API key: `FIRECRAWL_API_KEY` env var
220
+ - Docs: https://docs.firecrawl.dev