deep-research-cc 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +299 -0
- package/VERSION +1 -0
- package/bin/install.js +119 -0
- package/package.json +28 -0
- package/skills/deep-research/SKILL.md +229 -0
- package/skills/deep-research/academic-report-template.md +435 -0
- package/skills/deep-research/firecrawl-reference.md +220 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 desland01
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,299 @@
|
|
|
1
|
+
<div align="center">
|
|
2
|
+
|
|
3
|
+
# Deep Research
|
|
4
|
+
|
|
5
|
+
**Academic-grade multi-agent research pipeline for Claude Code.**
|
|
6
|
+
|
|
7
|
+
[](VERSION)
|
|
8
|
+
[](LICENSE)
|
|
9
|
+
|
|
10
|
+
<br>
|
|
11
|
+
|
|
12
|
+
```bash
|
|
13
|
+
npx deep-research-cc
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
**3-stage pipeline: Research Agents -> Synthesis Agents -> Report Builder**
|
|
17
|
+
|
|
18
|
+
<br>
|
|
19
|
+
|
|
20
|
+
*"Point it at a question, walk away, come back to an academic-grade research report with citations."*
|
|
21
|
+
|
|
22
|
+
<br>
|
|
23
|
+
|
|
24
|
+
[How It Works](#how-it-works) · [Prerequisites](#prerequisites) · [Installation](#installation) · [Usage](#usage) · [Architecture](#architecture)
|
|
25
|
+
|
|
26
|
+
</div>
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## The Problem
|
|
31
|
+
|
|
32
|
+
Manual research with AI tools is shallow. You search, read, forget context, search again. Sources get lost. Analysis stays surface-level. And if your context window fills up, everything collapses.
|
|
33
|
+
|
|
34
|
+
**Deep Research fixes that.** It decomposes your question into independent research domains, dispatches parallel agents to investigate each one, synthesizes findings through dedicated analysis agents, and produces a comprehensive report with full citations. All three stages write to disk, so nothing is lost to context limits.
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## How It Works
|
|
39
|
+
|
|
40
|
+
### Stage 1: Research Agents (parallel)
|
|
41
|
+
|
|
42
|
+
You provide a topic. The pipeline decomposes it into 3-6 independent research domains and launches parallel agents, each investigating one domain using Firecrawl MCP for web search and scraping.
|
|
43
|
+
|
|
44
|
+
Each agent writes a raw research document to `docs/plans/YYYY-MM-DD-{domain}-research.md`.
|
|
45
|
+
|
|
46
|
+
### Stage 2: Synthesis Agents (parallel)
|
|
47
|
+
|
|
48
|
+
After all research agents complete, synthesis agents launch in parallel. Each synthesis agent reads exactly ONE research file (strict 1:1 mapping) and produces a distilled domain summary with evidence quality assessment, gap analysis, and organized findings.
|
|
49
|
+
|
|
50
|
+
Each agent writes to `docs/plans/YYYY-MM-DD-{domain}-synthesis.md`.
|
|
51
|
+
|
|
52
|
+
### Stage 3: Report Builder (single agent)
|
|
53
|
+
|
|
54
|
+
A single report builder agent reads ALL synthesis files and produces a comprehensive, externally-shareable research report following an academic template.
|
|
55
|
+
|
|
56
|
+
Final report: `docs/plans/YYYY-MM-DD-{topic}-report.md`
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## What You Get
|
|
61
|
+
|
|
62
|
+
A structured research report with:
|
|
63
|
+
|
|
64
|
+
- **Executive Summary** with confidence assessment
|
|
65
|
+
- **Methodology** documenting search strategy and source evaluation
|
|
66
|
+
- **Domain Findings** with citations and evidence quality ratings
|
|
67
|
+
- **Cross-Cutting Analysis** identifying themes, contradictions, and risks
|
|
68
|
+
- **Synthesis and Recommendations** with alternatives table
|
|
69
|
+
- **Limitations** and suggested follow-up research
|
|
70
|
+
- **Full References** organized by domain with source type and relevance ratings
|
|
71
|
+
|
|
72
|
+
Every factual claim traces back to a cited source. The report stands alone: someone with no prior context can read and learn from it.
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## Prerequisites
|
|
77
|
+
|
|
78
|
+
| Requirement | Details |
|
|
79
|
+
|-------------|---------|
|
|
80
|
+
| **Claude Code** | Anthropic's CLI for Claude |
|
|
81
|
+
| **Node.js** | v18+ (for npx installer) |
|
|
82
|
+
| **Firecrawl MCP** | Paid API key from [firecrawl.dev](https://firecrawl.dev) |
|
|
83
|
+
|
|
84
|
+
> **Note:** Firecrawl is a paid service that provides web search and scraping capabilities. Deep Research requires it for all web-based research. Visit [firecrawl.dev/pricing](https://firecrawl.dev/pricing) for current plans.
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## Installation
|
|
89
|
+
|
|
90
|
+
### Quick Install (recommended)
|
|
91
|
+
|
|
92
|
+
```bash
|
|
93
|
+
npx deep-research-cc
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
The installer copies the skill files into `~/.claude/` and backs up any existing files.
|
|
97
|
+
|
|
98
|
+
<details>
|
|
99
|
+
<summary><strong>Non-interactive install (Docker, CI, Scripts)</strong></summary>
|
|
100
|
+
|
|
101
|
+
```bash
|
|
102
|
+
npx deep-research-cc --global # Install to ~/.claude/
|
|
103
|
+
npx deep-research-cc --local # Install to ./.claude/
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
Use `--global` (`-g`) or `--local` (`-l`) to skip the interactive prompt.
|
|
107
|
+
|
|
108
|
+
</details>
|
|
109
|
+
|
|
110
|
+
<details>
|
|
111
|
+
<summary><strong>Development installation</strong></summary>
|
|
112
|
+
|
|
113
|
+
Clone the repository and use symlinks for live editing:
|
|
114
|
+
|
|
115
|
+
```bash
|
|
116
|
+
git clone https://github.com/desland01/deep-research.git ~/deep-research
|
|
117
|
+
cd ~/deep-research
|
|
118
|
+
chmod +x install.sh
|
|
119
|
+
./install.sh
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
Changes in `~/deep-research/` are live immediately via symlinks.
|
|
123
|
+
|
|
124
|
+
</details>
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## Firecrawl MCP Setup
|
|
129
|
+
|
|
130
|
+
Deep Research uses Firecrawl MCP for all web search and scraping. You must configure it before using the skill.
|
|
131
|
+
|
|
132
|
+
### Step 1: Get an API Key
|
|
133
|
+
|
|
134
|
+
Sign up at [firecrawl.dev](https://firecrawl.dev) and get your API key from the dashboard.
|
|
135
|
+
|
|
136
|
+
### Step 2: Add Firecrawl MCP to Claude Code
|
|
137
|
+
|
|
138
|
+
```bash
|
|
139
|
+
claude mcp add firecrawl-mcp -e FIRECRAWL_API_KEY=your-key-here -- npx -y firecrawl-mcp
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Step 3: Restart Claude Code
|
|
143
|
+
|
|
144
|
+
Restart Claude Code to load both the new skill and the Firecrawl MCP server.
|
|
145
|
+
|
|
146
|
+
### Verify Setup
|
|
147
|
+
|
|
148
|
+
Start Claude Code and type `/deep-research`. If the skill loads, you're ready.
|
|
149
|
+
|
|
150
|
+
---
|
|
151
|
+
|
|
152
|
+
## Usage
|
|
153
|
+
|
|
154
|
+
In any Claude Code session:
|
|
155
|
+
|
|
156
|
+
```
|
|
157
|
+
/deep-research What are the best approaches to real-time voice AI for mobile apps?
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
The pipeline will:
|
|
161
|
+
|
|
162
|
+
1. Decompose your question into 3-6 research domains
|
|
163
|
+
2. Launch parallel research agents (each writes to disk)
|
|
164
|
+
3. Launch parallel synthesis agents (each reads one research file, writes synthesis)
|
|
165
|
+
4. Launch a report builder (reads all syntheses, writes final report)
|
|
166
|
+
5. Report the file location and top-level findings
|
|
167
|
+
|
|
168
|
+
### Example Output Files
|
|
169
|
+
|
|
170
|
+
For a research topic "voice AI providers for React Native":
|
|
171
|
+
|
|
172
|
+
```
|
|
173
|
+
docs/plans/
|
|
174
|
+
2026-02-16-voice-ai-landscape-research.md
|
|
175
|
+
2026-02-16-voice-ai-pricing-research.md
|
|
176
|
+
2026-02-16-voice-ai-react-native-research.md
|
|
177
|
+
2026-02-16-voice-ai-latency-research.md
|
|
178
|
+
2026-02-16-voice-ai-privacy-research.md
|
|
179
|
+
2026-02-16-voice-ai-landscape-synthesis.md
|
|
180
|
+
2026-02-16-voice-ai-pricing-synthesis.md
|
|
181
|
+
2026-02-16-voice-ai-react-native-synthesis.md
|
|
182
|
+
2026-02-16-voice-ai-latency-synthesis.md
|
|
183
|
+
2026-02-16-voice-ai-privacy-synthesis.md
|
|
184
|
+
2026-02-16-voice-ai-providers-report.md <- Final report
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
---
|
|
188
|
+
|
|
189
|
+
## Architecture
|
|
190
|
+
|
|
191
|
+
```
|
|
192
|
+
/deep-research [topic]
|
|
193
|
+
|
|
|
194
|
+
v
|
|
195
|
+
Decompose into 3-6 domains
|
|
196
|
+
|
|
|
197
|
+
v
|
|
198
|
+
┌────┬────┬────┬────┬────┐
|
|
199
|
+
R1 R2 R3 R4 R5 R6 <- Research agents (parallel, background)
|
|
200
|
+
| | | | | | Each writes *-research.md to disk
|
|
201
|
+
└────┴────┴────┴────┴────┘
|
|
202
|
+
|
|
|
203
|
+
v (wait for ALL to complete)
|
|
204
|
+
|
|
|
205
|
+
┌────┬────┬────┬────┬────┐
|
|
206
|
+
S1 S2 S3 S4 S5 S6 <- Synthesis agents (parallel, background)
|
|
207
|
+
| | | | | | 1:1 with research agents
|
|
208
|
+
└────┴────┴────┴────┴────┘ Each reads one *-research.md, writes *-synthesis.md
|
|
209
|
+
|
|
|
210
|
+
v (wait for ALL to complete)
|
|
211
|
+
|
|
|
212
|
+
[RB] <- Report builder (single agent, background)
|
|
213
|
+
| Reads ALL *-synthesis.md files
|
|
214
|
+
v Writes final *-report.md
|
|
215
|
+
Final Report
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
### Agent Limits
|
|
219
|
+
|
|
220
|
+
| Constraint | Value |
|
|
221
|
+
|-----------|-------|
|
|
222
|
+
| Max parallel research agents | 6 |
|
|
223
|
+
| Max parallel synthesis agents | 6 (1:1 with research) |
|
|
224
|
+
| Report builder agents | 1 (always single) |
|
|
225
|
+
| Parent reads raw output | Never |
|
|
226
|
+
| Parent reads final report | Yes (after verification) |
|
|
227
|
+
|
|
228
|
+
### Why Disk-Based Handoff?
|
|
229
|
+
|
|
230
|
+
Raw research agent output can exceed 600K tokens. If the parent agent reads it via TaskOutput, context gets crushed and downstream agents never launch. Writing to disk files keeps each stage isolated and the parent context clean.
|
|
231
|
+
|
|
232
|
+
### Anti-Patterns
|
|
233
|
+
|
|
234
|
+
| Don't | Do Instead |
|
|
235
|
+
|-------|-----------|
|
|
236
|
+
| Read TaskOutput from research/synthesis agents | Wait for completion, check files on disk |
|
|
237
|
+
| Assign 2+ research files to one synthesis agent | Strict 1:1 mapping |
|
|
238
|
+
| Skip the report builder for "simple" research | Always run all 3 stages |
|
|
239
|
+
| Omit Firecrawl instructions from research agents | Agents don't inherit MCP context |
|
|
240
|
+
| Omit source URLs | Every claim must trace to a URL |
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
## Troubleshooting
|
|
245
|
+
|
|
246
|
+
**Skill not found after install?**
|
|
247
|
+
- Restart Claude Code to reload skills
|
|
248
|
+
- Verify files exist: `ls ~/.claude/skills/deep-research/`
|
|
249
|
+
|
|
250
|
+
**Research agents failing?**
|
|
251
|
+
- Check Firecrawl MCP is configured: look for `firecrawl-mcp` in your MCP server list
|
|
252
|
+
- Verify your API key is valid at [firecrawl.dev/dashboard](https://firecrawl.dev)
|
|
253
|
+
|
|
254
|
+
**Report missing sections?**
|
|
255
|
+
- Check that all research and synthesis files were created in `docs/plans/`
|
|
256
|
+
- The report builder only runs after ALL synthesis agents complete
|
|
257
|
+
|
|
258
|
+
**Context getting crushed?**
|
|
259
|
+
- This usually means the parent is reading raw agent output. The skill prevents this by design, but if you've modified the protocol, ensure the parent never reads research or synthesis files directly.
|
|
260
|
+
|
|
261
|
+
---
|
|
262
|
+
|
|
263
|
+
## Directory Structure
|
|
264
|
+
|
|
265
|
+
```
|
|
266
|
+
deep-research/
|
|
267
|
+
├── README.md
|
|
268
|
+
├── VERSION
|
|
269
|
+
├── LICENSE
|
|
270
|
+
├── .gitignore
|
|
271
|
+
├── package.json
|
|
272
|
+
├── install.sh # Dev install (symlinks)
|
|
273
|
+
├── bin/
|
|
274
|
+
│ └── install.js # npx entry point (copy-based)
|
|
275
|
+
└── skills/
|
|
276
|
+
└── deep-research/
|
|
277
|
+
├── SKILL.md # 3-stage protocol
|
|
278
|
+
├── academic-report-template.md # Report format template
|
|
279
|
+
└── firecrawl-reference.md # Firecrawl MCP tool reference
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
---
|
|
283
|
+
|
|
284
|
+
## Contributing
|
|
285
|
+
|
|
286
|
+
1. Clone the repo and run `./install.sh`
|
|
287
|
+
2. Edit files in `~/deep-research/` (changes are live via symlinks)
|
|
288
|
+
3. Test with `/deep-research` in Claude Code
|
|
289
|
+
4. For protocol changes, verify all 3 stages produce expected output files
|
|
290
|
+
|
|
291
|
+
---
|
|
292
|
+
|
|
293
|
+
<div align="center">
|
|
294
|
+
|
|
295
|
+
**Claude Code is powerful. Deep Research makes it thorough.**
|
|
296
|
+
|
|
297
|
+
*Academic-grade research pipelines, so you can decide with confidence.*
|
|
298
|
+
|
|
299
|
+
</div>
|
package/VERSION
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
1.0.0
|
package/bin/install.js
ADDED
|
@@ -0,0 +1,119 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
const fs = require("fs");
|
|
4
|
+
const path = require("path");
|
|
5
|
+
const os = require("os");
|
|
6
|
+
const readline = require("readline");
|
|
7
|
+
|
|
8
|
+
const VERSION = fs
|
|
9
|
+
.readFileSync(path.join(__dirname, "..", "VERSION"), "utf8")
|
|
10
|
+
.trim();
|
|
11
|
+
const PKG_ROOT = path.join(__dirname, "..");
|
|
12
|
+
const HOME = os.homedir();
|
|
13
|
+
|
|
14
|
+
const TARGETS = {
|
|
15
|
+
global: path.join(HOME, ".claude"),
|
|
16
|
+
local: path.join(process.cwd(), ".claude"),
|
|
17
|
+
};
|
|
18
|
+
|
|
19
|
+
const DIRS_TO_COPY = ["skills/deep-research"];
|
|
20
|
+
|
|
21
|
+
// -- Helpers ----------------------------------------------------------------
|
|
22
|
+
|
|
23
|
+
function copyDirSync(src, dest) {
|
|
24
|
+
fs.mkdirSync(dest, { recursive: true });
|
|
25
|
+
for (const entry of fs.readdirSync(src, { withFileTypes: true })) {
|
|
26
|
+
const srcPath = path.join(src, entry.name);
|
|
27
|
+
const destPath = path.join(dest, entry.name);
|
|
28
|
+
if (entry.isDirectory()) {
|
|
29
|
+
copyDirSync(srcPath, destPath);
|
|
30
|
+
} else {
|
|
31
|
+
fs.copyFileSync(srcPath, destPath);
|
|
32
|
+
}
|
|
33
|
+
}
|
|
34
|
+
}
|
|
35
|
+
|
|
36
|
+
function backupIfExists(target) {
|
|
37
|
+
if (fs.existsSync(target)) {
|
|
38
|
+
const timestamp = new Date()
|
|
39
|
+
.toISOString()
|
|
40
|
+
.replace(/[-:T]/g, "")
|
|
41
|
+
.slice(0, 14);
|
|
42
|
+
const backup = `${target}.backup.${timestamp}`;
|
|
43
|
+
console.log(` Backing up existing -> ${path.basename(backup)}`);
|
|
44
|
+
fs.renameSync(target, backup);
|
|
45
|
+
}
|
|
46
|
+
}
|
|
47
|
+
|
|
48
|
+
function install(claudeDir) {
|
|
49
|
+
console.log(`\nInstalling to ${claudeDir}/\n`);
|
|
50
|
+
|
|
51
|
+
for (const rel of DIRS_TO_COPY) {
|
|
52
|
+
const src = path.join(PKG_ROOT, rel);
|
|
53
|
+
const dest = path.join(claudeDir, rel);
|
|
54
|
+
|
|
55
|
+
// Ensure parent dir exists
|
|
56
|
+
fs.mkdirSync(path.dirname(dest), { recursive: true });
|
|
57
|
+
|
|
58
|
+
backupIfExists(dest);
|
|
59
|
+
copyDirSync(src, dest);
|
|
60
|
+
console.log(` Copied: ${rel}`);
|
|
61
|
+
}
|
|
62
|
+
|
|
63
|
+
console.log(`
|
|
64
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
65
|
+
Install complete.
|
|
66
|
+
|
|
67
|
+
Skill installed:
|
|
68
|
+
/deep-research [topic] Run 3-stage academic research pipeline
|
|
69
|
+
|
|
70
|
+
Firecrawl MCP Setup (required):
|
|
71
|
+
Deep Research uses Firecrawl for web search and scraping.
|
|
72
|
+
You need a Firecrawl API key from https://firecrawl.dev
|
|
73
|
+
|
|
74
|
+
Add Firecrawl MCP to your project:
|
|
75
|
+
claude mcp add firecrawl-mcp -e FIRECRAWL_API_KEY=your-key -- npx -y firecrawl-mcp
|
|
76
|
+
|
|
77
|
+
Then restart Claude Code to load the new skill and MCP server.
|
|
78
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━`);
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
// -- Main -------------------------------------------------------------------
|
|
82
|
+
|
|
83
|
+
console.log(`
|
|
84
|
+
Deep Research Installer v${VERSION}
|
|
85
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━`);
|
|
86
|
+
|
|
87
|
+
const args = process.argv.slice(2);
|
|
88
|
+
|
|
89
|
+
if (args.includes("--global") || args.includes("-g")) {
|
|
90
|
+
install(TARGETS.global);
|
|
91
|
+
} else if (args.includes("--local") || args.includes("-l")) {
|
|
92
|
+
install(TARGETS.local);
|
|
93
|
+
} else {
|
|
94
|
+
// Interactive prompt
|
|
95
|
+
const rl = readline.createInterface({
|
|
96
|
+
input: process.stdin,
|
|
97
|
+
output: process.stdout,
|
|
98
|
+
});
|
|
99
|
+
|
|
100
|
+
console.log(`
|
|
101
|
+
Where would you like to install?
|
|
102
|
+
|
|
103
|
+
1) Global (~/.claude/) Available in all projects
|
|
104
|
+
2) Local (./.claude/) This project only
|
|
105
|
+
`);
|
|
106
|
+
|
|
107
|
+
rl.question("Choose [1/2]: ", (answer) => {
|
|
108
|
+
rl.close();
|
|
109
|
+
const choice = answer.trim();
|
|
110
|
+
if (choice === "1" || choice.toLowerCase() === "global") {
|
|
111
|
+
install(TARGETS.global);
|
|
112
|
+
} else if (choice === "2" || choice.toLowerCase() === "local") {
|
|
113
|
+
install(TARGETS.local);
|
|
114
|
+
} else {
|
|
115
|
+
console.log("Invalid choice. Use --global or --local flag, or enter 1 or 2.");
|
|
116
|
+
process.exit(1);
|
|
117
|
+
}
|
|
118
|
+
});
|
|
119
|
+
}
|
package/package.json
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "deep-research-cc",
|
|
3
|
+
"version": "1.0.0",
|
|
4
|
+
"description": "Academic-grade multi-agent research pipeline for Claude Code — 3-stage deep research that writes itself.",
|
|
5
|
+
"bin": {
|
|
6
|
+
"deep-research-cc": "bin/install.js"
|
|
7
|
+
},
|
|
8
|
+
"files": [
|
|
9
|
+
"bin/",
|
|
10
|
+
"skills/",
|
|
11
|
+
"VERSION"
|
|
12
|
+
],
|
|
13
|
+
"keywords": [
|
|
14
|
+
"claude",
|
|
15
|
+
"claude-code",
|
|
16
|
+
"research",
|
|
17
|
+
"deep-research",
|
|
18
|
+
"firecrawl",
|
|
19
|
+
"academic",
|
|
20
|
+
"multi-agent"
|
|
21
|
+
],
|
|
22
|
+
"author": "desland01",
|
|
23
|
+
"license": "MIT",
|
|
24
|
+
"repository": {
|
|
25
|
+
"type": "git",
|
|
26
|
+
"url": "git+https://github.com/desland01/deep-research.git"
|
|
27
|
+
}
|
|
28
|
+
}
|
|
@@ -0,0 +1,229 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: deep-research
|
|
3
|
+
description: "Use when evaluating technologies, making architectural decisions, comparing options across multiple dimensions, or any research task requiring 5+ web searches across multiple domains. Triggers on phrases like 'research this', 'figure out the best way', 'compare options for', or 'what are our choices'. NOT for simple factual lookups, codebase exploration, or tasks where you already have enough context."
|
|
4
|
+
argument-hint: [topic or question]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Deep Research Protocol (3-Stage Pipeline)
|
|
8
|
+
|
|
9
|
+
Autonomous multi-agent research workflow producing academic-level reports. Three stages: Research, Synthesis, Report. Each stage writes to disk. Parent never touches raw output from any stage.
|
|
10
|
+
|
|
11
|
+
## Why This Exists
|
|
12
|
+
|
|
13
|
+
Raw research agent output can exceed 600K tokens. If the parent reads it (via TaskOutput or inline), context gets crushed and downstream agents never launch. This protocol keeps the parent context clean by using disk files as the handoff layer between all three stages.
|
|
14
|
+
|
|
15
|
+
## The Rule
|
|
16
|
+
|
|
17
|
+
**Parent (you) must NEVER:**
|
|
18
|
+
- Read TaskOutput from research agents or synthesis agents
|
|
19
|
+
- Paste findings from any agent into another agent's prompt
|
|
20
|
+
- Read raw JSONL output files
|
|
21
|
+
- Read research or synthesis markdown files directly
|
|
22
|
+
|
|
23
|
+
**Parent (you) CAN:**
|
|
24
|
+
- Check completion status via task notifications (automatic)
|
|
25
|
+
- Read the final report on disk after Stage 3 completes
|
|
26
|
+
- Verify files exist using Glob
|
|
27
|
+
- Grep for structural markers (headers, citations) without reading full content
|
|
28
|
+
|
|
29
|
+
## Protocol
|
|
30
|
+
|
|
31
|
+
### Step 1: Decompose
|
|
32
|
+
|
|
33
|
+
Split the research question into 3-6 independent domains. Each domain becomes one research agent. Domains should be non-overlapping but collectively exhaustive.
|
|
34
|
+
|
|
35
|
+
Example decomposition for "Best voice AI provider for our app":
|
|
36
|
+
- Domain 1: Provider landscape and feature comparison
|
|
37
|
+
- Domain 2: Pricing models and cost projections
|
|
38
|
+
- Domain 3: React Native integration complexity
|
|
39
|
+
- Domain 4: Latency, reliability, and production readiness
|
|
40
|
+
- Domain 5: Privacy, compliance, and data handling
|
|
41
|
+
|
|
42
|
+
### Step 2: Dispatch Research Agents (parallel, background)
|
|
43
|
+
|
|
44
|
+
Launch all research agents in parallel with `run_in_background: true`.
|
|
45
|
+
|
|
46
|
+
Each research agent writes to: `docs/plans/YYYY-MM-DD-{domain-slug}-research.md`
|
|
47
|
+
|
|
48
|
+
Use the Research Agent Prompt Template below. Every prompt MUST include:
|
|
49
|
+
- Numbered questions to answer (3-5 per agent)
|
|
50
|
+
- Firecrawl MCP tool instructions (agents do not inherit MCP context)
|
|
51
|
+
- Suggested search queries (3-5 per agent)
|
|
52
|
+
- The exact output file path
|
|
53
|
+
- Instructions to include source URLs for every claim
|
|
54
|
+
|
|
55
|
+
### Step 3: Wait for Research Completion
|
|
56
|
+
|
|
57
|
+
Do nothing. Task notifications arrive automatically when agents complete. Do NOT poll, do NOT read output files. Wait until all research agents report completion.
|
|
58
|
+
|
|
59
|
+
### Step 4: Dispatch Synthesis Agents (parallel, background)
|
|
60
|
+
|
|
61
|
+
After ALL research agents complete, launch synthesis agents with `run_in_background: true`.
|
|
62
|
+
|
|
63
|
+
**One synthesis agent per research agent.** This is a strict 1:1 mapping. Each synthesis agent reads exactly ONE research file and produces ONE synthesis file. Never assign multiple research files to a single synthesis agent.
|
|
64
|
+
|
|
65
|
+
Each synthesis agent reads from: `docs/plans/YYYY-MM-DD-{domain-slug}-research.md`
|
|
66
|
+
Each synthesis agent writes to: `docs/plans/YYYY-MM-DD-{domain-slug}-synthesis.md`
|
|
67
|
+
|
|
68
|
+
Use the Synthesis Agent Prompt Template below. Every prompt MUST include:
|
|
69
|
+
- The exact research file path to READ
|
|
70
|
+
- The exact synthesis file path to WRITE
|
|
71
|
+
- Instructions to distill, organize, cite, and assess source quality
|
|
72
|
+
- Instructions to flag gaps and contradictions
|
|
73
|
+
|
|
74
|
+
**Why 1:1?** Each domain deserves dedicated attention. Combining domains in synthesis loses nuance and produces shallow analysis.
|
|
75
|
+
|
|
76
|
+
### Step 5: Dispatch Report Builder (single agent, background)
|
|
77
|
+
|
|
78
|
+
After ALL synthesis agents complete, launch ONE report builder agent with `run_in_background: true`.
|
|
79
|
+
|
|
80
|
+
The report builder reads ALL synthesis files and produces a single comprehensive report.
|
|
81
|
+
|
|
82
|
+
Report builder reads from: All `docs/plans/YYYY-MM-DD-*-synthesis.md` files
|
|
83
|
+
Report builder writes to: `docs/plans/YYYY-MM-DD-{overall-topic}-report.md`
|
|
84
|
+
Report builder references: `~/.claude/skills/deep-research/academic-report-template.md`
|
|
85
|
+
|
|
86
|
+
Use the Report Builder Prompt Template below. The report builder focuses on cross-cutting analysis, theme identification, and actionable recommendations. It does NOT re-analyze raw research. It works exclusively from the synthesized domain summaries.
|
|
87
|
+
|
|
88
|
+
### Step 6: Verify and Report
|
|
89
|
+
|
|
90
|
+
After the report builder completes:
|
|
91
|
+
|
|
92
|
+
1. Verify deliverables exist on disk (Glob for the report file)
|
|
93
|
+
2. Grep for structural markers: `## Executive Summary`, `## References`, `## 2. Methodology`
|
|
94
|
+
3. Grep for citation presence (source URLs in References section)
|
|
95
|
+
4. Report to user: what was researched, where the report lives, key top-level findings (2-3 sentences max)
|
|
96
|
+
|
|
97
|
+
The user reads the report on disk. Do NOT paste report content into the chat.
|
|
98
|
+
|
|
99
|
+
## Agent Limits
|
|
100
|
+
|
|
101
|
+
| Constraint | Value |
|
|
102
|
+
|-----------|-------|
|
|
103
|
+
| Max parallel research agents | 6 |
|
|
104
|
+
| Max parallel synthesis agents | 6 (1:1 with research agents) |
|
|
105
|
+
| Report builder agents | 1 (always single) |
|
|
106
|
+
| Research agent output | `docs/plans/YYYY-MM-DD-{domain}-research.md` |
|
|
107
|
+
| Synthesis agent output | `docs/plans/YYYY-MM-DD-{domain}-synthesis.md` |
|
|
108
|
+
| Report builder output | `docs/plans/YYYY-MM-DD-{topic}-report.md` |
|
|
109
|
+
| Parent reads raw output | NEVER (research or synthesis) |
|
|
110
|
+
| Parent reads final report | YES (after Step 6 verification) |
|
|
111
|
+
|
|
112
|
+
## Key Principles
|
|
113
|
+
|
|
114
|
+
| Principle | Rationale |
|
|
115
|
+
|-----------|-----------|
|
|
116
|
+
| Disk files as handoff layer | Prevents context flooding between stages |
|
|
117
|
+
| 1:1 synthesis mapping | Each domain gets dedicated analytical attention |
|
|
118
|
+
| Report builder works from syntheses only | Cross-cutting analysis, not domain re-reading |
|
|
119
|
+
| Academic rigor | Every claim needs a citation, methodology documented, limitations acknowledged |
|
|
120
|
+
| Background execution for all agents | Parent stays responsive, agents run in parallel |
|
|
121
|
+
| Firecrawl MCP for web research | Consistent tool usage, agents don't inherit MCP context |
|
|
122
|
+
| All findings include source URLs | Traceability and verifiability of claims |
|
|
123
|
+
| Report should be externally shareable | Quality bar: someone outside the team can read and learn from it |
|
|
124
|
+
|
|
125
|
+
## Template: Research Agent Prompt
|
|
126
|
+
|
|
127
|
+
```
|
|
128
|
+
You are a technical researcher. Your job is to thoroughly investigate [DOMAIN].
|
|
129
|
+
|
|
130
|
+
Questions to answer:
|
|
131
|
+
1. [Question 1]
|
|
132
|
+
2. [Question 2]
|
|
133
|
+
3. [Question 3]
|
|
134
|
+
4. [Question 4 - optional]
|
|
135
|
+
5. [Question 5 - optional]
|
|
136
|
+
|
|
137
|
+
Research method:
|
|
138
|
+
- Use Firecrawl MCP tools for all web search and URL scraping
|
|
139
|
+
- `mcp__firecrawl-mcp__firecrawl_search` for web search (use `query` param)
|
|
140
|
+
- `mcp__firecrawl-mcp__firecrawl_scrape` for reading specific URLs (use `url` param)
|
|
141
|
+
- Reference `~/.claude/skills/deep-research/firecrawl-reference.md` for tool patterns and params
|
|
142
|
+
- Suggested searches: "[query 1]", "[query 2]", "[query 3]"
|
|
143
|
+
|
|
144
|
+
Output requirements:
|
|
145
|
+
- Write ALL findings to `docs/plans/YYYY-MM-DD-{domain-slug}-research.md` using the Write tool
|
|
146
|
+
- Structure with clear H2/H3 headers and tables where data is comparable
|
|
147
|
+
- Include source URLs for EVERY factual claim (inline or footnote style)
|
|
148
|
+
- Include direct quotes from official documentation where relevant
|
|
149
|
+
- Note contradictions between sources
|
|
150
|
+
- Never use em dashes (use commas, periods, or restructure sentences)
|
|
151
|
+
- End with a "Sources" section listing all URLs consulted with brief descriptions
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
## Template: Synthesis Agent Prompt
|
|
155
|
+
|
|
156
|
+
```
|
|
157
|
+
You are an analytical synthesizer. Your job is to distill raw research into a focused, well-organized domain summary.
|
|
158
|
+
|
|
159
|
+
Input: Read the research document at `docs/plans/YYYY-MM-DD-{domain-slug}-research.md` using the Read tool.
|
|
160
|
+
|
|
161
|
+
Output: Write your synthesis to `docs/plans/YYYY-MM-DD-{domain-slug}-synthesis.md` using the Write tool.
|
|
162
|
+
|
|
163
|
+
Synthesis requirements:
|
|
164
|
+
1. **Key Findings** (bulleted, 5-10 items): The most important facts and conclusions from the research
|
|
165
|
+
2. **Thematic Organization**: Group findings by theme, not by source. Use H2 headers for each theme.
|
|
166
|
+
3. **Evidence Quality Assessment**: For each major claim, note:
|
|
167
|
+
- Number of independent sources confirming it
|
|
168
|
+
- Source credibility (official docs, peer-reviewed, blog post, forum, marketing material)
|
|
169
|
+
- Recency (when was this information published or last updated?)
|
|
170
|
+
4. **Gaps and Contradictions**: What questions remain unanswered? Where do sources disagree?
|
|
171
|
+
5. **Citations**: Preserve all source URLs from the research document. Every claim must trace back to a source.
|
|
172
|
+
6. **Domain-Specific Recommendations**: Based on the evidence, what are the clear takeaways for this domain?
|
|
173
|
+
|
|
174
|
+
Style rules:
|
|
175
|
+
- Never use em dashes (use commas, periods, or restructure sentences)
|
|
176
|
+
- Tables for comparisons and structured data
|
|
177
|
+
- Concise paragraphs (3-4 sentences max)
|
|
178
|
+
- No filler, no hedging language, no sycophancy
|
|
179
|
+
- Write for a technical audience that wants actionable information
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
## Template: Report Builder Prompt
|
|
183
|
+
|
|
184
|
+
```
|
|
185
|
+
You are an academic research report writer. Your job is to produce a comprehensive, externally-shareable research report from multiple domain syntheses.
|
|
186
|
+
|
|
187
|
+
Input: Read ALL synthesis files matching `docs/plans/YYYY-MM-DD-*-synthesis.md` using the Read tool.
|
|
188
|
+
Template: Read the report template at `~/.claude/skills/deep-research/academic-report-template.md` using the Read tool.
|
|
189
|
+
|
|
190
|
+
Output: Write the final report to `docs/plans/YYYY-MM-DD-{overall-topic}-report.md` using the Write tool.
|
|
191
|
+
|
|
192
|
+
Report structure (follow the academic report template):
|
|
193
|
+
1. **Executive Summary** (200-300 words): Core question, key findings, primary recommendation, confidence level
|
|
194
|
+
2. **Introduction**: Problem statement, scope and boundaries, numbered research questions
|
|
195
|
+
3. **Methodology**: Search strategy, source evaluation criteria, source distribution by type, limitations
|
|
196
|
+
4. **Domain Findings**: One major section per synthesis file. Key findings with citations, evidence quality, notable gaps.
|
|
197
|
+
5. **Cross-Cutting Analysis**: Common themes across domains, contradictions, dependencies, risk assessment matrix
|
|
198
|
+
6. **Synthesis and Recommendations**: Primary recommendation with rationale, alternatives table, implementation considerations, decision criteria for when to revisit
|
|
199
|
+
7. **Limitations and Future Research**: Exclusions, thin evidence areas, suggested follow-up research
|
|
200
|
+
8. **References**: All source URLs organized by domain, with title, access date, type, and relevance rating
|
|
201
|
+
|
|
202
|
+
Quality standards:
|
|
203
|
+
- Every factual claim must have a citation (URL or source reference)
|
|
204
|
+
- Distinguish between established facts, expert opinions, and emerging trends
|
|
205
|
+
- Quantify where possible (costs, latency numbers, adoption percentages)
|
|
206
|
+
- Acknowledge uncertainty explicitly rather than presenting speculation as fact
|
|
207
|
+
- The report should stand alone: a reader with no prior context should understand it fully
|
|
208
|
+
|
|
209
|
+
Style rules:
|
|
210
|
+
- Never use em dashes (use commas, periods, or restructure sentences)
|
|
211
|
+
- Professional, direct tone. No filler or hedging.
|
|
212
|
+
- Tables for all comparative data
|
|
213
|
+
- Use H2 for major sections, H3 for subsections
|
|
214
|
+
- Target length: 2,000-5,000 words depending on complexity
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
## Anti-Patterns
|
|
218
|
+
|
|
219
|
+
| Do NOT do this | Do this instead |
|
|
220
|
+
|---------------|-----------------|
|
|
221
|
+
| Read TaskOutput from any agent | Wait for task notifications, then check disk |
|
|
222
|
+
| Paste research into synthesis prompt | Tell synthesis agent which file to Read from disk |
|
|
223
|
+
| Assign 2+ research files to one synthesis agent | Maintain strict 1:1 mapping |
|
|
224
|
+
| Skip the report builder for "simple" research | Always run all 3 stages. The report builder adds cross-cutting analysis. |
|
|
225
|
+
| Read the research or synthesis files yourself | Only read the final report after Step 6 |
|
|
226
|
+
| Launch synthesis before ALL research completes | Wait for every research agent to finish |
|
|
227
|
+
| Launch report builder before ALL synthesis completes | Wait for every synthesis agent to finish |
|
|
228
|
+
| Omit Firecrawl instructions from research agents | Agents do not inherit MCP context. Always spell it out. |
|
|
229
|
+
| Omit source URLs | Every claim must trace to a URL |
|
|
@@ -0,0 +1,435 @@
|
|
|
1
|
+
# [TOPIC]
|
|
2
|
+
|
|
3
|
+
**Research Date:** [DATE]
|
|
4
|
+
**Domains Covered:** [DOMAIN_1], [DOMAIN_2], [DOMAIN_3], ... [DOMAIN_N]
|
|
5
|
+
**Total Sources Consulted:** [SOURCE_COUNT]
|
|
6
|
+
**Report Version:** 1.0
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Executive Summary
|
|
11
|
+
|
|
12
|
+
<!--
|
|
13
|
+
Write 200-300 words. State the core question, the key findings, the primary recommendation,
|
|
14
|
+
and an honest confidence assessment. This section should stand alone: a reader who only reads
|
|
15
|
+
this section should understand what was researched, what was found, and what to do next.
|
|
16
|
+
-->
|
|
17
|
+
|
|
18
|
+
**Core Question:** [STATE_THE_CENTRAL_RESEARCH_QUESTION]
|
|
19
|
+
|
|
20
|
+
**Key Findings:**
|
|
21
|
+
|
|
22
|
+
- [FINDING_1: One sentence summarizing the most important discovery]
|
|
23
|
+
- [FINDING_2: One sentence on the second most important finding]
|
|
24
|
+
- [FINDING_3: One sentence on the third finding]
|
|
25
|
+
- [FINDING_4: Optional fourth finding]
|
|
26
|
+
- [FINDING_5: Optional fifth finding]
|
|
27
|
+
|
|
28
|
+
**Primary Recommendation:** [ONE_PARAGRAPH_RECOMMENDATION]
|
|
29
|
+
|
|
30
|
+
**Confidence Level:** [HIGH / MODERATE / LOW]
|
|
31
|
+
|
|
32
|
+
<!--
|
|
33
|
+
Confidence criteria:
|
|
34
|
+
- HIGH: Multiple independent sources agree, official documentation confirms, tested/verified examples exist
|
|
35
|
+
- MODERATE: Several sources agree but some gaps remain, limited independent verification
|
|
36
|
+
- LOW: Few sources available, conflicting information, rapidly changing landscape, or heavy reliance on unofficial sources
|
|
37
|
+
-->
|
|
38
|
+
|
|
39
|
+
**Confidence Rationale:** [ONE_SENTENCE_EXPLAINING_WHY_THIS_CONFIDENCE_LEVEL]
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## 1. Introduction
|
|
44
|
+
|
|
45
|
+
### 1.1 Problem Statement and Motivation
|
|
46
|
+
|
|
47
|
+
<!--
|
|
48
|
+
Why does this research matter? What decision, project, or initiative prompted it?
|
|
49
|
+
Be specific about the practical context driving this investigation.
|
|
50
|
+
-->
|
|
51
|
+
|
|
52
|
+
[PROBLEM_STATEMENT]
|
|
53
|
+
|
|
54
|
+
### 1.2 Scope and Boundaries
|
|
55
|
+
|
|
56
|
+
**In scope:**
|
|
57
|
+
|
|
58
|
+
- [INCLUDED_TOPIC_1]
|
|
59
|
+
- [INCLUDED_TOPIC_2]
|
|
60
|
+
- [INCLUDED_TOPIC_3]
|
|
61
|
+
|
|
62
|
+
**Explicitly excluded:**
|
|
63
|
+
|
|
64
|
+
- [EXCLUDED_TOPIC_1: brief reason for exclusion]
|
|
65
|
+
- [EXCLUDED_TOPIC_2: brief reason for exclusion]
|
|
66
|
+
|
|
67
|
+
### 1.3 Research Questions
|
|
68
|
+
|
|
69
|
+
<!--
|
|
70
|
+
Number each question. These should be specific and answerable, not vague.
|
|
71
|
+
Good: "What is the p95 latency of Provider X's streaming API under concurrent load?"
|
|
72
|
+
Bad: "Is Provider X fast?"
|
|
73
|
+
-->
|
|
74
|
+
|
|
75
|
+
1. [RESEARCH_QUESTION_1]
|
|
76
|
+
2. [RESEARCH_QUESTION_2]
|
|
77
|
+
3. [RESEARCH_QUESTION_3]
|
|
78
|
+
4. [RESEARCH_QUESTION_N]
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## 2. Methodology
|
|
83
|
+
|
|
84
|
+
### 2.1 Search Strategy
|
|
85
|
+
|
|
86
|
+
<!--
|
|
87
|
+
Document how the research was conducted so the reader can assess rigor and reproduce it.
|
|
88
|
+
-->
|
|
89
|
+
|
|
90
|
+
| Parameter | Details |
|
|
91
|
+
|-----------|---------|
|
|
92
|
+
| Search tools | [e.g., Firecrawl MCP search, Firecrawl scrape, GitHub API] |
|
|
93
|
+
| Date range | [e.g., Sources published after January 2025] |
|
|
94
|
+
| Query terms | [List primary search queries used] |
|
|
95
|
+
| Languages | [e.g., English only] |
|
|
96
|
+
| Geographic scope | [e.g., Global, US-focused, etc.] |
|
|
97
|
+
|
|
98
|
+
### 2.2 Source Evaluation Criteria
|
|
99
|
+
|
|
100
|
+
Sources were evaluated on the following dimensions:
|
|
101
|
+
|
|
102
|
+
| Criterion | Weight | Description |
|
|
103
|
+
|-----------|--------|-------------|
|
|
104
|
+
| Authority | High | Official docs, peer-reviewed papers, recognized experts |
|
|
105
|
+
| Recency | High | Published within [TIME_WINDOW], reflecting current state |
|
|
106
|
+
| Specificity | Medium | Addresses the exact question, not tangential topics |
|
|
107
|
+
| Independence | Medium | Not authored by a vendor about their own product |
|
|
108
|
+
| Reproducibility | Medium | Claims backed by code samples, benchmarks, or verifiable data |
|
|
109
|
+
|
|
110
|
+
### 2.3 Source Distribution
|
|
111
|
+
|
|
112
|
+
| Source Type | Count | Notes |
|
|
113
|
+
|-------------|-------|-------|
|
|
114
|
+
| Official documentation | [N] | |
|
|
115
|
+
| Academic/research papers | [N] | |
|
|
116
|
+
| Technical blog posts | [N] | |
|
|
117
|
+
| GitHub repositories | [N] | |
|
|
118
|
+
| Conference talks/videos | [N] | |
|
|
119
|
+
| Community discussions | [N] | [e.g., Stack Overflow, Discord, forums] |
|
|
120
|
+
| Vendor/marketing materials | [N] | [treated with appropriate skepticism] |
|
|
121
|
+
| **Total** | **[SOURCE_COUNT]** | |
|
|
122
|
+
|
|
123
|
+
### 2.4 Limitations of This Research
|
|
124
|
+
|
|
125
|
+
<!--
|
|
126
|
+
Be honest. Every research method has blind spots. Acknowledging them increases credibility.
|
|
127
|
+
-->
|
|
128
|
+
|
|
129
|
+
- [LIMITATION_1: e.g., "No hands-on benchmarking was performed; latency figures are from third-party reports"]
|
|
130
|
+
- [LIMITATION_2: e.g., "Pricing information may be outdated; last verified on [DATE]"]
|
|
131
|
+
- [LIMITATION_3: e.g., "Limited non-English sources consulted"]
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
## 3. Domain Findings
|
|
136
|
+
|
|
137
|
+
<!--
|
|
138
|
+
Create one subsection (3.1, 3.2, etc.) per research domain.
|
|
139
|
+
Each domain was investigated by a separate research agent with a focused scope.
|
|
140
|
+
-->
|
|
141
|
+
|
|
142
|
+
### 3.1 [DOMAIN_1_TITLE]
|
|
143
|
+
|
|
144
|
+
#### Key Findings
|
|
145
|
+
|
|
146
|
+
<!--
|
|
147
|
+
Each finding should include a citation. Use the format [Source Title](URL) inline,
|
|
148
|
+
or use numbered references like [1] that map to the References section.
|
|
149
|
+
Prefer inline links for readability when the report will be consumed as markdown.
|
|
150
|
+
-->
|
|
151
|
+
|
|
152
|
+
- [FINDING]: [CITATION]
|
|
153
|
+
- [FINDING]: [CITATION]
|
|
154
|
+
- [FINDING]: [CITATION]
|
|
155
|
+
|
|
156
|
+
#### Evidence Quality
|
|
157
|
+
|
|
158
|
+
<!--
|
|
159
|
+
Rate the overall evidence quality for this domain and explain why.
|
|
160
|
+
-->
|
|
161
|
+
|
|
162
|
+
| Rating | Justification |
|
|
163
|
+
|--------|--------------|
|
|
164
|
+
| **[STRONG / MODERATE / WEAK]** | [Brief explanation, e.g., "Multiple official sources and independent benchmarks confirm these findings"] |
|
|
165
|
+
|
|
166
|
+
#### Notable Gaps
|
|
167
|
+
|
|
168
|
+
<!--
|
|
169
|
+
What questions in this domain could not be fully answered? What data was missing?
|
|
170
|
+
-->
|
|
171
|
+
|
|
172
|
+
- [GAP_1]
|
|
173
|
+
- [GAP_2]
|
|
174
|
+
|
|
175
|
+
#### Comparison Table
|
|
176
|
+
|
|
177
|
+
<!--
|
|
178
|
+
Include when the domain involves evaluating multiple options, tools, or approaches.
|
|
179
|
+
Omit this subsection if not applicable.
|
|
180
|
+
-->
|
|
181
|
+
|
|
182
|
+
| Criterion | Option A | Option B | Option C |
|
|
183
|
+
|-----------|----------|----------|----------|
|
|
184
|
+
| [CRITERION_1] | [VALUE] | [VALUE] | [VALUE] |
|
|
185
|
+
| [CRITERION_2] | [VALUE] | [VALUE] | [VALUE] |
|
|
186
|
+
| [CRITERION_3] | [VALUE] | [VALUE] | [VALUE] |
|
|
187
|
+
| **Overall** | [RATING] | [RATING] | [RATING] |
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
### 3.2 [DOMAIN_2_TITLE]
|
|
192
|
+
|
|
193
|
+
#### Key Findings
|
|
194
|
+
|
|
195
|
+
- [FINDING]: [CITATION]
|
|
196
|
+
- [FINDING]: [CITATION]
|
|
197
|
+
|
|
198
|
+
#### Evidence Quality
|
|
199
|
+
|
|
200
|
+
| Rating | Justification |
|
|
201
|
+
|--------|--------------|
|
|
202
|
+
| **[STRONG / MODERATE / WEAK]** | [Explanation] |
|
|
203
|
+
|
|
204
|
+
#### Notable Gaps
|
|
205
|
+
|
|
206
|
+
- [GAP_1]
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
### 3.N [DOMAIN_N_TITLE]
|
|
211
|
+
|
|
212
|
+
<!--
|
|
213
|
+
Repeat the domain section structure for each research domain.
|
|
214
|
+
Typical reports have 3-6 domain sections.
|
|
215
|
+
-->
|
|
216
|
+
|
|
217
|
+
#### Key Findings
|
|
218
|
+
|
|
219
|
+
- [FINDING]: [CITATION]
|
|
220
|
+
|
|
221
|
+
#### Evidence Quality
|
|
222
|
+
|
|
223
|
+
| Rating | Justification |
|
|
224
|
+
|--------|--------------|
|
|
225
|
+
| **[STRONG / MODERATE / WEAK]** | [Explanation] |
|
|
226
|
+
|
|
227
|
+
#### Notable Gaps
|
|
228
|
+
|
|
229
|
+
- [GAP_1]
|
|
230
|
+
|
|
231
|
+
---
|
|
232
|
+
|
|
233
|
+
## 4. Cross-Cutting Analysis
|
|
234
|
+
|
|
235
|
+
<!--
|
|
236
|
+
This is where the report goes beyond summarizing individual domains and performs synthesis.
|
|
237
|
+
Look for patterns, contradictions, and interactions across domains.
|
|
238
|
+
-->
|
|
239
|
+
|
|
240
|
+
### 4.1 Common Themes
|
|
241
|
+
|
|
242
|
+
<!--
|
|
243
|
+
Identify findings or patterns that appeared independently in multiple domains.
|
|
244
|
+
These carry higher confidence because they were corroborated from different angles.
|
|
245
|
+
-->
|
|
246
|
+
|
|
247
|
+
| Theme | Domains Where Observed | Confidence |
|
|
248
|
+
|-------|----------------------|------------|
|
|
249
|
+
| [THEME_1] | [DOMAIN_A], [DOMAIN_B] | [HIGH/MODERATE/LOW] |
|
|
250
|
+
| [THEME_2] | [DOMAIN_A], [DOMAIN_C] | [HIGH/MODERATE/LOW] |
|
|
251
|
+
|
|
252
|
+
### 4.2 Contradictions and Tensions
|
|
253
|
+
|
|
254
|
+
<!--
|
|
255
|
+
Where did different domains or sources disagree? Do not paper over disagreements.
|
|
256
|
+
Present both sides and, if possible, explain why they diverge.
|
|
257
|
+
-->
|
|
258
|
+
|
|
259
|
+
- **[TENSION_1]:** [Domain X suggests A, while Domain Y suggests B. This may be because...]
|
|
260
|
+
- **[TENSION_2]:** [Description of the contradiction and possible explanations]
|
|
261
|
+
|
|
262
|
+
### 4.3 Dependencies and Interactions
|
|
263
|
+
|
|
264
|
+
<!--
|
|
265
|
+
How do decisions in one domain constrain or enable decisions in another?
|
|
266
|
+
Example: "Choosing Provider X for the API layer constrains the authentication options to OAuth 2.0 only."
|
|
267
|
+
-->
|
|
268
|
+
|
|
269
|
+
- [DEPENDENCY_1]
|
|
270
|
+
- [DEPENDENCY_2]
|
|
271
|
+
|
|
272
|
+
### 4.4 Risk Assessment
|
|
273
|
+
|
|
274
|
+
<!--
|
|
275
|
+
Identify the key risks surfaced by the research. Rate each on likelihood and impact.
|
|
276
|
+
-->
|
|
277
|
+
|
|
278
|
+
| Risk | Likelihood | Impact | Mitigation |
|
|
279
|
+
|------|-----------|--------|------------|
|
|
280
|
+
| [RISK_1] | [HIGH/MED/LOW] | [HIGH/MED/LOW] | [Brief mitigation strategy] |
|
|
281
|
+
| [RISK_2] | [HIGH/MED/LOW] | [HIGH/MED/LOW] | [Brief mitigation strategy] |
|
|
282
|
+
| [RISK_3] | [HIGH/MED/LOW] | [HIGH/MED/LOW] | [Brief mitigation strategy] |
|
|
283
|
+
|
|
284
|
+
---
|
|
285
|
+
|
|
286
|
+
## 5. Synthesis and Recommendations
|
|
287
|
+
|
|
288
|
+
### 5.1 Primary Recommendation
|
|
289
|
+
|
|
290
|
+
<!--
|
|
291
|
+
State the recommendation clearly and then explain why.
|
|
292
|
+
The rationale should reference specific findings from the domain sections.
|
|
293
|
+
A reader should be able to trace the recommendation back to evidence.
|
|
294
|
+
-->
|
|
295
|
+
|
|
296
|
+
**Recommendation:** [CLEAR_STATEMENT_OF_WHAT_TO_DO]
|
|
297
|
+
|
|
298
|
+
**Rationale:** [2-3 paragraphs explaining why, referencing specific findings from Sections 3 and 4]
|
|
299
|
+
|
|
300
|
+
### 5.2 Alternative Approaches
|
|
301
|
+
|
|
302
|
+
<!--
|
|
303
|
+
Present the alternatives that were considered. Be fair to each.
|
|
304
|
+
The reader may have different constraints than those assumed in the primary recommendation.
|
|
305
|
+
-->
|
|
306
|
+
|
|
307
|
+
| Approach | Strengths | Weaknesses | Best When |
|
|
308
|
+
|----------|-----------|------------|-----------|
|
|
309
|
+
| [PRIMARY: recommended] | [STRENGTHS] | [WEAKNESSES] | [CONDITIONS] |
|
|
310
|
+
| [ALTERNATIVE_1] | [STRENGTHS] | [WEAKNESSES] | [CONDITIONS] |
|
|
311
|
+
| [ALTERNATIVE_2] | [STRENGTHS] | [WEAKNESSES] | [CONDITIONS] |
|
|
312
|
+
|
|
313
|
+
### 5.3 Implementation Considerations
|
|
314
|
+
|
|
315
|
+
<!--
|
|
316
|
+
Practical notes for acting on the recommendation. Not a full implementation plan,
|
|
317
|
+
but enough to understand the effort, prerequisites, and sequencing.
|
|
318
|
+
-->
|
|
319
|
+
|
|
320
|
+
- **Prerequisites:** [What must be in place before starting]
|
|
321
|
+
- **Estimated effort:** [Rough sizing, e.g., "2-3 sprint cycles for a team of 2"]
|
|
322
|
+
- **Key decisions to make first:** [Decisions that block implementation]
|
|
323
|
+
- **Suggested sequencing:** [What to do first, second, third]
|
|
324
|
+
|
|
325
|
+
### 5.4 Decision Criteria
|
|
326
|
+
|
|
327
|
+
<!--
|
|
328
|
+
Under what circumstances would the recommendation change? This is critical for making
|
|
329
|
+
the report durable. If conditions shift, the reader knows when to re-evaluate.
|
|
330
|
+
-->
|
|
331
|
+
|
|
332
|
+
The primary recommendation should be revisited if any of the following occur:
|
|
333
|
+
|
|
334
|
+
- [CONDITION_1: e.g., "Provider X raises pricing above $Y/month"]
|
|
335
|
+
- [CONDITION_2: e.g., "A competing solution achieves feature parity with lower latency"]
|
|
336
|
+
- [CONDITION_3: e.g., "Requirements change to include [SPECIFIC_CAPABILITY]"]
|
|
337
|
+
|
|
338
|
+
---
|
|
339
|
+
|
|
340
|
+
## 6. Limitations and Future Research
|
|
341
|
+
|
|
342
|
+
### 6.1 What This Report Does Not Cover
|
|
343
|
+
|
|
344
|
+
<!--
|
|
345
|
+
Restate scope exclusions and add any areas that turned out to be relevant
|
|
346
|
+
but could not be adequately researched within the current scope.
|
|
347
|
+
-->
|
|
348
|
+
|
|
349
|
+
- [EXCLUSION_1]
|
|
350
|
+
- [EXCLUSION_2]
|
|
351
|
+
|
|
352
|
+
### 6.2 Areas of Thin or Conflicting Evidence
|
|
353
|
+
|
|
354
|
+
<!--
|
|
355
|
+
Where was the evidence insufficient to draw strong conclusions?
|
|
356
|
+
Be specific about what data would resolve the uncertainty.
|
|
357
|
+
-->
|
|
358
|
+
|
|
359
|
+
- **[AREA_1]:** [What is uncertain and what evidence would help]
|
|
360
|
+
- **[AREA_2]:** [What is uncertain and what evidence would help]
|
|
361
|
+
|
|
362
|
+
### 6.3 Suggested Follow-Up Research
|
|
363
|
+
|
|
364
|
+
<!--
|
|
365
|
+
Concrete next steps for research, not vague suggestions.
|
|
366
|
+
Include specific questions and, where possible, suggested approaches.
|
|
367
|
+
-->
|
|
368
|
+
|
|
369
|
+
1. **[FOLLOW_UP_1]:** [Specific question to answer, suggested method]
|
|
370
|
+
2. **[FOLLOW_UP_2]:** [Specific question to answer, suggested method]
|
|
371
|
+
3. **[FOLLOW_UP_3]:** [Specific question to answer, suggested method]
|
|
372
|
+
|
|
373
|
+
---
|
|
374
|
+
|
|
375
|
+
## 7. References
|
|
376
|
+
|
|
377
|
+
<!--
|
|
378
|
+
Organize references by domain for easy navigation.
|
|
379
|
+
Every claim in the report should trace back to a reference here.
|
|
380
|
+
|
|
381
|
+
Reference format:
|
|
382
|
+
- [N] **Title** — URL — Accessed [DATE] — Type: [official docs / paper / blog / repo / discussion / vendor] — Relevance: [HIGH/MEDIUM/LOW]
|
|
383
|
+
-->
|
|
384
|
+
|
|
385
|
+
### [DOMAIN_1_TITLE]
|
|
386
|
+
|
|
387
|
+
- [1] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
|
|
388
|
+
- [2] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
|
|
389
|
+
|
|
390
|
+
### [DOMAIN_2_TITLE]
|
|
391
|
+
|
|
392
|
+
- [3] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
|
|
393
|
+
- [4] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
|
|
394
|
+
|
|
395
|
+
### [DOMAIN_N_TITLE]
|
|
396
|
+
|
|
397
|
+
- [N] **[Source Title]** | [URL] | Accessed [DATE] | Type: [SOURCE_TYPE] | Relevance: [HIGH/MEDIUM/LOW]
|
|
398
|
+
|
|
399
|
+
---
|
|
400
|
+
|
|
401
|
+
## Appendices
|
|
402
|
+
|
|
403
|
+
<!--
|
|
404
|
+
Include appendices only when there is substantial supplementary data that would
|
|
405
|
+
disrupt the flow of the main report. Each appendix should be self-contained.
|
|
406
|
+
Omit this entire section if no appendices are needed.
|
|
407
|
+
-->
|
|
408
|
+
|
|
409
|
+
### Appendix A: [TITLE]
|
|
410
|
+
|
|
411
|
+
<!-- Example: Detailed Feature Comparison Matrix -->
|
|
412
|
+
|
|
413
|
+
| Feature | Option A | Option B | Option C | Option D |
|
|
414
|
+
|---------|----------|----------|----------|----------|
|
|
415
|
+
| [FEATURE_1] | [DETAIL] | [DETAIL] | [DETAIL] | [DETAIL] |
|
|
416
|
+
| [FEATURE_2] | [DETAIL] | [DETAIL] | [DETAIL] | [DETAIL] |
|
|
417
|
+
|
|
418
|
+
### Appendix B: [TITLE]
|
|
419
|
+
|
|
420
|
+
<!-- Example: Pricing and Cost Analysis -->
|
|
421
|
+
|
|
422
|
+
| Tier | Monthly Cost | Included Usage | Overage Rate | Notes |
|
|
423
|
+
|------|-------------|---------------|-------------|-------|
|
|
424
|
+
| [TIER_1] | [COST] | [USAGE] | [RATE] | [NOTES] |
|
|
425
|
+
| [TIER_2] | [COST] | [USAGE] | [RATE] | [NOTES] |
|
|
426
|
+
|
|
427
|
+
### Appendix C: [TITLE]
|
|
428
|
+
|
|
429
|
+
<!-- Example: Technical Specifications or Configuration Details -->
|
|
430
|
+
|
|
431
|
+
[SUPPLEMENTARY_CONTENT]
|
|
432
|
+
|
|
433
|
+
---
|
|
434
|
+
|
|
435
|
+
*Report generated via Deep Research Protocol. [AGENT_COUNT] research agents across [DOMAIN_COUNT] domains, synthesized into a unified analysis.*
|
|
@@ -0,0 +1,220 @@
|
|
|
1
|
+
# Firecrawl MCP Reference
|
|
2
|
+
|
|
3
|
+
## Layer 1: Quick Decision Table
|
|
4
|
+
|
|
5
|
+
Use this to pick the right Firecrawl tool instantly.
|
|
6
|
+
|
|
7
|
+
| I need to... | Tool | Key params | Credits |
|
|
8
|
+
|---|---|---|---|
|
|
9
|
+
| Search the web for info | `mcp__firecrawl-mcp__firecrawl_search` | `query`, `limit` | 2 per 10 results |
|
|
10
|
+
| Get content from a known URL | `mcp__firecrawl-mcp__firecrawl_scrape` | `url`, `formats: ["markdown"]` | 1 per page |
|
|
11
|
+
| Scrape multiple known URLs | `mcp__firecrawl-mcp__firecrawl_batch_scrape` | `urls`, `options` | 1 per page |
|
|
12
|
+
| Discover all URLs on a site | `mcp__firecrawl-mcp__firecrawl_map` | `url`, `search` | 1 |
|
|
13
|
+
| Crawl a site section | `mcp__firecrawl-mcp__firecrawl_crawl` | `url`, `limit`, `maxDiscoveryDepth` | 1 per page |
|
|
14
|
+
| Extract structured JSON from pages | `mcp__firecrawl-mcp__firecrawl_extract` | `urls`, `prompt`, `schema` | varies |
|
|
15
|
+
|
|
16
|
+
**Default for web search**: Always use `firecrawl_search` instead of `WebSearch`.
|
|
17
|
+
**Default for reading a URL**: Always use `firecrawl_scrape` instead of `WebFetch`.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Layer 2: Common Patterns
|
|
22
|
+
|
|
23
|
+
### Web Search (most common)
|
|
24
|
+
```json
|
|
25
|
+
{
|
|
26
|
+
"query": "your search terms",
|
|
27
|
+
"limit": 5,
|
|
28
|
+
"sources": [{"type": "web"}]
|
|
29
|
+
}
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
To also scrape the result pages:
|
|
33
|
+
```json
|
|
34
|
+
{
|
|
35
|
+
"query": "your search terms",
|
|
36
|
+
"limit": 3,
|
|
37
|
+
"scrapeOptions": {
|
|
38
|
+
"formats": ["markdown"],
|
|
39
|
+
"onlyMainContent": true
|
|
40
|
+
}
|
|
41
|
+
}
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
**Search sources**: `web` (default), `news`, `images`. Limit applies per source type.
|
|
45
|
+
|
|
46
|
+
**Search categories** (filter by site type):
|
|
47
|
+
- `github`: GitHub repos, code, issues
|
|
48
|
+
- `research`: arXiv, Nature, IEEE, PubMed
|
|
49
|
+
- `pdf`: PDF documents
|
|
50
|
+
|
|
51
|
+
**Time filters** (via `tbs` param):
|
|
52
|
+
- `qdr:h` past hour, `qdr:d` past 24h, `qdr:w` past week, `qdr:m` past month, `qdr:y` past year
|
|
53
|
+
|
|
54
|
+
### Single Page Scrape
|
|
55
|
+
```json
|
|
56
|
+
{
|
|
57
|
+
"url": "https://example.com",
|
|
58
|
+
"formats": ["markdown"],
|
|
59
|
+
"onlyMainContent": true,
|
|
60
|
+
"maxAge": 172800000
|
|
61
|
+
}
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
**Performance tip**: `maxAge: 172800000` (2-day cache) makes scrapes up to 5x faster. Use `maxAge: 0` for fresh content.
|
|
65
|
+
|
|
66
|
+
### Site Discovery (Map then Scrape)
|
|
67
|
+
```json
|
|
68
|
+
{
|
|
69
|
+
"url": "https://docs.example.com",
|
|
70
|
+
"search": "api reference",
|
|
71
|
+
"limit": 50
|
|
72
|
+
}
|
|
73
|
+
```
|
|
74
|
+
Use `map` to find URLs, then `scrape` or `batch_scrape` to get content.
|
|
75
|
+
|
|
76
|
+
### Structured Data Extraction
|
|
77
|
+
```json
|
|
78
|
+
{
|
|
79
|
+
"urls": ["https://example.com/pricing"],
|
|
80
|
+
"prompt": "Extract all pricing plans with name, price, and features",
|
|
81
|
+
"schema": {
|
|
82
|
+
"type": "object",
|
|
83
|
+
"properties": {
|
|
84
|
+
"plans": {
|
|
85
|
+
"type": "array",
|
|
86
|
+
"items": {
|
|
87
|
+
"type": "object",
|
|
88
|
+
"properties": {
|
|
89
|
+
"name": {"type": "string"},
|
|
90
|
+
"price": {"type": "number"},
|
|
91
|
+
"features": {"type": "array", "items": {"type": "string"}}
|
|
92
|
+
}
|
|
93
|
+
}
|
|
94
|
+
}
|
|
95
|
+
}
|
|
96
|
+
}
|
|
97
|
+
}
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## Layer 3: Full API Details
|
|
103
|
+
|
|
104
|
+
### Scrape Formats
|
|
105
|
+
| Format | Description |
|
|
106
|
+
|--------|-------------|
|
|
107
|
+
| `markdown` | Clean markdown, ideal for LLM consumption |
|
|
108
|
+
| `html` | Parsed HTML |
|
|
109
|
+
| `rawHtml` | Unmodified HTML |
|
|
110
|
+
| `screenshot` | Page screenshot (supports `fullPage`, `quality`, `viewport`) |
|
|
111
|
+
| `links` | All links on the page |
|
|
112
|
+
| `summary` | AI-generated summary |
|
|
113
|
+
| `json` | Structured extraction with schema/prompt |
|
|
114
|
+
| `branding` | Brand identity extraction (colors, fonts, typography) |
|
|
115
|
+
| `changeTracking` | Diff against previous scrape |
|
|
116
|
+
|
|
117
|
+
### Scrape Actions (interact before scraping)
|
|
118
|
+
Available action types: `wait`, `click`, `screenshot`, `write`, `press`, `scroll`, `scrape`, `executeJavascript`, `generatePDF`
|
|
119
|
+
|
|
120
|
+
Example: Login then scrape
|
|
121
|
+
```json
|
|
122
|
+
{
|
|
123
|
+
"url": "https://example.com/login",
|
|
124
|
+
"formats": ["markdown"],
|
|
125
|
+
"actions": [
|
|
126
|
+
{"type": "write", "text": "user@example.com"},
|
|
127
|
+
{"type": "press", "key": "Tab"},
|
|
128
|
+
{"type": "write", "text": "password"},
|
|
129
|
+
{"type": "click", "selector": "button[type='submit']"},
|
|
130
|
+
{"type": "wait", "milliseconds": 1500}
|
|
131
|
+
]
|
|
132
|
+
}
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
### Location and Language
|
|
136
|
+
```json
|
|
137
|
+
{
|
|
138
|
+
"url": "https://example.com",
|
|
139
|
+
"location": {
|
|
140
|
+
"country": "US",
|
|
141
|
+
"languages": ["en"]
|
|
142
|
+
}
|
|
143
|
+
}
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### Crawl Options
|
|
147
|
+
| Param | Description | Default |
|
|
148
|
+
|-------|-------------|---------|
|
|
149
|
+
| `maxDiscoveryDepth` | How deep to follow links | - |
|
|
150
|
+
| `limit` | Max pages to crawl | - |
|
|
151
|
+
| `allowExternalLinks` | Follow links to other domains | false |
|
|
152
|
+
| `deduplicateSimilarURLs` | Skip near-duplicate URLs | false |
|
|
153
|
+
| `includePaths` | Only crawl URLs matching these patterns | - |
|
|
154
|
+
| `excludePaths` | Skip URLs matching these patterns | - |
|
|
155
|
+
|
|
156
|
+
**Warning**: Crawl responses can be very large. Use `map` + `batch_scrape` for better control.
|
|
157
|
+
|
|
158
|
+
### Map Options
|
|
159
|
+
| Param | Description |
|
|
160
|
+
|-------|-------------|
|
|
161
|
+
| `search` | Filter URLs by search term |
|
|
162
|
+
| `sitemap` | `include`, `skip`, or `only` |
|
|
163
|
+
| `includeSubdomains` | Include subdomains |
|
|
164
|
+
| `limit` | Max URLs to return (up to 100k) |
|
|
165
|
+
| `ignoreQueryParameters` | Deduplicate by path |
|
|
166
|
+
|
|
167
|
+
### Cost Reference
|
|
168
|
+
| Operation | Credits |
|
|
169
|
+
|-----------|---------|
|
|
170
|
+
| Search (10 results) | 2 |
|
|
171
|
+
| Basic scrape | 1 per page |
|
|
172
|
+
| PDF parsing | 1 per PDF page |
|
|
173
|
+
| Enhanced proxy | +4 per page |
|
|
174
|
+
| JSON mode (structured extraction) | +4 per page |
|
|
175
|
+
| Map | 1 |
|
|
176
|
+
|
|
177
|
+
### Caching
|
|
178
|
+
- Default `maxAge`: 172800000ms (2 days)
|
|
179
|
+
- Set `maxAge: 0` for always-fresh
|
|
180
|
+
- Set `storeInCache: false` to skip caching
|
|
181
|
+
- `changeTracking` format bypasses cache
|
|
182
|
+
|
|
183
|
+
### Rate Limits and Retries
|
|
184
|
+
- Built-in exponential backoff on rate limits
|
|
185
|
+
- Configurable via env vars:
|
|
186
|
+
- `FIRECRAWL_RETRY_MAX_ATTEMPTS` (default: 3)
|
|
187
|
+
- `FIRECRAWL_RETRY_INITIAL_DELAY` (default: 1000ms)
|
|
188
|
+
- `FIRECRAWL_RETRY_MAX_DELAY` (default: 10000ms)
|
|
189
|
+
- `FIRECRAWL_RETRY_BACKOFF_FACTOR` (default: 2)
|
|
190
|
+
|
|
191
|
+
### Agent Feature (New)
|
|
192
|
+
Autonomous web data gathering. Describe what you need, it searches, navigates, and extracts.
|
|
193
|
+
```json
|
|
194
|
+
{
|
|
195
|
+
"prompt": "Find the pricing plans for Notion"
|
|
196
|
+
}
|
|
197
|
+
```
|
|
198
|
+
Not available via MCP tools yet, API-only at `POST /v2/agent`.
|
|
199
|
+
|
|
200
|
+
### Async Operations
|
|
201
|
+
- `crawl` and `batch_scrape` are async. They return an operation ID.
|
|
202
|
+
- Check status with `firecrawl_check_crawl_status` or `firecrawl_check_batch_status`.
|
|
203
|
+
- Results expire after 24 hours.
|
|
204
|
+
|
|
205
|
+
---
|
|
206
|
+
|
|
207
|
+
## When NOT to Use Firecrawl
|
|
208
|
+
|
|
209
|
+
| Situation | Use Instead |
|
|
210
|
+
|-----------|-------------|
|
|
211
|
+
| Searching local files | `Grep`, `Glob` |
|
|
212
|
+
| Reading a file on disk | `Read` |
|
|
213
|
+
| GitHub-specific operations (PRs, issues) | `gh` CLI or GitHub MCP tools |
|
|
214
|
+
| Authenticated service (Google Docs, Jira) | Service-specific MCP tool |
|
|
215
|
+
|
|
216
|
+
## Setup
|
|
217
|
+
- Package: `firecrawl-mcp` (npm, stdio transport)
|
|
218
|
+
- Config: `~/.claude.json` under mcpServers
|
|
219
|
+
- API key: `FIRECRAWL_API_KEY` env var
|
|
220
|
+
- Docs: https://docs.firecrawl.dev
|