firecrawl-mcp 1.7.2 → 1.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +1 -1
- package/README.md +90 -14
- package/dist/index.js +123 -231
- package/dist/index.test.js +1 -1
- package/package.json +7 -5
- package/dist/jest.setup.js +0 -58
- package/dist/src/index.js +0 -1053
- package/dist/src/index.test.js +0 -225
package/LICENSE
CHANGED
package/README.md
CHANGED
|
@@ -2,7 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
A Model Context Protocol (MCP) server implementation that integrates with [Firecrawl](https://github.com/mendableai/firecrawl) for web scraping capabilities.
|
|
4
4
|
|
|
5
|
-
Big thanks to [@vrknetha](https://github.com/vrknetha), [@cawstudios](https://caw.tech) for the initial implementation!
|
|
5
|
+
> Big thanks to [@vrknetha](https://github.com/vrknetha), [@cawstudios](https://caw.tech) for the initial implementation!
|
|
6
|
+
>
|
|
7
|
+
> You can also play around with [our MCP Server on MCP.so's playground](https://mcp.so/playground?server=firecrawl-mcp-server) or on [Klavis AI](https://www.klavis.ai/mcp-servers). Thanks to MCP.so and Klavis AI for hosting and [@gstarwd](https://github.com/gstarwd) and [@xiangkaiz](https://github.com/xiangkaiz) for integrating our server.
|
|
6
8
|
|
|
7
9
|
## Features
|
|
8
10
|
|
|
@@ -11,10 +13,10 @@ Big thanks to [@vrknetha](https://github.com/vrknetha), [@cawstudios](https://ca
|
|
|
11
13
|
- URL discovery and crawling
|
|
12
14
|
- Web search with content extraction
|
|
13
15
|
- Automatic retries with exponential backoff
|
|
14
|
-
-
|
|
16
|
+
- Efficient batch processing with built-in rate limiting
|
|
15
17
|
- Credit usage monitoring for cloud API
|
|
16
18
|
- Comprehensive logging system
|
|
17
|
-
- Support for cloud and self-hosted
|
|
19
|
+
- Support for cloud and self-hosted Firecrawl instances
|
|
18
20
|
- Mobile/Desktop viewport support
|
|
19
21
|
- Smart content filtering with tag inclusion/exclusion
|
|
20
22
|
|
|
@@ -36,22 +38,44 @@ npm install -g firecrawl-mcp
|
|
|
36
38
|
|
|
37
39
|
Configuring Cursor 🖥️
|
|
38
40
|
Note: Requires Cursor version 0.45.6+
|
|
41
|
+
For the most up-to-date configuration instructions, please refer to the official Cursor documentation on configuring MCP servers:
|
|
42
|
+
[Cursor MCP Server Configuration Guide](https://docs.cursor.com/context/model-context-protocol#configuring-mcp-servers)
|
|
39
43
|
|
|
40
|
-
To configure
|
|
44
|
+
To configure Firecrawl MCP in Cursor **v0.45.6**
|
|
41
45
|
|
|
42
46
|
1. Open Cursor Settings
|
|
43
|
-
2. Go to Features > MCP Servers
|
|
47
|
+
2. Go to Features > MCP Servers
|
|
44
48
|
3. Click "+ Add New MCP Server"
|
|
45
49
|
4. Enter the following:
|
|
46
50
|
- Name: "firecrawl-mcp" (or your preferred name)
|
|
47
51
|
- Type: "command"
|
|
48
52
|
- Command: `env FIRECRAWL_API_KEY=your-api-key npx -y firecrawl-mcp`
|
|
49
53
|
|
|
54
|
+
To configure Firecrawl MCP in Cursor **v0.48.6**
|
|
55
|
+
|
|
56
|
+
1. Open Cursor Settings
|
|
57
|
+
2. Go to Features > MCP Servers
|
|
58
|
+
3. Click "+ Add new global MCP server"
|
|
59
|
+
4. Enter the following code:
|
|
60
|
+
```json
|
|
61
|
+
{
|
|
62
|
+
"mcpServers": {
|
|
63
|
+
"firecrawl-mcp": {
|
|
64
|
+
"command": "npx",
|
|
65
|
+
"args": ["-y", "firecrawl-mcp"],
|
|
66
|
+
"env": {
|
|
67
|
+
"FIRECRAWL_API_KEY": "YOUR-API-KEY"
|
|
68
|
+
}
|
|
69
|
+
}
|
|
70
|
+
}
|
|
71
|
+
}
|
|
72
|
+
```
|
|
73
|
+
|
|
50
74
|
> If you are using Windows and are running into issues, try `cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"`
|
|
51
75
|
|
|
52
|
-
Replace `your-api-key` with your
|
|
76
|
+
Replace `your-api-key` with your Firecrawl API key. If you don't have one yet, you can create an account and get it from https://www.firecrawl.dev/app/api-keys
|
|
53
77
|
|
|
54
|
-
After adding, refresh the MCP server list to see the new tools. The Composer Agent will automatically use
|
|
78
|
+
After adding, refresh the MCP server list to see the new tools. The Composer Agent will automatically use Firecrawl MCP when appropriate, but you can explicitly request it by describing your web scraping needs. Access the Composer via Command+L (Mac), select "Agent" next to the submit button, and enter your query.
|
|
55
79
|
|
|
56
80
|
### Running on Windsurf
|
|
57
81
|
|
|
@@ -64,17 +88,16 @@ Add this to your `./codeium/windsurf/model_config.json`:
|
|
|
64
88
|
"command": "npx",
|
|
65
89
|
"args": ["-y", "firecrawl-mcp"],
|
|
66
90
|
"env": {
|
|
67
|
-
"FIRECRAWL_API_KEY": "
|
|
91
|
+
"FIRECRAWL_API_KEY": "YOUR_API_KEY"
|
|
68
92
|
}
|
|
69
93
|
}
|
|
70
94
|
}
|
|
71
95
|
}
|
|
72
96
|
```
|
|
73
97
|
|
|
74
|
-
|
|
75
98
|
### Installing via Smithery (Legacy)
|
|
76
99
|
|
|
77
|
-
To install
|
|
100
|
+
To install Firecrawl for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@mendableai/mcp-server-firecrawl):
|
|
78
101
|
|
|
79
102
|
```bash
|
|
80
103
|
npx -y @smithery/cli install @mendableai/mcp-server-firecrawl --client claude
|
|
@@ -86,7 +109,7 @@ npx -y @smithery/cli install @mendableai/mcp-server-firecrawl --client claude
|
|
|
86
109
|
|
|
87
110
|
#### Required for Cloud API
|
|
88
111
|
|
|
89
|
-
- `FIRECRAWL_API_KEY`: Your
|
|
112
|
+
- `FIRECRAWL_API_KEY`: Your Firecrawl API key
|
|
90
113
|
- Required when using cloud API (default)
|
|
91
114
|
- Optional when using self-hosted instance with `FIRECRAWL_API_URL`
|
|
92
115
|
- `FIRECRAWL_API_URL` (Optional): Custom API endpoint for self-hosted instances
|
|
@@ -206,7 +229,7 @@ These configurations control:
|
|
|
206
229
|
|
|
207
230
|
### Rate Limiting and Batch Processing
|
|
208
231
|
|
|
209
|
-
The server utilizes
|
|
232
|
+
The server utilizes Firecrawl's built-in rate limiting and batch processing capabilities:
|
|
210
233
|
|
|
211
234
|
- Automatic rate limit handling with exponential backoff
|
|
212
235
|
- Efficient parallel processing for batch operations
|
|
@@ -372,7 +395,60 @@ Example response:
|
|
|
372
395
|
- `enableWebSearch`: Enable web search for additional context
|
|
373
396
|
- `includeSubdomains`: Include subdomains in extraction
|
|
374
397
|
|
|
375
|
-
When using a self-hosted instance, the extraction will use your configured LLM. For cloud API, it uses
|
|
398
|
+
When using a self-hosted instance, the extraction will use your configured LLM. For cloud API, it uses Firecrawl's managed LLM service.
|
|
399
|
+
|
|
400
|
+
### 7. Deep Research Tool (firecrawl_deep_research)
|
|
401
|
+
|
|
402
|
+
Conduct deep web research on a query using intelligent crawling, search, and LLM analysis.
|
|
403
|
+
|
|
404
|
+
```json
|
|
405
|
+
{
|
|
406
|
+
"name": "firecrawl_deep_research",
|
|
407
|
+
"arguments": {
|
|
408
|
+
"query": "how does carbon capture technology work?",
|
|
409
|
+
"maxDepth": 3,
|
|
410
|
+
"timeLimit": 120,
|
|
411
|
+
"maxUrls": 50
|
|
412
|
+
}
|
|
413
|
+
}
|
|
414
|
+
```
|
|
415
|
+
|
|
416
|
+
Arguments:
|
|
417
|
+
|
|
418
|
+
- query (string, required): The research question or topic to explore.
|
|
419
|
+
- maxDepth (number, optional): Maximum recursive depth for crawling/search (default: 3).
|
|
420
|
+
- timeLimit (number, optional): Time limit in seconds for the research session (default: 120).
|
|
421
|
+
- maxUrls (number, optional): Maximum number of URLs to analyze (default: 50).
|
|
422
|
+
|
|
423
|
+
Returns:
|
|
424
|
+
|
|
425
|
+
- Final analysis generated by an LLM based on research. (data.finalAnalysis)
|
|
426
|
+
- May also include structured activities and sources used in the research process.
|
|
427
|
+
|
|
428
|
+
### 8. Generate LLMs.txt Tool (firecrawl_generate_llmstxt)
|
|
429
|
+
|
|
430
|
+
Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site.
|
|
431
|
+
|
|
432
|
+
```json
|
|
433
|
+
{
|
|
434
|
+
"name": "firecrawl_generate_llmstxt",
|
|
435
|
+
"arguments": {
|
|
436
|
+
"url": "https://example.com",
|
|
437
|
+
"maxUrls": 20,
|
|
438
|
+
"showFullText": true
|
|
439
|
+
}
|
|
440
|
+
}
|
|
441
|
+
```
|
|
442
|
+
|
|
443
|
+
Arguments:
|
|
444
|
+
|
|
445
|
+
- url (string, required): The base URL of the website to analyze.
|
|
446
|
+
- maxUrls (number, optional): Max number of URLs to include (default: 10).
|
|
447
|
+
- showFullText (boolean, optional): Whether to include llms-full.txt contents in the response.
|
|
448
|
+
|
|
449
|
+
Returns:
|
|
450
|
+
|
|
451
|
+
- Generated llms.txt file contents and optionally the llms-full.txt (data.llmstxt and/or data.llmsfulltxt)
|
|
376
452
|
|
|
377
453
|
## Logging System
|
|
378
454
|
|
|
@@ -387,7 +463,7 @@ The server includes comprehensive logging:
|
|
|
387
463
|
Example log messages:
|
|
388
464
|
|
|
389
465
|
```
|
|
390
|
-
[INFO]
|
|
466
|
+
[INFO] Firecrawl MCP Server initialized successfully
|
|
391
467
|
[INFO] Starting scrape for URL: https://example.com
|
|
392
468
|
[INFO] Batch operation queued with ID: batch_1
|
|
393
469
|
[WARNING] Credit usage has reached warning threshold
|