@browserless.io/mcp 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (81) hide show
  1. package/LICENSE +557 -0
  2. package/README.md +280 -0
  3. package/bin/cli.js +2 -0
  4. package/build/src/@types/types.d.ts +538 -0
  5. package/build/src/config.d.ts +3 -0
  6. package/build/src/config.js +42 -0
  7. package/build/src/index.d.ts +4 -0
  8. package/build/src/index.js +153 -0
  9. package/build/src/lib/account-resolver.d.ts +17 -0
  10. package/build/src/lib/account-resolver.js +78 -0
  11. package/build/src/lib/agent-client.d.ts +58 -0
  12. package/build/src/lib/agent-client.js +530 -0
  13. package/build/src/lib/agent-format.d.ts +35 -0
  14. package/build/src/lib/agent-format.js +155 -0
  15. package/build/src/lib/amplitude.d.ts +11 -0
  16. package/build/src/lib/amplitude.js +65 -0
  17. package/build/src/lib/analytics.d.ts +18 -0
  18. package/build/src/lib/analytics.js +79 -0
  19. package/build/src/lib/api-client.d.ts +17 -0
  20. package/build/src/lib/api-client.js +357 -0
  21. package/build/src/lib/bounded-event-store.d.ts +22 -0
  22. package/build/src/lib/bounded-event-store.js +69 -0
  23. package/build/src/lib/cache.d.ts +12 -0
  24. package/build/src/lib/cache.js +49 -0
  25. package/build/src/lib/define-tool.d.ts +71 -0
  26. package/build/src/lib/define-tool.js +71 -0
  27. package/build/src/lib/error-classifier.d.ts +4 -0
  28. package/build/src/lib/error-classifier.js +125 -0
  29. package/build/src/lib/redis-oauth-proxy.d.ts +13 -0
  30. package/build/src/lib/redis-oauth-proxy.js +214 -0
  31. package/build/src/lib/retry.d.ts +2 -0
  32. package/build/src/lib/retry.js +19 -0
  33. package/build/src/lib/schema-fields.d.ts +10 -0
  34. package/build/src/lib/schema-fields.js +27 -0
  35. package/build/src/lib/supabase-token-patch.d.ts +6 -0
  36. package/build/src/lib/supabase-token-patch.js +33 -0
  37. package/build/src/lib/utils.d.ts +27 -0
  38. package/build/src/lib/utils.js +67 -0
  39. package/build/src/prompts/extract-content.d.ts +2 -0
  40. package/build/src/prompts/extract-content.js +33 -0
  41. package/build/src/prompts/scrape-url.d.ts +2 -0
  42. package/build/src/prompts/scrape-url.js +36 -0
  43. package/build/src/resources/api-docs.d.ts +3 -0
  44. package/build/src/resources/api-docs.js +54 -0
  45. package/build/src/resources/status.d.ts +3 -0
  46. package/build/src/resources/status.js +30 -0
  47. package/build/src/skills/autonomous-login.md +95 -0
  48. package/build/src/skills/captchas.md +48 -0
  49. package/build/src/skills/cookie-consent.md +50 -0
  50. package/build/src/skills/dynamic-content.md +72 -0
  51. package/build/src/skills/index.d.ts +9 -0
  52. package/build/src/skills/index.js +221 -0
  53. package/build/src/skills/modals.md +56 -0
  54. package/build/src/skills/screenshots.md +53 -0
  55. package/build/src/skills/shadow-dom.md +64 -0
  56. package/build/src/skills/snapshot-misses.md +67 -0
  57. package/build/src/skills/system-prompt.d.ts +2 -0
  58. package/build/src/skills/system-prompt.js +128 -0
  59. package/build/src/skills/tabs.md +77 -0
  60. package/build/src/tools/agent.d.ts +15 -0
  61. package/build/src/tools/agent.js +299 -0
  62. package/build/src/tools/crawl.d.ts +75 -0
  63. package/build/src/tools/crawl.js +426 -0
  64. package/build/src/tools/download.d.ts +11 -0
  65. package/build/src/tools/download.js +92 -0
  66. package/build/src/tools/export.d.ts +28 -0
  67. package/build/src/tools/export.js +129 -0
  68. package/build/src/tools/function.d.ts +24 -0
  69. package/build/src/tools/function.js +144 -0
  70. package/build/src/tools/map.d.ts +23 -0
  71. package/build/src/tools/map.js +129 -0
  72. package/build/src/tools/performance.d.ts +25 -0
  73. package/build/src/tools/performance.js +103 -0
  74. package/build/src/tools/schemas.d.ts +466 -0
  75. package/build/src/tools/schemas.js +487 -0
  76. package/build/src/tools/search.d.ts +67 -0
  77. package/build/src/tools/search.js +184 -0
  78. package/build/src/tools/smartscraper.d.ts +42 -0
  79. package/build/src/tools/smartscraper.js +136 -0
  80. package/package.json +111 -0
  81. package/patches/mcp-proxy+6.4.0.patch +31 -0
package/README.md ADDED
@@ -0,0 +1,280 @@
1
+ # Browserless MCP Server
2
+
3
+ <div align="center">
4
+
5
+ [![MCP Badge](https://lobehub.com/badge/mcp/browserless-browserless-mcp?style=plastic)](https://lobehub.com/mcp/browserless-browserless-mcp)
6
+
7
+ </div>
8
+
9
+ MCP (Model Context Protocol) server for [Browserless.io](https://browserless.io) — expose the Browserless smart scraper API to LLM clients like Claude Desktop, Cursor, VS Code, and Windsurf.
10
+
11
+ ## Quick Start
12
+
13
+ Get an API token from [browserless.io](https://browserless.io) (free tier available), then point your MCP client at the hosted server:
14
+
15
+ ```json
16
+ {
17
+ "mcpServers": {
18
+ "browserless": {
19
+ "url": "https://mcp.browserless.io/mcp?token=your-token-here"
20
+ }
21
+ }
22
+ }
23
+ ```
24
+
25
+ No local install — see [Configuration](#configuration) for per-client snippets.
26
+
27
+ ## Tools
28
+
29
+ | Tool | Description |
30
+ | -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
31
+ | `browserless_smartscraper` | Scrape any webpage using cascading strategies (HTTP fetch, proxy, headless browser, captcha solving). Returns content in requested formats: `markdown`, `html`, `screenshot`, `pdf`, `links`. |
32
+ | `browserless_search` | Search the web using Browserless and optionally scrape each result. Supports web, news, and image search with geo-targeting and time filters. |
33
+ | `browserless_map` | Discover and map all URLs on a website. Crawls via sitemaps and link extraction. Returns URLs with optional titles and descriptions. Useful for site audits and content discovery. |
34
+ | `browserless_crawl` | Crawl a website and scrape every discovered page. Supports depth control, path filtering, sitemap strategies, and configurable scrape options. Returns scraped content and metadata for each page. |
35
+ | `browserless_performance` | Run Lighthouse audits on any URL. Returns scores and metrics for accessibility, best practices, performance, PWA, and SEO. Optionally filter by category or supply performance budgets. |
36
+ | `browserless_function` | Execute custom Puppeteer JavaScript on the Browserless cloud. The function receives a `page` object and optional `context`; return `{ data, type }` to control the payload and Content-Type. |
37
+ | `browserless_download` | Run custom Puppeteer code and return the file Chrome downloads during execution (e.g. after clicking a download link). The downloaded file is streamed back to the caller. |
38
+ | `browserless_export` | Export a webpage via the Browserless `/export` API. Fetches the URL and returns its native content (HTML, PDF, image, etc.) with automatic content-type detection. |
39
+ | `browserless_agent` | Drive a persistent browser session via a ReAct loop: snapshot the page, plan, batch interactions (click, type, scroll, evaluate, etc.), and re-snapshot. Uses ref-based selectors derived from snapshots, supports multi-tab workflows, screenshots, captcha solving, and live URLs. |
40
+ | `browserless_skill` | Load an on-demand recipe for a non-trivial page mechanic (shadow DOM, cookie consent, modals, captchas, dynamic content, snapshot misses, screenshots, tabs). Companion to `browserless_agent`. |
41
+
42
+ ## Skills
43
+
44
+ The server ships with a built-in library of **Skills** — on-demand recipes the agent can load to handle tricky page mechanics. Skills auto-inject into `browserless_agent` responses when their triggers fire (e.g. the agent hits a cookie banner), and can also be loaded manually via the `browserless_skill` tool.
45
+
46
+ | Skill | Source | Purpose |
47
+ | ----------------- | -------------------------------------------------------------- | --------------------------------------------------------------------------------- |
48
+ | `shadow-dom` | [src/skills/shadow-dom.md](src/skills/shadow-dom.md) | Deep selectors and iframe targeting through shadow roots. |
49
+ | `cookie-consent` | [src/skills/cookie-consent.md](src/skills/cookie-consent.md) | Vendor-specific dismiss recipes (OneTrust, Cookiebot, Didomi, TrustArc, etc.). |
50
+ | `modals` | [src/skills/modals.md](src/skills/modals.md) | Closing dialogs, alertdialogs, and overlay close-button heuristics. |
51
+ | `captchas` | [src/skills/captchas.md](src/skills/captchas.md) | Using the `solve` command, response semantics, and escalation paths (Cloud only). |
52
+ | `dynamic-content` | [src/skills/dynamic-content.md](src/skills/dynamic-content.md) | Choosing the right `wait*` method for async/AJAX/SPA content. |
53
+ | `snapshot-misses` | [src/skills/snapshot-misses.md](src/skills/snapshot-misses.md) | Handling truncated/empty snapshots and image-rendered content. |
54
+ | `screenshots` | [src/skills/screenshots.md](src/skills/screenshots.md) | When to screenshot vs. snapshot, scope and format choices. |
55
+ | `tabs` | [src/skills/tabs.md](src/skills/tabs.md) | Multi-tab workflows and peek-without-switching via `targetId`. |
56
+
57
+ Load a skill explicitly:
58
+
59
+ ```jsonc
60
+ {
61
+ "method": "tools/call",
62
+ "params": {
63
+ "name": "browserless_skill",
64
+ "arguments": { "id": "cookie-consent" },
65
+ },
66
+ }
67
+ ```
68
+
69
+ ### Residential proxy (`browserless_agent`)
70
+
71
+ Pass a top-level `proxy` object on `browserless_agent` to route the session through residential IPs. Use this when targets IP-block datacenter traffic.
72
+
73
+ ```jsonc
74
+ {
75
+ "method": "tools/call",
76
+ "params": {
77
+ "name": "browserless_agent",
78
+ "arguments": {
79
+ "method": "goto",
80
+ "params": { "url": "https://example.com" },
81
+ "proxy": {
82
+ "proxy": "residential",
83
+ "proxyCountry": "us",
84
+ "proxySticky": true,
85
+ },
86
+ },
87
+ },
88
+ }
89
+ ```
90
+
91
+ | Field | Notes |
92
+ | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
93
+ | `proxy` | `"residential"` — only value supported today. |
94
+ | `proxyCountry` | ISO-2 country code (`"us"`, `"de"`). Auto-normalized to lowercase. Non-letter values are rejected. |
95
+ | `proxyState` | US state name with whitespace replaced by underscores (`"new_york"`). Paid-plan gated — non-eligible tokens get a 401. |
96
+ | `proxyCity` | City target. Paid/enterprise plan gated — non-eligible tokens get a 401. |
97
+ | `proxySticky` | Stable IP while the underlying WebSocket stays open. Reconnects (idle drop, network blip, browser crash) allocate a new sticky id and new IP. |
98
+ | `proxyLocaleMatch` | Match `navigator` locale to the proxy IP country. |
99
+ | `proxyPreset` | Named preset (e.g. `"px_amazon01"`). Available presets are plan-dependent — ask Browserless support for your list. |
100
+ | `externalProxyServer` | Bring-your-own upstream, e.g. `http://user:pass@host:port`. Must be `http://` or `https://`. |
101
+
102
+ > **Note:** `proxyCountry` / `proxyState` / `proxyCity` / `proxySticky` / `proxyLocaleMatch` / `proxyPreset` require either `proxy: "residential"` or `externalProxyServer` to be set. The MCP rejects this combination at validation time; without it, the API would silently ignore them.
103
+
104
+ The `proxy` object is read once at session creation. To change it, call `close` and start a new session — the agent client keys sessions on the proxy fingerprint, so passing a different config will land on a fresh WebSocket.
105
+
106
+ ## Configuration
107
+
108
+ The server is hosted at `https://mcp.browserless.io/mcp`. Authenticate via headers (preferred) or a `?token=` query parameter.
109
+
110
+ **Using headers** (recommended for clients that support them):
111
+
112
+ ```json
113
+ {
114
+ "mcpServers": {
115
+ "browserless": {
116
+ "url": "https://mcp.browserless.io/mcp",
117
+ "headers": {
118
+ "Authorization": "Bearer your-token-here"
119
+ }
120
+ }
121
+ }
122
+ }
123
+ ```
124
+
125
+ **Using URL query parameters** (for clients like Claude.ai custom connectors that only accept a URL):
126
+
127
+ ```text
128
+ https://mcp.browserless.io/mcp?token=your-token-here
129
+ ```
130
+
131
+ To connect to a specific Browserless regional endpoint, add the `x-browserless-api-url` header or the `browserlessUrl` query parameter:
132
+
133
+ ```json
134
+ {
135
+ "mcpServers": {
136
+ "browserless": {
137
+ "url": "https://mcp.browserless.io/mcp",
138
+ "headers": {
139
+ "Authorization": "Bearer your-token-here",
140
+ "x-browserless-api-url": "https://production-lon.browserless.io"
141
+ }
142
+ }
143
+ }
144
+ }
145
+ ```
146
+
147
+ ```text
148
+ https://mcp.browserless.io/mcp?token=your-token-here&browserlessUrl=https://production-lon.browserless.io
149
+ ```
150
+
151
+ When both headers and query parameters are present, headers take precedence.
152
+
153
+ ### Claude Desktop
154
+
155
+ Add to your `claude_desktop_config.json`:
156
+
157
+ ```json
158
+ {
159
+ "mcpServers": {
160
+ "browserless": {
161
+ "url": "https://mcp.browserless.io/mcp?token=your-token-here"
162
+ }
163
+ }
164
+ }
165
+ ```
166
+
167
+ ### Cursor
168
+
169
+ Add to your Cursor MCP settings:
170
+
171
+ ```json
172
+ {
173
+ "mcpServers": {
174
+ "browserless": {
175
+ "url": "https://mcp.browserless.io/mcp?token=your-token-here"
176
+ }
177
+ }
178
+ }
179
+ ```
180
+
181
+ ### VS Code
182
+
183
+ Add to your VS Code settings (`settings.json`):
184
+
185
+ ```json
186
+ {
187
+ "mcp": {
188
+ "servers": {
189
+ "browserless": {
190
+ "url": "https://mcp.browserless.io/mcp",
191
+ "headers": {
192
+ "Authorization": "Bearer your-token-here"
193
+ }
194
+ }
195
+ }
196
+ }
197
+ }
198
+ ```
199
+
200
+ ### Windsurf
201
+
202
+ Add to your Windsurf MCP configuration:
203
+
204
+ ```json
205
+ {
206
+ "mcpServers": {
207
+ "browserless": {
208
+ "url": "https://mcp.browserless.io/mcp?token=your-token-here"
209
+ }
210
+ }
211
+ }
212
+ ```
213
+
214
+ ## Self-Hosting
215
+
216
+ The server can also be run locally — useful for air-gapped deployments or pointing at a self-hosted Browserless instance. Clone this repo and build the Docker image:
217
+
218
+ ```bash
219
+ docker build -f docker/Dockerfile -t browserless-mcp .
220
+
221
+ docker run \
222
+ -e BROWSERLESS_TOKEN=your-token \
223
+ -e BROWSERLESS_API_URL=https://your-browserless-instance.example.com \
224
+ -p 8080:8080 \
225
+ browserless-mcp
226
+ ```
227
+
228
+ Then point your MCP client at `http://localhost:8080/mcp` using the same header/query-parameter auth as above.
229
+
230
+ ### Self-hosted environment variables
231
+
232
+ | Variable | Required | Default | Description |
233
+ | ------------------------- | -------- | --------------------------------------- | -------------------------------------------------- |
234
+ | `BROWSERLESS_TOKEN` | Yes | — | Your Browserless API token |
235
+ | `BROWSERLESS_API_URL` | No | `https://production-sfo.browserless.io` | API endpoint (for self-hosted Browserless) |
236
+ | `TRANSPORT` | No | `stdio` | Transport type: `stdio` or `httpStream` |
237
+ | `PORT` | No | `8080` | HTTP server port (only for `httpStream` transport) |
238
+ | `BROWSERLESS_TIMEOUT` | No | `30000` | Request timeout in milliseconds |
239
+ | `BROWSERLESS_MAX_RETRIES` | No | `3` | Max retry attempts for failed requests |
240
+ | `BROWSERLESS_CACHE_TTL` | No | `60000` | Cache TTL in milliseconds (0 to disable) |
241
+
242
+ ## MCP Resources
243
+
244
+ | Resource URI | Description |
245
+ | ------------------------ | ------------------------------- |
246
+ | `browserless://api-docs` | Smart scraper API documentation |
247
+ | `browserless://status` | Live service health status |
248
+
249
+ ## MCP Prompts
250
+
251
+ | Prompt | Description |
252
+ | ----------------- | ------------------------------------------- |
253
+ | `scrape-url` | Scrape a webpage and summarize its content |
254
+ | `extract-content` | Extract specific information from a webpage |
255
+
256
+ ## Development
257
+
258
+ ```bash
259
+ npm install
260
+ npm run build
261
+ npm test
262
+ npm run coverage
263
+ ```
264
+
265
+ ### Tests
266
+
267
+ The test suite uses [Mocha](https://mochajs.org/) with [Chai](https://www.chaijs.com/) and [Sinon](https://sinonjs.org/). Specs live alongside the code in `test/` (`test/lib/`, `test/tools/`, `test/prompts/`, `test/resources/`, `test/integration/`) and run against the compiled output in `build/`.
268
+
269
+ - `npm test` — compiles TypeScript and runs every `*.spec.js` under `build/test/`. No external services or `BROWSERLESS_TOKEN` are required; the API client is stubbed.
270
+ - `npm run coverage` — runs the suite under [c8](https://github.com/bcoe/c8) with the thresholds configured in `package.json` (lines ≥ 80%, branches ≥ 70%, functions ≥ 80%).
271
+
272
+ Tests run automatically on every pull request via the [Test workflow](.github/workflows/test.yml) on Node 24. PRs must keep the suite green before they can merge.
273
+
274
+ ## API Token
275
+
276
+ Get your API token at [browserless.io](https://browserless.io). The token authenticates all requests to the Browserless API.
277
+
278
+ ## License
279
+
280
+ SSPL-1.0
package/bin/cli.js ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env node
2
+ import '../build/src/index.js';