@mindstone/mcp-server-browser-automation 0.1.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,97 @@
1
+ # Functional Source License, Version 1.1, MIT Future License
2
+
3
+ ## Abbreviation
4
+
5
+ FSL-1.1-MIT
6
+
7
+ ## Notice
8
+
9
+ Copyright 2026 Mindstone Learning Limited
10
+
11
+ ## Terms and Conditions
12
+
13
+ ### Licensor ("We")
14
+
15
+ The party offering the Software under these Terms and Conditions.
16
+
17
+ **Licensor**: Mindstone Learning Limited
18
+
19
+ ### The Software
20
+
21
+ The "Software" is each version of the software that we make available under
22
+ these Terms and Conditions, as indicated by our inclusion of these Terms and
23
+ Conditions with the Software.
24
+
25
+ **Software**: Browser Automation MCP Server
26
+
27
+ ### License Grant
28
+
29
+ Subject to your compliance with this License Grant and the Patents,
30
+ Redistribution and Trademark clauses below, we hereby grant you the right to
31
+ use, copy, modify, create derivative works, publicly perform, publicly display
32
+ and redistribute the Software for any Permitted Purpose identified below.
33
+
34
+ ### Permitted Purpose
35
+
36
+ A Permitted Purpose is any purpose other than a Competing Use. A "Competing
37
+ Use" means making the Software available to third parties as a commercial
38
+ hosted service that directly competes with any product or service provided by
39
+ the Licensor.
40
+
41
+ ### Patents
42
+
43
+ To the extent your use for a Permitted Purpose would necessarily infringe our
44
+ patents, the license grant above includes a license under our patents. If you
45
+ make a claim against any party that the Software infringes or contributes to
46
+ the infringement of any patent, then your patent license to the Software ends
47
+ immediately.
48
+
49
+ ### Redistribution
50
+
51
+ The Terms and Conditions apply to all copies, modifications and derivatives of
52
+ the Software.
53
+
54
+ If you redistribute any copies, modifications or derivatives of the Software,
55
+ you must include a copy of or a link to these Terms and Conditions and not
56
+ remove any copyright notices provided in or with the Software.
57
+
58
+ ### Disclaimer
59
+
60
+ THE SOFTWARE IS PROVIDED "AS IS" AND WITHOUT WARRANTIES OF ANY KIND, EXPRESS OR
61
+ IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR
62
+ PURPOSE, MERCHANTABILITY, TITLE OR NON-INFRINGEMENT.
63
+
64
+ IN NO EVENT WILL WE HAVE ANY LIABILITY TO YOU ARISING OUT OF OR RELATED TO THE
65
+ SOFTWARE, INCLUDING INDIRECT, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES, OF
66
+ ANY CHARACTER INCLUDING DAMAGES FOR LOSS OF GOODWILL, LOST PROFITS, LOST SALES
67
+ OR BUSINESS, WORK STOPPAGE, COMPUTER FAILURE OR MALFUNCTION, LOST CONTENT,
68
+ DATA OR DATA USE, BREACH OF DUTY OF GOOD FAITH, OR ANY AND ALL OTHER DAMAGES
69
+ OR LOSSES OF ANY KIND OR NATURE WHATSOEVER (WHETHER DIRECT, INDIRECT, SPECIAL,
70
+ COLLATERAL, INCIDENTAL, CONSEQUENTIAL OR OTHERWISE) ARISING OUT OF OR IN
71
+ CONNECTION WITH THE SOFTWARE OR THIS LICENSE, EVEN IF SUCH PARTY SHALL HAVE
72
+ BEEN INFORMED OF THE POSSIBILITY OF SUCH DAMAGES.
73
+
74
+ ### Trademark
75
+
76
+ Except for displaying the License Details and identifying us as the origin of
77
+ the Software, you have no right under these Terms and Conditions to use our
78
+ trademarks, trade names, service marks or product names.
79
+
80
+ ## Change Date
81
+
82
+ Four years from the date the Software is made available under these Terms and
83
+ Conditions: **2030-04-08**
84
+
85
+ ## Change License
86
+
87
+ MIT License
88
+
89
+ ## License Details
90
+
91
+ | Parameter | Value |
92
+ |---|---|
93
+ | Licensor | Mindstone Learning Limited |
94
+ | Software | Browser Automation MCP Server |
95
+ | Use Limitation | Competing Use |
96
+ | Change Date | 2030-04-08 |
97
+ | Change License | MIT |
package/README.md ADDED
@@ -0,0 +1,134 @@
1
+ # Browser Automation MCP Server
2
+
3
+ Headless browser control via accessibility snapshots — navigate pages, fill forms, click elements, take screenshots, and manage tabs using the [agent-browser](https://www.npmjs.com/package/agent-browser) CLI.
4
+
5
+ ## Installation
6
+
7
+ ```bash
8
+ npx -y @mindstone/mcp-server-browser-automation
9
+ ```
10
+
11
+ Or install globally:
12
+
13
+ ```bash
14
+ npm install -g @mindstone/mcp-server-browser-automation
15
+ mcp-server-browser-automation
16
+ ```
17
+
18
+ ## Requirements
19
+
20
+ This server requires the `agent-browser` CLI binary to control the browser.
21
+
22
+ ### Binary Resolution
23
+
24
+ 1. **PATH lookup** (preferred): If `agent-browser` is on your PATH, it is used directly.
25
+ 2. **npx fallback**: If the binary is not found, the server automatically falls back to `npx -y agent-browser@0.17`.
26
+
27
+ ### Installing agent-browser
28
+
29
+ ```bash
30
+ npm install -g agent-browser
31
+ ```
32
+
33
+ Or let the npx fallback handle it automatically (slower on first use due to download).
34
+
35
+ ## Configuration
36
+
37
+ No API keys or credentials are required. The server communicates with the browser via the agent-browser CLI.
38
+
39
+ | Variable | Required | Description |
40
+ |---|---|---|
41
+ | `AGENT_BROWSER_SESSION_NAME` | No | Session name for browser persistence (default: `mcp`) |
42
+ | `BROWSER_AUTOMATION_ALLOW_EVAL` | No | Set to `1` to register the `browser_evaluate` tool. Off by default. See [Security considerations](#security-considerations). |
43
+
44
+ ### MCP Host Configuration
45
+
46
+ ```json
47
+ {
48
+ "mcpServers": {
49
+ "browser-automation": {
50
+ "command": "npx",
51
+ "args": ["-y", "@mindstone/mcp-server-browser-automation"]
52
+ }
53
+ }
54
+ }
55
+ ```
56
+
57
+ ## Available Tools (17 by default; +1 when `BROWSER_AUTOMATION_ALLOW_EVAL=1`)
58
+
59
+ ### Navigation
60
+ - **browser_navigate** — Navigate to a URL
61
+ - **browser_back** — Navigate back in browser history
62
+ - **browser_forward** — Navigate forward in browser history
63
+ - **browser_wait** — Wait for an element to appear or a specified time
64
+
65
+ ### Observation
66
+ - **browser_snapshot** — Get the page accessibility tree with interactive element references
67
+ - **browser_screenshot** — Take a screenshot of the current page
68
+ - **browser_get_page_info** — Get the current page URL and title
69
+
70
+ ### Interaction
71
+ - **browser_click** — Click an element using @ref or CSS selector
72
+ - **browser_fill** — Clear a field and fill it with text
73
+ - **browser_type** — Type text character by character (real keystrokes)
74
+ - **browser_press_key** — Press a keyboard key
75
+ - **browser_scroll** — Scroll the page in a direction
76
+ - **browser_select** — Select an option from a dropdown
77
+ - **browser_hover** — Hover over an element
78
+ - **browser_evaluate** — Execute JavaScript in the page context (gated; see [Security considerations](#security-considerations))
79
+
80
+ ### Session Management
81
+ - **browser_tabs** — List open tabs or switch to a tab
82
+ - **browser_close** — Close the browser session
83
+ - **browser_authenticate** — Open a visible browser for manual login
84
+
85
+ ## Workflow
86
+
87
+ The typical workflow uses accessibility snapshots for reliable element targeting:
88
+
89
+ 1. `browser_navigate` → open a page
90
+ 2. `browser_snapshot` → see interactive elements with @ref IDs
91
+ 3. `browser_click` / `browser_fill` → interact using @ref references
92
+ 4. `browser_screenshot` → visual verification
93
+
94
+ ## Security considerations
95
+
96
+ Browser automation has a large attack surface: the agent-browser CLI controls a real headless browser that loads URLs you pass it, runs page-side JavaScript, and persists cookies and session state across runs. Read this section before deploying.
97
+
98
+ ### `browser_evaluate` is gated behind `BROWSER_AUTOMATION_ALLOW_EVAL`
99
+
100
+ `browser_evaluate` lets the model execute arbitrary JavaScript inside the page context — the security equivalent of giving the model a shell on whatever site it has just navigated to. To prevent prompt-injected content from doing this silently, the tool is **only registered when the host explicitly opts in**:
101
+
102
+ ```bash
103
+ BROWSER_AUTOMATION_ALLOW_EVAL=1 mcp-server-browser-automation
104
+ ```
105
+
106
+ Without this env var, `browser_evaluate` is **not** in the tools list at all — the LLM cannot even see it. When enabled, the tool is annotated `destructiveHint: true` so MCP hosts can (and should) require explicit user confirmation before each invocation.
107
+
108
+ ### URL scheme deny-list
109
+
110
+ `browser_navigate` and `browser_authenticate` accept only `http:` and `https:` URLs (plus the special `about:blank`). Other URL schemes are refused before the underlying `agent-browser` CLI is invoked:
111
+
112
+ - `file:` — would let pages read local filesystem paths
113
+ - `chrome:` and `chrome-extension:` — internal browser pages and installed extensions
114
+ - `javascript:` — equivalent to `eval()` against the current document
115
+ - `data:` — inlined attacker-controlled HTML/JS payloads
116
+ - `view-source:` — defeats the same-origin policy on rendered content
117
+ - `about:` — privileged internal pages (`about:config`, `about:cache`, `about:debugging`, …); only `about:blank` is permitted
118
+
119
+ ### Cookie and session persistence
120
+
121
+ The connector tells `agent-browser` to use a **named, persistent session** via `AGENT_BROWSER_SESSION_NAME` (default value: `mcp`). All cookies, `localStorage` data, and any logins performed via `browser_authenticate` are stored on disk under that session name and reused across runs. Anyone who can read the session storage — the local user, other tools running as the same user, or backups — can also use those logged-in sessions.
122
+
123
+ To override the session name (for example, to keep separate profiles per project) set `AGENT_BROWSER_SESSION_NAME` explicitly in the host's MCP server config. To wipe state, close the browser via `browser_close` and remove the session directory managed by `agent-browser`.
124
+
125
+ ### Recommended deployment posture
126
+
127
+ - **Run the connector against a separate browser profile** — a dedicated `AGENT_BROWSER_SESSION_NAME` per MCP host. Do not reuse your daily browser profile: the connector reads and overwrites cookies in whichever profile it is pointed at, and a malicious page can ride the existing session of any site you are logged into.
128
+ - **Leave `browser_evaluate` disabled** unless the host implements user confirmation for every call. The default (off) is the safe choice.
129
+ - **Require host confirmation** for `browser_authenticate` and any flow that may navigate to authenticated sites — otherwise prompt injection in fetched content can drive the browser at sites the user is logged into.
130
+ - **Treat returned page content as untrusted** — accessibility snapshots, screenshots, and JavaScript-evaluation outputs come from arbitrary websites and may contain prompt-injection attempts.
131
+
132
+ ## License
133
+
134
+ FSL-1.1-MIT
@@ -0,0 +1,33 @@
1
+ export interface ExecResult {
2
+ stdout: string;
3
+ stderr: string;
4
+ }
5
+ export interface ExecOptions {
6
+ timeoutMs?: number;
7
+ headed?: boolean;
8
+ }
9
+ /**
10
+ * Execute an agent-browser CLI command.
11
+ *
12
+ * Argument shape: `agent-browser <command> [args] [options]`. The CLI parses
13
+ * the FIRST positional as the command, so flags like `--headed` MUST come
14
+ * AFTER the command — putting them first makes the CLI report
15
+ * "Unknown command: --headed" and exit 1.
16
+ *
17
+ * Visibility default is HEADED — users see the browser window so they can
18
+ * watch what the agent is doing (the trust-by-transparency choice). Hosts
19
+ * that want quiet operation set `AGENT_BROWSER_SHOW_WINDOW=false`. Callers
20
+ * can override per-call with `options.headed`. There is no `--headless` flag
21
+ * on the CLI — passing one would be a CLI error — so headless is the absence
22
+ * of `--headed`.
23
+ *
24
+ * Falls back to `npx -y agent-browser@<NPX_FALLBACK_VERSION>` if the binary is
25
+ * not on PATH. Uses execFile (no shell) to prevent command injection.
26
+ */
27
+ export declare function execAgentBrowser(args: string[], options?: ExecOptions): Promise<ExecResult>;
28
+ /**
29
+ * Reset the resolved binary cache.
30
+ * Primarily used for testing to reset state between test runs.
31
+ */
32
+ export declare function resetBinaryCache(): void;
33
+ //# sourceMappingURL=browser-client.d.ts.map
@@ -0,0 +1,138 @@
1
+ import { execFile } from 'node:child_process';
2
+ import { promisify } from 'node:util';
3
+ import { ConnectorError, DEFAULT_TIMEOUT_MS, SESSION_NAME } from './types.js';
4
+ const execFileAsync = promisify(execFile);
5
+ let resolvedBinary = null;
6
+ function resolveAgentBrowser() {
7
+ if (resolvedBinary)
8
+ return resolvedBinary;
9
+ // Default to the binary name — execFile will search PATH.
10
+ // If not found (ENOENT), the caller falls back to npx.
11
+ resolvedBinary = 'agent-browser';
12
+ return resolvedBinary;
13
+ }
14
+ function buildEnv() {
15
+ const env = { ...process.env };
16
+ // Always use session persistence
17
+ if (!env.AGENT_BROWSER_SESSION_NAME) {
18
+ env.AGENT_BROWSER_SESSION_NAME = SESSION_NAME;
19
+ }
20
+ return env;
21
+ }
22
+ /**
23
+ * Resolve whether the browser window should be visible for this invocation.
24
+ *
25
+ * Resolution order (highest precedence first):
26
+ * 1. Explicit `options.headed` from the caller (true → headed, false → headless).
27
+ * Used by `browser_authenticate` and any future caller that wants to
28
+ * override the user's preference for a specific operation.
29
+ * 2. The `AGENT_BROWSER_SHOW_WINDOW` env var, set by the host application
30
+ * from the user's connector setupField:
31
+ * - 'false' / '0' → headless (work out of sight)
32
+ * - 'true' / '1' / unset → headed (visible window)
33
+ *
34
+ * The visible default is deliberate: showing the browser builds user trust by
35
+ * letting them watch what the agent is doing. Hosts (or power users) who
36
+ * prefer the quieter behaviour can opt out by setting the env var to 'false'.
37
+ */
38
+ function resolveHeaded(optionHeaded, env) {
39
+ if (optionHeaded !== undefined)
40
+ return optionHeaded;
41
+ const raw = env.AGENT_BROWSER_SHOW_WINDOW?.trim().toLowerCase();
42
+ if (raw === 'false' || raw === '0')
43
+ return false;
44
+ return true;
45
+ }
46
+ /**
47
+ * Pinned version of agent-browser used by the npx fallback.
48
+ *
49
+ * Why pinned: keeps fallback behavior reproducible. Bump when verified against
50
+ * a newer release. Do not use `latest` — npx caches by spec, and an unpinned
51
+ * spec produces flaky behavior across machines.
52
+ */
53
+ const NPX_FALLBACK_VERSION = '0.26.0';
54
+ /**
55
+ * Execute an agent-browser CLI command.
56
+ *
57
+ * Argument shape: `agent-browser <command> [args] [options]`. The CLI parses
58
+ * the FIRST positional as the command, so flags like `--headed` MUST come
59
+ * AFTER the command — putting them first makes the CLI report
60
+ * "Unknown command: --headed" and exit 1.
61
+ *
62
+ * Visibility default is HEADED — users see the browser window so they can
63
+ * watch what the agent is doing (the trust-by-transparency choice). Hosts
64
+ * that want quiet operation set `AGENT_BROWSER_SHOW_WINDOW=false`. Callers
65
+ * can override per-call with `options.headed`. There is no `--headless` flag
66
+ * on the CLI — passing one would be a CLI error — so headless is the absence
67
+ * of `--headed`.
68
+ *
69
+ * Falls back to `npx -y agent-browser@<NPX_FALLBACK_VERSION>` if the binary is
70
+ * not on PATH. Uses execFile (no shell) to prevent command injection.
71
+ */
72
+ export async function execAgentBrowser(args, options) {
73
+ const timeoutMs = options?.timeoutMs ?? DEFAULT_TIMEOUT_MS;
74
+ const env = buildEnv();
75
+ // Inject --headed AFTER the command (positional index 1). The CLI parses
76
+ // the first positional as the command name, so flags must follow it.
77
+ // Headless is the absence of --headed; the CLI has no --headless flag.
78
+ if (resolveHeaded(options?.headed, env) && args.length > 0) {
79
+ args = [args[0], '--headed', ...args.slice(1)];
80
+ }
81
+ const binary = resolveAgentBrowser();
82
+ try {
83
+ // execFile is safe against command injection (no shell interpretation)
84
+ const result = await execFileAsync(binary, args, {
85
+ env,
86
+ timeout: timeoutMs,
87
+ maxBuffer: 10 * 1024 * 1024, // 10MB for large snapshots
88
+ });
89
+ return { stdout: result.stdout, stderr: result.stderr ?? '' };
90
+ }
91
+ catch (error) {
92
+ const err = error;
93
+ // Binary not found on PATH — try npx fallback (pulls a pinned version
94
+ // from the npm cache / registry).
95
+ if (err.code === 'ENOENT') {
96
+ try {
97
+ const npxResult = await execFileAsync('npx', ['-y', `agent-browser@${NPX_FALLBACK_VERSION}`, ...args], {
98
+ env,
99
+ timeout: timeoutMs + 15_000, // extra time for npx install
100
+ maxBuffer: 10 * 1024 * 1024,
101
+ });
102
+ return { stdout: npxResult.stdout, stderr: npxResult.stderr ?? '' };
103
+ }
104
+ catch (npxError) {
105
+ const npxErr = npxError;
106
+ // Distinguish: npx itself missing (true binary-not-found) vs
107
+ // agent-browser ran but returned non-zero (CLI error surfaced via npx).
108
+ if (npxErr.code === 'ENOENT') {
109
+ throw new ConnectorError(`agent-browser binary not found on PATH and npx is also unavailable: ${npxErr.message ?? String(npxErr)}`, 'BINARY_NOT_FOUND', 'Install agent-browser: npm install -g agent-browser\n' +
110
+ 'Or ensure npx is available on PATH.');
111
+ }
112
+ // npx ran but the underlying CLI exited non-zero — propagate as CLI_ERROR
113
+ // with the actual stderr for diagnosis.
114
+ const npxStderr = npxErr.stderr?.trim() ?? '';
115
+ const npxStdout = npxErr.stdout?.trim() ?? '';
116
+ throw new ConnectorError(npxStderr || npxStdout || npxErr.message || String(npxError), 'CLI_ERROR', 'The agent-browser CLI command failed (via npx fallback). ' +
117
+ 'Check the error details above. ' +
118
+ 'For best performance, install agent-browser globally: npm install -g agent-browser');
119
+ }
120
+ }
121
+ // Timeout
122
+ if (err.code === 'ERR_CHILD_PROCESS_STDIO_MAXBUFFER' || err.killed) {
123
+ throw new ConnectorError(`Command timed out after ${timeoutMs}ms: agent-browser ${args.join(' ')}`, 'TIMEOUT', 'The browser operation took too long. Try a simpler action or increase the timeout.');
124
+ }
125
+ // Other errors — include stderr for diagnostics
126
+ const stderr = err.stderr?.trim() ?? '';
127
+ const stdout = err.stdout?.trim() ?? '';
128
+ throw new ConnectorError(stderr || stdout || err.message || String(error), 'CLI_ERROR', 'The agent-browser CLI command failed. Check that agent-browser is installed and the browser session is active.');
129
+ }
130
+ }
131
+ /**
132
+ * Reset the resolved binary cache.
133
+ * Primarily used for testing to reset state between test runs.
134
+ */
135
+ export function resetBinaryCache() {
136
+ resolvedBinary = null;
137
+ }
138
+ //# sourceMappingURL=browser-client.js.map
@@ -0,0 +1,17 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * Browser Automation MCP Server
4
+ *
5
+ * Provides headless browser automation via the agent-browser CLI.
6
+ * Uses accessibility snapshots (@ref pointers) instead of fragile CSS selectors.
7
+ * Sessions persist automatically between invocations.
8
+ *
9
+ * Requirements:
10
+ * - agent-browser CLI binary on PATH, or npx available for fallback
11
+ *
12
+ * Environment variables:
13
+ * - AGENT_BROWSER_SESSION_NAME: Session name for persistence (default: "mcp")
14
+ * - MCP_DISABLE_GRACEFUL_FS=1: Disable the graceful-fs EMFILE mitigation patch
15
+ */
16
+ import './installGracefulFs.js';
17
+ //# sourceMappingURL=index.d.ts.map
package/dist/index.js ADDED
@@ -0,0 +1,31 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * Browser Automation MCP Server
4
+ *
5
+ * Provides headless browser automation via the agent-browser CLI.
6
+ * Uses accessibility snapshots (@ref pointers) instead of fragile CSS selectors.
7
+ * Sessions persist automatically between invocations.
8
+ *
9
+ * Requirements:
10
+ * - agent-browser CLI binary on PATH, or npx available for fallback
11
+ *
12
+ * Environment variables:
13
+ * - AGENT_BROWSER_SESSION_NAME: Session name for persistence (default: "mcp")
14
+ * - MCP_DISABLE_GRACEFUL_FS=1: Disable the graceful-fs EMFILE mitigation patch
15
+ */
16
+ // MUST be the very first import — installs the graceful-fs EMFILE mitigation
17
+ // before any other module touches node:fs.
18
+ import './installGracefulFs.js';
19
+ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
20
+ import { createServer } from './server.js';
21
+ async function main() {
22
+ const server = createServer();
23
+ const transport = new StdioServerTransport();
24
+ await server.connect(transport);
25
+ console.error('Browser Automation MCP server running on stdio');
26
+ }
27
+ main().catch((error) => {
28
+ console.error('Fatal error:', error);
29
+ process.exit(1);
30
+ });
31
+ //# sourceMappingURL=index.js.map
@@ -0,0 +1,20 @@
1
+ /**
2
+ * Boot-time graceful-fs install (leaf module).
3
+ *
4
+ * The browser-automation MCP server runs as a Node child process spawned by
5
+ * its host (e.g. via `npx`). It has its own `fs` surface and needs its own
6
+ * `graceful-fs.gracefulify(fs)` call to mitigate EMFILE / ENFILE bursts —
7
+ * notably on Windows where the default file descriptor / handle ceiling is
8
+ * tight and long-running browser-automation sessions can exhaust it.
9
+ *
10
+ * Imported as the very first statement of `index.ts` so the patch is
11
+ * installed before any other module touches `node:fs`.
12
+ *
13
+ * Kill switch: set `MCP_DISABLE_GRACEFUL_FS=1` to disable the patch.
14
+ *
15
+ * Failure handling: stash on `globalThis.__MCP_BOOTSTRAP_ERROR__` so future
16
+ * observability hooks can surface it. With `MCP_DEBUG_BOOTSTRAP=1` the
17
+ * failure also logs to stderr.
18
+ */
19
+ export {};
20
+ //# sourceMappingURL=installGracefulFs.d.ts.map
@@ -0,0 +1,45 @@
1
+ /**
2
+ * Boot-time graceful-fs install (leaf module).
3
+ *
4
+ * The browser-automation MCP server runs as a Node child process spawned by
5
+ * its host (e.g. via `npx`). It has its own `fs` surface and needs its own
6
+ * `graceful-fs.gracefulify(fs)` call to mitigate EMFILE / ENFILE bursts —
7
+ * notably on Windows where the default file descriptor / handle ceiling is
8
+ * tight and long-running browser-automation sessions can exhaust it.
9
+ *
10
+ * Imported as the very first statement of `index.ts` so the patch is
11
+ * installed before any other module touches `node:fs`.
12
+ *
13
+ * Kill switch: set `MCP_DISABLE_GRACEFUL_FS=1` to disable the patch.
14
+ *
15
+ * Failure handling: stash on `globalThis.__MCP_BOOTSTRAP_ERROR__` so future
16
+ * observability hooks can surface it. With `MCP_DEBUG_BOOTSTRAP=1` the
17
+ * failure also logs to stderr.
18
+ */
19
+ import { createRequire } from 'node:module';
20
+ if (process.env.MCP_DISABLE_GRACEFUL_FS !== '1') {
21
+ try {
22
+ // CommonJS interop — graceful-fs is a CJS package.
23
+ const requireFn = createRequire(import.meta.url);
24
+ const gracefulFs = requireFn('graceful-fs');
25
+ const fs = requireFn('node:fs');
26
+ gracefulFs.gracefulify(fs); // idempotent
27
+ }
28
+ catch (e) {
29
+ const g = globalThis;
30
+ g.__MCP_BOOTSTRAP_ERROR__ = {
31
+ kind: 'graceful_fs_leaf_install_failed',
32
+ error: {
33
+ name: e?.name,
34
+ message: e?.message,
35
+ stack: e?.stack,
36
+ },
37
+ at: Date.now(),
38
+ };
39
+ if (process.env.MCP_DEBUG_BOOTSTRAP === '1') {
40
+ // eslint-disable-next-line no-console
41
+ console.warn('[installGracefulFs] failed:', e);
42
+ }
43
+ }
44
+ }
45
+ //# sourceMappingURL=installGracefulFs.js.map
@@ -0,0 +1,3 @@
1
+ import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
2
+ export declare function createServer(): McpServer;
3
+ //# sourceMappingURL=server.d.ts.map
package/dist/server.js ADDED
@@ -0,0 +1,15 @@
1
+ import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
2
+ import { SERVER_NAME, SERVER_VERSION } from './types.js';
3
+ import { registerNavigationTools, registerInteractionTools, registerObservationTools, registerSessionTools, } from './tools/index.js';
4
+ export function createServer() {
5
+ const server = new McpServer({
6
+ name: SERVER_NAME,
7
+ version: SERVER_VERSION,
8
+ });
9
+ registerNavigationTools(server);
10
+ registerInteractionTools(server);
11
+ registerObservationTools(server);
12
+ registerSessionTools(server);
13
+ return server;
14
+ }
15
+ //# sourceMappingURL=server.js.map
@@ -0,0 +1,5 @@
1
+ export { registerNavigationTools } from './navigation.js';
2
+ export { registerInteractionTools } from './interaction.js';
3
+ export { registerObservationTools } from './observation.js';
4
+ export { registerSessionTools } from './session.js';
5
+ //# sourceMappingURL=index.d.ts.map
@@ -0,0 +1,5 @@
1
+ export { registerNavigationTools } from './navigation.js';
2
+ export { registerInteractionTools } from './interaction.js';
3
+ export { registerObservationTools } from './observation.js';
4
+ export { registerSessionTools } from './session.js';
5
+ //# sourceMappingURL=index.js.map
@@ -0,0 +1,3 @@
1
+ import type { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
2
+ export declare function registerInteractionTools(server: McpServer): void;
3
+ //# sourceMappingURL=interaction.d.ts.map
@@ -0,0 +1,149 @@
1
+ import { z } from 'zod';
2
+ import { execAgentBrowser } from '../browser-client.js';
3
+ import { withErrorHandling } from '../utils.js';
4
+ export function registerInteractionTools(server) {
5
+ server.registerTool('browser_click', {
6
+ description: `Click an element. Use @ref from browser_snapshot (preferred) or a CSS selector.
7
+
8
+ WORKFLOW: browser_snapshot → find @ref → browser_click @ref`,
9
+ inputSchema: {
10
+ ref: z.string().describe('Element ref from snapshot (e.g., "@e2") or CSS selector'),
11
+ },
12
+ annotations: {
13
+ readOnlyHint: false,
14
+ destructiveHint: false,
15
+ idempotentHint: false,
16
+ openWorldHint: true,
17
+ },
18
+ }, withErrorHandling(async (args) => {
19
+ await execAgentBrowser(['click', args.ref]);
20
+ return JSON.stringify({ ok: true, message: `Clicked: ${args.ref}` });
21
+ }));
22
+ server.registerTool('browser_fill', {
23
+ description: `Clear a field and fill it with text. Use @ref from browser_snapshot.
24
+
25
+ WORKFLOW: browser_snapshot → find input @ref → browser_fill`,
26
+ inputSchema: {
27
+ ref: z.string().describe('Element ref (e.g., "@e3") or CSS selector'),
28
+ value: z.string().describe('Text to fill'),
29
+ },
30
+ annotations: {
31
+ readOnlyHint: false,
32
+ destructiveHint: false,
33
+ idempotentHint: false,
34
+ openWorldHint: true,
35
+ },
36
+ }, withErrorHandling(async (args) => {
37
+ await execAgentBrowser(['fill', args.ref, args.value]);
38
+ return JSON.stringify({ ok: true, message: `Filled ${args.ref} with ${args.value.length} characters` });
39
+ }));
40
+ server.registerTool('browser_type', {
41
+ description: 'Type text character by character (simulates real keystrokes). Useful for search boxes and autocompletes that respond to individual key events.',
42
+ inputSchema: {
43
+ ref: z.string().describe('Element ref or CSS selector'),
44
+ text: z.string().describe('Text to type'),
45
+ },
46
+ annotations: {
47
+ readOnlyHint: false,
48
+ destructiveHint: false,
49
+ idempotentHint: false,
50
+ openWorldHint: true,
51
+ },
52
+ }, withErrorHandling(async (args) => {
53
+ await execAgentBrowser(['type', args.ref, args.text]);
54
+ return JSON.stringify({ ok: true, message: `Typed ${args.text.length} characters into ${args.ref}` });
55
+ }));
56
+ server.registerTool('browser_press_key', {
57
+ description: 'Press a keyboard key. Common keys: Enter, Tab, Escape, Backspace, ArrowDown, ArrowUp.',
58
+ inputSchema: {
59
+ key: z.string().describe('Key to press (e.g., "Enter", "Tab", "Escape")'),
60
+ },
61
+ annotations: {
62
+ readOnlyHint: false,
63
+ destructiveHint: false,
64
+ idempotentHint: false,
65
+ openWorldHint: true,
66
+ },
67
+ }, withErrorHandling(async (args) => {
68
+ await execAgentBrowser(['press', args.key]);
69
+ return JSON.stringify({ ok: true, message: `Pressed key: ${args.key}` });
70
+ }));
71
+ server.registerTool('browser_scroll', {
72
+ description: 'Scroll the page in a direction.',
73
+ inputSchema: {
74
+ direction: z.enum(['up', 'down', 'left', 'right']).describe('Scroll direction'),
75
+ amount: z.number().optional().default(500).describe('Pixels to scroll (default: 500)'),
76
+ },
77
+ annotations: {
78
+ readOnlyHint: false,
79
+ destructiveHint: false,
80
+ idempotentHint: false,
81
+ openWorldHint: true,
82
+ },
83
+ }, withErrorHandling(async (args) => {
84
+ const px = args.amount ?? 500;
85
+ await execAgentBrowser(['scroll', args.direction, String(px)]);
86
+ return JSON.stringify({ ok: true, message: `Scrolled ${args.direction} ${px}px` });
87
+ }));
88
+ server.registerTool('browser_select', {
89
+ description: 'Select an option from a dropdown.',
90
+ inputSchema: {
91
+ ref: z.string().describe('Element ref or CSS selector for the <select>'),
92
+ value: z.string().describe('Option value or visible text to select'),
93
+ },
94
+ annotations: {
95
+ readOnlyHint: false,
96
+ destructiveHint: false,
97
+ idempotentHint: false,
98
+ openWorldHint: true,
99
+ },
100
+ }, withErrorHandling(async (args) => {
101
+ await execAgentBrowser(['select', args.ref, args.value]);
102
+ return JSON.stringify({ ok: true, message: `Selected "${args.value}" in ${args.ref}` });
103
+ }));
104
+ server.registerTool('browser_hover', {
105
+ description: 'Hover over an element (triggers hover menus/tooltips).',
106
+ inputSchema: {
107
+ ref: z.string().describe('Element ref or CSS selector'),
108
+ },
109
+ annotations: {
110
+ readOnlyHint: true,
111
+ destructiveHint: false,
112
+ idempotentHint: true,
113
+ openWorldHint: true,
114
+ },
115
+ }, withErrorHandling(async (args) => {
116
+ await execAgentBrowser(['hover', args.ref]);
117
+ return JSON.stringify({ ok: true, message: `Hovering over ${args.ref}` });
118
+ }));
119
+ // M3.12 — `browser_evaluate` lets the model run arbitrary JavaScript inside
120
+ // the page context, which is the security equivalent of giving it a shell
121
+ // on whatever site it has just navigated to. To prevent prompt-injected
122
+ // content from doing this silently, the tool is registered ONLY when the
123
+ // host explicitly opts in via `BROWSER_AUTOMATION_ALLOW_EVAL=1`. Without
124
+ // that env var the tool is not in the tools list at all (the LLM cannot
125
+ // even see it). When enabled it carries `destructiveHint: true` so MCP
126
+ // hosts can require explicit user confirmation before each invocation.
127
+ if (process.env.BROWSER_AUTOMATION_ALLOW_EVAL === '1') {
128
+ server.registerTool('browser_evaluate', // eslint-disable-line @typescript-eslint/quotes
129
+ {
130
+ description: 'Execute JavaScript in the page context and return the result. ' +
131
+ 'DESTRUCTIVE: this is equivalent to running arbitrary code with the privileges of the current page; ' +
132
+ 'hosts SHOULD require user confirmation before each call. ' +
133
+ 'Only registered when BROWSER_AUTOMATION_ALLOW_EVAL=1 is set.',
134
+ inputSchema: {
135
+ script: z.string().describe('JavaScript code to execute'),
136
+ },
137
+ annotations: {
138
+ readOnlyHint: false,
139
+ destructiveHint: true,
140
+ idempotentHint: false,
141
+ openWorldHint: true,
142
+ },
143
+ }, withErrorHandling(async (args) => {
144
+ const result = await execAgentBrowser(['eval', args.script]);
145
+ return JSON.stringify({ ok: true, result: result.stdout.trim() });
146
+ }));
147
+ }
148
+ }
149
+ //# sourceMappingURL=interaction.js.map
@@ -0,0 +1,3 @@
1
+ import type { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
2
+ export declare function registerNavigationTools(server: McpServer): void;
3
+ //# sourceMappingURL=navigation.d.ts.map
@@ -0,0 +1,79 @@
1
+ import { z } from 'zod';
2
+ import { execAgentBrowser } from '../browser-client.js';
3
+ import { validateUrlScheme, withErrorHandling } from '../utils.js';
4
+ // URL scheme deny-list (validated by `validateUrlScheme` in utils.ts):
5
+ // only http: and https: are permitted; about:blank is special-cased.
6
+ // Refused: file:, chrome:, chrome-extension:, javascript:, data:,
7
+ // view-source:, and about: URLs other than about:blank.
8
+ export function registerNavigationTools(server) {
9
+ server.registerTool('browser_navigate', {
10
+ description: `Navigate to a URL. Opens the browser if not already running.
11
+
12
+ Only http: and https: URLs are accepted (plus the special about:blank). Other URL schemes (file:, chrome:, chrome-extension:, javascript:, data:, view-source:, about:*) are refused.
13
+
14
+ IMPORTANT: After navigating, call browser_snapshot to see the page content before interacting.`,
15
+ inputSchema: {
16
+ url: z.string().describe('URL to navigate to (http://, https://, or about:blank)'),
17
+ },
18
+ annotations: {
19
+ readOnlyHint: false,
20
+ destructiveHint: false,
21
+ idempotentHint: false,
22
+ openWorldHint: true,
23
+ },
24
+ }, withErrorHandling(async (args) => {
25
+ validateUrlScheme(args.url);
26
+ await execAgentBrowser(['open', args.url]);
27
+ const titleResult = await execAgentBrowser(['get', 'title']).catch(() => ({ stdout: '', stderr: '' }));
28
+ return JSON.stringify({
29
+ ok: true,
30
+ message: `Navigated to ${args.url}`,
31
+ title: titleResult.stdout.trim(),
32
+ hint: 'Call browser_snapshot to see page elements before interacting.',
33
+ });
34
+ }));
35
+ server.registerTool('browser_back', {
36
+ description: 'Navigate back in browser history.',
37
+ inputSchema: {},
38
+ annotations: {
39
+ readOnlyHint: false,
40
+ destructiveHint: false,
41
+ idempotentHint: false,
42
+ openWorldHint: true,
43
+ },
44
+ }, withErrorHandling(async () => {
45
+ await execAgentBrowser(['back']);
46
+ return JSON.stringify({ ok: true, message: 'Navigated back' });
47
+ }));
48
+ server.registerTool('browser_forward', {
49
+ description: 'Navigate forward in browser history.',
50
+ inputSchema: {},
51
+ annotations: {
52
+ readOnlyHint: false,
53
+ destructiveHint: false,
54
+ idempotentHint: false,
55
+ openWorldHint: true,
56
+ },
57
+ }, withErrorHandling(async () => {
58
+ await execAgentBrowser(['forward']);
59
+ return JSON.stringify({ ok: true, message: 'Navigated forward' });
60
+ }));
61
+ server.registerTool('browser_wait', {
62
+ description: 'Wait for an element to appear or for a specified time.',
63
+ inputSchema: {
64
+ selector: z.string().describe('CSS selector to wait for, or milliseconds (e.g., "2000")'),
65
+ timeout: z.number().optional().default(10000).describe('Max wait time in ms (default: 10000)'),
66
+ },
67
+ annotations: {
68
+ readOnlyHint: true,
69
+ destructiveHint: false,
70
+ idempotentHint: true,
71
+ openWorldHint: true,
72
+ },
73
+ }, withErrorHandling(async (args) => {
74
+ const timeoutMs = args.timeout ?? 10_000;
75
+ await execAgentBrowser(['wait', args.selector], { timeoutMs: timeoutMs + 2000 });
76
+ return JSON.stringify({ ok: true, message: `Wait completed for: ${args.selector}` });
77
+ }));
78
+ }
79
+ //# sourceMappingURL=navigation.js.map
@@ -0,0 +1,3 @@
1
+ import type { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
2
+ export declare function registerObservationTools(server: McpServer): void;
3
+ //# sourceMappingURL=observation.d.ts.map
@@ -0,0 +1,81 @@
1
+ import { z } from 'zod';
2
+ import { execAgentBrowser } from '../browser-client.js';
3
+ import { withErrorHandling, withErrorHandlingRaw } from '../utils.js';
4
+ import { SNAPSHOT_TIMEOUT_MS, SCREENSHOT_TIMEOUT_MS } from '../types.js';
5
+ export function registerObservationTools(server) {
6
+ server.registerTool('browser_snapshot', {
7
+ description: `Get the page accessibility tree with interactive element references.
8
+
9
+ THIS IS YOUR PRIMARY DISCOVERY TOOL. Always call this before clicking, filling, or interacting with the page.
10
+
11
+ Returns element refs like @e1, @e2 that you use with browser_click, browser_fill, etc.
12
+ Use the -i flag (default) to see only interactive elements, keeping output focused.`,
13
+ inputSchema: {
14
+ full: z.boolean().optional().default(false).describe('If true, show all elements (not just interactive). Default: false.'),
15
+ },
16
+ annotations: {
17
+ readOnlyHint: true,
18
+ destructiveHint: false,
19
+ idempotentHint: true,
20
+ openWorldHint: true,
21
+ },
22
+ }, withErrorHandling(async (args) => {
23
+ const cliArgs = args.full ? ['snapshot'] : ['snapshot', '-i'];
24
+ const result = await execAgentBrowser(cliArgs, { timeoutMs: SNAPSHOT_TIMEOUT_MS });
25
+ return JSON.stringify({ ok: true, snapshot: result.stdout });
26
+ }));
27
+ server.registerTool('browser_screenshot', {
28
+ description: 'Take a screenshot of the current page. Returns an image.',
29
+ inputSchema: {
30
+ full_page: z.boolean().optional().default(false).describe('Capture full scrollable page'),
31
+ annotate: z.boolean().optional().default(false).describe('Add numbered element labels to the screenshot'),
32
+ },
33
+ annotations: {
34
+ readOnlyHint: true,
35
+ destructiveHint: false,
36
+ idempotentHint: true,
37
+ openWorldHint: true,
38
+ },
39
+ }, withErrorHandlingRaw(async (args) => {
40
+ const cliArgs = ['screenshot'];
41
+ if (args.full_page)
42
+ cliArgs.push('--full');
43
+ if (args.annotate)
44
+ cliArgs.push('--annotate');
45
+ cliArgs.push('-'); // output to stdout
46
+ const result = await execAgentBrowser(cliArgs, { timeoutMs: SCREENSHOT_TIMEOUT_MS });
47
+ const data = result.stdout.trim();
48
+ // agent-browser outputs base64 PNG when piped to stdout
49
+ if (data.length > 100) {
50
+ return {
51
+ content: [{
52
+ type: 'image',
53
+ data,
54
+ mimeType: 'image/png',
55
+ }],
56
+ };
57
+ }
58
+ return {
59
+ content: [{ type: 'text', text: JSON.stringify({ ok: true, message: 'Screenshot taken', note: data }) }],
60
+ };
61
+ }));
62
+ server.registerTool('browser_get_page_info', {
63
+ description: 'Get the current page URL and title.',
64
+ inputSchema: {},
65
+ annotations: {
66
+ readOnlyHint: true,
67
+ destructiveHint: false,
68
+ idempotentHint: true,
69
+ openWorldHint: true,
70
+ },
71
+ }, withErrorHandling(async () => {
72
+ const urlResult = await execAgentBrowser(['get', 'url']);
73
+ const titleResult = await execAgentBrowser(['get', 'title']);
74
+ return JSON.stringify({
75
+ ok: true,
76
+ url: urlResult.stdout.trim(),
77
+ title: titleResult.stdout.trim(),
78
+ });
79
+ }));
80
+ }
81
+ //# sourceMappingURL=observation.js.map
@@ -0,0 +1,3 @@
1
+ import type { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
2
+ export declare function registerSessionTools(server: McpServer): void;
3
+ //# sourceMappingURL=session.d.ts.map
@@ -0,0 +1,69 @@
1
+ import { z } from 'zod';
2
+ import { execAgentBrowser } from '../browser-client.js';
3
+ import { validateUrlScheme, withErrorHandling } from '../utils.js';
4
+ // URL scheme deny-list applied to browser_authenticate (validated by
5
+ // `validateUrlScheme` in utils.ts): only http: and https: URLs are
6
+ // permitted; about:blank is special-cased. Refused: file:, chrome:,
7
+ // chrome-extension:, javascript:, data:, view-source:, and about: URLs
8
+ // other than about:blank.
9
+ export function registerSessionTools(server) {
10
+ server.registerTool('browser_tabs', {
11
+ description: 'List open tabs or switch to a tab by number.',
12
+ inputSchema: {
13
+ action: z.enum(['list', 'new', 'close']).optional().describe('Tab action. Omit to list tabs.'),
14
+ tab_number: z.number().optional().describe('Tab number to switch to (from tab list)'),
15
+ },
16
+ annotations: {
17
+ readOnlyHint: false,
18
+ destructiveHint: false,
19
+ idempotentHint: false,
20
+ openWorldHint: true,
21
+ },
22
+ }, withErrorHandling(async (args) => {
23
+ if (args.tab_number !== undefined) {
24
+ await execAgentBrowser(['tab', String(args.tab_number)]);
25
+ return JSON.stringify({ ok: true, message: `Switched to tab ${args.tab_number}` });
26
+ }
27
+ const cliAction = args.action ?? 'list';
28
+ const result = await execAgentBrowser(['tab', cliAction]);
29
+ return JSON.stringify({ ok: true, tabs: result.stdout.trim() });
30
+ }));
31
+ server.registerTool('browser_close', {
32
+ description: 'Close the browser session. Sessions are saved automatically.',
33
+ inputSchema: {},
34
+ annotations: {
35
+ readOnlyHint: false,
36
+ destructiveHint: true,
37
+ idempotentHint: true,
38
+ openWorldHint: false,
39
+ },
40
+ }, withErrorHandling(async () => {
41
+ await execAgentBrowser(['close']);
42
+ return JSON.stringify({ ok: true, message: 'Browser session closed. Sessions are saved automatically.' });
43
+ }));
44
+ server.registerTool('browser_authenticate', {
45
+ description: `Open a visible browser window so the user can log in manually. The session is saved automatically.
46
+
47
+ WHEN TO USE: "I need to access LinkedIn", "Log me into WhatsApp", etc.
48
+ Tell the user to close the browser when done logging in, or call browser_close.`,
49
+ inputSchema: {
50
+ url: z.string().describe('Website URL to open for login (http://, https://, or about:blank)'),
51
+ },
52
+ annotations: {
53
+ readOnlyHint: false,
54
+ destructiveHint: false,
55
+ idempotentHint: false,
56
+ openWorldHint: true,
57
+ },
58
+ }, withErrorHandling(async (args) => {
59
+ validateUrlScheme(args.url);
60
+ await execAgentBrowser(['open', args.url], { headed: true });
61
+ return JSON.stringify({
62
+ ok: true,
63
+ url: args.url,
64
+ message: `Browser opened to ${args.url} in visible mode. The user should log in manually. Their session will be saved automatically when the browser is closed.`,
65
+ next_step: 'Tell the user to log in and close the browser when done, or call browser_close.',
66
+ });
67
+ }));
68
+ }
69
+ //# sourceMappingURL=session.js.map
@@ -0,0 +1,14 @@
1
+ export declare const SERVER_NAME = "browser-automation-mcp-server";
2
+ /** Server version reported on MCP `initialize`. Read from package.json so
3
+ * it cannot drift from the published npm version. */
4
+ export declare const SERVER_VERSION: string;
5
+ export declare const DEFAULT_TIMEOUT_MS = 30000;
6
+ export declare const SNAPSHOT_TIMEOUT_MS = 15000;
7
+ export declare const SCREENSHOT_TIMEOUT_MS = 15000;
8
+ export declare const SESSION_NAME = "mcp";
9
+ export declare class ConnectorError extends Error {
10
+ readonly code: string;
11
+ readonly resolution: string;
12
+ constructor(message: string, code: string, resolution: string);
13
+ }
14
+ //# sourceMappingURL=types.d.ts.map
package/dist/types.js ADDED
@@ -0,0 +1,22 @@
1
+ import { createRequire } from 'node:module';
2
+ const require = createRequire(import.meta.url);
3
+ const pkg = require('../package.json');
4
+ export const SERVER_NAME = 'browser-automation-mcp-server';
5
+ /** Server version reported on MCP `initialize`. Read from package.json so
6
+ * it cannot drift from the published npm version. */
7
+ export const SERVER_VERSION = pkg.version;
8
+ export const DEFAULT_TIMEOUT_MS = 30_000;
9
+ export const SNAPSHOT_TIMEOUT_MS = 15_000;
10
+ export const SCREENSHOT_TIMEOUT_MS = 15_000;
11
+ export const SESSION_NAME = 'mcp';
12
+ export class ConnectorError extends Error {
13
+ code;
14
+ resolution;
15
+ constructor(message, code, resolution) {
16
+ super(message);
17
+ this.code = code;
18
+ this.resolution = resolution;
19
+ this.name = 'ConnectorError';
20
+ }
21
+ }
22
+ //# sourceMappingURL=types.js.map
@@ -0,0 +1,25 @@
1
+ import type { CallToolResult } from '@modelcontextprotocol/sdk/types.js';
2
+ /**
3
+ * Validate the URL scheme of a user-supplied URL before forwarding it to the
4
+ * agent-browser CLI. Throws a `ConnectorError` with a human-readable message
5
+ * (and a stable code suitable for tool error responses) when the scheme is
6
+ * not on the allow-list.
7
+ */
8
+ export declare function validateUrlScheme(url: string): void;
9
+ type ToolHandler<T> = (args: T, extra: unknown) => Promise<CallToolResult>;
10
+ /**
11
+ * Wraps a tool handler with standard error handling.
12
+ *
13
+ * - On success: returns the string result as a text content block.
14
+ * - On ConnectorError: returns a structured JSON error with code and resolution.
15
+ * - On unknown error: returns a generic error message.
16
+ *
17
+ * Secrets are never exposed in error messages.
18
+ */
19
+ export declare function withErrorHandling<T>(fn: (args: T, extra: unknown) => Promise<string>): ToolHandler<T>;
20
+ /**
21
+ * Wraps a tool handler that returns a CallToolResult directly (e.g. for image responses).
22
+ */
23
+ export declare function withErrorHandlingRaw<T>(fn: (args: T, extra: unknown) => Promise<CallToolResult>): ToolHandler<T>;
24
+ export {};
25
+ //# sourceMappingURL=utils.d.ts.map
package/dist/utils.js ADDED
@@ -0,0 +1,129 @@
1
+ import { ConnectorError } from './types.js';
2
+ /**
3
+ * URL scheme deny-list for browser_navigate / browser_authenticate.
4
+ *
5
+ * Only `http:` and `https:` are accepted. The pseudo-URL `about:blank` is
6
+ * special-cased and permitted (it's the only safe `about:` page — no local
7
+ * data, no chrome internals). All other schemes are refused before the
8
+ * underlying agent-browser CLI is invoked, so the agent cannot:
9
+ * - read local files via `file:` URLs,
10
+ * - access browser internals via `chrome:` / `chrome-extension:` URLs,
11
+ * - execute page-side JavaScript via `javascript:` URLs,
12
+ * - render attacker-controlled inline payloads via `data:` URLs,
13
+ * - bypass the same-origin policy via `view-source:` URLs,
14
+ * - touch privileged `about:` pages (about:config, about:cache, …).
15
+ */
16
+ const BLOCKED_URL_SCHEMES = new Set([
17
+ 'file:',
18
+ 'chrome:',
19
+ 'chrome-extension:',
20
+ 'javascript:',
21
+ 'data:',
22
+ 'view-source:',
23
+ ]);
24
+ /**
25
+ * Validate the URL scheme of a user-supplied URL before forwarding it to the
26
+ * agent-browser CLI. Throws a `ConnectorError` with a human-readable message
27
+ * (and a stable code suitable for tool error responses) when the scheme is
28
+ * not on the allow-list.
29
+ */
30
+ export function validateUrlScheme(url) {
31
+ // Special case: `about:blank` is the only `about:` URL we accept. We match
32
+ // it textually to sidestep any quirks in URL parsing for opaque schemes.
33
+ if (url.toLowerCase() === 'about:blank')
34
+ return;
35
+ let parsed;
36
+ try {
37
+ parsed = new URL(url);
38
+ }
39
+ catch {
40
+ throw new ConnectorError(`URL scheme not allowed: invalid URL ${JSON.stringify(url)}`, 'URL_SCHEME_REJECTED', 'Pass a valid http: or https: URL (or about:blank). Only http and https schemes are permitted.');
41
+ }
42
+ const proto = parsed.protocol.toLowerCase();
43
+ if (proto === 'http:' || proto === 'https:')
44
+ return;
45
+ if (proto === 'about:') {
46
+ // We already returned above for about:blank — anything else here is
47
+ // about:config / about:cache / about:debugging etc.
48
+ throw new ConnectorError(`URL scheme not allowed: ${proto} (only about:blank is permitted, got ${url})`, 'URL_SCHEME_REJECTED', 'Only http://, https://, and about:blank URLs are accepted by the browser-automation connector.');
49
+ }
50
+ // Default-deny: explicit deny-list match OR unknown scheme — both rejected.
51
+ // The deny-list is enumerated for documentation; the protocol check above
52
+ // is what actually enforces the policy.
53
+ void BLOCKED_URL_SCHEMES; // referenced so the import survives tree-shaking
54
+ throw new ConnectorError(`URL scheme not allowed: ${proto}. Only http: and https: schemes are permitted (about:blank also allowed).`, 'URL_SCHEME_REJECTED', 'Pass an http://, https://, or about:blank URL. Schemes like file:, chrome:, chrome-extension:, javascript:, data:, view-source:, and about: (other than about:blank) are refused.');
55
+ }
56
+ /**
57
+ * Wraps a tool handler with standard error handling.
58
+ *
59
+ * - On success: returns the string result as a text content block.
60
+ * - On ConnectorError: returns a structured JSON error with code and resolution.
61
+ * - On unknown error: returns a generic error message.
62
+ *
63
+ * Secrets are never exposed in error messages.
64
+ */
65
+ export function withErrorHandling(fn) {
66
+ return async (args, extra) => {
67
+ try {
68
+ const result = await fn(args, extra);
69
+ return { content: [{ type: 'text', text: result }] };
70
+ }
71
+ catch (error) {
72
+ if (error instanceof ConnectorError) {
73
+ return {
74
+ content: [
75
+ {
76
+ type: 'text',
77
+ text: JSON.stringify({
78
+ ok: false,
79
+ error: error.message,
80
+ code: error.code,
81
+ resolution: error.resolution,
82
+ }),
83
+ },
84
+ ],
85
+ isError: true,
86
+ };
87
+ }
88
+ const errorMessage = error instanceof Error ? error.message : String(error);
89
+ return {
90
+ content: [{ type: 'text', text: JSON.stringify({ ok: false, error: errorMessage }) }],
91
+ isError: true,
92
+ };
93
+ }
94
+ };
95
+ }
96
+ /**
97
+ * Wraps a tool handler that returns a CallToolResult directly (e.g. for image responses).
98
+ */
99
+ export function withErrorHandlingRaw(fn) {
100
+ return async (args, extra) => {
101
+ try {
102
+ return await fn(args, extra);
103
+ }
104
+ catch (error) {
105
+ if (error instanceof ConnectorError) {
106
+ return {
107
+ content: [
108
+ {
109
+ type: 'text',
110
+ text: JSON.stringify({
111
+ ok: false,
112
+ error: error.message,
113
+ code: error.code,
114
+ resolution: error.resolution,
115
+ }),
116
+ },
117
+ ],
118
+ isError: true,
119
+ };
120
+ }
121
+ const errorMessage = error instanceof Error ? error.message : String(error);
122
+ return {
123
+ content: [{ type: 'text', text: JSON.stringify({ ok: false, error: errorMessage }) }],
124
+ isError: true,
125
+ };
126
+ }
127
+ };
128
+ }
129
+ //# sourceMappingURL=utils.js.map
package/package.json ADDED
@@ -0,0 +1,55 @@
1
+ {
2
+ "name": "@mindstone/mcp-server-browser-automation",
3
+ "version": "0.1.7",
4
+ "mcpName": "io.github.mindstone/mcp-server-browser-automation",
5
+ "description": "Browser automation MCP server \u2014 visible-by-default browser control via accessibility snapshots, navigation, form filling, screenshots, and tab management. Set AGENT_BROWSER_SHOW_WINDOW=false to run quietly.",
6
+ "license": "FSL-1.1-MIT",
7
+ "type": "module",
8
+ "bin": {
9
+ "mcp-server-browser-automation": "dist/index.js"
10
+ },
11
+ "files": [
12
+ "dist",
13
+ "!dist/**/*.map"
14
+ ],
15
+ "repository": {
16
+ "type": "git",
17
+ "url": "https://github.com/mindstone/mcp-servers.git",
18
+ "directory": "connectors/browser-automation"
19
+ },
20
+ "homepage": "https://github.com/mindstone/mcp-servers/tree/main/connectors/browser-automation",
21
+ "publishConfig": {
22
+ "access": "public"
23
+ },
24
+ "scripts": {
25
+ "build": "tsc && shx chmod +x dist/index.js",
26
+ "prepare": "npm run build",
27
+ "watch": "tsc --watch",
28
+ "start": "node dist/index.js",
29
+ "test": "vitest run",
30
+ "test:watch": "vitest",
31
+ "test:coverage": "vitest run --coverage"
32
+ },
33
+ "dependencies": {
34
+ "@modelcontextprotocol/sdk": "^1.26.0",
35
+ "graceful-fs": "^4.2.11",
36
+ "zod": "^3.23.0"
37
+ },
38
+ "devDependencies": {
39
+ "@mindstone/mcp-test-harness": "file:../../test-harness",
40
+ "@types/node": "^22",
41
+ "@vitest/coverage-v8": "^4.1.3",
42
+ "msw": "^2.13.2",
43
+ "shx": "^0.3.4",
44
+ "typescript": "^5.8.2",
45
+ "vitest": "^4.1.3"
46
+ },
47
+ "engines": {
48
+ "node": ">=20"
49
+ },
50
+ "overrides": {
51
+ "fast-uri": "^3.1.2",
52
+ "hono": "^4.12.18",
53
+ "ip-address": "^10.2.0"
54
+ }
55
+ }