barebrowse 0.10.1 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,26 @@
1
+ name: Publish to npm
2
+
3
+ # Trusted publishing via OIDC — no NPM_TOKEN needed.
4
+ # Configure the trusted publisher at npmjs.com first (see repo notes).
5
+ on:
6
+ workflow_dispatch: # manual "Run workflow" button
7
+ release:
8
+ types: [published] # also publishes when you cut a GitHub release
9
+
10
+ permissions:
11
+ contents: read
12
+ id-token: write # required: lets npm mint OIDC credentials
13
+
14
+ jobs:
15
+ publish:
16
+ runs-on: ubuntu-latest
17
+ steps:
18
+ - uses: actions/checkout@v4
19
+ - uses: actions/setup-node@v4
20
+ with:
21
+ node-version: 22
22
+ registry-url: 'https://registry.npmjs.org'
23
+ - name: Upgrade npm (trusted publishing needs >= 11.5.1)
24
+ run: npm install -g npm@latest
25
+ - name: Publish
26
+ run: npm publish
package/CHANGELOG.md CHANGED
@@ -1,5 +1,69 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.11.0
4
+
5
+ ### Security hardening — audit findings fixed, safe-by-default
6
+
7
+ A full security audit of the library + CLI daemon + MCP server. Eight
8
+ findings were reproduced with live PoCs, fixed, and locked in with 14 new
9
+ regression tests (143 → 157 passing). Two new opt-in controls; two new
10
+ defaults that change behavior (see **Breaking** below).
11
+
12
+ - **Daemon authentication (was: unauthenticated `eval` over loopback).**
13
+ The CLI daemon's HTTP server bound to `127.0.0.1` but had no auth — and
14
+ loopback is shared across local users, so any local process could POST
15
+ `/command` (including `eval` = arbitrary JS in the authenticated browser).
16
+ Now every daemon mints a 32-byte random token at startup, written into
17
+ `session.json` (mode `0600`) and required on `/command` via the
18
+ `x-barebrowse-token` header (constant-time compare). `session-client.js`
19
+ reads and sends it transparently — no caller change. `GET /status` stays
20
+ open as a liveness ping returning only `{ ok, pid }`.
21
+ - **Artifact permissions.** The session dir is now created `0700` and all
22
+ daemon artifacts (`session.json`, snapshots, screenshots, PDFs, console /
23
+ network / dialog logs) plus `page.saveState()` output are written `0600`.
24
+ `saveState` holds cookies + localStorage (session tokens), so this stops a
25
+ multi-user host from reading another user's credentials off disk.
26
+ - **Navigation scheme guard (new module `src/url-guard.js`).** `goto()` /
27
+ `browse()` now reject local-resource and browser-internal schemes
28
+ (`file:`, `view-source:`, `chrome:`, `chrome-extension:`, `filesystem:`,
29
+ `devtools:`, …) by default — closing a confirmed local-file-read /
30
+ directory-listing vector for a prompt-injected agent. `http`/`https`/
31
+ `data`/`blob`/`about` stay allowed (`data:` is opaque-origin and the
32
+ test-fixture mechanism — not a read/SSRF vector). Override with
33
+ `{ allowLocalUrls: true }`.
34
+ - **SSRF guard (opt-in `blockPrivateNetwork`).** When set, `goto()`/
35
+ `browse()` refuse loopback / RFC-1918 / link-local / cloud-metadata
36
+ (`169.254.169.254`) / `*.internal` hosts. Off by default so localhost
37
+ dev-server browsing keeps working. Exposed as `--block-private-network`.
38
+ - **Upload sandbox (opt-in `uploadDir`).** `upload()` confirmed it would
39
+ attach any absolute path to a file input (exfil vector under prompt
40
+ injection). When `uploadDir` is set, every path must resolve (symlinks
41
+ included, via `realpath`) inside it. Default unrestricted — nothing breaks
42
+ unless you opt in. Exposed as `--upload-dir=DIR`. Both new opts pass
43
+ through `connect()` → MCP / bareagent / CLI daemon uniformly.
44
+ - **Cookie injection scoped precisely (was: over-broad substring match).**
45
+ `authenticate()` matched `host_key LIKE '%domain%'`, so browsing
46
+ `apple.com` injected cookies for `apple.com.evil.org` / `notapple.com`,
47
+ and `mybank.co.uk` (→ `co.uk`) pulled every `*.co.uk` cookie. The LIKE
48
+ query is now only a coarse pre-filter; a precise RFC-6265
49
+ `cookieDomainMatch()` decides what actually gets injected (parent-domain
50
+ cookies like `.google.com` still apply to `mail.google.com`).
51
+ - **Hardening:** browser discovery uses `execFileSync('which', [name])`
52
+ (no shell) instead of an interpolated `execSync` string; the cleanup
53
+ busy-wait drops a `sleep` subprocess for `Atomics.wait`. Added
54
+ `.gitignore` (was missing — `.barebrowse/` state/snapshots could be
55
+ accidentally committed). Pinned `wearehere` to exact `1.0.0`.
56
+ - **Tests:** 157 total (14 new) — `test/unit/url-guard.test.js` (19
57
+ assertions over scheme/private-host policy), `cookieDomainMatch` cases in
58
+ `test/unit/auth.test.js`, daemon token + `0600` perms in
59
+ `test/integration/cli.test.js`.
60
+
61
+ **Breaking:** (1) `file:`/`chrome:`/etc. navigation now throws by default —
62
+ pass `allowLocalUrls: true` to restore. (2) The CLI daemon now requires the
63
+ token; this is transparent via the bundled `session-client`, but any
64
+ third-party client hitting the daemon's HTTP API directly must send
65
+ `x-barebrowse-token` from `session.json`.
66
+
3
67
  ## 0.10.1
4
68
 
5
69
  ### Blocklist long-tail additions + legacy-Chrome warn + switchTab attach-mode test
package/README.md CHANGED
@@ -134,6 +134,17 @@ No clone profile, no fresh cookies — the agent sees what you see.
134
134
 
135
135
  Cookie consent walls (29 languages, with real mouse click fallback for stubborn CMPs), login walls (cookie extraction from your browsers), bot detection (ARIA node count heuristic + stealth patches + automatic headed fallback — snapshot shows `[BOT CHALLENGE DETECTED]` warning when blocked), permission prompts, SPA navigation, JS dialogs, off-screen elements, pre-filled inputs, ARIA noise, and profile locking. The agent doesn't think about any of it.
136
136
 
137
+ ## Safe by default (v0.11.0)
138
+
139
+ barebrowse hands an autonomous — and therefore prompt-injectable — agent an *authenticated* browser, so the defaults are calibrated for that threat:
140
+
141
+ - **Local-resource schemes blocked.** `file:`, `view-source:`, `chrome:`, etc. are rejected by default (a confirmed local-file-read vector); `http`/`https`/`data` stay allowed. Override with `allowLocalUrls: true`.
142
+ - **Cookie injection scoped** to a precise RFC-6265 domain match — browsing one site can't pull look-alike or unrelated cookies into the session.
143
+ - **CLI daemon authenticated** with a per-session token (loopback alone isn't an authorization boundary); snapshots and saved state are written owner-only (`0600`).
144
+ - **Opt-in hardening** for stricter deployments: `blockPrivateNetwork` (SSRF guard for loopback/RFC-1918/cloud-metadata) and `uploadDir` (confine `upload()` to one directory). Both available on the library, MCP, bareagent, and CLI (`--block-private-network`, `--upload-dir`).
145
+
146
+ See `barebrowse.context.md` and the PRD's "Security Model & Safe Defaults" for the full rationale.
147
+
137
148
  ## What the agent sees
138
149
 
139
150
  Raw ARIA output from a page is noisy -- decorative wrappers, hidden elements, structural junk. The pruning pipeline (ported from [mcprune](https://github.com/hamr0/mcprune)) strips it down to what matters.
@@ -1,7 +1,7 @@
1
1
  # barebrowse -- Integration Guide
2
2
 
3
3
  > For AI assistants and developers wiring barebrowse into a project.
4
- > v0.9.1 | Node.js >= 22 | 0 required deps | Apache-2.0
4
+ > v0.11.0 | Node.js >= 22 | 0 required deps | Apache-2.0
5
5
 
6
6
  ## What this is
7
7
 
@@ -95,6 +95,9 @@ const snapshot = await browse('https://example.com', {
95
95
  - `downloadPath: '/abs/dir'` — Where downloads land. Default: per-session `mkdtemp` under `/tmp/barebrowse-dl-*` that gets removed on `close()`. Caller-supplied paths are not cleaned up — caller owns the lifecycle.
96
96
  - `blockAds: true|false` — CDP-level URL blocking of 128 common ad/tracker patterns (Google ads/analytics, FB/Amazon/MS/Adobe ad+analytics, Segment/Amplitude/Mixpanel/Heap/PostHog, Hotjar/FullStory/LogRocket, Criteo/Taboola/Outbrain, the consumer-pixel cluster, AppNexus/Rubicon/PubMatic supply, marketing automation; v0.10.1 added AppsFlyer/Branch/Adjust, Cloudflare Web Analytics, Matomo Cloud). Default `true` for launched browsers, `false` in attach mode (would affect any tab in the user's running browser). Explicit `true` in attach mode is honored and follows the session across `switchTab()` (regression-tested). Shrinks ARIA snapshots and speeds page loads. On legacy Chromium lacking `Network.setBlockedURLs` a one-time `console.warn` surfaces the fallback.
97
97
  - `blockUrls: ['*://foo.com/*', ...]` — Extra glob patterns (CDP `Network.setBlockedURLs` format) to block in addition to the default. Merged with the default unless `blockAds: false`.
98
+ - `allowLocalUrls: true|false` — (v0.11.0) Default `false`: navigation to local-resource / browser-internal schemes (`file:`, `view-source:`, `chrome:`, `filesystem:`, `devtools:`, …) is **blocked** to stop a prompt-injected agent reading local files. `http`/`https`/`data`/`blob`/`about` are always allowed. Set `true` to permit local schemes.
99
+ - `blockPrivateNetwork: true|false` — (v0.11.0) Default `false`. When `true`, `goto()`/`browse()` refuse loopback / RFC-1918 / link-local / cloud-metadata (`169.254.169.254`) / `*.internal` hosts (SSRF guard). Off by default so localhost dev-server browsing works. Hostname-based — does not catch DNS names that resolve to private IPs.
100
+ - `uploadDir: '/abs/dir'` — (v0.11.0) Default unset (no restriction). When set, `upload()` rejects any file that does not resolve (symlinks included, via `realpath`) inside this directory — sandboxes the agent's file-upload capability.
98
101
 
99
102
  ## Snapshot format
100
103
 
@@ -226,7 +229,7 @@ barebrowse save-state # → .barebrowse/state-<timestamp>.json
226
229
  barebrowse close # Kill daemon + browser
227
230
  ```
228
231
 
229
- **Open flags:** `--mode=headless|headed|hybrid`, `--port=N` (attach to running browser), `--proxy=URL`, `--viewport=WxH`, `--storage-state=FILE`, `--download-path=DIR` (v0.9.0), `--no-cookies`, `--browser=firefox|chromium`, `--timeout=N`
232
+ **Open flags:** `--mode=headless|headed|hybrid`, `--port=N` (attach to running browser), `--proxy=URL`, `--viewport=WxH`, `--storage-state=FILE`, `--download-path=DIR` (v0.9.0), `--no-cookies`, `--browser=firefox|chromium`, `--timeout=N`, `--block-private-network` (SSRF guard, v0.11.0), `--upload-dir=DIR` (upload sandbox, v0.11.0)
230
233
 
231
234
  Session lifecycle: `open` spawns a background daemon holding a `connect()` session. Subsequent commands POST to the daemon over HTTP (localhost). `close` shuts everything down. JS dialogs (alert/confirm/prompt) are auto-dismissed and logged.
232
235
 
@@ -355,6 +358,10 @@ Useful for agent threshold decisions: "skip sites above score 40", "warn if term
355
358
 
356
359
  14. **`eval` MCP tool is opt-in.** Set `BAREBROWSE_MCP_EVAL=1` to register it. Default off because `Runtime.evaluate` in an authenticated session can read cookies/localStorage, post on the user's behalf, hit any same-origin endpoint. CLI/connect()/daemon all keep `eval` because the developer is the caller; MCP gates it because the agent acts with less judgment.
357
360
 
361
+ 15. **The CLI daemon requires a per-session token (v0.11.0).** `open` mints a 32-byte random token, writes it into `.barebrowse/session.json` (mode `0600`) and requires it on `POST /command` via the `x-barebrowse-token` header (loopback is shared across local users, so binding to `127.0.0.1` alone isn't an authorization boundary). The bundled `session-client` sends it automatically — no change for CLI users. A third-party client hitting the daemon HTTP API directly must read the token from `session.json` and send it. `GET /status` stays open (liveness only). The session dir is `0700`; snapshots, `saveState`, and logs are written `0600`.
362
+
363
+ 16. **Navigation is scheme-guarded by default (v0.11.0).** `file:`/`chrome:`/etc. throw unless `allowLocalUrls: true`; `blockPrivateNetwork` and `uploadDir` add opt-in SSRF and upload-sandbox controls. All four are exposed identically on the library, MCP/bareagent (via `connect` opts), and the CLI (`--block-private-network`, `--upload-dir=DIR`; the scheme guard and token are always on).
364
+
358
365
  ## Constraints
359
366
 
360
367
  - **Node >= 22** -- built-in WebSocket, built-in SQLite
package/cli.js CHANGED
@@ -119,6 +119,8 @@ async function cmdOpen() {
119
119
  downloadPath: parseFlag('--download-path'),
120
120
  blockAds: hasFlag('--no-block-ads') ? false : undefined,
121
121
  blockUrls: parseFlagAll('--block-urls'),
122
+ blockPrivateNetwork: hasFlag('--block-private-network') || undefined,
123
+ uploadDir: parseFlag('--upload-dir') ? resolve(parseFlag('--upload-dir')) : undefined,
122
124
  };
123
125
 
124
126
  try {
@@ -222,6 +224,8 @@ async function runDaemonInternal() {
222
224
  downloadPath: parseFlag('--download-path'),
223
225
  blockAds: hasFlag('--no-block-ads') ? false : undefined,
224
226
  blockUrls: parseFlagAll('--block-urls'),
227
+ blockPrivateNetwork: hasFlag('--block-private-network') || undefined,
228
+ uploadDir: parseFlag('--upload-dir'),
225
229
  };
226
230
  const outputDir = parseFlag('--output-dir') || resolve('.barebrowse');
227
231
  const url = parseFlag('--url');
@@ -489,6 +493,10 @@ Session:
489
493
  Default: enabled in owned-browser modes, disabled in attach mode.
490
494
  --block-urls=PATTERN Extra URL glob to block (repeatable, e.g. --block-urls='*://*.foo.com/*').
491
495
  Use the =VALUE form when the pattern could be mistaken for a flag.
496
+ --block-private-network SSRF guard: refuse to navigate to loopback / RFC-1918 / link-local /
497
+ cloud-metadata hosts. Off by default so localhost browsing works.
498
+ --upload-dir=DIR Sandbox uploads: reject files outside DIR (symlinks resolved).
499
+ Default: no restriction. (file:/chrome: schemes are always blocked.)
492
500
 
493
501
  Navigation:
494
502
  barebrowse goto <url> Navigate to URL
@@ -39,6 +39,10 @@ All output files go to `.barebrowse/` in the current directory. Read them with t
39
39
  - `--proxy=URL` — HTTP/SOCKS proxy server
40
40
  - `--viewport=WxH` — Viewport size (e.g. 1280x720)
41
41
  - `--storage-state=FILE` — Load cookies/localStorage from JSON file
42
+ - `--block-private-network` — SSRF guard: refuse loopback / RFC-1918 / link-local / cloud-metadata hosts (v0.11.0)
43
+ - `--upload-dir=DIR` — Sandbox uploads to DIR; reject files outside it (v0.11.0)
44
+
45
+ > Security (v0.11.0): `file:`/`chrome:`/etc. navigation is blocked by default, and the daemon requires a per-session token (handled transparently by the CLI). Snapshots and saved state are written owner-only (`0600`).
42
46
 
43
47
  ### Navigation
44
48
 
@@ -38,6 +38,10 @@ All output files go to `.barebrowse/` in the current directory. Read them with t
38
38
  - `--proxy=URL` — HTTP/SOCKS proxy server
39
39
  - `--viewport=WxH` — Viewport size (e.g. 1280x720)
40
40
  - `--storage-state=FILE` — Load cookies/localStorage from JSON file
41
+ - `--block-private-network` — SSRF guard: refuse loopback / RFC-1918 / link-local / cloud-metadata hosts (v0.11.0)
42
+ - `--upload-dir=DIR` — Sandbox uploads to DIR; reject files outside it (v0.11.0)
43
+
44
+ > Security (v0.11.0): `file:`/`chrome:`/etc. navigation is blocked by default, and the daemon requires a per-session token (handled transparently by the CLI). Snapshots and saved state are written owner-only (`0600`).
41
45
 
42
46
  ### Navigation
43
47
 
package/package.json CHANGED
@@ -1,7 +1,13 @@
1
1
  {
2
2
  "name": "barebrowse",
3
- "version": "0.10.1",
3
+ "version": "0.11.0",
4
4
  "description": "Authenticated web browsing for autonomous agents via CDP. URL in, pruned ARIA snapshot out.",
5
+ "repository": {
6
+ "type": "git",
7
+ "url": "git+https://github.com/hamr0/barebrowse.git"
8
+ },
9
+ "homepage": "https://github.com/hamr0/barebrowse#readme",
10
+ "bugs": "https://github.com/hamr0/barebrowse/issues",
5
11
  "type": "module",
6
12
  "main": "src/index.js",
7
13
  "exports": {
@@ -29,7 +35,7 @@
29
35
  "headless"
30
36
  ],
31
37
  "optionalDependencies": {
32
- "wearehere": "^1.0.0"
38
+ "wearehere": "1.0.0"
33
39
  },
34
40
  "license": "Apache-2.0"
35
41
  }
package/src/auth.js CHANGED
@@ -268,6 +268,22 @@ export async function injectCookies(session, cookies) {
268
268
  }
269
269
  }
270
270
 
271
+ /**
272
+ * RFC 6265 domain-match: does `host` belong to a cookie declared for
273
+ * `cookieDomain`? Leading dot on the cookie domain is ignored (host-only
274
+ * vs domain cookies are matched the same here, intentionally — we want
275
+ * parent-domain cookies like .google.com to apply to mail.google.com).
276
+ * @param {string} host - target hostname (e.g. 'mail.google.com')
277
+ * @param {string} cookieDomain - cookie's host_key (e.g. '.google.com')
278
+ * @returns {boolean}
279
+ */
280
+ export function cookieDomainMatch(host, cookieDomain) {
281
+ const h = String(host).toLowerCase();
282
+ const d = String(cookieDomain).toLowerCase().replace(/^\./, '');
283
+ if (!d) return false;
284
+ return h === d || h.endsWith('.' + d);
285
+ }
286
+
271
287
  /**
272
288
  * Extract cookies for a URL and inject them into a CDP session.
273
289
  * Convenience function combining extractCookies + injectCookies.
@@ -276,12 +292,18 @@ export async function injectCookies(session, cookies) {
276
292
  * @param {object} [opts] - Options passed to extractCookies
277
293
  */
278
294
  export async function authenticate(session, url, opts = {}) {
279
- // Strip to registrable domain so mail.google.com → google.com
280
- // This ensures parent-domain cookies (.google.com) are included
281
- const hostname = new URL(url).hostname.replace(/^www\./, '');
282
- const parts = hostname.split('.');
283
- const domain = parts.length > 2 ? parts.slice(-2).join('.') : hostname;
284
- const cookies = extractCookies({ ...opts, domain });
295
+ const fullHost = new URL(url).hostname.toLowerCase();
296
+ // Coarse SQL pre-filter: strip to a registrable-ish domain so the LIKE query
297
+ // returns a superset (incl. parent-domain cookies). slice(-2) is a cheap
298
+ // heuristic it over-selects for multi-part eTLDs (co.uk) and as a substring
299
+ // match, so the precise RFC-6265 domain-match below is what actually decides
300
+ // which cookies get injected. Without it, browsing apple.com would inject
301
+ // cookies for apple.com.evil.org and every *.co.uk site (verified).
302
+ const noWww = fullHost.replace(/^www\./, '');
303
+ const parts = noWww.split('.');
304
+ const coarseDomain = parts.length > 2 ? parts.slice(-2).join('.') : noWww;
305
+ const candidates = extractCookies({ ...opts, domain: coarseDomain });
306
+ const cookies = candidates.filter((c) => cookieDomainMatch(fullHost, c.domain));
285
307
  if (cookies.length > 0) {
286
308
  await injectCookies(session, cookies);
287
309
  }
package/src/chromium.js CHANGED
@@ -5,9 +5,14 @@
5
5
  * Modes: headless (launch new, no UI), headed (launch new, visible window).
6
6
  */
7
7
 
8
- import { execSync, spawn } from 'node:child_process';
8
+ import { execFileSync, spawn } from 'node:child_process';
9
9
  import { existsSync, rmSync } from 'node:fs';
10
10
 
11
+ /** Block the current thread for `ms` without spawning a process. */
12
+ function sleepSync(ms) {
13
+ Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
14
+ }
15
+
11
16
  // Track launched browsers so we can clean them up if the parent crashes.
12
17
  // Registered exit handlers (one-time) iterate this set on shutdown.
13
18
  const activeBrowsers = new Set();
@@ -29,7 +34,7 @@ function reapAllSync() {
29
34
  for (const b of toReap) {
30
35
  for (let i = 0; i < 20; i++) {
31
36
  try { process.kill(b.process.pid, 0); } catch { break; }
32
- try { execSync('sleep 0.05'); } catch {}
37
+ sleepSync(50);
33
38
  }
34
39
  if (b.ownedProfileDir) {
35
40
  try { rmSync(b.ownedProfileDir, { recursive: true, force: true }); } catch {}
@@ -84,8 +89,11 @@ export function findBrowser() {
84
89
  if (existsSync(candidate)) return candidate;
85
90
  continue;
86
91
  }
87
- // Relative name — check via which
88
- const path = execSync(`which ${candidate} 2>/dev/null`, { encoding: 'utf8' }).trim();
92
+ // Relative name — resolve via `which` (execFile: no shell, no injection)
93
+ const path = execFileSync('which', [candidate], {
94
+ encoding: 'utf8',
95
+ stdio: ['ignore', 'pipe', 'ignore'],
96
+ }).trim();
89
97
  if (path) return path;
90
98
  } catch {
91
99
  // Not found, try next
package/src/daemon.js CHANGED
@@ -8,9 +8,25 @@
8
8
  import { createServer } from 'node:http';
9
9
  import { spawn } from 'node:child_process';
10
10
  import { writeFileSync, mkdirSync, existsSync, readFileSync, unlinkSync } from 'node:fs';
11
+ import { randomBytes, timingSafeEqual } from 'node:crypto';
11
12
  import { join, resolve } from 'node:path';
12
13
  import { connect } from './index.js';
13
14
 
15
+ /** Owner-only file write helper — daemon artifacts can hold authenticated content. */
16
+ function writeFilePrivate(path, data) {
17
+ writeFileSync(path, data, { mode: 0o600 });
18
+ }
19
+
20
+ /** Constant-time token compare; false on any length/format mismatch. */
21
+ function tokenMatches(expected, got) {
22
+ if (typeof got !== 'string' || got.length !== expected.length) return false;
23
+ try {
24
+ return timingSafeEqual(Buffer.from(got), Buffer.from(expected));
25
+ } catch {
26
+ return false;
27
+ }
28
+ }
29
+
14
30
  const SESSION_FILE = 'session.json';
15
31
 
16
32
  /**
@@ -19,7 +35,7 @@ const SESSION_FILE = 'session.json';
19
35
  */
20
36
  export async function startDaemon(opts, outputDir, initialUrl) {
21
37
  const absDir = resolve(outputDir);
22
- mkdirSync(absDir, { recursive: true });
38
+ mkdirSync(absDir, { recursive: true, mode: 0o700 });
23
39
 
24
40
  // Clean stale session
25
41
  const sessionPath = join(absDir, SESSION_FILE);
@@ -44,6 +60,8 @@ export async function startDaemon(opts, outputDir, initialUrl) {
44
60
  if (Array.isArray(opts.blockUrls)) {
45
61
  for (const p of opts.blockUrls) args.push('--block-urls', p);
46
62
  }
63
+ if (opts.blockPrivateNetwork) args.push('--block-private-network');
64
+ if (opts.uploadDir) args.push('--upload-dir', opts.uploadDir);
47
65
 
48
66
  const child = spawn(process.execPath, args, {
49
67
  detached: true,
@@ -75,7 +93,13 @@ export async function startDaemon(opts, outputDir, initialUrl) {
75
93
  */
76
94
  export async function runDaemon(opts, outputDir, initialUrl) {
77
95
  const absDir = resolve(outputDir);
78
- mkdirSync(absDir, { recursive: true });
96
+ mkdirSync(absDir, { recursive: true, mode: 0o700 });
97
+
98
+ // Per-session auth token. The daemon binds to loopback, but loopback is
99
+ // shared across local users — without a token any local user/process could
100
+ // POST /command and drive the authenticated browser (incl. `eval`). The
101
+ // token is written into session.json (mode 0600) so only the owner reads it.
102
+ const authToken = randomBytes(32).toString('hex');
79
103
 
80
104
  // Connect to browser
81
105
  const page = await connect({
@@ -88,6 +112,8 @@ export async function runDaemon(opts, outputDir, initialUrl) {
88
112
  downloadPath: opts.downloadPath,
89
113
  blockAds: opts.blockAds,
90
114
  blockUrls: opts.blockUrls,
115
+ blockPrivateNetwork: opts.blockPrivateNetwork,
116
+ uploadDir: opts.uploadDir,
91
117
  });
92
118
 
93
119
  // Console log capture
@@ -161,7 +187,7 @@ export async function runDaemon(opts, outputDir, initialUrl) {
161
187
  const text = await page.snapshot({ mode: pruneMode });
162
188
  const ts = new Date().toISOString().replace(/[:.]/g, '-');
163
189
  const file = join(absDir, `page-${ts}.yml`);
164
- writeFileSync(file, text);
190
+ writeFilePrivate(file, text);
165
191
  return { ok: true, file };
166
192
  },
167
193
 
@@ -170,7 +196,7 @@ export async function runDaemon(opts, outputDir, initialUrl) {
170
196
  const ts = new Date().toISOString().replace(/[:.]/g, '-');
171
197
  const ext = format || 'png';
172
198
  const file = join(absDir, `screenshot-${ts}.${ext}`);
173
- writeFileSync(file, Buffer.from(data, 'base64'));
199
+ writeFilePrivate(file, Buffer.from(data, 'base64'));
174
200
  return { ok: true, file };
175
201
  },
176
202
 
@@ -244,7 +270,7 @@ export async function runDaemon(opts, outputDir, initialUrl) {
244
270
  const data = await page.pdf({ landscape });
245
271
  const ts = new Date().toISOString().replace(/[:.]/g, '-');
246
272
  const file = join(absDir, `page-${ts}.pdf`);
247
- writeFileSync(file, Buffer.from(data, 'base64'));
273
+ writeFilePrivate(file, Buffer.from(data, 'base64'));
248
274
  return { ok: true, file };
249
275
  },
250
276
 
@@ -273,7 +299,7 @@ export async function runDaemon(opts, outputDir, initialUrl) {
273
299
  async 'dialog-log'() {
274
300
  const ts = new Date().toISOString().replace(/[:.]/g, '-');
275
301
  const file = join(absDir, `dialogs-${ts}.json`);
276
- writeFileSync(file, JSON.stringify(page.dialogLog, null, 2));
302
+ writeFilePrivate(file, JSON.stringify(page.dialogLog, null, 2));
277
303
  return { ok: true, file, count: page.dialogLog.length };
278
304
  },
279
305
 
@@ -304,7 +330,7 @@ export async function runDaemon(opts, outputDir, initialUrl) {
304
330
  if (level) logs = logs.filter((l) => l.type === level);
305
331
  const ts = new Date().toISOString().replace(/[:.]/g, '-');
306
332
  const file = join(absDir, `console-${ts}.json`);
307
- writeFileSync(file, JSON.stringify(logs, null, 2));
333
+ writeFilePrivate(file, JSON.stringify(logs, null, 2));
308
334
  if (clear) consoleLogs.length = 0;
309
335
  return { ok: true, file, count: logs.length };
310
336
  },
@@ -314,7 +340,7 @@ export async function runDaemon(opts, outputDir, initialUrl) {
314
340
  if (failed) logs = logs.filter((l) => l.status === 0 || l.status >= 400);
315
341
  const ts = new Date().toISOString().replace(/[:.]/g, '-');
316
342
  const file = join(absDir, `network-${ts}.json`);
317
- writeFileSync(file, JSON.stringify(logs, null, 2));
343
+ writeFilePrivate(file, JSON.stringify(logs, null, 2));
318
344
  return { ok: true, file, count: logs.length };
319
345
  },
320
346
 
@@ -346,6 +372,14 @@ export async function runDaemon(opts, outputDir, initialUrl) {
346
372
  return;
347
373
  }
348
374
 
375
+ // Require the per-session token. Rejects any local process that hasn't
376
+ // read session.json (which is owner-only). Constant-time compare.
377
+ if (!tokenMatches(authToken, req.headers['x-barebrowse-token'])) {
378
+ res.writeHead(401, { 'Content-Type': 'application/json' });
379
+ res.end(JSON.stringify({ ok: false, error: 'Unauthorized: missing or invalid token' }));
380
+ return;
381
+ }
382
+
349
383
  let body = '';
350
384
  for await (const chunk of req) body += chunk;
351
385
 
@@ -388,11 +422,13 @@ export async function runDaemon(opts, outputDir, initialUrl) {
388
422
 
389
423
  const port = server.address().port;
390
424
 
391
- // Write session.json so parent/clients can find us
425
+ // Write session.json so parent/clients can find us. Owner-only: it carries
426
+ // the auth token that gates /command.
392
427
  const sessionPath = join(absDir, SESSION_FILE);
393
- writeFileSync(sessionPath, JSON.stringify({
428
+ writeFilePrivate(sessionPath, JSON.stringify({
394
429
  port,
395
430
  pid: process.pid,
431
+ token: authToken,
396
432
  startedAt: new Date().toISOString(),
397
433
  }));
398
434
 
package/src/index.js CHANGED
@@ -18,7 +18,9 @@ import { dismissConsent } from './consent.js';
18
18
  import { applyStealth } from './stealth.js';
19
19
  import { DEFAULT_BLOCKLIST } from './blocklist.js';
20
20
  import { waitForNetworkIdle } from './network-idle.js';
21
+ import { assertNavigable, assertUploadAllowed } from './url-guard.js';
21
22
  import { join as pathJoin } from 'node:path';
23
+ import { chmodSync } from 'node:fs';
22
24
 
23
25
  /**
24
26
  * Browse a URL and return an ARIA snapshot.
@@ -41,6 +43,10 @@ export async function browse(url, opts = {}) {
41
43
  const mode = opts.mode || 'headless';
42
44
  const timeout = opts.timeout || 30000;
43
45
 
46
+ // Reject local-resource schemes (and optionally private hosts) before we
47
+ // spend a browser launch on a URL we won't navigate to.
48
+ assertNavigable(url, { allowLocalUrls: opts.allowLocalUrls, blockPrivateNetwork: opts.blockPrivateNetwork });
49
+
44
50
  let browser = null;
45
51
  let cdp = null;
46
52
  // Forward caller-supplied launch knobs (binary, userDataDir, proxy) into
@@ -154,6 +160,15 @@ export async function browse(url, opts = {}) {
154
160
  * attached to and follows the session across switchTab() until close.
155
161
  * @param {string[]} [opts.blockUrls] - Extra URL glob patterns to block,
156
162
  * merged with the default unless blockAds is false.
163
+ * @param {boolean} [opts.allowLocalUrls=false] - Permit navigation to local-
164
+ * resource schemes (file:, view-source:, chrome:, …). Blocked by default
165
+ * because a prompt-injected agent could use them to read local files.
166
+ * @param {boolean} [opts.blockPrivateNetwork=false] - Reject navigation to
167
+ * loopback / RFC-1918 / link-local / cloud-metadata hosts (SSRF guard).
168
+ * Off by default so localhost dev-server browsing keeps working.
169
+ * @param {string} [opts.uploadDir] - When set, upload() rejects any file that
170
+ * does not resolve (symlinks included) inside this directory. Sandboxes the
171
+ * agent's file-upload capability. Default: no restriction.
157
172
  * @returns {Promise<object>} Page handle with goto, snapshot, close
158
173
  */
159
174
  export async function connect(opts = {}) {
@@ -164,6 +179,11 @@ export async function connect(opts = {}) {
164
179
  // Forward caller-supplied launch knobs into every launch() below,
165
180
  // including hybrid-fallback re-launches inside goto().
166
181
  const launchOpts = { proxy: opts.proxy, binary: opts.binary, userDataDir: opts.userDataDir };
182
+ // Navigation safety policy, applied on every goto()/createTab().goto().
183
+ const urlGuard = { allowLocalUrls: opts.allowLocalUrls, blockPrivateNetwork: opts.blockPrivateNetwork };
184
+ // Optional upload sandbox: when set, upload() rejects files outside this dir.
185
+ // assertUploadAllowed resolves it (realpath) at check time.
186
+ const uploadDir = opts.uploadDir || null;
167
187
 
168
188
  if (attachMode) {
169
189
  // Reuse the user's running browser — do not launch, do not own the
@@ -312,6 +332,7 @@ export async function connect(opts = {}) {
312
332
 
313
333
  return {
314
334
  async goto(url, timeout = 30000) {
335
+ assertNavigable(url, urlGuard);
315
336
  // Refs from the previous page are about to become invalid — clear
316
337
  // before navigating so a stale click(ref) errors clearly instead of
317
338
  // silently resolving to whatever backendNodeId happens to still be in
@@ -467,6 +488,10 @@ export async function connect(opts = {}) {
467
488
  async upload(ref, files) {
468
489
  const entry = refMap.get(ref);
469
490
  if (!entry) throw new Error(`No element found for ref "${ref}"`);
491
+ // Upload sandbox: when uploadDir is set, every path must resolve
492
+ // (symlinks included, via realpath) inside it. Stops a prompt-injected
493
+ // agent from attaching ~/.ssh/id_rsa or other arbitrary local files.
494
+ assertUploadAllowed(files, uploadDir);
470
495
  await cdpUpload(entry.session, entry.backendNodeId, files);
471
496
  },
472
497
 
@@ -535,7 +560,10 @@ export async function connect(opts = {}) {
535
560
  });
536
561
  const state = { cookies, localStorage: JSON.parse(result.value || '{}') };
537
562
  const { writeFileSync } = await import('node:fs');
538
- writeFileSync(filePath, JSON.stringify(state, null, 2));
563
+ // State holds cookies + localStorage (session tokens) — write owner-only
564
+ // so a multi-user host can't read another user's credentials off disk.
565
+ writeFileSync(filePath, JSON.stringify(state, null, 2), { mode: 0o600 });
566
+ try { chmodSync(filePath, 0o600); } catch { /* best effort if pre-existing */ }
539
567
  },
540
568
 
541
569
  get botBlocked() { return botBlocked; },
@@ -590,6 +618,7 @@ export async function connect(opts = {}) {
590
618
  let tabBotBlocked = false;
591
619
  return {
592
620
  async goto(url, timeout = 30000) {
621
+ assertNavigable(url, urlGuard);
593
622
  await navigate(tab, url, timeout);
594
623
  if (opts.consent !== false) {
595
624
  await dismissConsent(tab.session);
@@ -53,7 +53,11 @@ export async function sendCommand(command, args, outputDir) {
53
53
  try {
54
54
  res = await fetch(`http://127.0.0.1:${session.port}/command`, {
55
55
  method: 'POST',
56
- headers: { 'Content-Type': 'application/json' },
56
+ headers: {
57
+ 'Content-Type': 'application/json',
58
+ // Authenticate to the daemon with the per-session token from session.json.
59
+ ...(session.token ? { 'x-barebrowse-token': session.token } : {}),
60
+ },
57
61
  body: JSON.stringify({ command, args }),
58
62
  signal: AbortSignal.timeout(60000),
59
63
  });
@@ -0,0 +1,138 @@
1
+ /**
2
+ * url-guard.js — Navigation safety checks for goto()/browse().
3
+ *
4
+ * Closes two confirmed vectors for an autonomous (and therefore
5
+ * prompt-injectable) agent:
6
+ * 1. Local-resource schemes (file:, view-source:, chrome:, …) that let a
7
+ * page-sourced instruction read local files or browser internals.
8
+ * 2. Optional private-network blocking (loopback, RFC-1918, link-local,
9
+ * cloud-metadata) to stop SSRF to internal services.
10
+ *
11
+ * Scheme blocking is on by default; private-network blocking is opt-in
12
+ * (blockPrivateNetwork) so localhost dev-server browsing keeps working.
13
+ *
14
+ * Limitation: private-network checks match the URL hostname only. A public
15
+ * DNS name that resolves to a private IP (DNS rebinding) is NOT caught here —
16
+ * that needs connection-time IP inspection. Documented, not silently assumed.
17
+ */
18
+
19
+ import { realpathSync } from 'node:fs';
20
+ import { resolve, sep } from 'node:path';
21
+
22
+ // Schemes safe to navigate to. Everything else is treated as a local-resource
23
+ // or browser-internal scheme and blocked unless allowLocalUrls is set.
24
+ // data:/blob:/about: stay allowed: opaque origins, no file:// or cross-origin
25
+ // read, and data: is the library's test-fixture mechanism.
26
+ const ALLOWED_SCHEMES = new Set(['http:', 'https:', 'data:', 'blob:', 'about:']);
27
+
28
+ /**
29
+ * @param {string} host - hostname (no brackets for IPv6)
30
+ * @returns {boolean} true if it names a private/loopback/link-local/internal host
31
+ */
32
+ function isPrivateHost(host) {
33
+ const h = host.toLowerCase().replace(/^\[|\]$/g, ''); // strip IPv6 brackets
34
+
35
+ // Internal hostnames
36
+ if (h === 'localhost' || h.endsWith('.localhost')) return true;
37
+ if (h.endsWith('.local') || h.endsWith('.internal')) return true;
38
+ if (h === 'metadata.google.internal') return true;
39
+
40
+ // IPv4 (incl. ranges)
41
+ const v4 = h.match(/^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})$/);
42
+ if (v4) {
43
+ const [a, b] = [Number(v4[1]), Number(v4[2])];
44
+ if (a === 127) return true; // loopback 127.0.0.0/8
45
+ if (a === 10) return true; // 10.0.0.0/8
46
+ if (a === 0) return true; // 0.0.0.0/8
47
+ if (a === 169 && b === 254) return true; // link-local / cloud metadata
48
+ if (a === 172 && b >= 16 && b <= 31) return true; // 172.16.0.0/12
49
+ if (a === 192 && b === 168) return true; // 192.168.0.0/16
50
+ return false;
51
+ }
52
+
53
+ // IPv6 — gated on the host actually being an IPv6 literal (contains a
54
+ // colon). Without this gate, ordinary hostnames like "fcbarcelona.com" or
55
+ // "fdic.gov" would match the fc00::/7 ULA prefix check and be wrongly blocked.
56
+ if (h.includes(':')) {
57
+ if (h === '::1' || h === '::') return true; // loopback / unspecified
58
+ if (h.startsWith('fe80:')) return true; // link-local fe80::/10
59
+ if (h.startsWith('fc') || h.startsWith('fd')) return true; // fc00::/7 ULA
60
+ // IPv4-mapped IPv6 (e.g. ::ffff:127.0.0.1)
61
+ const mapped = h.match(/::ffff:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$/);
62
+ if (mapped) return isPrivateHost(mapped[1]);
63
+ return false;
64
+ }
65
+
66
+ return false;
67
+ }
68
+
69
+ /**
70
+ * Throw if `url` is unsafe to navigate to under the given policy.
71
+ * @param {string} url
72
+ * @param {object} [opts]
73
+ * @param {boolean} [opts.allowLocalUrls=false] - permit file:/chrome:/etc.
74
+ * @param {boolean} [opts.blockPrivateNetwork=false] - reject loopback/RFC-1918/metadata.
75
+ */
76
+ export function assertNavigable(url, opts = {}) {
77
+ let parsed;
78
+ try {
79
+ parsed = new URL(url);
80
+ } catch {
81
+ throw new Error(`Refusing to navigate: not a valid URL (${String(url).slice(0, 80)})`);
82
+ }
83
+
84
+ if (!opts.allowLocalUrls && !ALLOWED_SCHEMES.has(parsed.protocol)) {
85
+ throw new Error(
86
+ `Refusing to navigate to "${parsed.protocol}" URL — local-resource and ` +
87
+ `browser-internal schemes are blocked (reads local files / browser state). ` +
88
+ `Pass { allowLocalUrls: true } to override.`
89
+ );
90
+ }
91
+
92
+ if (
93
+ opts.blockPrivateNetwork &&
94
+ (parsed.protocol === 'http:' || parsed.protocol === 'https:') &&
95
+ parsed.hostname &&
96
+ isPrivateHost(parsed.hostname)
97
+ ) {
98
+ throw new Error(
99
+ `Refusing to navigate to private/internal host "${parsed.hostname}" — ` +
100
+ `blockPrivateNetwork is enabled (SSRF guard). ` +
101
+ `Unset it to allow localhost / internal browsing.`
102
+ );
103
+ }
104
+ }
105
+
106
+ /**
107
+ * Throw if any file in `files` resolves outside `uploadDir`. Both the base
108
+ * dir and each file are resolved through realpath, so symlinks (in either the
109
+ * base path — e.g. macOS /tmp → /private/tmp — or the file) can't be used to
110
+ * escape the sandbox or to false-reject a legitimate file.
111
+ * No-op when `uploadDir` is falsy (no restriction configured).
112
+ * @param {string|string[]} files
113
+ * @param {string|null} uploadDir
114
+ */
115
+ export function assertUploadAllowed(files, uploadDir) {
116
+ if (!uploadDir) return;
117
+ let baseReal;
118
+ try {
119
+ baseReal = realpathSync(resolve(uploadDir));
120
+ } catch {
121
+ throw new Error(`upload: uploadDir does not exist or is unreadable (${uploadDir})`);
122
+ }
123
+ const list = Array.isArray(files) ? files : [files];
124
+ for (const f of list) {
125
+ let real;
126
+ try {
127
+ real = realpathSync(resolve(String(f)));
128
+ } catch {
129
+ throw new Error(`upload: cannot resolve "${f}" (must exist inside uploadDir)`);
130
+ }
131
+ if (real !== baseReal && !real.startsWith(baseReal + sep)) {
132
+ throw new Error(`upload: "${f}" is outside the allowed uploadDir (${uploadDir})`);
133
+ }
134
+ }
135
+ }
136
+
137
+ // Exported for unit tests.
138
+ export { isPrivateHost };