barebrowse 0.6.1 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,52 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.7.1
4
+
5
+ Fix: timeout now triggers auto-retry instead of bypassing it.
6
+
7
+ ### Bug fix (`mcp-server.js`)
8
+ - **Root cause:** The 30s timeout was a `Promise.race` *outside* `withRetry()`. When a page timed out, the race rejected immediately — `withRetry` never got a chance to reset the session and retry. Timeouts also didn't match `isCdpDead()`, so even if they did reach `withRetry`, they wouldn't be retried.
9
+ - **Fix:** Moved per-attempt timeout *inside* `withRetry()`. Each attempt gets its own 30s deadline. On timeout or CDP death, the session resets and a fresh attempt runs. The outer `Promise.race` is removed entirely.
10
+ - `isCdpDead()` renamed to `isTransient()` — now also matches timeout errors (`"Timeout waiting for CDP event"`, `"timed out"`)
11
+ - Non-transient errors (validation, unknown tool) are still not retried
12
+
13
+ ### Tests
14
+ - 11 new unit tests in `test/unit/mcp.test.js`: `isTransient` detection (CDP death, timeouts, non-transient), `withRetry` behavior (success, CDP retry, timeout retry, no-retry for validation, double-failure, no-timeout mode)
15
+ - 80 total tests (39 unit + 41 integration)
16
+
17
+ ## 0.7.0
18
+
19
+ MCP resilience: timeouts, auto-retry, LLM-friendly scroll, and click fallback for hidden elements.
20
+
21
+ ### Timeouts (`mcp-server.js`)
22
+ - All MCP tool calls now have a hard timeout: 30s for session tools, 60s for `browse` and `assess`
23
+ - Returns a structured error (`Tool "X" timed out after Ns`) instead of hanging silently
24
+ - Previously: a hung browser or slow page caused `[Tool result missing due to internal error]` — opaque and unrecoverable
25
+
26
+ ### Auto-retry (`mcp-server.js`)
27
+ - `withRetry()` wrapper on all session tools (goto, snapshot, click, type, press, scroll, back, forward, drag, upload, pdf)
28
+ - On transient CDP failure (WebSocket closed, target/session closed), resets the session and retries once automatically
29
+ - Non-CDP errors (validation, unknown tool) are not retried
30
+
31
+ ### LLM-friendly scroll (`mcp-server.js`, `src/bareagent.js`)
32
+ - Scroll tool now accepts `direction: "up"/"down"` in addition to numeric `deltaY`
33
+ - LLMs naturally say `scroll(direction: "down")` — this now works instead of crashing with `deltaX/deltaY expected for mouseWheel event`
34
+ - `"down"` → `deltaY: 900`, `"up"` → `deltaY: -900`. Numeric `deltaY` still works and takes precedence.
35
+ - Clear validation error if neither `direction` nor `deltaY` is provided
36
+
37
+ ### Click JS fallback (`src/interact.js`)
38
+ - Click now falls back to JS `element.click()` when `DOM.scrollIntoViewIfNeeded` fails with "Node does not have a layout object"
39
+ - This error occurs on elements that exist in the ARIA tree but have no visual layout (display:none, zero-size, collapsed sections, detached nodes)
40
+ - Resolves the node via `DOM.requestNode` → `DOM.resolveNode` → `Runtime.callFunctionOn`
41
+ - Other click errors still throw normally
42
+
43
+ ### Docs
44
+ - Updated barebrowse.context.md, README.md, prd.md with resilience features
45
+ - MCP server version string updated to 0.7.0
46
+
47
+ ### Tests
48
+ - 71/71 passing — no test changes needed
49
+
3
50
  ## 0.6.1
4
51
 
5
52
  Headed fallback is now a per-navigation escape hatch, not a permanent mode switch. Graceful degradation when headed is unavailable.
package/README.md CHANGED
@@ -87,7 +87,7 @@ Or manually add to your config (`claude_desktop_config.json`, `.cursor/mcp.json`
87
87
  }
88
88
  ```
89
89
 
90
- 12 tools: `browse`, `goto`, `snapshot`, `click`, `type`, `press`, `scroll`, `back`, `forward`, `drag`, `upload`, `pdf`. Plus `assess` (privacy scan) if [wearehere](https://github.com/hamr0/wearehere) is installed. Session runs in hybrid mode with automatic cookie injection.
90
+ 12 tools: `browse`, `goto`, `snapshot`, `click`, `type`, `press`, `scroll`, `back`, `forward`, `drag`, `upload`, `pdf`. Plus `assess` (privacy scan) if [wearehere](https://github.com/hamr0/wearehere) is installed. Session runs in hybrid mode with automatic cookie injection. All tools have timeouts (30s/60s) and auto-retry on transient failures.
91
91
 
92
92
  ### 3. Library -- for agentic automation
93
93
 
@@ -129,10 +129,10 @@ Everything the agent can do through barebrowse:
129
129
  | **Navigate** | Load a URL, wait for page load, auto-dismiss consent |
130
130
  | **Back / Forward** | Browser history navigation |
131
131
  | **Snapshot** | Pruned ARIA tree with `[ref=N]` markers. Two modes: `act` (buttons, links, inputs) and `read` (full text). 40-90% token reduction. |
132
- | **Click** | Scroll into view + mouse click at element center |
132
+ | **Click** | Scroll into view + mouse click at element center, JS fallback for hidden elements |
133
133
  | **Type** | Focus + insert text, with option to clear existing content first |
134
134
  | **Press** | Special keys: Enter, Tab, Escape, Backspace, Delete, arrows, Space |
135
- | **Scroll** | Mouse wheel up or down |
135
+ | **Scroll** | Mouse wheel up or down (accepts direction or pixels) |
136
136
  | **Hover** | Move mouse to element center (triggers tooltips, hover states) |
137
137
  | **Select** | Set dropdown value (native select or custom dropdown) |
138
138
  | **Drag** | Drag one element to another (Kanban boards, sliders) |
@@ -1,7 +1,7 @@
1
1
  # barebrowse -- Integration Guide
2
2
 
3
3
  > For AI assistants and developers wiring barebrowse into a project.
4
- > v0.6.1 | Node.js >= 22 | 0 required deps | MIT
4
+ > v0.7.1 | Node.js >= 22 | 0 required deps | MIT
5
5
 
6
6
  ## What this is
7
7
 
@@ -59,7 +59,7 @@ const snapshot = await browse('https://example.com', {
59
59
  | `click(ref)` | ref: string | void | Scroll into view + mouse press+release at center |
60
60
  | `type(ref, text, opts?)` | ref: string, text: string, opts: { clear?, keyEvents? } | void | Focus + insert text. `clear: true` replaces existing. |
61
61
  | `press(key)` | key: string | void | Special key: Enter, Tab, Escape, Backspace, Delete, arrows, Home, End, PageUp, PageDown, Space |
62
- | `scroll(deltaY)` | deltaY: number | void | Mouse wheel. Positive = down, negative = up. |
62
+ | `scroll(deltaY)` | deltaY: number | void | Mouse wheel. Positive = down, negative = up. MCP/bareagent also accept `direction: "up"/"down"`. |
63
63
  | `hover(ref)` | ref: string | void | Move mouse to element center |
64
64
  | `select(ref, value)` | ref: string, value: string | void | Set `<select>` value or click custom dropdown option |
65
65
  | `drag(fromRef, toRef)` | fromRef: string, toRef: string | void | Drag from one element to another |
@@ -149,7 +149,7 @@ barebrowse can inject cookies from the user's real browser sessions, bypassing l
149
149
  | Media autoplay blocked | `--autoplay-policy=no-user-gesture-required` | Both |
150
150
  | Login walls | Cookie extraction from Firefox/Chromium + CDP injection | Both |
151
151
  | Pre-filled form inputs | `type({ clear: true })` selects all + deletes first | Both |
152
- | Off-screen elements | `DOM.scrollIntoViewIfNeeded` before every click | Both |
152
+ | Off-screen elements | `DOM.scrollIntoViewIfNeeded` before every click, JS `.click()` fallback for no-layout elements | Both |
153
153
  | Form submission | `press('Enter')` triggers onsubmit | Both |
154
154
  | SPA navigation | `waitForNavigation()` uses loadEventFired + frameNavigated | Both |
155
155
  | Bot detection | ARIA node count (<50 = bot-blocked) + text heuristics. `botBlocked` flag set after every `goto()`. Hybrid fallback switches to headed. Snapshot shows `[BOT CHALLENGE DETECTED]` warning. | Hybrid |
@@ -241,7 +241,7 @@ Action tools return `'ok'` -- the agent calls `snapshot` explicitly to observe.
241
241
 
242
242
  Session runs in hybrid mode (headless with automatic headed fallback on bot detection). `goto` injects cookies from the user's browser before navigation for authenticated access.
243
243
 
244
- Session tools share a singleton page, lazy-created on first use. Assess tries headless first; if bot-blocked (score ≤5 with all zeros), retries with a separate headed session. Tabs dismissed for consent and closed after every scan. Max 3 concurrent. Browser OOM/crash auto-recovers (session resets, server stays alive).
244
+ Session tools share a singleton page, lazy-created on first use. All session tools have auto-retry on transient failures (browser crash, WebSocket close, navigation timeout) — each attempt gets its own 30s deadline, session resets between attempts, retries once automatically. Scroll accepts `direction: "up"/"down"` in addition to numeric `deltaY`. Click falls back to JS `.click()` when elements have no layout. `browse` has a 60s timeout (no retry stateless). Assess tries headless first; if bot-blocked, retries headed. Browser OOM/crash auto-recovers (session resets, server stays alive).
245
245
 
246
246
  ## Architecture
247
247
 
package/mcp-server.js CHANGED
@@ -20,9 +20,38 @@ try {
20
20
  } catch {}
21
21
 
22
22
 
23
- function isCdpDead(err) {
23
+ function isTransient(err) {
24
24
  const m = err.message || '';
25
- return m.includes('WebSocket') || m.includes('Target closed') || m.includes('Session closed') || m.includes('CDP');
25
+ return m.includes('WebSocket') || m.includes('Target closed') || m.includes('Session closed')
26
+ || m.includes('CDP') || m.includes('Timeout waiting for CDP event') || m.includes('timed out');
27
+ }
28
+
29
+ /**
30
+ * Retry-once wrapper with per-attempt timeout.
31
+ * On transient failure (CDP death OR timeout), resets session and retries once.
32
+ * @param {Function} fn - async function to execute
33
+ * @param {number} timeoutMs - per-attempt timeout in ms
34
+ */
35
+ async function withRetry(fn, timeoutMs) {
36
+ async function attempt() {
37
+ if (!timeoutMs) return await fn();
38
+ let timer;
39
+ const result = await Promise.race([
40
+ fn(),
41
+ new Promise((_, rej) => { timer = setTimeout(() => rej(new Error(`timed out after ${timeoutMs / 1000}s`)), timeoutMs); }),
42
+ ]);
43
+ clearTimeout(timer);
44
+ return result;
45
+ }
46
+
47
+ try {
48
+ return await attempt();
49
+ } catch (err) {
50
+ if (!isTransient(err)) throw err;
51
+ // Transient failure — reset session and retry once
52
+ _page = null;
53
+ return await attempt();
54
+ }
26
55
  }
27
56
 
28
57
  const MAX_CHARS_DEFAULT = 30000;
@@ -139,13 +168,13 @@ const TOOLS = [
139
168
  },
140
169
  {
141
170
  name: 'scroll',
142
- description: 'Scroll the page. Positive deltaY scrolls down, negative scrolls up. Returns ok.',
171
+ description: 'Scroll the page up or down. Pass direction ("up"/"down") or a numeric deltaY. Returns ok.',
143
172
  inputSchema: {
144
173
  type: 'object',
145
174
  properties: {
146
- deltaY: { type: 'number', description: 'Pixels to scroll (positive=down, negative=up)' },
175
+ direction: { type: 'string', enum: ['up', 'down'], description: 'Scroll direction "up" or "down" (scrolls ~3 screen-heights)' },
176
+ deltaY: { type: 'number', description: 'Pixels to scroll (positive=down, negative=up). Overrides direction if both given.' },
147
177
  },
148
- required: ['deltaY'],
149
178
  },
150
179
  },
151
180
  {
@@ -214,7 +243,12 @@ if (assessFn) {
214
243
  async function handleToolCall(name, args) {
215
244
  switch (name) {
216
245
  case 'browse': {
217
- const text = await browse(args.url, { mode: args.mode });
246
+ let timer;
247
+ const text = await Promise.race([
248
+ browse(args.url, { mode: args.mode }),
249
+ new Promise((_, rej) => { timer = setTimeout(() => rej(new Error('browse timed out after 60s')), 60000); }),
250
+ ]);
251
+ clearTimeout(timer);
218
252
  const limit = args.maxChars ?? MAX_CHARS_DEFAULT;
219
253
  if (text.length > limit) {
220
254
  const file = saveSnapshot(text);
@@ -222,13 +256,13 @@ async function handleToolCall(name, args) {
222
256
  }
223
257
  return text;
224
258
  }
225
- case 'goto': {
259
+ case 'goto': return withRetry(async () => {
226
260
  const page = await getPage();
227
261
  try { await page.injectCookies(args.url); } catch {}
228
262
  await page.goto(args.url);
229
263
  return 'ok';
230
- }
231
- case 'snapshot': {
264
+ }, 30000);
265
+ case 'snapshot': return withRetry(async () => {
232
266
  const page = await getPage();
233
267
  const text = await page.snapshot();
234
268
  const limit = args.maxChars ?? MAX_CHARS_DEFAULT;
@@ -237,51 +271,58 @@ async function handleToolCall(name, args) {
237
271
  return `Snapshot (${text.length} chars) saved to ${file}`;
238
272
  }
239
273
  return text;
240
- }
241
- case 'click': {
274
+ }, 30000);
275
+ case 'click': return withRetry(async () => {
242
276
  const page = await getPage();
243
277
  await page.click(args.ref);
244
278
  return 'ok';
245
- }
246
- case 'type': {
279
+ }, 30000);
280
+ case 'type': return withRetry(async () => {
247
281
  const page = await getPage();
248
282
  await page.type(args.ref, args.text, { clear: args.clear });
249
283
  return 'ok';
250
- }
251
- case 'press': {
284
+ }, 30000);
285
+ case 'press': return withRetry(async () => {
252
286
  const page = await getPage();
253
287
  await page.press(args.key);
254
288
  return 'ok';
255
- }
256
- case 'scroll': {
289
+ }, 30000);
290
+ case 'scroll': return withRetry(async () => {
257
291
  const page = await getPage();
258
- await page.scroll(args.deltaY);
292
+ let dy = args.deltaY;
293
+ if (dy == null && args.direction) {
294
+ dy = args.direction === 'up' ? -900 : 900;
295
+ }
296
+ if (dy == null || typeof dy !== 'number') {
297
+ throw new Error('scroll requires "direction" ("up"/"down") or numeric "deltaY"');
298
+ }
299
+ await page.scroll(dy);
259
300
  return 'ok';
260
- }
261
- case 'back': {
301
+ }, 30000);
302
+ case 'back': return withRetry(async () => {
262
303
  const page = await getPage();
263
304
  await page.goBack();
264
305
  return 'ok';
265
- }
266
- case 'forward': {
306
+ }, 30000);
307
+ case 'forward': return withRetry(async () => {
267
308
  const page = await getPage();
268
309
  await page.goForward();
269
310
  return 'ok';
270
- }
271
- case 'drag': {
311
+ }, 30000);
312
+ case 'drag': return withRetry(async () => {
272
313
  const page = await getPage();
273
314
  await page.drag(args.fromRef, args.toRef);
274
315
  return 'ok';
275
- }
276
- case 'upload': {
316
+ }, 30000);
317
+ case 'upload': return withRetry(async () => {
277
318
  const page = await getPage();
278
319
  await page.upload(args.ref, args.files);
279
320
  return 'ok';
280
- }
281
- case 'pdf': {
321
+ }, 30000);
322
+ case 'pdf': return withRetry(async () => {
282
323
  const page = await getPage();
283
324
  return await page.pdf({ landscape: args.landscape });
284
- }
325
+ }, 30000);
285
326
  case 'assess': {
286
327
  if (!assessFn) throw new Error('wearehere is not installed. Run: npm install wearehere');
287
328
  const releaseSlot = await acquireAssessSlot();
@@ -323,7 +364,7 @@ async function handleToolCall(name, args) {
323
364
  } catch (err) {
324
365
  clearTimeout(timer);
325
366
  await tab.close().catch(() => {});
326
- if (isCdpDead(err)) _page = null;
367
+ if (isTransient(err)) _page = null;
327
368
  throw err;
328
369
  }
329
370
  } finally {
@@ -350,7 +391,7 @@ async function handleMessage(msg) {
350
391
  return jsonrpcResponse(id, {
351
392
  protocolVersion: '2024-11-05',
352
393
  capabilities: { tools: {} },
353
- serverInfo: { name: 'barebrowse', version: '0.6.0' },
394
+ serverInfo: { name: 'barebrowse', version: '0.7.1' },
354
395
  });
355
396
  }
356
397
 
@@ -370,6 +411,7 @@ async function handleMessage(msg) {
370
411
  content: [{ type: 'text', text: typeof result === 'string' ? result : JSON.stringify(result) }],
371
412
  });
372
413
  } catch (err) {
414
+ if (isTransient(err)) _page = null;
373
415
  return jsonrpcResponse(id, {
374
416
  content: [{ type: 'text', text: `Error: ${err.message}` }],
375
417
  isError: true,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "barebrowse",
3
- "version": "0.6.1",
3
+ "version": "0.7.1",
4
4
  "description": "Authenticated web browsing for autonomous agents via CDP. URL in, pruned ARIA snapshot out.",
5
5
  "type": "module",
6
6
  "main": "src/index.js",
package/src/bareagent.js CHANGED
@@ -116,15 +116,20 @@ export function createBrowseTools(opts = {}) {
116
116
  },
117
117
  {
118
118
  name: 'scroll',
119
- description: 'Scroll the page. Returns the updated snapshot.',
119
+ description: 'Scroll the page up or down. Pass direction ("up"/"down") or a numeric deltaY. Returns the updated snapshot.',
120
120
  parameters: {
121
121
  type: 'object',
122
122
  properties: {
123
- deltaY: { type: 'number', description: 'Pixels to scroll (positive=down, negative=up)' },
123
+ direction: { type: 'string', enum: ['up', 'down'], description: 'Scroll direction "up" or "down" (scrolls ~3 screen-heights)' },
124
+ deltaY: { type: 'number', description: 'Pixels to scroll (positive=down, negative=up). Overrides direction if both given.' },
124
125
  },
125
- required: ['deltaY'],
126
126
  },
127
- execute: async ({ deltaY }) => actionAndSnapshot((page) => page.scroll(deltaY)),
127
+ execute: async ({ direction, deltaY }) => {
128
+ let dy = deltaY;
129
+ if (dy == null && direction) dy = direction === 'up' ? -900 : 900;
130
+ if (dy == null || typeof dy !== 'number') throw new Error('scroll requires "direction" or numeric "deltaY"');
131
+ return actionAndSnapshot((page) => page.scroll(dy));
132
+ },
128
133
  },
129
134
  {
130
135
  name: 'select',
package/src/interact.js CHANGED
@@ -46,13 +46,27 @@ async function getCenter(session, backendNodeId) {
46
46
  * @param {number} backendNodeId - Backend DOM node ID
47
47
  */
48
48
  export async function click(session, backendNodeId) {
49
- const { x, y } = await getCenter(session, backendNodeId);
50
- await session.send('Input.dispatchMouseEvent', {
51
- type: 'mousePressed', x, y, button: 'left', clickCount: 1,
52
- });
53
- await session.send('Input.dispatchMouseEvent', {
54
- type: 'mouseReleased', x, y, button: 'left', clickCount: 1,
55
- });
49
+ try {
50
+ const { x, y } = await getCenter(session, backendNodeId);
51
+ await session.send('Input.dispatchMouseEvent', {
52
+ type: 'mousePressed', x, y, button: 'left', clickCount: 1,
53
+ });
54
+ await session.send('Input.dispatchMouseEvent', {
55
+ type: 'mouseReleased', x, y, button: 'left', clickCount: 1,
56
+ });
57
+ } catch (err) {
58
+ // Element has no layout (display:none, zero-size, detached) — fall back to JS click
59
+ if (err.message && err.message.includes('layout object')) {
60
+ const { nodeId } = await session.send('DOM.requestNode', { backendNodeId });
61
+ const { object } = await session.send('DOM.resolveNode', { nodeId });
62
+ await session.send('Runtime.callFunctionOn', {
63
+ objectId: object.objectId,
64
+ functionDeclaration: 'function() { this.click(); }',
65
+ });
66
+ } else {
67
+ throw err;
68
+ }
69
+ }
56
70
  }
57
71
 
58
72
  /**
@@ -1,68 +0,0 @@
1
- # Design: Harden assess tool
2
-
3
- ## Architecture
4
-
5
- ### Current flow (broken)
6
- ```
7
- assess(url) → connect({ mode: 'hybrid' }) ← NEW browser, no cookies
8
- → assessFn(page, url)
9
- → page.close() ← kills browser
10
- ```
11
-
12
- ### New flow
13
- ```
14
- assess(url) → getPage() ← reuse session browser
15
- → page.createTab() ← new tab in same browser
16
- → tab.injectCookies(url) ← cookie injection
17
- → assessFn(tab, url) ← assess uses tab
18
- → tab.close() ← close tab only
19
- timeout guard wraps entire flow
20
- retry wraps entire flow
21
- ```
22
-
23
- ## Key design decisions
24
-
25
- ### Why a new tab, not the session page?
26
- wearehere's `assess()` calls:
27
- - `session.send('Page.addScriptToEvaluateOnNewDocument', ...)` — injects fingerprint detection scripts
28
- - `networkSession.on('Network.requestWillBeSent', ...)` — monitors all network traffic
29
-
30
- These would pollute the session page. A separate tab has its own CDP session with isolated Page/Network domains.
31
-
32
- ### createTab() page-like interface
33
- wearehere expects a page object with: `goto()`, `cdp` (raw session), `waitForNetworkIdle()`. createTab() returns exactly this interface:
34
-
35
- ```javascript
36
- {
37
- goto(url, timeout) // navigate tab
38
- cdp // raw CDP session for this tab
39
- waitForNetworkIdle(opts) // reuses existing waitForNetworkIdle()
40
- injectCookies(url) // cookie injection for this tab
41
- close() // close tab, NOT the browser
42
- }
43
- ```
44
-
45
- ### Retry strategy
46
- ```
47
- attempt 1: assess with 45s timeout
48
- fail → wait 2s
49
- attempt 2: assess with 45s timeout (if browser crashed, reset _page first)
50
- fail → return { error: "assessment_failed", ... }
51
- ```
52
-
53
- ### Timeout implementation
54
- ```javascript
55
- const result = await Promise.race([
56
- doAssess(page, url, opts),
57
- new Promise((_, reject) => setTimeout(() => reject(new Error('timeout')), 45000))
58
- ]);
59
- ```
60
- On timeout, the tab is closed in a finally block.
61
-
62
- ## Files changed
63
- | File | Change |
64
- |------|--------|
65
- | `src/index.js` | Add `createTab()` and tab's `close()` to connect() return object |
66
- | `mcp-server.js` | Rewrite assess case: use getPage().createTab(), add retry + timeout |
67
- | `README.md` | Update assess description, add self-healing mention |
68
- | `barebrowse.context.md` | Update assess section, document createTab() |
@@ -1,71 +0,0 @@
1
- # Plan: Harden assess tool — session reuse + self-healing
2
-
3
- ## Plan ID
4
- `harden-assess`
5
-
6
- ## Summary
7
- Make the assess tool reuse the MCP session's browser instance (cookies, headed fallback) instead of spawning throwaway browsers, add retry logic for transient failures, and add a timeout guard so no single assessment can hang forever.
8
-
9
- ## Problem Statement
10
- The `assess` tool in `mcp-server.js` calls `connect({ mode: 'hybrid' })` directly — creating a fresh headless browser per call with no cookies, no session state, and no headed fallback. Every other MCP tool uses the `getPage()` singleton. This causes:
11
-
12
- 1. **No cookies** — assess browses as a stranger, getting blocked by consent walls and bot detection that cookies would bypass
13
- 2. **No headed fallback reuse** — if the singleton already fell back to headed mode, assess still starts fresh headless and hits the same blocks
14
- 3. **No retry** — any failure (browser crash, navigation timeout, CDP disconnect) kills the assessment with no recovery
15
- 4. **No timeout guard** — if `wearehere`'s `assess()` hangs (e.g. network idle never resolves), the MCP call blocks indefinitely
16
-
17
- However, assess can't simply use the shared `_page` singleton directly because `wearehere` injects init scripts (`addScriptToEvaluateOnNewDocument`) and network listeners that would pollute the session page for subsequent calls. The solution is to create a **second page tab** within the same browser instance.
18
-
19
- ## Proposed Solution
20
- 1. **Session-aware page creation** — Instead of `connect()`, assess opens a new CDP tab within the existing browser. This shares the browser process (same cookies, same headed/headless state) but isolates assess's script injections to its own tab.
21
- 2. **Retry with backoff** — Wrap the assess call in a retry loop (max 2 attempts, 2s backoff). On browser crash, reconnect the singleton.
22
- 3. **Timeout guard** — Wrap each assess call in `Promise.race` with a 45s hard deadline. If exceeded, return an error result (not hang).
23
-
24
- ## Benefits
25
- - Assess gets cookies and headed fallback for free — no separate browser instance
26
- - Failed assessments auto-retry instead of dying
27
- - Hanging assessments time out gracefully instead of blocking the MCP server forever
28
- - Eliminates the 10+ second cold-start per assessment (browser launch)
29
-
30
- ## Scope
31
- ### In Scope
32
- - Modify `mcp-server.js` assess handler to create tabs within existing browser
33
- - Add `createTab()` / `closeTab()` helper to `src/index.js` connect() page handle
34
- - Add retry wrapper in mcp-server.js
35
- - Add timeout guard in mcp-server.js
36
- - Update docs: README.md, barebrowse.context.md, CLAUDE.md
37
-
38
- ### Out of Scope
39
- - Changing wearehere's internal logic
40
- - Adding retry/self-healing to other MCP tools (future work)
41
- - Batch/queue mode for multiple assessments
42
- - Changing the assess tool's MCP interface (same inputs/outputs)
43
-
44
- ## Dependencies
45
- - `wearehere` package (assess function signature unchanged)
46
- - `src/index.js` connect() API (adding createTab/closeTab methods)
47
- - `src/cdp.js` (Target.createTarget / closeTarget already available)
48
-
49
- ## Implementation Strategy
50
- Phase 1: Add tab management to connect() page handle (createTab, closeTab)
51
- Phase 2: Rewrite assess handler to use session tab + retry + timeout
52
- Phase 3: Update documentation
53
-
54
- ## Risks and Mitigations
55
- | Risk | Impact | Mitigation |
56
- |------|--------|------------|
57
- | Init script injection leaks across tabs | Pollutes session page | Each tab gets its own Page domain; addScriptToEvaluateOnNewDocument is per-target |
58
- | Browser crash during assess kills session too | Session page lost | getPage() already handles reconnection lazily (set _page = null, next call recreates) |
59
- | wearehere expects full page handle, not raw CDP session | API mismatch | createTab() returns a page-like object with goto, cdp, waitForNetworkIdle — same interface |
60
-
61
- ## Success Criteria
62
- - [ ] `assess` reuses the session browser (no separate `connect()` call)
63
- - [ ] `assess` inherits cookies from the session
64
- - [ ] `assess` works when session is in headed mode (hybrid fallback already triggered)
65
- - [ ] Failed assessment retries once before returning error
66
- - [ ] Assessment hanging > 45s returns timeout error, doesn't block server
67
- - [ ] All existing tests pass
68
- - [ ] Documentation updated
69
-
70
- ## Open Questions
71
- 1. Should createTab() also inject cookies? — **Recommendation**: Yes, call `authenticate()` for the target URL before navigation, same as `goto` does.
@@ -1,38 +0,0 @@
1
- # PRD: Harden assess tool
2
-
3
- ## Overview
4
- The `assess` MCP tool must reuse the session browser, retry on failure, and time out gracefully.
5
-
6
- ## Requirements
7
-
8
- ### R1: Session-aware tab creation
9
- The assess tool MUST create a new browser tab within the existing MCP session browser instead of spawning a separate browser via `connect()`. The tab MUST:
10
- - Share the same browser process (inheriting headless/headed state)
11
- - Have access to the browser's cookie jar
12
- - Isolate its CDP domains (Page, Network, DOM) from the session page
13
- - Be closed after each assessment completes or fails
14
-
15
- ### R2: Cookie injection
16
- Before navigating, the assess tab MUST inject cookies from the user's browser for the target URL, using the same `authenticate()` mechanism as `goto`.
17
-
18
- ### R3: Retry on failure
19
- If an assessment fails (navigation timeout, CDP error, browser crash), the tool MUST:
20
- - Retry once after a 2-second delay
21
- - If the browser crashed, reset the session singleton (`_page = null`) so getPage() reconnects
22
- - If retry also fails, return a structured error result (not throw)
23
-
24
- ### R4: Timeout guard
25
- Each assessment MUST have a hard timeout of 45 seconds. If exceeded:
26
- - The tab is force-closed
27
- - A structured error result is returned: `{ site, url, error: "timeout", scanned_at }`
28
- - The session page is NOT affected
29
-
30
- ### R5: Backwards compatibility
31
- - The assess tool's MCP interface (inputs/outputs) MUST NOT change
32
- - Successful assessments return the same JSON format as before
33
- - The tool appears/disappears based on wearehere availability (unchanged)
34
-
35
- ## Non-functional
36
- - No new dependencies
37
- - No changes to wearehere package
38
- - createTab()/closeTab() exposed on connect() page handle for library users too