barebrowse 0.1.0 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.mcp.json +8 -0
- package/CHANGELOG.md +100 -0
- package/CLAUDE.md +22 -0
- package/README.md +123 -43
- package/barebrowse.context.md +261 -0
- package/cli.js +156 -0
- package/docs/blueprint.md +361 -0
- package/docs/testing.md +202 -0
- package/mcp-server.js +216 -0
- package/package.json +22 -9
- package/src/aria.js +69 -0
- package/src/auth.js +279 -0
- package/src/bareagent.js +161 -0
- package/src/cdp.js +148 -0
- package/src/chromium.js +148 -0
- package/src/consent.js +210 -0
- package/src/index.js +186 -10
- package/src/interact.js +208 -0
- package/src/prune.js +472 -0
- package/src/stealth.js +51 -0
package/.mcp.json
ADDED
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
## 0.2.1
|
|
4
|
+
|
|
5
|
+
- README rewritten: no code blocks, full obstacle course table with mode column, two usage paths (MCP vs framework), mcprune credited, measured token savings, context.md as code reference
|
|
6
|
+
- MCP auto-installer: `npx barebrowse install` detects Claude Desktop, Cursor, Claude Code and writes config
|
|
7
|
+
- MCP config uses `npx barebrowse mcp` instead of local file paths (works for npm consumers)
|
|
8
|
+
- CLI help updated with install command
|
|
9
|
+
|
|
10
|
+
## 0.2.0
|
|
11
|
+
|
|
12
|
+
Agent integration release. MCP server, bareagent adapter, and interaction features that make barebrowse usable as a standalone tool or embedded browsing layer.
|
|
13
|
+
|
|
14
|
+
### New: MCP server
|
|
15
|
+
- Raw JSON-RPC 2.0 over stdio, zero SDK dependencies
|
|
16
|
+
- 7 tools: `browse`, `goto`, `snapshot`, `click`, `type`, `press`, `scroll`
|
|
17
|
+
- Singleton session page, lazy-created on first session tool call
|
|
18
|
+
- `npx barebrowse mcp` to start, `npx barebrowse install` to auto-configure
|
|
19
|
+
|
|
20
|
+
### New: MCP auto-installer
|
|
21
|
+
- `npx barebrowse install` detects Claude Desktop, Cursor, and Claude Code
|
|
22
|
+
- Writes MCP config automatically -- no manual JSON editing
|
|
23
|
+
- Reports status for each detected client
|
|
24
|
+
|
|
25
|
+
### New: bareagent tool adapter
|
|
26
|
+
- `import { createBrowseTools } from 'barebrowse/bareagent'`
|
|
27
|
+
- Returns `{ tools, close }` with 9 bareagent-compatible tools
|
|
28
|
+
- Action tools auto-return fresh snapshot after each action (300ms settle)
|
|
29
|
+
- Tools: browse, goto, snapshot, click, type, press, scroll, select, screenshot
|
|
30
|
+
|
|
31
|
+
### New: stealth patches
|
|
32
|
+
- `src/stealth.js` -- anti-detection for headless mode
|
|
33
|
+
- Uses `Page.addScriptToEvaluateOnNewDocument` (runs before page scripts)
|
|
34
|
+
- Patches: `navigator.webdriver`, `navigator.plugins`, `navigator.languages`, `window.chrome`, `Permissions.prototype.query`
|
|
35
|
+
- Auto-applied in headless mode
|
|
36
|
+
|
|
37
|
+
### New: interactions
|
|
38
|
+
- `page.hover(ref)` -- mouse move to element center, triggers hover styles/tooltips
|
|
39
|
+
- `page.select(ref, value)` -- native `<select>` (set value + change event) or custom dropdown (click + find option)
|
|
40
|
+
- `page.screenshot(opts)` -- `Page.captureScreenshot`, returns base64 (png/jpeg/webp)
|
|
41
|
+
|
|
42
|
+
### New: wait strategies
|
|
43
|
+
- `page.waitForNetworkIdle(opts)` -- resolve when no pending requests for N ms (default 500)
|
|
44
|
+
- `page.waitForNavigation()` now SPA-aware -- falls back gracefully when no `loadEventFired` fires
|
|
45
|
+
|
|
46
|
+
### New: hybrid mode
|
|
47
|
+
- `mode: 'hybrid'` in `browse()` -- tries headless, detects challenge pages (Cloudflare, etc.), falls back to headed
|
|
48
|
+
- Challenge detection via ARIA tree heuristic ("Just a moment", "Checking your browser", etc.)
|
|
49
|
+
|
|
50
|
+
### New: CLI
|
|
51
|
+
- `npx barebrowse mcp` -- start MCP server
|
|
52
|
+
- `npx barebrowse install` -- auto-configure MCP clients
|
|
53
|
+
- `npx barebrowse browse <url>` -- one-shot browse, print snapshot to stdout
|
|
54
|
+
|
|
55
|
+
### New: documentation
|
|
56
|
+
- `README.md` -- complete guide: idea, token savings, modes, library vs MCP, bareagent wiring
|
|
57
|
+
- `barebrowse.context.md` -- LLM-consumable integration guide for AI assistants
|
|
58
|
+
- `docs/testing.md` -- test pyramid, all 54 tests documented, CI guidance
|
|
59
|
+
- `docs/blueprint.md` -- updated with full 10-step pipeline, module table, integration sections
|
|
60
|
+
|
|
61
|
+
### Changed
|
|
62
|
+
- `package.json` -- subpath exports (`./bareagent`), `bin` entry, keywords
|
|
63
|
+
- `src/index.js` -- stealth auto-applied in headless via `createPage()`, `type()` param renamed to avoid shadowing
|
|
64
|
+
- `src/interact.js` -- `getCenter()` reused by new `hover()` function
|
|
65
|
+
|
|
66
|
+
### Tests
|
|
67
|
+
- 54 tests passing (was 47 in 0.1.0)
|
|
68
|
+
- All existing tests unchanged and passing
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## 0.1.0
|
|
73
|
+
|
|
74
|
+
Initial release. CDP-direct browsing with ARIA snapshots.
|
|
75
|
+
|
|
76
|
+
### Core
|
|
77
|
+
- `browse(url, opts)` -- one-shot: URL in, pruned ARIA snapshot out
|
|
78
|
+
- `connect(opts)` -- session: navigate, interact, observe across pages
|
|
79
|
+
- Three modes: headless (default), headed (connect to running browser)
|
|
80
|
+
- Zero required dependencies, vanilla JS, ES modules, Node >= 22
|
|
81
|
+
|
|
82
|
+
### Modules
|
|
83
|
+
- `src/cdp.js` -- WebSocket CDP client with flattened session support
|
|
84
|
+
- `src/chromium.js` -- find/launch any installed Chromium browser
|
|
85
|
+
- `src/aria.js` -- format ARIA tree as YAML-like text
|
|
86
|
+
- `src/auth.js` -- cookie extraction from Firefox (SQLite) and Chromium (AES + keyring)
|
|
87
|
+
- `src/prune.js` -- 9-step ARIA pruning pipeline (47-95% token reduction)
|
|
88
|
+
- `src/interact.js` -- click, type (with clear), press (14 special keys), scroll
|
|
89
|
+
- `src/consent.js` -- auto-dismiss cookie consent dialogs (7 languages, 16+ sites tested)
|
|
90
|
+
|
|
91
|
+
### Features
|
|
92
|
+
- Cookie injection from Firefox/Chromium into headless CDP sessions
|
|
93
|
+
- Permission suppression (notifications, geolocation, camera, mic) via launch flags + CDP
|
|
94
|
+
- Cookie consent auto-dismiss across EN, NL, DE, FR, ES, IT, PT
|
|
95
|
+
- `waitForNavigation()` for post-click page loads
|
|
96
|
+
- Unique temp dirs per headless instance to avoid profile locking
|
|
97
|
+
|
|
98
|
+
### Tests
|
|
99
|
+
- 47 tests across 5 files (unit: prune, auth, cdp; integration: browse, interact)
|
|
100
|
+
- Real-site testing: Google, Wikipedia, GitHub, DuckDuckGo, YouTube, HN, Reddit
|
package/CLAUDE.md
ADDED
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
## Dev Rules
|
|
2
|
+
|
|
3
|
+
**POC first.** Always validate logic with a ~15min proof-of-concept before building. Cover happy path + common edges. POC works → design properly → build with tests. Never ship the POC.
|
|
4
|
+
|
|
5
|
+
**Build incrementally.** Break work into small independent modules. One piece at a time, each must work on its own before integrating.
|
|
6
|
+
|
|
7
|
+
**Dependency hierarchy — follow strictly:** vanilla language → standard library → external (only when stdlib can't do it in <100 lines). External deps must be maintained, lightweight, and widely adopted. Exception: always use vetted libraries for security-critical code (crypto, auth, sanitization).
|
|
8
|
+
|
|
9
|
+
**Lightweight over complex.** Fewer moving parts, fewer deps, less config. Simple > clever. Readable > elegant.
|
|
10
|
+
|
|
11
|
+
**Open-source only.** No vendor lock-in. Every line of code must have a purpose — no speculative code, no premature abstractions.
|
|
12
|
+
|
|
13
|
+
## Project Specifics
|
|
14
|
+
|
|
15
|
+
- **Language:** Vanilla JavaScript, ES modules, no build step
|
|
16
|
+
- **Runtime:** Node.js >= 22 (built-in WebSocket, sqlite)
|
|
17
|
+
- **Protocol:** CDP (Chrome DevTools Protocol) direct — no Playwright
|
|
18
|
+
- **Browser:** Any installed Chromium-based browser (chromium, chrome, brave, edge)
|
|
19
|
+
- **Key files:** `src/index.js` (API), `src/cdp.js` (CDP client), `src/chromium.js` (browser launch), `src/aria.js` (ARIA formatting)
|
|
20
|
+
- **Docs:** `docs/prd.md` (decisions + rationale), `docs/poc-plan.md` (phases + DoD)
|
|
21
|
+
|
|
22
|
+
For full development and testing standards, see `.claude/memory/AGENT_RULES.md`.
|
package/README.md
CHANGED
|
@@ -1,69 +1,149 @@
|
|
|
1
|
-
|
|
1
|
+
```
|
|
2
|
+
~~~~~~~~~~~~~~~~~~~~
|
|
3
|
+
~~~ .---------. ~~~
|
|
4
|
+
~~~ | · clear | ~~~
|
|
5
|
+
~~~ | · focus | ~~~
|
|
6
|
+
~~~ '---------' ~~~
|
|
7
|
+
~~~~~~~~~~~~~~~~~~~~
|
|
8
|
+
|
|
9
|
+
barebrowse
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
> Your agent browses like you do -- same browser, same logins, same cookies.
|
|
13
|
+
> Prunes pages down to what matters. 40-90% fewer tokens, zero wasted context.
|
|
14
|
+
|
|
15
|
+
---
|
|
2
16
|
|
|
3
|
-
|
|
4
|
-
No Playwright, no bundled browser, no build step.
|
|
17
|
+
## What this is
|
|
5
18
|
|
|
6
|
-
|
|
19
|
+
barebrowse is agentic browsing stripped to the bone. It gives your AI agent eyes and hands on the web -- navigate any page, see what's there, click buttons, fill forms, scroll, and move on. It uses your installed Chromium browser (Chrome, Brave, Edge -- whatever you have), reuses your existing login sessions, and handles all the friction automatically: cookie consent walls, permission prompts, bot detection, GDPR dialogs.
|
|
7
20
|
|
|
8
|
-
|
|
21
|
+
Instead of dumping raw DOM or taking screenshots, barebrowse returns a **pruned ARIA snapshot** -- a compact semantic view of what's on the page and what the agent can interact with. Buttons, links, inputs, headings -- labeled with `[ref=N]` markers the agent uses to act. The pruning pipeline is ported from [mcprune](https://github.com/nickvdyck/mcprune) and cuts 40-90% of tokens compared to raw page output. Every token your agent reads is meaningful.
|
|
9
22
|
|
|
10
|
-
|
|
11
|
-
import { browse, connect } from 'barebrowse';
|
|
23
|
+
No Playwright. No bundled browser. No 200MB download. No broken dependencies. Zero deps. Just CDP over a WebSocket to whatever Chromium you already have.
|
|
12
24
|
|
|
13
|
-
|
|
14
|
-
const snapshot = await browse('https://any-page.com');
|
|
25
|
+
## Install
|
|
15
26
|
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
await page.goto('https://any-page.com');
|
|
19
|
-
console.log(await page.snapshot());
|
|
20
|
-
await page.click('8'); // ref from snapshot
|
|
21
|
-
await page.type('3', 'hello');
|
|
22
|
-
await page.scroll(500);
|
|
23
|
-
await page.close();
|
|
27
|
+
```
|
|
28
|
+
npm install barebrowse
|
|
24
29
|
```
|
|
25
30
|
|
|
26
|
-
|
|
31
|
+
Requires Node.js >= 22 and any installed Chromium-based browser.
|
|
27
32
|
|
|
28
|
-
|
|
29
|
-
- **ARIA snapshots** — semantic, token-efficient output for LLMs
|
|
30
|
-
- **Built-in pruning** — 47-95% token reduction via 9-step pipeline
|
|
31
|
-
- **Cookie extraction** — authenticated browsing from your existing sessions (Chromium + Firefox)
|
|
32
|
-
- **Interactions** — click, type, scroll via CDP Input domain
|
|
33
|
-
- **Three modes** — headless (default), headed (connect to running browser), hybrid (planned)
|
|
33
|
+
## Two ways to use it
|
|
34
34
|
|
|
35
|
-
|
|
35
|
+
### 1. MCP server -- for Claude Desktop, Cursor, Claude Code
|
|
36
36
|
|
|
37
37
|
```
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
→ auth.js (extract cookies → inject via CDP)
|
|
41
|
-
→ Page.navigate
|
|
42
|
-
→ aria.js (Accessibility.getFullAXTree → nested tree)
|
|
43
|
-
→ prune.js (9-step role-based pruning)
|
|
44
|
-
→ interact.js (click/type/scroll via Input domain)
|
|
45
|
-
→ agent-ready snapshot
|
|
38
|
+
npm install -g barebrowse
|
|
39
|
+
npx barebrowse install
|
|
46
40
|
```
|
|
47
41
|
|
|
48
|
-
|
|
42
|
+
That's it. `install` auto-detects your MCP client and writes the config. No manual JSON editing. Restart your client and you have 7 browsing tools: `browse`, `goto`, `snapshot`, `click`, `type`, `press`, `scroll`.
|
|
49
43
|
|
|
50
|
-
|
|
44
|
+
### 2. Framework -- for agentic automation
|
|
45
|
+
|
|
46
|
+
Import barebrowse in your agent code. One-shot reads, interactive sessions, full observe-think-act loops. Works with any LLM orchestration library. Ships with a ready-made adapter for [bareagent](https://www.npmjs.com/package/bare-agent) (9 tools, auto-snapshot after every action).
|
|
47
|
+
|
|
48
|
+
For code examples, API reference, and wiring instructions, see **[barebrowse.context.md](barebrowse.context.md)** -- the full integration guide.
|
|
49
|
+
|
|
50
|
+
## Three modes
|
|
51
|
+
|
|
52
|
+
| Mode | What happens | Best for |
|
|
53
|
+
|------|-------------|----------|
|
|
54
|
+
| **Headless** (default) | Launches a fresh Chromium, no UI | Fast automation, scraping, reading pages |
|
|
55
|
+
| **Headed** | Connects to your running browser on CDP port | Bot-detected sites, visual debugging, CAPTCHAs |
|
|
56
|
+
| **Hybrid** | Tries headless first, falls back to headed if blocked | General-purpose agent browsing |
|
|
57
|
+
|
|
58
|
+
## What it handles automatically
|
|
59
|
+
|
|
60
|
+
This is the obstacle course your agent doesn't have to think about:
|
|
61
|
+
|
|
62
|
+
| Obstacle | How it's handled | Mode |
|
|
63
|
+
|----------|-----------------|------|
|
|
64
|
+
| **Cookie consent walls** (GDPR) | ARIA tree scan + jsClick accept button, 7 languages (EN, NL, DE, FR, ES, IT, PT) | Both |
|
|
65
|
+
| **Consent in dialog role** | Detect `dialog`/`alertdialog` with consent hints, click accept inside | Both |
|
|
66
|
+
| **Consent outside dialog** (BBC SourcePoint) | Fallback global button scan when dialog has no accept button | Both |
|
|
67
|
+
| **Consent behind iframe overlay** | JS click via DOM.resolveNode bypasses z-index/overlay issues | Both |
|
|
68
|
+
| **Permission prompts** (location, camera, mic) | Launch flags + CDP Browser.setPermission auto-deny | Both |
|
|
69
|
+
| **Media autoplay blocked** | Autoplay policy flag on launch | Both |
|
|
70
|
+
| **Login walls** | Cookie extraction from Firefox/Chromium, injected via CDP | Both |
|
|
71
|
+
| **Pre-filled form inputs** | Select-all + delete before typing | Both |
|
|
72
|
+
| **Off-screen elements** | Scrolled into view before every click | Both |
|
|
73
|
+
| **Form submission** | Enter key triggers onsubmit | Both |
|
|
74
|
+
| **Tab between fields** | Tab key moves focus correctly | Both |
|
|
75
|
+
| **SPA navigation** (YouTube, GitHub) | SPA-aware wait: frameNavigated + loadEventFired | Both |
|
|
76
|
+
| **Bot detection** (Google, Reddit) | Stealth patches (headless) + headed fallback with real cookies | Both |
|
|
77
|
+
| **navigator.webdriver leak** | Patched before page scripts run: webdriver, plugins, languages, chrome object | Headless |
|
|
78
|
+
| **Profile locking** | Unique temp dir per headless instance | Headless |
|
|
79
|
+
| **ARIA noise** | 9-step pruning pipeline (ported from mcprune): wrapper collapse, noise removal, landmark promotion | Both |
|
|
51
80
|
|
|
52
|
-
|
|
53
|
-
- Any installed Chromium-based browser (Chrome, Chromium, Brave, Edge, Vivaldi, Arc, Opera)
|
|
81
|
+
## What the agent sees
|
|
54
82
|
|
|
55
|
-
|
|
83
|
+
Raw ARIA output from a page is noisy -- decorative wrappers, hidden elements, structural junk. The pruning pipeline (ported from [mcprune](https://github.com/nickvdyck/mcprune)) strips it down to what matters.
|
|
84
|
+
|
|
85
|
+
| Page | Raw | Pruned | Reduction |
|
|
86
|
+
|------|-----|--------|-----------|
|
|
87
|
+
| example.com | 377 chars | 45 chars | 88% |
|
|
88
|
+
| Hacker News | 51,726 chars | 27,197 chars | 47% |
|
|
89
|
+
| Wikipedia (article) | 109,479 chars | 40,566 chars | 63% |
|
|
90
|
+
| DuckDuckGo | 42,254 chars | 5,407 chars | 87% |
|
|
91
|
+
|
|
92
|
+
Two pruning modes: **act** (default) keeps interactive elements and visible labels -- for clicking, typing, navigating. **read** keeps all text content -- for reading articles and extracting information.
|
|
93
|
+
|
|
94
|
+
## Actions
|
|
95
|
+
|
|
96
|
+
Everything the agent can do through barebrowse:
|
|
97
|
+
|
|
98
|
+
| Action | What it does |
|
|
99
|
+
|--------|-------------|
|
|
100
|
+
| **Navigate** | Load a URL, wait for page load, auto-dismiss consent |
|
|
101
|
+
| **Snapshot** | Pruned ARIA tree with `[ref=N]` markers (40-90% token reduction) |
|
|
102
|
+
| **Click** | Scroll into view + mouse click at element center |
|
|
103
|
+
| **Type** | Focus + insert text, with option to clear existing content first |
|
|
104
|
+
| **Press** | Special keys: Enter, Tab, Escape, Backspace, Delete, arrows, Space |
|
|
105
|
+
| **Scroll** | Mouse wheel up or down |
|
|
106
|
+
| **Hover** | Move mouse to element center (triggers tooltips, hover states) |
|
|
107
|
+
| **Select** | Set dropdown value (native select or custom dropdown) |
|
|
108
|
+
| **Screenshot** | Page capture as base64 PNG/JPEG/WebP |
|
|
109
|
+
| **Wait for navigation** | SPA-aware: works for full page loads and pushState |
|
|
110
|
+
| **Wait for network idle** | Resolve when no pending requests for 500ms |
|
|
111
|
+
| **Inject cookies** | Extract from Firefox/Chromium and inject via CDP |
|
|
112
|
+
| **Raw CDP** | Escape hatch for any Chrome DevTools Protocol command |
|
|
113
|
+
|
|
114
|
+
## Tested against
|
|
115
|
+
|
|
116
|
+
16+ sites across 8 countries, all consent dialogs dismissed, all interactions working:
|
|
117
|
+
|
|
118
|
+
Google, YouTube, BBC, Wikipedia, GitHub, DuckDuckGo, Hacker News, Amazon DE, The Guardian, Spiegel, Le Monde, El Pais, Corriere, NOS, Bild, Nu.nl, Booking, NYT, Stack Overflow, CNN, Reddit
|
|
119
|
+
|
|
120
|
+
## Context file
|
|
121
|
+
|
|
122
|
+
**[barebrowse.context.md](barebrowse.context.md)** is the full integration guide. Feed it to an AI assistant or read it yourself -- it covers the complete API, snapshot format, interaction loop, auth options, bareagent wiring, MCP setup, and gotchas. Everything you need to wire barebrowse into a project.
|
|
123
|
+
|
|
124
|
+
## How it works
|
|
56
125
|
|
|
57
126
|
```
|
|
58
|
-
|
|
59
|
-
|
|
127
|
+
URL -> find/launch browser (chromium.js)
|
|
128
|
+
-> WebSocket CDP connection (cdp.js)
|
|
129
|
+
-> stealth patches before page scripts (stealth.js, headless only)
|
|
130
|
+
-> suppress all permission prompts (Browser.setPermission)
|
|
131
|
+
-> extract + inject cookies from your browser (auth.js)
|
|
132
|
+
-> navigate to URL, wait for load
|
|
133
|
+
-> detect + dismiss cookie consent dialogs (consent.js)
|
|
134
|
+
-> get full ARIA accessibility tree (aria.js)
|
|
135
|
+
-> 9-step pruning pipeline from mcprune (prune.js)
|
|
136
|
+
-> dispatch real input events: click/type/scroll (interact.js)
|
|
137
|
+
-> agent-ready snapshot with [ref=N] markers
|
|
60
138
|
```
|
|
61
139
|
|
|
62
|
-
|
|
140
|
+
11 modules, 2,400 lines, zero dependencies.
|
|
63
141
|
|
|
64
|
-
##
|
|
142
|
+
## Requirements
|
|
65
143
|
|
|
66
|
-
|
|
144
|
+
- Node.js >= 22 (built-in WebSocket, built-in SQLite)
|
|
145
|
+
- Any Chromium-based browser installed (Chrome, Chromium, Brave, Edge, Vivaldi)
|
|
146
|
+
- Linux tested (Fedora/KDE). macOS/Windows cookie paths exist but untested.
|
|
67
147
|
|
|
68
148
|
## License
|
|
69
149
|
|
|
@@ -0,0 +1,261 @@
|
|
|
1
|
+
# barebrowse -- Integration Guide
|
|
2
|
+
|
|
3
|
+
> For AI assistants and developers wiring barebrowse into a project.
|
|
4
|
+
> v0.1.0 | Node.js >= 22 | 0 required deps | MIT
|
|
5
|
+
|
|
6
|
+
## What this is
|
|
7
|
+
|
|
8
|
+
barebrowse is a CDP-direct browsing library for autonomous agents (~1,800 lines). URL in, pruned ARIA snapshot out. It launches the user's installed Chromium browser, navigates, handles consent/permissions/cookies, and returns a token-efficient ARIA tree with `[ref=N]` markers for interaction.
|
|
9
|
+
|
|
10
|
+
No Playwright. No bundled browser. No build step. Vanilla JS, ES modules.
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
npm install barebrowse
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
Two entry points:
|
|
17
|
+
- `import { browse } from 'barebrowse'` -- one-shot: URL in, snapshot out
|
|
18
|
+
- `import { connect } from 'barebrowse'` -- session: navigate, interact, observe
|
|
19
|
+
|
|
20
|
+
## Which mode do I need?
|
|
21
|
+
|
|
22
|
+
| Mode | What it does | When to use |
|
|
23
|
+
|---|---|---|
|
|
24
|
+
| `headless` (default) | Launches a fresh Chromium, no UI | Scraping, reading, fast automation |
|
|
25
|
+
| `headed` | Connects to user's running browser on CDP port | Bot-detected sites, debugging, visual tasks |
|
|
26
|
+
| `hybrid` | Tries headless first, falls back to headed if blocked | General-purpose agent browsing |
|
|
27
|
+
|
|
28
|
+
Headed mode requires the browser to be launched with `--remote-debugging-port=9222`.
|
|
29
|
+
|
|
30
|
+
## Minimal usage: one-shot browse
|
|
31
|
+
|
|
32
|
+
```javascript
|
|
33
|
+
import { browse } from 'barebrowse';
|
|
34
|
+
|
|
35
|
+
// Defaults: headless, cookies injected, pruned, consent dismissed
|
|
36
|
+
const snapshot = await browse('https://example.com');
|
|
37
|
+
|
|
38
|
+
// All options
|
|
39
|
+
const snapshot = await browse('https://example.com', {
|
|
40
|
+
mode: 'headless', // 'headless' | 'headed' | 'hybrid'
|
|
41
|
+
cookies: true, // inject user's browser cookies
|
|
42
|
+
browser: 'firefox', // cookie source: 'firefox' | 'chromium' (auto-detected)
|
|
43
|
+
prune: true, // apply ARIA pruning (47-95% token reduction)
|
|
44
|
+
pruneMode: 'act', // 'act' (interactive elements) | 'read' (all content)
|
|
45
|
+
consent: true, // auto-dismiss cookie consent dialogs
|
|
46
|
+
timeout: 30000, // navigation timeout in ms
|
|
47
|
+
port: 9222, // CDP port for headed/hybrid mode
|
|
48
|
+
});
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## connect() API
|
|
52
|
+
|
|
53
|
+
`connect(opts)` returns a page handle for interactive sessions. Same opts as `browse()` for mode/port.
|
|
54
|
+
|
|
55
|
+
| Method | Args | Returns | Notes |
|
|
56
|
+
|---|---|---|---|
|
|
57
|
+
| `goto(url, timeout?)` | url: string, timeout: number (default 30000) | void | Navigate + wait for load + dismiss consent |
|
|
58
|
+
| `snapshot(pruneOpts?)` | false or { mode: 'act'\|'read' } | string | ARIA tree with `[ref=N]` markers. Pass `false` for raw. |
|
|
59
|
+
| `click(ref)` | ref: string | void | Scroll into view + mouse press+release at center |
|
|
60
|
+
| `type(ref, text, opts?)` | ref: string, text: string, opts: { clear?, keyEvents? } | void | Focus + insert text. `clear: true` replaces existing. |
|
|
61
|
+
| `press(key)` | key: string | void | Special key: Enter, Tab, Escape, Backspace, Delete, arrows, Home, End, PageUp, PageDown, Space |
|
|
62
|
+
| `scroll(deltaY)` | deltaY: number | void | Mouse wheel. Positive = down, negative = up. |
|
|
63
|
+
| `hover(ref)` | ref: string | void | Move mouse to element center |
|
|
64
|
+
| `select(ref, value)` | ref: string, value: string | void | Set `<select>` value or click custom dropdown option |
|
|
65
|
+
| `screenshot(opts?)` | { format?: 'png'\|'jpeg'\|'webp', quality?: number } | string (base64) | Page screenshot |
|
|
66
|
+
| `waitForNavigation(timeout?)` | timeout: number (default 30000) | void | Wait for page load or frame navigation |
|
|
67
|
+
| `waitForNetworkIdle(opts?)` | { timeout?: number, idle?: number } | void | Wait until no pending requests for `idle` ms (default 500) |
|
|
68
|
+
| `injectCookies(url, opts?)` | url: string, { browser?: string } | void | Extract cookies from user's browser and inject via CDP |
|
|
69
|
+
| `cdp` | -- | object | Raw CDP session for escape hatch: `page.cdp.send(method, params)` |
|
|
70
|
+
| `close()` | -- | void | Close page, disconnect CDP, kill browser (if headless) |
|
|
71
|
+
|
|
72
|
+
## Snapshot format
|
|
73
|
+
|
|
74
|
+
The snapshot is a YAML-like ARIA tree. Each line is one node:
|
|
75
|
+
|
|
76
|
+
```
|
|
77
|
+
- WebArea "Example Domain" [ref=1]
|
|
78
|
+
- heading "Example Domain" [level=1] [ref=3]
|
|
79
|
+
- paragraph [ref=5]
|
|
80
|
+
- StaticText "This domain is for use in illustrative examples." [ref=6]
|
|
81
|
+
- link "More information..." [ref=8]
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Key rules:
|
|
85
|
+
- `[ref=N]` markers appear on interactive and named elements
|
|
86
|
+
- Refs are **ephemeral** -- they change on every `snapshot()` call
|
|
87
|
+
- Always call `snapshot()` to get fresh refs before interacting
|
|
88
|
+
- `click(ref)` / `type(ref, text)` / `hover(ref)` / `select(ref, value)` use these ref strings
|
|
89
|
+
- Pruning removes noise (~47-95% token reduction) while keeping all interactive elements
|
|
90
|
+
|
|
91
|
+
## Interaction loop: observe, think, act
|
|
92
|
+
|
|
93
|
+
```javascript
|
|
94
|
+
import { connect } from 'barebrowse';
|
|
95
|
+
|
|
96
|
+
const page = await connect();
|
|
97
|
+
await page.goto('https://example.com');
|
|
98
|
+
|
|
99
|
+
// 1. Observe
|
|
100
|
+
let snap = await page.snapshot();
|
|
101
|
+
|
|
102
|
+
// 2. Think (LLM decides what to do based on snapshot)
|
|
103
|
+
// 3. Act
|
|
104
|
+
await page.click('8'); // click the "More information..." link
|
|
105
|
+
await page.waitForNavigation();
|
|
106
|
+
|
|
107
|
+
// 4. Observe again (refs are now different)
|
|
108
|
+
snap = await page.snapshot();
|
|
109
|
+
|
|
110
|
+
// ... repeat until goal is achieved
|
|
111
|
+
|
|
112
|
+
await page.close();
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
## Auth / cookie options
|
|
116
|
+
|
|
117
|
+
barebrowse can inject cookies from the user's real browser sessions, bypassing login walls.
|
|
118
|
+
|
|
119
|
+
| Source | How | Notes |
|
|
120
|
+
|---|---|---|
|
|
121
|
+
| Firefox (default) | SQLite `cookies.sqlite`, plaintext | Works on Linux. Auto-detected default profile. |
|
|
122
|
+
| Chromium | SQLite `Cookies` + AES decryption via keyring | Linux: KWallet or GNOME Keyring. Profile must not be locked. |
|
|
123
|
+
| Manual | `page.injectCookies(url, { browser: 'firefox' })` | Explicit injection on connect() sessions |
|
|
124
|
+
| Disabled | `{ cookies: false }` | No cookie injection |
|
|
125
|
+
|
|
126
|
+
`browse()` auto-injects cookies before navigation. `connect()` exposes `injectCookies()` for manual control.
|
|
127
|
+
|
|
128
|
+
## Obstacle course -- what barebrowse handles automatically
|
|
129
|
+
|
|
130
|
+
| Obstacle | How | Mode |
|
|
131
|
+
|---|---|---|
|
|
132
|
+
| Cookie consent (GDPR) | ARIA scan + jsClick accept button, 7 languages | Both |
|
|
133
|
+
| Consent behind iframes | JS `.click()` via DOM.resolveNode bypasses overlays | Both |
|
|
134
|
+
| Permission prompts | Launch flags + CDP Browser.setPermission auto-deny | Both |
|
|
135
|
+
| Media autoplay blocked | `--autoplay-policy=no-user-gesture-required` | Both |
|
|
136
|
+
| Login walls | Cookie extraction from Firefox/Chromium + CDP injection | Both |
|
|
137
|
+
| Pre-filled form inputs | `type({ clear: true })` selects all + deletes first | Both |
|
|
138
|
+
| Off-screen elements | `DOM.scrollIntoViewIfNeeded` before every click | Both |
|
|
139
|
+
| Form submission | `press('Enter')` triggers onsubmit | Both |
|
|
140
|
+
| SPA navigation | `waitForNavigation()` uses loadEventFired + frameNavigated | Both |
|
|
141
|
+
| Bot detection | Headed mode with real cookies bypasses most checks | Headed |
|
|
142
|
+
| `navigator.webdriver` | Stealth patches in headless (webdriver, plugins, chrome obj) | Headless |
|
|
143
|
+
| Profile locking | Unique temp dir per headless instance | Headless |
|
|
144
|
+
| ARIA noise | 9-step pruning: wrapper collapse, noise removal, landmark promotion | Both |
|
|
145
|
+
|
|
146
|
+
## bareagent wiring
|
|
147
|
+
|
|
148
|
+
barebrowse provides a tool adapter for bareagent's Loop:
|
|
149
|
+
|
|
150
|
+
```javascript
|
|
151
|
+
import { Loop } from 'bare-agent';
|
|
152
|
+
import { Anthropic } from 'bare-agent/providers';
|
|
153
|
+
import { createBrowseTools } from 'barebrowse/src/bareagent.js';
|
|
154
|
+
|
|
155
|
+
const provider = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
|
|
156
|
+
const loop = new Loop({ provider });
|
|
157
|
+
|
|
158
|
+
const { tools, close } = createBrowseTools();
|
|
159
|
+
|
|
160
|
+
try {
|
|
161
|
+
const result = await loop.run(
|
|
162
|
+
[{ role: 'user', content: 'Search for "barebrowse" on DuckDuckGo and tell me the first result' }],
|
|
163
|
+
tools
|
|
164
|
+
);
|
|
165
|
+
console.log(result.text);
|
|
166
|
+
} finally {
|
|
167
|
+
await close();
|
|
168
|
+
}
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
`createBrowseTools(opts)` returns:
|
|
172
|
+
- `tools` -- array of bareagent-compatible tool objects (browse, goto, snapshot, click, type, press, scroll, select, screenshot)
|
|
173
|
+
- `close()` -- cleanup function, call when done
|
|
174
|
+
|
|
175
|
+
Action tools (click, type, press, scroll, goto) auto-return a fresh snapshot so the LLM always sees the result. 300ms settle delay after actions for DOM updates.
|
|
176
|
+
|
|
177
|
+
## MCP wrapper
|
|
178
|
+
|
|
179
|
+
barebrowse ships an MCP server for direct use with Claude Desktop, Cursor, or any MCP client.
|
|
180
|
+
|
|
181
|
+
```bash
|
|
182
|
+
npm install barebrowse # or npm install -g barebrowse
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
Add to your MCP client config (`.mcp.json`, `claude_desktop_config.json`, etc.):
|
|
186
|
+
```json
|
|
187
|
+
{
|
|
188
|
+
"mcpServers": {
|
|
189
|
+
"barebrowse": {
|
|
190
|
+
"command": "npx",
|
|
191
|
+
"args": ["barebrowse", "mcp"]
|
|
192
|
+
}
|
|
193
|
+
}
|
|
194
|
+
}
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
7 tools exposed: `browse` (one-shot), `goto`, `snapshot`, `click`, `type`, `press`, `scroll`.
|
|
198
|
+
|
|
199
|
+
Action tools return `'ok'` -- the agent calls `snapshot` explicitly to observe. This avoids double-token output since MCP tool calls are cheap to chain.
|
|
200
|
+
|
|
201
|
+
Session tools (goto, snapshot, click, type, press, scroll) share a singleton page, lazy-created on first use.
|
|
202
|
+
|
|
203
|
+
## Architecture
|
|
204
|
+
|
|
205
|
+
```
|
|
206
|
+
URL -> chromium.js (find/launch browser, permission flags)
|
|
207
|
+
-> cdp.js (WebSocket CDP client)
|
|
208
|
+
-> stealth.js (navigator.webdriver patches, headless only)
|
|
209
|
+
-> Browser.setPermission (suppress prompts)
|
|
210
|
+
-> auth.js (extract cookies -> inject via CDP)
|
|
211
|
+
-> Page.navigate
|
|
212
|
+
-> consent.js (detect + dismiss cookie dialogs)
|
|
213
|
+
-> aria.js (Accessibility.getFullAXTree -> nested tree)
|
|
214
|
+
-> prune.js (9-step role-based pruning)
|
|
215
|
+
-> interact.js (click/type/scroll/hover/select via Input domain)
|
|
216
|
+
-> agent-ready snapshot
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
| Module | Lines | Purpose |
|
|
220
|
+
|---|---|---|
|
|
221
|
+
| `src/index.js` | ~370 | Public API: `browse()`, `connect()`, screenshot, network idle, hybrid |
|
|
222
|
+
| `src/cdp.js` | 148 | WebSocket CDP client, flattened sessions |
|
|
223
|
+
| `src/chromium.js` | 148 | Find/launch Chromium browsers, permission-suppressing flags |
|
|
224
|
+
| `src/aria.js` | 69 | Format ARIA tree as text |
|
|
225
|
+
| `src/auth.js` | 279 | Cookie extraction (Chromium AES + keyring, Firefox), CDP injection |
|
|
226
|
+
| `src/prune.js` | 472 | ARIA pruning pipeline (ported from mcprune) |
|
|
227
|
+
| `src/interact.js` | ~170 | Click, type, press, scroll, hover, select |
|
|
228
|
+
| `src/consent.js` | 200 | Auto-dismiss cookie consent dialogs across languages |
|
|
229
|
+
| `src/stealth.js` | ~40 | Navigator patches for headless anti-detection |
|
|
230
|
+
| `src/bareagent.js` | ~120 | Tool adapter for bareagent Loop |
|
|
231
|
+
| `mcp-server.js` | ~170 | MCP server (JSON-RPC over stdio) |
|
|
232
|
+
|
|
233
|
+
## Gotchas
|
|
234
|
+
|
|
235
|
+
1. **Refs are ephemeral.** Every `snapshot()` call generates new refs. Always snapshot before interacting. Never cache refs across snapshots.
|
|
236
|
+
|
|
237
|
+
2. **SPA navigation has no loadEventFired.** For single-page apps (React, YouTube, GitHub), use `waitForNetworkIdle()` or a timed wait after click instead of `waitForNavigation()`.
|
|
238
|
+
|
|
239
|
+
3. **Pruning modes matter.** `act` mode (default) keeps interactive elements + visible labels. `read` mode keeps all text content. Use `read` for content extraction, `act` for form filling and navigation.
|
|
240
|
+
|
|
241
|
+
4. **Headed mode requires manual browser launch.** Start your browser with `--remote-debugging-port=9222`. barebrowse connects to it -- it does not launch it.
|
|
242
|
+
|
|
243
|
+
5. **Cookie extraction needs unlocked profile.** Chromium cookies are AES-encrypted with a keyring key. If Chromium is running, the profile may be locked. Firefox cookies are plaintext and always accessible.
|
|
244
|
+
|
|
245
|
+
6. **Hybrid mode kills and relaunches.** If headless is bot-blocked, hybrid mode kills the headless browser and connects to headed on port 9222. The headed browser must already be running.
|
|
246
|
+
|
|
247
|
+
7. **One page per connect().** Each `connect()` call creates one page. For multiple tabs, call `connect()` multiple times.
|
|
248
|
+
|
|
249
|
+
8. **Consent dismiss is best-effort.** It handles 16+ tested sites across 7 languages but novel consent implementations may need manual handling. Disable with `{ consent: false }`.
|
|
250
|
+
|
|
251
|
+
9. **Screenshot returns base64.** Write to file with `fs.writeFileSync('shot.png', Buffer.from(base64, 'base64'))` or pass directly to a vision model.
|
|
252
|
+
|
|
253
|
+
10. **Chromium-only.** CDP protocol limits us to Chrome, Chromium, Edge, Brave, Vivaldi (~80% desktop share). Firefox support via WebDriver BiDi is not yet implemented.
|
|
254
|
+
|
|
255
|
+
## Constraints
|
|
256
|
+
|
|
257
|
+
- **Node >= 22** -- built-in WebSocket, built-in SQLite
|
|
258
|
+
- **Chromium-only** -- CDP protocol
|
|
259
|
+
- **Linux first** -- tested on Fedora/KDE, macOS/Windows cookie paths exist but untested
|
|
260
|
+
- **Not a server** -- library that agents import. Wrap as MCP (included) or HTTP if needed.
|
|
261
|
+
- **Zero required deps** -- everything uses Node stdlib
|