barebrowse 0.2.0 → 0.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +7 -0
- package/README.md +100 -179
- package/mcp-server.js +1 -1
- package/package.json +1 -1
- package/.claude/memory/AGENT_RULES.md +0 -251
- package/.claude/settings.local.json +0 -30
- package/.claude/stash/barebrowse-research-2026-02-22.md +0 -49
- package/.claude/stash/phase3-interactions-complete.md +0 -69
- package/.claude/stash/phase3-prep.md +0 -88
- package/docs/poc-plan.md +0 -230
- package/docs/prd.md +0 -284
- package/examples/headed-demo.js +0 -157
- package/examples/yt-demo.js +0 -137
- package/test/integration/browse.test.js +0 -108
- package/test/integration/interact.test.js +0 -514
- package/test/unit/auth.test.js +0 -66
- package/test/unit/cdp.test.js +0 -110
- package/test/unit/prune.test.js +0 -292
|
@@ -1,49 +0,0 @@
|
|
|
1
|
-
# Stash: barebrowse Research & Repo Creation
|
|
2
|
-
**Timestamp:** 2026-02-22
|
|
3
|
-
**Session focus:** Research steipete repos, design shared browsing layer
|
|
4
|
-
|
|
5
|
-
## Key Decisions
|
|
6
|
-
|
|
7
|
-
1. **Researched steipete/sweetlink** — daemon + WebSocket + CDP architecture for driving real browser tabs. Not headless — opposite approach. Good ideas, overcomplicated for our needs.
|
|
8
|
-
|
|
9
|
-
2. **Researched steipete/sweet-cookie** — TS library extracting cookies from real browser profiles (Chrome/FF/Safari). Direct fit for mcprune's Cloudflare/auth-wall problem.
|
|
10
|
-
|
|
11
|
-
3. **Identified the parallel** — mcprune (headless DOM reading) and Multis (interactive browsing) share the same underlying need: get a browser context that loads any URL, authenticated as the user, without being blocked.
|
|
12
|
-
|
|
13
|
-
4. **Designed `barebrowse`** — a unified browsing layer with 3 modes:
|
|
14
|
-
- `headless` — Playwright + cookie injection + stealth (mcprune)
|
|
15
|
-
- `headed` — CDP connect to running Chrome (Multis)
|
|
16
|
-
- `hybrid` — try headless, auto-fallback to headed
|
|
17
|
-
|
|
18
|
-
5. **Key simplification over sweetlink** — skip daemon entirely, use CDP directly. Playwright wraps CDP already. No WebSocket bridge, no in-page runtime injection.
|
|
19
|
-
|
|
20
|
-
6. **Named it `barebrowse`** — follows bare- ecosystem (bareagent, barebrowse, mcprune, multis)
|
|
21
|
-
|
|
22
|
-
## Work Completed
|
|
23
|
-
|
|
24
|
-
- Created repo: `~/PycharmProjects/barebrowse/` (git init, main branch)
|
|
25
|
-
- Moved research doc: `mcprune/docs/peter-repos.md` → `barebrowse/docs/prd.md`
|
|
26
|
-
- PRD includes: architecture diagrams, API sketches, 4-phase POC plan, repos to study, package structure
|
|
27
|
-
|
|
28
|
-
## Repos to Borrow From
|
|
29
|
-
|
|
30
|
-
| Repo | Take |
|
|
31
|
-
|---|---|
|
|
32
|
-
| steipete/sweet-cookie | Cookie extraction (TS) |
|
|
33
|
-
| steipete/sweetlink | Selector discovery, click patterns, CDP dual-channel |
|
|
34
|
-
| steipete/canvas | Stealth/anti-detection config |
|
|
35
|
-
|
|
36
|
-
## Next Steps
|
|
37
|
-
|
|
38
|
-
- [ ] Start barebrowse POC Phase 1 (headless + cookies)
|
|
39
|
-
- [ ] Wire mcprune to use barebrowse instead of direct Playwright
|
|
40
|
-
- [ ] Design Multis headed-mode integration
|
|
41
|
-
|
|
42
|
-
## Bare Ecosystem Map
|
|
43
|
-
|
|
44
|
-
```
|
|
45
|
-
bareagent → agent orchestration
|
|
46
|
-
barebrowse → browser access layer (NEW)
|
|
47
|
-
mcprune → DOM snapshot pruning
|
|
48
|
-
multis → personal assistant
|
|
49
|
-
```
|
|
@@ -1,69 +0,0 @@
|
|
|
1
|
-
# Phase 3 Interactions — Complete
|
|
2
|
-
|
|
3
|
-
**Date:** 2026-02-22
|
|
4
|
-
**Session:** Real-world interaction testing + fixes
|
|
5
|
-
|
|
6
|
-
## What Was Done
|
|
7
|
-
|
|
8
|
-
### Fixes to `src/interact.js`
|
|
9
|
-
1. **scrollIntoView before click** — `DOM.scrollIntoViewIfNeeded` in `getCenter()` before `DOM.getBoxModel`
|
|
10
|
-
2. **`press(key)` function** — 14-key KEY_MAP (Enter, Tab, Escape, Backspace, Delete, arrows, Home/End, PageUp/Down, Space). Enter has `text: '\r'`, Tab has `text: '\t'` for proper form submission.
|
|
11
|
-
3. **`type({ clear: true })`** — Ctrl+A select-all + Backspace before typing to replace pre-filled content
|
|
12
|
-
4. **Key event text field** — Without `text: '\r'` on Enter keyDown, form `onsubmit` never fires
|
|
13
|
-
|
|
14
|
-
### Additions to `src/index.js`
|
|
15
|
-
- `page.press(key)` — wired to `cdpPress`
|
|
16
|
-
- `page.waitForNavigation(timeout)` — `Page.loadEventFired` promise
|
|
17
|
-
|
|
18
|
-
### Tests: 54/54 passing
|
|
19
|
-
- `test/integration/interact.test.js` (15 tests, 8 suites):
|
|
20
|
-
- Round 1: data: URL fixture (7 tests — click, type, clear, offscreen scroll, Enter submit, unknown key error, link navigation)
|
|
21
|
-
- Round 2: Google Search (consent handling, type+Enter+nav — bot-blocked but flow works)
|
|
22
|
-
- Round 3: Wikipedia (article link click + navigation)
|
|
23
|
-
- Round 4: GitHub (SPA navigation with settle time)
|
|
24
|
-
- Round 5: DuckDuckGo (search + results — headless-friendly)
|
|
25
|
-
- Round 6: Hacker News (story link click + navigation)
|
|
26
|
-
- Round 7: Reddit/old.reddit.com (with fallback to www.reddit.com)
|
|
27
|
-
- Round 8: Firefox cookie injection (extractCookies → injectCookies → Network.getAllCookies verification)
|
|
28
|
-
|
|
29
|
-
### `examples/headed-demo.js`
|
|
30
|
-
10-step demo: Wikipedia → click link → DuckDuckGo → type query → Enter → results. Requires `chromium-browser --remote-debugging-port=9222`.
|
|
31
|
-
|
|
32
|
-
### YouTube Demo (ad-hoc, in /tmp)
|
|
33
|
-
Proved the full loop: Firefox cookies extracted → injected into headed Chromium → YouTube consent bypassed → searched "Family Portrait Pink" → clicked and played the video. User watched it happen live.
|
|
34
|
-
|
|
35
|
-
## Key Findings
|
|
36
|
-
|
|
37
|
-
### Real-world breakage patterns
|
|
38
|
-
- **Cookie consent walls** block everything (Google, YouTube). Need locale-aware button matching or cookie injection to bypass.
|
|
39
|
-
- **Bot detection** (Google, Reddit) blocks headless. Headed mode with real cookies is the fix.
|
|
40
|
-
- **ARIA roles vary wildly** — DuckDuckGo search box is `LabelText` not `textbox`, Google's is `combobox` with `[expanded=false]` between role and ref.
|
|
41
|
-
- **`Page.loadEventFired` doesn't fire** for data: URL → data: URL navigation or SPA transitions.
|
|
42
|
-
- **`Page.frameNavigated` fires too early** — page DOM not ready yet, snapshots return empty.
|
|
43
|
-
- **Example.com link text changed** from "More information..." to "Learn more" — real sites change.
|
|
44
|
-
- **IANA page about example domains** contains "Example Domain" text — can't assert absence.
|
|
45
|
-
- **Act mode prunes links** on simple pages like example.com — need browse mode to find clickable links.
|
|
46
|
-
|
|
47
|
-
### Firefox → Chromium cookie flow (proven)
|
|
48
|
-
```
|
|
49
|
-
Firefox cookies.sqlite (plaintext) → extractCookies({ browser: 'firefox', domain })
|
|
50
|
-
→ injectCookies(session, cookies) via Network.setCookie
|
|
51
|
-
→ headless/headed Chromium uses user's sessions
|
|
52
|
-
```
|
|
53
|
-
Works for YouTube (8 cookies), Google (35 cookies), GitHub, etc.
|
|
54
|
-
|
|
55
|
-
## Docs Updated
|
|
56
|
-
- `docs/poc-plan.md` — Phase 3 DoD checkboxes checked, test section expanded
|
|
57
|
-
- `docs/blueprint.md` — interactions section updated, "Real-world testing" marked DONE, wait strategies partially done
|
|
58
|
-
- `docs/prd.md` — wait strategies and cookie auth decision updated
|
|
59
|
-
|
|
60
|
-
## Commits
|
|
61
|
-
1. `6017113` — feat(interact): add scrollIntoView, press(), clear, waitForNavigation
|
|
62
|
-
2. `41bed12` — test: add real-site rounds + headed demo, update docs
|
|
63
|
-
|
|
64
|
-
## What's Next (not started)
|
|
65
|
-
- Phase 4: Hybrid mode (headless → headed fallback)
|
|
66
|
-
- Stealth patches (`navigator.webdriver`, etc.)
|
|
67
|
-
- More wait strategies (network idle, element presence)
|
|
68
|
-
- Shopping/checkout flows, login forms, dropdowns
|
|
69
|
-
- Screenshot capture for visual verification
|
|
@@ -1,88 +0,0 @@
|
|
|
1
|
-
# Stash: barebrowse Phase 3 Prep
|
|
2
|
-
**Timestamp:** 2026-02-22
|
|
3
|
-
**Session focus:** Built Phase 1 + Phase 2 of barebrowse POC
|
|
4
|
-
|
|
5
|
-
## What barebrowse Is
|
|
6
|
-
Vanilla JS library — CDP-direct browsing for autonomous agents. URL in → pruned ARIA snapshot out. No Playwright, no bundled browser. Uses user's installed Chromium.
|
|
7
|
-
|
|
8
|
-
## Completed Work
|
|
9
|
-
|
|
10
|
-
### Phase 1 — CDP + ARIA Foundation (DONE)
|
|
11
|
-
- `src/chromium.js` (142 lines) — Find/launch any Chromium browser, parse CDP WebSocket URL from stderr
|
|
12
|
-
- `src/cdp.js` (148 lines) — Vanilla WebSocket CDP client with flattened session support (sessionId at top level)
|
|
13
|
-
- `src/aria.js` (69 lines) — Format ARIA tree as YAML-like text, skip InlineTextBox/LineBreak noise
|
|
14
|
-
- `src/index.js` (223 lines) — `browse(url)` and `connect(opts)` public API
|
|
15
|
-
|
|
16
|
-
### Phase 2 — Auth + Prune (DONE)
|
|
17
|
-
- `src/auth.js` (279 lines) — Cookie extraction from Chromium (AES-128-CBC + KWallet/GNOME keyring) and Firefox (plaintext). Injection via CDP `Network.setCookie`. Uses `node:sqlite` with `immutable=1` URI to read live DBs.
|
|
18
|
-
- `src/prune.js` (472 lines) — Full port of mcprune's 9-step pruning pipeline adapted for CDP ARIA tree format. Two modes: `act` (actions only) and `browse` (keeps content).
|
|
19
|
-
|
|
20
|
-
### Tests — 39/39 passing
|
|
21
|
-
- `test/unit/prune.test.js` — 16 tests (pruning logic, pure function)
|
|
22
|
-
- `test/unit/auth.test.js` — 7 tests (cookie extraction from Firefox)
|
|
23
|
-
- `test/unit/cdp.test.js` — 5 tests (CDP client, browser launch, ARIA tree)
|
|
24
|
-
- `test/integration/browse.test.js` — 11 tests (end-to-end pipeline)
|
|
25
|
-
|
|
26
|
-
### Docs
|
|
27
|
-
- `docs/prd.md` — Comprehensive PRD with all decisions and rationale
|
|
28
|
-
- `docs/poc-plan.md` — 4-phase plan with DoD, test instructions, repo study reference
|
|
29
|
-
- `CLAUDE.md` — Project dev rules stub
|
|
30
|
-
|
|
31
|
-
## Key Architecture Decisions (settled)
|
|
32
|
-
- CDP direct, not Playwright (no 200MB download)
|
|
33
|
-
- ARIA tree, not DOM (semantic, token-efficient)
|
|
34
|
-
- Pruning built-in from mcprune (not optional)
|
|
35
|
-
- Three modes: headless/headed/hybrid (one flag, same CDP code)
|
|
36
|
-
- Chromium-only (CDP constraint, Firefox later via BiDi)
|
|
37
|
-
- Vanilla JS, ES modules, Node >= 22, zero required deps
|
|
38
|
-
- sweet-cookie not on npm — wrote our own auth.js
|
|
39
|
-
- Unique temp dirs per browser instance (`/tmp/barebrowse-{pid}-{timestamp}`)
|
|
40
|
-
|
|
41
|
-
## Test Results
|
|
42
|
-
- HN: 51,932 chars raw → 27,312 pruned (47% reduction on minimal site)
|
|
43
|
-
- example.com: browse mode keeps paragraphs, act mode keeps only heading
|
|
44
|
-
- Cookie extraction: 181 Firefox cookies across 54 domains
|
|
45
|
-
- GitHub cookies found but `logged_in=no` (user not logged in via Firefox)
|
|
46
|
-
|
|
47
|
-
## System Details
|
|
48
|
-
- OS: Fedora Linux, KDE Plasma, Wayland
|
|
49
|
-
- Node: 22.22.0 (built-in WebSocket, experimental node:sqlite)
|
|
50
|
-
- Browser: Firefox default, Chromium installed via `sudo dnf install chromium`
|
|
51
|
-
- Chromium binary: `/usr/bin/chromium-browser`
|
|
52
|
-
- KWallet running with Chromium Safe Storage key available
|
|
53
|
-
|
|
54
|
-
## What's Next — Phase 3 (Headed + Interaction)
|
|
55
|
-
|
|
56
|
-
**Goal:** Connect to user's running browser, click/type on pages.
|
|
57
|
-
|
|
58
|
-
**Files to create/update:**
|
|
59
|
-
- Update `src/chromium.js` — `connect()` mode for running browser on debug port
|
|
60
|
-
- `src/interact.js` — Click (`Input.dispatchMouseEvent`), type (`Input.dispatchKeyEvent`), scroll. Resolve ARIA nodeId to DOM coordinates.
|
|
61
|
-
- Update `src/index.js` — Add click/type to connect() page handle
|
|
62
|
-
|
|
63
|
-
**Key challenge:** Mapping ARIA nodeId to screen coordinates for click targeting. CDP `DOM.getBoxModel` + `Accessibility.getFullAXTree` node→backendDOMNodeId mapping needed.
|
|
64
|
-
|
|
65
|
-
**Prerequisite:** User launches browser with `--remote-debugging-port=9222`
|
|
66
|
-
|
|
67
|
-
## Phase 4 — Hybrid + bareagent Integration
|
|
68
|
-
- `src/stealth.js` — Anti-detection patches via `Runtime.evaluate`
|
|
69
|
-
- Hybrid mode: try headless, detect CF/403, fall back to headed
|
|
70
|
-
- Wire as bareagent tool functions
|
|
71
|
-
|
|
72
|
-
## Repos Studied
|
|
73
|
-
- steipete/sweet-cookie — concept only (not on npm, wrote our own)
|
|
74
|
-
- steipete/sweetlink — CDP-direct concept, skip daemon/WS bloat
|
|
75
|
-
- steipete/canvas — stealth patterns noted for Phase 4
|
|
76
|
-
- mcprune (own) — pruning logic fully ported
|
|
77
|
-
|
|
78
|
-
## File Inventory
|
|
79
|
-
```
|
|
80
|
-
src/index.js 223 lines — Public API
|
|
81
|
-
src/chromium.js 142 lines — Browser find/launch
|
|
82
|
-
src/cdp.js 148 lines — CDP WebSocket client
|
|
83
|
-
src/aria.js 69 lines — ARIA tree formatter
|
|
84
|
-
src/auth.js 279 lines — Cookie extraction + injection
|
|
85
|
-
src/prune.js 472 lines — ARIA tree pruning
|
|
86
|
-
1333 total source
|
|
87
|
-
576 total tests
|
|
88
|
-
```
|
package/docs/poc-plan.md
DELETED
|
@@ -1,230 +0,0 @@
|
|
|
1
|
-
# barebrowse — POC Plan
|
|
2
|
-
|
|
3
|
-
**Date:** 2026-02-22
|
|
4
|
-
**Goal:** Prove that an autonomous agent can get an authenticated, pruned ARIA snapshot of any web page via CDP — no Playwright, no bundled browser.
|
|
5
|
-
|
|
6
|
-
---
|
|
7
|
-
|
|
8
|
-
## Repo Structure
|
|
9
|
-
|
|
10
|
-
```
|
|
11
|
-
barebrowse/
|
|
12
|
-
├── src/
|
|
13
|
-
│ ├── index.js # Public API: browse(), connect()
|
|
14
|
-
│ ├── chromium.js # Find/launch/connect to Chromium browsers
|
|
15
|
-
│ ├── cdp.js # Vanilla WebSocket CDP client
|
|
16
|
-
│ ├── aria.js # Accessibility.getFullAXTree → structured tree
|
|
17
|
-
│ ├── auth.js # Cookie extraction + CDP Network.setCookie injection
|
|
18
|
-
│ ├── prune.js # ARIA tree pruning (ported from mcprune)
|
|
19
|
-
│ ├── interact.js # Click, type, scroll via CDP Input domain
|
|
20
|
-
│ ├── consent.js # Auto-dismiss cookie consent dialogs
|
|
21
|
-
│ └── stealth.js # Anti-detection patches via Runtime.evaluate (Phase 4)
|
|
22
|
-
├── test/
|
|
23
|
-
│ ├── integration/
|
|
24
|
-
│ │ ├── browse.test.js # End-to-end: URL → pruned snapshot
|
|
25
|
-
│ │ ├── auth.test.js # Cookie injection → authenticated page
|
|
26
|
-
│ │ └── interact.test.js # Click/type on a live page
|
|
27
|
-
│ └── unit/
|
|
28
|
-
│ ├── cdp.test.js # CDP client message handling
|
|
29
|
-
│ ├── aria.test.js # ARIA tree formatting
|
|
30
|
-
│ └── prune.test.js # Pruning logic on sample trees
|
|
31
|
-
├── docs/
|
|
32
|
-
│ ├── prd.md # Product requirements (comprehensive)
|
|
33
|
-
│ └── poc-plan.md # This file
|
|
34
|
-
├── package.json
|
|
35
|
-
└── CLAUDE.md
|
|
36
|
-
```
|
|
37
|
-
|
|
38
|
-
**No build step.** Vanilla JS, ES modules, runs directly with Node.js >= 22.
|
|
39
|
-
|
|
40
|
-
---
|
|
41
|
-
|
|
42
|
-
## Phases
|
|
43
|
-
|
|
44
|
-
### Phase 1 — CDP + ARIA Foundation
|
|
45
|
-
|
|
46
|
-
**Prove:** Get an ARIA tree from any page via CDP, no Playwright.
|
|
47
|
-
|
|
48
|
-
**Files:**
|
|
49
|
-
- `src/chromium.js` — Find installed Chromium browsers on the system (Chrome, Chromium, Brave, Edge). Launch headless with `--headless=new --remote-debugging-port=<port>`. Parse CDP WebSocket URL from stderr output.
|
|
50
|
-
- `src/cdp.js` — Vanilla WebSocket client that speaks CDP. Send JSON commands, receive responses and events. Handle command IDs, promises, event subscriptions. ~100 lines.
|
|
51
|
-
- `src/aria.js` — Call `Accessibility.getFullAXTree` via CDP. Transform the raw CDP response (flat array of AXNodes with parentId references) into a nested tree structure. Format as readable output.
|
|
52
|
-
- `src/index.js` — Wire chromium → cdp → aria into `browse(url)` function. Minimal, just the pipeline.
|
|
53
|
-
|
|
54
|
-
**Test:**
|
|
55
|
-
```bash
|
|
56
|
-
node -e "import { browse } from './src/index.js'; console.log(await browse('https://example.com'))"
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
**DoD:**
|
|
60
|
-
- [x] `chromium.js` finds and launches at least one Chromium browser on Fedora Linux
|
|
61
|
-
- [x] `cdp.js` connects via WebSocket, sends commands, receives responses
|
|
62
|
-
- [x] `aria.js` returns a structured ARIA tree for any public page
|
|
63
|
-
- [x] `browse(url)` works end-to-end with zero external dependencies
|
|
64
|
-
- [x] Headless Chrome process is cleaned up on close
|
|
65
|
-
|
|
66
|
-
### Phase 2 — Auth + Prune
|
|
67
|
-
|
|
68
|
-
**Prove:** Authenticated, pruned ARIA snapshot of a Cloudflare-protected page.
|
|
69
|
-
|
|
70
|
-
**Files:**
|
|
71
|
-
- `src/auth.js` — Extract cookies from user's browser profile (use sweet-cookie or implement minimal extraction from Chrome's Cookies SQLite DB + Linux keyring decryption via `secret-tool`). Inject via CDP `Network.setCookie` before navigation.
|
|
72
|
-
- `src/prune.js` — Port mcprune's pruning logic as a pure function. Input: raw ARIA tree. Output: pruned ARIA tree. Role-based: keep landmarks + interactive elements, drop noise/structural wrappers.
|
|
73
|
-
- Update `src/index.js` — Add cookie injection and pruning to the `browse()` pipeline.
|
|
74
|
-
|
|
75
|
-
**Test:**
|
|
76
|
-
```bash
|
|
77
|
-
# Should return authenticated content, not a login wall or CF challenge
|
|
78
|
-
node -e "import { browse } from './src/index.js'; console.log(await browse('https://some-cf-protected-site.com'))"
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
**DoD:**
|
|
82
|
-
- [x] `auth.js` extracts cookies from Firefox profile on Linux (also supports Chromium when installed)
|
|
83
|
-
- [x] Cookies injected via CDP before navigation
|
|
84
|
-
- [ ] CF-protected page returns real content, not challenge page (needs active session to test)
|
|
85
|
-
- [x] `prune.js` reduces ARIA tree by 47%+ on HN (minimal site — heavier sites will see 70%+)
|
|
86
|
-
- [x] Pruned output preserves all interactive elements and landmarks
|
|
87
|
-
- [x] `browse(url)` returns pruned, authenticated snapshot by default
|
|
88
|
-
|
|
89
|
-
### Phase 3 — Headed Mode + Interaction
|
|
90
|
-
|
|
91
|
-
**Prove:** Connect to user's running browser and interact with a logged-in page.
|
|
92
|
-
|
|
93
|
-
**Files:**
|
|
94
|
-
- Update `src/chromium.js` — Add `connect()` mode: connect to an already-running browser's debug port instead of launching a new one. Detect running browsers with debug ports.
|
|
95
|
-
- `src/interact.js` — Click (`Input.dispatchMouseEvent`), type (`Input.dispatchKeyEvent`), scroll. Resolve ARIA node IDs to DOM coordinates for click targets.
|
|
96
|
-
- Update `src/index.js` — Add `connect()` export for long-lived sessions. Add `mode: 'headed'` option.
|
|
97
|
-
|
|
98
|
-
**Prerequisite:** User must launch their browser with `--remote-debugging-port=9222` flag.
|
|
99
|
-
|
|
100
|
-
**Test:**
|
|
101
|
-
```bash
|
|
102
|
-
# User has Chrome open with debug port, logged into GitHub
|
|
103
|
-
node -e "
|
|
104
|
-
import { connect } from './src/index.js';
|
|
105
|
-
const page = await connect({ mode: 'headed' });
|
|
106
|
-
await page.goto('https://github.com/notifications');
|
|
107
|
-
console.log(await page.snapshot());
|
|
108
|
-
"
|
|
109
|
-
```
|
|
110
|
-
|
|
111
|
-
**DoD:**
|
|
112
|
-
- [x] `connect()` attaches to a running Chromium browser via CDP
|
|
113
|
-
- [x] Same ARIA + prune pipeline works on headed browser
|
|
114
|
-
- [x] `click()` and `type()` send real input events via CDP
|
|
115
|
-
- [x] `press()` sends special keys (Enter, Tab, Escape, arrows) — triggers form submit
|
|
116
|
-
- [x] `scrollIntoView` before click ensures off-screen elements are reachable
|
|
117
|
-
- [x] `type({ clear: true })` replaces pre-filled input content
|
|
118
|
-
- [x] `waitForNavigation()` waits for page load after link clicks
|
|
119
|
-
- [x] Interactions tested against real sites: Wikipedia, GitHub, Google, Hacker News, DuckDuckGo, YouTube
|
|
120
|
-
- [x] Browser stays open after barebrowse disconnects
|
|
121
|
-
- [x] Cookie injection via `page.injectCookies()` for headed mode (Firefox → Chromium)
|
|
122
|
-
- [x] Permission prompts suppressed via launch flags + CDP `Browser.setPermission`
|
|
123
|
-
- [x] Cookie consent dialogs auto-dismissed across 16+ sites in 7 languages
|
|
124
|
-
- [x] YouTube end-to-end: Firefox cookies → search → click → video playback in headed mode
|
|
125
|
-
|
|
126
|
-
### Phase 4 — Hybrid + bareagent Integration
|
|
127
|
-
|
|
128
|
-
**Prove:** Agent autonomously browses the web using barebrowse tools.
|
|
129
|
-
|
|
130
|
-
**Files:**
|
|
131
|
-
- Update `src/chromium.js` — Add `mode: 'hybrid'`. Try headless first. If navigation returns a CF challenge or 403, automatically retry in headed mode.
|
|
132
|
-
- `src/stealth.js` — Basic anti-detection: patch `navigator.webdriver`, `navigator.plugins`, `window.chrome`. Applied via `Runtime.evaluate` on new page.
|
|
133
|
-
- Update `src/index.js` — Final API surface: `browse()`, `connect()`.
|
|
134
|
-
|
|
135
|
-
**Test:**
|
|
136
|
-
```js
|
|
137
|
-
import { Loop } from 'bare-agent';
|
|
138
|
-
import { browse } from './src/index.js';
|
|
139
|
-
|
|
140
|
-
const tools = [
|
|
141
|
-
{ name: 'browse', execute: ({ url }) => browse(url) },
|
|
142
|
-
];
|
|
143
|
-
|
|
144
|
-
const loop = new Loop({ provider });
|
|
145
|
-
await loop.run([
|
|
146
|
-
{ role: 'user', content: 'Go to hacker news and tell me the top 3 stories' }
|
|
147
|
-
], tools);
|
|
148
|
-
```
|
|
149
|
-
|
|
150
|
-
**DoD:**
|
|
151
|
-
- [ ] Hybrid mode automatically falls back when headless is blocked
|
|
152
|
-
- [ ] Stealth patches reduce headless detection on common sites
|
|
153
|
-
- [ ] bareagent can use `browse()` as a tool in its think/act/observe loop
|
|
154
|
-
- [ ] Agent successfully completes a multi-page research task autonomously
|
|
155
|
-
|
|
156
|
-
---
|
|
157
|
-
|
|
158
|
-
## Definition of Done — Full POC
|
|
159
|
-
|
|
160
|
-
The POC is complete when ALL of these are true:
|
|
161
|
-
|
|
162
|
-
1. **`browse(url)` works end-to-end** — URL in, pruned ARIA snapshot out, authenticated as the user
|
|
163
|
-
2. **Zero heavy deps** — no Playwright, no Puppeteer. Only deps: `ws` (WebSocket client, if Node's built-in isn't sufficient) and optionally `sweet-cookie`
|
|
164
|
-
3. **Three modes work** — headless (default), headed (connect to running browser), hybrid (auto-fallback)
|
|
165
|
-
4. **Works on Fedora Linux** — finds Chrome/Chromium/Brave, launches headless, connects headed
|
|
166
|
-
5. **Token-efficient output** — pruned ARIA tree is 70%+ smaller than raw tree
|
|
167
|
-
6. **Clean process management** — headless browser spawned and killed cleanly, no orphan processes
|
|
168
|
-
7. **Under 1,000 lines total** for core src/ (excluding tests)
|
|
169
|
-
8. **Documented** — PRD captures all decisions, this file captures all phases
|
|
170
|
-
|
|
171
|
-
## What the POC is NOT
|
|
172
|
-
|
|
173
|
-
- Not production-ready. No error recovery, no retry logic, no edge case handling beyond happy path.
|
|
174
|
-
- Not cross-platform tested. Linux first (Fedora). macOS/Windows later.
|
|
175
|
-
- Not an MCP server. That's a future wrapper.
|
|
176
|
-
- Not a published npm package. Local development only.
|
|
177
|
-
|
|
178
|
-
---
|
|
179
|
-
|
|
180
|
-
## Running Tests
|
|
181
|
-
|
|
182
|
-
```bash
|
|
183
|
-
# All tests (47+ tests)
|
|
184
|
-
node --test test/unit/*.test.js test/integration/*.test.js
|
|
185
|
-
|
|
186
|
-
# Unit tests only (fast, no network)
|
|
187
|
-
node --test test/unit/prune.test.js # 16 tests — pruning logic
|
|
188
|
-
node --test test/unit/auth.test.js # 7 tests — cookie extraction (2 fail when Chromium locked)
|
|
189
|
-
node --test test/unit/cdp.test.js # 5 tests — CDP client + browser launch
|
|
190
|
-
|
|
191
|
-
# Integration tests (needs network + Chromium)
|
|
192
|
-
node --test test/integration/browse.test.js # 11 tests — end-to-end pipeline
|
|
193
|
-
node --test test/integration/interact.test.js # 15 tests — interactions on real sites
|
|
194
|
-
|
|
195
|
-
# Quick smoke test
|
|
196
|
-
node -e "import { browse } from './src/index.js'; console.log(await browse('https://example.com'))"
|
|
197
|
-
|
|
198
|
-
# Headed mode demos (requires: chromium-browser --remote-debugging-port=9222)
|
|
199
|
-
node examples/headed-demo.js # Wikipedia → DuckDuckGo search
|
|
200
|
-
node examples/yt-demo.js # YouTube: Firefox cookies → search → play video
|
|
201
|
-
```
|
|
202
|
-
|
|
203
|
-
---
|
|
204
|
-
|
|
205
|
-
## Repos Studied — What We Borrowed vs Built
|
|
206
|
-
|
|
207
|
-
| steipete repo | What we studied | What we used | Why not more |
|
|
208
|
-
|---|---|---|---|
|
|
209
|
-
| **sweet-cookie** | Cookie extraction (SQLite + keyring) | **Concept only** — wrote `auth.js` ourselves | Not on npm (different package). Our version is simpler, tailored, vanilla JS |
|
|
210
|
-
| **sweetlink** | CDP dual-channel, selector discovery, daemon | **CDP-direct concept only** | Daemon + WebSocket bridge + in-page runtime = bloat. CDP direct is 100 lines vs ~2,000 |
|
|
211
|
-
| **canvas** | Stealth/anti-detection patterns | **Noted for Phase 4** `stealth.js` | Not needed yet — headless + real cookies handles most cases |
|
|
212
|
-
| **mcprune (own)** | ARIA pruning pipeline | **Full port** — `prune.js` (472 lines) | Proven code, adapted node format from Playwright YAML to CDP tree objects |
|
|
213
|
-
|
|
214
|
-
### What to explore in later phases
|
|
215
|
-
|
|
216
|
-
- **Selector discovery** (sweetlink) — crawl ARIA tree, score interactive elements, rank action targets. Phase 3/4.
|
|
217
|
-
- **Stealth patches** (canvas) — `navigator.webdriver`, plugins, chrome object spoofing. Phase 4.
|
|
218
|
-
- **In-page JS execution** (sweetlink) — `Runtime.evaluate` for complex interactions. Phase 3.
|
|
219
|
-
- **Screenshot + visual grounding** — `Page.captureScreenshot` for multimodal agents. Post-POC.
|
|
220
|
-
|
|
221
|
-
---
|
|
222
|
-
|
|
223
|
-
## Dev Rules (from AGENT_RULES.md)
|
|
224
|
-
|
|
225
|
-
- **Vanilla JS only.** No TypeScript, no build step, no transpilation.
|
|
226
|
-
- **Dependency hierarchy:** vanilla → stdlib → external. Write it yourself if <50 lines.
|
|
227
|
-
- **Simple > clever.** Readable code a junior can follow.
|
|
228
|
-
- **POC first.** Validate logic before designing. Never ship the POC — rewrite it.
|
|
229
|
-
- **Test behavior, not implementation.** Integration tests over unit tests.
|
|
230
|
-
- **No speculative code.** Every line must have a purpose.
|