barebrowse 0.2.2 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,284 @@
1
+ # barebrowse — Product Requirements Document
2
+
3
+ **Version:** 1.0
4
+ **Date:** 2026-02-22
5
+ **Status:** POC
6
+
7
+ ---
8
+
9
+ ## What barebrowse is
10
+
11
+ A standalone vanilla JavaScript library that gives autonomous agents authenticated access to the web through the user's own Chromium browser. One package, one import, three modes.
12
+
13
+ ```js
14
+ import { browse } from 'barebrowse';
15
+ const snapshot = await browse('https://any-page.com');
16
+ ```
17
+
18
+ barebrowse handles: finding the browser, connecting via CDP, injecting cookies, navigating, extracting the ARIA accessibility tree, and pruning it down to what an agent actually needs. The output is a clean, token-efficient snapshot of any web page — authenticated as the real user.
19
+
20
+ ## What barebrowse is NOT
21
+
22
+ - **Not a framework.** No plugin system, no config files, no lifecycle hooks.
23
+ - **Not an MCP server.** But trivially wrappable as one (~30 lines).
24
+ - **Not Playwright.** No bundled browser, no cross-engine abstraction, no 200MB download.
25
+ - **Not an agent.** No LLM, no planning, no orchestration — that's bareagent's job.
26
+ - **Not a scraper.** It browses as the user, not as a bot harvesting data.
27
+
28
+ ---
29
+
30
+ ## The Problem
31
+
32
+ Every AI agent that needs to read or interact with the web hits the same walls:
33
+
34
+ 1. **Cloudflare / bot detection** — headless browsers get blocked
35
+ 2. **Authentication** — sites require login, OAuth, session cookies
36
+ 3. **Token bloat** — raw DOM is 100K+ tokens; agents need ~5K
37
+ 4. **Two consumers, same need** — research agents (read pages) and personal assistants (click/type) both need an authenticated browser, but existing tools force you to choose one path
38
+
39
+ Existing solutions (Playwright MCP, sweetlink, open-operator, browser-use) are either too heavy, too opinionated, or solve only half the problem.
40
+
41
+ ## The Insight
42
+
43
+ The user already has a browser. It's already logged in. It already passes Cloudflare. Instead of fighting the web with headless stealth tricks, **use what's already there**.
44
+
45
+ CDP (Chrome DevTools Protocol) lets us connect to any Chromium-based browser — the same one the user browses with daily. We get their cookies, their sessions, their anti-detection posture, for free.
46
+
47
+ ---
48
+
49
+ ## Core Architecture
50
+
51
+ ### CDP-Direct (Why No Playwright)
52
+
53
+ **Decision:** Use CDP over WebSocket directly. No Playwright dependency.
54
+
55
+ **Why:**
56
+ - Playwright downloads a bundled Chromium (~200MB). barebrowse uses the browser already installed on the user's machine.
57
+ - Playwright abstracts CDP, but we need CDP directly for all three modes (headless, headed, hybrid) against the user's real browser.
58
+ - Every Playwright API call maps 1:1 to a CDP method. The abstraction adds weight without adding capability for our use case.
59
+ - CDP gives us everything: `Accessibility.getFullAXTree`, `Page.navigate`, `Runtime.evaluate`, `Input.dispatch*Event`, `Network.setCookie`, `Page.captureScreenshot`.
60
+ - The CDP WebSocket client is ~100 lines of vanilla JS. Playwright is ~50,000.
61
+
62
+ **What we lose:** Cross-engine support (Firefox, WebKit). CDP only works with Chromium-family browsers (Chrome, Chromium, Edge, Brave, Vivaldi, Arc, Opera). This covers ~80% of desktop browsers. Firefox support could come later via WebDriver BiDi.
63
+
64
+ **What we gain:** Zero heavy deps, uses the user's real browser, same code path for headless/headed/hybrid, drastically simpler codebase.
65
+
66
+ ### ARIA-First (Why Not DOM)
67
+
68
+ **Decision:** Use `Accessibility.getFullAXTree` (ARIA/accessibility tree) as the primary page representation, not DOM.
69
+
70
+ **Why:**
71
+ - The accessibility tree is the semantic structure of the page — roles, names, states, interactive elements. It's what screen readers see. It's also what agents need.
72
+ - DOM is bloated: wrapper divs, styling, tracking pixels, ad scripts. An agent doesn't need any of that.
73
+ - mcprune already proved this: ARIA snapshots pruned by role achieve 75-95% token reduction on typical pages while preserving all actionable information.
74
+ - CDP's `Accessibility.getFullAXTree` returns the tree directly. No parsing HTML, no building a DOM tree, no traversing nodes.
75
+ - ARIA refs map directly to CDP interaction targets — the agent reads a button in the tree and can click it via the same CDP connection.
76
+
77
+ **The pipeline:** CDP connect → authenticate → navigate → ARIA tree → prune → agent gets clean snapshot.
78
+
79
+ ### Three Modes (Why All Three)
80
+
81
+ **Decision:** Headless, headed, and hybrid — not as separate packages or optional features, but as a single flag on the same API.
82
+
83
+ **Why they're not bloat:** The CDP conversation is identical regardless of mode. The only difference is how you get a browser process with a debug port. It's one code path with a different entry point:
84
+
85
+ ```
86
+ headless: spawn chromium --headless=new --remote-debugging-port=N
87
+ headed: connect to user's already-running browser on debug port
88
+ hybrid: try headless → detect failure → fall back to headed
89
+ ```
90
+
91
+ After connection, every CDP command is the same. Three modes = ~20 extra lines in `chromium.js`, not three implementations.
92
+
93
+ **When to use each:**
94
+
95
+ | Mode | Use case | Example |
96
+ |---|---|---|
97
+ | `headless` | Agent research, background tasks, CI | "Read this article and summarize it" |
98
+ | `headed` | Personal assistant, interactive tasks, auth flows | "Book me a flight on this page" |
99
+ | `hybrid` | Default for autonomous agents | Try headless; if CF-blocked, fall back to headed |
100
+
101
+ **Headless is the default.** Most agent tasks are "go read this page." Headed is the escape hatch for when headless fails or the task requires user-visible interaction.
102
+
103
+ ### Cookie Authentication
104
+
105
+ **Decision:** Extract cookies from the user's browser profile and inject via CDP `Network.setCookie`.
106
+
107
+ **Why:**
108
+ - The user's browser has active sessions for every site they use. We reuse those sessions instead of building new auth flows.
109
+ - sweet-cookie (npm package) already extracts cookies from Chrome/Firefox/Safari SQLite databases with OS keychain decryption. We use it or vendor the relevant parts.
110
+ - For headed mode, cookies are already present in the browser — no extraction needed.
111
+ - For headless mode, we extract from the user's profile and inject into the headless instance.
112
+
113
+ **Limitation:** Cookies expire. This works for existing sessions, not new logins. For sites requiring fresh auth, headed mode with user interaction is the fallback.
114
+
115
+ ### Pruning (Absorbed from mcprune)
116
+
117
+ **Decision:** Port mcprune's role-based ARIA tree pruning into barebrowse as a built-in step, not an optional module.
118
+
119
+ **Why:**
120
+ - Pruning is not optional for agent consumption. A raw ARIA tree is still too large for most LLM context windows. Pruning is part of the pipeline, not an afterthought.
121
+ - mcprune's pruning logic is a pure function: takes an ARIA tree, returns a smaller ARIA tree. No browser dependency, no Playwright coupling. It's ~300 lines of role-based tree surgery.
122
+ - By absorbing it, barebrowse becomes a complete "URL in, agent-ready snapshot out" solution. No second package needed.
123
+
124
+ **What we port from mcprune:**
125
+ - Role taxonomy (landmarks, interactive, structural, noise)
126
+ - Landmark extraction (main, nav, banner, etc.)
127
+ - Noise removal (ads, tracking, legal boilerplate)
128
+ - Interactive element preservation (buttons, links, inputs)
129
+ - Wrapper collapsing (nested generics, empty groups)
130
+ - Context-aware filtering (search relevance, dedup)
131
+
132
+ **What stays in mcprune:** The Playwright MCP proxy architecture. mcprune can continue to exist as a Playwright-based MCP server for users who want that path. But for barebrowse consumers, pruning is built in.
133
+
134
+ ---
135
+
136
+ ## API Design
137
+
138
+ ### Public API
139
+
140
+ ```js
141
+ import { browse, connect } from 'barebrowse';
142
+
143
+ // One-shot: URL in, pruned ARIA snapshot out
144
+ const tree = await browse('https://example.com');
145
+
146
+ // With options
147
+ const tree = await browse('https://example.com', {
148
+ mode: 'hybrid', // 'headless' (default) | 'headed' | 'hybrid'
149
+ cookies: true, // inject user's cookies (default: true)
150
+ prune: true, // apply ARIA pruning (default: true)
151
+ browser: 'chrome', // which browser profile for cookies
152
+ timeout: 30000, // navigation timeout ms
153
+ });
154
+
155
+ // Long-lived session for interaction
156
+ const page = await connect({ mode: 'headed' });
157
+ await page.goto('https://amazon.com/cart');
158
+ await page.click('[data-action="checkout"]');
159
+ await page.type('#gift-message', 'Happy birthday!');
160
+ const tree = await page.snapshot(); // ARIA + prune
161
+ await page.close();
162
+ ```
163
+
164
+ ### Design Principles
165
+
166
+ 1. **One package, one import.** No picking pieces. `browse()` does everything. Power users get `connect()` for long-lived sessions.
167
+ 2. **Batteries included.** Cookies, ARIA, pruning — all happen inside by default. Disable with flags if you want raw access.
168
+ 3. **Escape hatches.** `connect()` returns an object with the raw CDP connection accessible. If you need something we don't wrap, you can send CDP commands directly.
169
+ 4. **Progressive complexity.** `browse(url)` for 90% of use cases. Options object for the rest. `connect()` for interactive sessions.
170
+
171
+ ---
172
+
173
+ ## The bare- Ecosystem
174
+
175
+ ```
176
+ bareagent = the brain (orchestration, planning, memory, retries, tool loop)
177
+ barebrowse = the eyes + hands (browse, read, interact with the web)
178
+ ```
179
+
180
+ **Integration with bareagent:**
181
+
182
+ ```js
183
+ import { Loop } from 'bare-agent';
184
+ import { browse } from 'barebrowse';
185
+
186
+ const tools = [
187
+ { name: 'browse', execute: ({ url }) => browse(url) },
188
+ ];
189
+
190
+ const loop = new Loop({ provider });
191
+ await loop.run([{ role: 'user', content: 'Find the cheapest flight to Tokyo' }], tools);
192
+ ```
193
+
194
+ bareagent handles the think/act/observe loop. barebrowse handles "see the web and act on it." Neither is opinionated about the other. Tools are plain functions.
195
+
196
+ **Integration with multis:**
197
+
198
+ multis (personal assistant) uses barebrowse in headed mode for interactive tasks. The multis proxy is already running, providing a desktop session. barebrowse connects to the user's Chrome and drives it on behalf of the assistant.
199
+
200
+ **MCP server wrapper (future):**
201
+
202
+ barebrowse is not an MCP server, but wrapping it as one is ~30 lines. This would replace Playwright MCP + mcprune proxy with a single, lighter MCP server.
203
+
204
+ ---
205
+
206
+ ## Decisions Log — Why We Chose Each
207
+
208
+ This section exists so we don't re-debate settled decisions.
209
+
210
+ | Decision | Choice | Why | Alternative considered | Why not |
211
+ |---|---|---|---|---|
212
+ | Browser protocol | CDP direct | Uses user's browser, ~100 lines, all 3 modes | Playwright | 200MB download, bundles its own Chromium, abstracts what we need raw |
213
+ | Page representation | ARIA tree | Semantic, token-efficient, what agents need | DOM/HTML | Bloated, noisy, needs heavy parsing |
214
+ | Pruning | Built-in | Agents always need pruned output | Optional/separate | Two deps for one job, pruning isn't optional |
215
+ | Cookie auth | Own auth.js + CDP inject | User's existing sessions (Firefox or Chromium), cross-browser injection into headless Chromium | OAuth/credential storage | Complex, security liability, reinventing what the browser already solved |
216
+ | Three modes | One flag | Same CDP code, ~20 lines difference | Separate packages | Same code, artificial separation |
217
+ | Chromium only | CDP constraint | ~80% browser share, user's real browser | Cross-browser (Playwright) | Requires Playwright, loses "use your own browser" benefit |
218
+ | Anti-detection | Runtime.evaluate patches | Minimal stealth for headless mode | Full stealth framework | Over-engineering; headless + real cookies handles 90% |
219
+ | Daemon/server | None | CDP is direct, no intermediary needed | sweetlink daemon pattern | Unnecessary complexity for local agent→browser |
220
+ | Framework | None (vanilla JS) | Matches bare- philosophy, zero deps | Express/Fastify wrapper | Not a server, not needed |
221
+ | Language | Vanilla JavaScript | Node.js ecosystem, same as bareagent, CDP libs available | TypeScript | Added build step, not needed for POC; can add types later |
222
+ | Naming | chromium.js | Covers all Chromium-family browsers, not just Chrome | chrome.js | Too specific; Brave/Edge/Arc are also targets |
223
+ | mcprune integration | Absorb pruning logic | One package does it all, mcprune pruning is a pure function | Keep separate | Agents shouldn't need two packages to browse |
224
+ | openclaw lesson | Single bridge protocol | One CDP connection vs many API integrations | Direct multi-API | openclaw proved this fails — bloat, maintenance, fragility |
225
+
226
+ ---
227
+
228
+ ## Future Features (Post-POC)
229
+
230
+ ### Near-term
231
+ - **Screenshot capture** — `Page.captureScreenshot` via CDP. Useful for visual verification and multimodal agents.
232
+ - **Network interception** — `Network.requestWillBeSent` / `Network.responseReceived` for monitoring page loads. Detect redirects, blocked resources, API calls.
233
+ - **Wait strategies** — `waitForNavigation()` done (Page.loadEventFired). Still needed: network idle, element presence polling.
234
+ - **Tab management** — Multiple pages in one browser session. CDP `Target.createTarget` / `Target.attachToTarget`.
235
+ - **MCP server wrapper** — Expose browse/click/type as MCP tools. Replaces Playwright MCP + mcprune combo.
236
+
237
+ ### Medium-term
238
+ - **Firefox support** — Via WebDriver BiDi protocol (cross-browser standard, still maturing). Second protocol adapter alongside CDP.
239
+ - **Cookie sync** — In hybrid mode, extract fresh cookies from headed session and cache for future headless use. Self-refreshing auth.
240
+ - **Selector discovery** — Port sweetlink's `discoverSelectors` — crawl ARIA tree, score interactive elements, return ranked action targets.
241
+ - **Form understanding** — Detect forms in ARIA tree, map fields to semantic purposes, enable agents to fill forms intelligently.
242
+ - **Proxy/Tor support** — Route headless browser through proxy for geo-restricted content.
243
+
244
+ ### Long-term
245
+ - **Profile management** — Multiple browser profiles for different identities/accounts.
246
+ - **Session recording/replay** — Record browsing sessions as CDP commands, replay for testing.
247
+ - **Visual grounding** — Combine ARIA tree with screenshot regions for multimodal agents.
248
+ - **Agent memory integration** — Remember visited pages, cache snapshots, track which sites need headed mode.
249
+
250
+ ---
251
+
252
+ ## Repos Studied — What We Borrowed and Why
253
+
254
+ | Repo | What we took | What we skipped |
255
+ |---|---|---|
256
+ | **steipete/sweet-cookie** | Cookie extraction from browser profiles, OS keychain decryption | Nothing — clean, focused library |
257
+ | **steipete/sweetlink** | CDP dual-channel concept, selector discovery scoring, click/command patterns | Daemon architecture, WebSocket bridge, in-page runtime injection, HMAC auth |
258
+ | **steipete/canvas** | Stealth/anti-detection config patterns | Go implementation (we're JS) |
259
+ | **nichochar/open-operator** | AI agent web automation patterns | Full framework, too opinionated |
260
+ | **AntlerClaw/playwright-mcp** | How to expose browser as MCP tools | Playwright dependency |
261
+ | **AntlerClaw/mcp-browser-use** | MCP-native browser patterns | Heavy deps |
262
+ | **AitchKay/chromancer** | Accessibility tree extraction approach | Different stack |
263
+ | **mcprune (own)** | ARIA pruning logic — role taxonomy, landmark extraction, noise removal, wrapper collapsing | Playwright dependency, MCP proxy architecture |
264
+ | **openclaw (own)** | Lesson learned: multi-API direct integration = bloat. Use a single bridge protocol | Everything — the architecture was the cautionary tale |
265
+
266
+ ### The openclaw lesson
267
+
268
+ openclaw tried to integrate 10+ messaging APIs directly — each with its own auth, format, quirks. It became a maintenance nightmare. multis solved the same problem by using Beeper/Matrix as a single bridge.
269
+
270
+ barebrowse applies the same lesson: instead of integrating Playwright + Puppeteer + WebDriver + stealth plugins + cookie libraries + proxy managers, we use **one protocol (CDP) to one browser (the user's)**. Everything else is unnecessary.
271
+
272
+ ---
273
+
274
+ ## Success Criteria
275
+
276
+ barebrowse succeeds when:
277
+
278
+ 1. `browse(url)` returns a pruned ARIA snapshot of any page, authenticated as the user
279
+ 2. Zero heavy dependencies — no Playwright, no Puppeteer, no bundled browser
280
+ 3. Works with any installed Chromium-based browser
281
+ 4. Headless for research, headed for interaction, hybrid for autonomous agents
282
+ 5. Plugs into bareagent as plain tool functions
283
+ 6. Total source under 1,000 lines for core functionality
284
+ 7. An agent using barebrowse + bareagent can autonomously research the web and act on pages
@@ -0,0 +1,16 @@
1
+ # Bug Log
2
+
3
+ Track bugs: symptom, root cause, fix, regression test.
4
+
5
+ ---
6
+
7
+ *No bugs logged yet. When one is found, add an entry:*
8
+
9
+ ```
10
+ ## [date] Short description
11
+
12
+ **Symptom:** What the user/test observed
13
+ **Root cause:** Why it happened
14
+ **Fix:** What was changed (file:line)
15
+ **Regression test:** Which test prevents recurrence
16
+ ```
@@ -0,0 +1,32 @@
1
+ # Decisions Log
2
+
3
+ Settled decisions. Don't re-debate these -- see rationale column.
4
+
5
+ ## Founding decisions (v0.1.0)
6
+
7
+ | # | Decision | Choice | Why | Alternative | Why not |
8
+ |---|----------|--------|-----|-------------|---------|
9
+ | 1 | Browser protocol | CDP direct | Uses user's browser, ~100 lines, all 3 modes | Playwright | 200MB download, bundles its own Chromium, abstracts what we need raw |
10
+ | 2 | Page representation | ARIA tree | Semantic, token-efficient, what agents need | DOM/HTML | Bloated, noisy, needs heavy parsing |
11
+ | 3 | Pruning | Built-in | Agents always need pruned output | Optional/separate | Two deps for one job, pruning isn't optional |
12
+ | 4 | Cookie auth | Own auth.js + CDP inject | User's existing sessions (Firefox or Chromium), cross-browser injection | OAuth/credential storage | Complex, security liability, reinventing what the browser already solved |
13
+ | 5 | Three modes | One flag | Same CDP code, ~20 lines difference | Separate packages | Same code, artificial separation |
14
+ | 6 | Chromium only | CDP constraint | ~80% browser share, user's real browser | Cross-browser (Playwright) | Requires Playwright, loses "use your own browser" benefit |
15
+ | 7 | Framework | None (vanilla JS) | Matches bare- philosophy, zero deps | Express/Fastify wrapper | Not a server, not needed |
16
+ | 8 | Language | Vanilla JavaScript | Node.js ecosystem, same as bareagent, CDP libs available | TypeScript | Added build step, not needed; can add types later |
17
+ | 9 | mcprune integration | Absorb pruning logic | One package does it all, mcprune pruning is a pure function | Keep separate | Agents shouldn't need two packages to browse |
18
+ | 10 | Daemon/server | None | CDP is direct, no intermediary needed | sweetlink daemon pattern | Unnecessary complexity for local agent-to-browser |
19
+
20
+ ## v0.2.0 decisions
21
+
22
+ | # | Decision | Choice | Why | Alternative | Why not |
23
+ |---|----------|--------|-----|-------------|---------|
24
+ | 11 | Anti-detection | Runtime.evaluate patches | Minimal stealth for headless mode | Full stealth framework | Over-engineering; headless + real cookies handles 90% |
25
+ | 12 | sweet-cookie | Wrote own auth.js | sweet-cookie not on npm (different package). Our version is simpler, tailored, vanilla JS | Use sweet-cookie | Not available as npm package |
26
+ | 13 | MCP server | Raw JSON-RPC, no SDK | Zero deps, ~200 lines. SDK adds weight without capability for stdio | @modelcontextprotocol/sdk | Unnecessary dependency for simple JSON-RPC |
27
+ | 14 | bareagent adapter | Action tools auto-return snapshot | LLM always sees result without extra tool call. 300ms settle for DOM updates | Return 'ok' like MCP | Different tradeoff -- bareagent tool calls are expensive (LLM round-trip) |
28
+ | 15 | MCP action tools | Return 'ok', agent calls snapshot | MCP tool calls are cheap to chain. Avoids double-token output | Auto-return snapshot | Would bloat every action response |
29
+
30
+ ---
31
+
32
+ *Add new decisions below this line. Include date, context, and rationale.*
@@ -0,0 +1,54 @@
1
+ # Implementation Log
2
+
3
+ Chronological record of what changed and why. For detailed changelogs, see `/CHANGELOG.md`.
4
+
5
+ ---
6
+
7
+ ## v0.2.1 (2026-02-22)
8
+
9
+ - README rewritten: no code blocks, obstacle table, two usage paths (MCP vs framework)
10
+ - MCP auto-installer: `npx barebrowse install` detects Claude Desktop, Cursor, Claude Code
11
+ - MCP config uses `npx` instead of local file paths
12
+
13
+ ## v0.2.0 (2026-02-22)
14
+
15
+ Major release: agent integration layer.
16
+
17
+ **New modules:**
18
+ - `mcp-server.js` -- JSON-RPC 2.0 over stdio, 7 tools, singleton session
19
+ - `src/bareagent.js` -- tool adapter for bareagent Loop, 9 tools, auto-snapshot
20
+ - `src/stealth.js` -- navigator patches for headless anti-detection
21
+ - `cli.js` -- `npx barebrowse mcp|install|browse`
22
+
23
+ **New features:**
24
+ - Hybrid mode (try headless, fallback to headed on bot detection)
25
+ - `page.hover(ref)`, `page.select(ref, value)`, `page.screenshot(opts)`
26
+ - `page.waitForNetworkIdle(opts)` -- resolve when no pending requests
27
+ - SPA-aware `waitForNavigation()`
28
+
29
+ **Docs:**
30
+ - `barebrowse.context.md` -- LLM integration guide
31
+ - `docs/testing.md` -- test pyramid, all 54 tests
32
+ - `docs/blueprint.md` -- full pipeline, module table
33
+
34
+ **Tests:** 54 passing (was 47)
35
+
36
+ ## v0.1.0 (2026-02-22)
37
+
38
+ Initial release. CDP-direct browsing with ARIA snapshots.
39
+
40
+ **Core modules (7):**
41
+ - `src/index.js` -- `browse()`, `connect()` API
42
+ - `src/cdp.js` -- WebSocket CDP client
43
+ - `src/chromium.js` -- browser discovery and launch
44
+ - `src/aria.js` -- ARIA tree formatting
45
+ - `src/auth.js` -- cookie extraction (Firefox SQLite, Chromium AES + keyring)
46
+ - `src/prune.js` -- 9-step pruning pipeline (ported from mcprune)
47
+ - `src/interact.js` -- click, type, press, scroll
48
+ - `src/consent.js` -- cookie consent auto-dismiss (7 languages, 16+ sites)
49
+
50
+ **Tests:** 47 passing across 5 files
51
+
52
+ ---
53
+
54
+ *Add new entries at the top. Include version, date, and what changed.*
@@ -0,0 +1,35 @@
1
+ # Insights
2
+
3
+ Lessons learned, patterns discovered, things to remember.
4
+
5
+ ---
6
+
7
+ ## The openclaw lesson
8
+
9
+ openclaw tried to integrate 10+ messaging APIs directly -- each with its own auth, format, quirks. It became a maintenance nightmare. multis solved the same problem by using Beeper/Matrix as a single bridge.
10
+
11
+ barebrowse applies the same lesson: instead of integrating Playwright + Puppeteer + WebDriver + stealth plugins + cookie libraries + proxy managers, we use **one protocol (CDP) to one browser (the user's)**. Everything else is unnecessary.
12
+
13
+ **Takeaway:** When possible, find a single bridge protocol instead of N direct integrations.
14
+
15
+ ## Repos studied -- what we took and what we skipped
16
+
17
+ | Repo | What we took | What we skipped | Why |
18
+ |------|-------------|-----------------|-----|
19
+ | **steipete/sweet-cookie** | Cookie extraction concept (SQLite + keyring) | Nothing | Not on npm. Wrote our own auth.js -- simpler, tailored, vanilla JS |
20
+ | **steipete/sweetlink** | CDP-direct concept | Daemon, WebSocket bridge, in-page runtime, HMAC auth | CDP direct is 100 lines vs ~2,000 |
21
+ | **steipete/canvas** | Stealth/anti-detection patterns | Go implementation | Noted for stealth.js |
22
+ | **mcprune (own)** | Full pruning pipeline port | Playwright dependency, MCP proxy | prune.js is 472 lines, adapted from Playwright YAML to CDP tree |
23
+ | **openclaw (own)** | Cautionary tale | Everything | Multi-API direct integration = bloat |
24
+
25
+ ## Key technical insights
26
+
27
+ - **ARIA tree > DOM** for agent consumption. Semantic, compact, interactive elements are first-class. Token reduction of 47-95% is real.
28
+ - **Cookie consent is solvable** with ARIA tree scanning + a button text corpus in 7 languages. Dialog role detection + global fallback covers >95% of sites.
29
+ - **Headed mode is the ultimate fallback.** When stealth fails, when cookies expire, when CAPTCHAs appear -- connecting to the user's real browser session handles it.
30
+ - **CDP flattened sessions** are the way to go. One WebSocket, multiple targets. The session ID header routes commands to the right tab.
31
+ - **`Page.addScriptToEvaluateOnNewDocument`** runs before any page scripts -- perfect for stealth patches without race conditions.
32
+
33
+ ---
34
+
35
+ *Add new insights as they emerge. These should be durable lessons, not session notes.*
@@ -0,0 +1,123 @@
1
+ # Validation Log
2
+
3
+ What's been tested against the real world. Updated when new sites or features are validated.
4
+
5
+ ---
6
+
7
+ ## Test suite (64 tests, 6 files)
8
+
9
+ | File | Tests | Type | What it covers |
10
+ |------|-------|------|----------------|
11
+ | `test/unit/prune.test.js` | 16 | Unit | 9-step pruning pipeline in isolation |
12
+ | `test/unit/auth.test.js` | 7 | Unit | Cookie extraction from Firefox/Chromium |
13
+ | `test/unit/cdp.test.js` | 5 | Unit | Browser discovery, launch, CDP client, sessions |
14
+ | `test/integration/browse.test.js` | 11 | Integration | Full `browse()` and `connect()` pipeline |
15
+ | `test/integration/cli.test.js` | 10 | Integration | CLI session lifecycle: open/snapshot/goto/click/eval/console/network/close |
16
+ | `test/integration/interact.test.js` | 15 | E2E | Real interactions on data: fixtures + live sites |
17
+
18
+ Run all: `node --test test/unit/*.test.js test/integration/*.test.js`
19
+
20
+ ## Site validation matrix
21
+
22
+ Tested across 16+ sites, 8 countries, 7 languages.
23
+
24
+ | Site | Consent | Cookies | Interactions | Notes |
25
+ |------|---------|---------|-------------|-------|
26
+ | google.com | NL dialog dismissed | Firefox injection | Search (combobox + Enter) | Bot-blocks headless |
27
+ | youtube.com | Bypassed via cookies | Firefox injection | Search + video playback | Full e2e demo, SPA nav |
28
+ | bbc.com | SourcePoint dismissed | -- | -- | Button outside dialog |
29
+ | wikipedia.org | -- | -- | Link click + navigation | Clean, no consent |
30
+ | github.com | -- | -- | SPA navigation | Needs settle time |
31
+ | duckduckgo.com | -- | -- | Search + results | Headless-friendly |
32
+ | news.ycombinator.com | -- | -- | Story link click | Clean, simple DOM |
33
+ | amazon.de | Banner dismissed | -- | -- | |
34
+ | theguardian.com | CMP dismissed | -- | -- | |
35
+ | spiegel.de | CMP dismissed | -- | -- | German |
36
+ | lemonde.fr | CMP dismissed | -- | -- | French |
37
+ | elpais.com | CMP dismissed | -- | -- | Spanish |
38
+ | corriere.it | CMP dismissed | -- | -- | Italian |
39
+ | nos.nl | CMP dismissed | -- | -- | Dutch |
40
+ | bild.de | CMP dismissed | -- | -- | German |
41
+ | nu.nl | CMP dismissed | -- | -- | Dutch |
42
+ | booking.com | Banner dismissed | -- | -- | |
43
+ | nytimes.com | -- | -- | -- | No consent wall |
44
+ | stackoverflow.com | Footer link only | -- | -- | Not blocking |
45
+ | cnn.com | -- | -- | -- | No consent wall |
46
+ | reddit.com | -- | -- | Fallback to old.reddit | Bot-blocks headless |
47
+
48
+ ## Token reduction measurements
49
+
50
+ | Page | Raw ARIA | Pruned | Reduction |
51
+ |------|----------|--------|-----------|
52
+ | example.com | 377 chars | 45 chars | 88% |
53
+ | Hacker News | 51,726 chars | 27,197 chars | 47% |
54
+ | Wikipedia (article) | 109,479 chars | 40,566 chars | 63% |
55
+ | DuckDuckGo | 42,254 chars | 5,407 chars | 87% |
56
+
57
+ ---
58
+
59
+ ## CLI manual validation (v0.3.0)
60
+
61
+ Full end-to-end validation of every CLI command against real websites.
62
+
63
+ ### Session lifecycle
64
+
65
+ | Command | Result |
66
+ |---------|--------|
67
+ | `barebrowse open https://example.com` | Session started, pid+port printed, session.json created |
68
+ | `barebrowse status` | Shows running pid, port, start time |
69
+ | `barebrowse close` | "Session closed", session.json removed, daemon exited |
70
+ | `status` after close | "No session found", exit code 1 |
71
+ | `click 5` with no session | "No active session. Run `barebrowse open` first.", exit 1 |
72
+ | double `open` | "Session already running. Use `barebrowse close` first.", exit 1 |
73
+
74
+ ### Navigation + snapshots (example.com, HN)
75
+
76
+ | Command | Result |
77
+ |---------|--------|
78
+ | `snapshot` (example.com) | `.barebrowse/page-*.yml` created, clean formatting |
79
+ | `snapshot --mode=read` | Read mode includes paragraphs, each node on own line |
80
+ | `goto https://news.ycombinator.com` | "ok" |
81
+ | `snapshot` (HN) | Clean ARIA tree with refs, proper newline separation |
82
+ | `screenshot` | Valid 780x493 PNG file |
83
+
84
+ ### Interactions (DuckDuckGo search)
85
+
86
+ | Command | Result |
87
+ |---------|--------|
88
+ | `type 12 barebrowse npm` | "ok", multi-word text correctly joined |
89
+ | `press Enter` | "ok", search submitted |
90
+ | `wait-idle` | "ok", waited for network settle |
91
+ | `eval "document.title"` | `"barebrowse npm at DuckDuckGo"` |
92
+ | `snapshot` | Search results page, clean formatting with refs |
93
+ | `fill 2583 hello world` | "ok", cleared search box + typed new text |
94
+ | `hover 2402` | "ok" |
95
+ | `scroll 300` | "ok" |
96
+
97
+ ### Debugging commands
98
+
99
+ | Command | Result |
100
+ |---------|--------|
101
+ | `eval "1 + 1"` | `2` |
102
+ | `eval "document.location.href"` | `"https://news.ycombinator.com/news"` |
103
+ | `eval "console.log('test'); console.error('err')"` | `ok` (undefined return) |
104
+ | `console-logs` | `.json (2 entries)` — log + error captured with types and timestamps |
105
+ | `network-log` | `.json (15 entries)` — all requests with URL, method, status |
106
+ | `network-log --failed` | `.json (1 entries)` — filtered to failed/4xx+ only |
107
+
108
+ ### Legacy + install commands
109
+
110
+ | Command | Result |
111
+ |---------|--------|
112
+ | `browse https://example.com` | One-shot snapshot to stdout |
113
+ | `install` | "No MCP clients detected" + Claude Code hint |
114
+ | `install --skill` | SKILL.md copied to `~/.config/claude/skills/barebrowse/` |
115
+ | (no args) | Clean help output with all commands |
116
+
117
+ ### Bug found and fixed during validation
118
+
119
+ **`src/aria.js` line 23**: ignored nodes joined children with `''` instead of `'\n'`, causing sibling subtrees to concatenate on one line (e.g. `[ref=15]- _promote`). Fixed to `.filter(Boolean).join('\n')`. All 64 tests pass with the fix.
120
+
121
+ ---
122
+
123
+ *Add new validation entries when testing against new sites or features.*
@@ -0,0 +1,31 @@
1
+ # Definition of Done
2
+
3
+ A feature or change is "done" when ALL of these are true.
4
+
5
+ ## Code
6
+
7
+ - [ ] Works end-to-end (not just the happy path)
8
+ - [ ] No heavy dependencies added (vanilla -> stdlib -> external hierarchy respected)
9
+ - [ ] Under reasonable line count -- no bloat
10
+ - [ ] Clean process management -- no orphan browser processes
11
+ - [ ] No security vulnerabilities introduced (command injection, XSS, etc.)
12
+
13
+ ## Tests
14
+
15
+ - [ ] Existing tests still pass: `node --test test/unit/*.test.js test/integration/*.test.js`
16
+ - [ ] New behavior has test coverage (integration preferred over unit)
17
+ - [ ] Bug fixes include a regression test that fails before the fix
18
+
19
+ ## Documentation
20
+
21
+ - [ ] `docs/00-context/system-state.md` updated if architecture changed
22
+ - [ ] `docs/03-logs/decisions-log.md` updated if a design decision was made
23
+ - [ ] `barebrowse.context.md` updated if public API changed
24
+ - [ ] `CHANGELOG.md` updated with what changed
25
+
26
+ ## Not required (avoid over-engineering)
27
+
28
+ - 100% code coverage
29
+ - TypeScript types
30
+ - Cross-platform testing (Linux first, others later)
31
+ - Performance benchmarks (unless performance is the feature)
@@ -0,0 +1,68 @@
1
+ # Development Workflow
2
+
3
+ ## Dev rules
4
+
5
+ **POC first.** Always validate logic with a ~15min proof-of-concept before building. Cover happy path + common edges. POC works -> design properly -> build with tests. Never ship the POC.
6
+
7
+ **Build incrementally.** Break work into small independent modules. One piece at a time, each must work on its own before integrating.
8
+
9
+ **Dependency hierarchy -- follow strictly:**
10
+ 1. Vanilla language -- write it yourself if <50 lines and not security-critical
11
+ 2. Standard library -- `node:test`, `node:fs`, `node:crypto`, `node:sqlite`
12
+ 3. External -- only when stdlib can't do it in <100 lines. Must be maintained, lightweight, widely adopted
13
+
14
+ **Exception:** Always use vetted libraries for security-critical code (crypto, auth, sanitization).
15
+
16
+ **Lightweight over complex.** Fewer moving parts, fewer deps, less config. Simple > clever. Readable > elegant.
17
+
18
+ **Open-source only.** No vendor lock-in. Every line of code must have a purpose -- no speculative code, no premature abstractions.
19
+
20
+ ## Language and runtime
21
+
22
+ - Vanilla JavaScript, ES modules, no build step
23
+ - Node.js >= 22 (built-in WebSocket, built-in SQLite)
24
+ - No TypeScript -- can add types later if needed
25
+
26
+ ## Running tests
27
+
28
+ ```bash
29
+ # All 54 tests
30
+ node --test test/unit/*.test.js test/integration/*.test.js
31
+
32
+ # Unit only (fast, no network)
33
+ node --test test/unit/prune.test.js
34
+ node --test test/unit/auth.test.js
35
+ node --test test/unit/cdp.test.js
36
+
37
+ # Integration (needs Chromium + network)
38
+ node --test test/integration/browse.test.js
39
+ node --test test/integration/interact.test.js
40
+
41
+ # Quick smoke test
42
+ node -e "import { browse } from './src/index.js'; console.log(await browse('https://example.com'))"
43
+ ```
44
+
45
+ ## Testing standards
46
+
47
+ - **Test behavior, not implementation.** Call the public API, assert on observable output.
48
+ - **Integration tests are the sweet spot.** Real components working together.
49
+ - **No test framework deps.** `node:test` and `node:assert/strict` only.
50
+ - **Always `page.close()` in a `finally` block** to avoid leaked browser processes.
51
+ - **Use `data:` URL fixtures** for deterministic tests (no network dependency).
52
+ - **Real-site tests** go in `interact.test.js`, grouped by site.
53
+
54
+ See `docs/04-process/testing.md` for the full test guide.
55
+
56
+ ## Git workflow
57
+
58
+ - Main branch: `main`
59
+ - Commit messages: conventional (`fix:`, `feat:`, `chore:`, `docs:`, `release:`)
60
+ - No force pushes to main
61
+
62
+ ## Environment
63
+
64
+ - OS: Fedora Linux, KDE Plasma, Wayland
65
+ - Node: 22.22.0
66
+ - Browser: `/usr/bin/chromium-browser`
67
+ - Default browser: Firefox (cookies extracted from `~/.mozilla/firefox/*.default-release/cookies.sqlite`)
68
+ - KWallet has Chromium Safe Storage key