npm - barebrowse - Versions diffs - 0.2.1 → 0.3.0 - Mend

barebrowse 0.2.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/.claude/memory/AGENT_RULES.md +251 -0
package/.claude/settings.local.json +37 -0
package/.claude/skills/barebrowse/SKILL.md +107 -0
package/.claude/stash/barebrowse-research-2026-02-22.md +49 -0
package/.claude/stash/phase3-interactions-complete.md +69 -0
package/.claude/stash/phase3-prep.md +88 -0
package/.claude/stash/phase4-complete-2026-02-22.md +61 -0
package/CHANGELOG.md +53 -0
package/CLAUDE.md +4 -2
package/README.md +54 -7
package/barebrowse.context.md +27 -8
package/cli.js +289 -48
package/docs/00-context/assumptions.md +38 -0
package/docs/{blueprint.md → 00-context/system-state.md} +30 -5
package/docs/00-context/vision.md +52 -0
package/docs/01-product/prd.md +284 -0
package/docs/03-logs/bug-log.md +16 -0
package/docs/03-logs/decisions-log.md +32 -0
package/docs/03-logs/implementation-log.md +54 -0
package/docs/03-logs/insights.md +35 -0
package/docs/03-logs/validation-log.md +123 -0
package/docs/04-process/definition-of-done.md +31 -0
package/docs/04-process/dev-workflow.md +68 -0
package/docs/{testing.md → 04-process/testing.md} +21 -2
package/docs/README.md +55 -0
package/docs/archive/poc-plan.md +230 -0
package/mcp-server.js +1 -1
package/package.json +1 -1
package/src/aria.js +1 -1
package/src/daemon.js +321 -0
package/src/session-client.js +70 -0

package/docs/00-context/assumptions.md ADDED Viewed

@@ -0,0 +1,38 @@
+# barebrowse -- Assumptions & Constraints
+## Hard constraints
+| Constraint | Detail |
+|-----------|--------|
+| **Chromium-only** | CDP protocol. Covers Chrome, Chromium, Edge, Brave, Vivaldi, Arc, Opera (~80% desktop share). Firefox later via WebDriver BiDi. |
+| **Node >= 22** | Built-in WebSocket (`globalThis.WebSocket`), built-in SQLite (`node:sqlite`). No polyfills. |
+| **Linux first** | Tested on Fedora/KDE/Wayland. macOS/Windows cookie extraction paths exist in auth.js but are untested. |
+| **Zero required deps** | Everything uses Node stdlib. Vanilla JS, ES modules, no build step. |
+| **Not a server** | Library that agents import. MCP wrapper included, HTTP wrapper is DIY. |
+## Assumptions
+- **User has Chromium installed.** At least one of: chromium-browser, google-chrome, brave-browser, microsoft-edge. `chromium.js` searches common paths.
+- **Cookie extraction needs unlocked profile.** Chromium cookies are AES-encrypted with a keyring key (KWallet on KDE, GNOME Keyring on GNOME). Firefox cookies are plaintext SQLite and always accessible.
+- **Headed mode requires manual browser launch.** User must start their browser with `--remote-debugging-port=9222`. barebrowse connects to it -- does not launch it.
+- **Hybrid fallback needs a running headed browser.** If headless is bot-blocked, hybrid kills headless and connects to headed on port 9222. That browser must already be running.
+- **Cookies expire.** Cookie injection works for existing sessions, not new logins. For sites requiring fresh auth, headed mode with user interaction is the fallback.
+- **One page per connect().** Each `connect()` call creates one page. For multiple tabs, call `connect()` multiple times.
+## Known limitations
+| Limitation | Impact | Workaround |
+|-----------|--------|------------|
+| No Firefox/WebKit support | ~20% of desktop users can't use native browser | Use Chromium as the automation target, Firefox as cookie source |
+| No file upload | Can't interact with file inputs | Not yet implemented (`Input.setFiles` via CDP) |
+| No drag and drop | Can't use drag-based UIs | Not yet implemented |
+| No cross-origin iframes | Content inside iframes invisible to ARIA tree | Frame tree traversal via CDP (medium effort) |
+| No CAPTCHAs | Cannot solve challenge pages | Headed mode lets user solve manually |
+| Canvas/WebGL opaque | No ARIA representation | Needs screenshot + vision model |
+| macOS/Windows untested | Cookie paths exist but may not work | Linux-only for now |
+## Risks
+- **CDP is not a stable API.** Chrome team can change it across versions. Mitigation: we use well-established domains (Accessibility, Input, Page, Network, DOM) that rarely break.
+- **Cookie consent patterns evolve.** New consent frameworks may not be detected by `consent.js`. Mitigation: best-effort, opt-out with `{ consent: false }`.
+- **Stealth patches are an arms race.** Bot detection evolves. Mitigation: headed mode with real browser profile is the ultimate fallback.

package/docs/{blueprint.md → 00-context/system-state.md} RENAMED Viewed

@@ -163,7 +163,7 @@ Every action returns a **pruned ARIA snapshot** -- the agent's view of the page
 ### Module table
-Eleven modules, 2,396 lines, zero required dependencies.
+Thirteen modules, zero required dependencies.
 | Module | Lines | Purpose |
 |---|---|---|
@@ -177,6 +177,8 @@ Eleven modules, 2,396 lines, zero required dependencies.
 | `src/consent.js` | 210 | Auto-dismiss cookie consent dialogs, 7 languages |
 | `src/stealth.js` | 51 | Navigator patches for headless anti-detection |
 | `src/bareagent.js` | 161 | Tool adapter for bareagent Loop |
+| `src/daemon.js` | ~230 | Background HTTP server holding connect() session for CLI mode |
+| `src/session-client.js` | ~60 | HTTP client to daemon (sendCommand, readSession, isAlive) |
 | `mcp-server.js` | 216 | MCP server (JSON-RPC 2.0 over stdio) |
 ---
@@ -254,11 +256,12 @@ Anti-detection for headless mode via `Page.addScriptToEvaluateOnNewDocument` (ru
 - `Permissions.prototype.query` -> notifications return 'prompt'
 - Applied automatically in headless mode
-### Tests -- 47+ passing
+### Tests -- 64 passing
 - 16 unit tests (pruning logic)
 - 7 unit tests (cookie extraction -- 2 skip when Chromium profile locked)
 - 5 unit tests (CDP client + browser launch)
 - 11 integration tests (end-to-end browse pipeline)
+- 10 integration tests (CLI session lifecycle: open/snapshot/goto/click/eval/console/network/close)
 - 15 integration tests (real-world interactions: data: URL fixture + live sites)
 ---
@@ -302,6 +305,26 @@ Raw JSON-RPC 2.0 over stdio. Zero SDK dependencies. `npm install barebrowse` the
 Action tools return `'ok'` -- agent calls `snapshot` explicitly (MCP tool calls are cheap to chain).
 Session tools share a singleton page, lazy-created on first use.
+### CLI session -- for coding agents + human devs
+Shell commands that output to disk. Coding agents (Claude Code, Copilot, Cursor) read output files with their file tools -- no tokens wasted in tool responses.
+```bash
+barebrowse open https://example.com    # Start daemon + navigate
+barebrowse snapshot                    # → .barebrowse/page-*.yml
+barebrowse click 8                     # Click element
+barebrowse console-logs                # → .barebrowse/console-*.json
+barebrowse close                       # Kill daemon + browser
+```
+Architecture: `open` spawns a detached child process running an HTTP server on a random localhost port. Session state stored in `.barebrowse/session.json`. Subsequent commands POST to the daemon. `close` sends shutdown, daemon calls `page.close()` + `process.exit(0)`.
+Full commands: open, close, status, goto, snapshot, screenshot, click, type, fill, press, scroll, hover, select, eval, wait-idle, console-logs, network-log.
+Self-sufficiency features (console/network capture, eval) let agents debug without guessing -- they see JS errors and failed requests directly.
+SKILL.md (`.claude/skills/barebrowse/SKILL.md`) teaches Claude Code the CLI commands. Install with `barebrowse install --skill`.
 ---
 ## Ecosystem
@@ -339,10 +362,12 @@ barebrowse/
 │   ├── interact.js    # Click, type, press, scroll, hover, select
 │   ├── consent.js     # Auto-dismiss cookie consent dialogs
 │   ├── stealth.js     # Navigator patches for headless anti-detection
-│   └── bareagent.js   # Tool adapter for bareagent Loop
+│   ├── bareagent.js   # Tool adapter for bareagent Loop
+│   ├── daemon.js      # Background HTTP server for CLI session
+│   └── session-client.js  # HTTP client to daemon
 ├── test/
 │   ├── unit/          # prune, auth, cdp tests
-│   └── integration/   # browse + interact tests (real sites)
+│   └── integration/   # browse, interact, cli tests
 ├── examples/
 │   ├── headed-demo.js # Interactive demo: Wikipedia → DuckDuckGo
 │   └── yt-demo.js     # YouTube demo: Firefox cookies → search → play video
@@ -352,7 +377,7 @@ barebrowse/
 │   ├── blueprint.md   # This file
 │   └── testing.md     # Test guide: pyramid, all 54 tests, CI strategy
 ├── mcp-server.js      # MCP server (JSON-RPC 2.0 over stdio)
-├── cli.js             # CLI entry: `npx barebrowse mcp` or `npx barebrowse browse <url>`
+├── cli.js             # CLI entry: session commands, MCP, browse, install
 ├── .mcp.json          # MCP server config for Claude Desktop / Cursor
 ├── barebrowse.context.md  # LLM-consumable integration guide
 ├── package.json

package/docs/00-context/vision.md ADDED Viewed

@@ -0,0 +1,52 @@
+# barebrowse -- Vision
+## What it is
+A standalone vanilla JavaScript library that gives autonomous agents authenticated access to the web through the user's own Chromium browser. One package, one import, three modes.
+```js
+import { browse } from 'barebrowse';
+const snapshot = await browse('https://any-page.com');
+```
+barebrowse handles: finding the browser, connecting via CDP, injecting cookies, navigating, extracting the ARIA accessibility tree, and pruning it down to what an agent actually needs. The output is a clean, token-efficient snapshot of any web page -- authenticated as the real user.
+## What it is NOT
+- **Not a framework.** No plugin system, no config files, no lifecycle hooks.
+- **Not Playwright.** No bundled browser, no cross-engine abstraction, no 200MB download.
+- **Not an agent.** No LLM, no planning, no orchestration -- that's bareagent's job.
+- **Not a scraper.** It browses as the user, not as a bot harvesting data.
+## The core insight
+The user already has a browser. It's already logged in. It already passes Cloudflare. Instead of fighting the web with headless stealth tricks, **use what's already there**.
+CDP (Chrome DevTools Protocol) lets us connect to any Chromium-based browser -- the same one the user browses with daily. We get their cookies, their sessions, their anti-detection posture, for free.
+## The problem it solves
+Every AI agent that needs to read or interact with the web hits the same walls:
+1. **Cloudflare / bot detection** -- headless browsers get blocked
+2. **Authentication** -- sites require login, OAuth, session cookies
+3. **Token bloat** -- raw DOM is 100K+ tokens; agents need ~5K
+4. **Two consumers, same need** -- research agents (read pages) and personal assistants (click/type) both need an authenticated browser, but existing tools force you to choose one path
+## The bare- ecosystem
+```
+bareagent  = the brain  (orchestration, LLM loop, memory, retries)
+barebrowse = the eyes + hands  (browse, read, interact with the web)
+```
+barebrowse is a library. bareagent imports it as a capability. barebrowse doesn't know about bareagent. bareagent doesn't know about CDP. Clean boundary. Each ships and tests independently.
+## Success criteria
+1. `browse(url)` returns a pruned ARIA snapshot of any page, authenticated as the user
+2. Zero heavy dependencies -- no Playwright, no Puppeteer, no bundled browser
+3. Works with any installed Chromium-based browser
+4. Headless for research, headed for interaction, hybrid for autonomous agents
+5. Plugs into bareagent as plain tool functions
+6. An agent using barebrowse + bareagent can autonomously research the web and act on pages

package/docs/01-product/prd.md ADDED Viewed

@@ -0,0 +1,284 @@
+# barebrowse — Product Requirements Document
+**Version:** 1.0
+**Date:** 2026-02-22
+**Status:** POC
+---
+## What barebrowse is
+A standalone vanilla JavaScript library that gives autonomous agents authenticated access to the web through the user's own Chromium browser. One package, one import, three modes.
+```js
+import { browse } from 'barebrowse';
+const snapshot = await browse('https://any-page.com');
+```
+barebrowse handles: finding the browser, connecting via CDP, injecting cookies, navigating, extracting the ARIA accessibility tree, and pruning it down to what an agent actually needs. The output is a clean, token-efficient snapshot of any web page — authenticated as the real user.
+## What barebrowse is NOT
+- **Not a framework.** No plugin system, no config files, no lifecycle hooks.
+- **Not an MCP server.** But trivially wrappable as one (~30 lines).
+- **Not Playwright.** No bundled browser, no cross-engine abstraction, no 200MB download.
+- **Not an agent.** No LLM, no planning, no orchestration — that's bareagent's job.
+- **Not a scraper.** It browses as the user, not as a bot harvesting data.
+---
+## The Problem
+Every AI agent that needs to read or interact with the web hits the same walls:
+1. **Cloudflare / bot detection** — headless browsers get blocked
+2. **Authentication** — sites require login, OAuth, session cookies
+3. **Token bloat** — raw DOM is 100K+ tokens; agents need ~5K
+4. **Two consumers, same need** — research agents (read pages) and personal assistants (click/type) both need an authenticated browser, but existing tools force you to choose one path
+Existing solutions (Playwright MCP, sweetlink, open-operator, browser-use) are either too heavy, too opinionated, or solve only half the problem.
+## The Insight
+The user already has a browser. It's already logged in. It already passes Cloudflare. Instead of fighting the web with headless stealth tricks, **use what's already there**.
+CDP (Chrome DevTools Protocol) lets us connect to any Chromium-based browser — the same one the user browses with daily. We get their cookies, their sessions, their anti-detection posture, for free.
+---
+## Core Architecture
+### CDP-Direct (Why No Playwright)
+**Decision:** Use CDP over WebSocket directly. No Playwright dependency.
+**Why:**
+- Playwright downloads a bundled Chromium (~200MB). barebrowse uses the browser already installed on the user's machine.
+- Playwright abstracts CDP, but we need CDP directly for all three modes (headless, headed, hybrid) against the user's real browser.
+- Every Playwright API call maps 1:1 to a CDP method. The abstraction adds weight without adding capability for our use case.
+- CDP gives us everything: `Accessibility.getFullAXTree`, `Page.navigate`, `Runtime.evaluate`, `Input.dispatch*Event`, `Network.setCookie`, `Page.captureScreenshot`.
+- The CDP WebSocket client is ~100 lines of vanilla JS. Playwright is ~50,000.
+**What we lose:** Cross-engine support (Firefox, WebKit). CDP only works with Chromium-family browsers (Chrome, Chromium, Edge, Brave, Vivaldi, Arc, Opera). This covers ~80% of desktop browsers. Firefox support could come later via WebDriver BiDi.
+**What we gain:** Zero heavy deps, uses the user's real browser, same code path for headless/headed/hybrid, drastically simpler codebase.
+### ARIA-First (Why Not DOM)
+**Decision:** Use `Accessibility.getFullAXTree` (ARIA/accessibility tree) as the primary page representation, not DOM.
+**Why:**
+- The accessibility tree is the semantic structure of the page — roles, names, states, interactive elements. It's what screen readers see. It's also what agents need.
+- DOM is bloated: wrapper divs, styling, tracking pixels, ad scripts. An agent doesn't need any of that.
+- mcprune already proved this: ARIA snapshots pruned by role achieve 75-95% token reduction on typical pages while preserving all actionable information.
+- CDP's `Accessibility.getFullAXTree` returns the tree directly. No parsing HTML, no building a DOM tree, no traversing nodes.
+- ARIA refs map directly to CDP interaction targets — the agent reads a button in the tree and can click it via the same CDP connection.
+**The pipeline:** CDP connect → authenticate → navigate → ARIA tree → prune → agent gets clean snapshot.
+### Three Modes (Why All Three)
+**Decision:** Headless, headed, and hybrid — not as separate packages or optional features, but as a single flag on the same API.
+**Why they're not bloat:** The CDP conversation is identical regardless of mode. The only difference is how you get a browser process with a debug port. It's one code path with a different entry point:
+```
+headless: spawn chromium --headless=new --remote-debugging-port=N
+headed:   connect to user's already-running browser on debug port
+hybrid:   try headless → detect failure → fall back to headed
+```
+After connection, every CDP command is the same. Three modes = ~20 extra lines in `chromium.js`, not three implementations.
+**When to use each:**
+| Mode | Use case | Example |
+|---|---|---|
+| `headless` | Agent research, background tasks, CI | "Read this article and summarize it" |
+| `headed` | Personal assistant, interactive tasks, auth flows | "Book me a flight on this page" |
+| `hybrid` | Default for autonomous agents | Try headless; if CF-blocked, fall back to headed |
+**Headless is the default.** Most agent tasks are "go read this page." Headed is the escape hatch for when headless fails or the task requires user-visible interaction.
+### Cookie Authentication
+**Decision:** Extract cookies from the user's browser profile and inject via CDP `Network.setCookie`.
+**Why:**
+- The user's browser has active sessions for every site they use. We reuse those sessions instead of building new auth flows.
+- sweet-cookie (npm package) already extracts cookies from Chrome/Firefox/Safari SQLite databases with OS keychain decryption. We use it or vendor the relevant parts.
+- For headed mode, cookies are already present in the browser — no extraction needed.
+- For headless mode, we extract from the user's profile and inject into the headless instance.
+**Limitation:** Cookies expire. This works for existing sessions, not new logins. For sites requiring fresh auth, headed mode with user interaction is the fallback.
+### Pruning (Absorbed from mcprune)
+**Decision:** Port mcprune's role-based ARIA tree pruning into barebrowse as a built-in step, not an optional module.
+**Why:**
+- Pruning is not optional for agent consumption. A raw ARIA tree is still too large for most LLM context windows. Pruning is part of the pipeline, not an afterthought.
+- mcprune's pruning logic is a pure function: takes an ARIA tree, returns a smaller ARIA tree. No browser dependency, no Playwright coupling. It's ~300 lines of role-based tree surgery.
+- By absorbing it, barebrowse becomes a complete "URL in, agent-ready snapshot out" solution. No second package needed.
+**What we port from mcprune:**
+- Role taxonomy (landmarks, interactive, structural, noise)
+- Landmark extraction (main, nav, banner, etc.)
+- Noise removal (ads, tracking, legal boilerplate)
+- Interactive element preservation (buttons, links, inputs)
+- Wrapper collapsing (nested generics, empty groups)
+- Context-aware filtering (search relevance, dedup)
+**What stays in mcprune:** The Playwright MCP proxy architecture. mcprune can continue to exist as a Playwright-based MCP server for users who want that path. But for barebrowse consumers, pruning is built in.
+---
+## API Design
+### Public API
+```js
+import { browse, connect } from 'barebrowse';
+// One-shot: URL in, pruned ARIA snapshot out
+const tree = await browse('https://example.com');
+// With options
+const tree = await browse('https://example.com', {
+  mode: 'hybrid',        // 'headless' (default) | 'headed' | 'hybrid'
+  cookies: true,          // inject user's cookies (default: true)
+  prune: true,            // apply ARIA pruning (default: true)
+  browser: 'chrome',      // which browser profile for cookies
+  timeout: 30000,         // navigation timeout ms
+});
+// Long-lived session for interaction
+const page = await connect({ mode: 'headed' });
+await page.goto('https://amazon.com/cart');
+await page.click('[data-action="checkout"]');
+await page.type('#gift-message', 'Happy birthday!');
+const tree = await page.snapshot();  // ARIA + prune
+await page.close();
+```
+### Design Principles
+1. **One package, one import.** No picking pieces. `browse()` does everything. Power users get `connect()` for long-lived sessions.
+2. **Batteries included.** Cookies, ARIA, pruning — all happen inside by default. Disable with flags if you want raw access.
+3. **Escape hatches.** `connect()` returns an object with the raw CDP connection accessible. If you need something we don't wrap, you can send CDP commands directly.
+4. **Progressive complexity.** `browse(url)` for 90% of use cases. Options object for the rest. `connect()` for interactive sessions.
+---
+## The bare- Ecosystem
+```
+bareagent   = the brain  (orchestration, planning, memory, retries, tool loop)
+barebrowse  = the eyes + hands  (browse, read, interact with the web)
+```
+**Integration with bareagent:**
+```js
+import { Loop } from 'bare-agent';
+import { browse } from 'barebrowse';
+const tools = [
+  { name: 'browse', execute: ({ url }) => browse(url) },
+];
+const loop = new Loop({ provider });
+await loop.run([{ role: 'user', content: 'Find the cheapest flight to Tokyo' }], tools);
+```
+bareagent handles the think/act/observe loop. barebrowse handles "see the web and act on it." Neither is opinionated about the other. Tools are plain functions.
+**Integration with multis:**
+multis (personal assistant) uses barebrowse in headed mode for interactive tasks. The multis proxy is already running, providing a desktop session. barebrowse connects to the user's Chrome and drives it on behalf of the assistant.
+**MCP server wrapper (future):**
+barebrowse is not an MCP server, but wrapping it as one is ~30 lines. This would replace Playwright MCP + mcprune proxy with a single, lighter MCP server.
+---
+## Decisions Log — Why We Chose Each
+This section exists so we don't re-debate settled decisions.
+| Decision | Choice | Why | Alternative considered | Why not |
+|---|---|---|---|---|
+| Browser protocol | CDP direct | Uses user's browser, ~100 lines, all 3 modes | Playwright | 200MB download, bundles its own Chromium, abstracts what we need raw |
+| Page representation | ARIA tree | Semantic, token-efficient, what agents need | DOM/HTML | Bloated, noisy, needs heavy parsing |
+| Pruning | Built-in | Agents always need pruned output | Optional/separate | Two deps for one job, pruning isn't optional |
+| Cookie auth | Own auth.js + CDP inject | User's existing sessions (Firefox or Chromium), cross-browser injection into headless Chromium | OAuth/credential storage | Complex, security liability, reinventing what the browser already solved |
+| Three modes | One flag | Same CDP code, ~20 lines difference | Separate packages | Same code, artificial separation |
+| Chromium only | CDP constraint | ~80% browser share, user's real browser | Cross-browser (Playwright) | Requires Playwright, loses "use your own browser" benefit |
+| Anti-detection | Runtime.evaluate patches | Minimal stealth for headless mode | Full stealth framework | Over-engineering; headless + real cookies handles 90% |
+| Daemon/server | None | CDP is direct, no intermediary needed | sweetlink daemon pattern | Unnecessary complexity for local agent→browser |
+| Framework | None (vanilla JS) | Matches bare- philosophy, zero deps | Express/Fastify wrapper | Not a server, not needed |
+| Language | Vanilla JavaScript | Node.js ecosystem, same as bareagent, CDP libs available | TypeScript | Added build step, not needed for POC; can add types later |
+| Naming | chromium.js | Covers all Chromium-family browsers, not just Chrome | chrome.js | Too specific; Brave/Edge/Arc are also targets |
+| mcprune integration | Absorb pruning logic | One package does it all, mcprune pruning is a pure function | Keep separate | Agents shouldn't need two packages to browse |
+| openclaw lesson | Single bridge protocol | One CDP connection vs many API integrations | Direct multi-API | openclaw proved this fails — bloat, maintenance, fragility |
+---
+## Future Features (Post-POC)
+### Near-term
+- **Screenshot capture** — `Page.captureScreenshot` via CDP. Useful for visual verification and multimodal agents.
+- **Network interception** — `Network.requestWillBeSent` / `Network.responseReceived` for monitoring page loads. Detect redirects, blocked resources, API calls.
+- **Wait strategies** — `waitForNavigation()` done (Page.loadEventFired). Still needed: network idle, element presence polling.
+- **Tab management** — Multiple pages in one browser session. CDP `Target.createTarget` / `Target.attachToTarget`.
+- **MCP server wrapper** — Expose browse/click/type as MCP tools. Replaces Playwright MCP + mcprune combo.
+### Medium-term
+- **Firefox support** — Via WebDriver BiDi protocol (cross-browser standard, still maturing). Second protocol adapter alongside CDP.
+- **Cookie sync** — In hybrid mode, extract fresh cookies from headed session and cache for future headless use. Self-refreshing auth.
+- **Selector discovery** — Port sweetlink's `discoverSelectors` — crawl ARIA tree, score interactive elements, return ranked action targets.
+- **Form understanding** — Detect forms in ARIA tree, map fields to semantic purposes, enable agents to fill forms intelligently.
+- **Proxy/Tor support** — Route headless browser through proxy for geo-restricted content.
+### Long-term
+- **Profile management** — Multiple browser profiles for different identities/accounts.
+- **Session recording/replay** — Record browsing sessions as CDP commands, replay for testing.
+- **Visual grounding** — Combine ARIA tree with screenshot regions for multimodal agents.
+- **Agent memory integration** — Remember visited pages, cache snapshots, track which sites need headed mode.
+---
+## Repos Studied — What We Borrowed and Why
+| Repo | What we took | What we skipped |
+|---|---|---|
+| **steipete/sweet-cookie** | Cookie extraction from browser profiles, OS keychain decryption | Nothing — clean, focused library |
+| **steipete/sweetlink** | CDP dual-channel concept, selector discovery scoring, click/command patterns | Daemon architecture, WebSocket bridge, in-page runtime injection, HMAC auth |
+| **steipete/canvas** | Stealth/anti-detection config patterns | Go implementation (we're JS) |
+| **nichochar/open-operator** | AI agent web automation patterns | Full framework, too opinionated |
+| **AntlerClaw/playwright-mcp** | How to expose browser as MCP tools | Playwright dependency |
+| **AntlerClaw/mcp-browser-use** | MCP-native browser patterns | Heavy deps |
+| **AitchKay/chromancer** | Accessibility tree extraction approach | Different stack |
+| **mcprune (own)** | ARIA pruning logic — role taxonomy, landmark extraction, noise removal, wrapper collapsing | Playwright dependency, MCP proxy architecture |
+| **openclaw (own)** | Lesson learned: multi-API direct integration = bloat. Use a single bridge protocol | Everything — the architecture was the cautionary tale |
+### The openclaw lesson
+openclaw tried to integrate 10+ messaging APIs directly — each with its own auth, format, quirks. It became a maintenance nightmare. multis solved the same problem by using Beeper/Matrix as a single bridge.
+barebrowse applies the same lesson: instead of integrating Playwright + Puppeteer + WebDriver + stealth plugins + cookie libraries + proxy managers, we use **one protocol (CDP) to one browser (the user's)**. Everything else is unnecessary.
+---
+## Success Criteria
+barebrowse succeeds when:
+1. `browse(url)` returns a pruned ARIA snapshot of any page, authenticated as the user
+2. Zero heavy dependencies — no Playwright, no Puppeteer, no bundled browser
+3. Works with any installed Chromium-based browser
+4. Headless for research, headed for interaction, hybrid for autonomous agents
+5. Plugs into bareagent as plain tool functions
+6. Total source under 1,000 lines for core functionality
+7. An agent using barebrowse + bareagent can autonomously research the web and act on pages

package/docs/03-logs/bug-log.md ADDED Viewed

@@ -0,0 +1,16 @@
+# Bug Log
+Track bugs: symptom, root cause, fix, regression test.
+---
+*No bugs logged yet. When one is found, add an entry:*
+```
+## [date] Short description
+**Symptom:** What the user/test observed
+**Root cause:** Why it happened
+**Fix:** What was changed (file:line)
+**Regression test:** Which test prevents recurrence
+```

package/docs/03-logs/decisions-log.md ADDED Viewed

@@ -0,0 +1,32 @@
+# Decisions Log
+Settled decisions. Don't re-debate these -- see rationale column.
+## Founding decisions (v0.1.0)
+| # | Decision | Choice | Why | Alternative | Why not |
+|---|----------|--------|-----|-------------|---------|
+| 1 | Browser protocol | CDP direct | Uses user's browser, ~100 lines, all 3 modes | Playwright | 200MB download, bundles its own Chromium, abstracts what we need raw |
+| 2 | Page representation | ARIA tree | Semantic, token-efficient, what agents need | DOM/HTML | Bloated, noisy, needs heavy parsing |
+| 3 | Pruning | Built-in | Agents always need pruned output | Optional/separate | Two deps for one job, pruning isn't optional |
+| 4 | Cookie auth | Own auth.js + CDP inject | User's existing sessions (Firefox or Chromium), cross-browser injection | OAuth/credential storage | Complex, security liability, reinventing what the browser already solved |
+| 5 | Three modes | One flag | Same CDP code, ~20 lines difference | Separate packages | Same code, artificial separation |
+| 6 | Chromium only | CDP constraint | ~80% browser share, user's real browser | Cross-browser (Playwright) | Requires Playwright, loses "use your own browser" benefit |
+| 7 | Framework | None (vanilla JS) | Matches bare- philosophy, zero deps | Express/Fastify wrapper | Not a server, not needed |
+| 8 | Language | Vanilla JavaScript | Node.js ecosystem, same as bareagent, CDP libs available | TypeScript | Added build step, not needed; can add types later |
+| 9 | mcprune integration | Absorb pruning logic | One package does it all, mcprune pruning is a pure function | Keep separate | Agents shouldn't need two packages to browse |
+| 10 | Daemon/server | None | CDP is direct, no intermediary needed | sweetlink daemon pattern | Unnecessary complexity for local agent-to-browser |
+## v0.2.0 decisions
+| # | Decision | Choice | Why | Alternative | Why not |
+|---|----------|--------|-----|-------------|---------|
+| 11 | Anti-detection | Runtime.evaluate patches | Minimal stealth for headless mode | Full stealth framework | Over-engineering; headless + real cookies handles 90% |
+| 12 | sweet-cookie | Wrote own auth.js | sweet-cookie not on npm (different package). Our version is simpler, tailored, vanilla JS | Use sweet-cookie | Not available as npm package |
+| 13 | MCP server | Raw JSON-RPC, no SDK | Zero deps, ~200 lines. SDK adds weight without capability for stdio | @modelcontextprotocol/sdk | Unnecessary dependency for simple JSON-RPC |
+| 14 | bareagent adapter | Action tools auto-return snapshot | LLM always sees result without extra tool call. 300ms settle for DOM updates | Return 'ok' like MCP | Different tradeoff -- bareagent tool calls are expensive (LLM round-trip) |
+| 15 | MCP action tools | Return 'ok', agent calls snapshot | MCP tool calls are cheap to chain. Avoids double-token output | Auto-return snapshot | Would bloat every action response |
+---
+*Add new decisions below this line. Include date, context, and rationale.*

package/docs/03-logs/implementation-log.md ADDED Viewed

@@ -0,0 +1,54 @@
+# Implementation Log
+Chronological record of what changed and why. For detailed changelogs, see `/CHANGELOG.md`.
+---
+## v0.2.1 (2026-02-22)
+- README rewritten: no code blocks, obstacle table, two usage paths (MCP vs framework)
+- MCP auto-installer: `npx barebrowse install` detects Claude Desktop, Cursor, Claude Code
+- MCP config uses `npx` instead of local file paths
+## v0.2.0 (2026-02-22)
+Major release: agent integration layer.
+**New modules:**
+- `mcp-server.js` -- JSON-RPC 2.0 over stdio, 7 tools, singleton session
+- `src/bareagent.js` -- tool adapter for bareagent Loop, 9 tools, auto-snapshot
+- `src/stealth.js` -- navigator patches for headless anti-detection
+- `cli.js` -- `npx barebrowse mcp|install|browse`
+**New features:**
+- Hybrid mode (try headless, fallback to headed on bot detection)
+- `page.hover(ref)`, `page.select(ref, value)`, `page.screenshot(opts)`
+- `page.waitForNetworkIdle(opts)` -- resolve when no pending requests
+- SPA-aware `waitForNavigation()`
+**Docs:**
+- `barebrowse.context.md` -- LLM integration guide
+- `docs/testing.md` -- test pyramid, all 54 tests
+- `docs/blueprint.md` -- full pipeline, module table
+**Tests:** 54 passing (was 47)
+## v0.1.0 (2026-02-22)
+Initial release. CDP-direct browsing with ARIA snapshots.
+**Core modules (7):**
+- `src/index.js` -- `browse()`, `connect()` API
+- `src/cdp.js` -- WebSocket CDP client
+- `src/chromium.js` -- browser discovery and launch
+- `src/aria.js` -- ARIA tree formatting
+- `src/auth.js` -- cookie extraction (Firefox SQLite, Chromium AES + keyring)
+- `src/prune.js` -- 9-step pruning pipeline (ported from mcprune)
+- `src/interact.js` -- click, type, press, scroll
+- `src/consent.js` -- cookie consent auto-dismiss (7 languages, 16+ sites)
+**Tests:** 47 passing across 5 files
+---
+*Add new entries at the top. Include version, date, and what changed.*

package/docs/03-logs/insights.md ADDED Viewed

@@ -0,0 +1,35 @@
+# Insights
+Lessons learned, patterns discovered, things to remember.
+---
+## The openclaw lesson
+openclaw tried to integrate 10+ messaging APIs directly -- each with its own auth, format, quirks. It became a maintenance nightmare. multis solved the same problem by using Beeper/Matrix as a single bridge.
+barebrowse applies the same lesson: instead of integrating Playwright + Puppeteer + WebDriver + stealth plugins + cookie libraries + proxy managers, we use **one protocol (CDP) to one browser (the user's)**. Everything else is unnecessary.
+**Takeaway:** When possible, find a single bridge protocol instead of N direct integrations.
+## Repos studied -- what we took and what we skipped
+| Repo | What we took | What we skipped | Why |
+|------|-------------|-----------------|-----|
+| **steipete/sweet-cookie** | Cookie extraction concept (SQLite + keyring) | Nothing | Not on npm. Wrote our own auth.js -- simpler, tailored, vanilla JS |
+| **steipete/sweetlink** | CDP-direct concept | Daemon, WebSocket bridge, in-page runtime, HMAC auth | CDP direct is 100 lines vs ~2,000 |
+| **steipete/canvas** | Stealth/anti-detection patterns | Go implementation | Noted for stealth.js |
+| **mcprune (own)** | Full pruning pipeline port | Playwright dependency, MCP proxy | prune.js is 472 lines, adapted from Playwright YAML to CDP tree |
+| **openclaw (own)** | Cautionary tale | Everything | Multi-API direct integration = bloat |
+## Key technical insights
+- **ARIA tree > DOM** for agent consumption. Semantic, compact, interactive elements are first-class. Token reduction of 47-95% is real.
+- **Cookie consent is solvable** with ARIA tree scanning + a button text corpus in 7 languages. Dialog role detection + global fallback covers >95% of sites.
+- **Headed mode is the ultimate fallback.** When stealth fails, when cookies expire, when CAPTCHAs appear -- connecting to the user's real browser session handles it.
+- **CDP flattened sessions** are the way to go. One WebSocket, multiple targets. The session ID header routes commands to the right tab.
+- **`Page.addScriptToEvaluateOnNewDocument`** runs before any page scripts -- perfect for stealth patches without race conditions.
+---
+*Add new insights as they emerge. These should be durable lessons, not session notes.*