barebrowse 0.2.2 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,107 @@
1
+ ---
2
+ name: barebrowse
3
+ description: Browser automation using the user's real browser with real cookies. Handles consent walls, login sessions, and bot detection automatically.
4
+ allowed-tools: Bash(barebrowse:*)
5
+ ---
6
+
7
+ # barebrowse CLI — Browser Automation for Agents
8
+
9
+ Browse any URL using the user's real browser with real cookies. Returns pruned ARIA snapshots (40-90% smaller than raw) with `[ref=N]` markers for interaction. Handles cookie consent, login sessions, and bot detection automatically.
10
+
11
+ ## Quick Start
12
+
13
+ ```bash
14
+ barebrowse open https://example.com # Start session + navigate
15
+ barebrowse snapshot # Get ARIA snapshot → .barebrowse/page-*.yml
16
+ barebrowse click 8 # Click element with ref=8
17
+ barebrowse snapshot # See result
18
+ barebrowse close # End session
19
+ ```
20
+
21
+ All output files go to `.barebrowse/` in the current directory. Read them with the Read tool when needed.
22
+
23
+ ## Commands
24
+
25
+ ### Session Lifecycle
26
+
27
+ | Command | Description |
28
+ |---------|-------------|
29
+ | `barebrowse open [url] [flags]` | Start browser session. Optionally navigate to URL. |
30
+ | `barebrowse close` | Close session and kill browser. |
31
+ | `barebrowse status` | Check if session is running. |
32
+
33
+ **Open flags:**
34
+ - `--mode=headless|headed|hybrid` — Browser mode (default: headless)
35
+ - `--no-cookies` — Skip cookie injection
36
+ - `--browser=firefox|chromium` — Cookie source
37
+ - `--prune-mode=act|read` — Default pruning mode
38
+ - `--timeout=N` — Navigation timeout in ms
39
+
40
+ ### Navigation
41
+
42
+ | Command | Output |
43
+ |---------|--------|
44
+ | `barebrowse goto <url>` | Navigates, waits for load, dismisses consent. Prints "ok". |
45
+ | `barebrowse snapshot` | ARIA snapshot → `.barebrowse/page-<timestamp>.yml` |
46
+ | `barebrowse snapshot --mode=read` | Read mode: keeps all text (for content extraction) |
47
+ | `barebrowse screenshot` | Screenshot → `.barebrowse/screenshot-<timestamp>.png` |
48
+
49
+ ### Interaction
50
+
51
+ | Command | Description |
52
+ |---------|-------------|
53
+ | `barebrowse click <ref>` | Click element (scrolls into view first) |
54
+ | `barebrowse type <ref> <text>` | Type text into element |
55
+ | `barebrowse fill <ref> <text>` | Clear existing content + type new text |
56
+ | `barebrowse press <key>` | Press key: Enter, Tab, Escape, Backspace, Delete, arrows, Space |
57
+ | `barebrowse scroll <deltaY>` | Scroll page (positive=down, negative=up) |
58
+ | `barebrowse hover <ref>` | Hover over element (triggers tooltips) |
59
+ | `barebrowse select <ref> <value>` | Select dropdown option |
60
+
61
+ ### Debugging
62
+
63
+ | Command | Output |
64
+ |---------|--------|
65
+ | `barebrowse eval <expression>` | Evaluate JS in page, print result |
66
+ | `barebrowse wait-idle` | Wait for network idle (no requests for 500ms) |
67
+ | `barebrowse console-logs` | Console logs → `.barebrowse/console-<timestamp>.json` |
68
+ | `barebrowse network-log` | Network log → `.barebrowse/network-<timestamp>.json` |
69
+ | `barebrowse network-log --failed` | Only failed/4xx/5xx requests |
70
+
71
+ ## Snapshot Format
72
+
73
+ The snapshot is a YAML-like ARIA tree. Each line is one node:
74
+
75
+ ```
76
+ - WebArea "Example Domain" [ref=1]
77
+ - heading "Example Domain" [level=1] [ref=3]
78
+ - paragraph [ref=5]
79
+ - StaticText "This domain is for use in illustrative examples." [ref=6]
80
+ - link "More information..." [ref=8]
81
+ ```
82
+
83
+ - `[ref=N]` — Use this number with click, type, fill, hover, select
84
+ - Refs change on every snapshot — always take a fresh snapshot before interacting
85
+ - **act mode** (default): interactive elements + labels — for clicking, typing, navigating
86
+ - **read mode**: all text content — for reading articles, extracting data
87
+
88
+ ## Workflow Pattern
89
+
90
+ 1. `barebrowse open <url>` — start session
91
+ 2. `barebrowse snapshot` — observe page (read the .yml file)
92
+ 3. Decide action based on snapshot content
93
+ 4. `barebrowse click/type/fill/press/scroll <ref>` — act
94
+ 5. `barebrowse snapshot` — observe result (refs are now different!)
95
+ 6. Repeat 3-5 until goal achieved
96
+ 7. `barebrowse close` — clean up
97
+
98
+ ## Tips
99
+
100
+ - **Always snapshot before interacting** — refs are ephemeral and change every time
101
+ - **Use `fill` instead of `type`** when replacing existing text in input fields
102
+ - **Use `--mode=read`** for snapshot when you need to extract article content or data
103
+ - **Check `console-logs`** when page behavior seems wrong — JS errors show up there
104
+ - **Check `network-log --failed`** to debug missing content or broken API calls
105
+ - **Use `eval`** as an escape hatch when ARIA tree doesn't show what you need
106
+ - **One session per project** — `.barebrowse/` is project-scoped
107
+ - For bot-detected sites, use `--mode=headed` (requires browser with `--remote-debugging-port=9222`)
package/CHANGELOG.md CHANGED
@@ -1,5 +1,62 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.3.1
4
+
5
+ - Fix `.npmignore`: exclude `.claude/memory/`, `.claude/stash/`, `.claude/settings.local.json` from package (leaked in 0.3.0)
6
+
7
+ ## 0.3.0
8
+
9
+ CLI session mode. Shell commands that output to disk — coding agents read files when needed instead of getting full snapshots in every tool response. ~4x more token-efficient than MCP for multi-step browsing flows.
10
+
11
+ ### New: CLI session commands
12
+ - `barebrowse open [url] [flags]` — spawn background daemon holding a `connect()` session
13
+ - `barebrowse close` / `status` — session lifecycle
14
+ - `barebrowse goto <url>` — navigate
15
+ - `barebrowse snapshot [--mode=act|read]` — ARIA snapshot → `.barebrowse/page-*.yml`
16
+ - `barebrowse screenshot [--format]` — screenshot → `.barebrowse/screenshot-*.png`
17
+ - `barebrowse click/type/fill/press/scroll/hover/select` — all interactions from connect() API
18
+ - Open flags: `--mode`, `--port`, `--no-cookies`, `--browser`, `--timeout`, `--prune-mode`, `--no-consent`
19
+
20
+ ### New: agent self-sufficiency
21
+ - `barebrowse eval <expression>` — run JS in page context via `Runtime.evaluate`
22
+ - `barebrowse console-logs [--level --clear]` — dump captured console logs → `.barebrowse/console-*.json`
23
+ - `barebrowse network-log [--failed]` — dump network requests → `.barebrowse/network-*.json`
24
+ - `barebrowse wait-idle [--timeout]` — wait for network idle
25
+
26
+ ### New: daemon architecture (`src/daemon.js` + `src/session-client.js`)
27
+ - Background HTTP server on random localhost port, holding a `connect()` session
28
+ - Spawned as detached child process, communicates via `session.json`
29
+ - Console capture via `Runtime.consoleAPICalled`
30
+ - Network capture via `Network.requestWillBeSent` / `responseReceived` / `loadingFailed`
31
+ - Graceful shutdown on `close` command or SIGTERM
32
+
33
+ ### New: SKILL.md for Claude Code
34
+ - `.claude/skills/barebrowse/SKILL.md` — skill definition + full CLI command reference
35
+ - `barebrowse install --skill` — copies SKILL.md to `~/.config/claude/skills/barebrowse/`
36
+
37
+ ### Fixed: MCP setup instructions
38
+ - README now has per-client instructions: Claude Code (`claude mcp add`), Claude Desktop/Cursor (`npx barebrowse install`), VS Code (`.vscode/mcp.json`)
39
+ - `install` command no longer writes `.mcp.json` for Claude Code — prints `claude mcp add` hint instead
40
+
41
+ ### Fixed: ARIA tree formatting (`src/aria.js`)
42
+ - Ignored nodes joined children with empty string instead of newline, causing sibling subtrees to concatenate on one line
43
+ - Fixed to `.filter(Boolean).join('\n')`
44
+
45
+ ### Changed
46
+ - `cli.js` — expanded from 3 commands to full dispatch table (20+ commands)
47
+ - `barebrowse.context.md` — added CLI as third integration path, updated MCP setup
48
+ - `README.md` — "Two ways" → "Three ways", added CLI section
49
+
50
+ ### Docs
51
+ - `docs/04-process/testing.md` — updated to 64 tests, added CLI test section
52
+ - `docs/00-context/system-state.md` — added daemon/session-client to module table, CLI to integrations
53
+ - `docs/03-logs/validation-log.md` — full CLI manual validation results
54
+
55
+ ### Tests
56
+ - 64 tests passing (was 54 in 0.2.x)
57
+ - New: `test/integration/cli.test.js` (10 tests) — full open → snapshot → goto → click → eval → console → network → close cycle
58
+ - All existing 54 tests unchanged and passing
59
+
3
60
  ## 0.2.1
4
61
 
5
62
  - README rewritten: no code blocks, full obstacle course table with mode column, two usage paths (MCP vs framework), mcprune credited, measured token savings, context.md as code reference
package/CLAUDE.md CHANGED
@@ -12,11 +12,13 @@
12
12
 
13
13
  ## Project Specifics
14
14
 
15
+ - **What:** Vanilla JS library — CDP-direct browsing for autonomous agents. URL in, pruned ARIA snapshot out.
15
16
  - **Language:** Vanilla JavaScript, ES modules, no build step
16
17
  - **Runtime:** Node.js >= 22 (built-in WebSocket, sqlite)
17
18
  - **Protocol:** CDP (Chrome DevTools Protocol) direct — no Playwright
18
19
  - **Browser:** Any installed Chromium-based browser (chromium, chrome, brave, edge)
19
- - **Key files:** `src/index.js` (API), `src/cdp.js` (CDP client), `src/chromium.js` (browser launch), `src/aria.js` (ARIA formatting)
20
- - **Docs:** `docs/prd.md` (decisions + rationale), `docs/poc-plan.md` (phases + DoD)
20
+ - **Modules:** 11 files in `src/`, ~2,400 lines, zero required deps
21
+ - **Tests:** 54 passing run with `node --test test/unit/*.test.js test/integration/*.test.js`
22
+ - **Docs:** `docs/README.md` (navigation guide to all documentation)
21
23
 
22
24
  For full development and testing standards, see `.claude/memory/AGENT_RULES.md`.
package/README.md CHANGED
@@ -30,18 +30,65 @@ npm install barebrowse
30
30
 
31
31
  Requires Node.js >= 22 and any installed Chromium-based browser.
32
32
 
33
- ## Two ways to use it
33
+ ## Three ways to use it
34
34
 
35
- ### 1. MCP server -- for Claude Desktop, Cursor, Claude Code
35
+ ### 1. CLI session -- for coding agents and quick testing
36
36
 
37
+ ```bash
38
+ barebrowse open https://example.com # Start session + navigate
39
+ barebrowse snapshot # ARIA snapshot → .barebrowse/page-*.yml
40
+ barebrowse click 8 # Click element
41
+ barebrowse close # End session
37
42
  ```
38
- npm install -g barebrowse
43
+
44
+ Outputs go to `.barebrowse/` as files -- agents read them with their file tools, no token waste in tool responses. Install the skill for Claude Code:
45
+
46
+ ```bash
47
+ barebrowse install --skill
48
+ # or: claude mcp add barebrowse -- npx barebrowse mcp
49
+ ```
50
+
51
+ Full command reference: [.claude/skills/barebrowse/SKILL.md](.claude/skills/barebrowse/SKILL.md)
52
+
53
+ ### 2. MCP server -- for Claude Desktop, Cursor, and other MCP clients
54
+
55
+ **Claude Code:**
56
+ ```bash
57
+ claude mcp add barebrowse -- npx barebrowse mcp
58
+ ```
59
+
60
+ **Claude Desktop / Cursor:**
61
+ ```bash
39
62
  npx barebrowse install
40
63
  ```
41
64
 
42
- That's it. `install` auto-detects your MCP client and writes the config. No manual JSON editing. Restart your client and you have 7 browsing tools: `browse`, `goto`, `snapshot`, `click`, `type`, `press`, `scroll`.
65
+ Or manually add to your config (`claude_desktop_config.json`, `.cursor/mcp.json`):
66
+ ```json
67
+ {
68
+ "mcpServers": {
69
+ "barebrowse": {
70
+ "command": "npx",
71
+ "args": ["barebrowse", "mcp"]
72
+ }
73
+ }
74
+ }
75
+ ```
76
+
77
+ **VS Code (`.vscode/mcp.json`):**
78
+ ```json
79
+ {
80
+ "servers": {
81
+ "barebrowse": {
82
+ "command": "npx",
83
+ "args": ["barebrowse", "mcp"]
84
+ }
85
+ }
86
+ }
87
+ ```
88
+
89
+ 7 tools: `browse`, `goto`, `snapshot`, `click`, `type`, `press`, `scroll`.
43
90
 
44
- ### 2. Framework -- for agentic automation
91
+ ### 3. Library -- for agentic automation
45
92
 
46
93
  Import barebrowse in your agent code. One-shot reads, interactive sessions, full observe-think-act loops. Works with any LLM orchestration library. Ships with a ready-made adapter for [bareagent](https://www.npmjs.com/package/bare-agent) (9 tools, auto-snapshot after every action).
47
94
 
@@ -1,7 +1,7 @@
1
1
  # barebrowse -- Integration Guide
2
2
 
3
3
  > For AI assistants and developers wiring barebrowse into a project.
4
- > v0.1.0 | Node.js >= 22 | 0 required deps | MIT
4
+ > v0.3.0 | Node.js >= 22 | 0 required deps | MIT
5
5
 
6
6
  ## What this is
7
7
 
@@ -13,9 +13,10 @@ No Playwright. No bundled browser. No build step. Vanilla JS, ES modules.
13
13
  npm install barebrowse
14
14
  ```
15
15
 
16
- Two entry points:
17
- - `import { browse } from 'barebrowse'` -- one-shot: URL in, snapshot out
18
- - `import { connect } from 'barebrowse'` -- session: navigate, interact, observe
16
+ Three integration paths:
17
+ 1. **Library:** `import { browse, connect } from 'barebrowse'` -- one-shot or interactive session
18
+ 2. **MCP server:** `barebrowse mcp` -- JSON-RPC over stdio for Claude Desktop, Cursor, etc.
19
+ 3. **CLI session:** `barebrowse open` / `click` / `snapshot` / `close` -- shell commands, outputs to disk
19
20
 
20
21
  ## Which mode do I need?
21
22
 
@@ -174,15 +175,33 @@ try {
174
175
 
175
176
  Action tools (click, type, press, scroll, goto) auto-return a fresh snapshot so the LLM always sees the result. 300ms settle delay after actions for DOM updates.
176
177
 
177
- ## MCP wrapper
178
+ ## CLI session mode
178
179
 
179
- barebrowse ships an MCP server for direct use with Claude Desktop, Cursor, or any MCP client.
180
+ For coding agents (Claude Code, Copilot, Cursor) and quick interactive testing. Commands output files to `.barebrowse/` -- agents read them with file tools, avoiding token waste in tool responses.
180
181
 
181
182
  ```bash
182
- npm install barebrowse # or npm install -g barebrowse
183
+ barebrowse open https://example.com # Start daemon + navigate
184
+ barebrowse snapshot # → .barebrowse/page-<timestamp>.yml
185
+ barebrowse click 8 # Click element ref=8
186
+ barebrowse type 12 hello world # Type into element ref=12
187
+ barebrowse screenshot # → .barebrowse/screenshot-<timestamp>.png
188
+ barebrowse console-logs # → .barebrowse/console-<timestamp>.json
189
+ barebrowse close # Kill daemon + browser
183
190
  ```
184
191
 
185
- Add to your MCP client config (`.mcp.json`, `claude_desktop_config.json`, etc.):
192
+ Session lifecycle: `open` spawns a background daemon holding a `connect()` session. Subsequent commands POST to the daemon over HTTP (localhost). `close` shuts everything down.
193
+
194
+ Full command reference: `.claude/skills/barebrowse/SKILL.md`
195
+
196
+ ## MCP wrapper
197
+
198
+ barebrowse ships an MCP server for direct use with Claude Desktop, Cursor, or any MCP client.
199
+
200
+ **Claude Code:** `claude mcp add barebrowse -- npx barebrowse mcp`
201
+
202
+ **Claude Desktop / Cursor:** `npx barebrowse install` (auto-detects and writes config)
203
+
204
+ **Manual config** (`claude_desktop_config.json`, `.cursor/mcp.json`):
186
205
  ```json
187
206
  {
188
207
  "mcpServers": {