npm - barebrowse - Versions diffs - 0.2.2 → 0.3.1 - Mend

barebrowse 0.2.2 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/.claude/skills/barebrowse/SKILL.md +107 -0
package/CHANGELOG.md +57 -0
package/CLAUDE.md +4 -2
package/README.md +52 -5
package/barebrowse.context.md +27 -8
package/cli.js +289 -48
package/docs/00-context/assumptions.md +38 -0
package/docs/{blueprint.md → 00-context/system-state.md} +30 -5
package/docs/00-context/vision.md +52 -0
package/docs/01-product/prd.md +284 -0
package/docs/03-logs/bug-log.md +16 -0
package/docs/03-logs/decisions-log.md +32 -0
package/docs/03-logs/implementation-log.md +54 -0
package/docs/03-logs/insights.md +35 -0
package/docs/03-logs/validation-log.md +123 -0
package/docs/04-process/definition-of-done.md +31 -0
package/docs/04-process/dev-workflow.md +68 -0
package/docs/{testing.md → 04-process/testing.md} +21 -2
package/docs/README.md +55 -0
package/docs/archive/poc-plan.md +230 -0
package/package.json +1 -1
package/src/aria.js +1 -1
package/src/daemon.js +321 -0
package/src/session-client.js +70 -0

package/.claude/skills/barebrowse/SKILL.md ADDED Viewed

@@ -0,0 +1,107 @@
+---
+name: barebrowse
+description: Browser automation using the user's real browser with real cookies. Handles consent walls, login sessions, and bot detection automatically.
+allowed-tools: Bash(barebrowse:*)
+---
+# barebrowse CLI — Browser Automation for Agents
+Browse any URL using the user's real browser with real cookies. Returns pruned ARIA snapshots (40-90% smaller than raw) with `[ref=N]` markers for interaction. Handles cookie consent, login sessions, and bot detection automatically.
+## Quick Start
+```bash
+barebrowse open https://example.com    # Start session + navigate
+barebrowse snapshot                    # Get ARIA snapshot → .barebrowse/page-*.yml
+barebrowse click 8                     # Click element with ref=8
+barebrowse snapshot                    # See result
+barebrowse close                       # End session
+```
+All output files go to `.barebrowse/` in the current directory. Read them with the Read tool when needed.
+## Commands
+### Session Lifecycle
+| Command | Description |
+|---------|-------------|
+| `barebrowse open [url] [flags]` | Start browser session. Optionally navigate to URL. |
+| `barebrowse close` | Close session and kill browser. |
+| `barebrowse status` | Check if session is running. |
+**Open flags:**
+- `--mode=headless|headed|hybrid` — Browser mode (default: headless)
+- `--no-cookies` — Skip cookie injection
+- `--browser=firefox|chromium` — Cookie source
+- `--prune-mode=act|read` — Default pruning mode
+- `--timeout=N` — Navigation timeout in ms
+### Navigation
+| Command | Output |
+|---------|--------|
+| `barebrowse goto <url>` | Navigates, waits for load, dismisses consent. Prints "ok". |
+| `barebrowse snapshot` | ARIA snapshot → `.barebrowse/page-<timestamp>.yml` |
+| `barebrowse snapshot --mode=read` | Read mode: keeps all text (for content extraction) |
+| `barebrowse screenshot` | Screenshot → `.barebrowse/screenshot-<timestamp>.png` |
+### Interaction
+| Command | Description |
+|---------|-------------|
+| `barebrowse click <ref>` | Click element (scrolls into view first) |
+| `barebrowse type <ref> <text>` | Type text into element |
+| `barebrowse fill <ref> <text>` | Clear existing content + type new text |
+| `barebrowse press <key>` | Press key: Enter, Tab, Escape, Backspace, Delete, arrows, Space |
+| `barebrowse scroll <deltaY>` | Scroll page (positive=down, negative=up) |
+| `barebrowse hover <ref>` | Hover over element (triggers tooltips) |
+| `barebrowse select <ref> <value>` | Select dropdown option |
+### Debugging
+| Command | Output |
+|---------|--------|
+| `barebrowse eval <expression>` | Evaluate JS in page, print result |
+| `barebrowse wait-idle` | Wait for network idle (no requests for 500ms) |
+| `barebrowse console-logs` | Console logs → `.barebrowse/console-<timestamp>.json` |
+| `barebrowse network-log` | Network log → `.barebrowse/network-<timestamp>.json` |
+| `barebrowse network-log --failed` | Only failed/4xx/5xx requests |
+## Snapshot Format
+The snapshot is a YAML-like ARIA tree. Each line is one node:
+```
+- WebArea "Example Domain" [ref=1]
+  - heading "Example Domain" [level=1] [ref=3]
+  - paragraph [ref=5]
+    - StaticText "This domain is for use in illustrative examples." [ref=6]
+  - link "More information..." [ref=8]
+```
+- `[ref=N]` — Use this number with click, type, fill, hover, select
+- Refs change on every snapshot — always take a fresh snapshot before interacting
+- **act mode** (default): interactive elements + labels — for clicking, typing, navigating
+- **read mode**: all text content — for reading articles, extracting data
+## Workflow Pattern
+1. `barebrowse open <url>` — start session
+2. `barebrowse snapshot` — observe page (read the .yml file)
+3. Decide action based on snapshot content
+4. `barebrowse click/type/fill/press/scroll <ref>` — act
+5. `barebrowse snapshot` — observe result (refs are now different!)
+6. Repeat 3-5 until goal achieved
+7. `barebrowse close` — clean up
+## Tips
+- **Always snapshot before interacting** — refs are ephemeral and change every time
+- **Use `fill` instead of `type`** when replacing existing text in input fields
+- **Use `--mode=read`** for snapshot when you need to extract article content or data
+- **Check `console-logs`** when page behavior seems wrong — JS errors show up there
+- **Check `network-log --failed`** to debug missing content or broken API calls
+- **Use `eval`** as an escape hatch when ARIA tree doesn't show what you need
+- **One session per project** — `.barebrowse/` is project-scoped
+- For bot-detected sites, use `--mode=headed` (requires browser with `--remote-debugging-port=9222`)

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,62 @@
 # Changelog
+## 0.3.1
+- Fix `.npmignore`: exclude `.claude/memory/`, `.claude/stash/`, `.claude/settings.local.json` from package (leaked in 0.3.0)
+## 0.3.0
+CLI session mode. Shell commands that output to disk — coding agents read files when needed instead of getting full snapshots in every tool response. ~4x more token-efficient than MCP for multi-step browsing flows.
+### New: CLI session commands
+- `barebrowse open [url] [flags]` — spawn background daemon holding a `connect()` session
+- `barebrowse close` / `status` — session lifecycle
+- `barebrowse goto <url>` — navigate
+- `barebrowse snapshot [--mode=act|read]` — ARIA snapshot → `.barebrowse/page-*.yml`
+- `barebrowse screenshot [--format]` — screenshot → `.barebrowse/screenshot-*.png`
+- `barebrowse click/type/fill/press/scroll/hover/select` — all interactions from connect() API
+- Open flags: `--mode`, `--port`, `--no-cookies`, `--browser`, `--timeout`, `--prune-mode`, `--no-consent`
+### New: agent self-sufficiency
+- `barebrowse eval <expression>` — run JS in page context via `Runtime.evaluate`
+- `barebrowse console-logs [--level --clear]` — dump captured console logs → `.barebrowse/console-*.json`
+- `barebrowse network-log [--failed]` — dump network requests → `.barebrowse/network-*.json`
+- `barebrowse wait-idle [--timeout]` — wait for network idle
+### New: daemon architecture (`src/daemon.js` + `src/session-client.js`)
+- Background HTTP server on random localhost port, holding a `connect()` session
+- Spawned as detached child process, communicates via `session.json`
+- Console capture via `Runtime.consoleAPICalled`
+- Network capture via `Network.requestWillBeSent` / `responseReceived` / `loadingFailed`
+- Graceful shutdown on `close` command or SIGTERM
+### New: SKILL.md for Claude Code
+- `.claude/skills/barebrowse/SKILL.md` — skill definition + full CLI command reference
+- `barebrowse install --skill` — copies SKILL.md to `~/.config/claude/skills/barebrowse/`
+### Fixed: MCP setup instructions
+- README now has per-client instructions: Claude Code (`claude mcp add`), Claude Desktop/Cursor (`npx barebrowse install`), VS Code (`.vscode/mcp.json`)
+- `install` command no longer writes `.mcp.json` for Claude Code — prints `claude mcp add` hint instead
+### Fixed: ARIA tree formatting (`src/aria.js`)
+- Ignored nodes joined children with empty string instead of newline, causing sibling subtrees to concatenate on one line
+- Fixed to `.filter(Boolean).join('\n')`
+### Changed
+- `cli.js` — expanded from 3 commands to full dispatch table (20+ commands)
+- `barebrowse.context.md` — added CLI as third integration path, updated MCP setup
+- `README.md` — "Two ways" → "Three ways", added CLI section
+### Docs
+- `docs/04-process/testing.md` — updated to 64 tests, added CLI test section
+- `docs/00-context/system-state.md` — added daemon/session-client to module table, CLI to integrations
+- `docs/03-logs/validation-log.md` — full CLI manual validation results
+### Tests
+- 64 tests passing (was 54 in 0.2.x)
+- New: `test/integration/cli.test.js` (10 tests) — full open → snapshot → goto → click → eval → console → network → close cycle
+- All existing 54 tests unchanged and passing
 ## 0.2.1
 - README rewritten: no code blocks, full obstacle course table with mode column, two usage paths (MCP vs framework), mcprune credited, measured token savings, context.md as code reference

package/CLAUDE.md CHANGED Viewed

@@ -12,11 +12,13 @@
 ## Project Specifics
+- **What:** Vanilla JS library — CDP-direct browsing for autonomous agents. URL in, pruned ARIA snapshot out.
 - **Language:** Vanilla JavaScript, ES modules, no build step
 - **Runtime:** Node.js >= 22 (built-in WebSocket, sqlite)
 - **Protocol:** CDP (Chrome DevTools Protocol) direct — no Playwright
 - **Browser:** Any installed Chromium-based browser (chromium, chrome, brave, edge)
-- **Key files:** `src/index.js` (API), `src/cdp.js` (CDP client), `src/chromium.js` (browser launch), `src/aria.js` (ARIA formatting)
-- **Docs:** `docs/prd.md` (decisions + rationale), `docs/poc-plan.md` (phases + DoD)
+- **Modules:** 11 files in `src/`, ~2,400 lines, zero required deps
+- **Tests:** 54 passing — run with `node --test test/unit/*.test.js test/integration/*.test.js`
+- **Docs:** `docs/README.md` (navigation guide to all documentation)
 For full development and testing standards, see `.claude/memory/AGENT_RULES.md`.

package/README.md CHANGED Viewed

@@ -30,18 +30,65 @@ npm install barebrowse
 Requires Node.js >= 22 and any installed Chromium-based browser.
-## Two ways to use it
+## Three ways to use it
-### 1. MCP server -- for Claude Desktop, Cursor, Claude Code
+### 1. CLI session -- for coding agents and quick testing
+```bash
+barebrowse open https://example.com    # Start session + navigate
+barebrowse snapshot                    # ARIA snapshot → .barebrowse/page-*.yml
+barebrowse click 8                     # Click element
+barebrowse close                       # End session
 ```
-npm install -g barebrowse
+Outputs go to `.barebrowse/` as files -- agents read them with their file tools, no token waste in tool responses. Install the skill for Claude Code:
+```bash
+barebrowse install --skill
+# or: claude mcp add barebrowse -- npx barebrowse mcp
+```
+Full command reference: [.claude/skills/barebrowse/SKILL.md](.claude/skills/barebrowse/SKILL.md)
+### 2. MCP server -- for Claude Desktop, Cursor, and other MCP clients
+**Claude Code:**
+```bash
+claude mcp add barebrowse -- npx barebrowse mcp
+```
+**Claude Desktop / Cursor:**
+```bash
 npx barebrowse install
 ```
-That's it. `install` auto-detects your MCP client and writes the config. No manual JSON editing. Restart your client and you have 7 browsing tools: `browse`, `goto`, `snapshot`, `click`, `type`, `press`, `scroll`.
+Or manually add to your config (`claude_desktop_config.json`, `.cursor/mcp.json`):
+```json
+{
+  "mcpServers": {
+    "barebrowse": {
+      "command": "npx",
+      "args": ["barebrowse", "mcp"]
+    }
+  }
+}
+```
+**VS Code (`.vscode/mcp.json`):**
+```json
+{
+  "servers": {
+    "barebrowse": {
+      "command": "npx",
+      "args": ["barebrowse", "mcp"]
+    }
+  }
+}
+```
+7 tools: `browse`, `goto`, `snapshot`, `click`, `type`, `press`, `scroll`.
-### 2. Framework -- for agentic automation
+### 3. Library -- for agentic automation
 Import barebrowse in your agent code. One-shot reads, interactive sessions, full observe-think-act loops. Works with any LLM orchestration library. Ships with a ready-made adapter for [bareagent](https://www.npmjs.com/package/bare-agent) (9 tools, auto-snapshot after every action).

package/barebrowse.context.md CHANGED Viewed

@@ -1,7 +1,7 @@
 # barebrowse -- Integration Guide
 > For AI assistants and developers wiring barebrowse into a project.
-> v0.1.0 | Node.js >= 22 | 0 required deps | MIT
+> v0.3.0 | Node.js >= 22 | 0 required deps | MIT
 ## What this is
@@ -13,9 +13,10 @@ No Playwright. No bundled browser. No build step. Vanilla JS, ES modules.
 npm install barebrowse
 ```
-Two entry points:
-- `import { browse } from 'barebrowse'` -- one-shot: URL in, snapshot out
-- `import { connect } from 'barebrowse'` -- session: navigate, interact, observe
+Three integration paths:
+1. **Library:** `import { browse, connect } from 'barebrowse'` -- one-shot or interactive session
+2. **MCP server:** `barebrowse mcp` -- JSON-RPC over stdio for Claude Desktop, Cursor, etc.
+3. **CLI session:** `barebrowse open` / `click` / `snapshot` / `close` -- shell commands, outputs to disk
 ## Which mode do I need?
@@ -174,15 +175,33 @@ try {
 Action tools (click, type, press, scroll, goto) auto-return a fresh snapshot so the LLM always sees the result. 300ms settle delay after actions for DOM updates.
-## MCP wrapper
+## CLI session mode
-barebrowse ships an MCP server for direct use with Claude Desktop, Cursor, or any MCP client.
+For coding agents (Claude Code, Copilot, Cursor) and quick interactive testing. Commands output files to `.barebrowse/` -- agents read them with file tools, avoiding token waste in tool responses.
 ```bash
-npm install barebrowse       # or npm install -g barebrowse
+barebrowse open https://example.com    # Start daemon + navigate
+barebrowse snapshot                    # → .barebrowse/page-<timestamp>.yml
+barebrowse click 8                     # Click element ref=8
+barebrowse type 12 hello world         # Type into element ref=12
+barebrowse screenshot                  # → .barebrowse/screenshot-<timestamp>.png
+barebrowse console-logs                # → .barebrowse/console-<timestamp>.json
+barebrowse close                       # Kill daemon + browser
 ```
-Add to your MCP client config (`.mcp.json`, `claude_desktop_config.json`, etc.):
+Session lifecycle: `open` spawns a background daemon holding a `connect()` session. Subsequent commands POST to the daemon over HTTP (localhost). `close` shuts everything down.
+Full command reference: `.claude/skills/barebrowse/SKILL.md`
+## MCP wrapper
+barebrowse ships an MCP server for direct use with Claude Desktop, Cursor, or any MCP client.
+**Claude Code:** `claude mcp add barebrowse -- npx barebrowse mcp`
+**Claude Desktop / Cursor:** `npx barebrowse install` (auto-detects and writes config)
+**Manual config** (`claude_desktop_config.json`, `.cursor/mcp.json`):
 ```json
 {
   "mcpServers": {