browserclaw 0.3.2 → 0.3.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +19 -14
- package/dist/index.cjs +1 -1
- package/dist/index.cjs.map +1 -1
- package/dist/index.js +1 -1
- package/dist/index.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -41,30 +41,35 @@ The snapshot + ref pattern means:
|
|
|
41
41
|
|
|
42
42
|
The AI browser automation space is moving fast. Here's how browserclaw compares to the major alternatives.
|
|
43
43
|
|
|
44
|
-
| | [browserclaw](https://github.com/idan-rubin/browserclaw) | [browser-use](https://github.com/browser-use/browser-use) | [Stagehand](https://github.com/browserbase/stagehand) | [
|
|
45
|
-
|
|
46
|
-
| Ref → exact element, no guessing | :white_check_mark: | :heavy_minus_sign: | :x: | :
|
|
47
|
-
| No vision model in the loop | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :
|
|
48
|
-
| Survives redesigns (semantic, not pixel) | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :
|
|
49
|
-
| Fill 10 form fields in one call | :white_check_mark: | :x: | :x: | :x: |
|
|
50
|
-
| Interact with cross-origin iframes | :white_check_mark: | :white_check_mark: | :x: | :x: |
|
|
51
|
-
| Playwright engine (auto-wait, locators) | :white_check_mark: | :x: | :white_check_mark: | :white_check_mark: |
|
|
52
|
-
| Embeddable in your own agent loop | :white_check_mark: | :x: | :heavy_minus_sign: | :x: |
|
|
44
|
+
| | [browserclaw](https://github.com/idan-rubin/browserclaw) | [browser-use](https://github.com/browser-use/browser-use) | [Stagehand](https://github.com/browserbase/stagehand) | [Playwright MCP](https://github.com/microsoft/playwright-mcp) |
|
|
45
|
+
|:---|:---:|:---:|:---:|:---:|
|
|
46
|
+
| Ref → exact element, no guessing | :white_check_mark: | :heavy_minus_sign: | :x: | :white_check_mark: |
|
|
47
|
+
| No vision model in the loop | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :white_check_mark: |
|
|
48
|
+
| Survives redesigns (semantic, not pixel) | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :white_check_mark: |
|
|
49
|
+
| Fill 10 form fields in one call | :white_check_mark: | :x: | :x: | :x: |
|
|
50
|
+
| Interact with cross-origin iframes | :white_check_mark: | :white_check_mark: | :x: | :x: |
|
|
51
|
+
| Playwright engine (auto-wait, locators) | :white_check_mark: | :x: | :white_check_mark: | :white_check_mark: |
|
|
52
|
+
| Embeddable in your own JS/TS agent loop | :white_check_mark: | :x: | :heavy_minus_sign: | :x: |
|
|
53
53
|
|
|
54
54
|
:white_check_mark: = Yes  :heavy_minus_sign: = Partial  :x: = No
|
|
55
55
|
|
|
56
56
|
**browserclaw is the only tool that checks every box.** It combines the precision of accessibility snapshots with Playwright's battle-tested engine, batch operations, cross-origin iframe access, and zero framework lock-in — in a single embeddable library.
|
|
57
57
|
|
|
58
|
+
### The key distinction: browser tool vs. AI agent
|
|
59
|
+
|
|
60
|
+
Most tools in this space are **AI agents that happen to control a browser**. They own the intelligence layer: they take a task, call an LLM, decide what actions to take, and execute them. That's a complete agent.
|
|
61
|
+
|
|
62
|
+
browserclaw is different. It's a **browser tool** — just the eyes and hands. It takes a snapshot and returns refs. It executes actions on refs. The LLM, the reasoning, the task planning — that all lives in your code, in your agent, wherever you want it. browserclaw doesn't have opinions about any of that.
|
|
63
|
+
|
|
64
|
+
This distinction matters if you're building an agent platform, a product with its own AI layer, or anything where you need to control the intelligence loop. You can't compose an agent-first tool into a system that already has an agent. You end up with two brains fighting over who's in charge.
|
|
65
|
+
|
|
58
66
|
### How each tool works under the hood
|
|
59
67
|
|
|
60
|
-
- **browserclaw** — Accessibility snapshot with numbered refs → Playwright locator (`aria-ref` in default mode, `getByRole()` in role mode). One ref, one element. No vision model, no LLM in the targeting loop.
|
|
61
|
-
- **browser-use** —
|
|
68
|
+
- **browserclaw** — Accessibility snapshot with numbered refs → Playwright locator (`aria-ref` in default mode, `getByRole()` in role mode). One ref, one element. No vision model, no LLM in the targeting loop. You bring the brain.
|
|
69
|
+
- **browser-use** — A complete AI agent: takes a task, calls an LLM, decides actions, executes them. The LLM loop is inside the library. Great for standalone automation scripts; incompatible with platforms that already own the agent loop. Python-only.
|
|
62
70
|
- **Stagehand** — Accessibility tree + natural language primitives (`page.act("click login")`). Convenient, but the LLM re-interprets which element to target on every single call — non-deterministic by design.
|
|
63
|
-
- **Skyvern** — Vision-first. Screenshots sent to a Vision LLM that guesses coordinates. Multi-agent architecture (Planner/Actor/Validator) adds self-correction, but at significant cost and latency.
|
|
64
71
|
- **Playwright MCP** — Same snapshot philosophy as browserclaw, but locked to the MCP protocol. Great for chat-based agents, but not embeddable as a library — you can't compose it into your own agent loop or call it from application code.
|
|
65
72
|
|
|
66
|
-
**Also in the space:** [LaVague](https://github.com/lavague-ai/LaVague) (generates Selenium code via RAG on HTML), [AgentQL](https://github.com/tinyfish-io/agentql) (semantic query language for the DOM), [Vercel agent-browser](https://github.com/vercel-labs/agent-browser) (element refs like `@e1` — a similar ref-based approach).
|
|
67
|
-
|
|
68
73
|
### Why this matters for repeated complex UI tasks
|
|
69
74
|
|
|
70
75
|
When you're running the same multi-step workflow hundreds of times — filling forms, navigating dashboards, processing queues — the differences compound:
|
package/dist/index.cjs
CHANGED
|
@@ -751,7 +751,7 @@ async function getPageForTargetId(opts) {
|
|
|
751
751
|
const found = await findPageByTargetId(browser, opts.targetId, opts.cdpUrl);
|
|
752
752
|
if (!found) {
|
|
753
753
|
if (pages.length === 1) return first;
|
|
754
|
-
throw new Error(
|
|
754
|
+
throw new Error("tab not found");
|
|
755
755
|
}
|
|
756
756
|
return found;
|
|
757
757
|
}
|