crawlio-browser 1.5.5 → 1.5.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +384 -56
- package/dist/mcp-server/{chunk-A4EQCKHH.js → chunk-YEKQAHYW.js} +1 -1
- package/dist/mcp-server/index.js +888 -23
- package/dist/mcp-server/{init-77AO6DDJ.js → init-ZLXCKEQB.js} +2 -2
- package/package.json +2 -4
- package/skills/browser-automation/SKILL.md +15 -3
- package/skills/web-research/SKILL.md +1 -0
- package/.claude-plugin/plugin.json +0 -10
- package/assets/AppIcon.icns +0 -0
package/README.md
CHANGED
|
@@ -1,47 +1,53 @@
|
|
|
1
1
|
# Crawlio Agent
|
|
2
2
|
|
|
3
|
-
MCP server that gives AI full control of a live Chrome browser via CDP. 89 browser tools + framework-aware intelligence — captures what static crawlers can't see.
|
|
4
|
-
|
|
5
3
|
[](https://www.npmjs.com/package/crawlio-browser)
|
|
6
4
|
[](LICENSE)
|
|
7
5
|
|
|
6
|
+
## [Documentation](https://docs.crawlio.app/browser-agent/overview) | [API Reference](https://docs.crawlio.app/browser-agent/tools) | [Chrome Extension](https://www.crawlio.app/browser-agent)
|
|
7
|
+
|
|
8
|
+
MCP server that gives AI full control of a live Chrome browser via CDP. 100 tools (93 browser + 3 extraction + 3 recording + 1 compiler) with framework-aware intelligence, typed evidence infrastructure, and confidence-tracked findings — captures what static crawlers can't see.
|
|
9
|
+
|
|
10
|
+
> **Note:** This repo supersedes [`crawlio-browser-mcp`](https://github.com/AshDevFr/crawlio-browser-mcp). All development now happens here.
|
|
11
|
+
|
|
8
12
|
## When to use Crawlio Agent
|
|
9
13
|
|
|
10
14
|
Use Crawlio Agent when your AI needs to interact with a **real browser** — SPAs, authenticated pages, dynamic content, JS-rendered frameworks. Unlike headless browser tools, Crawlio Agent connects to **your actual Chrome** via a lightweight extension, giving the AI access to your logged-in sessions, cookies, and full browser state.
|
|
11
15
|
|
|
12
|
-
**Crawlio Agent vs
|
|
16
|
+
**Crawlio Agent vs headless browser tools:** Headless tools launch a separate browser process. Crawlio Agent connects to your existing Chrome — no separate browser, no login flows, full access to your tabs and sessions.
|
|
13
17
|
|
|
14
18
|
## Quick Start
|
|
15
19
|
|
|
16
|
-
1. Install the [Chrome Extension](https://crawlio.app/agent)
|
|
17
|
-
2. Run
|
|
20
|
+
1. Install the [Chrome Extension](https://www.crawlio.app/browser-agent)
|
|
21
|
+
2. Run the init wizard:
|
|
18
22
|
```bash
|
|
19
|
-
npx crawlio-browser
|
|
23
|
+
npx crawlio-browser init
|
|
20
24
|
```
|
|
21
25
|
|
|
22
|
-
That's it. Auto-detects and configures Claude Code, Cursor, VS Code, Codex,
|
|
23
|
-
Claude Desktop, and 9 more MCP clients. Starts a persistent background server.
|
|
26
|
+
That's it. Auto-detects and configures 14 MCP clients: Claude Code, Cursor, VS Code, Codex, Gemini CLI, Claude Desktop, ChatGPT Desktop, Windsurf, Cline, Zed, Goose, OpenCode, MCPorter, and Cline CLI.
|
|
24
27
|
|
|
25
|
-
###
|
|
26
|
-
|
|
27
|
-
If you prefer manual configuration or use `add-mcp`:
|
|
28
|
+
### Init wizard options
|
|
28
29
|
|
|
29
30
|
```bash
|
|
30
|
-
#
|
|
31
|
-
npx crawlio-browser --
|
|
32
|
-
|
|
33
|
-
#
|
|
34
|
-
npx
|
|
31
|
+
npx crawlio-browser init # Default: code mode, stdio transport
|
|
32
|
+
npx crawlio-browser init --full # Full mode (100 individual tools)
|
|
33
|
+
npx crawlio-browser init --portal # Portal mode (persistent HTTP server)
|
|
34
|
+
npx crawlio-browser init --cloudflare # Add Cloudflare MCP (89 tools, no wrangler)
|
|
35
|
+
npx crawlio-browser init --dry-run # Show what would happen
|
|
36
|
+
npx crawlio-browser init --yes # Skip prompts (CI / scripted installs)
|
|
37
|
+
npx crawlio-browser init -a claude # Target specific MCP client
|
|
35
38
|
```
|
|
36
39
|
|
|
37
|
-
###
|
|
40
|
+
### Transport Modes
|
|
38
41
|
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
+
| Mode | Command / URL | Protocol | Best For |
|
|
43
|
+
|------|--------------|----------|----------|
|
|
44
|
+
| **stdio** | `npx crawlio-browser` | JSON-RPC over stdin/stdout | Claude Desktop, Cursor, Windsurf — client manages process lifecycle |
|
|
45
|
+
| **Portal (HTTP)** | `POST http://127.0.0.1:3001/mcp` | MCP Streamable HTTP | Claude Code, ChatGPT Desktop — server survives session restarts |
|
|
46
|
+
| **Portal (SSE)** | `GET /sse` + `POST /message` | Server-Sent Events | Legacy clients needing SSE transport |
|
|
47
|
+
|
|
48
|
+
Portal mode is recommended for Claude Code — the server persists across context compaction and session restarts. On macOS, `--portal` installs a launchd agent for auto-start on login.
|
|
42
49
|
|
|
43
|
-
|
|
44
|
-
`/crawlio:observe`, `/crawlio:finding`, `/crawlio:extract-and-export`
|
|
50
|
+
### Manual setup (any client)
|
|
45
51
|
|
|
46
52
|
<details>
|
|
47
53
|
<summary><b>Per-client manual config</b></summary>
|
|
@@ -84,57 +90,136 @@ URL: `http://127.0.0.1:3001/mcp` | Type: Streamable HTTP
|
|
|
84
90
|
## How It Works
|
|
85
91
|
|
|
86
92
|
```
|
|
87
|
-
AI Client (stdio)
|
|
88
|
-
|
|
93
|
+
AI Client (stdio/http) --> MCP Server (Node.js) --> Chrome Extension (MV3)
|
|
94
|
+
crawlio-browser WebSocket -> CDP
|
|
89
95
|
```
|
|
90
96
|
|
|
91
|
-
The MCP server communicates with the Chrome extension via WebSocket. The extension controls the browser through Chrome DevTools Protocol
|
|
97
|
+
The MCP server communicates with the Chrome extension via WebSocket. The extension controls the browser through Chrome DevTools Protocol (CDP).
|
|
92
98
|
|
|
93
|
-
##
|
|
99
|
+
## Capabilities
|
|
100
|
+
|
|
101
|
+
### Framework-Aware Intelligence
|
|
94
102
|
|
|
95
|
-
|
|
103
|
+
Every `execute` call probes the browser for framework signatures and injects a shape-shifting `smart` object with framework-native accessors. React state, Vue reactivity, Next.js routing, Shopify cart data — 17 framework namespaces across 4 tiers, detected at runtime and rebuilt on every navigation. The AI doesn't query a generic DOM; it queries the framework's own data structures.
|
|
96
104
|
|
|
97
|
-
-
|
|
98
|
-
|
|
99
|
-
-
|
|
100
|
-
|
|
105
|
+
### Evidence-Based Analysis
|
|
106
|
+
|
|
107
|
+
Method Mode adds higher-order methods and a typed evidence system on top of Code Mode. `smart.extractPage()` runs 7 parallel operations in a single call — page capture, performance metrics, security state, font detection, meta extraction, accessibility audit, and mobile-readiness check. Failed operations produce typed `CoverageGap` records instead of silent `null`s. Findings created with `smart.finding()` get their confidence automatically adjusted when supporting data is missing. The result: structured, auditable research output with gap tracking and confidence propagation.
|
|
108
|
+
|
|
109
|
+
### Session Recording & Replay
|
|
110
|
+
|
|
111
|
+
Record browser interactions as structured data, then compile them into reusable SKILL.md automations. 12 interaction tools are automatically intercepted during recording — clicks, typing, navigation, scrolling — each capturing args, result, timing, and page URL. One `compileRecording()` call converts the session into a deterministic automation script.
|
|
112
|
+
|
|
113
|
+
### Auto-Settling & Actionability
|
|
114
|
+
|
|
115
|
+
Every mutative action (`click`, `type`, `navigate`, `select_option`) runs actionability checks before acting — polling visibility, dimensions, enabled state, and overlay detection. After the action, a progressive backoff settle delay (`[0, 20, 100, 100, 500]ms`) waits for DOM mutations to quiesce. The AI doesn't need manual `sleep()` calls between actions.
|
|
116
|
+
|
|
117
|
+
## Architecture: JIT Context Runtime
|
|
118
|
+
|
|
119
|
+
The JIT Context MCP Runtime is a layered execution architecture where each layer absorbs a category of complexity that would otherwise fall on the model. The model sees three tools and a clean SDK. Everything beneath that surface is the runtime absorbing reality.
|
|
101
120
|
|
|
102
121
|
```
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
122
|
+
┌───────────────────────────────────┐
|
|
123
|
+
│ AI Model (LLM) │
|
|
124
|
+
│ Writes code, reads errors, loops │
|
|
125
|
+
└───────────────┬───────────────────┘
|
|
126
|
+
│ 3 tools: search, execute, connect_tab
|
|
127
|
+
▼
|
|
128
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
129
|
+
│ JIT Context MCP Runtime │
|
|
130
|
+
│ │
|
|
131
|
+
│ ┌────────────────────────────────────────────────────────────┐ │
|
|
132
|
+
│ │ METHOD MODE │ │
|
|
133
|
+
│ │ Behavioral protocol + higher-order methods │ │
|
|
134
|
+
│ │ scrollCapture · waitForIdle · extractPage · comparePages │ │
|
|
135
|
+
│ │ detectTables · extractTable · waitForNetworkIdle · │ │
|
|
136
|
+
│ │ extractData │ │
|
|
137
|
+
│ │ │ │
|
|
138
|
+
│ │ ↳ Absorbs: behavioral variance, ad-hoc composition, │ │
|
|
139
|
+
│ │ inconsistent output shapes, data extraction patterns │ │
|
|
140
|
+
│ ├────────────────────────────────────────────────────────────┤ │
|
|
141
|
+
│ │ POLYMORPHIC CONTEXT │ │
|
|
142
|
+
│ │ 17 framework namespaces, injected Just-In-Time │ │
|
|
143
|
+
│ │ react · vue · angular · nextjs · shopify · ... │ │
|
|
144
|
+
│ │ │ │
|
|
145
|
+
│ │ ↳ Absorbs: framework opacity, minified code, │ │
|
|
146
|
+
│ │ devtools hook complexity │ │
|
|
147
|
+
│ ├────────────────────────────────────────────────────────────┤ │
|
|
148
|
+
│ │ ACTIONABILITY ENGINE │ │
|
|
149
|
+
│ │ 7 core smart methods with built-in resilience │ │
|
|
150
|
+
│ │ click · type · navigate · waitFor · evaluate · │ │
|
|
151
|
+
│ │ snapshot · screenshot │ │
|
|
152
|
+
│ │ │ │
|
|
153
|
+
│ │ ↳ Absorbs: DOM timing, hydration delays, CSS animations, │ │
|
|
154
|
+
│ │ disabled states, overlapping elements │ │
|
|
155
|
+
│ ├────────────────────────────────────────────────────────────┤ │
|
|
156
|
+
│ │ TETHERED IPC BRIDGE │ │
|
|
157
|
+
│ │ WebSocket ↔ Chrome extension, message queue, │ │
|
|
158
|
+
│ │ heartbeat, auto-reconnect, stale detection │ │
|
|
159
|
+
│ │ │ │
|
|
160
|
+
│ │ ↳ Absorbs: connection drops, tab refreshes, │ │
|
|
161
|
+
│ │ port conflicts, extension lifecycle │ │
|
|
162
|
+
│ ├────────────────────────────────────────────────────────────┤ │
|
|
163
|
+
│ │ 133 RAW COMMANDS (bridge.send) │ │
|
|
164
|
+
│ │ CDP-level browser control via Chrome extension │ │
|
|
165
|
+
│ └────────────────────────────────────────────────────────────┘ │
|
|
166
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
167
|
+
│
|
|
168
|
+
▼
|
|
169
|
+
┌───────────────────────────────────┐
|
|
170
|
+
│ Live Chrome Browser │
|
|
171
|
+
│ Persistent session, real DOM, │
|
|
172
|
+
│ framework runtime, user state │
|
|
173
|
+
└───────────────────────────────────┘
|
|
109
174
|
```
|
|
110
175
|
|
|
111
|
-
|
|
176
|
+
### What Each Layer Absorbs
|
|
112
177
|
|
|
113
|
-
|
|
178
|
+
| Layer | Without It | With It |
|
|
179
|
+
|-------|-----------|---------|
|
|
180
|
+
| **Tethered IPC Bridge** | Script crashes on tab refresh, pending commands lost on reconnect, port conflicts on startup | Resilient WebSocket with message queue (100-msg capacity), heartbeat stale detection (15s intervals), auto-reconnect with drain |
|
|
181
|
+
| **Actionability Engine** | `click('#btn')` fires before the button renders, during CSS transitions, or while an overlay covers it | Progressive polling (exists → has dimensions → visible → not disabled → not obscured) with `[0, 20, 100, 100, 500]ms` backoff |
|
|
182
|
+
| **Polymorphic Context** | Model sees minified `<div>` elements; reading React state requires knowing exact hook paths, renderer maps, and fiber root API | Runtime probes live JS environment, detects 17 frameworks, injects namespace methods (`smart.react.getVersion()`, `smart.nextjs.getData()`) |
|
|
183
|
+
| **Method Mode** | Model composes primitives ad-hoc — inconsistent scroll loops, missed edge cases, varying return shapes | 8 tested methods encode correct patterns; behavioral protocol constrains workflow |
|
|
114
184
|
|
|
115
|
-
###
|
|
185
|
+
### Execution Lifecycle
|
|
116
186
|
|
|
117
|
-
|
|
187
|
+
1. **Discovery** — Model calls `search("page capture performance")` and gets documentation for relevant commands
|
|
188
|
+
2. **Framework Detection** — Runtime probes the live DOM, detects active frameworks, constructs polymorphic `smart` object with appropriate namespaces
|
|
189
|
+
3. **Scope Assembly** — Model's code is compiled into an async function with injected parameters: `bridge` (133 commands), `crawlio` (HTTP client), `sleep`, `TIMEOUTS`, `smart` (7 core + 8 higher-order + up to 17 framework namespaces), `compileRecording`
|
|
190
|
+
4. **Execution** — Method Mode methods compose the lower layers: `extractPage()` fires 7 parallel `bridge.send()` calls; `click()` runs the actionability engine; `react.getVersion()` evaluates framework-specific expressions
|
|
191
|
+
5. **Error Recovery (Agentic REPL)** — On failure, the browser stays in the exact state that produced the error. The model reads the structured error, adjusts, and calls `execute` again. Framework cache persists — no re-detection unless URL changed
|
|
118
192
|
|
|
119
|
-
###
|
|
193
|
+
### Design Principles
|
|
120
194
|
|
|
121
|
-
|
|
195
|
+
1. **Absorb complexity downward** — Every category of difficulty (connection management, DOM timing, framework detection, multi-step composition) is handled by the layer best equipped for it. The model only encounters the clean interface at the top.
|
|
196
|
+
2. **Shape the SDK to the target** — The polymorphic context system detects what the page is and reshapes available methods to match. The model writes against a stable interface; the runtime adapts underneath.
|
|
197
|
+
3. **Preserve state across cycles** — The tethered architecture means the model can fail, learn, and retry against the same live environment — transforming error handling from "restart from scratch" into "adjust and continue."
|
|
122
198
|
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
199
|
+
### How It Compares
|
|
200
|
+
|
|
201
|
+
| Dimension | Standard MCP | Cloudflare Code Mode | JIT Context Runtime |
|
|
202
|
+
|-----------|-------------|---------------------|---------------------|
|
|
203
|
+
| **Tools in context** | 50-100+ schemas | 2 (`search`, `execute`) | 3 (`search`, `execute`, `connect_tab`) |
|
|
204
|
+
| **Execution environment** | N/A (tool calls) | V8 isolate (stateless) | Local async sandbox (stateful, tethered to live browser) |
|
|
205
|
+
| **DOM access** | Via individual tool calls | None | Live, persistent, framework-aware |
|
|
206
|
+
| **Framework awareness** | None | None | 17 namespaces, injected JIT |
|
|
207
|
+
| **Action resilience** | Model must handle timing | N/A (no DOM) | Built-in actionability polling + settle delays |
|
|
208
|
+
| **Error recovery** | Re-call individual tool | Re-create isolate | Re-execute against same live state (Agentic REPL) |
|
|
209
|
+
| **Multi-step patterns** | Model improvises | Model writes loops | 8 tested higher-order methods + behavioral protocol |
|
|
210
|
+
|
|
211
|
+
[Read the full architecture guide →](https://docs.crawlio.app/browser-agent/overview)
|
|
212
|
+
|
|
213
|
+
## Two Modes
|
|
214
|
+
|
|
215
|
+
### Code Mode (3 tools) — default
|
|
216
|
+
|
|
217
|
+
Collapses 100 tools into 3 high-level tools with ~95% schema token reduction:
|
|
133
218
|
|
|
134
219
|
| Tool | Description |
|
|
135
220
|
|------|-------------|
|
|
136
221
|
| `search` | Discover available commands by keyword |
|
|
137
|
-
| `execute` | Run async JS with `bridge`, `crawlio`, `smart`, and `
|
|
222
|
+
| `execute` | Run async JS with `bridge`, `crawlio`, `smart`, `sleep`, and `compileRecording` in scope |
|
|
138
223
|
| `connect_tab` | Connect to a browser tab |
|
|
139
224
|
|
|
140
225
|
```javascript
|
|
@@ -145,6 +230,14 @@ const screenshot = await bridge.send({ type: 'take_screenshot' }, 10000);
|
|
|
145
230
|
return screenshot;
|
|
146
231
|
```
|
|
147
232
|
|
|
233
|
+
### Full Mode (100 tools)
|
|
234
|
+
|
|
235
|
+
Every tool exposed directly to the LLM. Enable with `--full`:
|
|
236
|
+
|
|
237
|
+
```bash
|
|
238
|
+
npx crawlio-browser init --full
|
|
239
|
+
```
|
|
240
|
+
|
|
148
241
|
## Smart Object
|
|
149
242
|
|
|
150
243
|
In Code Mode, the `smart` object provides framework-aware helpers with auto-waiting and actionability checks.
|
|
@@ -159,6 +252,35 @@ In Code Mode, the `smart` object provides framework-aware helpers with auto-wait
|
|
|
159
252
|
| `smart.navigate(url, opts?)` | Navigate with 1000ms settle |
|
|
160
253
|
| `smart.waitFor(selector, timeout?)` | Poll until element is actionable |
|
|
161
254
|
| `smart.snapshot()` | Accessibility tree snapshot |
|
|
255
|
+
| `smart.screenshot()` | Full-page screenshot (base64 PNG) |
|
|
256
|
+
|
|
257
|
+
### Higher-Order Methods
|
|
258
|
+
|
|
259
|
+
| Method | Description |
|
|
260
|
+
|--------|-------------|
|
|
261
|
+
| `smart.scrollCapture(opts?)` | Scroll to bottom, capturing screenshots at each position. Handles stuck-scroll detection, bottom detection, section capping, and scroll reset. |
|
|
262
|
+
| `smart.waitForIdle(timeout?)` | MutationObserver-based idle detection — waits for 500ms quiet window. Timeout hard-capped at 15s. Replaces blind `sleep()` calls. |
|
|
263
|
+
| `smart.extractPage(opts?)` | 7 parallel operations in one call — page capture, performance, security, fonts, meta, accessibility, mobile-readiness. Returns typed `PageEvidence` with `CoverageGap[]` for anything that failed. |
|
|
264
|
+
| `smart.comparePages(urlA, urlB)` | Navigates to both URLs, runs `extractPage()` on each, returns a `ComparisonScaffold` with 11 dimensions, shared/missing fields, and comparable metrics. |
|
|
265
|
+
|
|
266
|
+
### Typed Evidence
|
|
267
|
+
|
|
268
|
+
Methods for structured analysis findings with confidence propagation:
|
|
269
|
+
|
|
270
|
+
| Method | Description |
|
|
271
|
+
|--------|-------------|
|
|
272
|
+
| `smart.finding(data)` | Create a validated `Finding` with claim, evidence, sourceUrl, confidence, and method. Rejects malformed input with specific errors. |
|
|
273
|
+
| `smart.findings()` | Get all session-accumulated findings (returns a copy) |
|
|
274
|
+
| `smart.clearFindings()` | Reset session findings and coverage gaps |
|
|
275
|
+
|
|
276
|
+
When a finding's `dimension` matches an active coverage gap, confidence is automatically capped:
|
|
277
|
+
|
|
278
|
+
| Input Confidence | Active Gap | Output |
|
|
279
|
+
|-----------------|------------|--------|
|
|
280
|
+
| `high` | `reducesConfidence: true` | `medium` + `confidenceCapped: true` |
|
|
281
|
+
| `medium` | `reducesConfidence: true` | `low` + `confidenceCapped: true` |
|
|
282
|
+
| `low` | any | `low` (floor) |
|
|
283
|
+
| any | no matching gap | unchanged |
|
|
162
284
|
|
|
163
285
|
### Framework Namespaces
|
|
164
286
|
|
|
@@ -301,9 +423,191 @@ When a framework is detected, the smart object exposes framework-specific helper
|
|
|
301
423
|
|
|
302
424
|
</details>
|
|
303
425
|
|
|
426
|
+
## Method Mode
|
|
427
|
+
|
|
428
|
+
Method Mode is a domain layer built on top of Code Mode. It adds higher-order methods, a typed evidence system, and a behavioral protocol to the `execute` sandbox — without changing the tool surface. The model still sees three tools. The same `smart` object. The same 133-command catalog underneath. What changes is what happens *inside* `execute`.
|
|
429
|
+
|
|
430
|
+
### The Maturity Ladder
|
|
431
|
+
|
|
432
|
+
| Layer | Optimizes For | Behavioral Variance | Evidence Quality |
|
|
433
|
+
|-------|---------------|---------------------|-----------------|
|
|
434
|
+
| **Raw MCP** (100 tools) | Completeness | High — flat tool list, no composition guidance | None — unstructured text |
|
|
435
|
+
| **Code Mode** (3 tools) | Token efficiency | Medium — right primitives, ad-hoc composition | None — model-defined shapes |
|
|
436
|
+
| **Method Mode v1** (+ 8 methods + protocol) | Consistency | Low — proper methods, protocol constraints | Convention — `{ finding, evidence, url }` |
|
|
437
|
+
| **Method Mode v2** (+ typed evidence + gaps + confidence) | Correctness | Minimal — typed schemas, tool-enforced findings | Structural — typed records, gap tracking, confidence propagation |
|
|
438
|
+
|
|
439
|
+
### Architecture
|
|
440
|
+
|
|
441
|
+
```
|
|
442
|
+
┌────────────────────────────────────────────────────────────┐
|
|
443
|
+
│ execute sandbox │
|
|
444
|
+
│ │
|
|
445
|
+
│ ┌──────────────────────────────────────────────────────┐ │
|
|
446
|
+
│ │ Behavioral Protocol (web-research skill) │ │
|
|
447
|
+
│ │ Acquire → Normalize → Analyze │ │
|
|
448
|
+
│ ├──────────────────────────────────────────────────────┤ │
|
|
449
|
+
│ │ Evidence Infrastructure │ │
|
|
450
|
+
│ │ finding() · findings() · clearFindings() │ │
|
|
451
|
+
│ │ Typed records · Coverage gaps · Confidence prop. │ │
|
|
452
|
+
│ ├──────────────────────────────────────────────────────┤ │
|
|
453
|
+
│ │ Higher-Order Methods [8] │ │
|
|
454
|
+
│ │ scrollCapture · waitForIdle · extractPage · │ │
|
|
455
|
+
│ │ comparePages · detectTables · extractTable · │ │
|
|
456
|
+
│ │ waitForNetworkIdle · extractData │ │
|
|
457
|
+
│ ├──────────────────────────────────────────────────────┤ │
|
|
458
|
+
│ │ Smart Core [7 methods] │ │
|
|
459
|
+
│ │ evaluate · click · type · navigate · waitFor · │ │
|
|
460
|
+
│ │ snapshot · screenshot │ │
|
|
461
|
+
│ ├──────────────────────────────────────────────────────┤ │
|
|
462
|
+
│ │ Framework Namespaces [up to 17, injected JIT] │ │
|
|
463
|
+
│ │ react · vue · angular · nextjs · shopify · ... │ │
|
|
464
|
+
│ ├──────────────────────────────────────────────────────┤ │
|
|
465
|
+
│ │ bridge.send() — 133 raw commands │ │
|
|
466
|
+
│ └──────────────────────────────────────────────────────┘ │
|
|
467
|
+
└────────────────────────────────────────────────────────────┘
|
|
468
|
+
```
|
|
469
|
+
|
|
470
|
+
Each layer up encodes more domain knowledge. `bridge.send({ type: "capture_page" })` captures a page. `smart.extractPage()` captures a page AND runs performance metrics, security state, font detection, accessibility analysis, and mobile-readiness checks in parallel — seven operations, one call, graceful failure on supplementary data, typed gaps for anything that fails.
|
|
471
|
+
|
|
472
|
+
### Evidence Infrastructure
|
|
473
|
+
|
|
474
|
+
**Coverage Gaps** — When supplementary operations in `extractPage()` fail, they don't silently return `null`. A typed gap is recorded with the dimension, reason, impact, and whether it reduces confidence on related findings:
|
|
475
|
+
|
|
476
|
+
```javascript
|
|
477
|
+
// Example gap from a failed performance metrics call
|
|
478
|
+
{ dimension: "performance", reason: "CDP domain disabled", impact: "method-failed", reducesConfidence: true }
|
|
479
|
+
```
|
|
480
|
+
|
|
481
|
+
**Tool-Enforced Findings** — `smart.finding()` validates every field at the tool level. The model cannot produce a finding without meeting the schema — it either returns a valid `Finding` or gets a clear error. Findings accumulate across `execute` calls within a session via `smart.findings()`.
|
|
482
|
+
|
|
483
|
+
**Session Aggregation** — Findings and coverage gaps persist across `execute` calls. A model can make findings across multiple calls, then retrieve the full set with `smart.findings()`. Reset with `smart.clearFindings()`.
|
|
484
|
+
|
|
485
|
+
### End-to-End Example: Competitive Audit
|
|
486
|
+
|
|
487
|
+
```javascript
|
|
488
|
+
// 1. Extract and compare both sites (scaffold + gaps included)
|
|
489
|
+
const comparison = await smart.comparePages(
|
|
490
|
+
'https://acme.com',
|
|
491
|
+
'https://rival.com'
|
|
492
|
+
);
|
|
493
|
+
|
|
494
|
+
// 2. Make findings — confidence auto-adjusts based on data availability
|
|
495
|
+
smart.finding({
|
|
496
|
+
claim: 'Rival loads 2.3x faster on Largest Contentful Paint',
|
|
497
|
+
evidence: [
|
|
498
|
+
`Acme LCP: ${comparison.siteA.performance?.webVitals?.lcp}ms`,
|
|
499
|
+
`Rival LCP: ${comparison.siteB.performance?.webVitals?.lcp}ms`,
|
|
500
|
+
],
|
|
501
|
+
sourceUrl: 'https://acme.com',
|
|
502
|
+
confidence: 'high',
|
|
503
|
+
method: 'comparePages + extractPage performance metrics',
|
|
504
|
+
dimension: 'performance', // if perf data failed, confidence caps to "medium"
|
|
505
|
+
});
|
|
506
|
+
|
|
507
|
+
smart.finding({
|
|
508
|
+
claim: 'Acme has 12 images without alt text; Rival has 0',
|
|
509
|
+
evidence: [
|
|
510
|
+
`Acme imagesWithoutAlt: ${comparison.siteA.accessibility?.imagesWithoutAlt}`,
|
|
511
|
+
`Rival imagesWithoutAlt: ${comparison.siteB.accessibility?.imagesWithoutAlt}`,
|
|
512
|
+
],
|
|
513
|
+
sourceUrl: 'https://acme.com',
|
|
514
|
+
confidence: 'high',
|
|
515
|
+
method: 'comparePages + extractPage accessibility summary',
|
|
516
|
+
dimension: 'accessibility',
|
|
517
|
+
});
|
|
518
|
+
|
|
519
|
+
// 3. Capture visual evidence
|
|
520
|
+
await smart.navigate('https://acme.com');
|
|
521
|
+
await smart.waitForIdle();
|
|
522
|
+
const acmeVisuals = await smart.scrollCapture({ maxSections: 5 });
|
|
523
|
+
|
|
524
|
+
// 4. Return accumulated session findings + visual evidence
|
|
525
|
+
return {
|
|
526
|
+
findings: smart.findings(),
|
|
527
|
+
scaffold: comparison.scaffold,
|
|
528
|
+
gaps: { acme: comparison.siteA.gaps, rival: comparison.siteB.gaps },
|
|
529
|
+
visualEvidence: { acme: acmeVisuals.sectionCount + ' sections captured' },
|
|
530
|
+
};
|
|
531
|
+
```
|
|
532
|
+
|
|
533
|
+
## Examples
|
|
534
|
+
|
|
535
|
+
#### Navigate, extract, and analyze
|
|
536
|
+
|
|
537
|
+
```javascript
|
|
538
|
+
// Connect to active tab, extract structured page evidence
|
|
539
|
+
const page = await smart.extractPage();
|
|
540
|
+
const finding = smart.finding({
|
|
541
|
+
claim: `Site uses ${page.capture.framework?.name || 'no detected framework'}`,
|
|
542
|
+
evidence: [`Framework: ${JSON.stringify(page.capture.framework)}`],
|
|
543
|
+
sourceUrl: page.meta?.canonical || 'active tab',
|
|
544
|
+
confidence: 'high',
|
|
545
|
+
method: 'extractPage framework detection',
|
|
546
|
+
});
|
|
547
|
+
return { page: page.meta, finding };
|
|
548
|
+
```
|
|
549
|
+
|
|
550
|
+
#### Mobile emulation + screenshot
|
|
551
|
+
|
|
552
|
+
```javascript
|
|
553
|
+
// Emulate iPhone and capture
|
|
554
|
+
await bridge.send({ type: 'emulate_device', device: 'iPhone 14' }, 10000);
|
|
555
|
+
await smart.navigate('https://example.com');
|
|
556
|
+
await smart.waitForIdle();
|
|
557
|
+
const screenshot = await smart.screenshot();
|
|
558
|
+
return screenshot;
|
|
559
|
+
```
|
|
560
|
+
|
|
561
|
+
#### Record and compile automation
|
|
562
|
+
|
|
563
|
+
```javascript
|
|
564
|
+
// Record a browser session, then compile to reusable skill
|
|
565
|
+
await bridge.send({ type: 'start_recording' }, 10000);
|
|
566
|
+
await smart.navigate('https://example.com');
|
|
567
|
+
await smart.click('button.submit');
|
|
568
|
+
await smart.type('#email', 'test@example.com');
|
|
569
|
+
const session = await bridge.send({ type: 'stop_recording' }, 10000);
|
|
570
|
+
return compileRecording(session.session, 'signup-flow');
|
|
571
|
+
```
|
|
572
|
+
|
|
573
|
+
#### Intercept and mock network
|
|
574
|
+
|
|
575
|
+
```javascript
|
|
576
|
+
// Block analytics, mock API response
|
|
577
|
+
await bridge.send({
|
|
578
|
+
type: 'browser_intercept',
|
|
579
|
+
pattern: '*analytics*',
|
|
580
|
+
action: 'block'
|
|
581
|
+
}, 10000);
|
|
582
|
+
await bridge.send({
|
|
583
|
+
type: 'browser_intercept',
|
|
584
|
+
pattern: '*/api/user',
|
|
585
|
+
action: 'mock',
|
|
586
|
+
body: JSON.stringify({ name: 'Test User' }),
|
|
587
|
+
statusCode: 200
|
|
588
|
+
}, 10000);
|
|
589
|
+
await smart.navigate('https://example.com');
|
|
590
|
+
return await smart.snapshot();
|
|
591
|
+
```
|
|
592
|
+
|
|
593
|
+
## Session Recording
|
|
594
|
+
|
|
595
|
+
Record browser sessions as structured data, then compile them into reusable automation skills. 12 interaction tools are automatically intercepted during recording (click, type, navigate, scroll, etc.), capturing args, result, timing, and page URL.
|
|
596
|
+
|
|
597
|
+
```javascript
|
|
598
|
+
// In code mode: record, interact, compile
|
|
599
|
+
await bridge.send({ type: 'start_recording' }, 10000);
|
|
600
|
+
// ... interact with the page ...
|
|
601
|
+
const session = await bridge.send({ type: 'stop_recording' }, 10000);
|
|
602
|
+
const skill = compileRecording(session.session, 'my-automation');
|
|
603
|
+
return skill;
|
|
604
|
+
```
|
|
605
|
+
|
|
606
|
+
In full mode, recording is available as 4 individual tools: `start_recording`, `stop_recording`, `get_recording_status`, and `compile_recording`.
|
|
607
|
+
|
|
304
608
|
## Auto-Settling
|
|
305
609
|
|
|
306
|
-
Mutative tools (`browser_click`, `browser_type`, `browser_navigate`, `browser_select_option`) use
|
|
610
|
+
Mutative tools (`browser_click`, `browser_type`, `browser_navigate`, `browser_select_option`) use actionability checks:
|
|
307
611
|
|
|
308
612
|
1. **Pre-flight**: Polls element visibility, stability, and enabled state before acting
|
|
309
613
|
2. **Action**: Dispatches the CDP command
|
|
@@ -326,6 +630,9 @@ Multi-framework detection returns a **primary** framework (meta-framework takes
|
|
|
326
630
|
|
|
327
631
|
## Tools Reference
|
|
328
632
|
|
|
633
|
+
<details>
|
|
634
|
+
<summary><b>All 100 tools</b> — Connection, Capture, Navigation, Network, Storage, Emulation, and more</summary>
|
|
635
|
+
|
|
329
636
|
### Connection & Status
|
|
330
637
|
|
|
331
638
|
| Tool | Description |
|
|
@@ -485,6 +792,15 @@ Multi-framework detection returns a **primary** framework (meta-framework takes
|
|
|
485
792
|
| `show_layout_shifts` | Visualize CLS regions |
|
|
486
793
|
| `show_paint_rects` | Visualize paint/repaint areas |
|
|
487
794
|
|
|
795
|
+
### Session Recording
|
|
796
|
+
|
|
797
|
+
| Tool | Description |
|
|
798
|
+
|------|-------------|
|
|
799
|
+
| `start_recording` | Begin recording browser session |
|
|
800
|
+
| `stop_recording` | Stop recording and return session data |
|
|
801
|
+
| `get_recording_status` | Check recording state |
|
|
802
|
+
| `compile_recording` | Compile session into SKILL.md automation |
|
|
803
|
+
|
|
488
804
|
### Crawlio App Integration
|
|
489
805
|
|
|
490
806
|
> Optional — requires [Crawlio.app](https://crawlio.app) running locally.
|
|
@@ -497,12 +813,24 @@ Multi-framework detection returns a **primary** framework (meta-framework takes
|
|
|
497
813
|
| `get_crawled_urls` | Get crawled URLs with status and pagination |
|
|
498
814
|
| `enrich_url` | Navigate + capture + submit enrichment in one call |
|
|
499
815
|
|
|
816
|
+
</details>
|
|
817
|
+
|
|
500
818
|
## Requirements
|
|
501
819
|
|
|
502
820
|
- **Node.js** >= 18
|
|
503
|
-
- **Chrome** (or Chromium) with the [Crawlio Agent extension](https://crawlio.app/agent) installed
|
|
821
|
+
- **Chrome** (or Chromium) with the [Crawlio Agent extension](https://www.crawlio.app/browser-agent) installed
|
|
504
822
|
- **Crawlio.app** (optional) — for site crawling and enrichment
|
|
505
823
|
|
|
824
|
+
## Resources
|
|
825
|
+
|
|
826
|
+
- [Documentation](https://docs.crawlio.app/browser-agent/overview)
|
|
827
|
+
- [API Reference](https://docs.crawlio.app/browser-agent/tools)
|
|
828
|
+
- [Product Page](https://www.crawlio.app/browser-agent)
|
|
829
|
+
- [Chrome Extension](https://www.crawlio.app/browser-agent)
|
|
830
|
+
- [npm Package](https://www.npmjs.com/package/crawlio-browser)
|
|
831
|
+
- [Changelog](https://github.com/Crawlio-app/crawlio-browser-agent/releases)
|
|
832
|
+
- [Previous repo](https://github.com/AshDevFr/crawlio-browser-mcp) — this project supersedes `crawlio-browser-mcp`
|
|
833
|
+
|
|
506
834
|
## License
|
|
507
835
|
|
|
508
836
|
MIT
|