@mastra/mcp-docs-server 1.1.22-alpha.12 → 1.1.22-alpha.13
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.docs/docs/browser/agent-browser.md +75 -0
- package/.docs/docs/browser/overview.md +136 -0
- package/.docs/docs/browser/stagehand.md +128 -0
- package/.docs/reference/browser/agent-browser.md +374 -0
- package/.docs/reference/browser/mastra-browser.md +284 -0
- package/.docs/reference/browser/stagehand-browser.md +290 -0
- package/.docs/reference/index.md +3 -0
- package/CHANGELOG.md +7 -0
- package/package.json +3 -3
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
# AgentBrowser
|
|
2
|
+
|
|
3
|
+
The `@mastra/agent-browser` package provides browser automation using Playwright with accessibility-first element targeting. Elements are identified by refs from the page's accessibility tree, making interactions reliable across different page layouts.
|
|
4
|
+
|
|
5
|
+
## When to use AgentBrowser
|
|
6
|
+
|
|
7
|
+
Use AgentBrowser when you need:
|
|
8
|
+
|
|
9
|
+
- Reliable element targeting through accessibility refs
|
|
10
|
+
- Fine-grained control over browser actions
|
|
11
|
+
- Playwright's robust automation capabilities
|
|
12
|
+
- Support for keyboard shortcuts and complex interactions
|
|
13
|
+
|
|
14
|
+
## Quickstart
|
|
15
|
+
|
|
16
|
+
Install the package:
|
|
17
|
+
|
|
18
|
+
**npm**:
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
npm install @mastra/agent-browser
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
**pnpm**:
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
pnpm add @mastra/agent-browser
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
**Yarn**:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
yarn add @mastra/agent-browser
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
**Bun**:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
bun add @mastra/agent-browser
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Create a browser instance and assign it to an agent:
|
|
43
|
+
|
|
44
|
+
```typescript
|
|
45
|
+
import { Agent } from '@mastra/core/agent'
|
|
46
|
+
import { AgentBrowser } from '@mastra/agent-browser'
|
|
47
|
+
|
|
48
|
+
const browser = new AgentBrowser({
|
|
49
|
+
headless: false,
|
|
50
|
+
})
|
|
51
|
+
|
|
52
|
+
export const browserAgent = new Agent({
|
|
53
|
+
id: 'browser-agent',
|
|
54
|
+
model: 'openai/gpt-5.4',
|
|
55
|
+
browser,
|
|
56
|
+
instructions: `You are a web automation assistant.
|
|
57
|
+
|
|
58
|
+
When interacting with pages:
|
|
59
|
+
1. Use browser_snapshot to get the current page state and element refs
|
|
60
|
+
2. Use the refs (like @e1, @e2) to target elements for clicks and typing
|
|
61
|
+
3. After actions, take another snapshot to verify the result`,
|
|
62
|
+
})
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Element refs
|
|
66
|
+
|
|
67
|
+
AgentBrowser uses accessibility tree refs to identify elements. When an agent calls `browser_snapshot`, it receives a text representation of the page with refs like `@e1`, `@e2`, etc. The agent then uses these refs with other tools to interact with elements.
|
|
68
|
+
|
|
69
|
+
> **Note:** See [AgentBrowser reference](https://mastra.ai/reference/browser/agent-browser) for all configuration options and tool details.
|
|
70
|
+
|
|
71
|
+
## Related
|
|
72
|
+
|
|
73
|
+
- [Browser overview](https://mastra.ai/docs/browser/overview)
|
|
74
|
+
- [Stagehand](https://mastra.ai/docs/browser/stagehand)
|
|
75
|
+
- [AgentBrowser reference](https://mastra.ai/reference/browser/agent-browser)
|
|
@@ -0,0 +1,136 @@
|
|
|
1
|
+
# Browser overview
|
|
2
|
+
|
|
3
|
+
Browser support enables agents to navigate websites, interact with page elements, fill forms, and extract data. Mastra provides browser capabilities through SDK providers that wrap popular browser automation libraries.
|
|
4
|
+
|
|
5
|
+
Mastra supports two browser SDK providers:
|
|
6
|
+
|
|
7
|
+
- [**AgentBrowser**](https://mastra.ai/docs/browser/agent-browser): A Playwright-based provider with accessibility-first element targeting. Best for general web automation and scraping.
|
|
8
|
+
- [**Stagehand**](https://mastra.ai/docs/browser/stagehand): A Browserbase provider with AI-powered element detection. Best for complex interactions that benefit from natural language selectors.
|
|
9
|
+
|
|
10
|
+
## When to use browser
|
|
11
|
+
|
|
12
|
+
Use browser when your agent needs to:
|
|
13
|
+
|
|
14
|
+
- Navigate websites and interact with page elements
|
|
15
|
+
- Fill out forms and submit data
|
|
16
|
+
- Extract structured data from web pages
|
|
17
|
+
- Automate multi-step web workflows
|
|
18
|
+
- Take actions that require a real browser (JavaScript rendering, authentication flows)
|
|
19
|
+
|
|
20
|
+
## How it works
|
|
21
|
+
|
|
22
|
+
When you assign a browser to an agent, Mastra includes the provider's tools in the agent's toolset. The agent uses these tools to control the browser: navigating to URLs, selecting elements, typing text, and reading page content.
|
|
23
|
+
|
|
24
|
+
Each provider offers a different set of tools optimized for its approach.
|
|
25
|
+
|
|
26
|
+
## Quickstart
|
|
27
|
+
|
|
28
|
+
Install your provider of choice, for this example you'll use the AgentBrowser provider.
|
|
29
|
+
|
|
30
|
+
**npm**:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
npm install @mastra/agent-browser
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
**pnpm**:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
pnpm add @mastra/agent-browser
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
**Yarn**:
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
yarn add @mastra/agent-browser
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
**Bun**:
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
bun add @mastra/agent-browser
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
Create a new browser instance:
|
|
55
|
+
|
|
56
|
+
```typescript
|
|
57
|
+
import { AgentBrowser } from '@mastra/agent-browser'
|
|
58
|
+
|
|
59
|
+
export const browser = new AgentBrowser({
|
|
60
|
+
headless: false,
|
|
61
|
+
})
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
Assign the browser to an agent:
|
|
65
|
+
|
|
66
|
+
```typescript
|
|
67
|
+
import { Agent } from '@mastra/core/agent'
|
|
68
|
+
import { browser } from '../browsers'
|
|
69
|
+
|
|
70
|
+
export const webAgent = new Agent({
|
|
71
|
+
id: 'web-agent',
|
|
72
|
+
model: 'openai/gpt-5.4',
|
|
73
|
+
browser,
|
|
74
|
+
instructions:
|
|
75
|
+
'You are a web automation assistant. Use browser tools to navigate websites and complete tasks.',
|
|
76
|
+
})
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
The agent automatically receives all browser tools from the provider.
|
|
80
|
+
|
|
81
|
+
## Cloud providers
|
|
82
|
+
|
|
83
|
+
Both SDK providers support connecting to cloud browser services instead of launching a local browser.
|
|
84
|
+
|
|
85
|
+
### Browserbase (Stagehand native)
|
|
86
|
+
|
|
87
|
+
Stagehand has native Browserbase integration:
|
|
88
|
+
|
|
89
|
+
```typescript
|
|
90
|
+
import { StagehandBrowser } from '@mastra/stagehand'
|
|
91
|
+
|
|
92
|
+
const browser = new StagehandBrowser({
|
|
93
|
+
env: 'BROWSERBASE',
|
|
94
|
+
apiKey: process.env.BROWSERBASE_API_KEY,
|
|
95
|
+
projectId: process.env.BROWSERBASE_PROJECT_ID,
|
|
96
|
+
})
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
### CDP URL (any provider)
|
|
100
|
+
|
|
101
|
+
Connect to any browser exposing a Chrome DevTools Protocol (CDP) endpoint:
|
|
102
|
+
|
|
103
|
+
```typescript
|
|
104
|
+
import { AgentBrowser } from '@mastra/agent-browser'
|
|
105
|
+
|
|
106
|
+
const browser = new AgentBrowser({
|
|
107
|
+
cdpUrl: process.env.BROWSER_CDP_URL,
|
|
108
|
+
headless: true,
|
|
109
|
+
})
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
This works with any [CDP-compatible](https://chromedevtools.github.io/devtools-protocol/) browser service.
|
|
113
|
+
|
|
114
|
+
## Screencast
|
|
115
|
+
|
|
116
|
+
Browser providers stream a live video feed of the browser to the Mastra Studio UI. This lets you watch the agent interact with pages in real-time.
|
|
117
|
+
|
|
118
|
+
Screencast is enabled by default and can be configured:
|
|
119
|
+
|
|
120
|
+
```typescript
|
|
121
|
+
const browser = new AgentBrowser({
|
|
122
|
+
screencast: {
|
|
123
|
+
enabled: true,
|
|
124
|
+
format: 'jpeg',
|
|
125
|
+
quality: 80,
|
|
126
|
+
maxWidth: 1280,
|
|
127
|
+
maxHeight: 720,
|
|
128
|
+
},
|
|
129
|
+
})
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
## Next steps
|
|
133
|
+
|
|
134
|
+
- [AgentBrowser](https://mastra.ai/docs/browser/agent-browser)
|
|
135
|
+
- [Stagehand](https://mastra.ai/docs/browser/stagehand)
|
|
136
|
+
- [MastraBrowser reference](https://mastra.ai/reference/browser/mastra-browser)
|
|
@@ -0,0 +1,128 @@
|
|
|
1
|
+
# Stagehand
|
|
2
|
+
|
|
3
|
+
The `@mastra/stagehand` package provides browser automation using the [Stagehand SDK](https://docs.browserbase.com/stagehand/introduction) from Browserbase. Stagehand uses AI to understand page context and locate elements, enabling natural language descriptions instead of explicit selectors.
|
|
4
|
+
|
|
5
|
+
## When to use Stagehand
|
|
6
|
+
|
|
7
|
+
Use Stagehand when you need:
|
|
8
|
+
|
|
9
|
+
- Natural language element targeting ("select the login button")
|
|
10
|
+
- AI-powered data extraction from pages
|
|
11
|
+
- Native Browserbase cloud integration
|
|
12
|
+
- Simpler tool interface for common actions
|
|
13
|
+
|
|
14
|
+
## Quickstart
|
|
15
|
+
|
|
16
|
+
Install the package:
|
|
17
|
+
|
|
18
|
+
**npm**:
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
npm install @mastra/stagehand
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
**pnpm**:
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
pnpm add @mastra/stagehand
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
**Yarn**:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
yarn add @mastra/stagehand
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
**Bun**:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
bun add @mastra/stagehand
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Create a browser instance and assign it to an agent:
|
|
43
|
+
|
|
44
|
+
```typescript
|
|
45
|
+
import { Agent } from '@mastra/core/agent'
|
|
46
|
+
import { StagehandBrowser } from '@mastra/stagehand'
|
|
47
|
+
|
|
48
|
+
const browser = new StagehandBrowser({
|
|
49
|
+
headless: false,
|
|
50
|
+
model: 'openai/gpt-5.4',
|
|
51
|
+
})
|
|
52
|
+
|
|
53
|
+
export const stagehandAgent = new Agent({
|
|
54
|
+
id: 'stagehand-agent',
|
|
55
|
+
model: 'openai/gpt-5.4',
|
|
56
|
+
browser,
|
|
57
|
+
instructions: `You are a web automation assistant.
|
|
58
|
+
|
|
59
|
+
Use stagehand tools to interact with pages:
|
|
60
|
+
- stagehand_navigate to go to URLs
|
|
61
|
+
- stagehand_act to perform actions described in natural language
|
|
62
|
+
- stagehand_extract to get structured data from the page
|
|
63
|
+
- stagehand_observe to find available actions on the page`,
|
|
64
|
+
})
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Natural language actions
|
|
68
|
+
|
|
69
|
+
When the agent uses the `stagehand_act` tool, it accepts natural language descriptions of actions:
|
|
70
|
+
|
|
71
|
+
- "Press the Sign In button"
|
|
72
|
+
- "Type `'[user@example.com](mailto:user@example.com)'` in the email field"
|
|
73
|
+
- "Select 'United States' from the country dropdown"
|
|
74
|
+
|
|
75
|
+
Stagehand's AI interprets the action and finds the appropriate element on the page.
|
|
76
|
+
|
|
77
|
+
## Data extraction
|
|
78
|
+
|
|
79
|
+
When the agent uses the `stagehand_extract` tool, it can pull structured data from pages.
|
|
80
|
+
|
|
81
|
+
Example instruction: "Extract the product name, price, and availability"
|
|
82
|
+
|
|
83
|
+
The tool returns structured data based on page content:
|
|
84
|
+
|
|
85
|
+
```json
|
|
86
|
+
{
|
|
87
|
+
"name": "Widget Pro",
|
|
88
|
+
"price": "$29.99",
|
|
89
|
+
"availability": "In Stock"
|
|
90
|
+
}
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
## Observing actions
|
|
94
|
+
|
|
95
|
+
When the agent uses the `stagehand_observe` tool, it analyzes the current page and returns possible actions.
|
|
96
|
+
|
|
97
|
+
Example instruction: "What actions can I take on this login form?"
|
|
98
|
+
|
|
99
|
+
Returns a list of available actions:
|
|
100
|
+
|
|
101
|
+
```json
|
|
102
|
+
[
|
|
103
|
+
{ "action": "Press 'Sign In' button", "element": "button" },
|
|
104
|
+
{ "action": "Type in 'Email' field", "element": "input" },
|
|
105
|
+
{ "action": "Open 'Forgot Password' link", "element": "a" }
|
|
106
|
+
]
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Browserbase
|
|
110
|
+
|
|
111
|
+
Stagehand has native Browserbase integration for cloud browser infrastructure:
|
|
112
|
+
|
|
113
|
+
```typescript
|
|
114
|
+
const browser = new StagehandBrowser({
|
|
115
|
+
env: 'BROWSERBASE',
|
|
116
|
+
apiKey: process.env.BROWSERBASE_API_KEY,
|
|
117
|
+
projectId: process.env.BROWSERBASE_PROJECT_ID,
|
|
118
|
+
model: 'openai/gpt-5.4',
|
|
119
|
+
})
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
> **Note:** See [StagehandBrowser reference](https://mastra.ai/reference/browser/stagehand-browser) for all configuration options.
|
|
123
|
+
|
|
124
|
+
## Related
|
|
125
|
+
|
|
126
|
+
- [Browser overview](https://mastra.ai/docs/browser/overview)
|
|
127
|
+
- [AgentBrowser](https://mastra.ai/docs/browser/agent-browser)
|
|
128
|
+
- [StagehandBrowser reference](https://mastra.ai/reference/browser/stagehand-browser)
|
|
@@ -0,0 +1,374 @@
|
|
|
1
|
+
# AgentBrowser class
|
|
2
|
+
|
|
3
|
+
The `AgentBrowser` class provides deterministic browser automation using the [agent-browser](https://github.com/vercel-labs/agent-browser) library. It uses accessibility tree snapshots and element refs (e.g., `@e5`) for precise, reproducible interactions.
|
|
4
|
+
|
|
5
|
+
Use `AgentBrowser` when you need reliable, deterministic browser automation. For AI-powered interactions using natural language, see [`StagehandBrowser`](https://mastra.ai/reference/browser/stagehand-browser).
|
|
6
|
+
|
|
7
|
+
## Usage example
|
|
8
|
+
|
|
9
|
+
```typescript
|
|
10
|
+
import { Agent } from '@mastra/core/agent'
|
|
11
|
+
import { AgentBrowser } from '@mastra/agent-browser'
|
|
12
|
+
|
|
13
|
+
const browser = new AgentBrowser({
|
|
14
|
+
headless: true,
|
|
15
|
+
viewport: { width: 1280, height: 720 },
|
|
16
|
+
scope: 'thread',
|
|
17
|
+
})
|
|
18
|
+
|
|
19
|
+
export const browserAgent = new Agent({
|
|
20
|
+
name: 'browser-agent',
|
|
21
|
+
instructions: `You can browse the web. Use browser_snapshot to see the page structure,
|
|
22
|
+
then interact with elements using their refs (e.g., @e5).`,
|
|
23
|
+
model: 'openai/gpt-5.4',
|
|
24
|
+
browser,
|
|
25
|
+
})
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Constructor parameters
|
|
29
|
+
|
|
30
|
+
**headless** (`boolean`): Whether to run the browser in headless mode (no visible UI). (Default: `true`)
|
|
31
|
+
|
|
32
|
+
**viewport** (`{ width: number; height: number }`): Browser viewport dimensions. (Default: `{ width: 1280, height: 720 }`)
|
|
33
|
+
|
|
34
|
+
**timeout** (`number`): Default timeout in milliseconds for browser operations. (Default: `30000`)
|
|
35
|
+
|
|
36
|
+
**cdpUrl** (`string | (() => string | Promise<string>)`): CDP WebSocket URL for connecting to an existing browser. Useful for cloud browser providers.
|
|
37
|
+
|
|
38
|
+
**scope** (`'shared' | 'thread'`): Browser instance scope. 'shared' shares one browser across all threads. 'thread' gives each thread its own browser. (Default: `'thread' (or 'shared' when cdpUrl is provided)`)
|
|
39
|
+
|
|
40
|
+
**onLaunch** (`(args: { browser: MastraBrowser }) => void | Promise<void>`): Callback invoked after the browser is ready.
|
|
41
|
+
|
|
42
|
+
**onClose** (`(args: { browser: MastraBrowser }) => void | Promise<void>`): Callback invoked before the browser closes.
|
|
43
|
+
|
|
44
|
+
**screencast** (`ScreencastOptions`): Configuration for streaming browser frames to Studio.
|
|
45
|
+
|
|
46
|
+
## Tools
|
|
47
|
+
|
|
48
|
+
`AgentBrowser` provides 15 deterministic tools for browser automation. All tools that interact with elements use refs from the accessibility tree snapshot.
|
|
49
|
+
|
|
50
|
+
### Core tools
|
|
51
|
+
|
|
52
|
+
| Tool | Description |
|
|
53
|
+
| ------------------ | ------------------------------------------------- |
|
|
54
|
+
| `browser_goto` | Navigate to a URL |
|
|
55
|
+
| `browser_snapshot` | Get accessibility tree snapshot with element refs |
|
|
56
|
+
| `browser_click` | Click an element by ref |
|
|
57
|
+
| `browser_type` | Type text into an element |
|
|
58
|
+
| `browser_press` | Press keyboard keys |
|
|
59
|
+
| `browser_select` | Select option from dropdown |
|
|
60
|
+
| `browser_scroll` | Scroll the page or element |
|
|
61
|
+
| `browser_close` | Close the browser |
|
|
62
|
+
|
|
63
|
+
### Extended tools
|
|
64
|
+
|
|
65
|
+
| Tool | Description |
|
|
66
|
+
| ------------------ | ----------------------------------------------- |
|
|
67
|
+
| `browser_hover` | Hover over an element |
|
|
68
|
+
| `browser_back` | Go back in browser history |
|
|
69
|
+
| `browser_dialog` | Handle browser dialogs (alert, confirm, prompt) |
|
|
70
|
+
| `browser_wait` | Wait for element state changes |
|
|
71
|
+
| `browser_tabs` | Manage browser tabs (list, new, switch, close) |
|
|
72
|
+
| `browser_drag` | Drag and drop elements |
|
|
73
|
+
| `browser_evaluate` | Execute JavaScript in the page (escape hatch) |
|
|
74
|
+
|
|
75
|
+
## Tool reference
|
|
76
|
+
|
|
77
|
+
### `browser_goto`
|
|
78
|
+
|
|
79
|
+
Navigate to a URL.
|
|
80
|
+
|
|
81
|
+
```text
|
|
82
|
+
// Tool input
|
|
83
|
+
{
|
|
84
|
+
"url": "https://example.com",
|
|
85
|
+
"waitUntil": "domcontentloaded",
|
|
86
|
+
"timeout": 30000
|
|
87
|
+
}
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
| Parameter | Type | Description |
|
|
91
|
+
| ----------- | ----------------------------------------------- | ----------------------------------------------- |
|
|
92
|
+
| `url` | `string` | URL to navigate to |
|
|
93
|
+
| `waitUntil` | `"load" \| "domcontentloaded" \| "networkidle"` | When to consider navigation complete (optional) |
|
|
94
|
+
| `timeout` | `number` | Navigation timeout in ms (optional) |
|
|
95
|
+
|
|
96
|
+
### `browser_snapshot`
|
|
97
|
+
|
|
98
|
+
Get an accessibility tree snapshot of the page. Returns element refs like `@e5` that you use with other tools.
|
|
99
|
+
|
|
100
|
+
```text
|
|
101
|
+
// Tool input
|
|
102
|
+
{
|
|
103
|
+
"interactiveOnly": true,
|
|
104
|
+
"maxDepth": 10
|
|
105
|
+
}
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
| Parameter | Type | Description |
|
|
109
|
+
| ----------------- | --------- | -------------------------------------------- |
|
|
110
|
+
| `interactiveOnly` | `boolean` | Only include interactive elements (optional) |
|
|
111
|
+
| `maxDepth` | `number` | Maximum tree depth (optional) |
|
|
112
|
+
|
|
113
|
+
**Example output:**
|
|
114
|
+
|
|
115
|
+
```text
|
|
116
|
+
[document] Example Page
|
|
117
|
+
[banner]
|
|
118
|
+
[link @e1] Home
|
|
119
|
+
[link @e2] About
|
|
120
|
+
[main]
|
|
121
|
+
[heading @e3] Welcome
|
|
122
|
+
[textbox @e4] Search...
|
|
123
|
+
[button @e5] Submit
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
### `browser_click`
|
|
127
|
+
|
|
128
|
+
Click an element using its ref from the snapshot.
|
|
129
|
+
|
|
130
|
+
```text
|
|
131
|
+
{
|
|
132
|
+
"ref": "@e5",
|
|
133
|
+
"button": "left",
|
|
134
|
+
"clickCount": 1,
|
|
135
|
+
"modifiers": ["Control", "Shift"]
|
|
136
|
+
}
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
| Parameter | Type | Description |
|
|
140
|
+
| ------------ | ------------------------------- | ---------------------------------------------- |
|
|
141
|
+
| `ref` | `string` | Element ref from snapshot (required) |
|
|
142
|
+
| `button` | `"left" \| "right" \| "middle"` | Mouse button (optional) |
|
|
143
|
+
| `clickCount` | `number` | Number of activations, 2 for double (optional) |
|
|
144
|
+
| `modifiers` | `string[]` | Modifier keys (optional) |
|
|
145
|
+
|
|
146
|
+
### `browser_type`
|
|
147
|
+
|
|
148
|
+
Type text into an input element.
|
|
149
|
+
|
|
150
|
+
```text
|
|
151
|
+
// Tool input
|
|
152
|
+
{
|
|
153
|
+
"ref": "@e4",
|
|
154
|
+
"text": "search query",
|
|
155
|
+
"clear": true,
|
|
156
|
+
"delay": 50
|
|
157
|
+
}
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
| Parameter | Type | Description |
|
|
161
|
+
| --------- | --------- | ----------------------------------------- |
|
|
162
|
+
| `ref` | `string` | Element ref from snapshot (required) |
|
|
163
|
+
| `text` | `string` | Text to type (required) |
|
|
164
|
+
| `clear` | `boolean` | Clear existing content first (optional) |
|
|
165
|
+
| `delay` | `number` | Delay between keystrokes in ms (optional) |
|
|
166
|
+
|
|
167
|
+
### `browser_press`
|
|
168
|
+
|
|
169
|
+
Press keyboard keys.
|
|
170
|
+
|
|
171
|
+
```text
|
|
172
|
+
// Tool input
|
|
173
|
+
{
|
|
174
|
+
"key": "Enter",
|
|
175
|
+
"modifiers": ["Control"]
|
|
176
|
+
}
|
|
177
|
+
|
|
178
|
+
// Key combinations
|
|
179
|
+
{ "key": "Control+a" }
|
|
180
|
+
{ "key": "Control+c" }
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
| Parameter | Type | Description |
|
|
184
|
+
| ----------- | ---------- | ----------------------------------------------------------------- |
|
|
185
|
+
| `key` | `string` | Key name (e.g., "Enter", "Tab", "Escape", "Control+a") (required) |
|
|
186
|
+
| `modifiers` | `string[]` | Modifier keys (optional) |
|
|
187
|
+
|
|
188
|
+
### `browser_select`
|
|
189
|
+
|
|
190
|
+
Select an option from a dropdown. Provide one of `value`, `label`, or `index`.
|
|
191
|
+
|
|
192
|
+
```text
|
|
193
|
+
// Tool input - by value
|
|
194
|
+
{
|
|
195
|
+
"ref": "@e10",
|
|
196
|
+
"value": "option-value"
|
|
197
|
+
}
|
|
198
|
+
|
|
199
|
+
// Tool input - by label
|
|
200
|
+
{
|
|
201
|
+
"ref": "@e10",
|
|
202
|
+
"label": "Option Text"
|
|
203
|
+
}
|
|
204
|
+
|
|
205
|
+
// Tool input - by index
|
|
206
|
+
{
|
|
207
|
+
"ref": "@e10",
|
|
208
|
+
"index": 0
|
|
209
|
+
}
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
### `browser_scroll`
|
|
213
|
+
|
|
214
|
+
Scroll the page or a specific element.
|
|
215
|
+
|
|
216
|
+
```text
|
|
217
|
+
// Tool input
|
|
218
|
+
{
|
|
219
|
+
"direction": "down",
|
|
220
|
+
"amount": 300,
|
|
221
|
+
"ref": "@e15"
|
|
222
|
+
}
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
| Parameter | Type | Description |
|
|
226
|
+
| ----------- | ------------------------------------- | ----------------------------------------------------- |
|
|
227
|
+
| `direction` | `"up" \| "down" \| "left" \| "right"` | Scroll direction (required) |
|
|
228
|
+
| `amount` | `number` | Pixels to scroll, default 300 (optional) |
|
|
229
|
+
| `ref` | `string` | Element to scroll, scrolls page if omitted (optional) |
|
|
230
|
+
|
|
231
|
+
### `browser_hover`
|
|
232
|
+
|
|
233
|
+
Hover over an element to trigger hover effects.
|
|
234
|
+
|
|
235
|
+
```text
|
|
236
|
+
// Tool input
|
|
237
|
+
{
|
|
238
|
+
"ref": "@e7"
|
|
239
|
+
}
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
### `browser_back`
|
|
243
|
+
|
|
244
|
+
Go back in browser history.
|
|
245
|
+
|
|
246
|
+
```text
|
|
247
|
+
// Tool input (no parameters required)
|
|
248
|
+
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
### `browser_dialog`
|
|
252
|
+
|
|
253
|
+
Handle browser dialogs (alert, confirm, prompt). Click an element that triggers a dialog and handle it.
|
|
254
|
+
|
|
255
|
+
```text
|
|
256
|
+
// Tool input
|
|
257
|
+
{
|
|
258
|
+
"triggerRef": "@e5",
|
|
259
|
+
"action": "accept",
|
|
260
|
+
"text": "response"
|
|
261
|
+
}
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
| Parameter | Type | Description |
|
|
265
|
+
| ------------ | ----------------------- | ------------------------------------------- |
|
|
266
|
+
| `triggerRef` | `string` | Element that triggers the dialog (required) |
|
|
267
|
+
| `action` | `"accept" \| "dismiss"` | How to handle the dialog (required) |
|
|
268
|
+
| `text` | `string` | Text for prompt dialogs (optional) |
|
|
269
|
+
|
|
270
|
+
### `browser_wait`
|
|
271
|
+
|
|
272
|
+
Wait for an element to reach a specific state.
|
|
273
|
+
|
|
274
|
+
```text
|
|
275
|
+
// Tool input
|
|
276
|
+
{
|
|
277
|
+
"ref": "@e20",
|
|
278
|
+
"state": "visible",
|
|
279
|
+
"timeout": 30000
|
|
280
|
+
}
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
| Parameter | Type | Description |
|
|
284
|
+
| --------- | --------------------------------------------------- | ---------------------------------- |
|
|
285
|
+
| `ref` | `string` | Element ref to wait for (optional) |
|
|
286
|
+
| `state` | `"visible" \| "hidden" \| "attached" \| "detached"` | State to wait for (optional) |
|
|
287
|
+
| `timeout` | `number` | Max wait time in ms (optional) |
|
|
288
|
+
|
|
289
|
+
### `browser_tabs`
|
|
290
|
+
|
|
291
|
+
Manage browser tabs.
|
|
292
|
+
|
|
293
|
+
```text
|
|
294
|
+
// List all tabs
|
|
295
|
+
{ "action": "list" }
|
|
296
|
+
|
|
297
|
+
// Open new tab
|
|
298
|
+
{ "action": "new", "url": "https://example.com" }
|
|
299
|
+
|
|
300
|
+
// Switch to tab by index
|
|
301
|
+
{ "action": "switch", "index": 0 }
|
|
302
|
+
|
|
303
|
+
// Close tab by index
|
|
304
|
+
{ "action": "close", "index": 1 }
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
### `browser_drag`
|
|
308
|
+
|
|
309
|
+
Drag an element to a target location.
|
|
310
|
+
|
|
311
|
+
```text
|
|
312
|
+
// Tool input
|
|
313
|
+
{
|
|
314
|
+
"sourceRef": "@e10",
|
|
315
|
+
"targetRef": "@e20"
|
|
316
|
+
}
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
| Parameter | Type | Description |
|
|
320
|
+
| ----------- | -------- | ------------------------------ |
|
|
321
|
+
| `sourceRef` | `string` | Element to drag (required) |
|
|
322
|
+
| `targetRef` | `string` | Drop target element (required) |
|
|
323
|
+
|
|
324
|
+
### `browser_evaluate`
|
|
325
|
+
|
|
326
|
+
Execute JavaScript in the page context. Use as an escape hatch when other tools don't cover your use case.
|
|
327
|
+
|
|
328
|
+
```text
|
|
329
|
+
// Tool input
|
|
330
|
+
{
|
|
331
|
+
"script": "document.title",
|
|
332
|
+
"returnValue": true
|
|
333
|
+
}
|
|
334
|
+
```
|
|
335
|
+
|
|
336
|
+
| Parameter | Type | Description |
|
|
337
|
+
| ------------- | --------- | --------------------------------------- |
|
|
338
|
+
| `script` | `string` | JavaScript to execute (required) |
|
|
339
|
+
| `returnValue` | `boolean` | Whether to return the result (optional) |
|
|
340
|
+
|
|
341
|
+
### `browser_close`
|
|
342
|
+
|
|
343
|
+
Close the browser and clean up resources.
|
|
344
|
+
|
|
345
|
+
```text
|
|
346
|
+
// Tool input (no parameters required)
|
|
347
|
+
|
|
348
|
+
```
|
|
349
|
+
|
|
350
|
+
## How refs work
|
|
351
|
+
|
|
352
|
+
The `browser_snapshot` tool returns an accessibility tree with element refs like `@e1`, `@e2`, etc. These refs are stable identifiers you use with other tools:
|
|
353
|
+
|
|
354
|
+
1. Call `browser_snapshot` to see the page structure
|
|
355
|
+
2. Find the element you want to interact with
|
|
356
|
+
3. Use its ref with interaction tools like `browser_type` or `browser_scroll`.
|
|
357
|
+
|
|
358
|
+
```text
|
|
359
|
+
// 1. Get snapshot
|
|
360
|
+
// Returns: [textbox @e4] Search... [link @e5] Home
|
|
361
|
+
|
|
362
|
+
// 2. Type in the search box
|
|
363
|
+
{ "tool": "browser_type", "input": { "ref": "@e4", "text": "mastra" } }
|
|
364
|
+
|
|
365
|
+
// 3. Navigate to home
|
|
366
|
+
{ "tool": "browser_goto", "input": { "url": "https://example.com" } }
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
## Related
|
|
370
|
+
|
|
371
|
+
- [MastraBrowser](https://mastra.ai/reference/browser/mastra-browser): Base class reference
|
|
372
|
+
- [StagehandBrowser](https://mastra.ai/reference/browser/stagehand-browser): AI-powered alternative
|
|
373
|
+
- [Browser overview](https://mastra.ai/docs/browser/overview): Conceptual guide
|
|
374
|
+
- [agent-browser guide](https://mastra.ai/docs/browser/agent-browser): Usage guide
|
|
@@ -0,0 +1,284 @@
|
|
|
1
|
+
# MastraBrowser class
|
|
2
|
+
|
|
3
|
+
The `MastraBrowser` class is the abstract base class for browser automation providers. It defines the common interface for launching browsers, managing thread isolation, streaming screencasts, and handling input events.
|
|
4
|
+
|
|
5
|
+
You don't instantiate `MastraBrowser` directly. Instead, use a provider implementation:
|
|
6
|
+
|
|
7
|
+
- [`AgentBrowser`](https://mastra.ai/reference/browser/agent-browser): Deterministic browser automation using refs
|
|
8
|
+
- [`StagehandBrowser`](https://mastra.ai/reference/browser/stagehand-browser): AI-powered browser automation using natural language
|
|
9
|
+
|
|
10
|
+
## Usage example
|
|
11
|
+
|
|
12
|
+
```typescript
|
|
13
|
+
import { Agent } from '@mastra/core/agent'
|
|
14
|
+
import { AgentBrowser } from '@mastra/agent-browser'
|
|
15
|
+
|
|
16
|
+
const browser = new AgentBrowser({
|
|
17
|
+
headless: true,
|
|
18
|
+
viewport: { width: 1280, height: 720 },
|
|
19
|
+
scope: 'thread',
|
|
20
|
+
})
|
|
21
|
+
|
|
22
|
+
export const browserAgent = new Agent({
|
|
23
|
+
name: 'browser-agent',
|
|
24
|
+
instructions: 'You can browse the web to find information.',
|
|
25
|
+
model: 'openai/gpt-5.4',
|
|
26
|
+
browser,
|
|
27
|
+
})
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
## Constructor parameters
|
|
31
|
+
|
|
32
|
+
**headless** (`boolean`): Whether to run the browser in headless mode (no visible UI). (Default: `true`)
|
|
33
|
+
|
|
34
|
+
**viewport** (`{ width: number; height: number }`): Browser viewport dimensions. Controls the size of the browser window. (Default: `{ width: 1280, height: 720 }`)
|
|
35
|
+
|
|
36
|
+
**timeout** (`number`): Default timeout in milliseconds for browser operations. (Default: `10000`)
|
|
37
|
+
|
|
38
|
+
**cdpUrl** (`string | (() => string | Promise<string>)`): CDP WebSocket URL, HTTP endpoint, or sync/async provider function. When provided, connects to an existing browser instead of launching a new one. HTTP endpoints are resolved to WebSocket internally. Can't be used with scope: 'thread' (automatically uses shared scope).
|
|
39
|
+
|
|
40
|
+
**scope** (`'shared' | 'thread'`): Browser instance scope across threads. 'shared' means all threads share a single browser instance. 'thread' means each thread gets its own browser instance (full isolation). (Default: `'thread' (or 'shared' when cdpUrl is provided)`)
|
|
41
|
+
|
|
42
|
+
**onLaunch** (`(args: { browser: MastraBrowser }) => void | Promise<void>`): Callback invoked after the browser reaches 'ready' status.
|
|
43
|
+
|
|
44
|
+
**onClose** (`(args: { browser: MastraBrowser }) => void | Promise<void>`): Callback invoked before the browser is closed.
|
|
45
|
+
|
|
46
|
+
**screencast** (`ScreencastOptions`): Configuration for streaming browser frames.
|
|
47
|
+
|
|
48
|
+
**screencast.format** (`'jpeg' | 'png'`): Image format for screencast frames.
|
|
49
|
+
|
|
50
|
+
**screencast.quality** (`number`): Image quality (1-100). Only applies to JPEG format.
|
|
51
|
+
|
|
52
|
+
**screencast.maxWidth** (`number`): Maximum width for screencast frames.
|
|
53
|
+
|
|
54
|
+
**screencast.maxHeight** (`number`): Maximum height for screencast frames.
|
|
55
|
+
|
|
56
|
+
**screencast.everyNthFrame** (`number`): Capture every Nth frame to reduce bandwidth.
|
|
57
|
+
|
|
58
|
+
## Properties
|
|
59
|
+
|
|
60
|
+
The following properties (`id`, `name`, `provider`) are abstract and must be defined by concrete provider implementations:
|
|
61
|
+
|
|
62
|
+
**id** (`string`): Unique identifier for this browser instance. Abstract - defined by provider.
|
|
63
|
+
|
|
64
|
+
**name** (`string`): Human-readable name of the browser provider (e.g., 'AgentBrowser', 'StagehandBrowser'). Abstract - defined by provider.
|
|
65
|
+
|
|
66
|
+
**provider** (`string`): Provider identifier (e.g., 'vercel-labs/agent-browser', 'browserbase/stagehand'). Abstract - defined by provider.
|
|
67
|
+
|
|
68
|
+
**headless** (`boolean`): Whether the browser is running in headless mode.
|
|
69
|
+
|
|
70
|
+
**status** (`BrowserStatus`): Current browser status: 'pending', 'launching', 'ready', 'error', 'closing', or 'closed'.
|
|
71
|
+
|
|
72
|
+
## Methods
|
|
73
|
+
|
|
74
|
+
### Lifecycle
|
|
75
|
+
|
|
76
|
+
#### `ensureReady()`
|
|
77
|
+
|
|
78
|
+
Ensures the browser is launched and ready for use. Automatically called before tool execution. Implemented in the base class.
|
|
79
|
+
|
|
80
|
+
```typescript
|
|
81
|
+
await browser.ensureReady()
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
#### `close()`
|
|
85
|
+
|
|
86
|
+
Closes the browser and cleans up all resources. Implemented in the base class with race-condition-safe handling.
|
|
87
|
+
|
|
88
|
+
```typescript
|
|
89
|
+
await browser.close()
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
#### `isBrowserRunning()`
|
|
93
|
+
|
|
94
|
+
Checks if the browser is currently running.
|
|
95
|
+
|
|
96
|
+
```typescript
|
|
97
|
+
const isRunning = browser.isBrowserRunning()
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
**Returns:** `boolean`
|
|
101
|
+
|
|
102
|
+
### Thread management
|
|
103
|
+
|
|
104
|
+
#### `setCurrentThread(threadId)`
|
|
105
|
+
|
|
106
|
+
Sets the current thread ID for browser operations. Used internally by the agent runtime.
|
|
107
|
+
|
|
108
|
+
```typescript
|
|
109
|
+
browser.setCurrentThread('thread-123')
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
#### `getCurrentThread()`
|
|
113
|
+
|
|
114
|
+
Gets the current thread ID.
|
|
115
|
+
|
|
116
|
+
```typescript
|
|
117
|
+
const threadId = browser.getCurrentThread()
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**Returns:** `string`
|
|
121
|
+
|
|
122
|
+
#### `hasThreadSession(threadId)`
|
|
123
|
+
|
|
124
|
+
Checks if a thread has an active browser session.
|
|
125
|
+
|
|
126
|
+
```typescript
|
|
127
|
+
const hasSession = browser.hasThreadSession('thread-123')
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
**Returns:** `boolean`
|
|
131
|
+
|
|
132
|
+
#### `closeThreadSession(threadId)`
|
|
133
|
+
|
|
134
|
+
Closes a specific thread's browser session. For 'thread' scope, this closes that thread's browser instance. For 'shared' scope, this clears the thread's state.
|
|
135
|
+
|
|
136
|
+
```typescript
|
|
137
|
+
await browser.closeThreadSession('thread-123')
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
### Tools
|
|
141
|
+
|
|
142
|
+
#### `getTools()`
|
|
143
|
+
|
|
144
|
+
Returns the browser tools for use with agents. Each provider returns different tools based on its paradigm.
|
|
145
|
+
|
|
146
|
+
```typescript
|
|
147
|
+
const tools = browser.getTools()
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
**Returns:** `Record<string, Tool>`
|
|
151
|
+
|
|
152
|
+
### Screencast
|
|
153
|
+
|
|
154
|
+
#### `startScreencast(options?, threadId?)`
|
|
155
|
+
|
|
156
|
+
Starts streaming browser frames. Returns a `ScreencastStream` that emits frame events.
|
|
157
|
+
|
|
158
|
+
```typescript
|
|
159
|
+
const stream = await browser.startScreencast({ format: 'jpeg', quality: 80 }, 'thread-123')
|
|
160
|
+
|
|
161
|
+
stream.on('frame', frame => {
|
|
162
|
+
console.log('Frame received:', frame.data.length, 'bytes')
|
|
163
|
+
})
|
|
164
|
+
|
|
165
|
+
stream.on('stop', reason => {
|
|
166
|
+
console.log('Screencast stopped:', reason)
|
|
167
|
+
})
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
**Returns:** `Promise<ScreencastStream>`
|
|
171
|
+
|
|
172
|
+
### Input injection
|
|
173
|
+
|
|
174
|
+
#### `injectMouseEvent(params, threadId?)`
|
|
175
|
+
|
|
176
|
+
Injects a mouse event into the browser. Used by Studio for live interaction.
|
|
177
|
+
|
|
178
|
+
```typescript
|
|
179
|
+
await browser.injectMouseEvent({
|
|
180
|
+
type: 'mousePressed',
|
|
181
|
+
x: 100,
|
|
182
|
+
y: 200,
|
|
183
|
+
button: 'left',
|
|
184
|
+
clickCount: 1,
|
|
185
|
+
})
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
#### `injectKeyboardEvent(params, threadId?)`
|
|
189
|
+
|
|
190
|
+
Injects a keyboard event into the browser. Used by Studio for live interaction.
|
|
191
|
+
|
|
192
|
+
```typescript
|
|
193
|
+
await browser.injectKeyboardEvent({
|
|
194
|
+
type: 'keyDown',
|
|
195
|
+
key: 'Enter',
|
|
196
|
+
code: 'Enter',
|
|
197
|
+
})
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
### State
|
|
201
|
+
|
|
202
|
+
#### `getState(threadId?)`
|
|
203
|
+
|
|
204
|
+
Gets the current browser state including URL and tabs.
|
|
205
|
+
|
|
206
|
+
```typescript
|
|
207
|
+
const state = await browser.getState('thread-123')
|
|
208
|
+
console.log('Current URL:', state.currentUrl)
|
|
209
|
+
console.log('Tabs:', state.tabs)
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
**Returns:** `Promise<BrowserState>`
|
|
213
|
+
|
|
214
|
+
```typescript
|
|
215
|
+
interface BrowserState {
|
|
216
|
+
currentUrl: string | null
|
|
217
|
+
tabs: BrowserTabState[]
|
|
218
|
+
activeTabIndex: number
|
|
219
|
+
}
|
|
220
|
+
|
|
221
|
+
interface BrowserTabState {
|
|
222
|
+
id: string
|
|
223
|
+
url: string
|
|
224
|
+
title: string
|
|
225
|
+
}
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
#### `getCurrentUrl(threadId?)`
|
|
229
|
+
|
|
230
|
+
Gets the current page URL.
|
|
231
|
+
|
|
232
|
+
```typescript
|
|
233
|
+
const url = await browser.getCurrentUrl()
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
**Returns:** `Promise<string | null>`
|
|
237
|
+
|
|
238
|
+
## Browser scope
|
|
239
|
+
|
|
240
|
+
The `scope` option controls how browser instances are shared across conversation threads:
|
|
241
|
+
|
|
242
|
+
| Scope | Description | Use case |
|
|
243
|
+
| ---------- | ------------------------------------------- | ---------------------------------------- |
|
|
244
|
+
| `'shared'` | All threads share a single browser instance | Cost-efficient for non-conflicting tasks |
|
|
245
|
+
| `'thread'` | Each thread gets its own browser instance | Full isolation for concurrent users |
|
|
246
|
+
|
|
247
|
+
```typescript
|
|
248
|
+
// Shared browser for all threads
|
|
249
|
+
const sharedBrowser = new AgentBrowser({
|
|
250
|
+
scope: 'shared',
|
|
251
|
+
})
|
|
252
|
+
|
|
253
|
+
// Isolated browser per thread
|
|
254
|
+
const isolatedBrowser = new AgentBrowser({
|
|
255
|
+
scope: 'thread',
|
|
256
|
+
})
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
When using `cdpUrl` to connect to an external browser, the scope automatically falls back to `'shared'` since you can't spawn new browser instances.
|
|
260
|
+
|
|
261
|
+
## Cloud browser providers
|
|
262
|
+
|
|
263
|
+
Connect to cloud browser services using the `cdpUrl` option:
|
|
264
|
+
|
|
265
|
+
```typescript
|
|
266
|
+
// Static CDP URL
|
|
267
|
+
const browser = new AgentBrowser({
|
|
268
|
+
cdpUrl: 'wss://browser.example.com/ws',
|
|
269
|
+
})
|
|
270
|
+
|
|
271
|
+
// Dynamic CDP URL (e.g., session-based)
|
|
272
|
+
const browser = new AgentBrowser({
|
|
273
|
+
cdpUrl: async () => {
|
|
274
|
+
const session = await createBrowserSession()
|
|
275
|
+
return session.wsUrl
|
|
276
|
+
},
|
|
277
|
+
})
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
## Related
|
|
281
|
+
|
|
282
|
+
- [AgentBrowser](https://mastra.ai/reference/browser/agent-browser): Deterministic browser automation
|
|
283
|
+
- [StagehandBrowser](https://mastra.ai/reference/browser/stagehand-browser): AI-powered browser automation
|
|
284
|
+
- [Browser overview](https://mastra.ai/docs/browser/overview): Conceptual guide to browser automation
|
|
@@ -0,0 +1,290 @@
|
|
|
1
|
+
# StagehandBrowser class
|
|
2
|
+
|
|
3
|
+
The `StagehandBrowser` class provides AI-powered browser automation using [Stagehand](https://github.com/browserbase/stagehand). It uses natural language instructions for interactions instead of element refs.
|
|
4
|
+
|
|
5
|
+
Use `StagehandBrowser` when you want AI to interpret and execute browser actions from natural language. For deterministic automation using element refs, see [`AgentBrowser`](https://mastra.ai/reference/browser/agent-browser).
|
|
6
|
+
|
|
7
|
+
## Usage example
|
|
8
|
+
|
|
9
|
+
```typescript
|
|
10
|
+
import { Agent } from '@mastra/core/agent'
|
|
11
|
+
import { StagehandBrowser } from '@mastra/stagehand'
|
|
12
|
+
|
|
13
|
+
const browser = new StagehandBrowser({
|
|
14
|
+
headless: true,
|
|
15
|
+
model: 'openai/gpt-5.4',
|
|
16
|
+
selfHeal: true,
|
|
17
|
+
})
|
|
18
|
+
|
|
19
|
+
export const browserAgent = new Agent({
|
|
20
|
+
name: 'browser-agent',
|
|
21
|
+
instructions: `You can browse the web using natural language.
|
|
22
|
+
Use stagehand_act to perform actions like "click the login button".
|
|
23
|
+
Use stagehand_extract to get data from pages.`,
|
|
24
|
+
model: 'openai/gpt-5.4',
|
|
25
|
+
browser,
|
|
26
|
+
})
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## Constructor parameters
|
|
30
|
+
|
|
31
|
+
**headless** (`boolean`): Whether to run the browser in headless mode. (Default: `true`)
|
|
32
|
+
|
|
33
|
+
**viewport** (`{ width: number; height: number }`): Browser viewport dimensions. (Default: `{ width: 1280, height: 720 }`)
|
|
34
|
+
|
|
35
|
+
**env** (`'LOCAL' | 'BROWSERBASE'`): Environment to run the browser in. Use 'BROWSERBASE' for cloud execution. (Default: `'LOCAL'`)
|
|
36
|
+
|
|
37
|
+
**apiKey** (`string`): Browserbase API key. Required when env is 'BROWSERBASE'.
|
|
38
|
+
|
|
39
|
+
**projectId** (`string`): Browserbase project ID. Required when env is 'BROWSERBASE'.
|
|
40
|
+
|
|
41
|
+
**model** (`string | ModelConfiguration`): Model configuration for AI operations. Can be a string like 'openai/gpt-5.4' or an object with modelName, apiKey, and baseURL. (Default: `'openai/gpt-5.4'`)
|
|
42
|
+
|
|
43
|
+
**selfHeal** (`boolean`): Enable self-healing selectors. When enabled, Stagehand uses AI to find elements even when initial selectors fail. (Default: `true`)
|
|
44
|
+
|
|
45
|
+
**domSettleTimeout** (`number`): Timeout in milliseconds for DOM to settle after actions. (Default: `5000`)
|
|
46
|
+
|
|
47
|
+
**verbose** (`0 | 1 | 2`): Logging verbosity level. 0 = silent, 1 = errors only, 2 = verbose. (Default: `1`)
|
|
48
|
+
|
|
49
|
+
**systemPrompt** (`string`): Custom system prompt for AI operations.
|
|
50
|
+
|
|
51
|
+
**cdpUrl** (`string | (() => string | Promise<string>)`): CDP WebSocket URL or HTTP endpoint for connecting to an existing browser. HTTP endpoints are resolved to WebSocket internally.
|
|
52
|
+
|
|
53
|
+
**scope** (`'shared' | 'thread'`): Browser instance scope across threads. (Default: `'thread' (or 'shared' when cdpUrl is provided)`)
|
|
54
|
+
|
|
55
|
+
**timeout** (`number`): Default timeout in milliseconds for browser operations. (Default: `10000`)
|
|
56
|
+
|
|
57
|
+
**onLaunch** (`(args: { browser: MastraBrowser }) => void | Promise<void>`): Callback invoked after the browser is ready.
|
|
58
|
+
|
|
59
|
+
**onClose** (`(args: { browser: MastraBrowser }) => void | Promise<void>`): Callback invoked before the browser closes.
|
|
60
|
+
|
|
61
|
+
**screencast** (`ScreencastOptions`): Configuration for streaming browser frames to Studio.
|
|
62
|
+
|
|
63
|
+
## Tools
|
|
64
|
+
|
|
65
|
+
`StagehandBrowser` provides 6 AI-powered tools for browser automation:
|
|
66
|
+
|
|
67
|
+
| Tool | Description |
|
|
68
|
+
| -------------------- | --------------------------------------------------- |
|
|
69
|
+
| `stagehand_act` | Perform actions using natural language instructions |
|
|
70
|
+
| `stagehand_extract` | Extract structured data from pages |
|
|
71
|
+
| `stagehand_observe` | Discover actionable elements on a page |
|
|
72
|
+
| `stagehand_navigate` | Navigate to a URL |
|
|
73
|
+
| `stagehand_tabs` | Manage browser tabs |
|
|
74
|
+
| `stagehand_close` | Close the browser |
|
|
75
|
+
|
|
76
|
+
## Tool reference
|
|
77
|
+
|
|
78
|
+
### `stagehand_act`
|
|
79
|
+
|
|
80
|
+
Perform an action using natural language instructions. The AI interprets your instruction and executes the appropriate browser action.
|
|
81
|
+
|
|
82
|
+
```text
|
|
83
|
+
// Tool input
|
|
84
|
+
{
|
|
85
|
+
"instruction": "click the login button",
|
|
86
|
+
"variables": { "username": "john" },
|
|
87
|
+
"useVision": true,
|
|
88
|
+
"timeout": 30000
|
|
89
|
+
}
|
|
90
|
+
|
|
91
|
+
// With variable substitution
|
|
92
|
+
{
|
|
93
|
+
"instruction": "type %email% into the email field",
|
|
94
|
+
"variables": { "email": "user@example.com" }
|
|
95
|
+
}
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
| Parameter | Type | Description |
|
|
99
|
+
| ------------- | ------------------------ | ---------------------------------------------------- |
|
|
100
|
+
| `instruction` | `string` | Natural language instruction (required) |
|
|
101
|
+
| `variables` | `Record<string, string>` | Variables for %variableName% substitution (optional) |
|
|
102
|
+
| `useVision` | `boolean` | Enable vision capabilities (optional) |
|
|
103
|
+
| `timeout` | `number` | Timeout in ms (optional) |
|
|
104
|
+
|
|
105
|
+
**Returns:**
|
|
106
|
+
|
|
107
|
+
```typescript
|
|
108
|
+
interface ActResult {
|
|
109
|
+
success: boolean
|
|
110
|
+
message?: string
|
|
111
|
+
action?: string
|
|
112
|
+
url?: string
|
|
113
|
+
}
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
### `stagehand_extract`
|
|
117
|
+
|
|
118
|
+
Extract structured data from a page using natural language instructions.
|
|
119
|
+
|
|
120
|
+
```text
|
|
121
|
+
// Basic extraction
|
|
122
|
+
{
|
|
123
|
+
"instruction": "extract all product names and prices"
|
|
124
|
+
}
|
|
125
|
+
|
|
126
|
+
// With schema for structured output
|
|
127
|
+
{
|
|
128
|
+
"instruction": "extract the product information",
|
|
129
|
+
"schema": {
|
|
130
|
+
"type": "object",
|
|
131
|
+
"properties": {
|
|
132
|
+
"name": { "type": "string" },
|
|
133
|
+
"price": { "type": "number" },
|
|
134
|
+
"inStock": { "type": "boolean" }
|
|
135
|
+
}
|
|
136
|
+
}
|
|
137
|
+
}
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
**Returns:**
|
|
141
|
+
|
|
142
|
+
```typescript
|
|
143
|
+
interface ExtractResult<T = unknown> {
|
|
144
|
+
success: boolean
|
|
145
|
+
data?: T
|
|
146
|
+
hint?: string
|
|
147
|
+
error?: string
|
|
148
|
+
url?: string
|
|
149
|
+
}
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
### `stagehand_observe`
|
|
153
|
+
|
|
154
|
+
Discover actionable elements on a page. Returns a list of elements with their selectors and descriptions.
|
|
155
|
+
|
|
156
|
+
```text
|
|
157
|
+
// Find specific elements
|
|
158
|
+
{
|
|
159
|
+
"instruction": "find all buttons related to checkout"
|
|
160
|
+
}
|
|
161
|
+
|
|
162
|
+
// Find all interactive elements
|
|
163
|
+
{
|
|
164
|
+
"onlyVisible": true
|
|
165
|
+
}
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
| Parameter | Type | Description |
|
|
169
|
+
| ------------- | --------- | --------------------------------------------------------- |
|
|
170
|
+
| `instruction` | `string` | Natural language instruction (optional, omit to find all) |
|
|
171
|
+
| `onlyVisible` | `boolean` | Only include visible elements (optional) |
|
|
172
|
+
| `timeout` | `number` | Timeout in ms (optional) |
|
|
173
|
+
|
|
174
|
+
**Returns:**
|
|
175
|
+
|
|
176
|
+
```typescript
|
|
177
|
+
interface ObserveResult {
|
|
178
|
+
success: boolean
|
|
179
|
+
actions: StagehandAction[]
|
|
180
|
+
url?: string
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
interface StagehandAction {
|
|
184
|
+
selector: string
|
|
185
|
+
description: string
|
|
186
|
+
method?: string
|
|
187
|
+
arguments?: string[]
|
|
188
|
+
}
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
### `stagehand_navigate`
|
|
192
|
+
|
|
193
|
+
Navigate to a URL.
|
|
194
|
+
|
|
195
|
+
```text
|
|
196
|
+
// Tool input
|
|
197
|
+
{
|
|
198
|
+
"url": "https://example.com",
|
|
199
|
+
"waitUntil": "domcontentloaded"
|
|
200
|
+
}
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
| Parameter | Type | Description |
|
|
204
|
+
| ----------- | ----------------------------------------------- | ----------------------------------------------- |
|
|
205
|
+
| `url` | `string` | URL to navigate to (required) |
|
|
206
|
+
| `waitUntil` | `"load" \| "domcontentloaded" \| "networkidle"` | When to consider navigation complete (optional) |
|
|
207
|
+
|
|
208
|
+
### `stagehand_tabs`
|
|
209
|
+
|
|
210
|
+
Manage browser tabs.
|
|
211
|
+
|
|
212
|
+
```text
|
|
213
|
+
// List all tabs
|
|
214
|
+
{ "action": "list" }
|
|
215
|
+
|
|
216
|
+
// Open new tab
|
|
217
|
+
{ "action": "new", "url": "https://example.com" }
|
|
218
|
+
|
|
219
|
+
// Switch to tab by index
|
|
220
|
+
{ "action": "switch", "index": 0 }
|
|
221
|
+
|
|
222
|
+
// Close tab by index (or current if omitted)
|
|
223
|
+
{ "action": "close", "index": 1 }
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
### `stagehand_close`
|
|
227
|
+
|
|
228
|
+
Close the browser and clean up resources.
|
|
229
|
+
|
|
230
|
+
```text
|
|
231
|
+
// Tool input (no parameters required)
|
|
232
|
+
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
## Using Browserbase
|
|
236
|
+
|
|
237
|
+
Run Stagehand in the cloud using Browserbase:
|
|
238
|
+
|
|
239
|
+
```typescript
|
|
240
|
+
const browser = new StagehandBrowser({
|
|
241
|
+
env: 'BROWSERBASE',
|
|
242
|
+
apiKey: process.env.BROWSERBASE_API_KEY,
|
|
243
|
+
projectId: process.env.BROWSERBASE_PROJECT_ID,
|
|
244
|
+
model: 'openai/gpt-5.4',
|
|
245
|
+
})
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
## Model configuration
|
|
249
|
+
|
|
250
|
+
Configure the AI model for Stagehand operations:
|
|
251
|
+
|
|
252
|
+
```typescript
|
|
253
|
+
// String format: "provider/model"
|
|
254
|
+
const browser = new StagehandBrowser({
|
|
255
|
+
model: 'openai/gpt-5.4',
|
|
256
|
+
})
|
|
257
|
+
|
|
258
|
+
// Object format for custom configuration
|
|
259
|
+
const browser = new StagehandBrowser({
|
|
260
|
+
model: {
|
|
261
|
+
modelName: 'gpt-5.4',
|
|
262
|
+
apiKey: process.env.OPENAI_API_KEY,
|
|
263
|
+
baseURL: 'https://api.openai.com/v1',
|
|
264
|
+
},
|
|
265
|
+
})
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
Supported providers:
|
|
269
|
+
|
|
270
|
+
- OpenAI: `"openai/gpt-5.4"`, `"openai/gpt-5.4-mini"`
|
|
271
|
+
- Anthropic: `"anthropic/claude-3-5-sonnet-20241022"`
|
|
272
|
+
|
|
273
|
+
## AgentBrowser vs StagehandBrowser
|
|
274
|
+
|
|
275
|
+
| Aspect | AgentBrowser | StagehandBrowser |
|
|
276
|
+
| --------------- | -------------------------- | --------------------- |
|
|
277
|
+
| **Approach** | Deterministic refs (`@e5`) | Natural language |
|
|
278
|
+
| **Precision** | Exact element targeting | AI interpretation |
|
|
279
|
+
| **Flexibility** | Requires a snapshot first | Direct instructions |
|
|
280
|
+
| **Use case** | Reproducible automation | Adaptive automation |
|
|
281
|
+
| **Speed** | Faster (no AI inference) | Slower (AI inference) |
|
|
282
|
+
|
|
283
|
+
Choose `AgentBrowser` for precise, reproducible automation. Choose `StagehandBrowser` for flexible, natural language interactions.
|
|
284
|
+
|
|
285
|
+
## Related
|
|
286
|
+
|
|
287
|
+
- [MastraBrowser](https://mastra.ai/reference/browser/mastra-browser): Base class reference
|
|
288
|
+
- [AgentBrowser](https://mastra.ai/reference/browser/agent-browser): Deterministic alternative
|
|
289
|
+
- [Browser overview](https://mastra.ai/docs/browser/overview): Conceptual guide
|
|
290
|
+
- [Stagehand guide](https://mastra.ai/docs/browser/stagehand): Usage guide
|
package/.docs/reference/index.md
CHANGED
|
@@ -39,6 +39,9 @@ The Reference section provides documentation of Mastra's API, including paramete
|
|
|
39
39
|
- [Okta](https://mastra.ai/reference/auth/okta)
|
|
40
40
|
- [Supabase](https://mastra.ai/reference/auth/supabase)
|
|
41
41
|
- [WorkOS](https://mastra.ai/reference/auth/workos)
|
|
42
|
+
- [AgentBrowser](https://mastra.ai/reference/browser/agent-browser)
|
|
43
|
+
- [MastraBrowser Class](https://mastra.ai/reference/browser/mastra-browser)
|
|
44
|
+
- [StagehandBrowser](https://mastra.ai/reference/browser/stagehand-browser)
|
|
42
45
|
- [create-mastra](https://mastra.ai/reference/cli/create-mastra)
|
|
43
46
|
- [mastra](https://mastra.ai/reference/cli/mastra)
|
|
44
47
|
- [Agents API](https://mastra.ai/reference/client-js/agents)
|
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,12 @@
|
|
|
1
1
|
# @mastra/mcp-docs-server
|
|
2
2
|
|
|
3
|
+
## 1.1.22-alpha.13
|
|
4
|
+
|
|
5
|
+
### Patch Changes
|
|
6
|
+
|
|
7
|
+
- Updated dependencies [[`a50d220`](https://github.com/mastra-ai/mastra/commit/a50d220b01ecbc5644d489a3d446c3bd4ab30245)]:
|
|
8
|
+
- @mastra/core@1.23.0-alpha.9
|
|
9
|
+
|
|
3
10
|
## 1.1.22-alpha.12
|
|
4
11
|
|
|
5
12
|
### Patch Changes
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@mastra/mcp-docs-server",
|
|
3
|
-
"version": "1.1.22-alpha.
|
|
3
|
+
"version": "1.1.22-alpha.13",
|
|
4
4
|
"description": "MCP server for accessing Mastra.ai documentation, changelogs, and news.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -29,7 +29,7 @@
|
|
|
29
29
|
"jsdom": "^26.1.0",
|
|
30
30
|
"local-pkg": "^1.1.2",
|
|
31
31
|
"zod": "^4.3.6",
|
|
32
|
-
"@mastra/core": "1.23.0-alpha.
|
|
32
|
+
"@mastra/core": "1.23.0-alpha.9",
|
|
33
33
|
"@mastra/mcp": "^1.4.1"
|
|
34
34
|
},
|
|
35
35
|
"devDependencies": {
|
|
@@ -48,7 +48,7 @@
|
|
|
48
48
|
"vitest": "4.0.18",
|
|
49
49
|
"@internal/lint": "0.0.79",
|
|
50
50
|
"@internal/types-builder": "0.0.54",
|
|
51
|
-
"@mastra/core": "1.23.0-alpha.
|
|
51
|
+
"@mastra/core": "1.23.0-alpha.9"
|
|
52
52
|
},
|
|
53
53
|
"homepage": "https://mastra.ai",
|
|
54
54
|
"repository": {
|