@mastra/agent-browser 0.1.0-alpha.0 → 0.2.0-alpha.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,51 @@
1
1
  # @mastra/agent-browser
2
2
 
3
+ ## 0.2.0-alpha.0
4
+
5
+ ### Minor Changes
6
+
7
+ - Added `storageState` option and `exportStorageState()` method for lightweight auth persistence (cookies and localStorage). Also kills orphaned Chrome child processes on close to prevent zombies. ([#15194](https://github.com/mastra-ai/mastra/pull/15194))
8
+
9
+ ### Patch Changes
10
+
11
+ - AgentBrowser with default thread scope now initializes correctly. Previously, calling launch() followed by getPage() would throw "Browser not launched" when no explicit thread ID was provided. ([#15285](https://github.com/mastra-ai/mastra/pull/15285))
12
+
13
+ - Updated dependencies [[`cbdf3e1`](https://github.com/mastra-ai/mastra/commit/cbdf3e12b3d0c30a6e5347be658e2009648c130a), [`8fe46d3`](https://github.com/mastra-ai/mastra/commit/8fe46d354027f3f0f0846e64219772348de106dd), [`18c67db`](https://github.com/mastra-ai/mastra/commit/18c67dbb9c9ebc26f26f65f7d3ff836e5691ef46), [`8dcc77e`](https://github.com/mastra-ai/mastra/commit/8dcc77e78a5340f5848f74b9e9f1b3da3513c1f5), [`aa67fc5`](https://github.com/mastra-ai/mastra/commit/aa67fc59ee8a5eeff1f23eb05970b8d7a536c8ff), [`fa8140b`](https://github.com/mastra-ai/mastra/commit/fa8140bcd4251d2e3ac85fdc5547dfc4f372b5be), [`190f452`](https://github.com/mastra-ai/mastra/commit/190f45258b0640e2adfc8219fa3258cdc5b8f071), [`7e7bf60`](https://github.com/mastra-ai/mastra/commit/7e7bf606886bf374a6f9d4ca9b09dd83d0533372), [`184907d`](https://github.com/mastra-ai/mastra/commit/184907d775d8609c03c26e78ccaf37315f3aa287), [`0c4cd13`](https://github.com/mastra-ai/mastra/commit/0c4cd131931c04ac5405373c932a242dbe88edd6), [`b16a753`](https://github.com/mastra-ai/mastra/commit/b16a753d5748440248d7df82e29bb987a9c8386c)]:
14
+ - @mastra/core@1.25.0-alpha.3
15
+
16
+ ## 0.1.0
17
+
18
+ ### Minor Changes
19
+
20
+ - Add browser automation support with screencast streaming, input injection, and thread isolation ([#14938](https://github.com/mastra-ai/mastra/pull/14938))
21
+
22
+ **New Features:**
23
+ - Browser tools for web automation (navigate, click, type, scroll, extract, etc.)
24
+ - Real-time screencast streaming via WebSocket
25
+ - Mouse and keyboard input injection
26
+ - Thread-scoped browser isolation (`scope: 'thread'`)
27
+ - State persistence and restoration across sessions
28
+ - Support for cloud providers (Browserbase, Browser-Use, Browserless)
29
+
30
+ **Configuration:**
31
+
32
+ ```typescript
33
+ import { AgentBrowser } from '@mastra/agent-browser';
34
+
35
+ const browser = new AgentBrowser({
36
+ headless: true,
37
+ scope: 'thread', // Each thread gets isolated browser
38
+ viewport: { width: 1280, height: 720 },
39
+ });
40
+
41
+ const agent = mastra.getAgent('my-agent', { browser });
42
+ ```
43
+
44
+ ### Patch Changes
45
+
46
+ - Updated dependencies [[`cb15509`](https://github.com/mastra-ai/mastra/commit/cb15509b58f6a83e11b765c945082afc027db972), [`81e4259`](https://github.com/mastra-ai/mastra/commit/81e425939b4ceeb4f586e9b6d89c3b1c1f2d2fe7), [`951b8a1`](https://github.com/mastra-ai/mastra/commit/951b8a1b5ef7e1474c59dc4f2b9fc1a8b1e508b6), [`80c5668`](https://github.com/mastra-ai/mastra/commit/80c5668e365470d3a96d3e953868fd7a643ff67c), [`3d478c1`](https://github.com/mastra-ai/mastra/commit/3d478c1e13f17b80f330ac49d7aa42ef929b93ff), [`2b4ea10`](https://github.com/mastra-ai/mastra/commit/2b4ea10b053e4ea1ab232d536933a4a3c4cba999), [`a0544f0`](https://github.com/mastra-ai/mastra/commit/a0544f0a1e6bd52ac12676228967c1938e43648d), [`6039f17`](https://github.com/mastra-ai/mastra/commit/6039f176f9c457304825ff1df8c83b8e457376c0), [`06b928d`](https://github.com/mastra-ai/mastra/commit/06b928dfc2f5630d023467476cc5919dfa858d0a), [`6a8d984`](https://github.com/mastra-ai/mastra/commit/6a8d9841f2933456ee1598099f488d742b600054), [`c8c86aa`](https://github.com/mastra-ai/mastra/commit/c8c86aa1458017fbd1c0776fdc0c520d129df8a6)]:
47
+ - @mastra/core@1.22.0
48
+
3
49
  ## 0.1.0-alpha.0
4
50
 
5
51
  ### Minor Changes
package/README.md ADDED
@@ -0,0 +1,138 @@
1
+ # @mastra/agent-browser
2
+
3
+ Deterministic browser automation for Mastra agents using [agent-browser](https://github.com/vercel-labs/agent-browser).
4
+
5
+ ## Installation
6
+
7
+ ```bash
8
+ npm install @mastra/agent-browser
9
+ ```
10
+
11
+ ## Usage
12
+
13
+ ```typescript
14
+ import { Agent } from '@mastra/core/agent';
15
+ import { AgentBrowser } from '@mastra/agent-browser';
16
+
17
+ // Create an AgentBrowser instance
18
+ const browser = new AgentBrowser({
19
+ headless: true,
20
+ });
21
+
22
+ // Create an agent with the browser
23
+ const agent = new Agent({
24
+ name: 'web-agent',
25
+ instructions: `You are a web automation assistant.
26
+ Use browser_snapshot to see the page structure,
27
+ then interact with elements using their refs (e.g., @e5).`,
28
+ model: 'openai/gpt-5.4',
29
+ browser,
30
+ });
31
+
32
+ // Use the agent to browse the web
33
+ const result = await agent.generate('Go to example.com and click the first link');
34
+ ```
35
+
36
+ ## Configuration
37
+
38
+ ```typescript
39
+ const browser = new AgentBrowser({
40
+ // Run headless (default: true)
41
+ headless: true,
42
+
43
+ // Viewport dimensions
44
+ viewport: { width: 1280, height: 720 },
45
+
46
+ // Default timeout for operations in ms (default: 30000)
47
+ timeout: 30000,
48
+
49
+ // CDP URL for connecting to existing browser
50
+ cdpUrl: 'ws://localhost:9222',
51
+
52
+ // Browser instance scope
53
+ // Default: 'thread' for local launch, 'shared' when cdpUrl is provided
54
+ // 'thread': Each thread gets its own browser
55
+ // 'shared': All threads share one browser
56
+ scope: 'thread',
57
+
58
+ // Screencast settings for Studio
59
+ screencast: {
60
+ enabled: true,
61
+ format: 'jpeg',
62
+ quality: 80,
63
+ },
64
+ });
65
+ ```
66
+
67
+ ## Tools
68
+
69
+ AgentBrowser exposes 15 deterministic tools using accessibility tree refs:
70
+
71
+ ### Core Tools
72
+
73
+ - **browser_goto** - Navigate to a URL
74
+ - **browser_snapshot** - Get accessibility tree with element refs (@e1, @e2, etc.)
75
+ - **browser_click** - Click an element by ref
76
+ - **browser_type** - Type text into an element
77
+ - **browser_press** - Press keyboard keys
78
+ - **browser_select** - Select option from dropdown
79
+ - **browser_scroll** - Scroll the page or element
80
+ - **browser_close** - Close the browser
81
+
82
+ ### Extended Tools
83
+
84
+ - **browser_hover** - Hover over an element
85
+ - **browser_back** - Go back in browser history
86
+ - **browser_dialog** - Handle browser dialogs (alert, confirm, prompt)
87
+ - **browser_wait** - Wait for element state changes
88
+ - **browser_tabs** - Manage browser tabs (list, new, switch, close)
89
+ - **browser_drag** - Drag and drop elements
90
+
91
+ ### Escape Hatch
92
+
93
+ - **browser_evaluate** - Execute JavaScript in the page context
94
+
95
+ ## How Refs Work
96
+
97
+ AgentBrowser uses accessibility tree refs for precise element targeting:
98
+
99
+ 1. Call `browser_snapshot` to get the page structure with refs
100
+ 2. Find the element you want to interact with
101
+ 3. Use its ref with other tools
102
+
103
+ ```text
104
+ [document] Example Page
105
+ [banner]
106
+ [link @e1] Home
107
+ [link @e2] About
108
+ [main]
109
+ [textbox @e3] Search...
110
+ [button @e4] Submit
111
+ ```
112
+
113
+ ```typescript
114
+ // Type in the search box
115
+ { tool: "browser_type", input: { ref: "@e3", text: "mastra" } }
116
+
117
+ // Click submit
118
+ { tool: "browser_click", input: { ref: "@e4" } }
119
+ ```
120
+
121
+ ## Comparison with StagehandBrowser
122
+
123
+ | Feature | AgentBrowser | StagehandBrowser |
124
+ | ----------- | ------------------------ | ---------------------------- |
125
+ | Approach | Deterministic refs (@e1) | Natural language |
126
+ | Token cost | Low | Higher (LLM calls) |
127
+ | Speed | Fast | Slower |
128
+ | Reliability | High (exact refs) | Variable (AI interpretation) |
129
+ | Best for | Structured workflows | Unknown/dynamic pages |
130
+
131
+ ## Documentation
132
+
133
+ - [agent-browser guide](https://mastra.ai/docs/browser/agent-browser) - Usage guide
134
+ - [AgentBrowser reference](https://mastra.ai/reference/browser/agent-browser) - API reference
135
+
136
+ ## License
137
+
138
+ Apache-2.0