opencode-agent-browser 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,332 @@
1
+ # opencode-agent-browser
2
+
3
+ OpenCode plugin for [agent-browser](https://github.com/vercel-labs/agent-browser) - browser automation for AI agents with persistent sessions, dev tools, and video streaming.
4
+
5
+ ## Features
6
+
7
+ - **Auto Viewport 1920x1080** - Browser opens with consistent viewport size
8
+ - **Cookie Banner Handling** - Automatically dismisses cookie consent popups
9
+ - **Persistent Sessions** - Cookies saved across navigations, login once and stay authenticated
10
+ - **Full Dev Tools** - Console logs, network requests, cookies, localStorage, sessionStorage
11
+ - **JavaScript Execution** - Run arbitrary JS in page context
12
+ - **Video Streaming** - Real-time WebSocket streaming for visual monitoring
13
+ - **Network Mocking** - Intercept and mock API responses
14
+ - **Zero Config** - Works out of the box
15
+
16
+ ## Installation
17
+
18
+ ### 1. Install agent-browser CLI
19
+
20
+ ```bash
21
+ npm install -g agent-browser
22
+ agent-browser install # Download Chromium
23
+ ```
24
+
25
+ ### 2. Install this plugin
26
+
27
+ ```bash
28
+ # In your OpenCode config directory
29
+ cd ~/.config/opencode
30
+ npm install opencode-agent-browser
31
+ ```
32
+
33
+ Add to your `opencode.json`:
34
+
35
+ ```json
36
+ {
37
+ "plugin": ["opencode-agent-browser"]
38
+ }
39
+ ```
40
+
41
+ ## Usage Examples
42
+
43
+ ### Basic Navigation
44
+ ```
45
+ "Go to amazon.it and accept the cookies"
46
+ "Navigate to github.com and take a screenshot"
47
+ ```
48
+
49
+ ### Web Scraping
50
+ ```
51
+ "Scrape product prices from this Amazon search page"
52
+ "Get all the links from the homepage"
53
+ "Extract the main article text from this news page"
54
+ ```
55
+
56
+ ### Form Automation
57
+ ```
58
+ "Fill the login form with username 'test' and password 'test123'"
59
+ "Submit the contact form with my details"
60
+ "Search for 'laptop' on Amazon"
61
+ ```
62
+
63
+ ### Debugging & Development
64
+ ```
65
+ "Check the browser console for JavaScript errors"
66
+ "Show me all network requests to the API"
67
+ "What cookies does this site set?"
68
+ "Get the localStorage data"
69
+ ```
70
+
71
+ ### Testing
72
+ ```
73
+ "Test if the checkout flow works correctly"
74
+ "Verify the login redirects to dashboard"
75
+ "Check if the form validation shows error messages"
76
+ ```
77
+
78
+ ## How It Works
79
+
80
+ ### Architecture
81
+
82
+ ```
83
+ OpenCode Plugin agent-browser
84
+ | | |
85
+ |-- loads plugin --------->| |
86
+ | |-- injects skill awareness -->|
87
+ | | |
88
+ |-- "go to amazon" ------->| |
89
+ | |-- load_agent_browser_skill ->|
90
+ | | |
91
+ | |<-- skill instructions -------|
92
+ | | |
93
+ | |-- bash: agent-browser ------>|
94
+ | | |
95
+ |<-- page opened ----------|<-- success -----------------|
96
+ ```
97
+
98
+ ### Plugin Components
99
+
100
+ 1. **Skill Injection** - Adds `<available-skills>` to system prompt so the AI knows when to use browser automation
101
+
102
+ 2. **Tool Registration** - Registers `load_agent_browser_skill` tool that injects detailed instructions into the conversation
103
+
104
+ 3. **Skill Template** - Comprehensive documentation for agent-browser CLI including:
105
+ - Opening browser with correct viewport
106
+ - Cookie banner handling
107
+ - Element interaction via @refs
108
+ - Dev tools access
109
+ - Troubleshooting
110
+
111
+ ## Quick Reference
112
+
113
+ ### Open Browser (MUST use this pattern first time)
114
+ ```bash
115
+ pkill -f agent-browser; sleep 1; agent-browser open <url> --headed && agent-browser set viewport 1920 1080
116
+ ```
117
+
118
+ ### Handle Cookie Banner
119
+ ```bash
120
+ agent-browser snapshot -i # Check for cookie banner
121
+ agent-browser click @eX # Click "Accept" button
122
+ ```
123
+
124
+ ### Interact with Elements
125
+ ```bash
126
+ agent-browser snapshot -i # Get interactive elements with @refs
127
+ agent-browser click @e1 # Click element
128
+ agent-browser fill @e2 "text" # Fill input field
129
+ agent-browser press Enter # Press key
130
+ agent-browser select @e1 "value" # Select dropdown option
131
+ agent-browser scroll down 500 # Scroll page
132
+ ```
133
+
134
+ ### Screenshots
135
+ ```bash
136
+ agent-browser screenshot ./screenshot.png # Viewport
137
+ agent-browser screenshot ./screenshot.png --full # Full page
138
+ ```
139
+
140
+ ### Dev Tools (ALWAYS use --json!)
141
+ ```bash
142
+ agent-browser console --json # Console logs
143
+ agent-browser errors --json # Page errors only
144
+ agent-browser cookies get --json # Get all cookies
145
+ agent-browser storage local --json # Get localStorage
146
+ agent-browser storage session --json # Get sessionStorage
147
+ agent-browser eval "document.title" # Execute JavaScript
148
+ ```
149
+
150
+ ### Network
151
+ ```bash
152
+ agent-browser network requests --json # View requests
153
+ agent-browser network route "*/api/*" --abort # Block requests
154
+ agent-browser network route "*/api/*" --body '{}' # Mock response
155
+ ```
156
+
157
+ ### Video Streaming
158
+ ```bash
159
+ AGENT_BROWSER_STREAM_PORT=9223 agent-browser open <url> --headed
160
+ # Connect via WebSocket at ws://localhost:9223
161
+ ```
162
+
163
+ ## Customization
164
+
165
+ ### Modify the Plugin
166
+
167
+ Clone and customize:
168
+
169
+ ```bash
170
+ git clone https://github.com/crottolo/opencode-agent-browser.git
171
+ cd opencode-agent-browser
172
+ ```
173
+
174
+ Edit `index.ts` to customize:
175
+
176
+ #### Change Default Viewport
177
+ ```typescript
178
+ // Find this line in skillTemplate:
179
+ agent-browser set viewport 1920 1080
180
+
181
+ // Change to your preferred size:
182
+ agent-browser set viewport 1440 900
183
+ ```
184
+
185
+ #### Add Custom Headers
186
+ ```typescript
187
+ // Add to the open command:
188
+ agent-browser open <url> --headed --headers '{"Accept-Language": "en-US"}'
189
+ ```
190
+
191
+ #### Modify Cookie Banner Behavior
192
+ ```typescript
193
+ // In the RULES section, change:
194
+ 2. **Cookie banner**: ALWAYS dismiss cookie banners...
195
+
196
+ // To:
197
+ 2. **Cookie banner**: ASK user before dismissing...
198
+ ```
199
+
200
+ ### Build and Use Locally
201
+
202
+ ```bash
203
+ bun install
204
+ bun run build
205
+
206
+ # In opencode.json, use local path:
207
+ {
208
+ "plugin": ["/path/to/opencode-agent-browser"]
209
+ }
210
+ ```
211
+
212
+ ## Troubleshooting
213
+
214
+ ### "Browser not launched" Error
215
+ ```bash
216
+ # Kill any existing daemon and restart:
217
+ pkill -f agent-browser; sleep 1; agent-browser open <url> --headed
218
+ ```
219
+
220
+ ### Browser Window Not Visible
221
+ ```bash
222
+ # Always use --headed flag:
223
+ agent-browser open <url> --headed # Correct
224
+ agent-browser open <url> # Wrong - headless mode
225
+ ```
226
+
227
+ ### Console/Cookies Show Empty
228
+ ```bash
229
+ # Always use --json flag for dev tools:
230
+ agent-browser console --json # Correct
231
+ agent-browser console # Wrong - may show empty
232
+ ```
233
+
234
+ ### Cookie Banner Not Detected
235
+ ```bash
236
+ # Run snapshot to see all interactive elements:
237
+ agent-browser snapshot -i
238
+
239
+ # Look for buttons with text like "Accept", "Accetta", "OK", "Agree"
240
+ # Then click the correct @ref
241
+ ```
242
+
243
+ ### Session Lost / Need to Re-login
244
+ ```bash
245
+ # DON'T use 'close' - it deletes all cookies!
246
+ agent-browser close # Destroys session
247
+
248
+ # Instead, just navigate to new URLs:
249
+ agent-browser open <new-url> # Keeps cookies
250
+ ```
251
+
252
+ ### Element Not Found
253
+ ```bash
254
+ # Always re-snapshot after navigation or page changes:
255
+ agent-browser open <url>
256
+ agent-browser snapshot -i # Get fresh @refs
257
+ agent-browser click @e1 # Use new refs
258
+ ```
259
+
260
+ ## Auto-Trigger Prompts
261
+
262
+ The plugin automatically suggests loading the skill when your prompt contains:
263
+
264
+ | Category | Trigger Words |
265
+ |----------|---------------|
266
+ | Screenshots | screenshot, capture, snapshot |
267
+ | Scraping | scrape, extract, get data from |
268
+ | Navigation | go to, navigate, open website, browse |
269
+ | Forms | fill form, submit, login, signup |
270
+ | Testing | test website, verify, check page |
271
+ | Debugging | console log, network request, cookies |
272
+
273
+ ### Examples That Auto-Trigger
274
+ - "Take a screenshot of google.com"
275
+ - "Scrape the prices from Amazon"
276
+ - "Fill the contact form on the website"
277
+ - "Check if there are any JavaScript errors"
278
+ - "Navigate to the login page and sign in"
279
+
280
+ ## Requirements
281
+
282
+ - **OpenCode** >= 1.0.0
283
+ - **agent-browser** CLI installed globally
284
+ - **Chromium** (installed via `agent-browser install`)
285
+ - **macOS/Linux** (Windows support may vary)
286
+
287
+ ## Development
288
+
289
+ ```bash
290
+ # Clone
291
+ git clone https://github.com/crottolo/opencode-agent-browser.git
292
+ cd opencode-agent-browser
293
+
294
+ # Install dependencies
295
+ bun install
296
+
297
+ # Build
298
+ bun run build
299
+
300
+ # The build creates:
301
+ # - dist/index.js (bundled plugin)
302
+ # - dist/index.d.ts (TypeScript types)
303
+ ```
304
+
305
+ ### Project Structure
306
+ ```
307
+ opencode-agent-browser/
308
+ ├── index.ts # Main plugin code with skill template
309
+ ├── package.json # npm package config
310
+ ├── tsconfig.json # TypeScript config
311
+ ├── dist/ # Built output
312
+ │ ├── index.js
313
+ │ └── index.d.ts
314
+ └── README.md
315
+ ```
316
+
317
+ ## Contributing
318
+
319
+ 1. Fork the repository
320
+ 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
321
+ 3. Commit your changes (`git commit -m 'Add amazing feature'`)
322
+ 4. Push to the branch (`git push origin feature/amazing-feature`)
323
+ 5. Open a Pull Request
324
+
325
+ ## License
326
+
327
+ MIT
328
+
329
+ ## Credits
330
+
331
+ - [agent-browser](https://github.com/vercel-labs/agent-browser) by Vercel Labs
332
+ - [OpenCode](https://opencode.ai) plugin system
@@ -0,0 +1,17 @@
1
+ import type { Plugin } from "@opencode-ai/plugin";
2
+ /**
3
+ * OpenCode plugin for agent-browser automation.
4
+ *
5
+ * Provides a skill for browser automation with persistent cookies,
6
+ * full dev tools access, and video streaming support.
7
+ *
8
+ * ## Configuration
9
+ *
10
+ * ```json
11
+ * {
12
+ * "plugin": ["opencode-agent-browser"]
13
+ * }
14
+ * ```
15
+ */
16
+ export declare const OpenCodeAgentBrowser: Plugin;
17
+ export default OpenCodeAgentBrowser;