opencode-browser-plugin 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md ADDED
@@ -0,0 +1,34 @@
1
+ # AGENTS.md
2
+
3
+ ## What this is
4
+
5
+ An [OpenCode](https://opencode.ai) plugin that exposes Playwright browser-control tools. Single source file: `index.ts`.
6
+
7
+ ## Runtime & package manager
8
+
9
+ - **Bun** (not Node). Lockfile is `bun.lock`.
10
+ - ESM only (`"type": "module"`).
11
+ - No build step — the plugin entry is raw TypeScript via package.json `exports["./server"]`.
12
+
13
+ ## Key dependencies
14
+
15
+ - `@opencode-ai/plugin` — plugin API (`tool()`, `tool.schema`, `PluginInput`, `PluginOptions`).
16
+ - `playwright` — browser automation (only `chromium` is used).
17
+
18
+ ## Running / testing
19
+
20
+ - No tests, no CI, no build script. Changes are verified by running an OpenCode session that loads the plugin.
21
+ - Playwright browsers must be installed separately: `bunx playwright install chromium`. The error message in `ensureBrowser()` also reminds of this.
22
+
23
+ ## Architecture notes
24
+
25
+ - **Persistent profile**: Browser data lives at `~/.opencode/browser-profile`. Created on first launch.
26
+ - **Idle timeout**: Browser auto-closes after 30 min of inactivity (`IDLE_TIMEOUT`). A watchdog polls every 60 s.
27
+ - **Tool multiplexing**: One `browser` tool accepts an `action` param to dispatch (start/stop/open/navigate/snapshot/screenshot/click/type/evaluate/wait/close/back). Four convenience shortcuts (`browser_start`, `browser_snapshot`, `browser_click`, `browser_type`) wrap the most common actions.
28
+ - **Ref-based targeting**: `buildSnapshot()` runs ARIA snapshot on `<body>`, assigns `e1`, `e2`, … refs to interactive elements, and stores them in `state.refs`. Click/type look up refs by role + name + nth index.
29
+ - **Multi-tab**: Pages tracked in `state.pages` Map keyed by `page_id`. `state.currentPageId` tracks the active tab.
30
+
31
+ ## Conventions
32
+
33
+ - All browser state is in a module-level `state` object — no classes, no external state store.
34
+ - Errors are returned as strings (not thrown) from tool execute handlers.
package/README.md ADDED
@@ -0,0 +1,131 @@
1
+ # OpenCode Browser Plugin
2
+
3
+ A browser automation plugin for [OpenCode](https://opencode.ai), powered by [Playwright](https://playwright.dev/). It exposes browser-control tools that allow an AI agent to navigate web pages, interact with elements, take screenshots, and execute JavaScript — all from within an OpenCode session.
4
+
5
+ [中文文档](./README.zh.md)
6
+
7
+ ## Features
8
+
9
+ - **Headless & Headed Modes** — Run the browser invisibly or with a visible window for debugging/demos.
10
+ - **Ref-Based Element Targeting** — ARIA snapshot generates stable `ref` identifiers (e.g. `e1`, `e2`) for interactive elements, enabling reliable clicks and typing.
11
+ - **Multi-Tab Support** — Open and manage multiple pages simultaneously via `page_id`.
12
+ - **Persistent Browser Profile** — Session data (cookies, localStorage, etc.) is preserved at `~/.opencode/browser-profile`.
13
+ - **Idle Auto-Close** — Browser automatically shuts down after 30 minutes of inactivity to conserve resources.
14
+ - **Convenience Shortcuts** — Four dedicated shortcut tools (`browser_start`, `browser_snapshot`, `browser_click`, `browser_type`) for the most common actions.
15
+
16
+ ## Prerequisites
17
+
18
+ - [Bun](https://bun.sh/) runtime
19
+ - [OpenCode](https://opencode.ai) CLI
20
+ - Playwright Chromium browser:
21
+
22
+ ```bash
23
+ bunx playwright install chromium
24
+ ```
25
+
26
+ ## Installation
27
+
28
+ Install the plugin:
29
+
30
+ ```bash
31
+ git clone https://github.com/heimoshuiyu/opencode-browser-plugin.git
32
+ cd opencode-browser-plugin
33
+ bun install
34
+ ```
35
+
36
+ Then configure it in your OpenCode settings. Add the plugin path to `opencode.json` (or `opencode.jsonc`):
37
+
38
+ ```jsonc
39
+ {
40
+ "$schema": "https://opencode.ai/config.json",
41
+ "plugin": [
42
+ "/path/to/opencode-browser-plugin"
43
+ ]
44
+ }
45
+ ```
46
+
47
+ > **Tip:** You can also use a relative path like `"./opencode-browser-plugin"` (resolved from the config file's directory).
48
+
49
+ ## Usage
50
+
51
+ ### Basic Workflow
52
+
53
+ ```
54
+ 1. Start the browser → browser(action: "start")
55
+ 2. Open a URL → browser(action: "open", url: "https://example.com")
56
+ 3. Take a snapshot → browser(action: "snapshot")
57
+ 4. Interact with elements → browser(action: "click", ref: "e1")
58
+ browser(action: "type", ref: "e3", text: "hello")
59
+ 5. Stop when done → browser(action: "stop")
60
+ ```
61
+
62
+ ### Main Tool — `browser`
63
+
64
+ A single multiplexed tool that accepts an `action` parameter:
65
+
66
+ | Action | Description | Key Parameters |
67
+ | ------------ | ----------------------------------------------- | ---------------------------------- |
68
+ | `start` | Launch the browser | `headed` |
69
+ | `stop` | Close the browser and clean up | — |
70
+ | `open` | Open a URL in a new tab | `url`, `page_id` |
71
+ | `navigate` | Navigate the current tab to a URL | `url` |
72
+ | `back` | Go back in browser history | — |
73
+ | `snapshot` | Capture an ARIA snapshot with element refs | `path` |
74
+ | `screenshot` | Take a screenshot of the page or an element | `ref`, `path`, `full_page` |
75
+ | `click` | Click an element | `ref`, `selector`, `wait` |
76
+ | `type` | Type text into an element | `ref`, `selector`, `text`, `submit`, `slowly` |
77
+ | `evaluate` | Execute JavaScript in the page context | `code` |
78
+ | `wait` | Wait for a duration or network idle | `wait` |
79
+ | `close` | Close a specific tab | `page_id` |
80
+
81
+ ### Shortcut Tools
82
+
83
+ | Tool | Description | Key Parameters |
84
+ | ------------------ | ---------------------------------------- | --------------------- |
85
+ | `browser_start` | Start browser (headed by default) | `headed` |
86
+ | `browser_snapshot` | Get interactive element refs from page | `page_id` |
87
+ | `browser_click` | Click an element by ref | `ref`, `page_id` |
88
+ | `browser_type` | Type text into an element by ref | `ref`, `text`, `submit`, `page_id` |
89
+
90
+ ### Example: Search on a Website
91
+
92
+ ```
93
+ 1. browser_start(headed: true)
94
+ 2. browser(action: "open", url: "https://www.google.com")
95
+ 3. browser_snapshot()
96
+ → Returns refs like: e1 (textbox), e2 (button "Google Search"), ...
97
+ 4. browser_type(ref: "e1", text: "OpenCode AI")
98
+ 5. browser_click(ref: "e2")
99
+ ```
100
+
101
+ ### Example: Take a Screenshot
102
+
103
+ ```
104
+ browser(action: "screenshot", path: "output.png", full_page: true)
105
+ ```
106
+
107
+ ### Example: Execute JavaScript
108
+
109
+ ```
110
+ browser(action: "evaluate", code: "document.title")
111
+ ```
112
+
113
+ ## Architecture
114
+
115
+ - **Single source file** — Everything lives in `index.ts`. No build step required.
116
+ - **Module-level state** — All browser state is held in a plain `state` object (context, pages, refs, etc.).
117
+ - **Ref system** — `buildSnapshot()` runs Playwright's ARIA snapshot on `<body>`, assigns sequential `e1`, `e2`, … refs to interactive elements (buttons, links, textboxes, etc.), and stores them per page. Click/type actions resolve refs via `getLocatorByRef()` using `page.getByRole()`.
118
+ - **ESM only** — The project uses `"type": "module"` and Bun's native TypeScript execution.
119
+
120
+ ## Configuration
121
+
122
+ | Setting | Default | Description |
123
+ | ------- | ------- | ----------- |
124
+ | Profile directory | `~/.opencode/browser-profile` | Browser data (cookies, storage, etc.) |
125
+ | Idle timeout | 30 minutes | Auto-close browser after inactivity |
126
+ | Viewport | 1280 × 720 | Default browser viewport size |
127
+ | Default timeout | 30 000 ms | Timeout for page navigation and actions |
128
+
129
+ ## License
130
+
131
+ MIT
package/README.zh.md ADDED
@@ -0,0 +1,131 @@
1
+ # OpenCode 浏览器插件
2
+
3
+ 一个基于 [Playwright](https://playwright.dev/) 的浏览器自动化插件,专为 [OpenCode](https://opencode.ai) 设计。它提供浏览器控制工具,让 AI 代理可以在 OpenCode 会话中浏览网页、与页面元素交互、截图以及执行 JavaScript。
4
+
5
+ [English](./README.md)
6
+
7
+ ## 功能特性
8
+
9
+ - **无头与有头模式** — 支持无界面运行或打开可见窗口用于调试/演示。
10
+ - **基于 Ref 的元素定位** — 通过 ARIA 快照为可交互元素生成稳定的 `ref` 标识(如 `e1`、`e2`),实现可靠的点击和输入操作。
11
+ - **多标签页支持** — 通过 `page_id` 同时打开和管理多个页面。
12
+ - **持久化浏览器配置** — 会话数据(cookies、localStorage 等)保存在 `~/.opencode/browser-profile`。
13
+ - **空闲自动关闭** — 浏览器在 30 分钟无操作后自动关闭,节省资源。
14
+ - **快捷工具** — 提供四个专用快捷工具(`browser_start`、`browser_snapshot`、`browser_click`、`browser_type`)用于最常用的操作。
15
+
16
+ ## 前置条件
17
+
18
+ - [Bun](https://bun.sh/) 运行时
19
+ - [OpenCode](https://opencode.ai) CLI
20
+ - Playwright Chromium 浏览器:
21
+
22
+ ```bash
23
+ bunx playwright install chromium
24
+ ```
25
+
26
+ ## 安装
27
+
28
+ 克隆并安装插件:
29
+
30
+ ```bash
31
+ git clone https://github.com/heimoshuiyu/opencode-browser-plugin.git
32
+ cd opencode-browser-plugin
33
+ bun install
34
+ ```
35
+
36
+ 然后在 OpenCode 配置文件中添加插件路径。在 `opencode.json`(或 `opencode.jsonc`)中添加:
37
+
38
+ ```jsonc
39
+ {
40
+ "$schema": "https://opencode.ai/config.json",
41
+ "plugin": [
42
+ "/path/to/opencode-browser-plugin"
43
+ ]
44
+ }
45
+ ```
46
+
47
+ > **提示:** 也可以使用相对路径,如 `"./opencode-browser-plugin"`(相对于配置文件所在目录解析)。
48
+
49
+ ## 使用方法
50
+
51
+ ### 基本流程
52
+
53
+ ```
54
+ 1. 启动浏览器 → browser(action: "start")
55
+ 2. 打开网址 → browser(action: "open", url: "https://example.com")
56
+ 3. 获取页面快照 → browser(action: "snapshot")
57
+ 4. 与元素交互 → browser(action: "click", ref: "e1")
58
+ browser(action: "type", ref: "e3", text: "hello")
59
+ 5. 完成后关闭 → browser(action: "stop")
60
+ ```
61
+
62
+ ### 主工具 — `browser`
63
+
64
+ 一个通过 `action` 参数进行调度的多路复用工具:
65
+
66
+ | 操作 | 说明 | 关键参数 |
67
+ | ------------ | -------------------------- | ------------------------------------ |
68
+ | `start` | 启动浏览器 | `headed` |
69
+ | `stop` | 关闭浏览器并清理 | — |
70
+ | `open` | 在新标签页中打开 URL | `url`, `page_id` |
71
+ | `navigate` | 在当前标签页导航到 URL | `url` |
72
+ | `back` | 浏览器后退 | — |
73
+ | `snapshot` | 获取带元素 ref 的 ARIA 快照 | `path` |
74
+ | `screenshot` | 截取页面或元素的截图 | `ref`, `path`, `full_page` |
75
+ | `click` | 点击元素 | `ref`, `selector`, `wait` |
76
+ | `type` | 在元素中输入文本 | `ref`, `selector`, `text`, `submit`, `slowly` |
77
+ | `evaluate` | 在页面上下文中执行 JavaScript | `code` |
78
+ | `wait` | 等待一段时间或网络空闲 | `wait` |
79
+ | `close` | 关闭指定标签页 | `page_id` |
80
+
81
+ ### 快捷工具
82
+
83
+ | 工具 | 说明 | 关键参数 |
84
+ | ------------------ | -------------------------- | ------------------------- |
85
+ | `browser_start` | 启动浏览器(默认有头模式) | `headed` |
86
+ | `browser_snapshot` | 获取页面可交互元素的 ref | `page_id` |
87
+ | `browser_click` | 通过 ref 点击元素 | `ref`, `page_id` |
88
+ | `browser_type` | 通过 ref 在元素中输入文本 | `ref`, `text`, `submit`, `page_id` |
89
+
90
+ ### 示例:在网站中搜索
91
+
92
+ ```
93
+ 1. browser_start(headed: true)
94
+ 2. browser(action: "open", url: "https://www.google.com")
95
+ 3. browser_snapshot()
96
+ → 返回 ref 如:e1 (textbox), e2 (button "Google Search"), ...
97
+ 4. browser_type(ref: "e1", text: "OpenCode AI")
98
+ 5. browser_click(ref: "e2")
99
+ ```
100
+
101
+ ### 示例:截图
102
+
103
+ ```
104
+ browser(action: "screenshot", path: "output.png", full_page: true)
105
+ ```
106
+
107
+ ### 示例:执行 JavaScript
108
+
109
+ ```
110
+ browser(action: "evaluate", code: "document.title")
111
+ ```
112
+
113
+ ## 架构
114
+
115
+ - **单文件** — 所有代码都在 `index.ts` 中,无需构建步骤。
116
+ - **模块级状态** — 浏览器状态保存在一个普通的 `state` 对象中(context、pages、refs 等)。
117
+ - **Ref 系统** — `buildSnapshot()` 在 `<body>` 上运行 Playwright 的 ARIA 快照,为可交互元素(按钮、链接、输入框等)分配递增的 `e1`、`e2`、… ref,并按页面存储。点击/输入操作通过 `getLocatorByRef()` 使用 `page.getByRole()` 解析 ref。
118
+ - **纯 ESM** — 项目使用 `"type": "module"`,通过 Bun 原生执行 TypeScript。
119
+
120
+ ## 配置项
121
+
122
+ | 配置项 | 默认值 | 说明 |
123
+ | ------ | ------ | ---- |
124
+ | 配置文件目录 | `~/.opencode/browser-profile` | 浏览器数据(cookies、存储等) |
125
+ | 空闲超时 | 30 分钟 | 无操作后自动关闭浏览器 |
126
+ | 视口大小 | 1280 × 720 | 默认浏览器视口尺寸 |
127
+ | 默认超时 | 30 000 ms | 页面导航和操作的超时时间 |
128
+
129
+ ## 许可证
130
+
131
+ MIT
package/bun.lock ADDED
@@ -0,0 +1,38 @@
1
+ {
2
+ "lockfileVersion": 1,
3
+ "configVersion": 1,
4
+ "workspaces": {
5
+ "": {
6
+ "name": "opencode-browser-plugin",
7
+ "dependencies": {
8
+ "@opencode-ai/plugin": "*",
9
+ "playwright": "^1.52.0",
10
+ },
11
+ },
12
+ },
13
+ "packages": {
14
+ "@opencode-ai/plugin": ["@opencode-ai/plugin@1.4.3", "", { "dependencies": { "@opencode-ai/sdk": "1.4.3", "zod": "4.1.8" }, "peerDependencies": { "@opentui/core": ">=0.1.97", "@opentui/solid": ">=0.1.97" }, "optionalPeers": ["@opentui/core", "@opentui/solid"] }, "sha512-Ob/3tVSIeuMRJBr2O23RtrnC5djRe01Lglx+TwGEmjrH9yDBJ2tftegYLnNEjRoMuzITgq9LD8168p4pzv+U/A=="],
15
+
16
+ "@opencode-ai/sdk": ["@opencode-ai/sdk@1.4.3", "", { "dependencies": { "cross-spawn": "7.0.6" } }, "sha512-X0CAVbwoGAjTY2iecpWkx2B+GAa2jSaQKYpJ+xILopeF/OGKZUN15mjqci+L7cEuwLHV5wk3x2TStUOVCa5p0A=="],
17
+
18
+ "cross-spawn": ["cross-spawn@7.0.6", "", { "dependencies": { "path-key": "^3.1.0", "shebang-command": "^2.0.0", "which": "^2.0.1" } }, "sha512-uV2QOWP2nWzsy2aMp8aRibhi9dlzF5Hgh5SHaB9OiTGEyDTiJJyx0uy51QXdyWbtAHNua4XJzUKca3OzKUd3vA=="],
19
+
20
+ "fsevents": ["fsevents@2.3.2", "", { "os": "darwin" }, "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA=="],
21
+
22
+ "isexe": ["isexe@2.0.0", "", {}, "sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw=="],
23
+
24
+ "path-key": ["path-key@3.1.1", "", {}, "sha512-ojmeN0qd+y0jszEtoY48r0Peq5dwMEkIlCOu6Q5f41lfkswXuKtYrhgoTpLnyIcHm24Uhqx+5Tqm2InSwLhE6Q=="],
25
+
26
+ "playwright": ["playwright@1.59.1", "", { "dependencies": { "playwright-core": "1.59.1" }, "optionalDependencies": { "fsevents": "2.3.2" }, "bin": { "playwright": "cli.js" } }, "sha512-C8oWjPR3F81yljW9o5OxcWzfh6avkVwDD2VYdwIGqTkl+OGFISgypqzfu7dOe4QNLL2aqcWBmI3PMtLIK233lw=="],
27
+
28
+ "playwright-core": ["playwright-core@1.59.1", "", { "bin": { "playwright-core": "cli.js" } }, "sha512-HBV/RJg81z5BiiZ9yPzIiClYV/QMsDCKUyogwH9p3MCP6IYjUFu/MActgYAvK0oWyV9NlwM3GLBjADyWgydVyg=="],
29
+
30
+ "shebang-command": ["shebang-command@2.0.0", "", { "dependencies": { "shebang-regex": "^3.0.0" } }, "sha512-kHxr2zZpYtdmrN1qDjrrX/Z1rR1kG8Dx+gkpK1G4eXmvXswmcE1hTWBWYUzlraYw1/yZp6YuDY77YtvbN0dmDA=="],
31
+
32
+ "shebang-regex": ["shebang-regex@3.0.0", "", {}, "sha512-7++dFhtcx3353uBaq8DDR4NuxBetBzC7ZQOhmTQInHEd6bSrXdiEyzCvG07Z44UYdLShWUyXt5M/yhz8ekcb1A=="],
33
+
34
+ "which": ["which@2.0.2", "", { "dependencies": { "isexe": "^2.0.0" }, "bin": { "node-which": "./bin/node-which" } }, "sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA=="],
35
+
36
+ "zod": ["zod@4.1.8", "", {}, "sha512-5R1P+WwQqmmMIEACyzSvo4JXHY5WiAFHRMg+zBZKgKS+Q1viRa0C1hmUKtHltoIFKtIdki3pRxkmpP74jnNYHQ=="],
37
+ }
38
+ }
package/index.ts ADDED
@@ -0,0 +1,509 @@
1
+ import { tool, type PluginInput, type PluginOptions } from "@opencode-ai/plugin"
2
+ import { chromium, type Page, type BrowserContext } from "playwright"
3
+ import path from "path"
4
+ import fs from "fs"
5
+ import os from "os"
6
+
7
+ const PROFILE_DIR = path.join(os.homedir(), ".opencode", "browser-profile")
8
+
9
+ interface BrowserState {
10
+ context: BrowserContext | null
11
+ pages: Map<string, Page>
12
+ currentPageId: string | null
13
+ pageCounter: number
14
+ refs: Map<string, Map<string, { role: string; name?: string; nth?: number }>>
15
+ headless: boolean
16
+ lastActivityTime: number
17
+ }
18
+
19
+ const state: BrowserState = {
20
+ context: null,
21
+ pages: new Map(),
22
+ currentPageId: null,
23
+ pageCounter: 0,
24
+ refs: new Map(),
25
+ headless: true,
26
+ lastActivityTime: 0,
27
+ }
28
+
29
+ const IDLE_TIMEOUT = 30 * 60 * 1000
30
+ let idleCheckInterval: Timer | null = null
31
+
32
+ function touchActivity() {
33
+ state.lastActivityTime = Date.now()
34
+ }
35
+
36
+ function startIdleWatchdog() {
37
+ if (idleCheckInterval) clearInterval(idleCheckInterval)
38
+
39
+ idleCheckInterval = setInterval(async () => {
40
+ if (!state.context) return
41
+
42
+ const idle = Date.now() - state.lastActivityTime
43
+ if (idle >= IDLE_TIMEOUT) {
44
+ await stopBrowser()
45
+ }
46
+ }, 60 * 1000)
47
+ }
48
+
49
+ function stopIdleWatchdog() {
50
+ if (idleCheckInterval) {
51
+ clearInterval(idleCheckInterval)
52
+ idleCheckInterval = null
53
+ }
54
+ }
55
+
56
+ interface BrowserResult {
57
+ success: boolean
58
+ error?: string
59
+ }
60
+
61
+ async function ensureBrowser(headless: boolean = true): Promise<BrowserResult> {
62
+ touchActivity()
63
+
64
+ if (state.context) {
65
+ if (!headless && state.headless) {
66
+ await stopBrowser()
67
+ } else {
68
+ startIdleWatchdog()
69
+ return { success: true }
70
+ }
71
+ }
72
+
73
+ try {
74
+ state.headless = headless
75
+
76
+ if (!fs.existsSync(PROFILE_DIR)) {
77
+ fs.mkdirSync(PROFILE_DIR, { recursive: true })
78
+ }
79
+
80
+ state.context = await chromium.launchPersistentContext(PROFILE_DIR, {
81
+ headless,
82
+ viewport: { width: 1280, height: 720 },
83
+ })
84
+
85
+ state.context.on("page", (page) => {
86
+ const pageId = nextPageId()
87
+ state.pages.set(pageId, page)
88
+ state.currentPageId = pageId
89
+ state.refs.set(pageId, new Map())
90
+ })
91
+
92
+ startIdleWatchdog()
93
+ return { success: true }
94
+ } catch (error) {
95
+ const errorMessage = error instanceof Error ? error.message : String(error)
96
+ return { success: false, error: errorMessage }
97
+ }
98
+ }
99
+
100
+ async function stopBrowser(): Promise<void> {
101
+ stopIdleWatchdog()
102
+
103
+ if (state.context) {
104
+ await state.context.close()
105
+ }
106
+
107
+ state.context = null
108
+ state.pages.clear()
109
+ state.currentPageId = null
110
+ state.pageCounter = 0
111
+ state.refs.clear()
112
+ state.headless = true
113
+ }
114
+
115
+ function nextPageId(): string {
116
+ state.pageCounter++
117
+ return `page_${state.pageCounter}`
118
+ }
119
+
120
+ function getPage(pageId: string): Page | undefined {
121
+ return state.pages.get(pageId)
122
+ }
123
+
124
+ function getRefs(pageId: string): Map<string, { role: string; name?: string; nth?: number }> {
125
+ if (!state.refs.has(pageId)) {
126
+ state.refs.set(pageId, new Map())
127
+ }
128
+ return state.refs.get(pageId)!
129
+ }
130
+
131
+ async function getLocatorByRef(page: Page, pageId: string, ref: string) {
132
+ const refs = getRefs(pageId)
133
+ const info = refs.get(ref)
134
+
135
+ if (!info) return null
136
+
137
+ const locator = page.getByRole(info.role as any, { name: info.name })
138
+ return info.nth && info.nth > 0 ? locator.nth(info.nth) : locator
139
+ }
140
+
141
+ async function buildSnapshot(page: Page): Promise<{ snapshot: string; refs: string[] }> {
142
+ const snapshot = await page.locator("body").ariaSnapshot()
143
+ const refs = new Map<string, { role: string; name?: string; nth?: number }>()
144
+
145
+ const lines = snapshot.split("\n")
146
+ const counter = { value: 0 }
147
+ const roleCounts = new Map<string, number>()
148
+
149
+ const processedLines = lines.map((line) => {
150
+ const match = line.match(/^(\s*)-\s*(\w+)(?:\s+"([^"]*)")?/)
151
+ if (!match) return line
152
+
153
+ const [, indent, role, name] = match
154
+ const roleKey = `${role}:${name || ""}`
155
+ const count = roleCounts.get(roleKey) || 0
156
+ roleCounts.set(roleKey, count + 1)
157
+
158
+ const interactiveRoles = [
159
+ "button", "link", "textbox", "checkbox", "radio", "combobox",
160
+ "listbox", "menuitem", "searchbox", "slider", "switch", "tab",
161
+ ]
162
+
163
+ if (interactiveRoles.includes(role.toLowerCase())) {
164
+ counter.value++
165
+ const ref = `e${counter.value}`
166
+ refs.set(ref, { role: role.toLowerCase(), name, nth: count > 0 ? count : undefined })
167
+
168
+ let processed = `${indent}- ${role}`
169
+ if (name) processed += ` "${name}"`
170
+ processed += ` [ref=${ref}]`
171
+ if (count > 0) processed += ` [nth=${count}]`
172
+ return processed
173
+ }
174
+
175
+ return line
176
+ })
177
+
178
+ const pageId = state.currentPageId || "default"
179
+ state.refs.set(pageId, refs)
180
+
181
+ return {
182
+ snapshot: processedLines.join("\n"),
183
+ refs: Array.from(refs.keys()),
184
+ }
185
+ }
186
+
187
+ function resolvePageId(args: { page_id?: string }) {
188
+ return args.page_id || state.currentPageId || "default"
189
+ }
190
+
191
+ function resolvePage(pageId: string): Page | string {
192
+ const page = getPage(pageId)
193
+ if (!page) return "Error: Page not found"
194
+ return page
195
+ }
196
+
197
+ export default {
198
+ id: "browser",
199
+ async server(input: PluginInput, options?: PluginOptions) {
200
+ return {
201
+ tool: {
202
+ browser: tool({
203
+ description: `Control a web browser using Playwright. Actions: start, stop, open, navigate, snapshot, screenshot, click, type, evaluate, wait.
204
+
205
+ Workflow:
206
+ 1. start (headed=true for visible window)
207
+ 2. open url
208
+ 3. snapshot to get element refs
209
+ 4. click/type using refs
210
+ 5. stop when done
211
+
212
+ Use ref from snapshot for stable element targeting. Multiple tabs supported via page_id.`,
213
+ args: {
214
+ action: tool.schema.string().describe("Action: start, stop, open, navigate, snapshot, screenshot, click, type, evaluate, wait, close, back"),
215
+ url: tool.schema.string().optional().describe("URL to open or navigate to"),
216
+ page_id: tool.schema.string().optional().default("default").describe("Page/tab identifier (default: 'default')"),
217
+ ref: tool.schema.string().optional().describe("Element ref from snapshot (preferred over selector)"),
218
+ selector: tool.schema.string().optional().describe("CSS selector (use ref when possible)"),
219
+ text: tool.schema.string().optional().describe("Text to type"),
220
+ code: tool.schema.string().optional().describe("JavaScript code to evaluate"),
221
+ path: tool.schema.string().optional().describe("File path for screenshot"),
222
+ wait: tool.schema.number().optional().default(0).describe("Milliseconds to wait"),
223
+ full_page: tool.schema.boolean().optional().default(false).describe("Full page screenshot"),
224
+ headed: tool.schema.boolean().optional().default(false).describe("Show visible browser window"),
225
+ submit: tool.schema.boolean().optional().default(false).describe("Press Enter after typing"),
226
+ slowly: tool.schema.boolean().optional().default(false).describe("Type character by character"),
227
+ timeout: tool.schema.number().optional().default(30000).describe("Timeout in milliseconds"),
228
+ },
229
+ async execute(args, context) {
230
+ const action = args.action.toLowerCase().trim()
231
+
232
+ try {
233
+ touchActivity()
234
+
235
+ if (action === "start") {
236
+ const headless = !(args.headed ?? false)
237
+ const result = await ensureBrowser(headless)
238
+ if (!result.success) {
239
+ return `Failed to start browser: ${result.error}. Make sure Playwright is installed: bunx playwright install chromium`
240
+ }
241
+ return `Browser started (${args.headed ? "visible window" : "headless"})`
242
+ }
243
+
244
+ if (action === "stop") {
245
+ await stopBrowser()
246
+ return "Browser stopped"
247
+ }
248
+
249
+ if (action === "open") {
250
+ if (!args.url) return "Error: url required for open action"
251
+
252
+ const result = await ensureBrowser()
253
+ if (!result.success) {
254
+ return `Error: Failed to start browser: ${result.error}`
255
+ }
256
+
257
+ const pageId = args.page_id || nextPageId()
258
+ const page = await state.context!.newPage()
259
+
260
+ state.pages.set(pageId, page)
261
+ state.currentPageId = pageId
262
+ state.refs.set(pageId, new Map())
263
+
264
+ await page.goto(args.url, { timeout: args.timeout })
265
+
266
+ return `Opened ${args.url} (page_id: ${pageId})`
267
+ }
268
+
269
+ if (action === "navigate") {
270
+ if (!args.url) return "Error: url required for navigate action"
271
+
272
+ const page = resolvePage(resolvePageId(args))
273
+ if (typeof page === "string") return page
274
+
275
+ await page.goto(args.url, { timeout: args.timeout })
276
+ return `Navigated to ${args.url}`
277
+ }
278
+
279
+ if (action === "back") {
280
+ const page = resolvePage(resolvePageId(args))
281
+ if (typeof page === "string") return page
282
+
283
+ await page.goBack({ timeout: args.timeout })
284
+ return "Navigated back"
285
+ }
286
+
287
+ if (action === "snapshot") {
288
+ const page = resolvePage(resolvePageId(args))
289
+ if (typeof page === "string") return page
290
+
291
+ const result = await buildSnapshot(page)
292
+
293
+ let output = `Page: ${page.url()}\n\n`
294
+ output += `Interactive Elements:\n${result.snapshot}\n\n`
295
+ output += `Available refs: ${result.refs.join(", ")}`
296
+
297
+ if (args.path) {
298
+ const filePath = path.isAbsolute(args.path)
299
+ ? args.path
300
+ : path.join(context.directory, args.path)
301
+ await fs.promises.writeFile(filePath, output, "utf-8")
302
+ output += `\n\nSnapshot saved to: ${args.path}`
303
+ }
304
+
305
+ return output
306
+ }
307
+
308
+ if (action === "screenshot") {
309
+ const pageId = resolvePageId(args)
310
+ const page = resolvePage(pageId)
311
+ if (typeof page === "string") return page
312
+
313
+ const screenshotPath = args.path || `screenshot-${Date.now()}.png`
314
+ const filePath = path.isAbsolute(screenshotPath)
315
+ ? screenshotPath
316
+ : path.join(context.directory, screenshotPath)
317
+
318
+ if (args.ref) {
319
+ const locator = await getLocatorByRef(page, pageId, args.ref)
320
+ if (!locator) return `Error: Ref '${args.ref}' not found`
321
+ await locator.screenshot({ path: filePath })
322
+ } else {
323
+ await page.screenshot({ path: filePath, fullPage: args.full_page })
324
+ }
325
+
326
+ return `Screenshot saved to ${screenshotPath}`
327
+ }
328
+
329
+ if (action === "click") {
330
+ const pageId = resolvePageId(args)
331
+ const page = resolvePage(pageId)
332
+ if (typeof page === "string") return page
333
+
334
+ if (!args.ref && !args.selector) {
335
+ return "Error: ref or selector required for click"
336
+ }
337
+
338
+ const locator = args.ref
339
+ ? await getLocatorByRef(page, pageId, args.ref)
340
+ : page.locator(args.selector!).first()
341
+
342
+ if (!locator) return `Error: Ref '${args.ref}' not found`
343
+
344
+ await locator.click({ timeout: args.timeout })
345
+
346
+ if (args.wait && args.wait > 0) {
347
+ await page.waitForTimeout(args.wait)
348
+ }
349
+
350
+ return `Clicked ${args.ref || args.selector}`
351
+ }
352
+
353
+ if (action === "type") {
354
+ const pageId = resolvePageId(args)
355
+ const page = resolvePage(pageId)
356
+ if (typeof page === "string") return page
357
+
358
+ if (!args.ref && !args.selector) {
359
+ return "Error: ref or selector required for type"
360
+ }
361
+
362
+ if (!args.text) {
363
+ return "Error: text required for type action"
364
+ }
365
+
366
+ const locator = args.ref
367
+ ? await getLocatorByRef(page, pageId, args.ref)
368
+ : page.locator(args.selector!).first()
369
+
370
+ if (!locator) return `Error: Ref '${args.ref}' not found`
371
+
372
+ if (args.slowly) {
373
+ await locator.pressSequentially(args.text)
374
+ } else {
375
+ await locator.fill(args.text)
376
+ }
377
+
378
+ if (args.submit) {
379
+ await locator.press("Enter")
380
+ }
381
+
382
+ return `Typed into ${args.ref || args.selector}`
383
+ }
384
+
385
+ if (action === "evaluate" || action === "eval") {
386
+ const page = resolvePage(resolvePageId(args))
387
+ if (typeof page === "string") return page
388
+
389
+ if (!args.code) return "Error: code required for evaluate"
390
+
391
+ const result = await page.evaluate(args.code)
392
+ return `Result: ${JSON.stringify(result, null, 2)}`
393
+ }
394
+
395
+ if (action === "wait") {
396
+ const page = resolvePage(resolvePageId(args))
397
+ if (typeof page === "string") return page
398
+
399
+ if (args.wait && args.wait > 0) {
400
+ await page.waitForTimeout(args.wait)
401
+ return `Waited ${args.wait}ms`
402
+ }
403
+
404
+ await page.waitForLoadState("networkidle", { timeout: args.timeout })
405
+ return "Waited for network idle"
406
+ }
407
+
408
+ if (action === "close") {
409
+ const pageId = args.page_id || state.currentPageId
410
+ if (!pageId) return "Error: No page to close"
411
+
412
+ const page = getPage(pageId)
413
+ if (!page) return "Error: Page not found"
414
+
415
+ await page.close()
416
+ state.pages.delete(pageId)
417
+ state.refs.delete(pageId)
418
+
419
+ if (state.currentPageId === pageId) {
420
+ state.currentPageId = state.pages.keys().next().value || null
421
+ }
422
+
423
+ return `Closed page ${pageId}`
424
+ }
425
+
426
+ return `Error: Unknown action '${action}'. Available: start, stop, open, navigate, back, snapshot, screenshot, click, type, evaluate, wait, close`
427
+
428
+ } catch (error) {
429
+ const errorMessage = error instanceof Error ? error.message : String(error)
430
+ return `Error: ${errorMessage}`
431
+ }
432
+ },
433
+ }),
434
+
435
+ browser_start: tool({
436
+ description: "Start browser in visible mode (headed) for debugging or demos",
437
+ args: {
438
+ headed: tool.schema.boolean().default(true).describe("Show visible browser window"),
439
+ },
440
+ async execute(args) {
441
+ const headless = !args.headed
442
+ const result = await ensureBrowser(headless)
443
+ return result.success
444
+ ? `Browser started in ${args.headed ? "visible" : "headless"} mode`
445
+ : `Failed to start browser: ${result.error}`
446
+ },
447
+ }),
448
+
449
+ browser_snapshot: tool({
450
+ description: "Take a snapshot of the current page to get interactive element refs",
451
+ args: {
452
+ page_id: tool.schema.string().optional().describe("Page ID (optional)"),
453
+ },
454
+ async execute(args) {
455
+ const pageId = args.page_id || state.currentPageId || "default"
456
+ const page = getPage(pageId)
457
+ if (!page) return "Error: Page not found. Open a page first."
458
+
459
+ const result = await buildSnapshot(page)
460
+
461
+ return `Page: ${page.url()}\n\n${result.snapshot}\n\nRefs: ${result.refs.join(", ")}`
462
+ },
463
+ }),
464
+
465
+ browser_click: tool({
466
+ description: "Click an element using ref from snapshot",
467
+ args: {
468
+ ref: tool.schema.string().describe("Element ref from snapshot"),
469
+ page_id: tool.schema.string().optional().describe("Page ID (optional)"),
470
+ },
471
+ async execute(args) {
472
+ const pageId = args.page_id || state.currentPageId || "default"
473
+ const page = getPage(pageId)
474
+ if (!page) return "Error: Page not found"
475
+
476
+ const locator = await getLocatorByRef(page, pageId, args.ref)
477
+ if (!locator) return `Error: Ref '${args.ref}' not found`
478
+
479
+ await locator.click()
480
+ return `Clicked ${args.ref}`
481
+ },
482
+ }),
483
+
484
+ browser_type: tool({
485
+ description: "Type text into an element using ref from snapshot",
486
+ args: {
487
+ ref: tool.schema.string().describe("Element ref from snapshot"),
488
+ text: tool.schema.string().describe("Text to type"),
489
+ submit: tool.schema.boolean().optional().default(false).describe("Press Enter after typing"),
490
+ page_id: tool.schema.string().optional().describe("Page ID (optional)"),
491
+ },
492
+ async execute(args) {
493
+ const pageId = args.page_id || state.currentPageId || "default"
494
+ const page = getPage(pageId)
495
+ if (!page) return "Error: Page not found"
496
+
497
+ const locator = await getLocatorByRef(page, pageId, args.ref)
498
+ if (!locator) return `Error: Ref '${args.ref}' not found`
499
+
500
+ await locator.fill(args.text)
501
+ if (args.submit) await locator.press("Enter")
502
+
503
+ return `Typed into ${args.ref}`
504
+ },
505
+ }),
506
+ },
507
+ }
508
+ },
509
+ }
package/package.json ADDED
@@ -0,0 +1,26 @@
1
+ {
2
+ "name": "opencode-browser-plugin",
3
+ "version": "1.0.0",
4
+ "description": "Browser automation plugin for OpenCode, powered by Playwright",
5
+ "type": "module",
6
+ "exports": {
7
+ "./server": "./index.ts"
8
+ },
9
+ "keywords": [
10
+ "opencode",
11
+ "opencode-plugin",
12
+ "browser",
13
+ "playwright",
14
+ "automation"
15
+ ],
16
+ "author": "heimoshuiyu",
17
+ "license": "MIT",
18
+ "repository": {
19
+ "type": "git",
20
+ "url": "https://github.com/heimoshuiyu/opencode-browser-plugin.git"
21
+ },
22
+ "dependencies": {
23
+ "@opencode-ai/plugin": "*",
24
+ "playwright": "^1.52.0"
25
+ }
26
+ }