@sleepinsummer/agent-browser-cli 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AI_INSTALL.md +126 -0
- package/LICENSE +21 -0
- package/README.md +194 -0
- package/README_EN.md +169 -0
- package/assets/tmwd_cdp_bridge/background.js +436 -0
- package/assets/tmwd_cdp_bridge/config.js +1 -0
- package/assets/tmwd_cdp_bridge/content.js +79 -0
- package/assets/tmwd_cdp_bridge/disable_dialogs.js +24 -0
- package/assets/tmwd_cdp_bridge/manifest.json +40 -0
- package/assets/tmwd_cdp_bridge/popup.html +19 -0
- package/assets/tmwd_cdp_bridge/popup.js +24 -0
- package/npm/bin/agent-browser-cli.js +35 -0
- package/npm/platform/darwin-arm64/bin/agent-browser-cli +0 -0
- package/npm/platform/darwin-arm64/package.json +14 -0
- package/npm/postinstall.js +9 -0
- package/package.json +30 -0
- package/skills/agent-browser-cli/SKILL.md +315 -0
package/AI_INSTALL.md
ADDED
|
@@ -0,0 +1,126 @@
|
|
|
1
|
+
# AI 安装说明
|
|
2
|
+
|
|
3
|
+
把下面这段话发给 AI,让 AI 在你的本机环境里完成安装、配置 skill 和验证。
|
|
4
|
+
|
|
5
|
+
```text
|
|
6
|
+
请帮我安装 agent-browser-cli:https://github.com/sleepinginsummer/agent-browser-cli
|
|
7
|
+
|
|
8
|
+
要求:
|
|
9
|
+
1. 优先使用 npm 安装:npm install -g @sleepinsummer/agent-browser-cli。
|
|
10
|
+
2. 指导我在 Chrome 中加载 assets/tmwd_cdp_bridge 解压扩展。
|
|
11
|
+
3. 如果之前已经加载过扩展,必须在 chrome://extensions 里重新加载 `assets/tmwd_cdp_bridge`,确保最新 `config.js` 和 `background.js` 生效。
|
|
12
|
+
4. 将 skills/agent-browser-cli/SKILL.md 安装到当前 AI 可识别的 skills 目录。
|
|
13
|
+
5. 执行 agent-browser-cli tabs、open 和 status 验证可用。
|
|
14
|
+
6. 如果 npm 平台包暂不支持当前系统,再回退到源码构建:cargo build --release。
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
## 1. 安装 CLI
|
|
18
|
+
|
|
19
|
+
优先使用 npm 全局安装:
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
npm install -g @sleepinsummer/agent-browser-cli
|
|
23
|
+
agent-browser-cli --help
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
当前 npm 包按平台安装原生二进制:
|
|
27
|
+
|
|
28
|
+
```text
|
|
29
|
+
@sleepinsummer/agent-browser-cli
|
|
30
|
+
@sleepinsummer/agent-browser-cli-darwin-arm64
|
|
31
|
+
@sleepinsummer/agent-browser-cli-darwin-x64
|
|
32
|
+
@sleepinsummer/agent-browser-cli-win32-x64
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
如果当前平台包暂未发布或安装失败,使用源码构建:
|
|
36
|
+
|
|
37
|
+
```bash
|
|
38
|
+
git clone https://github.com/sleepinginsummer/agent-browser-cli.git
|
|
39
|
+
cd agent-browser-cli
|
|
40
|
+
cargo build --release
|
|
41
|
+
./target/release/agent-browser-cli --help
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
## 2. 加载 Chrome 扩展
|
|
45
|
+
|
|
46
|
+
如果使用 npm 安装,需要先下载或克隆仓库,用于加载扩展和安装 skill:
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
git clone https://github.com/sleepinginsummer/agent-browser-cli.git
|
|
50
|
+
cd agent-browser-cli
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
在 Chrome 打开:
|
|
54
|
+
|
|
55
|
+
```text
|
|
56
|
+
chrome://extensions
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
开启“开发者模式”,加载已解压扩展目录:
|
|
60
|
+
|
|
61
|
+
```text
|
|
62
|
+
assets/tmwd_cdp_bridge
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
如果之前已经安装过旧版 GenericAgent 的 `tmwd_cdp_bridge` 扩展,可以继续使用同协议旧扩展;但建议加载当前仓库的 `assets/tmwd_cdp_bridge` 并点击“重新加载”。
|
|
66
|
+
|
|
67
|
+
当前扩展配置应包含:
|
|
68
|
+
|
|
69
|
+
```js
|
|
70
|
+
const TID = '__agent_browser_cli_bridge_26c9f1';
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
Chrome 至少需要打开一个正常网页标签页,不要只停留在 `about:blank` 或 `chrome://` 页面。
|
|
74
|
+
|
|
75
|
+
## 3. 安装 skill
|
|
76
|
+
|
|
77
|
+
将仓库中的 `skills/agent-browser-cli/SKILL.md` 安装到 AI 使用的 skills 目录。
|
|
78
|
+
|
|
79
|
+
通用目录示例:
|
|
80
|
+
|
|
81
|
+
```bash
|
|
82
|
+
mkdir -p ~/.agents/skills/agent-browser-cli
|
|
83
|
+
cp skills/agent-browser-cli/SKILL.md ~/.agents/skills/agent-browser-cli/SKILL.md
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
Codex 默认目录示例:
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
mkdir -p ~/.codex/skills/agent-browser-cli
|
|
90
|
+
cp skills/agent-browser-cli/SKILL.md ~/.codex/skills/agent-browser-cli/SKILL.md
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
如果 AI 使用其它 skills 目录,将 `SKILL.md` 复制到对应的 `agent-browser-cli/SKILL.md`。
|
|
94
|
+
|
|
95
|
+
## 4. 验证
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
agent-browser-cli tabs
|
|
99
|
+
agent-browser-cli open https://www.baidu.com
|
|
100
|
+
agent-browser-cli status
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
成功时,`tabs` 会返回 `ok: true`,并包含当前 Chrome 标签页数量。
|
|
104
|
+
`open` 应能原生新开标签页,不应使用 `exec --monitor` 或 `window.open` 代替。
|
|
105
|
+
|
|
106
|
+
如果常驻服务需要重载最新代码:
|
|
107
|
+
|
|
108
|
+
```bash
|
|
109
|
+
agent-browser-cli restart
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
## 5. 使用入口
|
|
113
|
+
|
|
114
|
+
拿到标签页 ID 后,可以执行:
|
|
115
|
+
|
|
116
|
+
```bash
|
|
117
|
+
agent-browser-cli scan --tab <tabId> --text-only
|
|
118
|
+
agent-browser-cli exec --tab <tabId> 'return document.title'
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
完整命令和浏览器操作 SOP 见:
|
|
122
|
+
|
|
123
|
+
```text
|
|
124
|
+
skills/agent-browser-cli/SKILL.md
|
|
125
|
+
```
|
|
126
|
+
|
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 sleepinginsummer
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,194 @@
|
|
|
1
|
+
<div align="center">
|
|
2
|
+
|
|
3
|
+
# agent-browser-cli
|
|
4
|
+
|
|
5
|
+
面向 Agent 的浏览器感知与控制 CLI,把真实 Chrome 会话变成可复用的标签页扫描、页面 JS、Cookie、CDP 和截图能力。
|
|
6
|
+
|
|
7
|
+
浏览器感知 · 页面控制 · Chrome 登录态复用 · CDP · 条件等待 · Agent Skill 集成
|
|
8
|
+
|
|
9
|
+
<p>
|
|
10
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/CLI-agentbrowsercli-2ea44f" alt="CLI agentbrowsercli"></a>
|
|
11
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-MIT-green" alt="License MIT"></a>
|
|
12
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/Windows-MacOS-0078D6?labelColor=0078D6&color=C0C0C0" alt="Windows/MacOS"></a>
|
|
13
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli/releases"><img src="https://img.shields.io/badge/release-v0.2.0-blue" alt="release v0.2.0"></a>
|
|
14
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli/pulls"><img src="https://img.shields.io/badge/PRs-welcome-brightgreen" alt="PRs welcome"></a>
|
|
15
|
+
</p>
|
|
16
|
+
|
|
17
|
+
[AI 一句话安装](#ai-一句话安装) · [手动安装](#手动安装) · [Chrome 扩展](#chrome-扩展) · [更新](#更新) · [卸载](#卸载) · [友情链接](#友情链接)
|
|
18
|
+
|
|
19
|
+
中文 | [English](README_EN.md)
|
|
20
|
+
|
|
21
|
+
</div>
|
|
22
|
+
|
|
23
|
+
`agent-browser-cli` 是一个面向 Agent 的浏览器感知与控制工具。它通过 Chrome 扩展连接用户真实浏览器,保留登录态和 Cookie,提供标签页扫描、页面 JS 执行、Cookie 读取、CDP 控制、截图、文件上传、下拉框点击等能力。
|
|
24
|
+
|
|
25
|
+
本项目不是 Selenium / Playwright。它更适合在已有浏览器会话中辅助 Agent 精确读取页面和执行操作。
|
|
26
|
+
|
|
27
|
+
## 项目信息
|
|
28
|
+
|
|
29
|
+
- 当前版本:`0.2.0`
|
|
30
|
+
- 支持平台:Windows、macOS
|
|
31
|
+
- 运行时:Rust 原生二进制;npm 安装时按平台选择二进制包
|
|
32
|
+
- 浏览器:Chrome / Chromium,需加载 `assets/tmwd_cdp_bridge`
|
|
33
|
+
|
|
34
|
+
## 致谢
|
|
35
|
+
|
|
36
|
+
本项目的浏览器控制能力提取并改造自 [GenericAgent](https://github.com/lsdefine/GenericAgent) 项目中的 Web 工具链,包括 `TMWebDriver`、`simphtml` 和 `tmwd_cdp_bridge` 扩展相关思路与实现。
|
|
37
|
+
|
|
38
|
+
感谢 GenericAgent 项目提供的浏览器桥接、页面简化、CDP 控制和实践 SOP。本仓库在此基础上做了面向独立使用和 CLI 调用的整理与增强。
|
|
39
|
+
|
|
40
|
+
## AI 一句话安装
|
|
41
|
+
|
|
42
|
+
```text
|
|
43
|
+
请阅读 https://github.com/sleepinginsummer/agent-browser-cli/blob/main/AI_INSTALL.md,按说明安装 CLI、加载 Chrome 扩展,并添加 `skills/agent-browser-cli/SKILL.md`。
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
## 改进内容
|
|
47
|
+
|
|
48
|
+
- 从 GenericAgent 中拆出浏览器控制能力,使用cli 提供给codex、claude code、opencode使用。GenericAgent浏览器插件不需要重新安装,可以共用同一个插件
|
|
49
|
+
- 避免每次命令都重新初始化浏览器连接。
|
|
50
|
+
- 新增启动锁,避免多个 CLI 并发启动时重复绑定底层端口。
|
|
51
|
+
- 增加skill:`skills/agent-browser-cli/SKILL.md`,提供ai参考使用。
|
|
52
|
+
- 若干优化,缩短命令执行时间
|
|
53
|
+
|
|
54
|
+
## 目录结构
|
|
55
|
+
|
|
56
|
+
```text
|
|
57
|
+
.
|
|
58
|
+
├── agent_browser_cli.py # 命令行入口
|
|
59
|
+
├── agent_browser_server.py # 常驻 HTTP 服务
|
|
60
|
+
├── ga.py # web_scan / web_execute_js 入口
|
|
61
|
+
├── TMWebDriver.py # 浏览器扩展 WebSocket / HTTP 桥
|
|
62
|
+
├── simphtml.py # 页面简化和 DOM diff
|
|
63
|
+
├── assets/tmwd_cdp_bridge/ # Chrome MV3 扩展
|
|
64
|
+
├── memory/ # 浏览器工具 SOP
|
|
65
|
+
└── skills/agent-browser-cli/ # skill
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
## 手动安装
|
|
69
|
+
|
|
70
|
+
### npm 安装
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
npm install -g @sleepinsummer/agent-browser-cli
|
|
74
|
+
agent-browser-cli tabs
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
当前 npm 分发采用主包 + 平台二进制包:
|
|
78
|
+
|
|
79
|
+
```text
|
|
80
|
+
@sleepinsummer/agent-browser-cli
|
|
81
|
+
@sleepinsummer/agent-browser-cli-darwin-arm64
|
|
82
|
+
@sleepinsummer/agent-browser-cli-darwin-x64
|
|
83
|
+
@sleepinsummer/agent-browser-cli-win32-x64
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### 本地源码构建
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
cargo build --release
|
|
90
|
+
./target/release/agent-browser-cli tabs
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
### Python 旧版运行方式
|
|
94
|
+
|
|
95
|
+
Python 实现暂时保留为迁移参考和回退入口:
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
cd /path/to/agent-browser-cli
|
|
99
|
+
python3 -m venv .venv
|
|
100
|
+
.venv/bin/python -m pip install -r requirements.txt
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
## Chrome 扩展
|
|
104
|
+
|
|
105
|
+
加载扩展目录:
|
|
106
|
+
|
|
107
|
+
```text
|
|
108
|
+
assets/tmwd_cdp_bridge
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
Chrome 需要至少打开一个正常网页标签页,不要只停留在 `about:blank` 或 `chrome://` 页面。
|
|
112
|
+
|
|
113
|
+
## 快速自检
|
|
114
|
+
|
|
115
|
+
```bash
|
|
116
|
+
agent-browser-cli tabs
|
|
117
|
+
agent-browser-cli open https://www.baidu.com
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
成功时会返回:
|
|
121
|
+
|
|
122
|
+
```json
|
|
123
|
+
{
|
|
124
|
+
"ok": true,
|
|
125
|
+
"result": {
|
|
126
|
+
"status": "success",
|
|
127
|
+
"metadata": {
|
|
128
|
+
"tabs_count": 1
|
|
129
|
+
}
|
|
130
|
+
}
|
|
131
|
+
}
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## 常用命令
|
|
135
|
+
|
|
136
|
+
README 只保留快速入口;完整命令和浏览器操作 SOP 见 [skills/agent-browser-cli/SKILL.md](./skills/agent-browser-cli/SKILL.md)。
|
|
137
|
+
|
|
138
|
+
```bash
|
|
139
|
+
agent-browser-cli tabs
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## 更新
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
git pull
|
|
146
|
+
cargo build --release
|
|
147
|
+
./target/release/agent-browser-cli restart
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
如果 Chrome 扩展有更新,在 `chrome://extensions` 中重新加载 `assets/tmwd_cdp_bridge` 扩展。
|
|
151
|
+
|
|
152
|
+
当前扩展配置标识为:
|
|
153
|
+
|
|
154
|
+
```js
|
|
155
|
+
const TID = '__agent_browser_cli_bridge_26c9f1';
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
如果你把 skill 安装到了 Codex/Agent 的全局目录,更新后同步复制:
|
|
159
|
+
|
|
160
|
+
```bash
|
|
161
|
+
mkdir -p ~/.agents/skills/agent-browser-cli
|
|
162
|
+
cp skills/agent-browser-cli/SKILL.md ~/.agents/skills/agent-browser-cli/SKILL.md
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
## 卸载
|
|
166
|
+
|
|
167
|
+
先停止常驻服务:
|
|
168
|
+
|
|
169
|
+
```bash
|
|
170
|
+
agent-browser-cli stop
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
然后按需清理:
|
|
174
|
+
|
|
175
|
+
```bash
|
|
176
|
+
rm -f .agent-browser-cli.log .agent-browser-cli.lock
|
|
177
|
+
rm -rf ~/.agents/skills/agent-browser-cli
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
最后在 Chrome 扩展管理页中移除 `TMWD CDP Bridge` 扩展,或删除已加载的 `assets/tmwd_cdp_bridge` 扩展配置。
|
|
181
|
+
|
|
182
|
+
## 端口
|
|
183
|
+
|
|
184
|
+
- `18765`:底层 `TMWebDriver` WebSocket,Chrome 扩展连接使用。
|
|
185
|
+
- `18766`:底层 `TMWebDriver` HTTP `/link`,用于内部 master/remote 协议。
|
|
186
|
+
- `18767`:外层 `agent-browser-cli` HTTP 服务,供 CLI 复用会话。
|
|
187
|
+
|
|
188
|
+
## 友情链接
|
|
189
|
+
|
|
190
|
+
- [LINUX DO - 新的理想型社区](https://linux.do/)
|
|
191
|
+
|
|
192
|
+
## 许可证
|
|
193
|
+
|
|
194
|
+
MIT License. See [LICENSE](./LICENSE).
|
package/README_EN.md
ADDED
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
<div align="center">
|
|
2
|
+
|
|
3
|
+
# agent-browser-cli
|
|
4
|
+
|
|
5
|
+
A browser perception and control CLI for agents, turning a real Chrome session into reusable tab scanning, page JavaScript, Cookie, CDP, and screenshot capabilities.
|
|
6
|
+
|
|
7
|
+
Browser perception · Page control · Chrome session reuse · CDP · Conditional wait · Agent Skill integration
|
|
8
|
+
|
|
9
|
+
<p>
|
|
10
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/CLI-agentbrowsercli-2ea44f" alt="CLI agentbrowsercli"></a>
|
|
11
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-MIT-green" alt="License MIT"></a>
|
|
12
|
+
<a href="https://www.python.org/"><img src="https://img.shields.io/badge/Python-%3E%3D3.10-3776AB?logo=python&logoColor=white" alt="Python >=3.10"></a>
|
|
13
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/Windows-MacOS-0078D6?labelColor=0078D6&color=C0C0C0" alt="Windows/MacOS"></a>
|
|
14
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli/releases"><img src="https://img.shields.io/badge/release-v0.1.1-blue" alt="release v0.1.1"></a>
|
|
15
|
+
<a href="https://github.com/sleepinginsummer/agent-browser-cli/pulls"><img src="https://img.shields.io/badge/PRs-welcome-brightgreen" alt="PRs welcome"></a>
|
|
16
|
+
</p>
|
|
17
|
+
|
|
18
|
+
[AI One-Line Install](#ai-one-line-install) · [Manual Installation](#manual-installation) · [Chrome Extension](#chrome-extension) · [Update](#update) · [Uninstall](#uninstall) · [Friendly Links](#friendly-links)
|
|
19
|
+
|
|
20
|
+
[中文](README.md) | English
|
|
21
|
+
|
|
22
|
+
</div>
|
|
23
|
+
|
|
24
|
+
`agent-browser-cli` is a browser perception and control tool for agents. It connects to the user's real Chrome browser through a Chrome extension, preserving login state and cookies while providing tab scanning, page JavaScript execution, cookie reading, CDP control, screenshots, file uploads, dropdown clicks, and related capabilities.
|
|
25
|
+
|
|
26
|
+
This project is not Selenium or Playwright. It is better suited for helping agents read pages accurately and perform actions inside an existing browser session.
|
|
27
|
+
|
|
28
|
+
## Project Info
|
|
29
|
+
|
|
30
|
+
- Current version: `0.1.1`
|
|
31
|
+
- Supported platforms: Windows, macOS
|
|
32
|
+
- Python: `3.10+` recommended
|
|
33
|
+
- Browser: Chrome / Chromium, with `assets/tmwd_cdp_bridge` loaded
|
|
34
|
+
|
|
35
|
+
## Acknowledgements
|
|
36
|
+
|
|
37
|
+
The browser control capability in this project was extracted and adapted from the Web toolchain in [GenericAgent](https://github.com/lsdefine/GenericAgent), including ideas and implementation around `TMWebDriver`, `simphtml`, and the `tmwd_cdp_bridge` extension.
|
|
38
|
+
|
|
39
|
+
Thanks to the GenericAgent project for the browser bridge, page simplification, CDP control, and practical SOPs. This repository reorganizes and enhances that work for standalone usage and CLI invocation.
|
|
40
|
+
|
|
41
|
+
## AI One-Line Install
|
|
42
|
+
|
|
43
|
+
```text
|
|
44
|
+
Please read https://github.com/sleepinginsummer/agent-browser-cli/blob/main/AI_INSTALL.md, follow the instructions to install the CLI, load the Chrome extension, and add `skills/agent-browser-cli/SKILL.md`.
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Improvements
|
|
48
|
+
|
|
49
|
+
- Extracted browser control capability from GenericAgent and exposed it as a CLI for Codex, Claude Code, and OpenCode. The GenericAgent browser extension can be reused and does not need to be reinstalled.
|
|
50
|
+
- Avoids reinitializing the browser connection for every command.
|
|
51
|
+
- Adds a startup lock to avoid repeated low-level port binding when multiple CLI commands start concurrently.
|
|
52
|
+
- Adds the skill `skills/agent-browser-cli/SKILL.md` for AI usage reference.
|
|
53
|
+
- Includes several optimizations to reduce command execution time.
|
|
54
|
+
|
|
55
|
+
## Layout
|
|
56
|
+
|
|
57
|
+
```text
|
|
58
|
+
.
|
|
59
|
+
├── agent_browser_cli.py # CLI entry
|
|
60
|
+
├── agent_browser_server.py # Long-lived HTTP service
|
|
61
|
+
├── ga.py # web_scan / web_execute_js entry
|
|
62
|
+
├── TMWebDriver.py # Browser extension WebSocket / HTTP bridge
|
|
63
|
+
├── simphtml.py # Page simplification and DOM diff
|
|
64
|
+
├── assets/tmwd_cdp_bridge/ # Chrome MV3 extension
|
|
65
|
+
├── memory/ # Browser tool SOPs
|
|
66
|
+
└── skills/agent-browser-cli/ # skill
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
## Manual Installation
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
cd /path/to/agent-browser-cli
|
|
73
|
+
python3 -m venv .venv
|
|
74
|
+
.venv/bin/python -m pip install -r requirements.txt
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
## Chrome Extension
|
|
78
|
+
|
|
79
|
+
Load this extension directory:
|
|
80
|
+
|
|
81
|
+
```text
|
|
82
|
+
assets/tmwd_cdp_bridge
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
Chrome needs at least one normal web page tab open. Do not leave it only on `about:blank` or `chrome://` pages.
|
|
86
|
+
|
|
87
|
+
## Quick Check
|
|
88
|
+
|
|
89
|
+
```bash
|
|
90
|
+
.venv/bin/python agent_browser_cli.py tabs
|
|
91
|
+
.venv/bin/python agent_browser_cli.py open https://www.baidu.com
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
On success, it returns:
|
|
95
|
+
|
|
96
|
+
```json
|
|
97
|
+
{
|
|
98
|
+
"ok": true,
|
|
99
|
+
"result": {
|
|
100
|
+
"status": "success",
|
|
101
|
+
"metadata": {
|
|
102
|
+
"tabs_count": 1
|
|
103
|
+
}
|
|
104
|
+
}
|
|
105
|
+
}
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
## Common Commands
|
|
109
|
+
|
|
110
|
+
The README only keeps the quick entry point. For the full command list and browser operation SOP, see [skills/agent-browser-cli/SKILL.md](./skills/agent-browser-cli/SKILL.md).
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
.venv/bin/python agent_browser_cli.py tabs
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
## Update
|
|
117
|
+
|
|
118
|
+
```bash
|
|
119
|
+
git pull
|
|
120
|
+
.venv/bin/python -m pip install -r requirements.txt
|
|
121
|
+
.venv/bin/python agent_browser_cli.py restart
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
If the Chrome extension has updates, reload the `assets/tmwd_cdp_bridge` extension in `chrome://extensions`.
|
|
125
|
+
|
|
126
|
+
Current extension bridge identifier:
|
|
127
|
+
|
|
128
|
+
```js
|
|
129
|
+
const TID = '__agent_browser_cli_bridge_26c9f1';
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
If you installed the skill into a global Codex/Agent directory, copy it again after updating:
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
mkdir -p ~/.agents/skills/agent-browser-cli
|
|
136
|
+
cp skills/agent-browser-cli/SKILL.md ~/.agents/skills/agent-browser-cli/SKILL.md
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
## Uninstall
|
|
140
|
+
|
|
141
|
+
Stop the long-lived service first:
|
|
142
|
+
|
|
143
|
+
```bash
|
|
144
|
+
.venv/bin/python agent_browser_cli.py stop
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
Then clean up as needed:
|
|
148
|
+
|
|
149
|
+
```bash
|
|
150
|
+
rm -rf .venv
|
|
151
|
+
rm -f .agent-browser-cli.log .agent-browser-cli.lock
|
|
152
|
+
rm -rf ~/.agents/skills/agent-browser-cli
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
Finally, remove the `TMWD CDP Bridge` extension from Chrome's extension management page, or remove the loaded `assets/tmwd_cdp_bridge` extension configuration.
|
|
156
|
+
|
|
157
|
+
## Ports
|
|
158
|
+
|
|
159
|
+
- `18765`: underlying `TMWebDriver` WebSocket, used by the Chrome extension.
|
|
160
|
+
- `18766`: underlying `TMWebDriver` HTTP `/link`, used by the internal master/remote protocol.
|
|
161
|
+
- `18767`: outer `agent-browser-cli` HTTP service, used by the CLI to reuse the session.
|
|
162
|
+
|
|
163
|
+
## Friendly Links
|
|
164
|
+
|
|
165
|
+
- [LINUX DO - A New Ideal Community](https://linux.do/)
|
|
166
|
+
|
|
167
|
+
## License
|
|
168
|
+
|
|
169
|
+
MIT License. See [LICENSE](./LICENSE).
|