@sleepinsummer/agent-browser-cli 0.2.0 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -9,8 +9,8 @@
9
9
  <p>
10
10
  <a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/CLI-agentbrowsercli-2ea44f" alt="CLI agentbrowsercli"></a>
11
11
  <a href="https://github.com/sleepinginsummer/agent-browser-cli/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-MIT-green" alt="License MIT"></a>
12
- <a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/Windows-MacOS-0078D6?labelColor=0078D6&color=C0C0C0" alt="Windows/MacOS"></a>
13
- <a href="https://github.com/sleepinginsummer/agent-browser-cli/releases"><img src="https://img.shields.io/badge/release-v0.2.0-blue" alt="release v0.2.0"></a>
12
+ <a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/Windows-MacOS-0078D6?labelColor=0078D6&color=C0C0C0" alt="Windows/MacOS"></a>
13
+ <a href="https://github.com/sleepinginsummer/agent-browser-cli/releases"><img src="https://img.shields.io/badge/release-v0.2.3-blue" alt="release v0.2.3"></a>
14
14
  <a href="https://github.com/sleepinginsummer/agent-browser-cli/pulls"><img src="https://img.shields.io/badge/PRs-welcome-brightgreen" alt="PRs welcome"></a>
15
15
  </p>
16
16
 
@@ -26,9 +26,8 @@
26
26
 
27
27
  ## 项目信息
28
28
 
29
- - 当前版本:`0.2.0`
30
- - 支持平台:Windows、macOS
31
- - 运行时:Rust 原生二进制;npm 安装时按平台选择二进制包
29
+ - 当前版本:`0.2.3`
30
+ - 支持平台:sys win/mac
32
31
  - 浏览器:Chrome / Chromium,需加载 `assets/tmwd_cdp_bridge`
33
32
 
34
33
  ## 致谢
@@ -50,19 +49,46 @@
50
49
  - 新增启动锁,避免多个 CLI 并发启动时重复绑定底层端口。
51
50
  - 增加skill:`skills/agent-browser-cli/SKILL.md`,提供ai参考使用。
52
51
  - 若干优化,缩短命令执行时间
52
+ - rust实现cli端
53
+
54
+ ## 性能参考
55
+
56
+ 以下为常驻服务已启动、Chrome 扩展已连接时的实测参考,实际耗时会受页面复杂度、网络、Chrome 状态和返回数据量影响。
57
+
58
+ | 操作 | 参考耗时 |
59
+ | --- | --- |
60
+ | 打开百度标签页 | 约 `0.10s` |
61
+ | 注入 JS 输入关键词并点击搜索 | 约 `0.27s` |
62
+ | 打开百度并搜索“小猫”合计 | 约 `0.37s` |
63
+ | `scan --tab --text-only` 读取页面文本 | 约 `0.04-0.12s` |
64
+ | `exec 'return document.title'` 注入简单 JS | 约 `0.04-0.12s` |
65
+ | `exec 'return document.body.innerText'` 读取正文 | 多数 `0.04-0.05s`,偶发约 `0.30s` |
66
+ | 查询 DOM 链接列表 | 约 `0.27-0.36s` |
67
+ | `exec --monitor` 页面变化摘要 | 约 `0.72-0.88s` |
68
+
69
+ 一般判断:普通读页面和简单 JS 注入是 `50ms` 级;复杂 DOM 查询主要取决于页面结构和返回数据量,常见约 `300ms`;`--monitor` 会额外生成页面变化摘要,通常接近 `0.8s`。
70
+
71
+ 与原 Python 调用链的参考对比:
72
+
73
+ | 对比项 | Python 版本 | Rust CLI 版本 |
74
+ | --- | --- | --- |
75
+ | 启动方式 | 每次调用更容易触发 Python 进程、模块加载和连接初始化开销 | CLI 命令复用常驻服务,避免重复初始化浏览器连接 |
76
+ | 简单读页面 / JS 注入 | 通常受进程启动和 Python 调用链影响,延迟更不稳定 | 常见 `0.04-0.12s` |
77
+ | 连续多次调用 | 多次短命令开销更明显 | 更适合 Agent 高频调用 |
78
+
79
+ 该对比只用于说明架构差异带来的性能趋势;具体耗时仍取决于页面复杂度、Chrome 状态和返回数据量。
53
80
 
54
81
  ## 目录结构
55
82
 
56
83
  ```text
57
84
  .
58
- ├── agent_browser_cli.py # 命令行入口
59
- ├── agent_browser_server.py # 常驻 HTTP 服务
60
- ├── ga.py # web_scan / web_execute_js 入口
61
- ├── TMWebDriver.py # 浏览器扩展 WebSocket / HTTP 桥
62
- ├── simphtml.py # 页面简化和 DOM diff
85
+ ├── Cargo.toml # Rust 工程配置
86
+ ├── src/ # Rust CLI / 常驻服务 / bridge
63
87
  ├── assets/tmwd_cdp_bridge/ # Chrome MV3 扩展
64
- ├── memory/ # 浏览器工具 SOP
65
- └── skills/agent-browser-cli/ # skill
88
+ ├── assets/simphtml_opt.js # 页面简化脚本
89
+ ├── assets/simphtml_find_list.js # 列表识别脚本
90
+ ├── npm/ # npm 启动脚本
91
+ └── skills/agent-browser-cli/ # skill
66
92
  ```
67
93
 
68
94
  ## 手动安装
@@ -74,15 +100,6 @@ npm install -g @sleepinsummer/agent-browser-cli
74
100
  agent-browser-cli tabs
75
101
  ```
76
102
 
77
- 当前 npm 分发采用主包 + 平台二进制包:
78
-
79
- ```text
80
- @sleepinsummer/agent-browser-cli
81
- @sleepinsummer/agent-browser-cli-darwin-arm64
82
- @sleepinsummer/agent-browser-cli-darwin-x64
83
- @sleepinsummer/agent-browser-cli-win32-x64
84
- ```
85
-
86
103
  ### 本地源码构建
87
104
 
88
105
  ```bash
@@ -90,15 +107,6 @@ cargo build --release
90
107
  ./target/release/agent-browser-cli tabs
91
108
  ```
92
109
 
93
- ### Python 旧版运行方式
94
-
95
- Python 实现暂时保留为迁移参考和回退入口:
96
-
97
- ```bash
98
- cd /path/to/agent-browser-cli
99
- python3 -m venv .venv
100
- .venv/bin/python -m pip install -r requirements.txt
101
- ```
102
110
 
103
111
  ## Chrome 扩展
104
112
 
@@ -182,7 +190,6 @@ rm -rf ~/.agents/skills/agent-browser-cli
182
190
  ## 端口
183
191
 
184
192
  - `18765`:底层 `TMWebDriver` WebSocket,Chrome 扩展连接使用。
185
- - `18766`:底层 `TMWebDriver` HTTP `/link`,用于内部 master/remote 协议。
186
193
  - `18767`:外层 `agent-browser-cli` HTTP 服务,供 CLI 复用会话。
187
194
 
188
195
  ## 友情链接
package/README_EN.md CHANGED
@@ -9,9 +9,8 @@ Browser perception · Page control · Chrome session reuse · CDP · Conditional
9
9
  <p>
10
10
  <a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/CLI-agentbrowsercli-2ea44f" alt="CLI agentbrowsercli"></a>
11
11
  <a href="https://github.com/sleepinginsummer/agent-browser-cli/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-MIT-green" alt="License MIT"></a>
12
- <a href="https://www.python.org/"><img src="https://img.shields.io/badge/Python-%3E%3D3.10-3776AB?logo=python&logoColor=white" alt="Python >=3.10"></a>
13
12
  <a href="https://github.com/sleepinginsummer/agent-browser-cli"><img src="https://img.shields.io/badge/Windows-MacOS-0078D6?labelColor=0078D6&color=C0C0C0" alt="Windows/MacOS"></a>
14
- <a href="https://github.com/sleepinginsummer/agent-browser-cli/releases"><img src="https://img.shields.io/badge/release-v0.1.1-blue" alt="release v0.1.1"></a>
13
+ <a href="https://github.com/sleepinginsummer/agent-browser-cli/releases"><img src="https://img.shields.io/badge/release-v0.2.3-blue" alt="release v0.2.3"></a>
15
14
  <a href="https://github.com/sleepinginsummer/agent-browser-cli/pulls"><img src="https://img.shields.io/badge/PRs-welcome-brightgreen" alt="PRs welcome"></a>
16
15
  </p>
17
16
 
@@ -27,9 +26,8 @@ This project is not Selenium or Playwright. It is better suited for helping agen
27
26
 
28
27
  ## Project Info
29
28
 
30
- - Current version: `0.1.1`
29
+ - Current version: `0.2.3`
31
30
  - Supported platforms: Windows, macOS
32
- - Python: `3.10+` recommended
33
31
  - Browser: Chrome / Chromium, with `assets/tmwd_cdp_bridge` loaded
34
32
 
35
33
  ## Acknowledgements
@@ -51,29 +49,65 @@ Please read https://github.com/sleepinginsummer/agent-browser-cli/blob/main/AI_I
51
49
  - Adds a startup lock to avoid repeated low-level port binding when multiple CLI commands start concurrently.
52
50
  - Adds the skill `skills/agent-browser-cli/SKILL.md` for AI usage reference.
53
51
  - Includes several optimizations to reduce command execution time.
52
+ - Rust implementation for the CLI side.
53
+
54
+ ## Performance Reference
55
+
56
+ The following numbers are measured with the long-lived service already running and the Chrome extension already connected. Actual latency depends on page complexity, network conditions, Chrome state, and response size.
57
+
58
+ | Operation | Reference Latency |
59
+ | --- | --- |
60
+ | Open a Baidu tab | About `0.10s` |
61
+ | Inject JS to enter a keyword and submit search | About `0.27s` |
62
+ | Open Baidu and search “小猫” end-to-end | About `0.37s` |
63
+ | `scan --tab --text-only` to read page text | About `0.04-0.12s` |
64
+ | `exec 'return document.title'` for simple JS | About `0.04-0.12s` |
65
+ | `exec 'return document.body.innerText'` to read body text | Mostly `0.04-0.05s`, occasional `0.30s` |
66
+ | Query DOM link lists | About `0.27-0.36s` |
67
+ | `exec --monitor` page-change summary | About `0.72-0.88s` |
68
+
69
+ Rule of thumb: normal page reads and simple JS injection are around the `50ms` level; complex DOM queries depend on page structure and returned data size, commonly around `300ms`; `--monitor` adds page-change summary work and is usually close to `0.8s`.
70
+
71
+ Reference comparison with the original Python call chain:
72
+
73
+ | Item | Python Version | Rust CLI Version |
74
+ | --- | --- | --- |
75
+ | Startup model | Each call is more likely to pay for Python process startup, module loading, and connection initialization | CLI commands reuse the long-lived service and avoid repeated browser connection initialization |
76
+ | Simple page read / JS injection | Usually more affected by process startup and the Python call chain, so latency is less stable | Commonly `0.04-0.12s` |
77
+ | Repeated calls | Overhead is more visible across many short commands | Better suited for high-frequency Agent calls |
78
+
79
+ This comparison is intended to describe the performance trend caused by the architecture difference. Actual latency still depends on page complexity, Chrome state, and response size.
54
80
 
55
81
  ## Layout
56
82
 
57
83
  ```text
58
84
  .
59
- ├── agent_browser_cli.py # CLI entry
60
- ├── agent_browser_server.py # Long-lived HTTP service
61
- ├── ga.py # web_scan / web_execute_js entry
62
- ├── TMWebDriver.py # Browser extension WebSocket / HTTP bridge
63
- ├── simphtml.py # Page simplification and DOM diff
85
+ ├── Cargo.toml # Rust crate config
86
+ ├── src/ # Rust CLI / daemon / bridge
64
87
  ├── assets/tmwd_cdp_bridge/ # Chrome MV3 extension
65
- ├── memory/ # Browser tool SOPs
88
+ ├── assets/simphtml_opt.js # Page simplification script
89
+ ├── assets/simphtml_find_list.js # List detection script
90
+ ├── npm/ # npm launcher scripts
66
91
  └── skills/agent-browser-cli/ # skill
67
92
  ```
68
93
 
69
94
  ## Manual Installation
70
95
 
96
+ ### npm
97
+
98
+ ```bash
99
+ npm install -g @sleepinsummer/agent-browser-cli
100
+ agent-browser-cli tabs
101
+ ```
102
+
103
+ ### Build From Source
104
+
71
105
  ```bash
72
- cd /path/to/agent-browser-cli
73
- python3 -m venv .venv
74
- .venv/bin/python -m pip install -r requirements.txt
106
+ cargo build --release
107
+ ./target/release/agent-browser-cli tabs
75
108
  ```
76
109
 
110
+
77
111
  ## Chrome Extension
78
112
 
79
113
  Load this extension directory:
@@ -87,8 +121,8 @@ Chrome needs at least one normal web page tab open. Do not leave it only on `abo
87
121
  ## Quick Check
88
122
 
89
123
  ```bash
90
- .venv/bin/python agent_browser_cli.py tabs
91
- .venv/bin/python agent_browser_cli.py open https://www.baidu.com
124
+ agent-browser-cli tabs
125
+ agent-browser-cli open https://www.baidu.com
92
126
  ```
93
127
 
94
128
  On success, it returns:
@@ -110,15 +144,15 @@ On success, it returns:
110
144
  The README only keeps the quick entry point. For the full command list and browser operation SOP, see [skills/agent-browser-cli/SKILL.md](./skills/agent-browser-cli/SKILL.md).
111
145
 
112
146
  ```bash
113
- .venv/bin/python agent_browser_cli.py tabs
147
+ agent-browser-cli tabs
114
148
  ```
115
149
 
116
150
  ## Update
117
151
 
118
152
  ```bash
119
153
  git pull
120
- .venv/bin/python -m pip install -r requirements.txt
121
- .venv/bin/python agent_browser_cli.py restart
154
+ cargo build --release
155
+ ./target/release/agent-browser-cli restart
122
156
  ```
123
157
 
124
158
  If the Chrome extension has updates, reload the `assets/tmwd_cdp_bridge` extension in `chrome://extensions`.
@@ -141,13 +175,12 @@ cp skills/agent-browser-cli/SKILL.md ~/.agents/skills/agent-browser-cli/SKILL.md
141
175
  Stop the long-lived service first:
142
176
 
143
177
  ```bash
144
- .venv/bin/python agent_browser_cli.py stop
178
+ agent-browser-cli stop
145
179
  ```
146
180
 
147
181
  Then clean up as needed:
148
182
 
149
183
  ```bash
150
- rm -rf .venv
151
184
  rm -f .agent-browser-cli.log .agent-browser-cli.lock
152
185
  rm -rf ~/.agents/skills/agent-browser-cli
153
186
  ```
@@ -157,7 +190,6 @@ Finally, remove the `TMWD CDP Bridge` extension from Chrome's extension manageme
157
190
  ## Ports
158
191
 
159
192
  - `18765`: underlying `TMWebDriver` WebSocket, used by the Chrome extension.
160
- - `18766`: underlying `TMWebDriver` HTTP `/link`, used by the internal master/remote protocol.
161
193
  - `18767`: outer `agent-browser-cli` HTTP service, used by the CLI to reuse the session.
162
194
 
163
195
  ## Friendly Links
@@ -0,0 +1,266 @@
1
+ function findMainList(startElement = null) {
2
+ const root = startElement || document.body;
3
+ const MIN_CHILDREN = 8;
4
+ const MAX_CONTAINERS = 20;
5
+
6
+ // 全局扫描:收集候选容器,按 l1 + l2*0.1 排序(l2=孙子元素数,捕获表格等多层结构)
7
+ const candidates = [];
8
+ const allEls = root.querySelectorAll('*');
9
+ for (const node of allEls) {
10
+ if (node.closest('svg')) continue;
11
+ const l1 = node.children.length;
12
+ if (l1 < 5) continue;
13
+ let l2 = 0;
14
+ for (const child of node.children) l2 += child.children.length;
15
+ const score = l1 + l2 * 0.1;
16
+ if (score >= MIN_CHILDREN) candidates.push({node, score});
17
+ }
18
+ candidates.sort((a, b) => b.score - a.score);
19
+ const toProcess = candidates.slice(0, MAX_CONTAINERS).map(c => c.node);
20
+
21
+ // 对每个容器找候选组并评分
22
+ let allCandidates = [];
23
+ for (const container of toProcess) {
24
+ const topGroups = findTopGroups(container, 3);
25
+ for (const groupInfo of topGroups) {
26
+ const items = findMatchingElements(container, groupInfo.selector);
27
+ if (items.length >= 5) {
28
+ const score = scoreContainer(container, items) + groupInfo.score;
29
+ if (score >= 30) {
30
+ allCandidates.push({ container, selector: groupInfo.selector, items, score });
31
+ }
32
+ }
33
+ }
34
+ }
35
+
36
+ // 按分数降序排列
37
+ allCandidates.sort((a, b) => b.score - a.score);
38
+
39
+ // 去重:移除与更高分候选重叠超50%的结果
40
+ const kept = [];
41
+ for (const cand of allCandidates) {
42
+ let dominated = false;
43
+ for (const k of kept) {
44
+ if (k.container.contains(cand.container) || cand.container.contains(k.container)) {
45
+ const kSet = new Set(k.items);
46
+ const overlap = cand.items.filter(it => kSet.has(it)).length;
47
+ if (overlap > cand.items.length * 0.5) { dominated = true; break; }
48
+ }
49
+ }
50
+ if (!dominated) kept.push(cand);
51
+ }
52
+
53
+ function describeResult(container, items, selector, score) {
54
+ if(container&&!container.id)container.id='_ljq'+(window._lci=(window._lci||0)+1);
55
+ const cTag = container ? container.tagName : null;
56
+ const cId = container ? (container.id || '') : '';
57
+ const cClass = container ? (String(container.className || '').trim()) : '';
58
+ const result = {
59
+ containerTag: cTag, containerId: cId, containerClass: cClass,
60
+ itemCount: items.length,
61
+ };
62
+ let prefix = '';
63
+ if (cId) prefix = '#' + CSS.escape(cId);
64
+ if (selector) result.selector = prefix ? (prefix + ' > ' + selector) : selector;
65
+ if (score !== undefined) result.score = score;
66
+ if (items.length > 0) {
67
+ result.firstItemPreview = items[0].outerHTML.substring(0, 200);
68
+ result.itemTags = items.slice(0, 10).map(el => el.tagName + (el.className ? '.' + String(el.className).trim().split(/\s+/)[0] : ''));
69
+ }
70
+ return result;
71
+ }
72
+
73
+ if (kept.length === 0) return [];
74
+
75
+ return kept.map(c => describeResult(c.container, c.items, c.selector, c.score));
76
+ }
77
+
78
+ function findTopGroups(container, limit) {
79
+ const children = Array.from(container.children).filter(c => !c.closest('svg'));
80
+ const totalChildren = children.length;
81
+ if (totalChildren < 3) return [];
82
+
83
+ const minGroupSize = Math.max(3, Math.floor(totalChildren * 0.2));
84
+ const groups = [];
85
+
86
+ // 统计标签和类名
87
+ const tagFreq = {}, classFreq = {}, tagMap = {}, classMap = {};
88
+
89
+ children.forEach(child => {
90
+ // 统计标签
91
+ const tag = child.tagName.toLowerCase();
92
+ if (tag === "td") return;
93
+ tagFreq[tag] = (tagFreq[tag] || 0) + 1;
94
+ if (!tagMap[tag]) tagMap[tag] = [];
95
+ tagMap[tag].push(child);
96
+
97
+ // 统计类名
98
+ if (child.className) {
99
+ child.className.trim().split(/\s+/).forEach(cls => {
100
+ if (cls) {
101
+ classFreq[cls] = (classFreq[cls] || 0) + 1;
102
+ if (!classMap[cls]) classMap[cls] = [];
103
+ classMap[cls].push(child);
104
+ }
105
+ });
106
+ }
107
+ });
108
+
109
+ // 评分函数
110
+ const scoreGroup = (selector, elements) => {
111
+ const coverage = elements.length / totalChildren;
112
+ let specificity = selector.startsWith('.')
113
+ ? (0.6 + (selector.match(/\./g).length - 1) * 0.1) // 类选择器
114
+ : (selector.includes('.')
115
+ ? (0.7 + (selector.match(/\./g).length) * 0.1) // 标签+类
116
+ : 0.3); // 纯标签
117
+ return (coverage * 0.5) + (specificity * 0.5);
118
+ };
119
+
120
+ // 添加标签组
121
+ Object.keys(tagFreq).forEach(tag => {
122
+ if (tag !== "div" && tagFreq[tag] >= minGroupSize) {
123
+ groups.push({
124
+ selector: tag,
125
+ elements: tagMap[tag],
126
+ score: scoreGroup(tag, tagMap[tag]) - 0.5
127
+ });
128
+ }
129
+ });
130
+
131
+ // 添加类组
132
+ Object.keys(classFreq).forEach(cls => {
133
+ if (classFreq[cls] >= minGroupSize) {
134
+ const selector = '.' + CSS.escape(cls);
135
+ groups.push({
136
+ selector,
137
+ elements: classMap[cls],
138
+ score: scoreGroup(selector, classMap[cls])
139
+ });
140
+ }
141
+ });
142
+ // 添加标签+类组合
143
+ const topTags = Object.keys(tagFreq).filter(t => tagFreq[t] >= minGroupSize).slice(0, 3);
144
+ const topClasses = Object.keys(classFreq).filter(c => classFreq[c] >= minGroupSize).sort((a, b) => classFreq[b] - classFreq[a]).slice(0, 3);
145
+
146
+ // 标签+类
147
+ topTags.forEach(tag => {
148
+ topClasses.forEach(cls => {
149
+ const elements = children.filter(el =>
150
+ el.tagName.toLowerCase() === tag &&
151
+ el.className && el.className.split(/\s+/).includes(cls)
152
+ );
153
+
154
+ if (elements.length >= minGroupSize) {
155
+ const selector = tag + '.' + CSS.escape(cls);
156
+ groups.push({selector, elements, score: scoreGroup(selector, elements)});
157
+ }
158
+ });
159
+ });
160
+
161
+ // 多类组合
162
+ for (let i = 0; i < topClasses.length; i++) {
163
+ for (let j = i + 1; j < topClasses.length; j++) {
164
+ const elements = children.filter(el =>
165
+ el.className && el.className.split(/\s+/).includes(topClasses[i]) && el.className.split(/\s+/).includes(topClasses[j]));
166
+
167
+ if (elements.length >= minGroupSize) {
168
+ const selector = '.' + CSS.escape(topClasses[i]) + '.' + CSS.escape(topClasses[j]);
169
+ groups.push({selector, elements,score: scoreGroup(selector, elements)});
170
+ }
171
+ }
172
+ }
173
+ // 返回得分最高的N个组
174
+ return groups.sort((a, b) => b.score - a.score).slice(0, limit);
175
+ }
176
+
177
+ function findMatchingElements(container, selector) {
178
+ try {
179
+ return Array.from(container.querySelectorAll(selector));
180
+ } catch (e) {
181
+ // 处理无效选择器
182
+ console.error('Invalid selector:', selector, e);
183
+ return [];
184
+ }
185
+ }
186
+
187
+ function scoreContainer(container, items) {
188
+ if (!container || items.length < 3) return 0;
189
+ // 1. 计算基础面积数据
190
+ const containerRect = container.getBoundingClientRect();
191
+ const containerArea = containerRect.width * containerRect.height;
192
+ if (containerArea < 10000) return 0; // 容器太小
193
+
194
+ // 收集列表项面积数据
195
+ const itemAreas = [];
196
+ let totalItemArea = 0;
197
+ let visibleItems = 0;
198
+
199
+ items.forEach(item => {
200
+ const rect = item.getBoundingClientRect();
201
+ const area = rect.width * rect.height;
202
+ if (area > 0) {
203
+ totalItemArea += area;
204
+ itemAreas.push(area);
205
+ visibleItems++;
206
+ }
207
+ });
208
+ // 如果可见项太少,返回低分
209
+ if (visibleItems < 3) return 0;
210
+ // 防止异常值:确保面积不超过容器
211
+ totalItemArea = Math.min(totalItemArea, containerArea * 0.98);
212
+ const areaRatio = totalItemArea / containerArea;
213
+ // 3. 计算各项评分 - 使用线性插值而非阶梯
214
+ // 3.2 面积比评分 - 最多40分,连续曲线
215
+ // 使用sigmoid函数让评分更平滑
216
+ const areaScore = 40 / (1 + Math.exp(-12 * (areaRatio - 0.4)));
217
+
218
+ // 3.3 均匀性评分 - 最多20分,连续曲线
219
+ let uniformityScore = 0;
220
+ if (itemAreas.length >= 3) {
221
+ const mean = itemAreas.reduce((sum, area) => sum + area, 0) / itemAreas.length;
222
+ const variance = itemAreas.reduce((sum, area) => sum + Math.pow(area - mean, 2), 0) / itemAreas.length;
223
+ const cv = mean > 0 ? Math.sqrt(variance) / mean : 1;
224
+ // 指数衰减函数,cv越小分数越高
225
+ uniformityScore = 20 * Math.exp(-2.5 * cv);
226
+ }
227
+
228
+ const baseScore = Math.log2(visibleItems) * 5 + Math.floor(visibleItems / 5) * 0.25;
229
+ const rawCountScore = Math.min(40, baseScore);
230
+ const countScore = rawCountScore * Math.max(0.1, uniformityScore / 20);
231
+
232
+ // 3.4 容器尺寸评分 - 最多15分,连续曲线
233
+ const viewportArea = window.innerWidth * window.innerHeight;
234
+ const containerViewportRatio = containerArea / viewportArea;
235
+ const sizeScore = 2 * (1 - 1/(1 + Math.exp(-10 * (containerViewportRatio - 0.25))));
236
+
237
+ let layoutScore = 0;
238
+ if (items.length >= 3) {
239
+ // 坐标分组并计算行列数
240
+ const uniqueRows = new Set(items.map(item => Math.round(item.getBoundingClientRect().top / 5) * 5)).size;
241
+ const uniqueCols = new Set(items.map(item => Math.round(item.getBoundingClientRect().left / 5) * 5)).size;
242
+ // 如果是单行或单列,直接给满分;否则评估网格质量
243
+ if (uniqueRows === 1 || uniqueCols === 1) { layoutScore = 20;
244
+ } else {
245
+ const coverage = Math.min(1, items.length / (uniqueRows * uniqueCols));
246
+ const efficiency = Math.max(0, 1 - (uniqueRows + uniqueCols) / (2 * items.length));
247
+ layoutScore = 20 * (0.7 * coverage + 0.3 * efficiency);
248
+ }
249
+ }
250
+
251
+ // 总分 - 仍然保持100分左右的总分
252
+ const totalScore = countScore + areaScore + uniformityScore + layoutScore + sizeScore;
253
+
254
+ if (totalScore > 100)
255
+ console.log(container, {
256
+ total: totalScore.toFixed(2),
257
+ count: countScore.toFixed(2),
258
+ areaRatio: areaRatio.toFixed(2),
259
+ area: areaScore.toFixed(2),
260
+ uniformity: uniformityScore.toFixed(2),
261
+ size: sizeScore.toFixed(2),
262
+ layout: layoutScore.toFixed(2)
263
+ });
264
+
265
+ return totalScore;
266
+ }
@@ -0,0 +1,324 @@
1
+ function optHTML(text_only=false) {
2
+ function createEnhancedDOMCopy() {
3
+ const nodeInfo = new WeakMap();
4
+ const ignoreTags = ['SCRIPT', 'STYLE', 'NOSCRIPT', 'META', 'LINK', 'COLGROUP', 'COL', 'TEMPLATE', 'PARAM', 'SOURCE'];
5
+ const ignoreIds = ['agent-browser-cli-ind'];
6
+ function cloneNode(sourceNode, keep=false) {
7
+ if (sourceNode.nodeType === 8 ||
8
+ (sourceNode.nodeType === 1 && (
9
+ ignoreTags.includes(sourceNode.tagName) ||
10
+ (sourceNode.id && ignoreIds.includes(sourceNode.id))
11
+ ))) {
12
+ return null;
13
+ }
14
+ if (sourceNode.nodeType === 3) return sourceNode.cloneNode(false);
15
+ const clone = sourceNode.cloneNode(false);
16
+ if ((sourceNode.tagName === 'INPUT' || sourceNode.tagName === 'TEXTAREA') && sourceNode.value) clone.setAttribute('value', sourceNode.value);
17
+ if (sourceNode.tagName === 'INPUT' && (sourceNode.type === 'radio' || sourceNode.type === 'checkbox') && sourceNode.checked) clone.setAttribute('checked', '');
18
+ else if (sourceNode.tagName === 'SELECT' && sourceNode.value) clone.setAttribute('data-selected', sourceNode.value);
19
+ try { if (sourceNode.matches && sourceNode.matches(':-webkit-autofill')) { clone.setAttribute('data-autofilled', 'true'); if (!sourceNode.value) clone.setAttribute('value', '⚠️受保护-读tmwebdriver_sop的autofill章节提取'); } } catch(e) {}
20
+
21
+ const isDropdown = sourceNode.classList?.contains('dropdown-menu') ||
22
+ /dropdown|menu/i.test(sourceNode.className) || sourceNode.getAttribute('role') === 'menu';
23
+ const _ddItems = isDropdown ? sourceNode.querySelectorAll('a, button, [role="menuitem"], li').length : 0;
24
+ const isSmallDropdown = _ddItems > 0 && _ddItems <= 7 && sourceNode.textContent.length < 500;
25
+
26
+ const childNodes = [];
27
+ for (const child of sourceNode.childNodes) {
28
+ const childClone = cloneNode(child, keep || isSmallDropdown);
29
+ if (childClone) childNodes.push(childClone);
30
+ }
31
+ if (sourceNode.tagName === 'IFRAME') {
32
+ try {
33
+ const iDoc = sourceNode.contentDocument || sourceNode.contentWindow?.document;
34
+ if (iDoc && iDoc.body && iDoc.body.children.length > 0) {
35
+ const wrapper = document.createElement('div');
36
+ wrapper.setAttribute('data-iframe-content', sourceNode.src || '');
37
+ for (const ch of iDoc.body.childNodes) {
38
+ const c = cloneNode(ch, keep);
39
+ if (c) wrapper.appendChild(c);
40
+ }
41
+ if (wrapper.childNodes.length) childNodes.push(wrapper);
42
+ }
43
+ } catch(e) {}
44
+ }
45
+ if (sourceNode.shadowRoot) {
46
+ for (const shadowChild of sourceNode.shadowRoot.childNodes) {
47
+ const shadowClone = cloneNode(shadowChild, keep);
48
+ if (shadowClone) childNodes.push(shadowClone);
49
+ }
50
+ }
51
+
52
+ const rect = sourceNode.getBoundingClientRect();
53
+ const style = window.getComputedStyle(sourceNode);
54
+ const area = (style.display === 'none' || style.visibility === 'hidden' || parseFloat(style.opacity) <= 0)?0:rect.width * rect.height;
55
+ const isVisible = (rect.width > 1 && rect.height > 1 &&
56
+ style.display !== 'none' && style.visibility !== 'hidden' &&
57
+ parseFloat(style.opacity) > 0 &&
58
+ Math.abs(rect.left) < 5000 && Math.abs(rect.top) < 5000)
59
+ || isSmallDropdown;
60
+ const zIndex = style.position !== 'static' ? (parseInt(style.zIndex) || 0) : 0;
61
+
62
+ let info = {
63
+ rect, area, isVisible, isSmallDropdown, zIndex,
64
+ style: {
65
+ display: style.display, visibility: style.visibility,
66
+ opacity: style.opacity, position: style.position
67
+ }};
68
+
69
+ const nonTextChildren = childNodes.filter(child => child.nodeType !== 3);
70
+ const hasValidChildren = nonTextChildren.length > 0;
71
+
72
+ if (hasValidChildren) {
73
+ const childrenInfos = nonTextChildren.map(c => nodeInfo.get(c)).filter(i => i && i.rect && i.rect.width > 0 && i.rect.height > 0);
74
+ const bgAlpha = (() => {
75
+ const c = style.backgroundColor;
76
+ if (!c || c === 'transparent') return 0;
77
+ const m = c.match(/rgba?\([^)]+,\s*([\d.]+)\)/);
78
+ return m ? parseFloat(m[1]) : 1;
79
+ })();
80
+ const hasVisualBg = bgAlpha > 0.1 || style.backgroundImage !== 'none' || (style.backdropFilter && style.backdropFilter !== 'none') || style.boxShadow !== 'none';
81
+
82
+ if (!hasVisualBg && childrenInfos.length > 0) {
83
+ // Skip fixed/absolute children when computing parent's merged rect (they're out of flow)
84
+ const flowChildren = childrenInfos.filter(cInfo => cInfo.style && cInfo.style.position !== 'fixed' && cInfo.style.position !== 'absolute');
85
+ if (flowChildren.length > 0) {
86
+ let minL = Infinity, minT = Infinity, maxR = -Infinity, maxB = -Infinity;
87
+ for (const cInfo of flowChildren) {
88
+ minL = Math.min(minL, cInfo.rect.left);
89
+ minT = Math.min(minT, cInfo.rect.top);
90
+ maxR = Math.max(maxR, cInfo.rect.right);
91
+ maxB = Math.max(maxB, cInfo.rect.bottom);
92
+ }
93
+ info.rect = { left: minL, top: minT, right: maxR, bottom: maxB, width: maxR - minL, height: maxB - minT };
94
+ info.area = info.rect.width * info.rect.height;
95
+ } else {
96
+ const maxC = childrenInfos.filter(i => i.isVisible).sort((a, b) => b.area - a.area)[0];
97
+ if (maxC && maxC.area > 10000 && (!isVisible || maxC.area > info.area * 5)) info = maxC;
98
+ }
99
+ }
100
+ }
101
+
102
+ if (sourceNode.nodeType === 1 && sourceNode.tagName === 'DIV') {
103
+ if (!hasValidChildren && !sourceNode.textContent.trim()) return null;
104
+ }
105
+ // aria-hidden + not visible = truly hidden (e.g. mobile menus), remove even if has children
106
+ if (sourceNode.getAttribute && sourceNode.getAttribute('aria-hidden') === 'true' && !info.isVisible) {
107
+ return null;
108
+ }
109
+ if (info.isVisible || hasValidChildren || keep) {
110
+ childNodes.forEach(child => clone.appendChild(child));
111
+ return clone;
112
+ }
113
+ return null;
114
+ }
115
+
116
+ return {
117
+ domCopy: cloneNode(document.body),
118
+ getNodeInfo: node => nodeInfo.get(node),
119
+ isVisible: node => {
120
+ const info = nodeInfo.get(node);
121
+ return info && info.isVisible;
122
+ }
123
+ };
124
+ }
125
+ const { domCopy, getNodeInfo, isVisible } = createEnhancedDOMCopy();
126
+ if (text_only) {
127
+ const blocks = new Set(['DIV','P','H1','H2','H3','H4','H5','H6','LI','TR','SECTION','ARTICLE','HEADER','FOOTER','NAV','BLOCKQUOTE','PRE','HR','BR','DT','DD','FIGCAPTION','DETAILS','SUMMARY']);
128
+ domCopy.querySelectorAll('*').forEach(el => {
129
+ if (blocks.has(el.tagName)) el.insertAdjacentText('beforebegin', '\n');
130
+ });
131
+ domCopy.querySelectorAll('input:not([type=hidden]),textarea,select').forEach(el=>{
132
+ const p=[el.tagName,el.id&&'#'+el.id,el.getAttribute('name')&&'name='+el.getAttribute('name'),el.tagName==='INPUT'&&'type='+(el.getAttribute('type')||'text'),el.getAttribute('placeholder')&&'"'+el.getAttribute('placeholder')+'"',el.getAttribute('data-autofilled')&&'autofilled',el.disabled&&'disabled',el.tagName==='SELECT'&&el.getAttribute('data-selected')&&'="'+el.getAttribute('data-selected')+'"'].filter(Boolean).join(' ');
133
+ el.insertAdjacentText('beforebegin','\n['+p+']\n');
134
+ });
135
+ domCopy.querySelectorAll('button[disabled]').forEach(el=>el.insertAdjacentText('beforebegin','[DISABLED] '));
136
+ return domCopy.textContent;
137
+ }
138
+ const viewportArea = window.innerWidth * window.innerHeight;
139
+
140
+ function analyzeNode(node, pPathType='main') {
141
+ // 处理非元素节点和叶节点
142
+ if (node.nodeType !== 1 || !node.children.length) {
143
+ node.nodeType === 1 && (node.dataset.mark = 'K:leaf');
144
+ return;
145
+ }
146
+ const pathType = (node.dataset.mark === 'K:secondary') ? 'second' : pPathType;
147
+ const nodeInfoData = getNodeInfo(node);
148
+ if (!nodeInfoData || !nodeInfoData.rect) return;
149
+ const rectn = nodeInfoData.rect;
150
+ if (rectn.width < window.innerWidth * 0.8 && rectn.height < window.innerHeight * 0.8) return node;
151
+ if (node.tagName === 'TABLE') return;
152
+ const children = Array.from(node.children);
153
+ if (children.length === 1) {
154
+ node.dataset.mark = 'K:container';
155
+ return analyzeNode(children[0], pathType);
156
+ }
157
+ if (children.length > 10) return;
158
+
159
+ // 获取子元素信息并排序
160
+ const childrenInfo = children.map(child => {
161
+ const info = getNodeInfo(child) || { rect: {}, style: {} };
162
+ return { node: child, rect: info.rect, style: info.style,
163
+ area: info.area, zIndex: (info.zIndex || 0), isVisible: info.isVisible };
164
+ });
165
+ childrenInfo.sort((a, b) => b.area - a.area);
166
+
167
+ // 检测是划分还是覆盖
168
+ const isOverlay = hasOverlap(childrenInfo);
169
+ node.dataset.mark = isOverlay ? 'K:overlayParent' : 'K:partitionParent';
170
+
171
+ if (isOverlay) handleOverlayContainer(childrenInfo, pathType);
172
+ else handlePartitionContainer(childrenInfo, pathType);
173
+
174
+ console.log(`${isOverlay ? '覆盖' : '划分'}容器:`, node, `子元素数量: ${children.length}`);
175
+ console.log('子元素及标记:', children.map(child => ({
176
+ element: child,
177
+ mark: child.dataset.mark || '无',
178
+ info: getNodeInfo ? getNodeInfo(child) : undefined
179
+ })));
180
+ for (const child of children)
181
+ if (!child.dataset.mark || child.dataset.mark[0] !== 'R') analyzeNode(child, pathType);
182
+ }
183
+
184
+ // 处理划分容器
185
+ function handlePartitionContainer(childrenInfo, pathType) {
186
+ childrenInfo.sort((a, b) => b.area - a.area);
187
+ const totalArea = childrenInfo.reduce((sum, item) => sum + item.area, 0);
188
+ console.log(childrenInfo[0].area / totalArea);
189
+ const hasMainElement = childrenInfo.length >= 1 &&
190
+ (childrenInfo[0].area / totalArea > 0.5) &&
191
+ (childrenInfo.length === 1 || childrenInfo[0].area > childrenInfo[1].area * 2);
192
+ if (hasMainElement) {
193
+ childrenInfo[0].node.dataset.mark = 'K:main';
194
+ for (let i = 1; i < childrenInfo.length; i++) {
195
+ const child = childrenInfo[i];
196
+ let className = (child.node.getAttribute('class') || '').toLowerCase();
197
+ let isSecondary = containsButton(child.node);
198
+ if (className.includes('nav')) isSecondary = true;
199
+ if (className.includes('breadcrumbs')) isSecondary = true;
200
+ if (className.includes('header') && className.includes('table')) isSecondary = true;
201
+ if (child.node.innerHTML.trim().replace(/\s+/g, '').length < 500) isSecondary = true;
202
+ if (child.node.textContent.trim().length > 200) isSecondary = true; // P3: 有实质文本内容则保留
203
+ if (child.style.visibility === 'hidden') isSecondary = false;
204
+ if (isSecondary) child.node.dataset.mark = 'K:secondary';
205
+ else child.node.dataset.mark = 'K:nonEssential';
206
+ }
207
+ } else {
208
+ return; // relaxed: skip equalmany filtering, list truncation handles token budget
209
+ const uniqueClassNames = new Set(childrenInfo.map(item => item.node.getAttribute('class') || '')).size;
210
+ const highClassNameVariety = uniqueClassNames >= childrenInfo.length * 0.8;
211
+ if (pathType !== 'main' && highClassNameVariety && childrenInfo.length > 5) {
212
+ childrenInfo.forEach(child => child.node.dataset.mark = 'R:equalmany');
213
+ } else {
214
+ childrenInfo.forEach(child => child.node.dataset.mark = 'K:equal');
215
+ }
216
+ }
217
+ }
218
+
219
+ function containsButton(container) {
220
+ const hasStandardButton = container.querySelector('button, input[type="button"], input[type="submit"], [role="button"]') !== null;
221
+ if (hasStandardButton) return true;
222
+ const hasClassButton = container.querySelector('[class*="-btn"], [class*="-button"], .button, .btn, [class*="btn-"]') !== null;
223
+ return hasClassButton;
224
+ }
225
+
226
+ function handleOverlayContainer(childrenInfo, pathType) {
227
+ // elementFromPoint ground truth: 让浏览器告诉我们谁在视觉最上层
228
+ const _efp = document.elementFromPoint(window.innerWidth/2, window.innerHeight/2);
229
+ if (_efp) { let _el = _efp; while (_el) { const _h = childrenInfo.find(c => c.node.id && c.node.id === _el.id); if (_h) { _h.zIndex = 9999; break; } _el = _el.parentElement; } }
230
+ const sorted = [...childrenInfo].sort((a, b) => b.zIndex - a.zIndex);
231
+ console.log('排序后的子元素:', sorted);
232
+ if (sorted.length === 0) return;
233
+
234
+ const top = sorted[0];
235
+ const rect = top.rect;
236
+ const topNode = top.node;
237
+ const isComplex = top.node.querySelectorAll('input, select, textarea, button, a, [role="button"]').length >= 1;
238
+
239
+ const textContent = topNode.textContent?.trim() || '';
240
+ const textLength = textContent.length;
241
+ const hasLinks = topNode.querySelectorAll('a').length > 0;
242
+ const isMostlyText = textLength > 7 && !hasLinks;
243
+
244
+ const centerDiff = Math.abs((rect.left + rect.width/2) - window.innerWidth/2) / window.innerWidth;
245
+ const minDimensionRatio = Math.min(rect.width / window.innerWidth, rect.height / window.innerHeight);
246
+ const maxDimensionRatio = Math.max(rect.width / window.innerWidth, rect.height / window.innerHeight);
247
+ const isNearTop = rect.top < 50;
248
+ const isDialog = (top.node.querySelector('iframe') || top.node.querySelector('button') || top.node.querySelector('input')) && centerDiff < 0.3;
249
+
250
+ if (isComplex && centerDiff < 0.2 &&
251
+ ((minDimensionRatio > 0.2 && rect.width/window.innerWidth < 0.98) || minDimensionRatio > 0.95)) {
252
+ top.node.dataset.mark = 'K:mainInteractive';
253
+ sorted.slice(1).forEach(e => {
254
+ if ((parseInt(e.zIndex)||0) <= (parseInt(sorted[0].zIndex)||0)) {
255
+ e.node.dataset.mark = 'R:covered';
256
+ } else {
257
+ e.node.dataset.mark = 'K:noncovered';
258
+ }
259
+ });
260
+ } else {
261
+ if (isComplex && isNearTop && maxDimensionRatio > 0.4 && top.isVisible) {
262
+ top.node.dataset.mark = 'K:topBar';
263
+ } else if (isMostlyText || isComplex || isDialog) {
264
+ topNode.dataset.mark = 'K:messageContent';
265
+ } else {
266
+ topNode.dataset.mark = 'R:floatingAd';
267
+ }
268
+ const rest = sorted.slice(1);
269
+ rest.length && (!hasOverlap(rest) ? handlePartitionContainer(rest, pathType) : handleOverlayContainer(rest, pathType));
270
+ }
271
+ }
272
+
273
+ function hasOverlap(items) {
274
+ return items.some((a, i) =>
275
+ items.slice(i+1).some(b => {
276
+ const r1 = a.rect, r2 = b.rect;
277
+ if (!r1.width || !r2.width || !r1.height || !r2.height) {return false;}
278
+ const epsilon = 1;
279
+ const x1 = r1.x !== undefined ? r1.x : r1.left;
280
+ const y1 = r1.y !== undefined ? r1.y : r1.top;
281
+ const x2 = r2.x !== undefined ? r2.x : r2.left;
282
+ const y2 = r2.y !== undefined ? r2.y : r2.top;
283
+ return !(x1 + r1.width <= x2 + epsilon || x1 >= x2 + r2.width - epsilon ||
284
+ y1 + r1.height <= y2 + epsilon || y1 >= y2 + r2.height - epsilon
285
+ );
286
+ })
287
+ );
288
+ }
289
+
290
+ // Hoist top 1-2 deep fixed dialogs to body level for overlay detection
291
+ const _fc = [...domCopy.querySelectorAll('*')].filter(el => {
292
+ if (el.parentNode === domCopy) return false;
293
+ const info = getNodeInfo(el);
294
+ if (!info?.rect || info.style.position !== 'fixed') return false;
295
+ const r = info.rect, cover = (r.width * r.height) / viewportArea;
296
+ const cd = Math.abs((r.left + r.width/2) - window.innerWidth/2) / window.innerWidth;
297
+ return cover > 0.15 && cover < 1.0 && cd < 0.3 && el.querySelector('button, input, a, [role="button"], iframe');
298
+ }).filter((el, _, arr) => !arr.some(o => o !== el && o.contains(el)))
299
+ .sort((a, b) => (getNodeInfo(b).rect.width * getNodeInfo(b).rect.height) - (getNodeInfo(a).rect.width * getNodeInfo(a).rect.height))
300
+ .slice(0, 2);
301
+ _fc.forEach(el => { const r = getNodeInfo(el).rect; console.log('[simphtml] Hoisted fixed dialog:', el.tagName + (el.id ? '#'+el.id : '') + (el.className ? '.'+String(el.className).split(' ')[0] : ''), Math.round(r.width)+'x'+Math.round(r.height), Math.round(100*r.width*r.height/viewportArea)+'%'); el.parentNode.removeChild(el); domCopy.appendChild(el); });
302
+ const result = analyzeNode(domCopy);
303
+ domCopy.querySelectorAll('[data-mark^="R:"]').forEach(el=>el.parentNode?.removeChild(el));
304
+ let root = domCopy;
305
+ while (root.children.length === 1) {
306
+ root = root.children[0];
307
+ }
308
+ for (let ii = 0; ii < 3; ii++) {
309
+ root.querySelectorAll('div').forEach(div => (!div.textContent.trim() && div.children.length === 0) && div.remove());
310
+ }
311
+ root.querySelectorAll('[data-mark]').forEach(e => e.removeAttribute('data-mark'));
312
+ root.removeAttribute('data-mark');
313
+ root.querySelectorAll('iframe').forEach(f => {
314
+ if (f.children.length) {
315
+ const d = document.createElement('div');
316
+ for (const a of f.attributes) d.setAttribute(a.name, a.value);
317
+ d.setAttribute('data-tag', 'iframe');
318
+ while (f.firstChild) d.appendChild(f.firstChild);
319
+ f.parentNode.replaceChild(d, f);
320
+ }
321
+ });
322
+ return root.outerHTML;
323
+ }
324
+ optHTML()
package/package.json CHANGED
@@ -1,13 +1,14 @@
1
1
  {
2
2
  "name": "@sleepinsummer/agent-browser-cli",
3
- "version": "0.2.0",
3
+ "version": "0.2.3",
4
4
  "description": "Agent-oriented browser sensing and control CLI backed by a native Rust daemon.",
5
5
  "license": "MIT",
6
6
  "bin": {
7
7
  "agent-browser-cli": "npm/bin/agent-browser-cli.js"
8
8
  },
9
9
  "files": [
10
- "npm",
10
+ "npm/bin",
11
+ "npm/postinstall.js",
11
12
  "assets",
12
13
  "skills",
13
14
  "README.md",
@@ -20,11 +21,19 @@
20
21
  "postinstall": "node npm/postinstall.js"
21
22
  },
22
23
  "optionalDependencies": {
23
- "@sleepinsummer/agent-browser-cli-darwin-arm64": "0.2.0",
24
- "@sleepinsummer/agent-browser-cli-darwin-x64": "0.2.0",
25
- "@sleepinsummer/agent-browser-cli-win32-x64": "0.2.0"
24
+ "@sleepinsummer/agent-browser-cli-darwin-arm64": "0.2.3",
25
+ "@sleepinsummer/agent-browser-cli-darwin-x64": "0.2.3",
26
+ "@sleepinsummer/agent-browser-cli-win32-x64": "0.2.3"
26
27
  },
27
28
  "engines": {
28
29
  "node": ">=18"
30
+ },
31
+ "repository": {
32
+ "type": "git",
33
+ "url": "https://github.com/sleepinginsummer/agent-browser-cli"
34
+ },
35
+ "publishConfig": {
36
+ "access": "public",
37
+ "provenance": true
29
38
  }
30
39
  }
@@ -5,7 +5,7 @@ description: 使用 agent-browser-cli 进行浏览器感知与控制。适用于
5
5
 
6
6
  # agent-browser-cli
7
7
 
8
- 使用 `agent-browser-cli` 进行浏览器控制。底层通过 Rust 常驻服务和 Chrome 扩展接管用户浏览器,保留登录态和 Cookie;不是 Selenium/Playwright。旧 Python API `web_scan`、`web_execute_js` 只作为兼容和排障入口。
8
+ 使用 `agent-browser-cli` 进行浏览器控制。底层通过 Rust 常驻服务和 Chrome 扩展接管用户浏览器,保留登录态和 Cookie;不是 Selenium/Playwright
9
9
 
10
10
  ## 项目路径
11
11
 
@@ -73,18 +73,8 @@ agent-browser-cli restart
73
73
 
74
74
  常驻服务端口:
75
75
  - `18765`:底层 `TMWebDriver` WebSocket,Chrome 扩展连接使用。
76
- - `18766`:底层 `TMWebDriver` HTTP `/link`,用于内部 master/remote 协议。
77
76
  - `18767`:外层 `agent-browser-cli` HTTP 服务,供 CLI 复用会话。
78
77
 
79
- 旧 Python 实现只用于回退排障或开发:
80
-
81
- ```bash
82
- .venv/bin/python - <<'PY'
83
- import ga
84
- print(ga.web_scan(tabs_only=True))
85
- PY
86
- ```
87
-
88
78
  成功标志:
89
79
  - 返回 `status=success`
90
80
  - 能看到 `tabs_count`
@@ -127,38 +117,29 @@ agent-browser-cli exec --tab 303987837 'document.querySelector("button").click()
127
117
 
128
118
  ## 基础调用
129
119
 
130
- `web_scan` 负责感知,`web_execute_js` 负责精确操作。能精确操作时,不做全量扫描。下面是旧 Python 直接调用方式,主要用于排障或兼容旧脚本;日常命令行操作优先用上面的常驻 CLI。
131
-
132
- ```python
133
- import ga
120
+ `scan` 负责感知,`exec` 负责精确操作。能精确操作时,不做全量扫描。
134
121
 
135
- print(ga.web_scan(tabs_only=True))
136
- print(ga.web_scan())
137
- print(ga.web_scan(text_only=True))
138
- print(ga.web_scan(switch_tab_id="303987837"))
122
+ ```bash
123
+ agent-browser-cli scan --tabs-only
124
+ agent-browser-cli scan
125
+ agent-browser-cli scan --text-only
126
+ agent-browser-cli scan --tab 303987837
139
127
  ```
140
128
 
141
129
  普通页面 JS:
142
130
 
143
- ```python
144
- import ga
145
-
146
- print(ga.web_execute_js("return document.title"))
147
- print(ga.web_execute_js("""
148
- return {
149
- title: document.title,
150
- url: location.href
151
- }
152
- """))
131
+ ```bash
132
+ agent-browser-cli exec 'return document.title'
133
+ agent-browser-cli exec 'return { title: document.title, url: location.href }'
153
134
  ```
154
135
 
155
- `web_execute_js` 内使用 `await` 时必须显式 `return`,否则结果可能是 `null`。
136
+ `exec` 内使用 `await` 时必须显式 `return`,否则结果可能是 `null`。
156
137
 
157
- `web_scan` 只读取当前页,不负责导航。切换网站用 `web_execute_js` 执行:
138
+ `scan` 只读取当前页,不负责导航。切换网站用 `open` 或 `exec` 执行:
158
139
 
159
- ```python
160
- import ga
161
- print(ga.web_execute_js("location.href='https://example.com'; return location.href"))
140
+ ```bash
141
+ agent-browser-cli open https://example.com
142
+ agent-browser-cli exec "location.href='https://example.com'; return location.href"
162
143
  ```
163
144
 
164
145
  新开标签页优先使用原生 `open` 命令,不要用 `window.open` 加 `--monitor`。`open` 底层走扩展 `chrome.tabs.create`,不会触发 CDP debugger attach。
@@ -174,13 +155,11 @@ JS 事件的 `isTrusted=false`,敏感操作可能被页面拦截。JS 点击
174
155
 
175
156
  跨标签页、Cookie、CDP、扩展管理、浏览器内容权限时,优先用 JSON 字符串直传,不要自己拼 DOM 节点。
176
157
 
177
- ```python
178
- import ga
179
-
180
- print(ga.web_execute_js('{"cmd":"tabs"}'))
181
- print(ga.web_execute_js('{"cmd":"cookies"}'))
182
- print(ga.web_execute_js('{"cmd":"cdp","tabId":303987837,"method":"Page.captureScreenshot","params":{"format":"png"}}'))
183
- print(ga.web_execute_js('{"cmd":"batch","tabId":303987837,"commands":[{"cmd":"tabs"},{"cmd":"cookies"}]}'))
158
+ ```bash
159
+ agent-browser-cli exec '{"cmd":"tabs"}'
160
+ agent-browser-cli exec '{"cmd":"cookies"}'
161
+ agent-browser-cli exec '{"cmd":"cdp","tabId":303987837,"method":"Page.captureScreenshot","params":{"format":"png"}}'
162
+ agent-browser-cli exec '{"cmd":"batch","tabId":303987837,"commands":[{"cmd":"tabs"},{"cmd":"cookies"}]}'
184
163
  ```
185
164
 
186
165
  常用命令:
@@ -263,11 +242,11 @@ return fetch('PDF_URL').then(r => r.blob()).then(b => {
263
242
  });
264
243
  ```
265
244
 
266
- Google 图搜场景不要硬编码混淆 class。点击结果优先找 `[role=button]` 容器;`web_scan` 可能过滤边栏,弹出后用 JS 读 `document.body.innerText`;大图遍历 `img` 按 `naturalWidth` 最大取 `src`;“访问”链接遍历 `a` 找 `textContent.includes('访问')` 的 `href`;缩略图直接提取 `img[src^="data:image"]`。
245
+ Google 图搜场景不要硬编码混淆 class。点击结果优先找 `[role=button]` 容器;`scan` 可能过滤边栏,弹出后用 JS 读 `document.body.innerText`;大图遍历 `img` 按 `naturalWidth` 最大取 `src`;“访问”链接遍历 `a` 找 `textContent.includes('访问')` 的 `href`;缩略图直接提取 `img[src^="data:image"]`。
267
246
 
268
247
  ## iframe、Shadow DOM 与截图
269
248
 
270
- 同源 iframe 会被 `web_scan` 自动穿透。跨域 iframe 优先走 CDP:`Page.getFrameTree` 找 `frameId`,再 `Page.createIsolatedWorld` 获取 `contextId`,最后用 `Runtime.evaluate` 在 iframe 上下文执行。
249
+ 同源 iframe 会被 `scan` 自动穿透。跨域 iframe 优先走 CDP:`Page.getFrameTree` 找 `frameId`,再 `Page.createIsolatedWorld` 获取 `contextId`,最后用 `Runtime.evaluate` 在 iframe 上下文执行。
271
250
 
272
251
  iframe 内元素做 CDP 点击时,坐标需要合成:`finalX = iframeRect.x + elRect.x`,`finalY = iframeRect.y + elRect.y`。`Target.getTargets` / `Target.attachToTarget` 在当前 CDP 桥里通常会返回 `Not allowed`,不要优先走这条路。postMessage 中继只在 content script 已注入 iframe 时可靠,第三方支付 iframe 通常不可用。
273
252
 
@@ -277,33 +256,21 @@ closed Shadow DOM 使用 `DOM.getDocument({depth:-1,pierce:true})`,再逐级 `
277
256
 
278
257
  截图优先 CDP:
279
258
 
280
- ```python
281
- import ga
282
- print(ga.web_execute_js('{"cmd":"cdp","method":"Page.captureScreenshot","params":{"format":"png"}}'))
259
+ ```bash
260
+ agent-browser-cli exec '{"cmd":"cdp","method":"Page.captureScreenshot","params":{"format":"png"}}'
283
261
  ```
284
262
 
285
263
  验证码 canvas/img 优先用 JS `canvas.toDataURL()` 或直接读取图片 `src`。
286
264
 
287
265
  ## Autofill 与登录
288
266
 
289
- `web_scan` 输出的 input 若带 `data-autofilled="true"`,value 可能显示为受保护提示,不是真实值。Chrome 只在前台 tab 释放 autofill 保护值,所以必须先 CDP `Page.bringToFront`。
267
+ `scan` 输出的 input 若带 `data-autofilled="true"`,value 可能显示为受保护提示,不是真实值。Chrome 只在前台 tab 释放 autofill 保护值,所以必须先 CDP `Page.bringToFront`。
290
268
 
291
269
  一键释放流程:`Page.bringToFront` -> `mousePressed` 点任一字段,通常不需要 `mouseReleased` -> 等 500ms -> 补发 `input/change` 事件 -> 点登录。
292
270
 
293
271
  ## 调试
294
272
 
295
- `simphtml` 调试必须注入 JS 到真实浏览器,本地静态解析无法模拟 DOM
296
-
297
- ```python
298
- from TMWebDriver import TMWebDriver
299
- import simphtml
300
-
301
- d = TMWebDriver()
302
- d.set_session('url_pattern')
303
- print(d.execute_js('return document.title'))
304
- ```
305
-
306
- `simphtml.optimize_html_for_tokens(html)` 返回 BeautifulSoup Tag,展示前用 `str(...)`。
273
+ 页面简化调试必须注入 JS 到真实浏览器,本地静态解析无法模拟 DOM。优先用 `scan --text-only` 和小段 `exec` 缩小问题范围。
307
274
 
308
275
  ## 排障顺序
309
276
 
@@ -312,4 +279,4 @@ print(d.execute_js('return document.title'))
312
279
  3. 若提示无法加载 `config.js` 或清单,检查 `assets/tmwd_cdp_bridge/config.js`。
313
280
  4. 若提示没有可用标签页,先打开正常网页,不要只开内部页。
314
281
  5. 若扩展没装,加载 `assets/tmwd_cdp_bridge`。
315
- 6. 仍失败时继续看 `memory/web_setup_sop.md` `memory/tmwebdriver_sop.md`。
282
+ 6. 仍失败时检查 Chrome 扩展后台日志和 `.agent-browser-cli.log`。
@@ -1,14 +0,0 @@
1
- {
2
- "name": "@sleepinsummer/agent-browser-cli-darwin-arm64",
3
- "version": "0.2.0",
4
- "license": "MIT",
5
- "os": [
6
- "darwin"
7
- ],
8
- "cpu": [
9
- "arm64"
10
- ],
11
- "files": [
12
- "bin"
13
- ]
14
- }