bobo-ai-cli 3.0.4 → 3.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (238) hide show
  1. package/LICENSE +21 -21
  2. package/README.md +259 -259
  3. package/bundled-skills/CORE_SKILLS.txt +18 -18
  4. package/bundled-skills/backend-expert/SKILL.md +97 -97
  5. package/bundled-skills/code-review/SKILL.md +280 -280
  6. package/bundled-skills/code-review-expert/SKILL.md +85 -85
  7. package/bundled-skills/context-budget-analyzer/SKILL.md +76 -76
  8. package/bundled-skills/context-compressor/SKILL.md +75 -75
  9. package/bundled-skills/context-optimization-suite/SKILL.md +162 -162
  10. package/bundled-skills/frontend-expert/SKILL.md +93 -93
  11. package/bundled-skills/github/SKILL.md +12 -12
  12. package/bundled-skills/high-agency/SKILL.md +473 -473
  13. package/bundled-skills/high-agency/references/builder-patterns.md +126 -126
  14. package/bundled-skills/high-agency/references/recovery-playbook.md +298 -298
  15. package/bundled-skills/memory-manager/SKILL.md +214 -214
  16. package/bundled-skills/memory-manager/references/advanced-config.md +65 -65
  17. package/bundled-skills/orchestrator/SKILL.md +681 -681
  18. package/bundled-skills/planning-with-files/SKILL.md +193 -193
  19. package/bundled-skills/skill-creator/SKILL.md +220 -220
  20. package/bundled-skills/testing-expert/SKILL.md +99 -99
  21. package/bundled-skills/verify/SKILL.md +15 -15
  22. package/dist/agent.d.ts +5 -0
  23. package/dist/agent.js +11 -1
  24. package/dist/agent.js.map +1 -1
  25. package/dist/agents/catalog.d.ts +47 -0
  26. package/dist/agents/catalog.js +63 -5
  27. package/dist/agents/catalog.js.map +1 -1
  28. package/dist/agents/router.d.ts +12 -1
  29. package/dist/agents/router.js +43 -3
  30. package/dist/agents/router.js.map +1 -1
  31. package/dist/agents/spawn.js +36 -18
  32. package/dist/agents/spawn.js.map +1 -1
  33. package/dist/autonomous.js +5 -5
  34. package/dist/cli.js +23 -21
  35. package/dist/cli.js.map +1 -1
  36. package/dist/compactor.js +39 -39
  37. package/dist/dream.js +29 -29
  38. package/dist/image-input.d.ts +44 -0
  39. package/dist/image-input.js +161 -0
  40. package/dist/image-input.js.map +1 -0
  41. package/dist/memory.js +13 -13
  42. package/dist/project.js +15 -15
  43. package/dist/repl.js +88 -0
  44. package/dist/repl.js.map +1 -1
  45. package/dist/skills.js +54 -54
  46. package/dist/sub-agents.js +65 -65
  47. package/dist/tools/browser.js +21 -21
  48. package/dist/tools/claude-code.js +10 -10
  49. package/dist/web.js +7 -7
  50. package/dist/wiki-commands.d.ts +2 -0
  51. package/dist/wiki-commands.js +249 -0
  52. package/dist/wiki-commands.js.map +1 -0
  53. package/dist/wiki.d.ts +90 -0
  54. package/dist/wiki.js +614 -0
  55. package/dist/wiki.js.map +1 -0
  56. package/knowledge/advanced-patterns.md +70 -70
  57. package/knowledge/agent-directives.md +74 -74
  58. package/knowledge/api-integration-patterns.md +102 -0
  59. package/knowledge/code-review-protocol.md +69 -0
  60. package/knowledge/dream.md +36 -36
  61. package/knowledge/engineering.md +52 -46
  62. package/knowledge/error-catalog.md +38 -33
  63. package/knowledge/event-driven-architecture.md +43 -0
  64. package/knowledge/external-alignment.md +47 -0
  65. package/knowledge/high-agency.md +73 -0
  66. package/knowledge/image-generation.md +48 -0
  67. package/knowledge/index.json +194 -169
  68. package/knowledge/llm-wiki-pattern.md +71 -0
  69. package/knowledge/long-task-management.md +79 -0
  70. package/knowledge/memory/cache-optimization-and-skill-integration.md +102 -102
  71. package/knowledge/memory/engineering-patterns.md +134 -134
  72. package/knowledge/memory/feedback_root_structure.md +15 -15
  73. package/knowledge/memory/project-contexts.md +69 -69
  74. package/knowledge/memory/tools-and-services.md +85 -85
  75. package/knowledge/memory-management.md +72 -0
  76. package/knowledge/rules/advisor-strategy.md +204 -0
  77. package/knowledge/rules/agents.md +62 -62
  78. package/knowledge/rules/blocking-rules.md +323 -323
  79. package/knowledge/rules/cache-management.md +379 -379
  80. package/knowledge/rules/capability-evolution.md +132 -132
  81. package/knowledge/rules/coding.md +126 -126
  82. package/knowledge/rules/engineering-workflows.md +225 -225
  83. package/knowledge/rules/evomap-content-guidelines.md +354 -354
  84. package/knowledge/rules/evomap-guide.md +224 -224
  85. package/knowledge/rules/external-alignment.md +22 -0
  86. package/knowledge/rules/git.md +31 -31
  87. package/knowledge/rules/hooks.md +106 -106
  88. package/knowledge/rules/performance.md +101 -101
  89. package/knowledge/rules/remotion-auto-production.md +1120 -1120
  90. package/knowledge/rules/security.md +46 -46
  91. package/knowledge/rules/testing.md +32 -32
  92. package/knowledge/rules/work-mode.md +208 -208
  93. package/knowledge/rules.md +62 -62
  94. package/knowledge/self-evolution.md +78 -0
  95. package/knowledge/self-rationalization-guard.md +52 -0
  96. package/knowledge/skills/Skill_Seekers.md +1722 -1722
  97. package/knowledge/skills/ab-test-setup.md +557 -557
  98. package/knowledge/skills/agent-sdk-dev.md +238 -238
  99. package/knowledge/skills/agent-tools.md +136 -136
  100. package/knowledge/skills/analytics-tracking.md +597 -597
  101. package/knowledge/skills/artifacts-builder.md +89 -89
  102. package/knowledge/skills/asana.md +12 -12
  103. package/knowledge/skills/backend-expert.md +97 -97
  104. package/knowledge/skills/brand-voice.md +481 -481
  105. package/knowledge/skills/browser-use.md +419 -419
  106. package/knowledge/skills/cache-optimization-skill.md +179 -179
  107. package/knowledge/skills/canvas-design.md +147 -147
  108. package/knowledge/skills/citation-validator.md +203 -203
  109. package/knowledge/skills/clangd-lsp.md +52 -52
  110. package/knowledge/skills/code-review-expert.md +85 -85
  111. package/knowledge/skills/code-review.md +280 -280
  112. package/knowledge/skills/code-simplifier.md +12 -12
  113. package/knowledge/skills/commit-commands.md +258 -258
  114. package/knowledge/skills/competitor-alternatives.md +795 -795
  115. package/knowledge/skills/content-atomizer.md +910 -910
  116. package/knowledge/skills/content-research-writer.md +605 -605
  117. package/knowledge/skills/context-optimization-suite.md +162 -162
  118. package/knowledge/skills/context7.md +12 -12
  119. package/knowledge/skills/copy-editing.md +494 -494
  120. package/knowledge/skills/copywriting.md +510 -510
  121. package/knowledge/skills/csharp-lsp.md +40 -40
  122. package/knowledge/skills/decision-making-framework.md +154 -154
  123. package/knowledge/skills/developer-growth-analysis.md +335 -335
  124. package/knowledge/skills/direct-response-copy.md +2336 -2336
  125. package/knowledge/skills/docker-expert.md +229 -229
  126. package/knowledge/skills/document-skills.md +12 -12
  127. package/knowledge/skills/documentation-expert.md +126 -126
  128. package/knowledge/skills/email-sequence.md +1061 -1061
  129. package/knowledge/skills/email-sequences.md +910 -910
  130. package/knowledge/skills/example-plugin.md +72 -72
  131. package/knowledge/skills/explanatory-output-style.md +82 -82
  132. package/knowledge/skills/feature-dev.md +458 -458
  133. package/knowledge/skills/file-organizer.md +466 -466
  134. package/knowledge/skills/firebase.disabled.md +12 -12
  135. package/knowledge/skills/form-cro.md +488 -488
  136. package/knowledge/skills/free-tool-strategy.md +636 -636
  137. package/knowledge/skills/frontend-design-offical.md +55 -55
  138. package/knowledge/skills/frontend-design.md +41 -41
  139. package/knowledge/skills/frontend-expert.md +93 -93
  140. package/knowledge/skills/github.md +12 -12
  141. package/knowledge/skills/gitlab.md +12 -12
  142. package/knowledge/skills/gopls-lsp.md +32 -32
  143. package/knowledge/skills/got-controller.md +218 -218
  144. package/knowledge/skills/greptile.md +72 -72
  145. package/knowledge/skills/hookify.md +376 -376
  146. package/knowledge/skills/image-editor.md +189 -189
  147. package/knowledge/skills/image-enhancer.md +109 -109
  148. package/knowledge/skills/jdtls-lsp.md +49 -49
  149. package/knowledge/skills/json-canvas.md +654 -654
  150. package/knowledge/skills/keyword-research.md +559 -559
  151. package/knowledge/skills/kotlin-lsp.md +28 -28
  152. package/knowledge/skills/laravel-boost.md +12 -12
  153. package/knowledge/skills/launch-strategy.md +394 -394
  154. package/knowledge/skills/lead-magnet.md +393 -393
  155. package/knowledge/skills/learning-output-style.md +106 -106
  156. package/knowledge/skills/linear.md +12 -12
  157. package/knowledge/skills/lua-lsp.md +47 -47
  158. package/knowledge/skills/marketing-ideas.md +720 -720
  159. package/knowledge/skills/marketing-psychology.md +534 -534
  160. package/knowledge/skills/mcp-builder.md +369 -369
  161. package/knowledge/skills/meeting-insights-analyzer.md +347 -347
  162. package/knowledge/skills/memory-evolution-system.md +172 -172
  163. package/knowledge/skills/multi-lens-thinking.md +407 -407
  164. package/knowledge/skills/nano-banana-pro.md +116 -116
  165. package/knowledge/skills/newsletter.md +736 -736
  166. package/knowledge/skills/notebooklm.md +296 -296
  167. package/knowledge/skills/obsidian-bases.md +634 -634
  168. package/knowledge/skills/obsidian-markdown.md +651 -651
  169. package/knowledge/skills/onboarding-cro.md +494 -494
  170. package/knowledge/skills/orchestrator.md +681 -681
  171. package/knowledge/skills/page-cro.md +379 -379
  172. package/knowledge/skills/paid-ads.md +624 -624
  173. package/knowledge/skills/paywall-upgrade-cro.md +651 -651
  174. package/knowledge/skills/php-lsp.md +36 -36
  175. package/knowledge/skills/planning-with-files.md +193 -193
  176. package/knowledge/skills/playwright.md +12 -12
  177. package/knowledge/skills/plugin-dev.md +434 -434
  178. package/knowledge/skills/popup-cro.md +520 -520
  179. package/knowledge/skills/positioning-angles.md +330 -330
  180. package/knowledge/skills/pr-review-toolkit.md +359 -359
  181. package/knowledge/skills/pricing-strategy.md +777 -777
  182. package/knowledge/skills/programmatic-seo.md +714 -714
  183. package/knowledge/skills/pyright-lsp.md +43 -43
  184. package/knowledge/skills/quality-assurance-framework.md +168 -168
  185. package/knowledge/skills/question-refiner.md +160 -160
  186. package/knowledge/skills/ralph-loop.md +205 -205
  187. package/knowledge/skills/refactoring-expert.md +103 -103
  188. package/knowledge/skills/referral-program.md +668 -668
  189. package/knowledge/skills/research-executor.md +164 -164
  190. package/knowledge/skills/review-with-security.md +12 -12
  191. package/knowledge/skills/rust-analyzer-lsp.md +50 -50
  192. package/knowledge/skills/schema-markup.md +647 -647
  193. package/knowledge/skills/security-audit-expert.md +124 -124
  194. package/knowledge/skills/security-expert.md +140 -140
  195. package/knowledge/skills/security-guidance.md +12 -12
  196. package/knowledge/skills/seedance-prompt.md +139 -139
  197. package/knowledge/skills/self-evolution.md +1160 -1160
  198. package/knowledge/skills/seo-audit.md +432 -432
  199. package/knowledge/skills/seo-content.md +787 -787
  200. package/knowledge/skills/serena.md +12 -12
  201. package/knowledge/skills/signup-flow-cro.md +409 -409
  202. package/knowledge/skills/skill-creator.md +220 -220
  203. package/knowledge/skills/skill-manager.md +226 -226
  204. package/knowledge/skills/skill-share.md +98 -98
  205. package/knowledge/skills/slack.md +12 -12
  206. package/knowledge/skills/social-content.md +878 -878
  207. package/knowledge/skills/spec-flow-skill.md +124 -124
  208. package/knowledge/skills/stripe.md +12 -12
  209. package/knowledge/skills/supabase.md +12 -12
  210. package/knowledge/skills/swift-lsp.md +40 -40
  211. package/knowledge/skills/synthesizer.md +236 -236
  212. package/knowledge/skills/template-skill.md +16 -16
  213. package/knowledge/skills/testing-expert.md +99 -99
  214. package/knowledge/skills/theme-factory.md +72 -72
  215. package/knowledge/skills/tiktok-research.md +208 -208
  216. package/knowledge/skills/typescript-lsp.md +36 -36
  217. package/knowledge/skills/ui-ux-pro-max.md +247 -247
  218. package/knowledge/skills/verify.md +15 -15
  219. package/knowledge/skills/visual-prompt-engineer.md +102 -102
  220. package/knowledge/skills/webapp-testing.md +111 -111
  221. package/knowledge/skills/wide-research.md +191 -191
  222. package/knowledge/system.md +93 -93
  223. package/knowledge/task-router.md +46 -37
  224. package/knowledge/verification.md +38 -38
  225. package/knowledge/worker-prompt-craft.md +66 -0
  226. package/knowledge/workflows/3d-viz.md +47 -47
  227. package/knowledge/workflows/data-pipeline.md +47 -47
  228. package/knowledge/workflows/db-migration.md +51 -51
  229. package/knowledge/workflows/feature-dev.md +41 -41
  230. package/knowledge/workflows/tdd-flow.md +52 -52
  231. package/knowledge/workflows/ui-verify.md +51 -51
  232. package/package.json +74 -74
  233. package/dist/claude-bridge.d.ts +0 -18
  234. package/dist/claude-bridge.js +0 -91
  235. package/dist/claude-bridge.js.map +0 -1
  236. package/dist/tools/claude-bridge-tool.d.ts +0 -4
  237. package/dist/tools/claude-bridge-tool.js +0 -44
  238. package/dist/tools/claude-bridge-tool.js.map +0 -1
@@ -1,419 +1,419 @@
1
- ---
2
- id: "browser-use"
3
- title: "Browser-Use Skill"
4
- category: "infrastructure"
5
- tags: ["browser-use skill", "📚 概述", "🚀 快速开始", "创建环境", "安装 browser-use 和 chromium", "browser use(推荐 - 最快速度 + 最低成本)", "或者使用其他 llm", "🏗️ 核心概念", "🛠️ 开发规则", "🎯 开发命令"]
6
- triggers: []
7
- dependencies: []
8
- source: "E:/Bobo's Coding cache/.claude/skills/browser-use"
9
- ---
10
-
11
- # Browser-Use Skill
12
-
13
- > AI 驱动的浏览器自动化库 - 使用 LLM 控制浏览器完成复杂任务
14
-
15
- ## 📚 概述
16
-
17
- Browser-Use 是一个 async Python >= 3.11 库,通过 LLM + CDP (Chrome DevTools Protocol) 实现 AI 浏览器驱动能力。核心架构使 AI agents 能够自主导航网页、与元素交互、通过处理 HTML 并做出 LLM 驱动的决策来完成复杂任务。
18
-
19
- ## 🚀 快速开始
20
-
21
- ### 1. 安装 Browser-Use
22
-
23
- ```bash
24
- # 创建环境
25
- pip install uv
26
- uv venv --python 3.12
27
- source .venv/bin/activate
28
- # Windows 使用: .venv\Scripts\activate
29
-
30
- # 安装 browser-use 和 chromium
31
- uv pip install browser-use
32
- uvx browser-use install
33
- ```
34
-
35
- ### 2. 选择你喜欢的 LLM
36
-
37
- 创建 `.env` 文件并添加 API key:
38
-
39
- ```bash
40
- # Browser Use(推荐 - 最快速度 + 最低成本)
41
- BROWSER_USE_API_KEY=your_key_here
42
- # 在 https://cloud.browser-use.com/new-api-key 获取 $10 免费额度
43
-
44
- # 或者使用其他 LLM
45
- OPENAI_API_KEY=your_key_here
46
- ANTHROPIC_API_KEY=your_key_here
47
- GOOGLE_API_KEY=your_key_here
48
- ```
49
-
50
- ### 3. 运行第一个 Agent
51
-
52
- ```python
53
- from browser_use import Agent, ChatBrowserUse
54
- from dotenv import load_dotenv
55
- import asyncio
56
-
57
- load_dotenv()
58
-
59
- async def main():
60
- llm = ChatBrowserUse()
61
- task = "查找 Hacker News 上排名第一的帖子"
62
- agent = Agent(task=task, llm=llm)
63
- await agent.run()
64
-
65
- if __name__ == "__main__":
66
- asyncio.run(main())
67
- ```
68
-
69
- ### 4. 生产部署
70
-
71
- 使用 `@sandbox` 装饰器部署到生产环境,并扩展到百万级 agents:
72
-
73
- ```python
74
- from browser_use import Browser, sandbox, ChatBrowserUse
75
- from browser_use.agent.service import Agent
76
- import asyncio
77
-
78
- @sandbox(cloud_profile_id='your-profile-id')
79
- async def production_task(browser: Browser):
80
- agent = Agent(
81
- task="你的认证任务",
82
- browser=browser,
83
- llm=ChatBrowserUse()
84
- )
85
- await agent.run()
86
-
87
- asyncio.run(production_task())
88
- ```
89
-
90
- 同步本地 cookies 到云端:
91
-
92
- ```bash
93
- export BROWSER_USE_API_KEY=your_key && curl -fsSL https://browser-use.com/profile.sh | sh
94
- ```
95
-
96
- ## 🏗️ 核心概念
97
-
98
- ### Agent 基础
99
-
100
- ```python
101
- from browser_use import Agent, ChatBrowserUse
102
-
103
- agent = Agent(
104
- task="搜索最新 AI 新闻",
105
- llm=ChatBrowserUse(),
106
- )
107
-
108
- async def main():
109
- history = await agent.run(max_steps=100)
110
-
111
- # 访问有用信息
112
- history.urls() # 访问过的 URL 列表
113
- history.action_names() # 执行的操作名称
114
- history.final_result() # 最终提取的内容
115
- history.is_successful() # 检查是否成功完成
116
- ```
117
-
118
- ### Browser 配置
119
-
120
- ```python
121
- from browser_use import Agent, Browser, ChatBrowserUse
122
-
123
- browser = Browser(
124
- headless=False, # 显示浏览器窗口
125
- window_size={'width': 1000, 'height': 700},
126
- proxy=ProxySettings(server='http://host:8080'),
127
- user_data_dir='./profile', # 保持登录状态
128
- )
129
-
130
- agent = Agent(
131
- task='搜索 Browser Use',
132
- browser=browser,
133
- llm=ChatBrowserUse(),
134
- )
135
- ```
136
-
137
- ### Tools(工具)
138
-
139
- 自定义工具扩展 agent 能力:
140
-
141
- ```python
142
- from browser_use import Tools, ActionResult, Browser
143
-
144
- tools = Tools()
145
-
146
- @tools.action('向人类询问问题')
147
- def ask_human(question: str, browser: Browser) -> ActionResult:
148
- answer = input(f'{question} > ')
149
- return f'人类回答: {answer}'
150
-
151
- agent = Agent(
152
- task='向人类寻求帮助',
153
- llm=llm,
154
- tools=tools,
155
- )
156
- ```
157
-
158
- ## 🛠️ 开发规则
159
-
160
- ### 核心原则
161
-
162
- 1. **始终使用 `uv` 而不是 `pip`**
163
-
164
- ```bash
165
- uv venv --python 3.11
166
- source .venv/bin/activate
167
- uv sync
168
- ```
169
-
170
- 2. **类型安全编码**
171
- - 使用 Pydantic v2 模型进行所有内部操作
172
- - 使用现代 Python 类型提示:`str | None` 而非 `Optional[str]`
173
-
174
- 3. **Pre-commit 格式化**
175
- - 在提交 PR 前始终运行 pre-commit
176
-
177
- 4. **使用描述性名称和文档字符串**
178
-
179
- 5. **返回 `ActionResult` 结构化内容**
180
- - 帮助 agent 更好地推理
181
-
182
- 6. **从不创建随机示例**
183
- - 测试功能时使用终端内联代码
184
-
185
- 7. **默认推荐 `ChatBrowserUse` 模型**
186
- - 最高准确度 + 最快速度 + 最低 token 成本
187
-
188
- ## 🎯 开发命令
189
-
190
- ```bash
191
- # 设置
192
- uv venv --python 3.11
193
- source .venv/bin/activate
194
- uv sync
195
-
196
- # 测试
197
- uv run pytest -vxs tests/ci # CI 测试
198
- uv run pytest -vxs tests/ # 所有测试
199
-
200
- # 质量检查
201
- uv run pyright # 类型检查
202
- uv run ruff check --fix # Linting
203
- uv run ruff format # 格式化
204
- uv run pre-commit run --all-files # Pre-commit hooks
205
-
206
- # MCP 服务器模式
207
- uvx browser-use[cli] --mcp
208
- ```
209
-
210
- ## 📖 可用工具(Actions)
211
-
212
- ### 导航和浏览器控制
213
-
214
- - `search` - 搜索查询(DuckDuckGo、Google、Bing)
215
- - `navigate` - 导航到 URL
216
- - `go_back` - 返回浏览器历史
217
- - `wait` - 等待指定秒数
218
-
219
- ### 页面交互
220
-
221
- - `click` - 通过索引点击元素
222
- - `input` - 输入文本到表单字段
223
- - `upload_file` - 上传文件
224
- - `scroll` - 滚动页面
225
- - `find_text` - 滚动到页面上的特定文本
226
- - `send_keys` - 发送特殊按键(Enter、Escape 等)
227
-
228
- ### JavaScript 执行
229
-
230
- - `evaluate` - 在页面上执行自定义 JavaScript 代码
231
-
232
- ### 标签页管理
233
-
234
- - `switch` - 在浏览器标签页之间切换
235
- - `close` - 关闭浏览器标签页
236
-
237
- ### 内容提取
238
-
239
- - `extract` - 使用 LLM 从网页提取数据
240
-
241
- ### 视觉分析
242
-
243
- - `screenshot` - 请求下一个浏览器状态的截图
244
-
245
- ### 表单控件
246
-
247
- - `dropdown_options` - 获取下拉选项值
248
- - `select_dropdown` - 选择下拉选项
249
-
250
- ### 文件操作
251
-
252
- - `write_file` - 写入内容到文件
253
- - `read_file` - 读取文件内容
254
- - `replace_file` - 替换文件中的文本
255
-
256
- ### 任务完成
257
-
258
- - `done` - 完成任务(始终可用)
259
-
260
- ## 💡 提示技巧
261
-
262
- ### 1. 具体 vs 开放式
263
-
264
- **✅ 具体(推荐)**
265
-
266
- ```python
267
- task = """
268
- 1. 访问 https://quotes.toscrape.com/
269
- 2. 使用 extract 操作查询 "前 3 条引用及其作者"
270
- 3. 使用 write_file 操作将结果保存到 quotes.csv
271
- 4. 对第一条引用进行 Google 搜索并找到写作时间
272
- """
273
- ```
274
-
275
- **❌ 开放式**
276
-
277
- ```python
278
- task = "访问网络并赚钱"
279
- ```
280
-
281
- ### 2. 直接命名操作
282
-
283
- 当你确切知道 agent 应该做什么时,直接引用操作名称:
284
-
285
- ```python
286
- task = """
287
- 1. 使用 search 操作查找 "Python 教程"
288
- 2. 使用 click 在新标签页中打开第一个结果
289
- 3. 使用 scroll 操作向下滚动 2 页
290
- 4. 使用 extract 提取前 5 项的名称
291
- 5. 如果页面未加载,等待 2 秒,刷新并等待 10 秒
292
- 6. 使用 send_keys 操作输入 "Tab Tab ArrowDown Enter"
293
- """
294
- ```
295
-
296
- ### 3. 通过键盘导航处理交互问题
297
-
298
- 有时按钮无法点击(你发现了库中的 bug - 提交 issue)。好消息 - 通常可以通过键盘导航解决!
299
-
300
- ```python
301
- task = """
302
- 如果提交按钮无法点击:
303
- 1. 使用 send_keys 操作输入 "Tab Tab Enter" 进行导航和激活
304
- 2. 或使用 send_keys 输入 "ArrowDown ArrowDown Enter" 提交表单
305
- """
306
- ```
307
-
308
- ### 4. 自定义操作集成
309
-
310
- ```python
311
- @controller.action("从认证器应用获取 2FA 代码")
312
- async def get_2fa_code():
313
- # 你的实现
314
- pass
315
-
316
- task = """
317
- 使用 2FA 登录:
318
- 1. 输入用户名/密码
319
- 2. 提示输入 2FA 时,使用 get_2fa_code 操作
320
- 3. 永远不要尝试从页面手动提取 2FA 代码
321
- 4. 始终使用 get_2fa_code 操作获取认证代码
322
- """
323
- ```
324
-
325
- ### 5. 错误恢复
326
-
327
- ```python
328
- task = """
329
- 稳健的数据提取:
330
- 1. 访问 openai.com 查找他们的 CEO
331
- 2. 如果由于反机器人保护导航失败:
332
- - 使用 Google 搜索查找 CEO
333
- 3. 如果页面超时,使用 go_back 并尝试替代方法
334
- """
335
- ```
336
-
337
- ## 🌟 高级功能
338
-
339
- ### 结构化输出
340
-
341
- 使用 Pydantic 模型获取结构化输出:
342
-
343
- ```python
344
- from pydantic import BaseModel
345
-
346
- class Quote(BaseModel):
347
- text: str
348
- author: str
349
-
350
- agent = Agent(
351
- task="提取前 3 条引用",
352
- llm=llm,
353
- output_model_schema=Quote,
354
- )
355
-
356
- history = await agent.run()
357
- structured_data = history.structured_output
358
- ```
359
-
360
- ### 远程浏览器(Browser-Use Cloud)
361
-
362
- ```python
363
- from browser_use import Browser, ChatBrowserUse
364
-
365
- # 简单:使用 Browser-Use 云浏览器服务
366
- browser = Browser(use_cloud=True)
367
-
368
- # 高级:配置云浏览器参数
369
- browser = Browser(
370
- cloud_profile_id='your-profile-id', # 特定浏览器配置
371
- cloud_proxy_country_code='us', # 代理位置
372
- cloud_timeout=30, # 会话超时(分钟)
373
- )
374
- ```
375
-
376
- ### MCP 集成
377
-
378
- Browser-Use 支持两种模式:
379
-
380
- 1. **作为 MCP 服务器**:向 MCP 客户端(如 Claude Desktop)公开浏览器自动化工具
381
- 2. **使用 MCP 客户端**:Agent 可以连接到外部 MCP 服务器以扩展能力
382
-
383
- ```bash
384
- # 作为 MCP 服务器运行
385
- uvx browser-use[cli] --mcp
386
- ```
387
-
388
- ## 📂 项目结构
389
-
390
- ```
391
- browser_use/
392
- ├── agent/ # Agent 核心逻辑
393
- │ ├── service.py # 主编排器
394
- │ ├── views.py # Pydantic 模型
395
- │ └── system_prompt*.md # Agent 提示词
396
- ├── browser/ # 浏览器管理
397
- │ ├── session.py # BrowserSession + CDP 客户端
398
- │ └── profile.py # 浏览器配置和启动参数
399
- ├── dom/ # DOM 处理
400
- │ └── service.py # DomService 提取和处理
401
- ├── tools/ # 操作注册表
402
- │ └── service.py # 工具定义
403
- ├── llm/ # LLM 集成层
404
- └── mcp/ # MCP 集成
405
- └── client.py # MCP 客户端连接
406
- ```
407
-
408
- ## 🔗 相关资源
409
-
410
- - **GitHub**: https://github.com/browser-use/browser-use
411
- - **文档**: https://docs.browser-use.com
412
- - **Discord**: https://link.browser-use.com/discord
413
- - **Cloud**: https://cloud.browser-use.com
414
-
415
- ## 🤝 支持
416
-
417
- - 查看 [GitHub Issues](https://github.com/browser-use/browser-use/issues)
418
- - 在 [Discord 社区](https://link.browser-use.com/discord) 提问
419
- - 企业支持:support@browser-use.com
1
+ ---
2
+ id: "browser-use"
3
+ title: "Browser-Use Skill"
4
+ category: "infrastructure"
5
+ tags: ["browser-use skill", "📚 概述", "🚀 快速开始", "创建环境", "安装 browser-use 和 chromium", "browser use(推荐 - 最快速度 + 最低成本)", "或者使用其他 llm", "🏗️ 核心概念", "🛠️ 开发规则", "🎯 开发命令"]
6
+ triggers: []
7
+ dependencies: []
8
+ source: "E:/Bobo's Coding cache/.claude/skills/browser-use"
9
+ ---
10
+
11
+ # Browser-Use Skill
12
+
13
+ > AI 驱动的浏览器自动化库 - 使用 LLM 控制浏览器完成复杂任务
14
+
15
+ ## 📚 概述
16
+
17
+ Browser-Use 是一个 async Python >= 3.11 库,通过 LLM + CDP (Chrome DevTools Protocol) 实现 AI 浏览器驱动能力。核心架构使 AI agents 能够自主导航网页、与元素交互、通过处理 HTML 并做出 LLM 驱动的决策来完成复杂任务。
18
+
19
+ ## 🚀 快速开始
20
+
21
+ ### 1. 安装 Browser-Use
22
+
23
+ ```bash
24
+ # 创建环境
25
+ pip install uv
26
+ uv venv --python 3.12
27
+ source .venv/bin/activate
28
+ # Windows 使用: .venv\Scripts\activate
29
+
30
+ # 安装 browser-use 和 chromium
31
+ uv pip install browser-use
32
+ uvx browser-use install
33
+ ```
34
+
35
+ ### 2. 选择你喜欢的 LLM
36
+
37
+ 创建 `.env` 文件并添加 API key:
38
+
39
+ ```bash
40
+ # Browser Use(推荐 - 最快速度 + 最低成本)
41
+ BROWSER_USE_API_KEY=your_key_here
42
+ # 在 https://cloud.browser-use.com/new-api-key 获取 $10 免费额度
43
+
44
+ # 或者使用其他 LLM
45
+ OPENAI_API_KEY=your_key_here
46
+ ANTHROPIC_API_KEY=your_key_here
47
+ GOOGLE_API_KEY=your_key_here
48
+ ```
49
+
50
+ ### 3. 运行第一个 Agent
51
+
52
+ ```python
53
+ from browser_use import Agent, ChatBrowserUse
54
+ from dotenv import load_dotenv
55
+ import asyncio
56
+
57
+ load_dotenv()
58
+
59
+ async def main():
60
+ llm = ChatBrowserUse()
61
+ task = "查找 Hacker News 上排名第一的帖子"
62
+ agent = Agent(task=task, llm=llm)
63
+ await agent.run()
64
+
65
+ if __name__ == "__main__":
66
+ asyncio.run(main())
67
+ ```
68
+
69
+ ### 4. 生产部署
70
+
71
+ 使用 `@sandbox` 装饰器部署到生产环境,并扩展到百万级 agents:
72
+
73
+ ```python
74
+ from browser_use import Browser, sandbox, ChatBrowserUse
75
+ from browser_use.agent.service import Agent
76
+ import asyncio
77
+
78
+ @sandbox(cloud_profile_id='your-profile-id')
79
+ async def production_task(browser: Browser):
80
+ agent = Agent(
81
+ task="你的认证任务",
82
+ browser=browser,
83
+ llm=ChatBrowserUse()
84
+ )
85
+ await agent.run()
86
+
87
+ asyncio.run(production_task())
88
+ ```
89
+
90
+ 同步本地 cookies 到云端:
91
+
92
+ ```bash
93
+ export BROWSER_USE_API_KEY=your_key && curl -fsSL https://browser-use.com/profile.sh | sh
94
+ ```
95
+
96
+ ## 🏗️ 核心概念
97
+
98
+ ### Agent 基础
99
+
100
+ ```python
101
+ from browser_use import Agent, ChatBrowserUse
102
+
103
+ agent = Agent(
104
+ task="搜索最新 AI 新闻",
105
+ llm=ChatBrowserUse(),
106
+ )
107
+
108
+ async def main():
109
+ history = await agent.run(max_steps=100)
110
+
111
+ # 访问有用信息
112
+ history.urls() # 访问过的 URL 列表
113
+ history.action_names() # 执行的操作名称
114
+ history.final_result() # 最终提取的内容
115
+ history.is_successful() # 检查是否成功完成
116
+ ```
117
+
118
+ ### Browser 配置
119
+
120
+ ```python
121
+ from browser_use import Agent, Browser, ChatBrowserUse
122
+
123
+ browser = Browser(
124
+ headless=False, # 显示浏览器窗口
125
+ window_size={'width': 1000, 'height': 700},
126
+ proxy=ProxySettings(server='http://host:8080'),
127
+ user_data_dir='./profile', # 保持登录状态
128
+ )
129
+
130
+ agent = Agent(
131
+ task='搜索 Browser Use',
132
+ browser=browser,
133
+ llm=ChatBrowserUse(),
134
+ )
135
+ ```
136
+
137
+ ### Tools(工具)
138
+
139
+ 自定义工具扩展 agent 能力:
140
+
141
+ ```python
142
+ from browser_use import Tools, ActionResult, Browser
143
+
144
+ tools = Tools()
145
+
146
+ @tools.action('向人类询问问题')
147
+ def ask_human(question: str, browser: Browser) -> ActionResult:
148
+ answer = input(f'{question} > ')
149
+ return f'人类回答: {answer}'
150
+
151
+ agent = Agent(
152
+ task='向人类寻求帮助',
153
+ llm=llm,
154
+ tools=tools,
155
+ )
156
+ ```
157
+
158
+ ## 🛠️ 开发规则
159
+
160
+ ### 核心原则
161
+
162
+ 1. **始终使用 `uv` 而不是 `pip`**
163
+
164
+ ```bash
165
+ uv venv --python 3.11
166
+ source .venv/bin/activate
167
+ uv sync
168
+ ```
169
+
170
+ 2. **类型安全编码**
171
+ - 使用 Pydantic v2 模型进行所有内部操作
172
+ - 使用现代 Python 类型提示:`str | None` 而非 `Optional[str]`
173
+
174
+ 3. **Pre-commit 格式化**
175
+ - 在提交 PR 前始终运行 pre-commit
176
+
177
+ 4. **使用描述性名称和文档字符串**
178
+
179
+ 5. **返回 `ActionResult` 结构化内容**
180
+ - 帮助 agent 更好地推理
181
+
182
+ 6. **从不创建随机示例**
183
+ - 测试功能时使用终端内联代码
184
+
185
+ 7. **默认推荐 `ChatBrowserUse` 模型**
186
+ - 最高准确度 + 最快速度 + 最低 token 成本
187
+
188
+ ## 🎯 开发命令
189
+
190
+ ```bash
191
+ # 设置
192
+ uv venv --python 3.11
193
+ source .venv/bin/activate
194
+ uv sync
195
+
196
+ # 测试
197
+ uv run pytest -vxs tests/ci # CI 测试
198
+ uv run pytest -vxs tests/ # 所有测试
199
+
200
+ # 质量检查
201
+ uv run pyright # 类型检查
202
+ uv run ruff check --fix # Linting
203
+ uv run ruff format # 格式化
204
+ uv run pre-commit run --all-files # Pre-commit hooks
205
+
206
+ # MCP 服务器模式
207
+ uvx browser-use[cli] --mcp
208
+ ```
209
+
210
+ ## 📖 可用工具(Actions)
211
+
212
+ ### 导航和浏览器控制
213
+
214
+ - `search` - 搜索查询(DuckDuckGo、Google、Bing)
215
+ - `navigate` - 导航到 URL
216
+ - `go_back` - 返回浏览器历史
217
+ - `wait` - 等待指定秒数
218
+
219
+ ### 页面交互
220
+
221
+ - `click` - 通过索引点击元素
222
+ - `input` - 输入文本到表单字段
223
+ - `upload_file` - 上传文件
224
+ - `scroll` - 滚动页面
225
+ - `find_text` - 滚动到页面上的特定文本
226
+ - `send_keys` - 发送特殊按键(Enter、Escape 等)
227
+
228
+ ### JavaScript 执行
229
+
230
+ - `evaluate` - 在页面上执行自定义 JavaScript 代码
231
+
232
+ ### 标签页管理
233
+
234
+ - `switch` - 在浏览器标签页之间切换
235
+ - `close` - 关闭浏览器标签页
236
+
237
+ ### 内容提取
238
+
239
+ - `extract` - 使用 LLM 从网页提取数据
240
+
241
+ ### 视觉分析
242
+
243
+ - `screenshot` - 请求下一个浏览器状态的截图
244
+
245
+ ### 表单控件
246
+
247
+ - `dropdown_options` - 获取下拉选项值
248
+ - `select_dropdown` - 选择下拉选项
249
+
250
+ ### 文件操作
251
+
252
+ - `write_file` - 写入内容到文件
253
+ - `read_file` - 读取文件内容
254
+ - `replace_file` - 替换文件中的文本
255
+
256
+ ### 任务完成
257
+
258
+ - `done` - 完成任务(始终可用)
259
+
260
+ ## 💡 提示技巧
261
+
262
+ ### 1. 具体 vs 开放式
263
+
264
+ **✅ 具体(推荐)**
265
+
266
+ ```python
267
+ task = """
268
+ 1. 访问 https://quotes.toscrape.com/
269
+ 2. 使用 extract 操作查询 "前 3 条引用及其作者"
270
+ 3. 使用 write_file 操作将结果保存到 quotes.csv
271
+ 4. 对第一条引用进行 Google 搜索并找到写作时间
272
+ """
273
+ ```
274
+
275
+ **❌ 开放式**
276
+
277
+ ```python
278
+ task = "访问网络并赚钱"
279
+ ```
280
+
281
+ ### 2. 直接命名操作
282
+
283
+ 当你确切知道 agent 应该做什么时,直接引用操作名称:
284
+
285
+ ```python
286
+ task = """
287
+ 1. 使用 search 操作查找 "Python 教程"
288
+ 2. 使用 click 在新标签页中打开第一个结果
289
+ 3. 使用 scroll 操作向下滚动 2 页
290
+ 4. 使用 extract 提取前 5 项的名称
291
+ 5. 如果页面未加载,等待 2 秒,刷新并等待 10 秒
292
+ 6. 使用 send_keys 操作输入 "Tab Tab ArrowDown Enter"
293
+ """
294
+ ```
295
+
296
+ ### 3. 通过键盘导航处理交互问题
297
+
298
+ 有时按钮无法点击(你发现了库中的 bug - 提交 issue)。好消息 - 通常可以通过键盘导航解决!
299
+
300
+ ```python
301
+ task = """
302
+ 如果提交按钮无法点击:
303
+ 1. 使用 send_keys 操作输入 "Tab Tab Enter" 进行导航和激活
304
+ 2. 或使用 send_keys 输入 "ArrowDown ArrowDown Enter" 提交表单
305
+ """
306
+ ```
307
+
308
+ ### 4. 自定义操作集成
309
+
310
+ ```python
311
+ @controller.action("从认证器应用获取 2FA 代码")
312
+ async def get_2fa_code():
313
+ # 你的实现
314
+ pass
315
+
316
+ task = """
317
+ 使用 2FA 登录:
318
+ 1. 输入用户名/密码
319
+ 2. 提示输入 2FA 时,使用 get_2fa_code 操作
320
+ 3. 永远不要尝试从页面手动提取 2FA 代码
321
+ 4. 始终使用 get_2fa_code 操作获取认证代码
322
+ """
323
+ ```
324
+
325
+ ### 5. 错误恢复
326
+
327
+ ```python
328
+ task = """
329
+ 稳健的数据提取:
330
+ 1. 访问 openai.com 查找他们的 CEO
331
+ 2. 如果由于反机器人保护导航失败:
332
+ - 使用 Google 搜索查找 CEO
333
+ 3. 如果页面超时,使用 go_back 并尝试替代方法
334
+ """
335
+ ```
336
+
337
+ ## 🌟 高级功能
338
+
339
+ ### 结构化输出
340
+
341
+ 使用 Pydantic 模型获取结构化输出:
342
+
343
+ ```python
344
+ from pydantic import BaseModel
345
+
346
+ class Quote(BaseModel):
347
+ text: str
348
+ author: str
349
+
350
+ agent = Agent(
351
+ task="提取前 3 条引用",
352
+ llm=llm,
353
+ output_model_schema=Quote,
354
+ )
355
+
356
+ history = await agent.run()
357
+ structured_data = history.structured_output
358
+ ```
359
+
360
+ ### 远程浏览器(Browser-Use Cloud)
361
+
362
+ ```python
363
+ from browser_use import Browser, ChatBrowserUse
364
+
365
+ # 简单:使用 Browser-Use 云浏览器服务
366
+ browser = Browser(use_cloud=True)
367
+
368
+ # 高级:配置云浏览器参数
369
+ browser = Browser(
370
+ cloud_profile_id='your-profile-id', # 特定浏览器配置
371
+ cloud_proxy_country_code='us', # 代理位置
372
+ cloud_timeout=30, # 会话超时(分钟)
373
+ )
374
+ ```
375
+
376
+ ### MCP 集成
377
+
378
+ Browser-Use 支持两种模式:
379
+
380
+ 1. **作为 MCP 服务器**:向 MCP 客户端(如 Claude Desktop)公开浏览器自动化工具
381
+ 2. **使用 MCP 客户端**:Agent 可以连接到外部 MCP 服务器以扩展能力
382
+
383
+ ```bash
384
+ # 作为 MCP 服务器运行
385
+ uvx browser-use[cli] --mcp
386
+ ```
387
+
388
+ ## 📂 项目结构
389
+
390
+ ```
391
+ browser_use/
392
+ ├── agent/ # Agent 核心逻辑
393
+ │ ├── service.py # 主编排器
394
+ │ ├── views.py # Pydantic 模型
395
+ │ └── system_prompt*.md # Agent 提示词
396
+ ├── browser/ # 浏览器管理
397
+ │ ├── session.py # BrowserSession + CDP 客户端
398
+ │ └── profile.py # 浏览器配置和启动参数
399
+ ├── dom/ # DOM 处理
400
+ │ └── service.py # DomService 提取和处理
401
+ ├── tools/ # 操作注册表
402
+ │ └── service.py # 工具定义
403
+ ├── llm/ # LLM 集成层
404
+ └── mcp/ # MCP 集成
405
+ └── client.py # MCP 客户端连接
406
+ ```
407
+
408
+ ## 🔗 相关资源
409
+
410
+ - **GitHub**: https://github.com/browser-use/browser-use
411
+ - **文档**: https://docs.browser-use.com
412
+ - **Discord**: https://link.browser-use.com/discord
413
+ - **Cloud**: https://cloud.browser-use.com
414
+
415
+ ## 🤝 支持
416
+
417
+ - 查看 [GitHub Issues](https://github.com/browser-use/browser-use/issues)
418
+ - 在 [Discord 社区](https://link.browser-use.com/discord) 提问
419
+ - 企业支持:support@browser-use.com