openclacky 1.1.2 → 1.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. checksums.yaml +4 -4
  2. data/.clacky/skills/gem-release/SKILL.md +27 -31
  3. data/CHANGELOG.md +30 -0
  4. data/Dockerfile +28 -0
  5. data/README.md +4 -0
  6. data/README_CN.md +198 -0
  7. data/docs/engineering-article.md +343 -0
  8. data/lib/clacky/agent/llm_caller.rb +2 -5
  9. data/lib/clacky/agent/session_serializer.rb +4 -0
  10. data/lib/clacky/agent.rb +22 -1
  11. data/lib/clacky/brand_config.rb +87 -5
  12. data/lib/clacky/cli.rb +1 -1
  13. data/lib/clacky/client.rb +15 -11
  14. data/lib/clacky/message_format/anthropic.rb +30 -2
  15. data/lib/clacky/message_format/bedrock.rb +13 -1
  16. data/lib/clacky/message_format/open_ai.rb +5 -1
  17. data/lib/clacky/providers.rb +34 -0
  18. data/lib/clacky/server/channel/adapters/dingtalk/adapter.rb +142 -5
  19. data/lib/clacky/server/channel/adapters/dingtalk/api_client.rb +309 -0
  20. data/lib/clacky/server/http_server.rb +130 -15
  21. data/lib/clacky/server/session_registry.rb +9 -6
  22. data/lib/clacky/ui2/ui_controller.rb +14 -0
  23. data/lib/clacky/ui_interface.rb +14 -0
  24. data/lib/clacky/utils/model_pricing.rb +96 -25
  25. data/lib/clacky/version.rb +1 -1
  26. data/lib/clacky/web/app.css +1286 -1116
  27. data/lib/clacky/web/brand.js +20 -5
  28. data/lib/clacky/web/i18n.js +42 -0
  29. data/lib/clacky/web/index.html +26 -7
  30. data/lib/clacky/web/onboard.js +6 -0
  31. data/lib/clacky/web/sessions.js +194 -11
  32. data/lib/clacky/web/settings.js +51 -10
  33. data/lib/clacky/web/skills.js +53 -31
  34. data/lib/clacky/web/vendor/hljs/highlight.min.js +1244 -0
  35. data/lib/clacky/web/vendor/hljs/hljs-theme.css +95 -0
  36. data/scripts/build/lib/apt.sh +30 -10
  37. data/scripts/build/lib/network.sh +3 -2
  38. data/scripts/install.sh +30 -9
  39. data/scripts/install_browser.sh +2 -1
  40. data/scripts/install_full.sh +2 -1
  41. data/scripts/install_rails_deps.sh +30 -9
  42. data/scripts/install_system_deps.sh +30 -9
  43. metadata +7 -17
  44. data/docs/HOW-TO-USE-CN.md +0 -96
  45. data/docs/HOW-TO-USE.md +0 -94
  46. data/docs/browser-cdp-native-design.md +0 -195
  47. data/docs/c-end-user-positioning.md +0 -64
  48. data/docs/config.example.yml +0 -27
  49. data/docs/deploy-architecture.md +0 -619
  50. data/docs/deploy_subagent_design.md +0 -540
  51. data/docs/install-script-simplification.md +0 -89
  52. data/docs/memory-architecture.md +0 -343
  53. data/docs/openclacky_cloud_api_reference.md +0 -584
  54. data/docs/security-design.md +0 -109
  55. data/docs/session-management-redesign.md +0 -202
  56. data/docs/system-skill-authoring-guide.md +0 -47
  57. data/docs/why-developer.md +0 -371
  58. data/docs/why-openclacky.md +0 -266
@@ -1,195 +0,0 @@
1
- # Browser Tool: Native CDP Integration Design
2
-
3
- ## 背景与目标
4
-
5
- 现有的 browser tool 依赖 `agent-browser`(Rust 二进制,通过 npm 分发),每次使用都启动一个独立的 Chrome 实例,存在以下问题:
6
-
7
- - 用户登录态、Cookie 无法复用
8
- - 需要额外安装 npm / agent-browser
9
- - 每次任务弹出新 Chrome 窗口,体验差
10
- - 依赖链长:npm → agent-browser binary → Chrome for Testing
11
-
12
- **核心目标**:Clacky 直接复用用户已打开的 Chrome,继承所有登录态和 Cookie,零额外依赖。
13
-
14
- ---
15
-
16
- ## Chrome 146 的关键变化
17
-
18
- ### 时间线
19
-
20
- | Chrome 版本 | 行为 |
21
- |------------|------|
22
- | ≤ 135 | `--remote-debugging-port` 可连接 default profile(不推荐但能用)|
23
- | 136 ~ 145 | Default profile 被封锁,必须用 `--user-data-dir` 开隔离 profile(空的,无登录态)|
24
- | **146+** | 新增 **autoConnect toggle**,一次开关,直接连真实浏览器,Consent-based ✅ |
25
-
26
- ### 用户操作(一次性)
27
-
28
- 1. 打开 `chrome://inspect/#remote-debugging`
29
- 2. 勾选 **"Allow remote debugging for this browser instance"**
30
- 3. Chrome 在 `127.0.0.1:9222` 启动 CDP server
31
-
32
- 之后每次 Clacky 连接时,Chrome 会弹一次 **"Allow remote debugging?"** 权限确认框,用户点 Allow 即可。
33
-
34
- ---
35
-
36
- ## 技术方案:纯 Ruby CDP Client
37
-
38
- ### 核心发现
39
-
40
- Chrome 146 的 autoConnect 模式**不暴露标准 `/json` HTTP endpoint**(返回 404),而是通过一个文件告知连接信息:
41
-
42
- ```
43
- ~/Library/Application Support/Google/Chrome/DevToolsActivePort
44
- ```
45
-
46
- 文件内容格式:
47
- ```
48
- 9222
49
- /devtools/browser/98823857-17b3-48ec-8f24-5805e3012a05
50
- ```
51
-
52
- 第一行是端口,第二行是 WebSocket path,直接拼成:
53
-
54
- ```
55
- ws://127.0.0.1:9222/devtools/browser/98823857-17b3-48ec-8f24-5805e3012a05
56
- ```
57
-
58
- ### 连接流程
59
-
60
- ```
61
- 1. 读 DevToolsActivePort 文件
62
-
63
- 2. WebSocket 连接 Browser endpoint
64
-
65
- 3. Target.getTargets → 列出所有真实 tab
66
-
67
- 4. Target.attachToTarget(targetId, flatten: true) → 获得 sessionId
68
-
69
- 5. 通过 sessionId 发送 CDP 命令操作指定 tab
70
- ```
71
-
72
- ### 依赖
73
-
74
- **零新依赖**,只用已有的:
75
- - `websocket-driver`(已在 gemspec)
76
- - `socket`(Ruby 标准库)
77
- - `net/http`(Ruby 标准库)
78
- - `json`(Ruby 标准库)
79
-
80
- ### 已验证能力
81
-
82
- 实测(2026-03-20)通过脚本验证:
83
-
84
- - ✅ 读取 DevToolsActivePort,发现 9222 端口
85
- - ✅ WebSocket 连接 Browser endpoint
86
- - ✅ `Target.getTargets` 列出用户所有真实 tab(含标题、URL)
87
- - ✅ `Target.attachToTarget` attach 到指定 tab
88
- - ✅ `Runtime.evaluate` 执行 JS(获取 URL、title 等)
89
- - ✅ `Page.captureScreenshot` 截图
90
- - ✅ `Target.createTarget` 开新 tab 并导航
91
- - ✅ 复用用户登录态(访问 yafeilee.com/admin 直接进后台,无需重新登录)
92
-
93
- ---
94
-
95
- ## 实施方案
96
-
97
- ### 第一层:Discovery(发现层)
98
-
99
- ```ruby
100
- # 检测 Chrome 是否开启了 remote debugging
101
- def discover_chrome_cdp
102
- port_file = File.expand_path(
103
- "~/Library/Application Support/Google/Chrome/DevToolsActivePort"
104
- )
105
- return nil unless File.exist?(port_file)
106
-
107
- lines = File.read(port_file).strip.split("\n")
108
- port = lines[0].to_i
109
- path = lines[1]
110
-
111
- # 验证端口确实在监听
112
- TCPSocket.new("127.0.0.1", port).close
113
- { port: port, path: path, ws_url: "ws://127.0.0.1:#{port}#{path}" }
114
- rescue Errno::ECONNREFUSED
115
- nil
116
- end
117
- ```
118
-
119
- **没有发现时的引导**:
120
-
121
- > "请在 Chrome 地址栏打开 `chrome://inspect/#remote-debugging`,
122
- > 勾选 'Allow remote debugging for this browser instance',只需一次。"
123
-
124
- ### 第二层:CDP Client(通信层)
125
-
126
- 新建 `lib/clacky/tools/cdp_client.rb`,实现:
127
-
128
- - WebSocket 连接管理
129
- - 命令发送(带 id)/ 响应匹配
130
- - Session 管理(Browser-level vs Tab-level)
131
- - 事件监听(Page.loadEventFired 等)
132
-
133
- ### 第三层:Browser Tool 改造
134
-
135
- `lib/clacky/tools/browser.rb` 改造策略:
136
-
137
- ```
138
- 优先级 1: 检测 DevToolsActivePort → 用户真实 Chrome(Native CDP)
139
- 优先级 2: Fallback → 现有 agent-browser(向后兼容)
140
- ```
141
-
142
- ### macOS 路径(其他平台待补充)
143
-
144
- | 平台 | DevToolsActivePort 路径 |
145
- |------|------------------------|
146
- | macOS | `~/Library/Application Support/Google/Chrome/DevToolsActivePort` |
147
- | Linux | `~/.config/google-chrome/DevToolsActivePort` |
148
- | Windows | `%LOCALAPPDATA%\Google\Chrome\User Data\DevToolsActivePort` |
149
-
150
- ---
151
-
152
- ## 关键问题与结论
153
-
154
- ### Q: `/json` endpoint 返回 404,怎么办?
155
-
156
- Chrome 146 autoConnect 模式不走 HTTP `/json`,改用 `DevToolsActivePort` 文件 + 直接 WebSocket 连接。
157
-
158
- ### Q: ferrum gem 是否适用?
159
-
160
- **不适用**。`Ferrum::Browser.new(url: "http://localhost:9222")` 虽然能连接到已有 Chrome,但会创建新的 incognito browser context,不复用用户的 tab 和登录态。需要绕过 ferrum,直接操作原始 CDP。
161
-
162
- ### Q: 每次连接都要点 Allow?
163
-
164
- 是的,Chrome 146 每次新的 WebSocket 连接都会弹确认框。这是 Chrome 的安全 consent 机制,无法绕过,但体验上是可以接受的(用户清楚地知道浏览器被控制了)。
165
-
166
- ### Q: agent-browser 是否彻底废弃?
167
-
168
- 建议渐进迁移:先并行运行,Native CDP 作为优先路径,agent-browser 作为 fallback,稳定后再移除。
169
-
170
- ---
171
-
172
- ## 参考资料
173
-
174
- - [Chrome 146 autoConnect 介绍 - DEV Community](https://dev.to/minatoplanb/chrome-146-finally-lets-ai-control-your-real-browser-google-oauth-included-28b7)
175
- - [One Toggle That Changed Browser Automation - LinkedIn](https://www.linkedin.com/posts/surajadsul_one-toggle-that-changed-the-browser-automation-activity-7439161929664864257-0v8z)
176
- - [Chrome DevTools MCP 连接模式详解](https://www.heyuan110.com/posts/ai/2026-03-17-chrome-devtools-mcp-guide/)
177
- - [agent-browser #412: Support --auto-connect](https://github.com/vercel-labs/agent-browser/issues/412)
178
- - [Chrome DevTools Protocol 官方文档](https://chromedevtools.github.io/devtools-protocol/)
179
- - [DevToolsActivePort WebSocket path 说明](https://deepwiki.com/ChromeDevTools/chrome-devtools-mcp/2.3-connection-modes)
180
- - [ferrum issue #320: Connect to existing Chrome](https://github.com/rubycdp/ferrum/issues/320)
181
- - [Chrome remote-debugging security changes](https://developer.chrome.com/blog/remote-debugging-port)
182
-
183
- ---
184
-
185
- ## 测试脚本
186
-
187
- 原型验证脚本位于:`tmp/cdp_test.rb`
188
-
189
- 运行前提:
190
- 1. Chrome 已开启 remote debugging(`chrome://inspect/#remote-debugging`)
191
- 2. 点击 Allow 弹框
192
-
193
- ```bash
194
- bundle exec ruby tmp/cdp_test.rb
195
- ```
@@ -1,64 +0,0 @@
1
- # C-End User Positioning
2
-
3
- > Date: 2026-03-30
4
-
5
- ---
6
-
7
- ## Market Context
8
-
9
- The "OpenClaw ecosystem" has exploded in 2026. Key players:
10
-
11
- - **OpenClaw** — open-source, self-hosted, community Skills. Designed for technical users who configure everything themselves.
12
- - **QClaw** — Tencent's fork. Bundled Kimi model, WeChat binding. Mass-market but Tencent-ecosystem only.
13
- - **Others** (Wukong, etc.) — same lane.
14
-
15
- OpenClaw has 5,700+ Skills, but almost all are open-source, free, and easily copied. The ecosystem lacks **expertise-backed, production-grade Skills worth paying for**.
16
-
17
- ---
18
-
19
- ## Who openclacky Is For
20
-
21
- **Ordinary users, not technical geeks.**
22
-
23
- The target user knows OpenClaw exists, has heard about "raising a lobster", but can't or doesn't want to:
24
- - configure Docker / environment / webhooks
25
- - manage their own API keys without knowing what they'll spend
26
- - troubleshoot when a long task breaks halfway
27
-
28
- They want to use a lobster built by an expert (a lawyer, a trader, an SEO specialist) — not build one themselves.
29
-
30
- > Core insight: **OpenClaw is built for people who create Skills. openclacky is built for people who use them.**
31
-
32
- ---
33
-
34
- ## Why openclacky Over OpenClaw: 3 Core Reasons
35
-
36
- ### 1. Zero-friction IM setup — the strongest differentiator
37
-
38
- OpenClaw requires users to manually configure webhooks, tokens, and config files to connect WeChat / Feishu / WeCom. High technical barrier, most ordinary users give up.
39
-
40
- openclacky uses **AI-automated channel setup**: one sentence, and the AI configures the IM connection for you — no plugins, no docs, no engineering knowledge required. This is a genuine technical moat.
41
-
42
- ### 2. Built for China, natively
43
-
44
- - No VPN required, no overseas credit card
45
- - WeChat / Feishu / WeCom are the primary daily tools for Chinese users — openclacky treats them as first-class citizens
46
- - Supports domestic models (DeepSeek, Kimi, etc.) out of the box
47
- - QClaw is domestic too, but locked to Tencent's ecosystem and model choices
48
-
49
- ### 3. Cost transparency and long-task reliability
50
-
51
- - Real-time token cost tracking — users always know what they're spending
52
- - Automatic compression (up to 90% savings via Insert-then-Compress + Prompt Caching)
53
- - Long tasks don't break: sub-agent isolation + Time Machine architecture keeps context intact
54
-
55
- ---
56
-
57
- ## The User Progression
58
-
59
- ```
60
- Can use it → Dare to use it → Keep using it
61
- (zero setup) (cost clarity) (tasks don't break)
62
- ```
63
-
64
- Each of the 3 reasons maps directly to one stage of this progression.
@@ -1,27 +0,0 @@
1
- # Clacky Configuration File
2
- # This is a top-level array of model configurations
3
- # The first model in the array is used as the default
4
-
5
- # Claude Sonnet 4 (default - first in array)
6
- - model: "claude-sonnet-4"
7
- api_key: "your-api-key-here"
8
- base_url: "https://api.anthropic.com"
9
- anthropic_format: true
10
-
11
- # Claude Opus 4
12
- - model: "claude-opus-4"
13
- api_key: "your-api-key-here"
14
- base_url: "https://api.anthropic.com"
15
- anthropic_format: true
16
-
17
- # OpenAI GPT-4
18
- - model: "gpt-4"
19
- api_key: "your-openai-api-key-here"
20
- base_url: "https://api.openai.com/v1"
21
- anthropic_format: false
22
-
23
- # Custom model (e.g., local or third-party)
24
- - model: "custom-model"
25
- api_key: "your-custom-api-key"
26
- base_url: "https://your-api-endpoint.com"
27
- anthropic_format: false