openclacky 1.1.2 → 1.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. checksums.yaml +4 -4
  2. data/.clacky/skills/gem-release/SKILL.md +27 -31
  3. data/CHANGELOG.md +14 -0
  4. data/Dockerfile +28 -0
  5. data/docs/engineering-article.md +343 -0
  6. data/lib/clacky/agent/llm_caller.rb +1 -5
  7. data/lib/clacky/cli.rb +1 -1
  8. data/lib/clacky/message_format/anthropic.rb +17 -1
  9. data/lib/clacky/providers.rb +34 -0
  10. data/lib/clacky/server/channel/adapters/dingtalk/adapter.rb +142 -5
  11. data/lib/clacky/server/channel/adapters/dingtalk/api_client.rb +309 -0
  12. data/lib/clacky/ui2/ui_controller.rb +14 -0
  13. data/lib/clacky/ui_interface.rb +14 -0
  14. data/lib/clacky/utils/model_pricing.rb +96 -25
  15. data/lib/clacky/version.rb +1 -1
  16. data/lib/clacky/web/app.css +8 -0
  17. data/lib/clacky/web/index.html +1 -1
  18. data/lib/clacky/web/onboard.js +6 -0
  19. data/lib/clacky/web/settings.js +17 -5
  20. data/scripts/build/lib/apt.sh +30 -10
  21. data/scripts/build/lib/network.sh +3 -2
  22. data/scripts/install.sh +30 -9
  23. metadata +3 -16
  24. data/docs/HOW-TO-USE-CN.md +0 -96
  25. data/docs/HOW-TO-USE.md +0 -94
  26. data/docs/browser-cdp-native-design.md +0 -195
  27. data/docs/c-end-user-positioning.md +0 -64
  28. data/docs/config.example.yml +0 -27
  29. data/docs/deploy-architecture.md +0 -619
  30. data/docs/deploy_subagent_design.md +0 -540
  31. data/docs/install-script-simplification.md +0 -89
  32. data/docs/memory-architecture.md +0 -343
  33. data/docs/openclacky_cloud_api_reference.md +0 -584
  34. data/docs/security-design.md +0 -109
  35. data/docs/session-management-redesign.md +0 -202
  36. data/docs/system-skill-authoring-guide.md +0 -47
  37. data/docs/why-developer.md +0 -371
  38. data/docs/why-openclacky.md +0 -266
data/docs/HOW-TO-USE.md DELETED
@@ -1,94 +0,0 @@
1
- # How to Use OpenClacky
2
-
3
- ## Installation
4
-
5
- ```bash
6
- gem install openclacky
7
- ```
8
-
9
- **Requirements:** Ruby >= 3.1
10
-
11
- ## Quick Start
12
-
13
- ### 1. Start Clacky
14
-
15
- ```bash
16
- clacky
17
- ```
18
-
19
- ### 2. Configure API Key (First Time)
20
-
21
- In the chat interface, type:
22
-
23
- ```
24
- /config
25
- ```
26
-
27
- Then follow the prompts to set your API key:
28
- - **OpenAI**: Get key from https://platform.openai.com/api-keys
29
- - **Anthropic**: Get key from https://console.anthropic.com/
30
-
31
- ### 3. Start Chatting
32
-
33
- Just type your questions or requests in the chat:
34
-
35
- ```
36
- Help me write a Ruby script to parse CSV files
37
- ```
38
-
39
- ```
40
- Create a web scraper for extracting article titles
41
- ```
42
-
43
- ## Key Features
44
-
45
- ### 🎯 Autonomous Agent Mode
46
- Clacky can automatically execute complex tasks using built-in tools:
47
- - **File Operations**: Read, write, edit, search files
48
- - **Web Access**: Browse and search the web
49
- - **Code Execution**: Run shell commands and test code
50
- - **Project Management**: Git operations, testing, deployment
51
-
52
- ### 🔌 Skill System
53
- Use powerful skills with simple shorthand commands:
54
-
55
- ```
56
- /commit # Smart git commit helper
57
- /gem-release # Automated gem publishing
58
- ```
59
-
60
- Create your own skills in `.clacky/skills/` directory!
61
-
62
- ### 💬 Smart Memory Management
63
- - **Automatic compression** for long conversations
64
- - **Context preservation** while reducing token costs
65
- - **Intelligent summarization** of conversation history
66
-
67
- ### ⚙️ Easy Configuration
68
- - Interactive setup wizard
69
- - Support for multiple API providers
70
- - Cost tracking and usage limits
71
- - Smart defaults for common use cases
72
-
73
- ## Common Commands in Chat
74
-
75
- ```
76
- /config # Configure API settings
77
- /help # Show available commands
78
- /skills # List available skills
79
- ```
80
-
81
- ## Why Choose OpenClacky?
82
-
83
- ✅ **Simple Setup** - Just `gem install` and start chatting
84
- ✅ **Powerful Agent** - Executes complex tasks autonomously
85
- ✅ **Extensible** - Create custom skills for your workflows
86
- ✅ **Cost-Effective** - Smart memory compression saves tokens
87
- ✅ **Multi-Provider** - Works with OpenAI and Anthropic
88
- ✅ **Well-Tested** - 367+ passing tests ensure reliability
89
-
90
- ## Learn More
91
-
92
- - GitHub: https://github.com/clacky-ai/openclacky
93
- - Report Issues: https://github.com/clacky-ai/openclacky/issues
94
- - Version: 0.7.0
@@ -1,195 +0,0 @@
1
- # Browser Tool: Native CDP Integration Design
2
-
3
- ## 背景与目标
4
-
5
- 现有的 browser tool 依赖 `agent-browser`(Rust 二进制,通过 npm 分发),每次使用都启动一个独立的 Chrome 实例,存在以下问题:
6
-
7
- - 用户登录态、Cookie 无法复用
8
- - 需要额外安装 npm / agent-browser
9
- - 每次任务弹出新 Chrome 窗口,体验差
10
- - 依赖链长:npm → agent-browser binary → Chrome for Testing
11
-
12
- **核心目标**:Clacky 直接复用用户已打开的 Chrome,继承所有登录态和 Cookie,零额外依赖。
13
-
14
- ---
15
-
16
- ## Chrome 146 的关键变化
17
-
18
- ### 时间线
19
-
20
- | Chrome 版本 | 行为 |
21
- |------------|------|
22
- | ≤ 135 | `--remote-debugging-port` 可连接 default profile(不推荐但能用)|
23
- | 136 ~ 145 | Default profile 被封锁,必须用 `--user-data-dir` 开隔离 profile(空的,无登录态)|
24
- | **146+** | 新增 **autoConnect toggle**,一次开关,直接连真实浏览器,Consent-based ✅ |
25
-
26
- ### 用户操作(一次性)
27
-
28
- 1. 打开 `chrome://inspect/#remote-debugging`
29
- 2. 勾选 **"Allow remote debugging for this browser instance"**
30
- 3. Chrome 在 `127.0.0.1:9222` 启动 CDP server
31
-
32
- 之后每次 Clacky 连接时,Chrome 会弹一次 **"Allow remote debugging?"** 权限确认框,用户点 Allow 即可。
33
-
34
- ---
35
-
36
- ## 技术方案:纯 Ruby CDP Client
37
-
38
- ### 核心发现
39
-
40
- Chrome 146 的 autoConnect 模式**不暴露标准 `/json` HTTP endpoint**(返回 404),而是通过一个文件告知连接信息:
41
-
42
- ```
43
- ~/Library/Application Support/Google/Chrome/DevToolsActivePort
44
- ```
45
-
46
- 文件内容格式:
47
- ```
48
- 9222
49
- /devtools/browser/98823857-17b3-48ec-8f24-5805e3012a05
50
- ```
51
-
52
- 第一行是端口,第二行是 WebSocket path,直接拼成:
53
-
54
- ```
55
- ws://127.0.0.1:9222/devtools/browser/98823857-17b3-48ec-8f24-5805e3012a05
56
- ```
57
-
58
- ### 连接流程
59
-
60
- ```
61
- 1. 读 DevToolsActivePort 文件
62
-
63
- 2. WebSocket 连接 Browser endpoint
64
-
65
- 3. Target.getTargets → 列出所有真实 tab
66
-
67
- 4. Target.attachToTarget(targetId, flatten: true) → 获得 sessionId
68
-
69
- 5. 通过 sessionId 发送 CDP 命令操作指定 tab
70
- ```
71
-
72
- ### 依赖
73
-
74
- **零新依赖**,只用已有的:
75
- - `websocket-driver`(已在 gemspec)
76
- - `socket`(Ruby 标准库)
77
- - `net/http`(Ruby 标准库)
78
- - `json`(Ruby 标准库)
79
-
80
- ### 已验证能力
81
-
82
- 实测(2026-03-20)通过脚本验证:
83
-
84
- - ✅ 读取 DevToolsActivePort,发现 9222 端口
85
- - ✅ WebSocket 连接 Browser endpoint
86
- - ✅ `Target.getTargets` 列出用户所有真实 tab(含标题、URL)
87
- - ✅ `Target.attachToTarget` attach 到指定 tab
88
- - ✅ `Runtime.evaluate` 执行 JS(获取 URL、title 等)
89
- - ✅ `Page.captureScreenshot` 截图
90
- - ✅ `Target.createTarget` 开新 tab 并导航
91
- - ✅ 复用用户登录态(访问 yafeilee.com/admin 直接进后台,无需重新登录)
92
-
93
- ---
94
-
95
- ## 实施方案
96
-
97
- ### 第一层:Discovery(发现层)
98
-
99
- ```ruby
100
- # 检测 Chrome 是否开启了 remote debugging
101
- def discover_chrome_cdp
102
- port_file = File.expand_path(
103
- "~/Library/Application Support/Google/Chrome/DevToolsActivePort"
104
- )
105
- return nil unless File.exist?(port_file)
106
-
107
- lines = File.read(port_file).strip.split("\n")
108
- port = lines[0].to_i
109
- path = lines[1]
110
-
111
- # 验证端口确实在监听
112
- TCPSocket.new("127.0.0.1", port).close
113
- { port: port, path: path, ws_url: "ws://127.0.0.1:#{port}#{path}" }
114
- rescue Errno::ECONNREFUSED
115
- nil
116
- end
117
- ```
118
-
119
- **没有发现时的引导**:
120
-
121
- > "请在 Chrome 地址栏打开 `chrome://inspect/#remote-debugging`,
122
- > 勾选 'Allow remote debugging for this browser instance',只需一次。"
123
-
124
- ### 第二层:CDP Client(通信层)
125
-
126
- 新建 `lib/clacky/tools/cdp_client.rb`,实现:
127
-
128
- - WebSocket 连接管理
129
- - 命令发送(带 id)/ 响应匹配
130
- - Session 管理(Browser-level vs Tab-level)
131
- - 事件监听(Page.loadEventFired 等)
132
-
133
- ### 第三层:Browser Tool 改造
134
-
135
- `lib/clacky/tools/browser.rb` 改造策略:
136
-
137
- ```
138
- 优先级 1: 检测 DevToolsActivePort → 用户真实 Chrome(Native CDP)
139
- 优先级 2: Fallback → 现有 agent-browser(向后兼容)
140
- ```
141
-
142
- ### macOS 路径(其他平台待补充)
143
-
144
- | 平台 | DevToolsActivePort 路径 |
145
- |------|------------------------|
146
- | macOS | `~/Library/Application Support/Google/Chrome/DevToolsActivePort` |
147
- | Linux | `~/.config/google-chrome/DevToolsActivePort` |
148
- | Windows | `%LOCALAPPDATA%\Google\Chrome\User Data\DevToolsActivePort` |
149
-
150
- ---
151
-
152
- ## 关键问题与结论
153
-
154
- ### Q: `/json` endpoint 返回 404,怎么办?
155
-
156
- Chrome 146 autoConnect 模式不走 HTTP `/json`,改用 `DevToolsActivePort` 文件 + 直接 WebSocket 连接。
157
-
158
- ### Q: ferrum gem 是否适用?
159
-
160
- **不适用**。`Ferrum::Browser.new(url: "http://localhost:9222")` 虽然能连接到已有 Chrome,但会创建新的 incognito browser context,不复用用户的 tab 和登录态。需要绕过 ferrum,直接操作原始 CDP。
161
-
162
- ### Q: 每次连接都要点 Allow?
163
-
164
- 是的,Chrome 146 每次新的 WebSocket 连接都会弹确认框。这是 Chrome 的安全 consent 机制,无法绕过,但体验上是可以接受的(用户清楚地知道浏览器被控制了)。
165
-
166
- ### Q: agent-browser 是否彻底废弃?
167
-
168
- 建议渐进迁移:先并行运行,Native CDP 作为优先路径,agent-browser 作为 fallback,稳定后再移除。
169
-
170
- ---
171
-
172
- ## 参考资料
173
-
174
- - [Chrome 146 autoConnect 介绍 - DEV Community](https://dev.to/minatoplanb/chrome-146-finally-lets-ai-control-your-real-browser-google-oauth-included-28b7)
175
- - [One Toggle That Changed Browser Automation - LinkedIn](https://www.linkedin.com/posts/surajadsul_one-toggle-that-changed-the-browser-automation-activity-7439161929664864257-0v8z)
176
- - [Chrome DevTools MCP 连接模式详解](https://www.heyuan110.com/posts/ai/2026-03-17-chrome-devtools-mcp-guide/)
177
- - [agent-browser #412: Support --auto-connect](https://github.com/vercel-labs/agent-browser/issues/412)
178
- - [Chrome DevTools Protocol 官方文档](https://chromedevtools.github.io/devtools-protocol/)
179
- - [DevToolsActivePort WebSocket path 说明](https://deepwiki.com/ChromeDevTools/chrome-devtools-mcp/2.3-connection-modes)
180
- - [ferrum issue #320: Connect to existing Chrome](https://github.com/rubycdp/ferrum/issues/320)
181
- - [Chrome remote-debugging security changes](https://developer.chrome.com/blog/remote-debugging-port)
182
-
183
- ---
184
-
185
- ## 测试脚本
186
-
187
- 原型验证脚本位于:`tmp/cdp_test.rb`
188
-
189
- 运行前提:
190
- 1. Chrome 已开启 remote debugging(`chrome://inspect/#remote-debugging`)
191
- 2. 点击 Allow 弹框
192
-
193
- ```bash
194
- bundle exec ruby tmp/cdp_test.rb
195
- ```
@@ -1,64 +0,0 @@
1
- # C-End User Positioning
2
-
3
- > Date: 2026-03-30
4
-
5
- ---
6
-
7
- ## Market Context
8
-
9
- The "OpenClaw ecosystem" has exploded in 2026. Key players:
10
-
11
- - **OpenClaw** — open-source, self-hosted, community Skills. Designed for technical users who configure everything themselves.
12
- - **QClaw** — Tencent's fork. Bundled Kimi model, WeChat binding. Mass-market but Tencent-ecosystem only.
13
- - **Others** (Wukong, etc.) — same lane.
14
-
15
- OpenClaw has 5,700+ Skills, but almost all are open-source, free, and easily copied. The ecosystem lacks **expertise-backed, production-grade Skills worth paying for**.
16
-
17
- ---
18
-
19
- ## Who openclacky Is For
20
-
21
- **Ordinary users, not technical geeks.**
22
-
23
- The target user knows OpenClaw exists, has heard about "raising a lobster", but can't or doesn't want to:
24
- - configure Docker / environment / webhooks
25
- - manage their own API keys without knowing what they'll spend
26
- - troubleshoot when a long task breaks halfway
27
-
28
- They want to use a lobster built by an expert (a lawyer, a trader, an SEO specialist) — not build one themselves.
29
-
30
- > Core insight: **OpenClaw is built for people who create Skills. openclacky is built for people who use them.**
31
-
32
- ---
33
-
34
- ## Why openclacky Over OpenClaw: 3 Core Reasons
35
-
36
- ### 1. Zero-friction IM setup — the strongest differentiator
37
-
38
- OpenClaw requires users to manually configure webhooks, tokens, and config files to connect WeChat / Feishu / WeCom. High technical barrier, most ordinary users give up.
39
-
40
- openclacky uses **AI-automated channel setup**: one sentence, and the AI configures the IM connection for you — no plugins, no docs, no engineering knowledge required. This is a genuine technical moat.
41
-
42
- ### 2. Built for China, natively
43
-
44
- - No VPN required, no overseas credit card
45
- - WeChat / Feishu / WeCom are the primary daily tools for Chinese users — openclacky treats them as first-class citizens
46
- - Supports domestic models (DeepSeek, Kimi, etc.) out of the box
47
- - QClaw is domestic too, but locked to Tencent's ecosystem and model choices
48
-
49
- ### 3. Cost transparency and long-task reliability
50
-
51
- - Real-time token cost tracking — users always know what they're spending
52
- - Automatic compression (up to 90% savings via Insert-then-Compress + Prompt Caching)
53
- - Long tasks don't break: sub-agent isolation + Time Machine architecture keeps context intact
54
-
55
- ---
56
-
57
- ## The User Progression
58
-
59
- ```
60
- Can use it → Dare to use it → Keep using it
61
- (zero setup) (cost clarity) (tasks don't break)
62
- ```
63
-
64
- Each of the 3 reasons maps directly to one stage of this progression.
@@ -1,27 +0,0 @@
1
- # Clacky Configuration File
2
- # This is a top-level array of model configurations
3
- # The first model in the array is used as the default
4
-
5
- # Claude Sonnet 4 (default - first in array)
6
- - model: "claude-sonnet-4"
7
- api_key: "your-api-key-here"
8
- base_url: "https://api.anthropic.com"
9
- anthropic_format: true
10
-
11
- # Claude Opus 4
12
- - model: "claude-opus-4"
13
- api_key: "your-api-key-here"
14
- base_url: "https://api.anthropic.com"
15
- anthropic_format: true
16
-
17
- # OpenAI GPT-4
18
- - model: "gpt-4"
19
- api_key: "your-openai-api-key-here"
20
- base_url: "https://api.openai.com/v1"
21
- anthropic_format: false
22
-
23
- # Custom model (e.g., local or third-party)
24
- - model: "custom-model"
25
- api_key: "your-custom-api-key"
26
- base_url: "https://your-api-endpoint.com"
27
- anthropic_format: false