pikiclaw 0.2.66 → 0.2.67

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,13 +4,13 @@
4
4
 
5
5
  **Put the world's smartest AI agents in your pocket. Command local Claude, Codex & Gemini via best IM.**
6
6
 
7
- *让最好用的 IM 变成你电脑上的顶级 Agent 控制台*
7
+ *Let the best IM app become a top-tier Agent console on your computer*
8
8
 
9
- > npx pikiclaw@latest
10
-
11
- <img src="docs/promo-install.gif" alt="Quick install" width="700">
9
+ ```
10
+ npx pikiclaw@latest
11
+ ```
12
12
 
13
- <p align="center">
13
+ <p>
14
14
  <a href="https://www.npmjs.com/package/pikiclaw"><img src="https://img.shields.io/npm/v/pikiclaw" alt="npm"></a>
15
15
  <a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License: MIT"></a>
16
16
  <a href="https://nodejs.org"><img src="https://img.shields.io/badge/Node.js-18+-green.svg" alt="Node.js 18+"></a>
@@ -18,188 +18,191 @@
18
18
 
19
19
  </div>
20
20
 
21
+ ## Demo
22
+
23
+ > Real task: ask pikiclaw to gather and summarize today's AI news — the agent reads, writes, and sends results back through Telegram, all from your phone.
24
+
25
+ <video src="docs/promo-demo.mp4" width="700" controls muted></video>
26
+
27
+ > Basic operations: send a message, watch the agent stream, receive files back.
28
+
29
+ <img src="docs/promo-basic-ops.gif" alt="Basic operations" width="700">
30
+
21
31
  ---
22
32
 
23
33
  ## Why pikiclaw?
24
34
 
25
- 很多“IM Agent”的方案,本质上还是在绕路:
35
+ Most "IM + Agent" solutions either reinvent the agent (worse than official CLIs), run in remote sandboxes (not your environment), or only support short conversations (unusable for real tasks).
26
36
 
27
- - 要么自己造 Agent,效果不如官方 CLI
28
- - 要么跑在远端沙盒里,不是你的环境
29
- - 要么只能短对话,不适合长任务
37
+ pikiclaw takes a different approach:
30
38
 
31
- pikiclaw 的目标很直接:
32
-
33
- - 用官方 Agent CLI,而不是重新发明一套
34
- - 用你自己的电脑,而不是陌生沙盒
35
- - 用你已经在用的 IM,而不是再学一套远程控制方式
39
+ - **Official Agent CLIs** — Claude Code, Codex, Gemini CLI as-is, not a home-grown wrapper
40
+ - **Your own machine** — local files, local tools, local environment
41
+ - **Your existing IM** — Telegram, Feishu, or WeChat, no new app to learn
36
42
 
37
43
  ```
38
- 你(Telegram / 飞书)
39
-
40
-
44
+ You (Telegram / Feishu / WeChat)
45
+ |
46
+ v
41
47
  pikiclaw
42
-
43
-
44
- Claude Code / Codex / Gemini
45
-
46
-
47
- 你的电脑
48
+ |
49
+ v
50
+ Claude Code / Codex / Gemini CLI
51
+ |
52
+ v
53
+ Your Computer
48
54
  ```
49
55
 
50
- 它适合的不是”演示一次 AI”,而是你离开电脑以后,Agent 还能继续在本机把事做完。
51
-
52
- ### 在 Telegram 里长这样
53
-
54
- <table>
55
- <tr>
56
- <td align=”center”><b>命令与 Agent 切换</b><br><img src=”docs/promo-tg-commands.png” alt=”Commands” width=”320”></td>
57
- <td align=”center”><b>代码审查</b><br><img src=”docs/promo-tg-task.png” alt=”Code review” width=”320”></td>
58
- </tr>
59
- <tr>
60
- <td align=”center”><b>多轮编码 + 文件回传</b><br><img src=”docs/promo-tg-complex.png” alt=”Complex task” width=”320”></td>
61
- <td align=”center”><b>状态监控 + 会话管理</b><br><img src=”docs/promo-tg-sessions.png” alt=”Sessions” width=”320”></td>
62
- </tr>
63
- </table>
56
+ It's designed for the moment you walk away from your desk — the agent keeps working locally, and you stay in control from your phone.
64
57
 
65
58
  ---
66
59
 
67
60
  ## Quick Start
68
61
 
69
- ### 准备
62
+ ### Prerequisites
70
63
 
71
64
  - Node.js 18+
72
- - 本机已安装并登录任意一个 Agent CLI
73
- - [`claude`](https://docs.anthropic.com/en/docs/claude-code)
74
- - [`codex`](https://github.com/openai/codex)
75
- - [`gemini`](https://github.com/google-gemini/gemini-cli)
76
- - Telegram Bot Token 或飞书应用凭证
65
+ - At least one Agent CLI installed and logged in:
66
+ - [`claude`](https://docs.anthropic.com/en/docs/claude-code) (Claude Code)
67
+ - [`codex`](https://github.com/openai/codex) (Codex CLI)
68
+ - [`gemini`](https://github.com/google-gemini/gemini-cli) (Gemini CLI)
69
+ - A bot token for your IM channel (Telegram Bot Token, Feishu app credentials, or WeChat account)
77
70
 
78
- ### 启动
71
+ ### Install & Launch
79
72
 
80
73
  ```bash
81
74
  cd your-workspace
82
75
  npx pikiclaw@latest
83
76
  ```
84
77
 
85
- 默认会打开 Web Dashboard:`http://localhost:3939`
78
+ <img src="docs/promo-install.gif" alt="Quick install" width="700">
86
79
 
87
- 你可以在 Dashboard 里完成:
80
+ This opens the **Web Dashboard** at `http://localhost:3939`, where you can:
88
81
 
89
- - 渠道配置
90
- - 默认 Agent / 模型设置
91
- - 工作目录切换
92
- - 会话和运行状态查看
82
+ - Connect IM channels (Telegram / Feishu / WeChat)
83
+ - Configure agents and models
84
+ - Manage macOS system permissions
85
+ - Set up browser & desktop automation extensions
86
+ - Monitor sessions and system resources
93
87
 
94
88
  <details>
95
- <summary>Dashboard 截图</summary>
89
+ <summary>Alternative: terminal setup wizard</summary>
96
90
 
97
- **配置管理** — IM 接入、AI Agent、系统权限
91
+ ```bash
92
+ npx pikiclaw@latest --setup # interactive terminal wizard
93
+ npx pikiclaw@latest --doctor # check environment only
94
+ ```
98
95
 
99
- <img src="docs/promo-dashboard-config.png" alt="Config" width="700">
96
+ </details>
100
97
 
101
- **会话管理** — 按 Agent 分组的会话泳道
98
+ ---
102
99
 
103
- <img src="docs/promo-dashboard-sessions.png" alt="Sessions" width="700">
100
+ ## Dashboard
104
101
 
105
- </details>
102
+ <details>
103
+ <summary>Expand to see all dashboard pages</summary>
106
104
 
107
- 如果你更喜欢终端向导:
105
+ **IM Access** — Telegram, Feishu, WeChat channel status and configuration
108
106
 
109
- ```bash
110
- npx pikiclaw@latest --setup
111
- ```
107
+ <img src="docs/promo-dashboard-im.png" alt="IM Access" width="700">
112
108
 
113
- 如果只是检查环境:
109
+ **Agent Config** — Default agent / model / reasoning effort, available agents overview
114
110
 
115
- ```bash
116
- npx pikiclaw@latest --doctor
117
- ```
111
+ <img src="docs/promo-dashboard-agents.png" alt="Agent Config" width="700">
112
+
113
+ **System Permissions** — macOS accessibility, screen recording, disk access
114
+
115
+ <img src="docs/promo-dashboard-permissions.png" alt="Permissions" width="700">
116
+
117
+ **Extensions** — Managed browser & desktop automation (Appium Mac2)
118
+
119
+ <img src="docs/promo-dashboard-extensions.png" alt="Extensions" width="700">
120
+
121
+ **Sessions** — Per-agent session list and runtime status
122
+
123
+ <img src="docs/promo-dashboard-sessions.png" alt="Sessions" width="700">
124
+
125
+ **System Info** — Working directory, CPU / memory / disk monitoring
126
+
127
+ <img src="docs/promo-dashboard-system.png" alt="System Info" width="700">
128
+
129
+ </details>
118
130
 
119
131
  ---
120
132
 
121
- ## Current Capabilities
133
+ ## Features
122
134
 
123
- ### Channels And Agents
135
+ ### Channels & Agents
124
136
 
125
- - Telegram、飞书都可用,也可以同时启动
126
- - Claude CodeCodex CLIGemini CLI 都已接入
127
- - agent 通过统一 driver registry 管理,模型列表、session 列表、usage 展示走同一套接口
137
+ - Telegram, Feishu, and WeChat — run one or all simultaneously
138
+ - Claude Code, Codex CLI, and Gemini CLI via unified driver registry
139
+ - Model listing, session management, and usage tracking through a single interface
128
140
 
129
141
  ### Runtime
130
142
 
131
- - 流式预览和持续消息更新
132
- - 会话切换、恢复和多轮续聊
133
- - 工作目录浏览与切换
134
- - 文件附件自动进入 session workspace
135
- - 长任务防休眠、watchdog 守护和自动重启
136
- - 长文本自动拆分,图片和文件可直接回传到 IM
137
- - Dashboard 可查看运行状态、sessions、usage、主机状态和 macOS 权限状态
143
+ - Streaming preview with continuous message updates
144
+ - Session switching, resume, and multi-turn conversations
145
+ - Working directory browsing and switching
146
+ - File attachments automatically enter the session workspace
147
+ - Long-task sleep prevention, watchdog, and auto-restart
148
+ - Long text auto-splitting; images and files sent back to IM directly
149
+ - Light / dark theme and i18n (Chinese & English)
138
150
 
139
151
  ### Skills
140
152
 
141
- - 项目级 skills `.pikiclaw/skills/*/SKILL.md` 为 canonical 入口
142
- - 兼容 `.claude/commands/*.md`
143
- - 兼容 legacy `.claude/skills` / `.agents/skills`,并可合并回 `.pikiclaw/skills`
144
- - IM 内可通过 `/skills` `/sk_<name>` 触发
153
+ - Project-level skills at `.pikiclaw/skills/*/SKILL.md`
154
+ - Compatible with `.claude/commands/*.md`
155
+ - Legacy `.claude/skills` / `.agents/skills` support with migration path
156
+ - Trigger via `/skills` and `/sk_<name>` in chat
145
157
 
146
158
  ### Codex Human Loop
147
159
 
148
- Codex 在运行过程中请求额外用户输入时,pikiclaw 会把问题转成 Telegram / 飞书里的交互提示,用户回复后再继续当前任务。
160
+ When Codex requests additional user input mid-task, pikiclaw surfaces the question as an interactive prompt in your IM. Reply there and the task continues.
149
161
 
150
- ### MCP And GUI Automation
162
+ ### MCP & GUI Automation
151
163
 
152
- 每次 Agent stream 都会启动一个会话级 MCP bridge,把本地工具按本次任务注入给 Agent。
164
+ Each agent stream launches a session-scoped MCP bridge that injects local tools:
153
165
 
154
- 当前内置工具:
166
+ - `im_list_files` — list session workspace files
167
+ - `im_send_file` — send files back to IM in real time
155
168
 
156
- - `im_list_files`:列出 session workspace 文件
157
- - `im_send_file`:把文件实时发回 IM
169
+ Optional GUI capabilities:
158
170
 
159
- 可选 GUI 能力:
160
-
161
- - 浏览器自动化:通过 `@playwright/mcp` 管理一个专用的持久化 Chrome profile;第一次使用时在这个自动化浏览器里登录需要的网站,后续任务会复用同一个 profile
162
- - macOS 桌面自动化:通过 Appium Mac2 提供 `desktop_open_app`、`desktop_snapshot`、`desktop_click`、`desktop_type`、`desktop_screenshot` 等工具
171
+ - **Browser automation** — managed Chrome profile via `@playwright/mcp`; log in once, reuse across tasks
172
+ - **macOS desktop automation** — Appium Mac2 with `desktop_open_app`, `desktop_snapshot`, `desktop_click`, `desktop_type`, `desktop_screenshot`
163
173
 
164
174
  ---
165
175
 
166
176
  ## Commands
167
177
 
168
- | 命令 | 说明 |
178
+ | Command | Description |
169
179
  |---|---|
170
- | `/start` | 显示入口信息、当前 Agent、工作目录 |
171
- | `/sessions` | 查看、切换或新建会话 |
172
- | `/agents` | 切换 Agent |
173
- | `/models` | 查看并切换模型 / reasoning effort |
174
- | `/switch` | 浏览并切换工作目录 |
175
- | `/status` | 查看运行状态、tokensusage、会话信息 |
176
- | `/host` | 查看主机 CPU / 内存 / 磁盘 / 电量 |
177
- | `/skills` | 浏览项目 skills |
178
- | `/restart` | 重启并重新拉起 bot |
179
- | `/sk_<name>` | 运行项目 skill |
180
-
181
- 普通文本消息会直接转给当前 Agent。
182
-
183
- <details>
184
- <summary>Telegram 命令效果预览</summary>
185
-
186
- <img src="docs/promo-tg-commands.png" alt="Commands in Telegram" width="360">
187
-
188
- </details>
180
+ | `/start` | Show entry info, current agent, working directory |
181
+ | `/sessions` | View, switch, or create sessions |
182
+ | `/agents` | Switch agent |
183
+ | `/models` | View and switch model / reasoning effort |
184
+ | `/switch` | Browse and switch working directory |
185
+ | `/status` | Runtime status, tokens, usage, session info |
186
+ | `/host` | Host CPU / memory / disk / battery |
187
+ | `/skills` | Browse project skills |
188
+ | `/restart` | Restart and re-launch bot |
189
+ | `/sk_<name>` | Run a project skill |
190
+
191
+ Plain text messages are forwarded directly to the current agent.
189
192
 
190
193
  ---
191
194
 
192
- ## Config And Setup Notes
195
+ ## Configuration
193
196
 
194
- - 持久化配置在 `~/.pikiclaw/setting.json`
195
- - Dashboard 是主配置入口,其他运行时配置仍然可用
196
- - 桌面 GUI 相关常用变量:
197
- - `PIKICLAW_DESKTOP_GUI`
198
- - `PIKICLAW_DESKTOP_APPIUM_URL`
197
+ - Persistent config lives in `~/.pikiclaw/setting.json`
198
+ - The Dashboard is the primary configuration interface
199
199
 
200
- 浏览器自动化由 dashboard 和本地运行时共同管理,会自动创建并复用专用的 Chrome profile 目录。你只需要在这个专用浏览器里登录需要自动化的网站账号一次。
200
+ <details>
201
+ <summary>GUI automation setup</summary>
201
202
 
202
- 如果要启用 macOS 桌面自动化,需要先准备 Appium Mac2:
203
+ **Browser automation** is managed by the dashboard and runtime together — a dedicated Chrome profile is created and reused automatically. Just log in to the sites you need once in that browser.
204
+
205
+ **macOS desktop automation** requires Appium Mac2:
203
206
 
204
207
  ```bash
205
208
  npm install -g appium
@@ -207,15 +210,21 @@ appium driver install mac2
207
210
  appium
208
211
  ```
209
212
 
210
- 然后给运行 `pikiclaw` 的终端应用授予 macOS 的辅助功能权限。
213
+ Then grant macOS Accessibility permission to your terminal app.
214
+
215
+ Relevant environment variables:
216
+ - `PIKICLAW_DESKTOP_GUI`
217
+ - `PIKICLAW_DESKTOP_APPIUM_URL`
218
+
219
+ </details>
211
220
 
212
221
  ---
213
222
 
214
223
  ## Roadmap
215
224
 
216
- - 把当前会话级 MCP bridge 继续扩展成更完整的顶级工具接入层
217
- - 继续完善 GUI 自动化能力,尤其是浏览器与桌面工具的协同链路
218
- - 增加更多 IM 渠道,WhatsApp 仍在规划中
225
+ - Expand session-scoped MCP bridge into a more complete top-level tool layer
226
+ - Improve GUI automation, especially browser + desktop tool coordination
227
+ - More IM channels (WhatsApp, etc.)
219
228
 
220
229
  ---
221
230
 
@@ -229,25 +238,15 @@ npm run build
229
238
  npm test
230
239
  ```
231
240
 
232
- 常用命令:
233
-
234
241
  ```bash
235
- npm run dev
236
- npm run build
237
- npm test
238
- npm run test:e2e
239
- npx vitest run test/channel-feishu.unit.test.ts
240
- npx pikiclaw@latest --doctor
242
+ npm run dev # local dev (--no-daemon, logs to ~/.pikiclaw/dev/dev.log)
243
+ npm run build # production build
244
+ npm test # unit tests
245
+ npm run test:e2e # end-to-end tests
246
+ npx pikiclaw@latest --doctor # environment check
241
247
  ```
242
248
 
243
- `npm run dev` 只跑本地源码链路,会固定使用 `--no-daemon`,避免跳转到生产/自举用的 `npx pikiclaw@latest`。
244
- 同时会把本次启动的全部日志写到 `~/.pikiclaw/dev/dev.log`,并在每次启动时先清空旧日志。
245
-
246
- 更多实现细节见:
247
-
248
- - [ARCHITECTURE.md](ARCHITECTURE.md)
249
- - [INTEGRATION.md](INTEGRATION.md)
250
- - [TESTING.md](TESTING.md)
249
+ See also: [ARCHITECTURE.md](ARCHITECTURE.md) · [INTEGRATION.md](INTEGRATION.md) · [TESTING.md](TESTING.md)
251
250
 
252
251
  ---
253
252