pikiclaw 0.3.34 → 0.3.36

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/README.md +78 -75
  2. package/README.zh-CN.md +350 -0
  3. package/dashboard/dist/assets/{AgentTab-DYtMGHpC.js → AgentTab-B0k6567P.js} +1 -1
  4. package/dashboard/dist/assets/{BrandIcon-DHaeSgX8.js → BrandIcon-DuvssI5P.js} +1 -1
  5. package/dashboard/dist/assets/{DirBrowser-DRg-wxu3.js → DirBrowser-CfaNqvEe.js} +1 -1
  6. package/dashboard/dist/assets/ExtensionsTab-BXEctSc6.js +1 -0
  7. package/dashboard/dist/assets/{IMAccessTab-bD_RJqfb.js → IMAccessTab-DM1rsj-u.js} +1 -1
  8. package/dashboard/dist/assets/{Modal-DxYHCIeK.js → Modal-DHVhz66H.js} +1 -1
  9. package/dashboard/dist/assets/{Modals-CDGPHZxx.js → Modals-Dhxu3hSi.js} +1 -1
  10. package/dashboard/dist/assets/{PermissionsTab-DQeaSUNi.js → PermissionsTab-B-KiKU8D.js} +1 -1
  11. package/dashboard/dist/assets/{Select-CtTWpzUC.js → Select-lWGnI7Cq.js} +1 -1
  12. package/dashboard/dist/assets/SessionPanel-DL2E4VnO.js +1 -0
  13. package/dashboard/dist/assets/{SystemTab-D5khpzb6.js → SystemTab-B6NbPTbB.js} +1 -1
  14. package/dashboard/dist/assets/index-BWqrPOhO.js +16 -0
  15. package/dashboard/dist/assets/index-CmjPBA1x.js +3 -0
  16. package/dashboard/dist/assets/index-DgLQyCkc.css +1 -0
  17. package/dashboard/dist/assets/{shared-JcpxfGic.js → shared-DKIT06du.js} +1 -1
  18. package/dashboard/dist/index.html +2 -2
  19. package/dist/agent/cli/auth.js +94 -1
  20. package/dist/agent/cli/catalog.js +5 -1
  21. package/dist/agent/cli/index.js +1 -1
  22. package/dist/agent/drivers/codex.js +80 -1
  23. package/dist/agent/drivers/hermes.js +59 -14
  24. package/dist/agent/goal.js +274 -0
  25. package/dist/agent/index.js +5 -1
  26. package/dist/agent/mcp/bridge.js +34 -17
  27. package/dist/agent/mcp/extensions.js +4 -2
  28. package/dist/agent/mcp/session-server.js +5 -0
  29. package/dist/agent/mcp/tools/goal.js +144 -0
  30. package/dist/bot/bot.js +223 -1
  31. package/dist/bot/commands.js +90 -0
  32. package/dist/bot/menu.js +1 -1
  33. package/dist/catalog/cli-tools.js +19 -0
  34. package/dist/catalog/mcp-servers.js +13 -0
  35. package/dist/catalog/skill-repos.js +10 -0
  36. package/dist/channels/feishu/bot.js +66 -2
  37. package/dist/channels/telegram/bot.js +12 -1
  38. package/dist/channels/weixin/bot.js +12 -1
  39. package/dist/dashboard/routes/cli.js +20 -1
  40. package/dist/dashboard/routes/extensions.js +227 -14
  41. package/dist/dashboard/routes/sessions.js +97 -0
  42. package/package.json +1 -1
  43. package/dashboard/dist/assets/ExtensionsTab-KsnTj9HM.js +0 -1
  44. package/dashboard/dist/assets/SessionPanel-BtE77tDC.js +0 -1
  45. package/dashboard/dist/assets/index-CypsxtrZ.js +0 -3
  46. package/dashboard/dist/assets/index-DIJ7MPen.js +0 -16
  47. package/dashboard/dist/assets/index-wGnw5EXO.css +0 -1
package/README.md CHANGED
@@ -17,7 +17,11 @@ npx pikiclaw@latest
17
17
  <a href="https://www.npmjs.com/package/pikiclaw"><img src="https://img.shields.io/npm/dm/pikiclaw?label=downloads&color=success" alt="npm downloads"></a>
18
18
  <a href="https://github.com/xiaotonng/pikiclaw/stargazers"><img src="https://img.shields.io/github/stars/xiaotonng/pikiclaw?style=flat&color=yellow" alt="GitHub stars"></a>
19
19
  <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="License: MIT"></a>
20
- <a href="https://nodejs.org"><img src="https://img.shields.io/badge/node-%E2%89%A518-green.svg" alt="Node 18+"></a>
20
+ <a href="https://nodejs.org"><img src="https://img.shields.io/badge/node-%E2%89%A520-green.svg" alt="Node 20+"></a>
21
+ </p>
22
+
23
+ <p>
24
+ <b>English</b> | <a href="README.zh-CN.md">简体中文</a>
21
25
  </p>
22
26
 
23
27
  <img src="docs/workspace.png" alt="Workspace" width="780">
@@ -33,27 +37,27 @@ npx pikiclaw@latest
33
37
  The product is the orchestrator. Everything else plugs in. **And the orchestrator is built using itself** — pikiclaw is what we use to build pikiclaw.
34
38
 
35
39
  ```
36
- Terminal layer Telegram Feishu WeChat Web Dashboard ( …mobile · voice · future )
37
- \__________________|__________________/
38
- v
39
- ┌──────────────────────────────┐
40
- │ pikiclaw orchestrator │
41
- └──────────────────────────────┘
42
- |
43
- ┌────────────────────────────────┼────────────────────────────────┐
44
- v v v
45
- Agent layer Model layer Tool layer
46
- Claude Code · Codex · Gemini Claude · GPT · Gemini · DeepSeek Skills · MCP · CLI
47
- Hermes · … (driver registry) 豆包 · MiMo · MiniMax · OpenRouter (global × workspace)
48
- · any third-party proxy · …
49
- |
50
- v
51
- Your computer
40
+ Terminal layer Telegram · Feishu · WeChat · Slack · Discord · DingTalk · WeCom · Web Dashboard
41
+ \__________________________|__________________________/
42
+ v
43
+ ┌──────────────────────────────┐
44
+ │ pikiclaw orchestrator │
45
+ └──────────────────────────────┘
46
+ |
47
+ ┌────────────────────────────────────────┼────────────────────────────────────────┐
48
+ v v v
49
+ Agent layer Model layer Tool layer
50
+ Claude Code · Codex · Gemini · Hermes Claude · GPT · Gemini · DeepSeek Skills · MCP · CLI
51
+ (driver registry · ACP · any agent) 豆包 · MiMo · MiniMax · OpenRouter (global × workspace)
52
+ · any OpenAI-compatible proxy · …
53
+ |
54
+ v
55
+ Your computer
52
56
  ```
53
57
 
54
- - **Terminal layer** — Telegram, Feishu, WeChat, and the Web Dashboard are co-equal entry points. New terminals plug in here.
55
- - **Agent layer** — Official Claude Code / Codex / Gemini CLIs as drivers. Hermes is next; the registry takes any agent.
56
- - **Model layer** — Claude / GPT / Gemini, the domestic Chinese series (DeepSeek, 豆包, MiMo, MiniMax), plus OpenRouter and any third-party proxy. Wrappers let an agent run on top of arbitrary models.
58
+ - **Terminal layer** — Telegram, Feishu, WeChat, Slack, Discord, DingTalk, WeCom, and the Web Dashboard are co-equal entry points. New terminals plug in here.
59
+ - **Agent layer** — Official Claude Code / Codex / Gemini / Hermes CLIs as drivers. Hermes speaks ACP (Agent Client Protocol); the registry takes any agent.
60
+ - **Model layer** — Claude / GPT / Gemini, the domestic Chinese series (DeepSeek, 豆包, MiMo, MiniMax), plus OpenRouter and any OpenAI-compatible proxy. Providers + Profiles are a first-class layer with their own credential vault, models.dev catalog, and per-agent injection.
57
61
  - **Tool layer** — Skills, MCP servers, and CLI tools merged across global and workspace scopes, injected into every session.
58
62
 
59
63
  ---
@@ -82,7 +86,7 @@ Most "AI dev tools" assume one user, one agent, one task at a time. pikiclaw ass
82
86
  - **Mix-and-match agents** — Claude Code in pane 1, Codex in pane 2, Gemini in pane 3, all on different repos / workspaces.
83
87
  - **One toolkit** — global skills, global MCP servers, and per-workspace overrides apply uniformly. You configure once; every session inherits.
84
88
  - **Steer anywhere** — interrupt any running stream, queue a follow-up, hand control to the next agent in line.
85
- - **Group-mode** — drop the orchestrator into a Feishu / WeChat group; teammates share the same swarm.
89
+ - **Group-mode** — drop the orchestrator into a Feishu / Slack / Discord / WeCom group; teammates share the same swarm.
86
90
 
87
91
  This is the shape that matters: one creator, with a swarm at their fingertips.
88
92
 
@@ -99,21 +103,23 @@ This is the shape that matters: one creator, with a swarm at their fingertips.
99
103
  <p align="center"><img src="docs/promo-dashboard-workspace.png" alt="Web Dashboard workspace" width="780"></p>
100
104
 
101
105
  <details>
102
- <summary><b>More: basic ops · IM access · agent config · extensions · permissions · system info</b></summary>
106
+ <summary><b>More: basic ops · IM access · agents · models · extensions · permissions · system info</b></summary>
103
107
 
104
108
  > Send a message, watch the agent stream, receive files back.
105
109
 
106
110
  <img src="docs/promo-basic-ops.gif" alt="Basic operations" width="780">
107
111
 
108
- > **IM Access** — Telegram, Feishu, WeChat channel status and configuration
112
+ > **IM Access** — Telegram, Feishu, WeChat, Slack, Discord, DingTalk, WeCom channel status and configuration
109
113
 
110
114
  <img src="docs/promo-dashboard-im.png" alt="IM Access" width="780">
111
115
 
112
- > **Agent Config** — default agent / model / reasoning effort, available agents overview
116
+ > **Agents** — installed agent CLIs, default agent, per-agent model / reasoning effort
117
+
118
+ <img src="docs/promo-dashboard-agents.png" alt="Agents" width="780">
113
119
 
114
- <img src="docs/promo-dashboard-agents.png" alt="Agent Config" width="780">
120
+ > **Models** — Providers + Profiles vault (Claude, GPT, Gemini, DeepSeek, 豆包, MiMo, MiniMax, OpenRouter, any OpenAI-compatible proxy), validated against models.dev catalog and injected per agent
115
121
 
116
- > **Extensions** — global MCP servers, community skills, browser & desktop automation
122
+ > **Extensions** — global MCP servers, community skills, managed browser + macOS desktop (Peekaboo) automation
117
123
 
118
124
  <img src="docs/promo-dashboard-extensions.png" alt="Extensions" width="780">
119
125
 
@@ -131,11 +137,12 @@ This is the shape that matters: one creator, with a swarm at their fingertips.
131
137
 
132
138
  ## Quick start
133
139
 
134
- **Prereqs:** Node.js 18+, plus at least one official Agent CLI logged in:
140
+ **Prereqs:** Node.js 20+, plus at least one official Agent CLI logged in:
135
141
 
136
142
  - [`claude`](https://docs.anthropic.com/en/docs/claude-code) (Claude Code)
137
143
  - [`codex`](https://github.com/openai/codex) (Codex CLI)
138
144
  - [`gemini`](https://github.com/google-gemini/gemini-cli) (Gemini CLI)
145
+ - `hermes` (Hermes — via ACP / Agent Client Protocol)
139
146
 
140
147
  **Launch:**
141
148
 
@@ -167,8 +174,8 @@ npx pikiclaw@latest --doctor # environment check only
167
174
  - **Walk-away coding** — kick off a long refactor, close the laptop, drive it from your phone over Telegram. The agent keeps running locally; results stream back to chat.
168
175
  - **Multi-agent on one workspace** — let Claude Code draft an implementation, switch to Codex to review, then Gemini for a different perspective. Same files, same session history.
169
176
  - **Domestic-model routing** — run Claude Code over DeepSeek or 豆包 via a wrapper driver when latency, cost, or compliance demands a non-frontier model.
170
- - **Group-chat agent** — drop pikiclaw into a Feishu / WeChat work group; the team shares one orchestrator, one workspace, one set of skills.
171
- - **Headless operator** — give the agent browser + macOS desktop control via the built-in MCP bridge, then steer it from anywhere book a meeting, scrape a dashboard, run an end-to-end test.
177
+ - **Group-chat agent** — drop pikiclaw into a Feishu / Slack / Discord / WeCom work group; the team shares one orchestrator, one workspace, one set of skills.
178
+ - **Computer-use, controlled by you** — toggle on the managed Chrome (Playwright) and macOS desktop (Peekaboo, via Accessibility + ScreenCaptureKit). The agent can `see` the screen, click, type, manage windows / menus / Dock — and you steer it from any phone. Book a meeting, scrape a dashboard, run an end-to-end test, or drive any native macOS app.
172
179
  - **Skill-driven workflows** — install community skills (`promote`, `snipe`, `review`, `security-review`, …) once and trigger them from any terminal with `/sk_<name>`.
173
180
 
174
181
  ---
@@ -177,29 +184,33 @@ npx pikiclaw@latest --doctor # environment check only
177
184
 
178
185
  ### Terminal layer
179
186
 
180
- - **Telegram, Feishu, WeChat** run one or all simultaneously. Each channel is physically isolated; adding a new one (WhatsApp, mobile app, …) doesn't touch the others.
187
+ - **Seven IM channels** — Telegram, Feishu, WeChat (personal), Slack, Discord, DingTalk, WeCom. Run one, several, or all simultaneously. Each channel is physically isolated; adding a new one (WhatsApp, mobile app, …) doesn't touch the others.
181
188
  - **Web Dashboard** — drive sessions directly from the browser with the same conversation, tool-use, and streaming surfaces as IM. Multi-pane workspace (1 / 2 / 3 / 6 panes), light / dark theme, EN / 中文 i18n.
182
189
  - **Live streaming preview** — message updates in place as the agent thinks; long text auto-splits; images and files stream back in real time.
183
190
 
184
191
  ### Agent layer
185
192
 
186
- - **Official CLIs as drivers** — Claude Code, Codex CLI, Gemini CLI. No home-grown agent rewrite. You get the upstream behavior, on day-zero updates.
187
- - **Pluggable registry** — `agent-driver.ts` is the only contract. Hermes and future agents drop in.
193
+ - **Official CLIs as drivers** — Claude Code, Codex CLI, Gemini CLI, and Hermes (via ACP). No home-grown agent rewrite you get upstream behavior on day-zero updates.
194
+ - **ACP-native** — Hermes integrates through the [Agent Client Protocol](https://agentclientprotocol.com), spawning `hermes acp` over JSON-RPC stdio. Any future ACP-compatible agent plugs in the same way.
195
+ - **Pluggable registry** — `src/agent/driver.ts` is the only contract. New CLI- or ACP-based agents drop in alongside the four built-ins.
188
196
  - **Per-session agent switching** — same workspace, swap the brain.
189
197
  - **Steer** — interrupt a running task and let a queued message jump ahead in the queue.
190
198
  - **Codex human-in-the-loop** — when Codex pauses to ask, the question becomes an interactive IM prompt. Reply there; the task continues.
199
+ - **Persistent goals** — `/goal` sets a long-running objective per session with token budget and pause/resume; the agent self-terminates when it audits the goal complete.
191
200
 
192
201
  ### Model layer
193
202
 
194
- - **Frontier + domestic + proxies** — Claude (4 family), GPT-5 / Codex, Gemini, DeepSeek, 豆包 (Doubao), MiMo, MiniMax, OpenRouter, and any third-party model proxy.
195
- - **Per-session model + reasoning effort** — picked from the dashboard or `/models`.
196
- - **Wrapper drivers** run Claude Code or Codex on top of arbitrary models when the upstream client allows.
203
+ - **Frontier + domestic + proxies** — Claude (4 family), GPT-5 / Codex, Gemini, DeepSeek, 豆包 (Doubao), MiMo, MiniMax, OpenRouter, and any OpenAI-compatible model proxy.
204
+ - **Providers + Profiles vault** — first-class data model with its own credential store under `~/.pikiclaw/setting.json`. Browse a read-only models.dev catalog, validate keys with a real provider probe, then bind a profile to an agent so spawn-time env injection is automatic.
205
+ - **Per-session model + reasoning effort** picked from the dashboard, `/models`, or `/mode`.
206
+ - **Per-agent injection** — `resolveAgentInjection(agentId)` applies the active profile's env vars at spawn time, so Claude Code can run on top of DeepSeek or Doubao without touching the upstream client config.
197
207
 
198
208
  ### Tool layer
199
209
 
200
210
  - **Skills** — project skills in `.pikiclaw/skills/*/SKILL.md`, compatible with `.claude/commands/*.md`. One-click install from GitHub repos (`owner/repo`) or browse recommended packs (Anthropic Official, Vercel Agent Skills, …). Trigger with `/skills` and `/sk_<name>`.
201
- - **MCP servers** — browse the [MCP Registry](https://registry.modelcontextprotocol.io), add custom stdio / HTTP servers, health-check with a real handshake, enable per scope. Built-ins include GitHub, Filesystem, PostgreSQL, Slack, Brave Search, Memory, Fetch, SQLite, Git, Sentry.
202
- - **CLI tools** — invoked through the agent's normal tool surface, augmented by pikiclaw's session-scoped MCP bridge.
211
+ - **MCP servers** — browse the [MCP Registry](https://registry.modelcontextprotocol.io), add custom stdio / HTTP servers, health-check with a real handshake, OAuth 2.1 with Dynamic Client Registration, enable per scope. Recommended catalog includes GitHub, Atlassian, Notion, Linear, Sentry, Cloudflare, Slack, Feishu/Lark, Stripe, Hugging Face, Gamma, Brave Search, Perplexity, Filesystem, SQLite, PostgreSQL — plus two built-in computer-use servers (`pikiclaw-browser` for Chrome via Playwright, `peekaboo` for macOS GUI via Peekaboo).
212
+ - **CLI tools** — auto-detected with live version + auth state, OAuth-web login sessions for browser-based CLIs, all invoked through the agent's normal tool surface.
213
+ - **Session-scoped MCP bridge** — `im_list_files`, `im_send_file`, `im_ask_user`, the managed-browser tools, and the macOS desktop tools (when enabled) are injected into every session automatically.
203
214
  - **Two-scope merge** — `global < workspace < built-in`, applied automatically to every session.
204
215
 
205
216
  <p align="center"><img src="docs/promo-dashboard-extensions-add.png" alt="Add MCP server" width="780"></p>
@@ -208,11 +219,10 @@ npx pikiclaw@latest --doctor # environment check only
208
219
 
209
220
  - **Session workspace** — every session owns a directory; file attachments land there automatically.
210
221
  - **Resume, switch, classify** — multi-turn conversations, session classification (answer / proposal / implementation / blocked / …).
211
- - **Session-scoped MCP bridge** — built-in `im_list_files` / `im_send_file` for streaming files back to chat.
212
- - **GUI automation** (optional):
213
- - **Browser** — managed Chrome profile via `@playwright/mcp`; log in once, reuse credentials across tasks.
214
- - **Desktop (macOS)** — Appium Mac2 with `desktop_open_app`, `desktop_snapshot`, `desktop_click`, `desktop_type`, `desktop_screenshot`.
215
- - **Long-task hardening** — sleep prevention, watchdog, auto-restart, daemon mode.
222
+ - **Session-scoped MCP tools** — `im_list_files`, `im_send_file`, `im_ask_user`, and goal-management tools auto-injected into every stream.
223
+ - **Computer-use (browser)** — built-in `pikiclaw-browser` MCP wraps `@playwright/mcp` with a shared Chrome profile and a process-level supervisor; log in once, reuse credentials across tasks.
224
+ - **Computer-use (macOS desktop)** — built-in `peekaboo` MCP runs [Peekaboo](https://peekaboo.sh/) over Accessibility + ScreenCaptureKit; exposes `see`, `click`, `type`, `scroll`, `window`, `menu`, `app`, `dock`. Opt-in from Extensions; needs Accessibility + Screen Recording permissions. macOS only.
225
+ - **Long-task hardening** — sleep prevention, watchdog, auto-restart, daemon mode, channel supervisor.
216
226
 
217
227
  ---
218
228
 
@@ -220,14 +230,14 @@ npx pikiclaw@latest --doctor # environment check only
220
230
 
221
231
  | | pikiclaw | IDE assistants<br>(Cursor / Windsurf / Aider) | Cloud agents<br>(Devin / web Claude) | Single-agent IM bots |
222
232
  |---|---|---|---|---|
223
- | **Terminal** | IM + Web + future plug-ins | IDE only | Web app | One IM, one bot |
233
+ | **Terminal** | 7 IM channels + Web + future plug-ins | IDE only | Web app | One IM, one bot |
224
234
  | **Where the agent runs** | Your machine | Your machine | Vendor sandbox | Often vendor |
225
- | **Agent choice** | Claude Code · Codex · Gemini · Hermes · … | Bundled | Single | Single |
226
- | **Model choice** | Frontier + domestic Chinese | Vendor-controlled | Vendor-controlled | Single |
235
+ | **Agent choice** | Claude Code · Codex · Gemini · Hermes (ACP) · … | Bundled | Single | Single |
236
+ | **Model choice** | Frontier + domestic Chinese + any OpenAI-compatible | Vendor-controlled | Vendor-controlled | Single |
227
237
  | **Parallel agents** | **N agents × N windows × N workspaces** | One per IDE | Sequential | One |
228
238
  | **Files / tools** | Your files, your MCP, your CLIs | Your files | Sandbox | None / limited |
229
239
  | **Plug new terminal** | Add a `Channel` class | n/a | n/a | Fork |
230
- | **Plug new agent** | Add an `AgentDriver` | n/a | n/a | Fork |
240
+ | **Plug new agent** | Add an `AgentDriver` (CLI or ACP) | n/a | n/a | Fork |
231
241
  | **Self-bootstrapping** | **Yes — built with itself** | No | No | No |
232
242
 
233
243
  The shape that matters: **you stay in your environment, you keep your choice of brain, you run a swarm in parallel, and the orchestrator is the same one we use to build the orchestrator.**
@@ -240,10 +250,12 @@ The shape that matters: **you stay in your environment, you keep your choice of
240
250
  |---|---|
241
251
  | `/start` | Entry info, current agent, working directory |
242
252
  | `/sessions` | View, switch, or create sessions |
243
- | `/agents` | Switch agent |
253
+ | `/agents` | Switch agent (Claude · Codex · Gemini · Hermes) |
244
254
  | `/models` | View and switch model / reasoning effort |
245
255
  | `/mode` | Toggle plan mode (reasoning effort) |
246
256
  | `/switch` | Browse and switch working directory |
257
+ | `/workspaces` | Pick a saved workspace from the Dashboard's quick-pick list |
258
+ | `/goal` | Set or inspect a long-running, self-terminating session goal |
247
259
  | `/stop` | Stop current session |
248
260
  | `/status` | Runtime status, tokens, usage, session info |
249
261
  | `/host` | Host CPU / memory / disk / battery |
@@ -258,40 +270,30 @@ Plain text is forwarded to the current agent.
258
270
 
259
271
  ## Configuration
260
272
 
261
- - Persistent config: `~/.pikiclaw/setting.json`
262
- - The Dashboard is the primary configuration surface
263
- - Global MCP extensions: `~/.pikiclaw/setting.json` `extensions.mcp`
264
- - Workspace MCP extensions: standard `.mcp.json`
273
+ - Persistent config: `~/.pikiclaw/setting.json` — channels, agents, Providers/Profiles, workspaces, MCP extensions
274
+ - The Dashboard is the primary configuration surface; the terminal wizard (`--setup`) and `--doctor` exist for headless setups
275
+ - Global MCP extensions live under `extensions.mcp` in the setting file
276
+ - Workspace MCP extensions: standard `.mcp.json` in the project root
277
+ - Project skills: `.pikiclaw/skills/*/SKILL.md` (also picks up `.claude/commands/*.md`)
265
278
 
266
- <details>
267
- <summary><b>GUI automation setup (browser + macOS desktop)</b></summary>
268
-
269
- **Browser** is fully managed by the dashboard — a dedicated Chrome profile is created and reused. Log in to the sites you need once, every future agent session reuses those credentials.
279
+ **Computer-use** is gated by two toggles under Extensions:
270
280
 
271
- **macOS desktop** needs Appium Mac2:
272
-
273
- ```bash
274
- npm install -g appium
275
- appium driver install mac2
276
- appium
277
- ```
278
-
279
- Then grant macOS Accessibility permission to your terminal app.
280
-
281
- Env vars: `PIKICLAW_DESKTOP_GUI`, `PIKICLAW_DESKTOP_APPIUM_URL`.
282
-
283
- </details>
281
+ - `browserEnabled` — managed Chrome (Playwright). The first time an agent needs Chrome, pikiclaw creates a dedicated profile under `~/.pikiclaw` and reuses it across sessions. Log in to the sites you need once; every future session reuses those credentials.
282
+ - `peekabooEnabled` — macOS desktop (Peekaboo). When on (macOS only), pikiclaw spawns `@steipete/peekaboo`'s `peekaboo-mcp` binary and injects its tools. Grant the parent terminal **Accessibility** and **Screen Recording** in System Settings → Privacy & Security before flipping the toggle.
284
283
 
285
284
  ---
286
285
 
287
286
  ## Roadmap
288
287
 
289
- - **Hermes driver** first-class plug-in for the Hermes agent
290
- - **ACP (Agent Client Protocol)** — unified driver for any ACP-compatible agent, replacing per-agent CLI parsing — see [ACP Migration Plan](docs/acp-migration.md)
288
+ Already shipped: Hermes driver · ACP (Agent Client Protocol) · Provider/Profile model vault · seven IM channels · computer-use (Playwright browser + Peekaboo macOS desktop).
289
+
290
+ - **More ACP agents** — every new ACP-compatible agent should drop in without a hand-written driver
291
291
  - **More terminals** — WhatsApp, dedicated mobile app, voice
292
292
  - **Deeper model layer** — agent-on-arbitrary-model wrappers for more domestic series
293
293
  - **Better tool ecosystem** — recommended MCP packs, skill templates, marketplace
294
- - **GUI co-ordination** — tighter browser + desktop tool interplay
294
+ - **Cross-platform computer-use** — Windows / Linux desktop drivers alongside the macOS Peekaboo bridge
295
+
296
+ See [ACP Migration Plan](docs/acp-migration.md) for the protocol-side details.
295
297
 
296
298
  ---
297
299
 
@@ -307,9 +309,8 @@ npm test
307
309
 
308
310
  ```bash
309
311
  npm run dev # local dev (--no-daemon, logs to ~/.pikiclaw/dev/dev.log)
310
- npm run build # production build
311
- npm test # unit tests
312
- npm run test:e2e # end-to-end tests
312
+ npm run build # production build (dashboard + tsc)
313
+ npm test # vitest run
313
314
  npx pikiclaw@latest --doctor # environment check
314
315
  ```
315
316
 
@@ -327,10 +328,12 @@ The project is built around layers that are *meant* to be extended. New terminal
327
328
 
328
329
  | Where | What you'd add |
329
330
  |---|---|
330
- | `src/agent/driver.ts`, `src/agent/drivers/*.ts` | A new agent driver |
331
+ | `src/agent/driver.ts`, `src/agent/drivers/*.ts`, `src/agent/acp-client.ts` | A new agent driver (CLI- or ACP-based) |
331
332
  | `src/channels/base.ts`, `src/channels/*/` | A new terminal / IM channel |
333
+ | `src/model/`, `src/model/injector.ts` | A new model provider or per-agent injection rule |
332
334
  | `src/dashboard/routes/*.ts` | A new dashboard API surface |
333
335
  | `src/agent/mcp/tools/*.ts`, `src/agent/mcp/bridge.ts` | New session-scoped MCP tools |
336
+ | `src/catalog/*.ts` | A recommended MCP server / CLI tool / skill repo |
334
337
 
335
338
  ---
336
339