npm - pikiclaw - Versions diffs - 0.3.51 → 0.3.53 - Mend

pikiclaw 0.3.51 → 0.3.53

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/README.md +41 -62
package/README.zh-CN.md +44 -65
package/dist/agent/drivers/claude-tui.js +42 -0
package/dist/agent/session.js +20 -4
package/dist/agent/stream.js +14 -1
package/dist/bot/bot.js +27 -0
package/dist/bot/command-ui.js +43 -8
package/dist/bot/commands.js +17 -3
package/dist/bot/session-hub.js +65 -16
package/dist/channels/dingtalk/bot.js +4 -4
package/dist/channels/discord/bot.js +4 -4
package/dist/channels/feishu/render.js +9 -2
package/dist/channels/slack/bot.js +4 -4
package/dist/channels/wecom/bot.js +4 -4
package/dist/channels/weixin/bot.js +3 -3
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -24,7 +24,7 @@ npx pikiclaw@latest
 <b>English</b> | <a href="README.zh-CN.md">简体中文</a>
 </p>
-<img src="docs/promo-dashboard-workspace.png" alt="Workspace" width="780">
+<img src="docs/promo-orchestrator.png" alt="Pikiclaw — AI-Native Agent Orchestrator" width="820">
 </div>
@@ -36,29 +36,14 @@ npx pikiclaw@latest
 The product is the orchestrator itself. Everything else simply plugs in. **And what's cooler is that this orchestrator is entirely self-bootstrapped**—pikiclaw is what we use to build pikiclaw.
-```text
-   Terminal Layer    Telegram · Feishu · WeChat · Slack · Discord · DingTalk · WeCom · Web Dashboard
-                              \__________________________|__________________________/
-                                                         v
-                                          ┌──────────────────────────────┐
-                                          │     pikiclaw orchestrator    │
-                                          └──────────────────────────────┘
-                                                         |
-                ┌────────────────────────────────────────┼────────────────────────────────────────┐
-                v                                        v                                        v
-           Agent Layer                              Model Layer                               Tool Layer
-   Claude Code · Codex · Gemini · Hermes    Claude · GPT · Gemini · DeepSeek           Skills · MCP · CLI
-   (driver registry · ACP · any agent)      Doubao · MiMo · MiniMax · OpenRouter       (global × workspace)
-                                            · any OpenAI-compatible proxy · …
-                                                         |
-                                                         v
-                                                   Your Machine
-```
+The diagram above maps the four layers we stitch together:
+- **Entry Points** — Telegram, Feishu, WeChat, Slack, Discord, DingTalk, WeCom, the Web Dashboard, and the local API/CLI are all first-class, co-equal terminals. New ones plug right in.
+- **Pluggable Agents** — Claude Code, Codex, Gemini, and Hermes ship as built-in drivers. Hermes speaks ACP (Agent Client Protocol); the registry accepts any CLI- or ACP-based agent through the same `AgentDriver` contract.
+- **Model Routing** — Frontier (Claude · GPT · Gemini), Chinese domestic (DeepSeek · Doubao · MiMo · MiniMax · Qwen), local runtimes (Ollama, mlx-lm on Apple Silicon), OpenRouter, and any OpenAI-compatible proxy. Providers + Profiles are a first-class vault with a read-only `models.dev` catalog and per-agent environment injection at spawn time.
+- **Tool Mesh** — Skills, MCP servers, CLI tools, web search, and desktop automation, intelligently merged across global × workspace scopes and silently injected into every session.
-- **Terminal Layer** — Telegram, Feishu, WeChat, Slack, Discord, DingTalk, WeCom, and the Web Dashboard are all first-class, co-equal entry points. New terminals plug right in.
-- **Agent Layer** — We use the official Claude Code, Codex, Gemini, and Hermes CLIs as underlying drivers. Hermes communicates via ACP (Agent Client Protocol); our flexible registry can accommodate virtually any agent.
-- **Model Layer** — Access Claude, GPT, Gemini, leading Chinese domestic models (DeepSeek, Doubao, MiMo, MiniMax), plus OpenRouter and any OpenAI-compatible proxy. Providers and Profiles are treated as a first-class layer with their own credential vault, a read-only models.dev catalog, and per-agent environment injection.
-- **Tool Layer** — Skills, MCP servers, and CLI tools are intelligently merged across global and workspace scopes, automatically injected into every session.
+Sitting in the middle is the **Pikiclaw Orchestration Core** — the runtime that owns routing, memory, observability, and the bot lifecycle so any terminal can talk to any agent on any model through any tool.
 ---
@@ -98,7 +83,7 @@ This is the shape that matters: one creator, with a swarm of AI agents at their
 <p align="center"><img src="docs/promo-demo.gif" alt="Demo: Ask Telegram, agent works locally, result returns to chat" width="780"></p>
-> **Web Dashboard** — A multi-pane workspace featuring a session list, conversation threads, tool-use traces, and an input composer (supporting 1, 2, 3, or 6-pane layouts).
+> **Web Dashboard** — Multi-pane workspace with a session list, live conversation threads, tool-use traces, file/image attachments, queued-task chips, and a unified input composer (1 / 2 / 3 / 6 pane layouts, light/dark theme, EN/中文).
 <p align="center"><img src="docs/promo-dashboard-workspace.png" alt="Web Dashboard workspace" width="780"></p>
@@ -113,13 +98,13 @@ This is the shape that matters: one creator, with a swarm of AI agents at their
 <img src="docs/promo-dashboard-im.png" alt="IM Access" width="780">
-> **Agents** — Manage installed agent CLIs, set your default agent, and configure per-agent models and reasoning effort levels.
+> **Agents** — Manage installed agent CLIs, set your default agent, configure per-agent models and reasoning effort, and bind a Profile to drive an agent on a non-native model.
 <img src="docs/promo-dashboard-agents.png" alt="Agents" width="780">
-> **Models** — A secure Providers + Profiles vault (supporting Claude, GPT, Gemini, DeepSeek, Doubao, MiMo, MiniMax, OpenRouter, and any OpenAI-compatible proxy), validated against the models.dev catalog and injected directly per agent.
+> **Models** — A secure Providers + Profiles vault (Claude · GPT · Gemini · DeepSeek · Doubao · MiMo · MiniMax · Qwen · OpenRouter · any OpenAI-compatible proxy), validated against the read-only `models.dev` catalog and injected per-agent at spawn time. Local backends (Ollama, mlx-lm on Apple Silicon) attach automatically the moment they're detected.
-> **Extensions** — Manage global MCP servers, community skills, and built-in automation for headless browsers and macOS desktop (Peekaboo).
+> **Extensions** — Manage global MCP servers, community skills, and built-in automation for headless browsers and macOS desktop (Peekaboo). Add servers via stdio, HTTP, or OAuth 2.1 with Dynamic Client Registration.
 <img src="docs/promo-dashboard-extensions.png" alt="Extensions" width="780">
@@ -151,8 +136,6 @@ cd your-workspace
 npx pikiclaw@latest
 ```
-<p align="center"><img src="docs/promo-install.gif" alt="One-command install" width="780"></p>
 This instantly opens the **Web Dashboard** at `http://localhost:3939`. From there, you can drive sessions in the browser, connect IM channels, configure agents and models, install MCP servers and skills, and manage system permissions. Everything else is just one click away.
 <details>
@@ -193,8 +176,9 @@ agent CLI versions).
 - **Self-Hosted Dev Loop** — pikiclaw was built using pikiclaw. The dev workflow *is* the product: drive the orchestrator from your phone, write code, ship a release, and iterate.
 - **Walk-Away Coding** — Kick off a massive refactoring task, close your laptop, and monitor/steer it from your phone over Telegram. The agent continues running locally, streaming results back to your chat.
 - **Multi-Agent Tag Team** — Let Claude Code draft an initial implementation, switch to Codex for an in-depth review, and finally hand it over to Gemini for a fresh perspective. Same files, same continuous session history.
-- **Domestic Model Routing** — When latency, cost, or compliance demands a non-frontier model, use a wrapper driver to run Claude Code effortlessly on DeepSeek or Doubao.
+- **Domestic Model Routing** — When latency, cost, or compliance demands a non-frontier model, use a wrapper driver to run Claude Code effortlessly on DeepSeek, Doubao, Qwen, or a fully local Ollama / mlx-lm endpoint.
 - **The Group Chat Agent** — Drop pikiclaw into a Feishu, Slack, Discord, or WeCom workgroup. The entire team shares one orchestrator, one project workspace, and a unified set of powerful skills.
+- **Codex-Generated Images On Tap** — Ask Codex to draft a poster, a diagram, a UI mock — the image streams back as a real attachment in the chat with a click-to-reveal Image Prompt so you can audit the exact text that went to the model. Iterate by replying, not by re-opening a browser.
 - **Computer-Use, Controlled by You** — Enable the managed Chrome (Playwright) and macOS desktop (Peekaboo, via Accessibility + ScreenCaptureKit) capabilities. The agent can suddenly `see` the screen, click, type, and manage windows, menus, and the Dock—while you steer it from your phone. Book a meeting, scrape a complex dashboard, run end-to-end tests, or drive any native macOS application.
 - **Skill-Driven Workflows** — Install community skills (`promote`, `snipe`, `review`, `security-review`, etc.) once, and trigger them instantly from any connected terminal using `/sk_<name>`.
@@ -204,45 +188,46 @@ agent CLI versions).
 ### Terminal Layer
-- **Seven Native IM Channels** — Telegram, Feishu, WeChat (personal), Slack, Discord, DingTalk, and WeCom. Run one, several, or all of them simultaneously. Each channel is strictly isolated at the code level; adding a new one (like WhatsApp or a mobile app) requires zero changes to the others.
-- **Web Dashboard** — Drive sessions directly from your browser with the exact same conversational flow, tool-use tracing, and streaming experience as IM. Enjoy a multi-pane workspace (1/2/3/6 panes), light/dark themes, and full EN/中文 i18n support.
-- **Live Streaming Preview** — Watch messages update in place as the agent thinks. Long text auto-splits beautifully; images and files stream back to the UI in real time.
+- **Seven Native IM Channels** — Telegram, Feishu, WeChat (personal), Slack, Discord, DingTalk, and WeCom. Run one, several, or all of them simultaneously. Each channel is strictly isolated at the code level; adding a new one (WhatsApp, a mobile app, voice) requires zero changes to the others.
+- **Web Dashboard** — Drive sessions directly from your browser with the exact same conversational flow, tool-use tracing, and streaming experience as IM. Multi-pane workspace (1/2/3/6 panes), light/dark themes, and full EN/中文 i18n.
+- **Live Streaming Preview** — Watch messages update in place as the agent thinks. Long text auto-splits cleanly; thinking traces, tool calls, and plans surface as collapsible cards; images and files stream back to the UI in real time.
+- **Queue & Steer From One Composer** — Send while the stream is still running. New messages line up as queued chips you can preview, recall, or hand-steer; one click stops the active turn AND drains the rest of the queue.
 ### Agent Layer
 - **Official CLIs as Drivers** — Powered directly by Claude Code, Codex CLI, Gemini CLI, and Hermes (via ACP). We don't rewrite the agent core—you inherit upstream capabilities and Day-0 updates automatically.
 - **ACP-Native Architecture** — Hermes integrates natively through the [Agent Client Protocol](https://agentclientprotocol.com), spawning `hermes acp` over JSON-RPC stdio. Any future ACP-compatible agent plugs in the exact same way.
 - **Pluggable Driver Registry** — The only contract is `src/agent/driver.ts`. New CLI- or ACP-based agents can drop right in alongside our four built-in drivers.
-- **Per-Session Agent Switching** — Swap the "brain" on the fly without leaving your workspace.
-- **Steer & Interrupt** — Interrupt a heavy running task and force a queued message to the front of the line.
-- **Codex Human-in-the-Loop** — When Codex pauses to ask you a question, it forwards the prompt interactively to your IM. Reply directly in the chat, and the task resumes seamlessly.
-- **Persistent Goals** — Use `/goal` to set a long-running, session-scoped objective complete with a token budget. Supports pause/resume, and the agent will autonomously self-terminate only when it verifies the goal is complete.
+- **Per-Session Agent Switching** — Swap the "brain" on the fly without leaving your workspace; the same conversation history follows you to the next agent.
+- **Steer & Interrupt** — Interrupt a heavy running task and force a queued message to the front of the line, or stop everything for this session in one click.
+- **Codex Human-in-the-Loop** — When Codex pauses to ask you a question, the prompt is forwarded to your active terminal (IM or Dashboard). Reply inline and the task resumes seamlessly.
+- **Persistent Goals, Routed by Agent** — `/goal <objective>` keeps a session working toward a target until the agent self-audits completion. Codex uses its native `thread/goal/*` RPC with optional `budget=N` tokens and full pause/resume; Claude uses its in-process Stop hook with a Haiku judge and auto-clears when the condition is met; other agents fall back to pikiclaw's portable continuation loop.
+- **Image Generation, Surfaced End-to-End** — Codex's built-in `image_gen` tool (and Claude MCP / Gemini Imagen sources) lands as a real image attachment in the chat — not a wall of base64. The agent's actual `revised_prompt` rides along as a click-to-reveal **Image Prompt** disclosure in the Dashboard, so you can audit *why* the model drew what it drew. A "Generating image…" chip ticks alongside the assistant turn while the call is in flight.
 ### Model Layer
-- **Frontier + Domestic + Proxies** — Supports the Claude 4 family, GPT-5 / Codex, Gemini, DeepSeek, Doubao, MiMo, MiniMax, OpenRouter, and any custom OpenAI-compatible proxy endpoint.
-- **Providers & Profiles Vault** — A first-class data model that securely isolates credentials in `~/.pikiclaw/setting.json`. Browse a read-only models.dev catalog, validate keys with real provider probes, and bind a profile to an agent for automatic environment injection at spawn-time.
-- **Per-Session Model & Reasoning Effort** — Switch models or adjust reasoning capabilities dynamically via the Dashboard, `/models`, or `/mode`.
-- **Per-Agent Deep Injection** — `resolveAgentInjection(agentId)` forces the active profile's environment variables down at spawn time. This means you can run Claude Code on top of DeepSeek or Doubao without ever touching the upstream client's config.
+- **Frontier + Domestic + Local + Proxies** — Frontier (Claude · GPT-5/Codex · Gemini), Chinese domestic (DeepSeek · Doubao · MiMo · MiniMax · Qwen), local runtimes (Ollama, mlx-lm on Apple Silicon), OpenRouter, and any custom OpenAI-compatible proxy.
+- **Providers & Profiles Vault** — Credentials are isolated in `~/.pikiclaw/setting.json`. Browse a read-only `models.dev` catalog, validate keys with real provider probes, and bind a Profile to an agent for automatic environment injection at spawn time.
+- **Local Models, Zero-Config Attach** — Detected Ollama or mlx-lm backends auto-attach as a Provider — no extra wiring. The Dashboard tile shows status, install hints (brew/pipx), the exact `ollama pull` / `mlx_lm.server` command, and RAM headroom warnings against the host's total memory.
+- **Per-Session Model & Reasoning Effort** — Switch models or reasoning effort live via the Dashboard, `/models`, or `/mode`. Effort levels are per-agent (Claude: low → max; Codex: low → very high; Hermes: minimal → very high).
+- **Per-Agent Deep Injection** — `resolveAgentInjection(agentId)` forces the bound Profile's env vars down at spawn time. Run Claude Code on top of DeepSeek, Doubao, or a local Ollama model without ever editing the upstream client's config.
 ### Tool Layer
-- **Robust Skills System** — Project-specific skills live safely in `.pikiclaw/skills/*/SKILL.md` (and we fully support legacy `.claude/commands/*.md` formats). Install community packages with one click from GitHub (`owner/repo`) or browse our curated packs (like Anthropic Official, Vercel Agent Skills, etc.). Trigger them anywhere with `/skills` and `/sk_<name>`.
-- **Massive MCP Server Ecosystem** — Browse the [MCP Registry](https://registry.modelcontextprotocol.io), add custom stdio or HTTP servers, enforce real handshake health-checks, and utilize OAuth 2.1 with Dynamic Client Registration. Our recommended catalog flawlessly covers GitHub, Atlassian, Notion, Linear, Sentry, Cloudflare, Slack, Feishu/Lark, Stripe, Hugging Face, Gamma, Brave Search, Perplexity, Filesystem, SQLite, and PostgreSQL. Furthermore, we ship with two built-in, hyper-powerful computer-use servers: `pikiclaw-browser` (driving Chrome via Playwright) and `peekaboo` (driving the macOS GUI via Peekaboo).
-- **Seamless CLI Tool Integration** — Auto-detects versions and authentication states for popular CLIs. We natively support OAuth-web login handoffs for browser-based authentications, routing everything smoothly through the agent's standard tool surface.
-- **Session-Scoped MCP Bridge** — Foundational tools like `im_list_files`, `im_send_file`, `im_ask_user`, alongside the managed browser and macOS desktop tools (when enabled), are automatically injected deep into every single session you launch.
-- **Two-Tier Merge Resolution** — Tool scopes follow a simple rule: `global < workspace < built-in`. The engine automatically resolves and merges these, applying them silently to every session.
-<p align="center"><img src="docs/promo-dashboard-extensions-add.png" alt="Add MCP server" width="780"></p>
+- **Robust Skills System** — Project skills live in `.pikiclaw/skills/*/SKILL.md` (legacy `.claude/commands/*.md` still works). Install community packs in one click from GitHub (`owner/repo`) or pick from our curated set (Anthropic Official, Vercel Agent Skills, etc.). Trigger anywhere with `/skills` and `/sk_<name>`.
+- **Massive MCP Server Ecosystem** — Browse the [MCP Registry](https://registry.modelcontextprotocol.io), add custom stdio or HTTP servers, enforce real-handshake health checks, and authenticate with OAuth 2.1 + Dynamic Client Registration. The recommended catalog covers GitHub, Atlassian, Notion, Linear, Sentry, Cloudflare, Slack, Feishu/Lark, Stripe, Hugging Face, Gamma, Brave Search, Perplexity, Filesystem, SQLite, and PostgreSQL — plus two built-in computer-use servers we ship ourselves: `pikiclaw-browser` (Chrome via Playwright) and `peekaboo` (macOS GUI via Peekaboo).
+- **Seamless CLI Tool Integration** — Auto-detects versions and authentication state for popular CLIs (gh, brew, npm, uv, …). OAuth-web login handoffs route through the agent's normal tool surface.
+- **Session-Scoped MCP Bridge** — Foundational tools (`im_list_files`, `im_send_file`, `im_ask_user`, `goal_get`, `goal_update`) plus the managed browser and macOS desktop tools (when enabled) are auto-injected into every session you launch.
+- **Three-Way Merge Resolution** — Scope precedence is simple: `global < workspace < built-in`. The engine resolves and merges these silently for every session.
 ### Runtime & Developer Experience
-- **Dedicated Session Workspaces** — Every session gets its own isolated directory; file attachments and generated assets drop there automatically.
-- **Resume, Switch, and Classify** — Flawless multi-turn conversation support with smart session classification (identifying answers, proposals, implementations, or blocked states).
-- **Auto-Injected Base Tools** — Core MCP tools like file listing, sending, user prompting, and goal tracking are hard-wired into every stream.
-- **Computer-Use (Browser Engine)** — The built-in `pikiclaw-browser` MCP is a hyper-charged wrapper over `@playwright/mcp`. It includes a process-level supervisor and shares an isolated Chrome profile. Log in to your tools once, and reuse those authenticated sessions across all future tasks!
-- **Computer-Use (macOS Desktop)** — Enable the `peekaboo` MCP built-in server (macOS only) to unleash the [Peekaboo](https://peekaboo.sh/) framework over Accessibility and ScreenCaptureKit APIs. It exposes a god-mode suite of tools: `see`, `click`, `type`, `scroll`, `window`, `menu`, `app`, and `dock`. Requires explicit OS-level permissions but grants unprecedented control.
-- **Hardened for Long Tasks** — Built with sleep prevention, watchdog timers, auto-restarts, daemon modes, and a robust channel supervisor. You can walk away knowing your marathon tasks are protected by an ironclad runtime.
+- **Dedicated Session Workspaces** — Every session gets its own isolated directory; uploaded files and generated assets (including agent-produced images) drop there automatically.
+- **Resume, Switch, and Classify** — Multi-turn conversations resume cleanly. Sessions are auto-classified (answer, proposal, implementation, blocked) and the workspace list sorts by recent activity across all installed agents.
+- **Auto-Injected Base Tools** — `im_*` (file listing, sending, asking the user) and `goal_*` tools are hard-wired into every stream — the agent can hand a file back to your IM or pause to ask a question without you wiring anything up.
+- **Computer-Use (Browser Engine)** — The built-in `pikiclaw-browser` MCP wraps `@playwright/mcp` with a process-level supervisor and a shared, isolated Chrome profile. Log in to your tools once and reuse those authenticated sessions across every future task.
+- **Computer-Use (macOS Desktop)** — Enable the `peekaboo` MCP server (macOS only) to unleash the [Peekaboo](https://peekaboo.sh/) framework over Accessibility and ScreenCaptureKit. Tools include `see`, `click`, `type`, `scroll`, `window`, `menu`, `app`, `dock`, plus the `agent` sub-agent for goal-directed control. Requires Accessibility + Screen Recording permission in System Settings.
+- **Hardened for Long Tasks** — Sleep prevention, watchdog timers, auto-restart, daemon mode, and a channel supervisor. Restart is blocked while tasks are active so a hot reload never kills your marathon job.
 ---
@@ -253,7 +238,7 @@ agent CLI versions).
 | **Terminal Access** | 7 IM channels + Web + Extensible | Locked inside the IDE | Confined to a Web app | One specific IM app |
 | **Execution Environment** | Your local machine | Your local machine | Vendor's remote sandbox | Usually vendor servers |
 | **Agent Flexibility** | Claude Code, Codex, Gemini, Hermes (ACP), etc. | Locked in | Single | Single |
-| **Model Freedom** | Frontier models, domestic giants, OpenAI-proxies | Controlled by the platform | Controlled by the vendor | Single, hardcoded |
+| **Model Freedom** | Frontier · Chinese domestic · local (Ollama, mlx-lm) · OpenAI-compatible proxies | Controlled by the platform | Controlled by the vendor | Single, hardcoded |
 | **Concurrency Power** | **N Agents × N Windows × N Workspaces** | One agent per IDE window | Strictly sequential | Single thread |
 | **Files & Tools Access** | Your entire local disk, your MCPs, your CLIs | Local project files | Heavily sandboxed | None or extremely limited |
 | **Add a New Terminal** | Drop in a simple `Channel` class | Impossible | Impossible | Requires a hard fork |
@@ -305,13 +290,7 @@ The shape that truly matters: **You never have to leave your preferred environme
 ## Roadmap
-**Already Shipped:** Hermes driver integration · ACP (Agent Client Protocol) · Secure Provider/Profile vault · Seven native IM channels · Computer-use via Playwright and Peekaboo (macOS).
-- **More ACP Agents** — Ensuring any new ACP-compatible agent can drop in with zero code changes.
-- **Broader Terminal Ecosystem** — Adding support for WhatsApp, a dedicated mobile app, and voice interfaces.
-- **Deeper Model Wrapping** — Building agent-on-arbitrary-model wrappers to support a wider array of domestic and open-source models seamlessly.
-- **Richer Tool Ecosystem** — Releasing official MCP packs, skill templates, and a community marketplace.
-- **Cross-Platform Computer-Use** — Extending desktop control drivers beyond macOS to support Windows and Linux.
+- **SupporterAgent** — A high-level meta-agent layered on top of the existing orchestration stack (Terminals × Agents × Models × Tools). It takes a complex objective and centrally owns the full loop: decomposition and planning, scheduling the right sub-agents on the right models with the right tools, watching their streams as they run, and stepping in to correct course when a sub-agent stalls, drifts, or contradicts the plan. The aim is a step-change in how reliably pikiclaw can drive long-horizon, multi-agent work without a human babysitting every turn.
 ---

package/README.zh-CN.md CHANGED Viewed

@@ -24,7 +24,7 @@ npx pikiclaw@latest
 <a href="README.md">English</a> | <b>简体中文</b>
 </p>
-<img src="docs/promo-dashboard-workspace.png" alt="工作区" width="780">
+<img src="docs/promo-orchestrator.png" alt="Pikiclaw —— AI-Native Agent 编排器" width="820">
 </div>
@@ -36,29 +36,14 @@ npx pikiclaw@latest
 核心产品就是这个编排器，其它所有组件都可拔插。**更酷的是，这个编排器是由它自己构建出来的** —— pikiclaw 就是我们用来开发 pikiclaw 的工具。
-```
-   终端层    Telegram · 飞书 · 微信 · Slack · Discord · 钉钉 · 企业微信 · Web Dashboard
-                              \__________________________|__________________________/
-                                                         v
-                                          ┌──────────────────────────────┐
-                                          │     pikiclaw 编排器           │
-                                          └──────────────────────────────┘
-                                                         |
-                ┌────────────────────────────────────────┼────────────────────────────────────────┐
-                v                                        v                                        v
-           Agent 层                                   模型层                                    工具层
-   Claude Code · Codex · Gemini · Hermes      Claude · GPT · Gemini · DeepSeek            Skills · MCP · CLI
-   (driver registry · ACP · 任意 Agent)       豆包 · MiMo · MiniMax · OpenRouter         (全局 × 工作区)
-                                              · 任意 OpenAI 兼容代理 · …
-                                                         |
-                                                         v
-                                                  你的电脑
-```
+上面这张架构图勾勒出我们缝合在一起的四层结构：
+- **入口层 (Entry Points)** —— Telegram、飞书、微信、Slack、Discord、钉钉、企业微信、Web Dashboard，以及本地 API / CLI，都是一等公民级别、地位完全对等的终端。新增任意一个新终端，对其它通道完全无感。
+- **可插拔 Agent (Pluggable Agents)** —— Claude Code、Codex、Gemini、Hermes 均作为内置驱动。Hermes 走 ACP (Agent Client Protocol) 协议；任何 CLI 或 ACP 形态的 Agent 都可通过相同的 `AgentDriver` 契约接入注册表。
+- **模型路由 (Model Routing)** —— 前沿系列（Claude · GPT · Gemini）、国产矩阵（DeepSeek · 豆包 · MiMo · MiniMax · Qwen）、本地推理（Ollama，以及 Apple Silicon 上的 mlx-lm）、OpenRouter，以及任意 OpenAI 兼容代理。Providers + Profiles 作为一等公民的凭据保险箱，自带只读的 `models.dev` 目录与启动时的逐 Agent 环境变量注入。
+- **工具网 (Tool Mesh)** —— Skills、MCP 服务器、CLI 工具、Web Search、桌面自动化等，会在「全局 × 工作区」两个维度智能合并，并悄悄注入到每一次会话之中。
-- **终端层 (Terminal)** —— Telegram、飞书、微信、Slack、Discord、钉钉、企业微信以及 Web Dashboard 都是一等公民入口。新的终端形态可以随时接入。
-- **Agent 层** —— 官方的 Claude Code / Codex / Gemini / Hermes CLI 作为底层驱动 (driver)。其中 Hermes 使用 ACP (Agent Client Protocol，客户端协议)；注册表机制允许无缝接入任何其他的 Agent。
-- **模型层 (Model)** —— Claude / GPT / Gemini、国产系列 (DeepSeek、豆包、MiMo、MiniMax)，外加 OpenRouter 以及任何兼容 OpenAI 接口的代理服务。提供商 (Providers) 与配置项 (Profiles) 是一等公民模块，自带凭据保险箱、models.dev 目录以及面向各个 Agent 专属的环境变量注入能力。
-- **工具层 (Tool)** —— Skills、MCP 服务器和 CLI 工具。它们会在全局和工作区两个层级进行智能合并，并被自动注入到每一次会话之中。
+这一切的正中央，是 **Pikiclaw Orchestration Core** —— 由它来统一管理路由、记忆、可观测性和 Bot 生命周期，从而保证任何终端都能借助任意工具，让任意 Agent 跑在任意模型上。
 ---
@@ -98,7 +83,7 @@ npx pikiclaw@latest
 <p align="center"><img src="docs/promo-demo.gif" alt="演示：从 Telegram 发起任务，Agent 在本地执行，结果回到聊天" width="780"></p>
-> **Web Dashboard** —— 多面板工作区，包含会话列表、对话流、工具调用轨迹以及输入区域（支持 1 / 2 / 3 / 6 面板布局）。
+> **Web Dashboard** —— 多面板工作区，集成会话列表、实时对话流、工具调用轨迹、文件/图片附件、排队任务芯片以及统一的输入框（支持 1 / 2 / 3 / 6 面板布局、深浅色主题与中英双语 i18n）。
 <p align="center"><img src="docs/promo-dashboard-workspace.png" alt="Web Dashboard 工作区" width="780"></p>
@@ -113,13 +98,13 @@ npx pikiclaw@latest
 <img src="docs/promo-dashboard-im.png" alt="IM 接入" width="780">
-> **Agent 管理** —— 已安装的 Agent CLI 列表、默认 Agent 设定，以及各自独立的模型 / 推理强度配置。
+> **Agent 管理** —— 已安装的 Agent CLI 列表、默认 Agent 设定，以及各 Agent 独立的模型 / 推理强度配置；可绑定 Profile 让 Agent 跑在非原生模型上。
 <img src="docs/promo-dashboard-agents.png" alt="Agent" width="780">
-> **模型配置** —— 整合了 Provider + Profile 的凭据库（涵盖 Claude、GPT、Gemini、DeepSeek、豆包、MiMo、MiniMax、OpenRouter 及任何兼容 OpenAI 接口的代理），支持通过 models.dev 目录进行验证，并为指定的 Agent 独立进行底层环境变量注入。
+> **模型配置** —— 整合了 Provider + Profile 的凭据库（涵盖 Claude、GPT、Gemini、DeepSeek、豆包、MiMo、MiniMax、Qwen、OpenRouter 及任何 OpenAI 兼容代理），支持通过只读 `models.dev` 目录进行验证，并在 Agent 启动时定向注入对应环境变量；探测到 Ollama / mlx-lm（Apple Silicon）等本地后端时会自动挂载为 Provider。
-> **扩展工具** —— 统一管理全局 MCP 服务器、社区版 Skills、内置托管的浏览器环境及 macOS 桌面（Peekaboo）自动化能力。
+> **扩展工具** —— 统一管理全局 MCP 服务器、社区版 Skills、内置托管的浏览器环境及 macOS 桌面（Peekaboo）自动化能力，支持通过 stdio、HTTP，或带动态客户端注册的 OAuth 2.1 接入服务。
 <img src="docs/promo-dashboard-extensions.png" alt="扩展" width="780">
@@ -151,8 +136,6 @@ cd your-workspace
 npx pikiclaw@latest
 ```
-<p align="center"><img src="docs/promo-install.gif" alt="一行命令安装" width="780"></p>
 这条命令会在 `http://localhost:3939` 自动唤起 **Web Dashboard**。随后，你就可以在浏览器里驱动任何会话、接入需要的 IM 渠道、灵活配置 Agent 和模型、快速安装 MCP 服务器与技能 (Skills)，并统筹所有的系统权限。其他一切功能，尽在一键之遥。
 <details>
@@ -192,8 +175,9 @@ docker run -d --name pikiclaw -p 3939:3939 \
 - **自包含的闭环开发** —— pikiclaw 就是用 pikiclaw 自己开发出来的。这套开发流本身就是这款产品最原始的面貌：甚至可以在外用手机操作编排器，让 Agent 写代码、发布版本并不断迭代。
 - **挂机式编程 (Walk-away coding)** —— 发起一个耗时极长的大型重构任务，合上笔记本，外出时直接用手机通过 Telegram 进行监控和控制。Agent 始终在本地机器上运行，结果则会流式实时推回聊天界面中。
 - **同工作区多 Agent 接力** —— 先让 Claude Code 写一版功能草稿，无缝切给 Codex 去做深度 Review，最后再交给 Gemini 提供截然不同视角的优化建议。所有这些操作都在同一份代码目录和相同的历史会话中完成。
-- **灵活的国产模型路由方案** —— 当你的任务对延迟、成本或合规有硬性要求时，通过模型驱动包装层，可以直接让 Claude Code 跑在实惠又快速的 DeepSeek 或豆包模型之上。
+- **灵活的国产 / 本地模型路由** —— 当你的任务对延迟、成本或合规有硬性要求时，通过模型注入层，可以让 Claude Code 直接跑在 DeepSeek、豆包、Qwen，甚至完全离线的 Ollama / mlx-lm 上。
 - **群聊协作级 Agent** —— 把 pikiclaw 拉入飞书 / Slack / Discord / 企业微信群聊内；整个团队可以共享这同一个编排器、统一的项目工作区和一系列团队专属技能。
+- **随手让 Codex 生图** —— 让 Codex 出张海报、出个示意图、画个 UI 草图，结果会作为真正的图片附件流回到聊天里，并附带一个可展开的「生图 Prompt」让你随时查看模型实际收到的指令。下一次迭代只需要继续聊，而不必再切回浏览器。
 - **完全受控的 Computer-use 能力** —— 开启内置的 Chrome 浏览器托管（基于 Playwright）和 macOS 桌面环境托管（基于 Peekaboo，通过辅助功能和 ScreenCaptureKit）。Agent 瞬间获得「视力」(`see`)、可以自由点击、打字，并管理窗口、菜单栏和 Dock，而你依然可以通过手机远程精准操控它。无论是帮你预定一场会议、抓取某个数据面板信息、跑一通端到端自动测试，还是驱动任何原生的 macOS 本地应用，全都不在话下。
 - **基于 Skill 体系的自动化工作流** —— 一次性安装好社区提供的常用技能（例如 `promote`、`snipe`、`review`、`security-review` 等），往后只需在任何连接的终端里输入 `/sk_<name>` 即可实现一键触发。
@@ -203,45 +187,46 @@ docker run -d --name pikiclaw -p 3939:3939 \
 ### 终端层 (Terminal)
-- **支持七大主流 IM** —— 全面集成 Telegram、飞书、微信（个人号）、Slack、Discord、钉钉和企业微信。你可以只开启其中一个，也可以多开齐上。底层代码中每个渠道都做到绝对隔离；即使后续再添加新渠道（如 WhatsApp、自有移动 App 等），也丝毫不会影响现有逻辑的稳定性。
-- **Web Dashboard 面板** —— 直接在网页浏览器中驱动所有会话，获得与 IM 完全一致的自然对话、工具调用轨迹跟踪和极速的流式反馈体验。面板提供 1 / 2 / 3 / 6 多窗口并发布局、深色/浅色自适应主题，以及纯正的中英文 (i18n) 双语支持。
-- **实时流式预览** —— 每当 Agent 开始思考，消息都会实时在原地进行刷新；遇到超长文本能自动进行友好分段；生成的图片与文件也会即刻原样推回前端界面。
+- **支持七大主流 IM** —— Telegram、飞书、微信（个人号）、Slack、Discord、钉钉与企业微信。开一个、开几个、全开都可以。底层每个渠道在代码上是物理隔离的；后续接入新通道（WhatsApp、自研移动 App、语音终端）也不会牵动其它通道。
+- **Web Dashboard 面板** —— 直接在浏览器里驱动所有会话，对话流、工具调用轨迹和流式反馈都与 IM 完全一致。提供 1 / 2 / 3 / 6 面板布局、深浅色主题与中英双语 i18n。
+- **实时流式预览** —— Agent 一边思考、消息一边原地更新；超长文本自动分段；思考过程、工具调用、Plan 都被分别折叠成卡片；图片与文件也会实时原样推回前端。
+- **排队 / 操控统一在一个输入框** —— 上一条还在跑，你就能继续发；新消息以排队 chip 出现，可以预览、撤回，也可以让 Agent 立刻插队执行；一键即可同时停掉当前任务与所有排队任务。
 ### Agent 层
-- **官方 CLI 作为原生底层驱动** —— 内置接入 Claude Code、Codex CLI、Gemini CLI 以及 Hermes (通过 ACP 协议)。我们坚决拒绝自己「造一套套壳的 Agent 引擎」——只要上游核心推出了任何更新功能，你就可以在第一时间无损享用。
-- **原生拥抱 ACP 协议** —— Hermes 的接入完全基于 [Agent Client Protocol](https://agentclientprotocol.com) 协议，通过系统标准的 JSON-RPC (输入/输出流) 唤起 `hermes acp`。这意味着在未来，任何兼容 ACP 协议的新 Agent 也能立刻无缝空降至平台。
-- **自由可插拔的注册表机制** —— 在整套代码库中，这部分唯一的强制契约只有 `src/agent/driver.ts`。不论是基于传统 CLI 还是新兴 ACP 协议开发的各类新 Agent，都能随时加入注册表，与现有的四大核心内置引擎并肩作战。
-- **无感会话级 Agent 切换** —— 你甚至不用离开当前代码工作区，就能在会话途中随时顺畅地帮 AI 更换一颗不同特性的「大脑」。
-- **接管与干预 (Steer) 控制** —— 你可以随心所欲中断正在执行的繁重任务，让排队的紧急新消息直接插队至最前方处理。
-- **Codex 人机协同机制 (Human-in-the-loop)** —— 当 Codex 需要你确认操作细节时，这些提示请求会自动转化发送为 IM 中的互动询问消息。你只需在平常用的聊天框内简单答复，暂停的任务就会完美接续运作。
-- **长效目标系统 (Persistent goals)** —— 允许使用 `/goal` 指令，为指定的会话设定出伴有明确 Token 预算的长效终止目标。任务支持智能暂停/恢复，只有当 Agent 靠自行审计判定达到目标要求后，它才会结束自身当前进程。
+- **官方 CLI 作为原生底层驱动** —— 内置接入 Claude Code、Codex CLI、Gemini CLI 以及 Hermes（通过 ACP 协议）。我们坚决不自己「造一套套壳的 Agent 引擎」—— 上游核心一旦更新，你立刻就能享用。
+- **原生拥抱 ACP 协议** —— Hermes 完全基于 [Agent Client Protocol](https://agentclientprotocol.com) 协议接入，通过 JSON-RPC stdio 唤起 `hermes acp`。未来任何兼容 ACP 的新 Agent 也能立刻无缝空降。
+- **可插拔的驱动注册表** —— 整个代码库中唯一的契约只有 `src/agent/driver.ts`。无论是 CLI 还是 ACP 形态，新 Agent 都能落地，与四大内置引擎并肩。
+- **会话级 Agent 切换** —— 不需要离开当前工作区，就能在会话中途给 AI 换一颗「大脑」，历史上下文继续生效。
+- **接管与干预 (Steer)** —— 随时中断正在执行的重任务，让排队的紧急消息插到最前；或者一键停掉整个会话。
+- **Codex 人机协同 (Human-in-the-loop)** —— Codex 需要确认操作时，提示会被自动转发到你的活跃终端（IM 或 Dashboard）。在原地回一句话，被暂停的任务就会继续。
+- **持久化目标系统，按 Agent 路由** —— `/goal <objective>` 会让会话持续工作直到 Agent 自审满足条件。Codex 走原生 `thread/goal/*` RPC，可选 `budget=N` Token 预算并支持暂停 / 恢复；Claude 走原生 Stop hook + Haiku 评审，目标完成后自动清除；其它 Agent 走 pikiclaw 自带的可移植 continuation。
+- **图片生成全链路接管** —— Codex 内置的 `image_gen`（以及 Claude MCP / Gemini Imagen）产出的图，会以真实的图片附件落到聊天里 —— 不再是一坨 base64。Agent 实际发给图模型的 `revised_prompt` 会作为可点开展开的「**生图 Prompt**」挂在图片旁；图片生成中时还会有「Generating image…」chip 在助手回复下闪烁，告诉你这一轮为什么慢。
 ### 模型层
-- **全面涵盖前沿顶流、国产之光与各类代理** —— 囊括 Claude 家族系列、强大的 GPT-5 / Codex 以及 Gemini；国内优秀梯队的 DeepSeek、豆包 (Doubao)、MiMo 与 MiniMax；同时原生兼容 OpenRouter 和任意支持 OpenAI 通用接口格式的第三方代理服务。
-- **Providers & Profiles 凭据专属保险箱** —— 构建了高标准隔离的数据保护模型，API 凭据会被单独加密存放在 `~/.pikiclaw/setting.json` 专属区域。你能在只读的 models.dev 目录进行便捷浏览、调用最真实的 API 探针来严谨验证密钥的有效性，最终再把这份 Profile 与指定的任意 Agent 相绑定，从而实现运行阶段环境变量参数的自动隔离注入。
-- **极度自由的会话级配置选取** —— 无论是模型本体还是针对特定高难度任务的推理强度，你都能在友好的 Dashboard 界面中，或者直接发送指令 `/models` 与 `/mode` 来即时动态切选。
-- **Agent 级别底层强制注入** —— 核心流函数 `resolveAgentInjection(agentId)` 在启动的最初阶段就会将对应的环境变量强行覆盖进去。这意味着，你竟然可以直接指令 Claude Code，让它全程跑在超高性价比的 DeepSeek 或是豆包核心大模型上，并且全程无需去改动其原本上游客户端里任何一行深层配置代码。
+- **前沿 + 国产 + 本地 + 各类代理** —— 前沿系列（Claude · GPT-5 / Codex · Gemini）、国产矩阵（DeepSeek · 豆包 · MiMo · MiniMax · Qwen）、本地推理（Ollama，以及 Apple Silicon 上的 mlx-lm）、OpenRouter，以及任意 OpenAI 兼容代理。
+- **Providers & Profiles 凭据保险箱** —— API 凭据隔离存放在 `~/.pikiclaw/setting.json` 中。在只读的 `models.dev` 目录里浏览模型、通过真实的 Provider 探针验证密钥，再把 Profile 与某个 Agent 绑定，启动时自动注入对应环境变量。
+- **本地模型零配置接入** —— 探测到 Ollama 或 mlx-lm 后端时会自动挂载为 Provider，不需要额外配置。Dashboard 上的卡片会展示状态、`brew/pipx` 安装命令、对应的 `ollama pull` / `mlx_lm.server` 拉模型命令，以及对照本机内存的 RAM 余量提示。
+- **会话级模型 / 推理强度切换** —— 在 Dashboard、`/models` 或 `/mode` 中实时切换。推理强度按 Agent 提供（Claude：low → max；Codex：low → very high；Hermes：minimal → very high）。
+- **Agent 级深度环境注入** —— `resolveAgentInjection(agentId)` 在启动时强制写入绑定 Profile 的环境变量。这意味着你可以让 Claude Code 全程跑在 DeepSeek、豆包，甚至本地 Ollama 上，而完全不动上游 CLI 的配置。
 ### 工具层
-- **强大的技能系统 (Skills)** —— 这个系统让每一个工程专属技能被稳稳地存放在 `.pikiclaw/skills/*/SKILL.md` 内（同时也全面向下兼容标准的 `.claude/commands/*.md` 描述格式）。支持快速指定从 GitHub 的公开仓库（`owner/repo`）中实现极速的一键远程拉取并安装；或者去随便逛逛我们收录整理的精选套件包（比如备受好评的 Anthropic 官方包、或是好用的 Vercel Agent Skills 包等）。平时直接发个 `/skills` 探查当前载入的所有技能，挑准目标直接用 `/sk_<name>` 便可秒速触发。
-- **最广泛主流的 MCP 服务器加持** —— 可以直接浏览接入 [MCP Registry](https://registry.modelcontextprotocol.io) 全球库或者自由手工增加本地 stdio 和网端 HTTP 服务；框架严格支持实机硬核握手健康侦测机制与 OAuth 2.1 高级动态客户端安全注册，且能精细拆分控制启用哪些作用域范围。目前精选优化的目录已毫无压力地涵盖 GitHub、Atlassian、Notion、Linear、Sentry、Cloudflare、Slack、飞书/Lark、Stripe、Hugging Face、Gamma、Brave Search、Perplexity、本地系统深度文件探测、SQLite 甚至专业的 PostgreSQL。此外，系统更逆天地内置附赠了两个重磅级的强力 Computer-use 级别核心服务（一个是基于大名鼎鼎的 Playwright 来暴躁驱动底层 Chrome 浏览器的 `pikiclaw-browser`；另一个则是依托极客向 Peekaboo 纯正血统，操控整个底层 macOS GUI 交互视窗的超级 `peekaboo` 工具）。
-- **无缝衔接各类流行 CLI 神器** —— 底层逻辑强悍地支持自动探测各类版本兼容性并精准校验出授权登入状态。特别是遇到基于浏览器鉴权登录判定的 CLI，我们底层支持 OAuth-web 授权无缝接力。最后统统由 Agent 最原生的调用接口无缝唤起执行操作。
-- **全局会话级的 MCP 底层桥接** —— `im_list_files`、`im_send_file`、`im_ask_user` 这些基建指令，再叠加前述的内置浏览器与 macOS 桌面自动化控制工具包（只要一旦开启安全开关），统统都会被全面自动注入进你的每一场会话里。
-- **双域极简权限合并机制** —— 所有工具作用范围授权，永远只需遵循这条策略：`全局 (global) < 当前工作区 (workspace) < 内建 (built-in)`。底层引擎每次都能自动执行合并，并丝滑生效进后续发起的对话之中。
-<p align="center"><img src="docs/promo-dashboard-extensions-add.png" alt="添加 MCP server" width="780"></p>
+- **强大的技能系统 (Skills)** —— 项目专属技能存放在 `.pikiclaw/skills/*/SKILL.md` 中（也兼容旧的 `.claude/commands/*.md` 格式）。可以从 GitHub（`owner/repo`）一键安装社区包，或挑选我们精选的官方包（Anthropic Official、Vercel Agent Skills 等）。在任何终端里发 `/skills` 浏览，`/sk_<name>` 一键触发。
+- **海量 MCP 生态加持** —— 浏览 [MCP Registry](https://registry.modelcontextprotocol.io)、手工增加 stdio / HTTP 服务、强制真实握手健康探测、支持带动态客户端注册的 OAuth 2.1。精选目录涵盖 GitHub、Atlassian、Notion、Linear、Sentry、Cloudflare、Slack、飞书/Lark、Stripe、Hugging Face、Gamma、Brave Search、Perplexity、Filesystem、SQLite 与 PostgreSQL —— 加上我们自带的两个 computer-use 服务：`pikiclaw-browser`（Playwright 驱动的 Chrome）与 `peekaboo`（Peekaboo 驱动的 macOS GUI）。
+- **无缝接入主流 CLI 工具** —— 自动探测版本与登录态（gh、brew、npm、uv 等），OAuth-web 浏览器授权流程在 Agent 调用面上无缝衔接。
+- **会话级 MCP 桥接** —— `im_list_files`、`im_send_file`、`im_ask_user`、`goal_get`、`goal_update` 等基础工具，加上启用后的浏览器与 macOS 桌面工具，会被自动注入到每一场会话里。
+- **三层合并规则** —— 工具作用域永远遵循：`全局 (global) < 当前工作区 (workspace) < 内建 (built-in)`。引擎自动合并后无感生效。
 ### 运行环境与开发者体验 (Runtime & DX)
-- **独享会话级项目工作区** —— 每开启一次新的交锋会话，底层引擎都会为它开辟出单独专属的实体文件隔离目录，附件直接落在那里。
-- **多轮会话回溯管控** —— 随便怎么恢复、切换，还配上了贴心的语义会话分类体系（快速分为解答、提案、实现，阻塞等清晰状态标识归类）。
-- **基建工具流自注入** —— 强悍的 `im_list_files`、`im_send_file`、以及 `im_ask_user`，加上目标追踪管理工具等，会在启动前夕自动挂载。
-- **Computer-use (浏览器引擎层)** —— 系统底层内置了 `pikiclaw-browser` MCP。这是二次封装了 `@playwright/mcp` 实现的，使其拥有进程级 Supervisor 监管机制，且达成了跨任务进程共享独立 Chrome 配置。只需要登录认证一次常用网站；在未来的任何任务里，这个工具将直接一键继承数据免签直连！
-- **Computer-use (macOS 桌面控制层)** —— 当你在扩展面板启用 `peekaboo` MCP 并在系统设置授予终端“辅助功能”与“屏幕录制”权限后（仅限 macOS）；你即可借助 [Peekaboo](https://peekaboo.sh/) 框架的加持瞬间获得暴露在外的各种工具：视力 (`see`)；精准点击 (`click`)；虚空打字输入 (`type`)；操作滚轮 (`scroll`)；以及操作全系统窗口 (`window`)；主菜单 (`menu`)；程序生命周期 (`app`)；甚至是 Dock (`dock`) 等这一整套系统控制工具集。
-- **长效任务坚固防线** —— 核心内置了防休眠系统、看门狗守护模块、异常自动重启涅槃机制、守护进程模式；还有渠道 Supervisor 督军服务。这豪华阵容保证你哪怕挂机跑极其漫长的任务，也能极度稳如磐石！
+- **独立的会话工作区** —— 每一次会话都有专属的隔离目录；上传的文件以及 Agent 生成的产物（含图片）都会落在那里。
+- **可恢复 / 可切换 / 自动分类** —— 多轮会话随意恢复与切换，自动按语义分类（answer / proposal / implementation / blocked），工作区会话列表按最近活动时间排序，覆盖所有已安装 Agent。
+- **基础工具自动注入** —— `im_*`（列文件 / 发文件 / 问用户）与 `goal_*` 在每一条流里都默认可用 —— Agent 不需要任何配置就能把文件回推到你的 IM、或者卡在中途反过来问你一句。
+- **Computer-use（浏览器层）** —— 内置的 `pikiclaw-browser` MCP 把 `@playwright/mcp` 包装上进程级 Supervisor 和一个共享的、隔离的 Chrome Profile。常用站点登录一次，所有后续任务都直接复用登录态。
+- **Computer-use（macOS 桌面层）** —— 启用 `peekaboo` MCP（仅 macOS），即可调用 [Peekaboo](https://peekaboo.sh/) 提供的整套桌面控制工具：`see`、`click`、`type`、`scroll`、`window`、`menu`、`app`、`dock`，以及面向目标自主控制的 `agent` 子代理。需要在系统设置中授予终端「辅助功能」与「屏幕录制」权限。
+- **为长任务硬化的运行时** —— 防休眠、看门狗、自动重启、守护进程模式、渠道 Supervisor 一应俱全；当还有任务在跑时主动阻止重启，保证你的马拉松作业不会被一次热加载弄崩。
 ---
@@ -252,7 +237,7 @@ docker run -d --name pikiclaw -p 3939:3939 \
 | **操作终端** | 7 大 IM + Web + 持续扩展 | 仅限 IDE 内部 | 局限在专属网页端 | 死绑在单个 IM 内的单个 Bot |
 | **Agent 运行地** | 完全在你自己的本地机器上 | 你的本地机器 | 厂商分配的云端沙盒里 | 往往在厂商服务器端 |
 | **Agent 的选择** | Claude Code · Codex · Gemini · Hermes (ACP) · …（任你选） | 深度绑定没得选 | 单一 | 单一 |
-| **底层模型抉择** | 国外前沿大模型 + 国产全系 + 任何兼容 OpenAI 接口的模型 | 平台控制 | 厂商绑定 | 单一无脑没得换 |
+| **底层模型抉择** | 前沿 · 国产 · 本地（Ollama / mlx-lm）· OpenAI 兼容代理 | 平台控制 | 厂商绑定 | 单一无脑没得换 |
 | **并发能力** | **N 个 Agent × N 个窗口 × N 个工作区** | 每个 IDE 窗口只能同时运行一个 | 串行排队 | 单一线程 |
 | **文件与工具掌控** | 你主机上的所有本地文件、MCP 资源库、以及本地 CLI 系统 | 本地文件 | 沙盒受限环境 | 极度受限 |
 | **接入新终端渠道** | 随便写个 `Channel` 基础实现类就能打通 | 无法实现 | 无法实现 | 需要 Fork 整个项目 |
@@ -304,13 +289,7 @@ docker run -d --name pikiclaw -p 3939:3939 \
 ## 产品路线图 (Roadmap)
-我们已交付：Hermes 驱动支持 · ACP (Agent Client Protocol) 协议底层集成 · Provider/Profile 模型保险箱机制 · 七大 IM 渠道打通 · Computer-use 的落地（Playwright 浏览器托管 + Peekaboo macOS 桌面托管）。
-- **接入更多 ACP Agent** —— 确保任何新的兼容 ACP 协议的 Agent 都能免代码零配置顺滑接入。
-- **拓展终端生态** —— 将支持 WhatsApp、独立的移动端 App 以及语音交互模块。
-- **深化模型层包装** —— 构建基于任意模型的通用 Agent Wrapper，以便无缝驱动更多优秀的国产模型。
-- **完善工具生态** —— 推出官方推荐的 MCP 插件合集、Skill 模版库及社区应用市场。
-- **全平台的 Computer-use** —— 在已有的 macOS Peekaboo 驱动之外，加入适配 Windows / Linux 操作系统的桌面控制支持。
+- **SupporterAgent** —— 在现有「终端 × Agent × 模型 × 工具」编排栈之上再加一层 high-level 元代理，统一管理整个复杂任务的生命周期：从拆解与规划，到把合适的子 Agent 调度到合适的模型与工具上，再到全程盯着各路 stream，发现子 Agent 卡壳、走偏或与计划冲突时主动介入校正。目标是把 pikiclaw 在长时序、多 Agent 协作上的稳定性拉到新一档，让人不再需要逐轮盯着每个子任务。
 ---

package/dist/agent/drivers/claude-tui.js CHANGED Viewed

@@ -131,6 +131,11 @@ process.stdin.on("end", () => {
       tool_name: typeof payload.tool_name === "string" ? payload.tool_name : null,
       tool_input: payload.tool_input || null,
       tool_response: payload.tool_response || null,
+      // Claude Code tags sub-agent tool calls with agent_id so the parent can
+      // tell them apart from main-thread calls. Forwarding it lets the driver
+      // route the hook to the right sub-agent card instead of the parent's
+      // 执行 list.
+      agent_id: typeof payload.agent_id === "string" ? payload.agent_id : null,
     }) + "\\n";
     try { fs.appendFileSync(toolEventsFile, line); } catch (_) {}
     process.stdout.write(JSON.stringify({ continue: true }) + "\\n");
@@ -250,6 +255,26 @@ function applyHookToolEvent(ev, s) {
     const toolName = String(ev?.tool_name || '').trim();
     if (!toolName || !toolUseId)
         return false;
+    // Sub-agent tool calls fire the parent's Pre/PostToolUse hooks too (one
+    // hook pipeline per CLI process). Claude Code tags those payloads with
+    // `agent_id`; route them to the matching sub-agent's tool list instead of
+    // appending to the parent's recentActivity. Without this every Task spawn
+    // floods the parent's 执行 card with the children's tool stream while the
+    // sub-agent cards sit empty until the sidecar JSONL flushes at Stop.
+    const subAgentId = typeof ev?.agent_id === 'string' && ev.agent_id ? ev.agent_id : '';
+    if (subAgentId) {
+        if (ev.event === 'PreToolUse') {
+            const parentToolUseId = s.subAgentIdToParent?.get(subAgentId);
+            const sub = parentToolUseId ? s.subAgents?.get(parentToolUseId) : undefined;
+            if (sub && !sub.tools.some((t) => t.id === toolUseId)) {
+                const summary = toolName === 'TodoWrite'
+                    ? 'Update plan'
+                    : summarizeClaudeToolUse(toolName, ev.tool_input || {});
+                sub.tools.push({ id: toolUseId, name: toolName, summary });
+            }
+        }
+        return true;
+    }
     if (ev.event === 'PreToolUse') {
         if (s.seenClaudeToolIds.has(toolUseId))
             return false;
@@ -835,6 +860,14 @@ export async function doClaudeTuiStream(opts) {
             catch {
                 continue;
             }
+            // A Task PreToolUse and the first sub-agent tool PreToolUse can land in
+            // the same tick batch. If the sub-agent's hook arrives before we've
+            // discovered its sidecar (and thus before s.subAgentIdToParent knows
+            // its agent_id), refresh discovery so the hook resolves its parent on
+            // this pass instead of leaking through unattributed.
+            const subAgentId = typeof ev?.agent_id === 'string' ? ev.agent_id : '';
+            if (subAgentId && !s.subAgentIdToParent?.has(subAgentId))
+                tryDiscoverSubAgents();
             try {
                 if (applyHookToolEvent(ev, s))
                     any = true;
@@ -880,6 +913,15 @@ export async function doClaudeTuiStream(opts) {
                 continue;
             const sidecarPath = path.join(sidecarDir, `${stem}.jsonl`);
             trackedSubAgents.set(stem, { sidecarPath, offset: 0, parentToolUseId });
+            // `stem` is "agent-<id>"; Claude Code's hook payload `agent_id` carries
+            // just the raw id. Keep both keys so applyHookToolEvent can attribute
+            // sub-agent tool hooks to the parent's Task tool_use no matter which
+            // form arrives.
+            const rawAgentId = stem.startsWith('agent-') ? stem.slice('agent-'.length) : stem;
+            if (!s.subAgentIdToParent)
+                s.subAgentIdToParent = new Map();
+            s.subAgentIdToParent.set(rawAgentId, parentToolUseId);
+            s.subAgentIdToParent.set(stem, parentToolUseId);
             agentLog(`[claude-tui] subagent sidecar discovered ${stem} parent=${parentToolUseId.slice(0, 14)}`);
         }
     };

package/dist/agent/session.js CHANGED Viewed

@@ -222,7 +222,11 @@ function normalizeSessionRecord(raw, workdir) {
         title: typeof raw?.title === 'string' && raw.title.trim() ? raw.title.trim() : null,
         model: typeof raw?.model === 'string' && raw.model.trim() ? raw.model.trim() : null,
         thinkingEffort: typeof raw?.thinkingEffort === 'string' && raw.thinkingEffort.trim() ? raw.thinkingEffort.trim() : null,
+        profileId: typeof raw?.profileId === 'string' && raw.profileId.trim() ? raw.profileId.trim() : null,
         stagedFiles: Array.isArray(raw?.stagedFiles) ? dedupeStrings(raw.stagedFiles.filter((v) => typeof v === 'string')) : [],
+        lastUserAttachments: Array.isArray(raw?.lastUserAttachments)
+            ? dedupeStrings(raw.lastUserAttachments.filter((v) => typeof v === 'string'))
+            : [],
         runState: normalizeSessionRunState(raw?.runState),
         runDetail: normalizeSessionRunDetail(raw?.runState, raw?.runDetail),
         runUpdatedAt: normalizeSessionRunUpdatedAt(raw?.runUpdatedAt, typeof raw?.updatedAt === 'string' && raw.updatedAt.trim() ? raw.updatedAt : new Date().toISOString()),
@@ -548,7 +552,7 @@ export function ensureSessionWorkspace(opts) {
             workspacePath: sessionWorkspacePath(workdir, opts.agent, sessionId),
             threadId,
             createdAt: new Date().toISOString(), updatedAt: new Date().toISOString(),
-            title: summarizePromptTitle(opts.title) || null, model: null, thinkingEffort: null, stagedFiles: [],
+            title: summarizePromptTitle(opts.title) || null, model: null, thinkingEffort: null, profileId: null, stagedFiles: [], lastUserAttachments: [],
             runState: 'completed', runDetail: null, runUpdatedAt: new Date().toISOString(),
             runPid: null,
             classification: null, userStatus: null, userNote: null,
@@ -588,6 +592,7 @@ function managedRecordToSessionInfo(record) {
         threadId: record.threadId,
         model: record.model,
         thinkingEffort: record.thinkingEffort,
+        profileId: record.profileId ?? null,
         createdAt: record.createdAt,
         title,
         running: record.runState === 'running',
@@ -686,12 +691,17 @@ export async function deleteAgentSession(opts) {
     return result;
 }
 /**
- * Look up the persisted model and thinkingEffort for an existing session.
- * Returns null values when the session is not found or fields are not set.
+ * Look up the persisted model, thinkingEffort, and bound profileId for an
+ * existing session. Returns null values when the session is not found or
+ * fields are not set.
  */
 export function getSessionStoredConfig(workdir, agent, sessionId) {
     const record = findPikiclawSession(workdir, agent, sessionId);
-    return { model: record?.model ?? null, thinkingEffort: record?.thinkingEffort ?? null };
+    return {
+        model: record?.model ?? null,
+        thinkingEffort: record?.thinkingEffort ?? null,
+        profileId: record?.profileId ?? null,
+    };
 }
 export function ensureManagedSession(opts) {
     const session = ensureSessionWorkspace({
@@ -705,6 +715,12 @@ export function ensureManagedSession(opts) {
         session.record.title = summarizePromptTitle(opts.title);
     if (!session.record.model && opts.model)
         session.record.model = opts.model.trim() || null;
+    if (!session.record.thinkingEffort && opts.thinkingEffort) {
+        session.record.thinkingEffort = opts.thinkingEffort.trim().toLowerCase() || null;
+    }
+    if (!session.record.profileId && opts.profileId) {
+        session.record.profileId = opts.profileId.trim() || null;
+    }
     saveSessionRecord(opts.workdir, session.record);
     return managedRecordToSessionInfo(session.record);
 }

package/dist/agent/stream.js CHANGED Viewed

@@ -9,7 +9,7 @@ import { restartManagedBrowser } from '../browser-supervisor.js';
 import { terminateProcessTree } from '../core/process-control.js';
 import { AGENT_DETECT_TIMEOUTS, AGENT_STREAM_HARD_KILL_GRACE_MS } from '../core/constants.js';
 import { getDriver, allDrivers, getAcceptedProviderKinds } from './driver.js';
-import { resolveAgentInjection, getActiveProfile, getProvider, updateProfile, listProfiles, } from '../model/index.js';
+import { resolveAgentInjection, getActiveProfile, getActiveProfileId, getProvider, updateProfile, listProfiles, } from '../model/index.js';
 import { Q, agentLog, agentWarn, agentError, joinErrorMessages, normalizeErrorMessage, buildStreamPreviewMeta, computeContext, shortValue, isPendingSessionId, dedupeStrings, normalizeStreamPreviewPlan, } from './utils.js';
 import { saveSessionRecord, setSessionRunState, applySessionRunResult, ensureSessionWorkspace, importFilesIntoWorkspace, syncManagedSessionIdentity, summarizePromptTitle, recordFork, } from './session.js';
 import { collapseSkillPrompt } from './skills.js';
@@ -346,6 +346,11 @@ function prepareStreamOpts(opts) {
     // Capture staged files for MCP bridge before clearing
     const stagedFiles = [...session.record.stagedFiles];
     session.record.stagedFiles = [];
+    // Remember this turn's attachments so dashboard fallbacks (called while the
+    // agent CLI hasn't yet flushed the user event to its native session file)
+    // can still render the user's image bubble. Cleared/overwritten at the
+    // start of the NEXT turn — always reflects the turn currently in flight.
+    session.record.lastUserAttachments = [...attachmentRelPaths];
     if (!session.record.title)
         session.record.title = summarizePromptTitle(displayPrompt) || null;
     session.record.lastQuestion = shortValue(displayPrompt, 500);
@@ -383,6 +388,14 @@ function finalizeStreamResult(result, workdir, prompt, session) {
     session.record.model = result.model || session.record.model;
     if (result.thinkingEffort)
         session.record.thinkingEffort = result.thinkingEffort;
+    // Capture the BYOK Profile that was in effect for this run so a future
+    // `session.switch` can re-bind it (null = native CLI auth).
+    try {
+        session.record.profileId = getActiveProfileId(session.record.agent);
+    }
+    catch {
+        /* model layer not initialised in tests — leave profileId untouched */
+    }
     const displayPrompt = collapseSkillPrompt(prompt) ?? prompt;
     if (!session.record.title)
         session.record.title = summarizePromptTitle(displayPrompt);

package/dist/bot/bot.js CHANGED Viewed

@@ -640,6 +640,8 @@ export class Bot {
             workdir: 'workdir' in session && session.workdir ? session.workdir : this.workdir,
             title: session.title ?? null,
             model: session.model ?? null,
+            thinkingEffort: session.thinkingEffort ?? null,
+            profileId: session.profileId ?? null,
             threadId: session.threadId ?? null,
         });
         const runtime = this.hydrateSessionRuntime({
@@ -649,11 +651,16 @@ export class Bot {
             workspacePath: managed.workspacePath ?? session.workspacePath ?? null,
             threadId: managed.threadId ?? session.threadId ?? null,
             modelId: session.model ?? managed.model ?? null,
+            thinkingEffort: session.thinkingEffort ?? managed.thinkingEffort ?? null,
         });
         if (!runtime) {
             this.applySessionSelection(cs, null);
             return;
         }
+        // Adopting an existing session is an explicit user pick — drop any
+        // queued handover from a prior agent toggle so we don't accidentally
+        // prepend the wrong context to the resumed session's next turn.
+        cs.pendingHandoverFrom = null;
         this.applySessionSelection(cs, runtime);
     }
     syncSelectedChats(session) {
@@ -1759,6 +1766,26 @@ export class Bot {
         this.adoptSession(cs, session);
         return this.getSelectedSession(cs);
     }
+    /**
+     * Resume an existing session in a chat and restore the agent's persistent
+     * model / effort / BYOK Profile binding so the next stream — and the IM
+     * picker chips — match the session that was just adopted. This is the
+     * shared "click a row from the workspace list" path used by both the
+     * interactive selector and the text-command `/sessions <#>` flow.
+     */
+    resumeSessionForChat(chatId, session) {
+        const runtime = this.adoptExistingSessionForChat(chatId, session);
+        if (session.model) {
+            this.switchModelForChat(chatId, session.model, session.profileId ?? null);
+        }
+        else if (session.profileId !== undefined) {
+            this.switchModelForChat(chatId, this.modelForAgent(session.agent), null);
+        }
+        if (session.thinkingEffort) {
+            this.switchEffortForChat(chatId, session.thinkingEffort);
+        }
+        return runtime;
+    }
     switchAgentForChat(chatId, agent) {
         const cs = this.chat(chatId);
         if (cs.agent === agent)

package/dist/bot/command-ui.js CHANGED Viewed

@@ -135,8 +135,11 @@ export function decodeCommandAction(data) {
 }
 export async function buildSessionsCommandView(bot, chatId, page, pageSize = 5) {
     const data = await getSessionsPageData(bot, chatId, page, pageSize);
+    // Multi-row: one button per session on its own line, prefixed with the
+    // agent badge so a mixed workspace list reads cleanly. Avoid cramming
+    // multiple buttons onto one row (some IM clients truncate).
     const sessionButtons = data.sessions.map(session => [{
-            label: session.title,
+            label: `[${session.agent}] ${session.title} · ${session.time}`,
             action: { kind: 'session.switch', sessionId: session.key },
             state: buttonStateFromFlags({ isCurrent: session.isCurrent, isRunning: session.isRunning }),
             primary: session.isCurrent,
@@ -147,20 +150,27 @@ export async function buildSessionsCommandView(bot, chatId, page, pageSize = 5)
     navRow.push({ label: '+ New', action: { kind: 'session.new' } });
     if (data.page < data.totalPages - 1)
         navRow.push({ label: `p${data.page + 2} ▶`, action: { kind: 'sessions.page', page: data.page + 1 } });
+    const agentChips = Object.entries(data.agentTotals)
+        .sort((a, b) => b[1] - a[1])
+        .map(([agent, count]) => `${agent}:${count}`)
+        .join(' · ');
+    const headerDetail = data.workspaceName
+        ? (agentChips ? `${data.workspaceName} · ${agentChips}` : data.workspaceName)
+        : (agentChips || null);
     return {
         kind: 'sessions',
         title: 'Sessions',
-        detail: data.agent,
+        detail: headerDetail,
         metaLines: [`${data.total} total · p${data.page + 1}/${data.totalPages}`],
         items: data.sessions.map(session => ({
-            label: session.title,
+            label: `[${session.agent}] ${session.title}`,
             detail: session.time,
             state: buttonStateFromFlags({ isCurrent: session.isCurrent, isRunning: session.isRunning }),
         })),
-        emptyText: 'No sessions found.',
+        emptyText: 'No sessions found in this workspace.',
         helperText: data.totalPages > 1
-            ? `Use the controls below to switch or turn pages.`
-            : 'Use the controls below to switch or start a new session.',
+            ? `Pick a row to resume (agent/model/effort restore automatically).`
+            : 'Pick a row to resume, or start a new session.',
         rows: navRow.length ? [...sessionButtons, navRow] : sessionButtons,
     };
 }
@@ -391,22 +401,47 @@ export async function executeCommandAction(bot, chatId, action, opts = {}) {
         }
         case 'session.switch': {
             const chat = bot.chat(chatId);
-            const result = await bot.fetchSessions(chat.agent, bot.chatWorkdir(chatId));
+            // Workspace-wide lookup (no agent filter) so a row from any agent can be
+            // resumed directly from a single mixed list.
+            const result = await bot.fetchSessions(undefined, bot.chatWorkdir(chatId));
             if (!result.ok)
                 return { kind: 'noop', message: 'Failed to load sessions' };
             const session = result.sessions.find(entry => entry.sessionId === action.sessionId);
             if (!session)
                 return { kind: 'noop', message: 'Session not found' };
+            const prevAgent = chat.agent;
             const runtime = bot.adoptExistingSessionForChat(chatId, session);
+            // Restore the agent's persistent model / effort / Profile binding so the
+            // next stream — and the IM picker chips — match the resumed session.
+            if (session.model) {
+                bot.switchModelForChat(chatId, session.model, session.profileId ?? null);
+            }
+            else if (session.profileId !== undefined) {
+                // Session was native (profileId === null) — explicitly clear any
+                // active Profile so we don't run with a stale BYOK binding.
+                bot.switchModelForChat(chatId, bot.modelForAgent(session.agent), null);
+            }
+            if (session.thinkingEffort) {
+                bot.switchEffortForChat(chatId, session.thinkingEffort);
+            }
             const displayId = session.sessionId || action.sessionId;
             const sessionStatus = getSessionStatusForChat(bot, chat, session);
+            const runDetail = summarizeSessionRun({ ...session, running: sessionStatus.isRunning }).noticeDetail;
+            const restoreParts = [];
+            if (prevAgent !== session.agent)
+                restoreParts.push(`agent → ${session.agent}`);
+            if (session.model)
+                restoreParts.push(`model → ${session.model}`);
+            if (session.thinkingEffort)
+                restoreParts.push(`effort → ${session.thinkingEffort}`);
+            const detail = restoreParts.length ? `${runDetail} · ${restoreParts.join(' · ')}` : runDetail;
             return {
                 kind: 'notice',
                 callbackText: `Switched: ${displayId.slice(0, 12)}`,
                 notice: {
                     title: 'Session Switched',
                     value: displayId,
-                    detail: summarizeSessionRun({ ...session, running: sessionStatus.isRunning }).noticeDetail,
+                    detail,
                     valueMode: 'code',
                 },
                 session: runtime,

package/dist/bot/commands.js CHANGED Viewed

@@ -198,12 +198,18 @@ export function summarizeSessionRun(session) {
 }
 export async function getSessionsPageData(bot, chatId, page, pageSize = 5) {
     const cs = bot.chat(chatId);
-    const res = await bot.fetchSessions(cs.agent, bot.chatWorkdir(chatId));
+    // Workspace-wide: drop the cs.agent filter so the list matches what the
+    // dashboard shows for this workspace (all installed agents, sorted by
+    // most-recent activity).
+    const res = await bot.fetchSessions(undefined, bot.chatWorkdir(chatId));
     const sessions = res.ok ? res.sessions : [];
     const total = sessions.length;
     const totalPages = Math.max(1, Math.ceil(total / pageSize));
     const pg = Math.max(0, Math.min(page, totalPages - 1));
     const slice = sessions.slice(pg * pageSize, (pg + 1) * pageSize);
+    const agentTotals = {};
+    for (const s of sessions)
+        agentTotals[s.agent] = (agentTotals[s.agent] || 0) + 1;
     const entries = [];
     for (const s of slice) {
         const sessionKey = s.sessionId || '';
@@ -216,12 +222,13 @@ export async function getSessionsPageData(bot, chatId, page, pageSize = 5) {
             runDetail: s.runDetail,
         });
         const displayText = sessionListDisplayTitle(s);
-        const title = displayText ? displayText.replace(/\n/g, ' ').slice(0, 20) : sessionKey.slice(0, 20);
+        const title = displayText ? displayText.replace(/\n/g, ' ').slice(0, 28) : sessionKey.slice(0, 28);
         const time = s.createdAt
             ? new Date(s.createdAt).toLocaleString('zh-CN', { timeZone: 'Asia/Shanghai', month: '2-digit', day: '2-digit', hour: '2-digit', minute: '2-digit' })
             : '?';
         entries.push({
             key: sessionKey,
+            agent: s.agent,
             title,
             time: `${time} · ${runSummary.shortLabel}`,
             isCurrent: status.isCurrent,
@@ -230,7 +237,14 @@ export async function getSessionsPageData(bot, chatId, page, pageSize = 5) {
             runDetail: s.runDetail,
         });
     }
-    return { agent: cs.agent, total, page: pg, totalPages, sessions: entries };
+    return {
+        workspaceName: res.workspaceName || '',
+        agentTotals,
+        total,
+        page: pg,
+        totalPages,
+        sessions: entries,
+    };
 }
 export function extractLastSessionTurn(messages) {
     if (!messages.length)

package/dist/bot/session-hub.js CHANGED Viewed

@@ -98,27 +98,73 @@ export async function querySessions(opts) {
 // ---------------------------------------------------------------------------
 // Session detail queries
 // ---------------------------------------------------------------------------
-/**
- * Build a 1-2 message fallback transcript from the pikiclaw session record
- * for runs that crashed before the agent could write its own transcript file
- * (e.g. gemini auth failure, codex spawn failure). Without this the dashboard
- * detail panel would render blank for clearly-failed sessions.
- */
+const IMAGE_EXTENSIONS = new Set(['.png', '.jpg', '.jpeg', '.gif', '.webp', '.bmp', '.svg']);
+const MIME_BY_EXT = {
+    '.png': 'image/png', '.jpg': 'image/jpeg', '.jpeg': 'image/jpeg',
+    '.gif': 'image/gif', '.webp': 'image/webp', '.bmp': 'image/bmp', '.svg': 'image/svg+xml',
+};
+/** Build image MessageBlocks from a session record's `lastUserAttachments`
+ *  (relative paths under `workspacePath`). Used by fallback paths so the
+ *  dashboard can still render the user's image bubble while the agent CLI
+ *  has not yet flushed the turn to its own session file. Non-image
+ *  attachments are skipped — the fallback is text-first and doesn't try to
+ *  reconstruct generic file references. */
+function imageBlocksFromManagedRecord(record) {
+    const attachments = record.lastUserAttachments;
+    if (!attachments?.length)
+        return [];
+    const blocks = [];
+    for (const rel of attachments) {
+        const ext = path.extname(rel).toLowerCase();
+        if (!IMAGE_EXTENSIONS.has(ext))
+            continue;
+        const abs = path.isAbsolute(rel) ? rel : path.join(record.workspacePath, rel);
+        blocks.push({
+            type: 'image',
+            // `file://` sentinel — `rewriteImageBlocksForTransport` (dashboard
+            // response layer) converts it to a proper /attachment URL.
+            content: `file://${abs}`,
+            imagePath: abs,
+            imageMime: MIME_BY_EXT[ext] || 'application/octet-stream',
+        });
+    }
+    return blocks;
+}
 function tailFallbackFromManagedRecord(opts) {
+    const fb = managedFallbackContent(opts);
+    if (!fb)
+        return null;
+    const limit = Math.max(1, opts.limit ?? fb.messages.length);
+    return { ok: true, messages: fb.messages.slice(-limit), error: null };
+}
+function managedFallbackContent(opts) {
     const record = findPikiclawSession(opts.workdir, opts.agent, opts.sessionId);
     if (!record)
         return null;
     const messages = [];
-    if (record.lastQuestion)
-        messages.push({ role: 'user', text: record.lastQuestion });
+    const richMessages = [];
+    if (record.lastQuestion) {
+        const text = record.lastQuestion;
+        messages.push({ role: 'user', text });
+        const blocks = text ? [{ type: 'text', content: text }] : [];
+        blocks.push(...imageBlocksFromManagedRecord(record));
+        if (blocks.length)
+            richMessages.push({ role: 'user', text, blocks, usage: null });
+    }
     const failureText = record.lastAnswer
         || (record.runState === 'incomplete' ? record.runDetail : null);
-    if (failureText)
+    if (failureText) {
         messages.push({ role: 'assistant', text: failureText });
+        richMessages.push({
+            role: 'assistant',
+            text: failureText,
+            blocks: [{ type: 'text', content: failureText }],
+            usage: null,
+        });
+    }
     if (!messages.length)
         return null;
-    const limit = Math.max(1, opts.limit ?? messages.length);
-    return { ok: true, messages: messages.slice(-limit), error: null };
+    return { messages, richMessages };
 }
 /** Get recent messages from a session (tail). */
 export async function querySessionTail(opts) {
@@ -169,17 +215,20 @@ function collapseSkillPromptsInResult(result) {
 export async function querySessionMessages(opts) {
     const result = await _getSessionMessages(opts);
     if (!result.ok || !result.messages.length) {
-        const fallback = tailFallbackFromManagedRecord({
+        const fb = managedFallbackContent({
             agent: opts.agent,
             sessionId: opts.sessionId,
             workdir: opts.workdir,
-            limit: result.messages.length || undefined,
         });
-        if (fallback) {
+        if (fb) {
+            const totalTurns = fb.messages.filter(m => m.role === 'user').length;
             return collapseSkillPromptsInResult({
                 ok: true,
-                messages: fallback.messages.map(m => ({ role: m.role, text: m.text })),
-                totalTurns: fallback.messages.filter(m => m.role === 'user').length,
+                messages: fb.messages.map(m => ({ role: m.role, text: m.text })),
+                // Always emit richMessages so the dashboard can render image blocks
+                // for the first user turn while the agent CLI is still spinning up.
+                richMessages: fb.richMessages,
+                totalTurns,
                 error: null,
             });
         }

package/dist/channels/dingtalk/bot.js CHANGED Viewed

@@ -364,11 +364,11 @@ export class DingtalkBot extends Bot {
             const d = await getSessionsPageData(this, ctx.chatId, 0, 100);
             const target = d.sessions[idx - 1];
             if (target) {
-                const result = await this.fetchSessions(this.chat(ctx.chatId).agent, this.chatWorkdir(ctx.chatId));
+                const result = await this.fetchSessions(undefined, this.chatWorkdir(ctx.chatId));
                 const session = result.sessions.find(s => s.sessionId === target.key);
                 if (session) {
-                    this.adoptExistingSessionForChat(ctx.chatId, session);
-                    await ctx.reply(`Switched to session ${target.title}`);
+                    this.resumeSessionForChat(ctx.chatId, session);
+                    await ctx.reply(`Switched to [${session.agent}] ${target.title}`);
                 }
                 else {
                     await ctx.reply('Session not found.');
@@ -387,7 +387,7 @@ export class DingtalkBot extends Bot {
         d.sessions.forEach((s, i) => {
             const mark = s.isCurrent ? ' ←' : '';
             const running = s.isRunning ? ' [running]' : '';
-            lines.push(`${i + 1}. ${s.title} · ${s.time}${mark}${running}`);
+            lines.push(`${i + 1}. [${s.agent}] ${s.title} · ${s.time}${mark}${running}`);
         });
         lines.push('', 'Usage: /sessions new | /sessions <#>');
         await ctx.reply(lines.join('\n'));

package/dist/channels/discord/bot.js CHANGED Viewed

@@ -360,11 +360,11 @@ export class DiscordBot extends Bot {
             const d = await getSessionsPageData(this, ctx.chatId, 0, 100);
             const target = d.sessions[idx - 1];
             if (target) {
-                const result = await this.fetchSessions(this.chat(ctx.chatId).agent, this.chatWorkdir(ctx.chatId));
+                const result = await this.fetchSessions(undefined, this.chatWorkdir(ctx.chatId));
                 const session = result.sessions.find(s => s.sessionId === target.key);
                 if (session) {
-                    this.adoptExistingSessionForChat(ctx.chatId, session);
-                    await ctx.reply(`Switched to session ${target.title}`);
+                    this.resumeSessionForChat(ctx.chatId, session);
+                    await ctx.reply(`Switched to [${session.agent}] ${target.title}`);
                 }
                 else {
                     await ctx.reply('Session not found.');
@@ -383,7 +383,7 @@ export class DiscordBot extends Bot {
         d.sessions.forEach((s, i) => {
             const mark = s.isCurrent ? ' ←' : '';
             const running = s.isRunning ? ' [running]' : '';
-            lines.push(`${i + 1}. ${s.title} · ${s.time}${mark}${running}`);
+            lines.push(`${i + 1}. [${s.agent}] ${s.title} · ${s.time}${mark}${running}`);
         });
         lines.push('', 'Usage: /sessions new | /sessions <#>');
         await ctx.reply(lines.join('\n'));

package/dist/channels/feishu/render.js CHANGED Viewed

@@ -309,8 +309,15 @@ export function renderStart(d) {
     return lines.join('\n');
 }
 export function renderSessionsPage(d) {
+    const agentChips = Object.entries(d.agentTotals)
+        .sort((a, b) => b[1] - a[1])
+        .map(([agent, count]) => `${agent}:${count}`)
+        .join(' · ');
+    const header = d.workspaceName
+        ? (agentChips ? `${d.workspaceName} · ${agentChips}` : d.workspaceName)
+        : (agentChips || 'sessions');
     const lines = [
-        `**${d.agent} sessions** (${d.total})  p${d.page + 1}/${d.totalPages}`,
+        `**${header}** (${d.total})  p${d.page + 1}/${d.totalPages}`,
         '',
     ];
     if (!d.sessions.length) {
@@ -320,7 +327,7 @@ export function renderSessionsPage(d) {
         for (let i = 0; i < d.sessions.length; i++) {
             const s = d.sessions[i];
             const icon = s.isRunning ? '🟢' : s.isCurrent ? '●' : '○';
-            lines.push(`${icon} **${i + 1}.** ${s.title}  ${s.time}${s.isCurrent ? ' ← current' : ''}`);
+            lines.push(`${icon} **${i + 1}.** [${s.agent}] ${s.title}  ${s.time}${s.isCurrent ? ' ← current' : ''}`);
         }
         lines.push('');
         lines.push('*Use the controls below to switch, or reply with session number / "new".*');

package/dist/channels/slack/bot.js CHANGED Viewed

@@ -369,11 +369,11 @@ export class SlackBot extends Bot {
             const d = await getSessionsPageData(this, ctx.chatId, 0, 100);
             const target = d.sessions[idx - 1];
             if (target) {
-                const result = await this.fetchSessions(this.chat(ctx.chatId).agent, this.chatWorkdir(ctx.chatId));
+                const result = await this.fetchSessions(undefined, this.chatWorkdir(ctx.chatId));
                 const session = result.sessions.find(s => s.sessionId === target.key);
                 if (session) {
-                    this.adoptExistingSessionForChat(ctx.chatId, session);
-                    await ctx.reply(`Switched to session ${target.title}`);
+                    this.resumeSessionForChat(ctx.chatId, session);
+                    await ctx.reply(`Switched to [${session.agent}] ${target.title}`);
                 }
                 else {
                     await ctx.reply('Session not found.');
@@ -392,7 +392,7 @@ export class SlackBot extends Bot {
         d.sessions.forEach((s, i) => {
             const mark = s.isCurrent ? ' ←' : '';
             const running = s.isRunning ? ' [running]' : '';
-            lines.push(`${i + 1}. ${s.title} · ${s.time}${mark}${running}`);
+            lines.push(`${i + 1}. [${s.agent}] ${s.title} · ${s.time}${mark}${running}`);
         });
         lines.push('', 'Usage: /sessions new | /sessions <#>');
         await ctx.reply(lines.join('\n'));

package/dist/channels/wecom/bot.js CHANGED Viewed

@@ -372,11 +372,11 @@ export class WeComBot extends Bot {
             const d = await getSessionsPageData(this, ctx.chatId, 0, 100);
             const target = d.sessions[idx - 1];
             if (target) {
-                const result = await this.fetchSessions(this.chat(ctx.chatId).agent, this.chatWorkdir(ctx.chatId));
+                const result = await this.fetchSessions(undefined, this.chatWorkdir(ctx.chatId));
                 const session = result.sessions.find(s => s.sessionId === target.key);
                 if (session) {
-                    this.adoptExistingSessionForChat(ctx.chatId, session);
-                    await ctx.reply(`Switched to session ${target.title}`);
+                    this.resumeSessionForChat(ctx.chatId, session);
+                    await ctx.reply(`Switched to [${session.agent}] ${target.title}`);
                 }
                 else {
                     await ctx.reply('Session not found.');
@@ -395,7 +395,7 @@ export class WeComBot extends Bot {
         d.sessions.forEach((s, i) => {
             const mark = s.isCurrent ? ' ←' : '';
             const running = s.isRunning ? ' [running]' : '';
-            lines.push(`${i + 1}. ${s.title} · ${s.time}${mark}${running}`);
+            lines.push(`${i + 1}. [${s.agent}] ${s.title} · ${s.time}${mark}${running}`);
         });
         lines.push('', 'Usage: /sessions new | /sessions <#>');
         await ctx.reply(lines.join('\n'));

package/dist/channels/weixin/bot.js CHANGED Viewed

@@ -475,11 +475,11 @@ export class WeixinBot extends Bot {
             const d = await getSessionsPageData(this, ctx.chatId, 0, 100);
             const target = d.sessions[idx - 1];
             if (target) {
-                const result = await this.fetchSessions(this.chat(ctx.chatId).agent, this.chatWorkdir(ctx.chatId));
+                const result = await this.fetchSessions(undefined, this.chatWorkdir(ctx.chatId));
                 const session = result.sessions.find(s => s.sessionId === target.key);
                 if (session) {
-                    this.adoptExistingSessionForChat(ctx.chatId, session);
-                    await ctx.reply(`Switched to session ${target.title}`);
+                    this.resumeSessionForChat(ctx.chatId, session);
+                    await ctx.reply(`Switched to [${session.agent}] ${target.title}`);
                 }
                 else {
                     await ctx.reply(`Session not found.`);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pikiclaw",
-  "version": "0.3.51",
+  "version": "0.3.53",
   "description": "Put the world's smartest AI agents in your pocket. Command local Claude & Gemini via IM. | 让最好用的 IM 变成你电脑上的顶级 Agent 控制台",
   "type": "module",
   "bin": {