pikiclaw 0.2.36 → 0.2.37

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -40,6 +40,19 @@ pikiclaw 的思路不同:**只挑最好的,然后把它们组合到极致。
40
40
  └──────── 流式进度 / 文件 / 截图 ←──────────┘
41
41
  ```
42
42
 
43
+ pikiclaw 不是另一个终端包装器,也不是另一个云端 IDE。
44
+
45
+ 它更像一个让官方 coding agent 变得**可远程调度、可持续运行、可回传结果**的本地执行中枢。
46
+
47
+ 当你需要:
48
+
49
+ - 在手机上发一句话就能派活
50
+ - 任务必须跑在你自己的电脑、现有代码库和本地工具链里
51
+ - 想在 Claude Code / Codex CLI / Gemini CLI 之间自由切换
52
+ - 希望进度、截图、文件自动回到聊天
53
+
54
+ pikiclaw 会比“守在终端里”“SSH + tmux”或“单厂商云端 agent”更顺。
55
+
43
56
  ---
44
57
 
45
58
  ## Quick Start
@@ -48,7 +61,7 @@ pikiclaw 的思路不同:**只挑最好的,然后把它们组合到极致。
48
61
 
49
62
  - Node.js 18+
50
63
  - 本机已安装 [`claude`](https://docs.anthropic.com/en/docs/claude-code)、[`codex`](https://github.com/openai/codex) 或 [`gemini`](https://github.com/google-gemini/gemini-cli) 中的任意一个
51
- - 一个 [Telegram Bot Token](https://t.me/BotFather) 或[飞书应用](https://open.feishu.cn)凭证
64
+ - 一个 [Telegram Bot Token](https://t.me/BotFather) 或 [飞书应用](https://open.feishu.cn) 凭证
52
65
 
53
66
  ### 一行启动
54
67
 
@@ -87,10 +100,10 @@ npx pikiclaw@latest --setup
87
100
 
88
101
  ### IM Channels
89
102
 
90
- | 渠道 | 消息编辑 | 文件上传 | 回调按钮 | 表情回应 | 消息线程 |
91
- |------|---------|---------|---------|---------|---------|
92
- | **Telegram** | ✅ | ✅ | ✅ | | |
93
- | **飞书** | ✅ | ✅ | ✅ | ✅ | |
103
+ | 渠道 | 消息编辑 | 文件收发 | 回调按钮 | 命令菜单 | 场景 |
104
+ |------|---------|---------|---------|---------|------|
105
+ | **Telegram** | ✅ | ✅ | ✅ | | 全球 / 个人 |
106
+ | **飞书** | ✅ | ✅ | ✅ | ✅ | 国内 / 团队 |
94
107
 
95
108
  两个渠道可以**同时启动**。
96
109
 
@@ -99,16 +112,15 @@ npx pikiclaw@latest --setup
99
112
  | 能力 | 说明 |
100
113
  |------|------|
101
114
  | 实时流式输出 | Agent 工作时消息持续更新 |
102
- | Thinking / Reasoning | 实时查看 Agent 的思考和推理过程 |
115
+ | Thinking / Reasoning / Plan | 实时查看 Agent 的思考、推理和计划步骤 |
103
116
  | Token 追踪 | 输入/输出/缓存统计,上下文使用率实时显示 |
104
- | 产物回传 | 截图、日志、生成文件自动发回 |
105
- | 长程防休眠 | 系统级防休眠,小时级任务不中断 |
106
- | 守护进程 | 崩溃自动重启,指数退避(3s → 60s) |
117
+ | 产物回传 | 截图、日志、生成文件自动发回聊天 |
118
+ | 长程任务保障 | 系统级防休眠 + 守护进程 + 异常自愈 |
107
119
  | 长文本处理 | 超长输出自动拆分或打包为 `.md` |
108
120
  | 多会话管理 | 随时切换、恢复历史会话 |
109
121
  | 图片/文件输入 | 截图、PDF、文档直接发给 Agent |
110
122
  | 项目 Skills | `.pikiclaw/skills/` 自定义技能,兼容 `.claude/commands/` |
111
- | 安全模式 | 危险操作推送确认卡片,白名单访问控制 |
123
+ | 安全模式 | 白名单访问控制,支持切换更安全的 agent 权限模式 |
112
124
  | Web Dashboard | 可视化配置、会话浏览、主机监控 |
113
125
 
114
126
  ---
@@ -134,24 +146,23 @@ npx pikiclaw@latest --setup
134
146
 
135
147
  | | 终端直接跑 | SSH + tmux | 云端 Agent | **pikiclaw** |
136
148
  |---|---|---|---|---|
137
- | 执行环境 | ✅ 本地 | ✅ 本地 | 沙盒 | ✅ 本地 |
149
+ | 执行环境 | ✅ 本地 | ✅ 本地 | ⚠️ 通常是远端或沙盒 | ✅ 本地 |
138
150
  | 走开后还能跑 | ❌ 合盖就断 | ⚠️ 要配 tmux | ✅ | ✅ 防休眠 + 守护进程 |
139
151
  | 手机可控 | ❌ | ⚠️ 打字痛苦 | ✅ | ✅ IM 原生 |
140
- | 实时看进度 | ✅ 终端 | ⚠️ 得连上去看 | 多数是黑盒 | ✅ 流式推到聊天 |
152
+ | 实时看进度 | ✅ 终端 | ⚠️ 得连上去看 | ⚠️ 依平台而定 | ✅ 流式推到聊天 |
141
153
  | 结果自动回传 | ❌ | ❌ | ⚠️ 看平台 | ✅ 截图/文件/长文本 |
142
- | 配置门槛 | 无 | SSH/穿透/tmux | 注册/付费 | `npx` 一行 |
154
+ | 配置门槛 | 无 | SSH/穿透/tmux | 注册并适应平台工作流 | `npx` 一行 |
143
155
 
144
- ### pikiclaw vs. 同类项目
156
+ ### pikiclaw vs. OpenClaw / 官方入口
145
157
 
146
- | | **pikiclaw** | OpenClaw | cc-connect |
158
+ | 维度 | **pikiclaw** | OpenClaw | 官方入口(Claude / Codex / Gemini) |
147
159
  |---|---|---|---|
148
- | **理念** | **精选最好的工具,组合到极致** | 开源自主 AI 智能体生态 | 多渠道多端连接器 |
149
- | **Agent** | Claude Code / Codex / Gemini CLI(官方出品) | 内置 Agent(自接模型) | 多种本地 CLI |
150
- | **IM** | Telegram + 飞书(深度打磨) | Web / 移动端 | Slack / Discord / LINE 等 |
151
- | **长程任务** | 防休眠 · 守护进程 · 异常自愈 | 偏即时任务 | 偏短对话 |
152
- | **产物回传** | 截图 · 文件 · 长文本打包 | ⚠️ 依赖客户端 | ⚠️ 基础附件 |
153
- | **流式体验** | IM 内实时流式 | | ⚠️ 看桥接能力 |
154
- | **上手成本** | **一行 `npx`** | 需部署后端 | 需安装服务端 |
160
+ | 产品层 | IM 驱动的本地 agent 控制平面 | 通用个人 AI 助手 / 多渠道生态 | 单一厂商的原生 agent 入口 |
161
+ | Agent 策略 | 复用官方 CLI,吃到各家最新能力 | 自带 runtime / agent stack | 只服务自家模型 |
162
+ | 执行环境 | 你的电脑 | 个人设备网络 / 本地节点 | 本地 CLI 或厂商云 |
163
+ | 渠道策略 | Telegram + 飞书深度打磨 | 广覆盖 | Web / App / Slack / CLI 为主 |
164
+ | 锁定程度 | 低,可随时切换引擎 | | |
165
+ | 最强场景 | 远程 coding、长任务、本地自动化 | 全能个人助手 | 原生模型体验、企业集成 |
155
166
 
156
167
  ---
157
168
 
@@ -159,6 +170,8 @@ npx pikiclaw@latest --setup
159
170
 
160
171
  pikiclaw 不限于编程。你的 Agent 能做什么,pikiclaw 就能远程调度什么。
161
172
 
173
+ 它尤其适合那些**必须在你自己的环境里执行**、同时又希望**进度和结果直接回到 IM** 的任务。
174
+
162
175
  **工程重构** — "把整个项目从 JS 迁移到 TS,跑测试直到全部通过。搞定告诉我。"
163
176
 
164
177
  **文档处理** — "把 docs/ 下所有零散文档整理汇总,提取核心指标,输出一份报告。"
@@ -298,7 +311,7 @@ npx pikiclaw@latest --doctor # 检查环境
298
311
  ## Development
299
312
 
300
313
  ```bash
301
- git clone https://github.com/nicepkg/pikiclaw.git
314
+ git clone https://github.com/xiaotonng/pikiclaw.git
302
315
  cd pikiclaw
303
316
  npm install
304
317
  echo "TELEGRAM_BOT_TOKEN=your_token" > .env
package/dist/bot.js CHANGED
@@ -11,26 +11,28 @@ import { getActiveUserConfig, onUserConfigChange, resolveUserWorkdir, setUserWor
11
11
  import { doStream, getSessions, getSessionTail, getUsage, initializeProjectSkills, listAgents, listModels, listSkills, isPendingSessionId, } from './code-agent.js';
12
12
  import { getDriver, hasDriver, allDriverIds } from './agent-driver.js';
13
13
  import { terminateProcessTree } from './process-control.js';
14
- export const VERSION = '0.2.36';
14
+ export const VERSION = '0.2.37';
15
15
  const MACOS_USER_ACTIVITY_PULSE_INTERVAL_MS = 20_000;
16
16
  const MACOS_USER_ACTIVITY_PULSE_TIMEOUT_S = 30;
17
17
  // ---------------------------------------------------------------------------
18
18
  // Helpers
19
19
  // ---------------------------------------------------------------------------
20
20
  /**
21
- * If `dir` has a .gitignore, ensure `.pikiclaw/` is ignored so it doesn't
22
- * pollute the project. Never modify .claude or .agents gitignore entries.
21
+ * If `dir` has a .gitignore, ignore managed `.pikiclaw` state without hiding
22
+ * `.pikiclaw/skills`, which may be committed as project skills.
23
23
  */
24
- function ensureGitignore(dir) {
24
+ export function ensureGitignore(dir) {
25
25
  try {
26
26
  const gi = path.join(dir, '.gitignore');
27
27
  if (!fs.existsSync(gi))
28
28
  return;
29
- const managedLines = ['.pikiclaw/'];
30
- const legacyLines = new Set([
29
+ const managedLines = [
31
30
  '.pikiclaw/*',
32
31
  '!.pikiclaw/skills/',
33
32
  '!.pikiclaw/skills/**',
33
+ ];
34
+ const legacyLines = new Set([
35
+ '.pikiclaw/',
34
36
  '.claude/skills/',
35
37
  '.agents/skills/',
36
38
  ]);
@@ -947,24 +947,55 @@ function ensureDirSymlink(linkPath, targetDir) {
947
947
  fs.mkdirSync(path.dirname(linkPath), { recursive: true });
948
948
  fs.symlinkSync(desiredTarget, linkPath, 'dir');
949
949
  }
950
+ function resetDir(dirPath) {
951
+ try {
952
+ fs.rmSync(dirPath, { recursive: true, force: true });
953
+ }
954
+ catch { }
955
+ fs.mkdirSync(dirPath, { recursive: true });
956
+ }
957
+ function copyMergedTree(sourceRoot, targetRoot, opts = {}) {
958
+ for (const relPath of listRelativeFiles(sourceRoot)) {
959
+ const sourcePath = path.join(sourceRoot, relPath);
960
+ const targetPath = path.join(targetRoot, relPath);
961
+ if (hasFile(targetPath)) {
962
+ opts.log?.(`skills merge skipped existing file: ${relPath}`);
963
+ continue;
964
+ }
965
+ fs.mkdirSync(path.dirname(targetPath), { recursive: true });
966
+ fs.copyFileSync(sourcePath, targetPath);
967
+ }
968
+ }
950
969
  export function initializeProjectSkills(workdir, opts = {}) {
951
970
  const canonicalRoot = path.join(workdir, '.pikiclaw', 'skills');
952
971
  const claudeRoot = path.join(workdir, '.claude', 'skills');
953
- // Only create .pikiclaw/skills .claude/skills symlink.
954
- // Never modify files under .claude or .agents.
955
- if (hasDir(claudeRoot)) {
956
- ensureDirSymlink(canonicalRoot, claudeRoot);
957
- opts.log?.(`skills linked: .pikiclaw/skills → .claude/skills workdir=${workdir}`);
972
+ const agentsRoot = path.join(workdir, '.agents', 'skills');
973
+ const mergeRoot = path.join(workdir, '.pikiclaw', '.skills-merge-tmp');
974
+ const sourceRoots = [canonicalRoot, claudeRoot, agentsRoot];
975
+ const seenSourceReals = new Set();
976
+ resetDir(mergeRoot);
977
+ // Merge order defines precedence: existing canonical content wins first,
978
+ // then legacy Claude skills, then legacy agents skills.
979
+ for (const sourceRoot of sourceRoots) {
980
+ if (!hasDir(sourceRoot))
981
+ continue;
982
+ const realSource = realPathOrNull(sourceRoot);
983
+ if (realSource && seenSourceReals.has(realSource))
984
+ continue;
985
+ if (realSource)
986
+ seenSourceReals.add(realSource);
987
+ copyMergedTree(sourceRoot, mergeRoot, opts);
958
988
  }
959
- else {
960
- // Remove stale symlink before creating real directory
961
- try {
962
- if (fs.lstatSync(canonicalRoot).isSymbolicLink())
963
- fs.unlinkSync(canonicalRoot);
964
- }
965
- catch { }
966
- fs.mkdirSync(canonicalRoot, { recursive: true });
989
+ resetDir(canonicalRoot);
990
+ copyMergedTree(mergeRoot, canonicalRoot, opts);
991
+ try {
992
+ fs.rmSync(mergeRoot, { recursive: true, force: true });
993
+ }
994
+ catch { }
995
+ for (const linkRoot of [claudeRoot, agentsRoot]) {
996
+ ensureDirSymlink(linkRoot, canonicalRoot);
967
997
  }
998
+ opts.log?.(`skills merged into .pikiclaw/skills and linked to .claude/.agents workdir=${workdir}`);
968
999
  }
969
1000
  export function getProjectSkillPaths(workdir, skillName) {
970
1001
  const sharedSkillFile = path.join(workdir, '.pikiclaw', 'skills', skillName, 'SKILL.md');
@@ -18,7 +18,6 @@
18
18
  import fs from 'node:fs';
19
19
  import path from 'node:path';
20
20
  import { workspaceTools } from './tools/workspace.js';
21
- import { captureTools } from './tools/capture.js';
22
21
  import { guiTools } from './tools/gui.js';
23
22
  // ---------------------------------------------------------------------------
24
23
  // Logging — writes to stderr + file so it doesn't interfere with stdio MCP transport
@@ -65,7 +64,7 @@ log(`started workspace=${ctx.workspace} stagedFiles=${ctx.stagedFiles.length} ca
65
64
  // ---------------------------------------------------------------------------
66
65
  // Tool registry — collect all tool modules
67
66
  // ---------------------------------------------------------------------------
68
- const TOOL_MODULES = [workspaceTools, captureTools, guiTools];
67
+ const TOOL_MODULES = [workspaceTools, guiTools];
69
68
  const ALL_TOOLS = TOOL_MODULES.flatMap(m => m.tools);
70
69
  /** Lookup: tool name → module that handles it. */
71
70
  const TOOL_HANDLERS = new Map();
package/dist/run.js CHANGED
@@ -7,7 +7,7 @@
7
7
  * npm run command -- claude-models
8
8
  * npm run command -- codex-models
9
9
  */
10
- import { formatThinkingForDisplay } from './bot.js';
10
+ import { ensureGitignore, formatThinkingForDisplay } from './bot.js';
11
11
  import { initializeProjectSkills, listAgents, listModels, listSkills, getUsage, doStream, getSessions, getSessionTail } from './code-agent.js';
12
12
  import { loadUserConfig, resolveUserWorkdir } from './user-config.js';
13
13
  function parseArgs(argv) {
@@ -98,6 +98,7 @@ async function main() {
98
98
  const args = parseArgs(process.argv.slice(2));
99
99
  const userConfig = loadUserConfig();
100
100
  const workdir = resolveUserWorkdir({ workdir: args.workdir, config: userConfig });
101
+ ensureGitignore(workdir);
101
102
  initializeProjectSkills(workdir);
102
103
  if (args.help || !args.command) {
103
104
  process.stdout.write(HELP);
@@ -90,7 +90,6 @@ function handleListFiles(args, ctx) {
90
90
  async function handleSendFile(args, ctx) {
91
91
  const filePath = typeof args?.path === 'string' ? args.path.trim() : '';
92
92
  const kind = typeof args?.kind === 'string' ? args.kind : undefined;
93
- toolLog('im_send_file', `path=${filePath} kind=${kind || 'auto'}`);
94
93
  if (!filePath) {
95
94
  toolLog('im_send_file', 'ERROR missing path');
96
95
  return toolResult('Error: "path" is required', true);
@@ -99,6 +98,8 @@ async function handleSendFile(args, ctx) {
99
98
  toolLog('im_send_file', 'ERROR no callback URL');
100
99
  return toolResult('Error: MCP callback URL is not configured', true);
101
100
  }
101
+ const callbackTarget = describeSendFileTarget(ctx.callbackUrl);
102
+ toolLog('im_send_file', `path=${filePath} kind=${kind || 'auto'} callback=${callbackTarget}`);
102
103
  try {
103
104
  const result = await callbackSendFile(ctx.callbackUrl, filePath, {
104
105
  caption: typeof args?.caption === 'string' ? args.caption : undefined,
@@ -109,13 +110,15 @@ async function handleSendFile(args, ctx) {
109
110
  return toolResult(`File sent successfully: ${filePath}`);
110
111
  }
111
112
  else {
112
- toolLog('im_send_file', `FAILED ${result.error || 'unknown error'}`);
113
- return toolResult(`Failed to send file: ${result.error || 'unknown error'}`, true);
113
+ const detail = formatSendFileFailure(result);
114
+ toolLog('im_send_file', `FAILED ${detail}`);
115
+ return toolResult(`Failed to send file: ${detail}`, true);
114
116
  }
115
117
  }
116
118
  catch (e) {
117
- toolLog('im_send_file', `ERROR ${e.message}`);
118
- return toolResult(`Error sending file: ${e.message}`, true);
119
+ const message = e instanceof Error ? e.message : String(e);
120
+ toolLog('im_send_file', `ERROR callback=${callbackTarget} ${message}`);
121
+ return toolResult(`Error sending file: ${message}`, true);
119
122
  }
120
123
  }
121
124
  // ---------------------------------------------------------------------------
@@ -132,14 +135,45 @@ function callbackSendFile(callbackUrl, filePath, opts) {
132
135
  let data = '';
133
136
  res.on('data', (chunk) => { data += chunk; });
134
137
  res.on('end', () => {
138
+ const statusCode = res.statusCode;
139
+ const statusMessage = res.statusMessage || undefined;
140
+ const bodyPreview = data ? previewText(data) : undefined;
141
+ let parsed = null;
135
142
  try {
136
- resolve(JSON.parse(data));
143
+ parsed = data ? JSON.parse(data) : null;
137
144
  }
138
- catch {
139
- resolve({ ok: false, error: 'invalid callback response' });
145
+ catch { }
146
+ if (statusCode && statusCode >= 400) {
147
+ const parsedError = typeof parsed?.error === 'string' ? parsed.error : null;
148
+ resolve({
149
+ ok: false,
150
+ error: parsedError || describeHttpFailure(statusCode, statusMessage, bodyPreview),
151
+ statusCode,
152
+ statusMessage,
153
+ bodyPreview,
154
+ });
155
+ return;
156
+ }
157
+ if (parsed && typeof parsed.ok === 'boolean') {
158
+ resolve({
159
+ ok: parsed.ok,
160
+ error: typeof parsed.error === 'string' ? parsed.error : undefined,
161
+ statusCode,
162
+ statusMessage,
163
+ bodyPreview,
164
+ });
165
+ return;
140
166
  }
167
+ resolve({
168
+ ok: false,
169
+ error: describeHttpFailure(statusCode, statusMessage, bodyPreview, 'invalid callback response'),
170
+ statusCode,
171
+ statusMessage,
172
+ bodyPreview,
173
+ });
141
174
  });
142
175
  });
176
+ req.setTimeout(30_000, () => req.destroy(new Error('send-file callback timed out after 30s')));
143
177
  req.on('error', e => reject(e));
144
178
  req.write(body);
145
179
  req.end();
@@ -156,6 +190,32 @@ function safeRealpath(p) {
156
190
  return null;
157
191
  }
158
192
  }
193
+ function previewText(text, max = 400) {
194
+ const normalized = String(text || '').replace(/\s+/g, ' ').trim();
195
+ if (!normalized)
196
+ return '';
197
+ return normalized.length <= max ? normalized : `${normalized.slice(0, Math.max(0, max - 3)).trimEnd()}...`;
198
+ }
199
+ function describeSendFileTarget(callbackUrl) {
200
+ try {
201
+ const url = new URL('/send-file', callbackUrl);
202
+ return `${url.origin}${url.pathname}`;
203
+ }
204
+ catch {
205
+ return callbackUrl;
206
+ }
207
+ }
208
+ function describeHttpFailure(statusCode, statusMessage, bodyPreview, fallback = 'callback request failed') {
209
+ const status = statusCode ? `HTTP ${statusCode}${statusMessage ? ` ${statusMessage}` : ''}` : fallback;
210
+ return bodyPreview ? `${status}; body=${bodyPreview}` : status;
211
+ }
212
+ function formatSendFileFailure(result) {
213
+ const base = result.error?.trim() || describeHttpFailure(result.statusCode, result.statusMessage, result.bodyPreview, 'unknown error');
214
+ if (result.bodyPreview && !base.includes(result.bodyPreview)) {
215
+ return `${base}; body=${result.bodyPreview}`;
216
+ }
217
+ return base;
218
+ }
159
219
  // ---------------------------------------------------------------------------
160
220
  // Module export
161
221
  // ---------------------------------------------------------------------------
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pikiclaw",
3
- "version": "0.2.36",
3
+ "version": "0.2.37",
4
4
  "description": "The best IM-driven remote coding experience. Bridge AI coding agents to any IM.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -1,260 +0,0 @@
1
- /**
2
- * tools/capture.ts — Screen capture tool.
3
- *
4
- * take_screenshot — captures a screenshot (full screen, region, or window), cross-platform
5
- */
6
- import fs from 'node:fs';
7
- import path from 'node:path';
8
- import os from 'node:os';
9
- import { execFile } from 'node:child_process';
10
- import { toolResult, toolLog } from './types.js';
11
- // ---------------------------------------------------------------------------
12
- // Tool definitions
13
- // ---------------------------------------------------------------------------
14
- const tools = [
15
- {
16
- name: 'take_screenshot',
17
- description: 'Take a screenshot and return the saved PNG path.',
18
- inputSchema: {
19
- type: 'object',
20
- properties: {
21
- display_number: {
22
- type: 'number',
23
- description: 'Optional display number.',
24
- },
25
- window_title: {
26
- type: 'string',
27
- description: 'Optional window title.',
28
- },
29
- region: {
30
- type: 'object',
31
- description: 'Optional capture region.',
32
- properties: {
33
- x: { type: 'number', description: 'Left.' },
34
- y: { type: 'number', description: 'Top.' },
35
- width: { type: 'number', description: 'Width.' },
36
- height: { type: 'number', description: 'Height.' },
37
- },
38
- required: ['x', 'y', 'width', 'height'],
39
- },
40
- },
41
- },
42
- },
43
- ];
44
- // ---------------------------------------------------------------------------
45
- // Handler
46
- // ---------------------------------------------------------------------------
47
- async function handleTakeScreenshot(args) {
48
- const displayNumber = typeof args?.display_number === 'number' ? args.display_number : undefined;
49
- const windowTitle = typeof args?.window_title === 'string' ? args.window_title.trim() : undefined;
50
- const regionRaw = args?.region;
51
- let region;
52
- toolLog('take_screenshot', `display=${displayNumber ?? 'default'} window=${windowTitle || 'none'} region=${regionRaw ? 'yes' : 'none'}`);
53
- if (regionRaw && typeof regionRaw === 'object') {
54
- const x = Number(regionRaw.x);
55
- const y = Number(regionRaw.y);
56
- const w = Number(regionRaw.width);
57
- const h = Number(regionRaw.height);
58
- if ([x, y, w, h].some(v => !Number.isFinite(v) || v < 0) || w === 0 || h === 0) {
59
- toolLog('take_screenshot', 'ERROR invalid region params');
60
- return toolResult('Error: region requires finite non-negative x, y, width, height (width/height > 0)', true);
61
- }
62
- region = { x, y, width: w, height: h };
63
- }
64
- try {
65
- const filePath = await captureScreenshot({ displayNumber, region, windowTitle });
66
- const size = fs.statSync(filePath).size;
67
- toolLog('take_screenshot', `OK saved ${filePath} (${Math.round(size / 1024)} KB)`);
68
- return toolResult(`Screenshot saved: ${filePath} (${Math.round(size / 1024)} KB)`);
69
- }
70
- catch (e) {
71
- toolLog('take_screenshot', `ERROR ${e.message}`);
72
- return toolResult(`Screenshot failed: ${e.message}`, true);
73
- }
74
- }
75
- // ---------------------------------------------------------------------------
76
- // Shell exec helper
77
- // ---------------------------------------------------------------------------
78
- function execAsync(cmd, args) {
79
- return new Promise((resolve, reject) => {
80
- execFile(cmd, args, { timeout: 15_000, maxBuffer: 1024 * 1024 }, (err, stdout, stderr) => {
81
- if (err)
82
- reject(err);
83
- else
84
- resolve({ stdout, stderr });
85
- });
86
- });
87
- }
88
- // ---------------------------------------------------------------------------
89
- // Window ID lookup — cross-platform
90
- // ---------------------------------------------------------------------------
91
- /** macOS: use JXA to find CGWindowID by title substring. */
92
- async function findWindowIdDarwin(title) {
93
- const jxa = `
94
- ObjC.import('CoreGraphics');
95
- const list = $.CGWindowListCopyWindowInfo($.kCGWindowListOptionOnScreenOnly, 0);
96
- const count = ObjC.unwrap(list).length;
97
- const needle = ${JSON.stringify(title.toLowerCase())};
98
- for (let i = 0; i < count; i++) {
99
- const entry = ObjC.unwrap(list)[i];
100
- const name = ObjC.unwrap(entry.kCGWindowName || '') || '';
101
- const owner = ObjC.unwrap(entry.kCGWindowOwnerName || '') || '';
102
- if (name.toLowerCase().includes(needle) || owner.toLowerCase().includes(needle)) {
103
- const wid = ObjC.unwrap(entry.kCGWindowNumber);
104
- if (wid) { console.log(wid); $.exit(0); }
105
- }
106
- }
107
- console.log(''); $.exit(1);
108
- `.trim();
109
- const { stdout } = await execAsync('osascript', ['-l', 'JavaScript', '-e', jxa]);
110
- const wid = stdout.trim();
111
- if (!wid)
112
- throw new Error(`No window found matching "${title}" on macOS`);
113
- return wid;
114
- }
115
- /** Linux: use xdotool to find window ID by title substring. */
116
- async function findWindowIdLinux(title) {
117
- const { stdout } = await execAsync('xdotool', ['search', '--name', title]);
118
- const wid = stdout.trim().split('\n')[0];
119
- if (!wid)
120
- throw new Error(`No window found matching "${title}" on Linux (via xdotool)`);
121
- return wid;
122
- }
123
- /** Windows: find window handle by title substring, return bounding rect. */
124
- async function findWindowRectWindows(title) {
125
- const psScript = [
126
- 'Add-Type @"',
127
- 'using System; using System.Runtime.InteropServices;',
128
- 'public class Win32 {',
129
- ' [DllImport("user32.dll")] public static extern bool GetWindowRect(IntPtr hWnd, out RECT lpRect);',
130
- ' [StructLayout(LayoutKind.Sequential)] public struct RECT { public int Left, Top, Right, Bottom; }',
131
- '}',
132
- '"@',
133
- `$procs = Get-Process | Where-Object { $_.MainWindowTitle -like '*${title.replace(/'/g, "''")}*' -and $_.MainWindowHandle -ne 0 } | Select-Object -First 1`,
134
- 'if (-not $procs) { Write-Error "No window found"; exit 1 }',
135
- '$rect = New-Object Win32+RECT',
136
- '[Win32]::GetWindowRect($procs.MainWindowHandle, [ref]$rect) | Out-Null',
137
- '$obj = @{ x=$rect.Left; y=$rect.Top; width=$rect.Right-$rect.Left; height=$rect.Bottom-$rect.Top }',
138
- '$obj | ConvertTo-Json -Compress',
139
- ].join('\n');
140
- const { stdout } = await execAsync('powershell', ['-NoProfile', '-Command', psScript]);
141
- const rect = JSON.parse(stdout.trim());
142
- if (!rect || rect.width <= 0 || rect.height <= 0)
143
- throw new Error(`No window found matching "${title}" on Windows`);
144
- return rect;
145
- }
146
- // ---------------------------------------------------------------------------
147
- // Screenshot capture — cross-platform
148
- // ---------------------------------------------------------------------------
149
- async function captureScreenshot(opts) {
150
- const { displayNumber, region, windowTitle } = opts;
151
- const outFile = path.join(os.tmpdir(), `pikiclaw_screenshot_${process.pid}_${Date.now()}.png`);
152
- const platform = process.platform;
153
- if (platform === 'darwin') {
154
- const args = ['-x'];
155
- if (windowTitle) {
156
- const wid = await findWindowIdDarwin(windowTitle);
157
- args.push('-l', wid);
158
- }
159
- else {
160
- if (displayNumber != null)
161
- args.push('-D', String(displayNumber));
162
- if (region)
163
- args.push('-R', `${region.x},${region.y},${region.width},${region.height}`);
164
- }
165
- args.push(outFile);
166
- await execAsync('screencapture', args);
167
- }
168
- else if (platform === 'win32') {
169
- let bounds;
170
- if (windowTitle) {
171
- const rect = await findWindowRectWindows(windowTitle);
172
- bounds = `$bounds = New-Object System.Drawing.Rectangle(${rect.x},${rect.y},${rect.width},${rect.height})`;
173
- }
174
- else if (region) {
175
- bounds = `$bounds = New-Object System.Drawing.Rectangle(${region.x},${region.y},${region.width},${region.height})`;
176
- }
177
- else if (displayNumber != null) {
178
- bounds = `$screen = [System.Windows.Forms.Screen]::AllScreens[${displayNumber}]; $bounds = $screen.Bounds`;
179
- }
180
- else {
181
- bounds = `$screen = [System.Windows.Forms.Screen]::PrimaryScreen; $bounds = $screen.Bounds`;
182
- }
183
- const psScript = [
184
- 'Add-Type -AssemblyName System.Windows.Forms',
185
- 'Add-Type -AssemblyName System.Drawing',
186
- bounds,
187
- '$bmp = New-Object System.Drawing.Bitmap($bounds.Width, $bounds.Height)',
188
- '$g = [System.Drawing.Graphics]::FromImage($bmp)',
189
- '$g.CopyFromScreen($bounds.Location, [System.Drawing.Point]::Empty, $bounds.Size)',
190
- '$g.Dispose()',
191
- `$bmp.Save('${outFile.replace(/'/g, "''")}', [System.Drawing.Imaging.ImageFormat]::Png)`,
192
- '$bmp.Dispose()',
193
- ].join('; ');
194
- await execAsync('powershell', ['-NoProfile', '-Command', psScript]);
195
- }
196
- else {
197
- let windowId;
198
- if (windowTitle)
199
- windowId = await findWindowIdLinux(windowTitle);
200
- const captured = await captureLinux(outFile, { region, windowId });
201
- if (!captured)
202
- throw new Error('No screenshot tool found. Install one of: maim, scrot, gnome-screenshot, or import (ImageMagick).');
203
- }
204
- if (!fs.existsSync(outFile))
205
- throw new Error('Screenshot command succeeded but no file was produced.');
206
- return outFile;
207
- }
208
- async function captureLinux(tmpFile, opts) {
209
- const { region, windowId } = opts;
210
- // maim — best window/region support
211
- try {
212
- const args = [];
213
- if (windowId)
214
- args.push('-i', windowId);
215
- else if (region)
216
- args.push('-g', `${region.width}x${region.height}+${region.x}+${region.y}`);
217
- args.push(tmpFile);
218
- await execAsync('maim', args);
219
- return true;
220
- }
221
- catch { }
222
- // scrot
223
- try {
224
- const args = [];
225
- if (region)
226
- args.push('-a', `${region.x},${region.y},${region.width},${region.height}`);
227
- args.push(tmpFile);
228
- await execAsync('scrot', args);
229
- return true;
230
- }
231
- catch { }
232
- // gnome-screenshot
233
- try {
234
- await execAsync('gnome-screenshot', ['-f', tmpFile]);
235
- return true;
236
- }
237
- catch { }
238
- // import (ImageMagick)
239
- try {
240
- const win = windowId || 'root';
241
- const geometry = region ? `${region.width}x${region.height}+${region.x}+${region.y}` : undefined;
242
- const args = geometry ? ['-window', win, '-crop', geometry, tmpFile] : ['-window', win, tmpFile];
243
- await execAsync('import', args);
244
- return true;
245
- }
246
- catch { }
247
- return false;
248
- }
249
- // ---------------------------------------------------------------------------
250
- // Module export
251
- // ---------------------------------------------------------------------------
252
- export const captureTools = {
253
- tools,
254
- handle(name, args) {
255
- switch (name) {
256
- case 'take_screenshot': return handleTakeScreenshot(args);
257
- default: return toolResult(`Unknown capture tool: ${name}`, true);
258
- }
259
- },
260
- };