junis 0.3.11 → 0.3.12
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli/index.js +125 -62
- package/dist/server/mcp.js +125 -62
- package/dist/server/stdio.js +5 -4
- package/package.json +1 -1
package/dist/cli/index.js
CHANGED
|
@@ -491,7 +491,7 @@ var FilesystemTools = class {
|
|
|
491
491
|
"- Use for system commands, package managers (npm, pip, brew), git, build tools, and scripting.",
|
|
492
492
|
"- For reading files prefer read_file, for editing prefer edit_block, for searching prefer search_code.",
|
|
493
493
|
"- NOT for macOS app GUI interaction. When the user asks to interact with, control, or automate any application (clicking, typing, reading screen, navigating menus), use the desktop_* tools instead (desktop_open_app, desktop_see, desktop_click, desktop_type, desktop_paste, desktop_hotkey, desktop_scroll, desktop_menu, desktop_screenshot).",
|
|
494
|
-
"- The ONLY exception:
|
|
494
|
+
"- The ONLY exception: permission fix commands (swift -e for CGRequestScreenCaptureAccess/AXIsProcessTrustedWithOptions, peekaboo permissions, or open 'x-apple.systempreferences:...').",
|
|
495
495
|
"",
|
|
496
496
|
"BEHAVIOR:",
|
|
497
497
|
"- Execute commands directly when the user requests them. Do not ask for confirmation \u2014 the user has already decided.",
|
|
@@ -1633,9 +1633,10 @@ Cause: ${e.message}${hint}` }],
|
|
|
1633
1633
|
"Use action='start' to begin, action='stop' to end and save. Only one recording can be active at a time.",
|
|
1634
1634
|
"Platform-specific: macOS (screencapture -v), Windows/Linux (ffmpeg).",
|
|
1635
1635
|
"",
|
|
1636
|
-
"PERMISSIONS (macOS): Screen Recording permission is needed
|
|
1637
|
-
"
|
|
1638
|
-
"
|
|
1636
|
+
"PERMISSIONS (macOS): Screen Recording permission is needed (inherited from terminal app).",
|
|
1637
|
+
"If denied, run via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
1638
|
+
"\u2192 macOS dialog appears. Ask user to click Allow, then retry.",
|
|
1639
|
+
"Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'"
|
|
1639
1640
|
].join("\n"),
|
|
1640
1641
|
{
|
|
1641
1642
|
action: z4.enum(["start", "stop"]).describe("'start': begin recording, 'stop': end recording and save the file"),
|
|
@@ -1779,10 +1780,13 @@ var consecutiveFailures = 0;
|
|
|
1779
1780
|
var MAX_CONSECUTIVE_FAILURES = 2;
|
|
1780
1781
|
var PERM_FIX_HINT = [
|
|
1781
1782
|
"\n\n\u{1F527} PERMISSION FIX \u2014 run these via execute_command:",
|
|
1782
|
-
"1. Check
|
|
1783
|
-
"2. Screen Recording:
|
|
1784
|
-
"3. Accessibility:
|
|
1785
|
-
"
|
|
1783
|
+
"1. Check: peekaboo permissions --json-output",
|
|
1784
|
+
"2. Screen Recording: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
1785
|
+
"3. Accessibility: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1786
|
+
"\u2192 macOS system dialogs appear. Ask user to click Allow, then retry.",
|
|
1787
|
+
"NOTE: peekaboo inherits permissions from the terminal app \u2014 do NOT look for 'peekaboo' in System Preferences.",
|
|
1788
|
+
"Fallback (if Swift fails): open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
|
|
1789
|
+
" open 'x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility'"
|
|
1786
1790
|
].join("\n");
|
|
1787
1791
|
function isPermissionError(msg) {
|
|
1788
1792
|
const lower = msg.toLowerCase();
|
|
@@ -1821,27 +1825,33 @@ var DesktopTools = class {
|
|
|
1821
1825
|
"Workflow: desktop_open_app \u2192 desktop_see \u2192 desktop_click/type/paste \u2192 verify with desktop_see or desktop_screenshot.",
|
|
1822
1826
|
"",
|
|
1823
1827
|
"WORKFLOW TIPS:",
|
|
1824
|
-
"- If accessibility tree times out (complex UI apps like KakaoTalk):
|
|
1828
|
+
"- If accessibility tree times out (complex UI apps like KakaoTalk): increase timeout parameter, or fall back to:",
|
|
1829
|
+
" desktop_screenshot \u2192 desktop_list_windows (get window bounds x,y,w,h) \u2192 calculate coordinates \u2192 desktop_click with coords parameter.",
|
|
1825
1830
|
"- For Korean/Japanese/Chinese text input: always use desktop_paste (NOT desktop_type).",
|
|
1826
1831
|
"- For multi-window apps: use desktop_list_windows to find specific windows.",
|
|
1827
1832
|
"- Pass snapshotId to subsequent calls for 240x speed improvement.",
|
|
1833
|
+
"- Double-click to open items (e.g. chat windows in KakaoTalk): use desktop_click with doubleClick=true.",
|
|
1828
1834
|
"",
|
|
1829
|
-
"PERMISSIONS: Requires Accessibility + Screen Recording
|
|
1830
|
-
"
|
|
1831
|
-
"
|
|
1832
|
-
"
|
|
1833
|
-
"
|
|
1834
|
-
"
|
|
1835
|
+
"PERMISSIONS: Requires Accessibility + Screen Recording.",
|
|
1836
|
+
"peekaboo inherits permissions from the parent terminal app \u2014 it does NOT need its own entry in System Preferences.",
|
|
1837
|
+
"If denied, fix via execute_command:",
|
|
1838
|
+
" 1. peekaboo permissions --json-output (check which are missing)",
|
|
1839
|
+
" 2. Screen Recording: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
1840
|
+
" 3. Accessibility: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1841
|
+
" \u2192 macOS system dialogs appear. Ask user to click Allow, then retry.",
|
|
1842
|
+
" Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
|
|
1835
1843
|
"",
|
|
1836
1844
|
"SAFETY: Terminal, iTerm, and Finder are blocked. Two consecutive failures trigger automatic safety stop."
|
|
1837
1845
|
].join("\n"),
|
|
1838
1846
|
{
|
|
1839
|
-
app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'Notes', 'Google Chrome'). Omit for the frontmost app.")
|
|
1847
|
+
app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'Notes', 'Google Chrome'). Omit for the frontmost app."),
|
|
1848
|
+
timeout: z5.number().optional().describe("Timeout in seconds (default: 20). Increase for complex UI apps. If it still times out, fall back to desktop_screenshot + coordinate-based desktop_click.")
|
|
1840
1849
|
},
|
|
1841
|
-
async ({ app }) => {
|
|
1850
|
+
async ({ app, timeout }) => {
|
|
1842
1851
|
checkBlacklist(app);
|
|
1843
1852
|
const args = ["see"];
|
|
1844
1853
|
if (app) args.push("--app", app);
|
|
1854
|
+
if (timeout) args.push("--timeout-seconds", String(timeout));
|
|
1845
1855
|
const result = await peekaboo(args);
|
|
1846
1856
|
const data = result.data;
|
|
1847
1857
|
const snapshotId = data?.snapshot_id ?? result.snapshotId ?? result.snapshot_id;
|
|
@@ -1862,27 +1872,48 @@ var DesktopTools = class {
|
|
|
1862
1872
|
server.tool(
|
|
1863
1873
|
"desktop_click",
|
|
1864
1874
|
[
|
|
1865
|
-
"Click a macOS UI element by
|
|
1875
|
+
"Click a macOS UI element by text query, element ID, or x,y coordinates.",
|
|
1876
|
+
"",
|
|
1877
|
+
"PARAMETER GUIDE:",
|
|
1878
|
+
"- query: Text/label to search for (e.g. 'Save', 'Submit'). Searches visible UI elements.",
|
|
1879
|
+
"- on: Element ID from a previous desktop_see snapshot (e.g. 'B1', 'T2'). Fastest with snapshotId.",
|
|
1880
|
+
"- coords: Click at exact screen coordinates as 'x,y' (e.g. '1070,188'). Use when accessibility tree times out.",
|
|
1866
1881
|
"",
|
|
1867
|
-
"
|
|
1868
|
-
"
|
|
1882
|
+
"PROVEN WORKFLOW (from KakaoTalk automation):",
|
|
1883
|
+
"1. Try desktop_see first to get element IDs \u2192 click with 'on' parameter.",
|
|
1884
|
+
"2. If desktop_see times out: use desktop_screenshot \u2192 calculate coordinates \u2192 click with 'coords'.",
|
|
1885
|
+
"3. Use desktop_list_windows to get window bounds (x,y,w,h) for coordinate calculation.",
|
|
1869
1886
|
"",
|
|
1870
|
-
"PERMISSIONS: Requires
|
|
1887
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app).",
|
|
1871
1888
|
"",
|
|
1872
1889
|
"SAFETY: Terminal, iTerm, and Finder are blocked. Two consecutive failures trigger automatic safety stop."
|
|
1873
1890
|
].join("\n"),
|
|
1874
1891
|
{
|
|
1875
|
-
|
|
1876
|
-
|
|
1877
|
-
|
|
1878
|
-
|
|
1892
|
+
query: z5.string().optional().describe("Text/label to search and click (e.g. 'Save', 'Submit Button')"),
|
|
1893
|
+
on: z5.string().optional().describe("Element ID from desktop_see snapshot (e.g. 'B1', 'T2')"),
|
|
1894
|
+
coords: z5.string().optional().describe("Screen coordinates as 'x,y' (e.g. '1070,188'). Use when accessibility tree is unavailable."),
|
|
1895
|
+
app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'KakaoTalk')"),
|
|
1896
|
+
snapshot: z5.string().optional().describe("snapshotId from desktop_see for cached interaction (240x faster)"),
|
|
1897
|
+
doubleClick: z5.boolean().optional().default(false).describe("Double-click instead of single click (e.g. open files, open chat windows)"),
|
|
1898
|
+
rightClick: z5.boolean().optional().default(false).describe("Right-click (context menu)")
|
|
1879
1899
|
},
|
|
1880
|
-
async ({ on, app, snapshot, doubleClick }) => {
|
|
1900
|
+
async ({ query, on, coords, app, snapshot, doubleClick, rightClick }) => {
|
|
1881
1901
|
checkBlacklist(app);
|
|
1882
|
-
|
|
1902
|
+
if (!query && !on && !coords) {
|
|
1903
|
+
throw new Error("Provide at least one of: query (text search), on (element ID), or coords ('x,y').");
|
|
1904
|
+
}
|
|
1905
|
+
const args = ["click"];
|
|
1906
|
+
if (coords) {
|
|
1907
|
+
args.push("--coords", coords);
|
|
1908
|
+
} else if (on) {
|
|
1909
|
+
args.push("--on", on);
|
|
1910
|
+
} else if (query) {
|
|
1911
|
+
args.push(query);
|
|
1912
|
+
}
|
|
1883
1913
|
if (app) args.push("--app", app);
|
|
1884
1914
|
if (snapshot) args.push("--snapshot", snapshot);
|
|
1885
|
-
if (doubleClick) args.push("--double
|
|
1915
|
+
if (doubleClick) args.push("--double");
|
|
1916
|
+
if (rightClick) args.push("--right");
|
|
1886
1917
|
const result = await peekaboo(args);
|
|
1887
1918
|
return {
|
|
1888
1919
|
content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
|
|
@@ -1892,23 +1923,27 @@ var DesktopTools = class {
|
|
|
1892
1923
|
server.tool(
|
|
1893
1924
|
"desktop_type",
|
|
1894
1925
|
[
|
|
1895
|
-
"Type text into the currently focused UI element on macOS
|
|
1926
|
+
"Type text into the currently focused UI element on macOS via keyboard simulation.",
|
|
1896
1927
|
"",
|
|
1897
|
-
"IMPORTANT:
|
|
1898
|
-
"
|
|
1928
|
+
"IMPORTANT: For Korean/Japanese/Chinese/emoji text, use desktop_paste instead \u2014 keyboard simulation does not support CJK.",
|
|
1929
|
+
"Always click the target input field first (via desktop_click) before typing.",
|
|
1899
1930
|
"",
|
|
1900
|
-
"PERMISSIONS: Requires
|
|
1931
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app).",
|
|
1901
1932
|
"",
|
|
1902
1933
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1903
1934
|
].join("\n"),
|
|
1904
1935
|
{
|
|
1905
|
-
text: z5.string().describe("Text to type
|
|
1906
|
-
app: z5.string().optional().describe("App name to focus before typing")
|
|
1936
|
+
text: z5.string().describe("Text to type (ASCII only \u2014 for CJK/emoji use desktop_paste)"),
|
|
1937
|
+
app: z5.string().optional().describe("App name to focus before typing"),
|
|
1938
|
+
pressReturn: z5.boolean().optional().default(false).describe("Press Return/Enter after typing (e.g. to send a message or submit a form)"),
|
|
1939
|
+
clear: z5.boolean().optional().default(false).describe("Clear the field before typing (Cmd+A, Delete)")
|
|
1907
1940
|
},
|
|
1908
|
-
async ({ text, app }) => {
|
|
1941
|
+
async ({ text, app, pressReturn, clear }) => {
|
|
1909
1942
|
checkBlacklist(app);
|
|
1910
1943
|
const args = ["type", text];
|
|
1911
1944
|
if (app) args.push("--app", app);
|
|
1945
|
+
if (clear) args.push("--clear");
|
|
1946
|
+
if (pressReturn) args.push("--return");
|
|
1912
1947
|
const result = await peekaboo(args);
|
|
1913
1948
|
return {
|
|
1914
1949
|
content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
|
|
@@ -1922,7 +1957,8 @@ var DesktopTools = class {
|
|
|
1922
1957
|
"",
|
|
1923
1958
|
"Common shortcuts: 'cmd,c' (copy), 'cmd,v' (paste), 'cmd,z' (undo), 'cmd,s' (save), 'cmd,w' (close tab), 'cmd,q' (quit), 'cmd,shift,t' (reopen tab), 'cmd,tab' (switch app).",
|
|
1924
1959
|
"",
|
|
1925
|
-
"PERMISSIONS: Requires
|
|
1960
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
|
|
1961
|
+
"Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1926
1962
|
"",
|
|
1927
1963
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1928
1964
|
].join("\n"),
|
|
@@ -1947,7 +1983,8 @@ var DesktopTools = class {
|
|
|
1947
1983
|
"",
|
|
1948
1984
|
"Use 'ticks' to control scroll distance (default: 3, higher = more scrolling). Can target a specific element by label or ID from a previous accessibility tree capture.",
|
|
1949
1985
|
"",
|
|
1950
|
-
"PERMISSIONS: Requires
|
|
1986
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
|
|
1987
|
+
"Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1951
1988
|
"",
|
|
1952
1989
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1953
1990
|
].join("\n"),
|
|
@@ -2004,7 +2041,8 @@ var DesktopTools = class {
|
|
|
2004
2041
|
"If no app is specified, lists windows for the frontmost application.",
|
|
2005
2042
|
"Use this after identifying running apps to find specific windows before capturing the accessibility tree or taking a screenshot.",
|
|
2006
2043
|
"",
|
|
2007
|
-
"PERMISSIONS: Requires
|
|
2044
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
|
|
2045
|
+
"Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'"
|
|
2008
2046
|
].join("\n"),
|
|
2009
2047
|
{
|
|
2010
2048
|
app: z5.string().optional().describe("Filter by app name. Omit to query the frontmost app.")
|
|
@@ -2041,23 +2079,41 @@ var DesktopTools = class {
|
|
|
2041
2079
|
server.tool(
|
|
2042
2080
|
"desktop_screenshot",
|
|
2043
2081
|
[
|
|
2044
|
-
"Take a high-quality macOS screenshot
|
|
2082
|
+
"Take a high-quality macOS screenshot. Returns base64 image data.",
|
|
2083
|
+
"",
|
|
2084
|
+
"MODES:",
|
|
2085
|
+
"- 'screen': full display capture (default). Use screenIndex for multi-monitor setups.",
|
|
2086
|
+
"- 'window': specific app window. Specify with app, windowTitle, or windowIndex.",
|
|
2087
|
+
"- 'frontmost': capture only the frontmost window.",
|
|
2088
|
+
"- 'auto': peekaboo chooses the best mode automatically.",
|
|
2089
|
+
"",
|
|
2090
|
+
"TARGETING SPECIFIC WINDOWS:",
|
|
2091
|
+
"- app: capture by app name (e.g. 'Safari', 'KakaoTalk')",
|
|
2092
|
+
"- windowTitle: capture a specific window by title (partial match supported)",
|
|
2093
|
+
"- windowIndex: capture by window z-order (0 = frontmost window of the app)",
|
|
2094
|
+
"- screenIndex: which display to capture in 'screen' mode (0-based, for multi-monitor)",
|
|
2045
2095
|
"",
|
|
2046
|
-
"MODES: 'screen' captures the full display, 'window' captures a specific app window.",
|
|
2047
2096
|
"TIP: Prefer the accessibility tree for understanding UI structure \u2014 use screenshots only when visual appearance matters (layouts, images, colors).",
|
|
2048
2097
|
"",
|
|
2049
|
-
"PERMISSIONS: Requires
|
|
2098
|
+
"PERMISSIONS: Requires Screen Recording (inherited from terminal app, not peekaboo itself).",
|
|
2099
|
+
"Fix if denied via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
2050
2100
|
"",
|
|
2051
2101
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
2052
2102
|
].join("\n"),
|
|
2053
2103
|
{
|
|
2054
|
-
app: z5.string().optional().describe("Capture a specific app's window (by name)"),
|
|
2055
|
-
mode: z5.enum(["screen", "window"]).optional().default("screen").describe("'screen': full display
|
|
2104
|
+
app: z5.string().optional().describe("Capture a specific app's window (by name, e.g. 'Safari', 'KakaoTalk')"),
|
|
2105
|
+
mode: z5.enum(["screen", "window", "frontmost", "auto"]).optional().default("screen").describe("'screen': full display, 'window': specific app window, 'frontmost': frontmost window, 'auto': peekaboo decides"),
|
|
2106
|
+
windowTitle: z5.string().optional().describe("Capture window by title (partial match). Use with mode='window'."),
|
|
2107
|
+
windowIndex: z5.number().optional().describe("Window z-order index (0 = frontmost window of the app). Use with mode='window'."),
|
|
2108
|
+
screenIndex: z5.number().optional().describe("Display index for multi-monitor (0-based). Use with mode='screen'.")
|
|
2056
2109
|
},
|
|
2057
|
-
async ({ app, mode }) => {
|
|
2110
|
+
async ({ app, mode, windowTitle, windowIndex, screenIndex }) => {
|
|
2058
2111
|
checkBlacklist(app);
|
|
2059
2112
|
const args = ["image", "--mode", mode];
|
|
2060
2113
|
if (app) args.push("--app", app);
|
|
2114
|
+
if (windowTitle) args.push("--window-title", windowTitle);
|
|
2115
|
+
if (windowIndex !== void 0) args.push("--window-index", String(windowIndex));
|
|
2116
|
+
if (screenIndex !== void 0) args.push("--screen-index", String(screenIndex));
|
|
2061
2117
|
const result = await peekaboo(args);
|
|
2062
2118
|
const data = result.data;
|
|
2063
2119
|
const files = data?.files;
|
|
@@ -2085,7 +2141,8 @@ var DesktopTools = class {
|
|
|
2085
2141
|
"Examples: ['File', 'New Tab'], ['Edit', 'Find', 'Find...'], ['View', 'Enter Full Screen'].",
|
|
2086
2142
|
"Omit the 'app' parameter to target the frontmost app. The target app must be running.",
|
|
2087
2143
|
"",
|
|
2088
|
-
"PERMISSIONS: Requires
|
|
2144
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
|
|
2145
|
+
"Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
2089
2146
|
"",
|
|
2090
2147
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
2091
2148
|
].join("\n"),
|
|
@@ -2118,21 +2175,24 @@ var DesktopTools = class {
|
|
|
2118
2175
|
server.tool(
|
|
2119
2176
|
"desktop_paste",
|
|
2120
2177
|
[
|
|
2121
|
-
"Paste text via clipboard into the focused element.
|
|
2178
|
+
"Paste text via clipboard into the focused element. Automatically sets clipboard, pastes (Cmd+V), then restores previous clipboard.",
|
|
2122
2179
|
"",
|
|
2123
|
-
"
|
|
2180
|
+
"ALWAYS USE THIS instead of desktop_type for: Korean, Japanese, Chinese, emoji, or any non-ASCII text.",
|
|
2181
|
+
"Unlike desktop_type (keyboard simulation), this uses the system clipboard \u2014 works with ALL character sets.",
|
|
2124
2182
|
"",
|
|
2125
|
-
|
|
2183
|
+
`PROVEN: In KakaoTalk automation, 'peekaboo paste "\uC548\uB155?"' successfully sent Korean text while 'type' would have failed.`,
|
|
2184
|
+
"",
|
|
2185
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app).",
|
|
2126
2186
|
"",
|
|
2127
2187
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
2128
2188
|
].join("\n"),
|
|
2129
2189
|
{
|
|
2130
|
-
text: z5.string().describe("Text to paste
|
|
2190
|
+
text: z5.string().describe("Text to paste (supports Korean, Japanese, Chinese, emoji, any Unicode)"),
|
|
2131
2191
|
app: z5.string().optional().describe("App name to focus before pasting")
|
|
2132
2192
|
},
|
|
2133
2193
|
async ({ text, app }) => {
|
|
2134
2194
|
checkBlacklist(app);
|
|
2135
|
-
const args = ["
|
|
2195
|
+
const args = ["paste", text];
|
|
2136
2196
|
if (app) args.push("--app", app);
|
|
2137
2197
|
const result = await peekaboo(args);
|
|
2138
2198
|
return {
|
|
@@ -2145,8 +2205,10 @@ var DesktopTools = class {
|
|
|
2145
2205
|
[
|
|
2146
2206
|
"Launch or bring to front a macOS application. Use this as the FIRST STEP when automating any app.",
|
|
2147
2207
|
"",
|
|
2148
|
-
"
|
|
2149
|
-
"
|
|
2208
|
+
"PROVEN WORKFLOW (from KakaoTalk automation):",
|
|
2209
|
+
"1. desktop_open_app \u2192 2. desktop_list_apps (verify) \u2192 3. desktop_see or desktop_screenshot \u2192 4. interact",
|
|
2210
|
+
"",
|
|
2211
|
+
"After launching, use desktop_list_apps to confirm the app is running, then desktop_see to capture UI.",
|
|
2150
2212
|
"",
|
|
2151
2213
|
"SAFETY: Terminal, iTerm, and Finder are blocked for automation safety."
|
|
2152
2214
|
].join("\n"),
|
|
@@ -2155,29 +2217,30 @@ var DesktopTools = class {
|
|
|
2155
2217
|
},
|
|
2156
2218
|
async ({ app }) => {
|
|
2157
2219
|
checkBlacklist(app);
|
|
2158
|
-
|
|
2159
|
-
|
|
2220
|
+
const args = ["app", "launch", app, "--wait-until-ready"];
|
|
2221
|
+
const result = await peekaboo(args);
|
|
2160
2222
|
return {
|
|
2161
|
-
content: [{ type: "text", text:
|
|
2223
|
+
content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
|
|
2162
2224
|
};
|
|
2163
2225
|
}
|
|
2164
2226
|
);
|
|
2165
2227
|
server.tool(
|
|
2166
2228
|
"desktop_open_url",
|
|
2167
2229
|
[
|
|
2168
|
-
"Open a URL
|
|
2230
|
+
"Open a URL or file with its default (or specified) application.",
|
|
2169
2231
|
"",
|
|
2170
|
-
"Examples: 'https://google.com', '
|
|
2232
|
+
"Examples: 'https://google.com', '~/Documents/report.pdf', 'x-apple.systempreferences:...'"
|
|
2171
2233
|
].join("\n"),
|
|
2172
2234
|
{
|
|
2173
|
-
url: z5.string().describe("URL
|
|
2174
|
-
app: z5.string().optional().describe("Specific app to open
|
|
2235
|
+
url: z5.string().describe("URL or file path to open"),
|
|
2236
|
+
app: z5.string().optional().describe("Specific app to open with (e.g. 'Google Chrome', 'Preview')")
|
|
2175
2237
|
},
|
|
2176
2238
|
async ({ url, app }) => {
|
|
2177
|
-
const args =
|
|
2178
|
-
|
|
2239
|
+
const args = ["open", url];
|
|
2240
|
+
if (app) args.push("--app", app);
|
|
2241
|
+
const result = await peekaboo(args);
|
|
2179
2242
|
return {
|
|
2180
|
-
content: [{ type: "text", text:
|
|
2243
|
+
content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
|
|
2181
2244
|
};
|
|
2182
2245
|
}
|
|
2183
2246
|
);
|
package/dist/server/mcp.js
CHANGED
|
@@ -77,7 +77,7 @@ var FilesystemTools = class {
|
|
|
77
77
|
"- Use for system commands, package managers (npm, pip, brew), git, build tools, and scripting.",
|
|
78
78
|
"- For reading files prefer read_file, for editing prefer edit_block, for searching prefer search_code.",
|
|
79
79
|
"- NOT for macOS app GUI interaction. When the user asks to interact with, control, or automate any application (clicking, typing, reading screen, navigating menus), use the desktop_* tools instead (desktop_open_app, desktop_see, desktop_click, desktop_type, desktop_paste, desktop_hotkey, desktop_scroll, desktop_menu, desktop_screenshot).",
|
|
80
|
-
"- The ONLY exception:
|
|
80
|
+
"- The ONLY exception: permission fix commands (swift -e for CGRequestScreenCaptureAccess/AXIsProcessTrustedWithOptions, peekaboo permissions, or open 'x-apple.systempreferences:...').",
|
|
81
81
|
"",
|
|
82
82
|
"BEHAVIOR:",
|
|
83
83
|
"- Execute commands directly when the user requests them. Do not ask for confirmation \u2014 the user has already decided.",
|
|
@@ -1219,9 +1219,10 @@ Cause: ${e.message}${hint}` }],
|
|
|
1219
1219
|
"Use action='start' to begin, action='stop' to end and save. Only one recording can be active at a time.",
|
|
1220
1220
|
"Platform-specific: macOS (screencapture -v), Windows/Linux (ffmpeg).",
|
|
1221
1221
|
"",
|
|
1222
|
-
"PERMISSIONS (macOS): Screen Recording permission is needed
|
|
1223
|
-
"
|
|
1224
|
-
"
|
|
1222
|
+
"PERMISSIONS (macOS): Screen Recording permission is needed (inherited from terminal app).",
|
|
1223
|
+
"If denied, run via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
1224
|
+
"\u2192 macOS dialog appears. Ask user to click Allow, then retry.",
|
|
1225
|
+
"Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'"
|
|
1225
1226
|
].join("\n"),
|
|
1226
1227
|
{
|
|
1227
1228
|
action: z4.enum(["start", "stop"]).describe("'start': begin recording, 'stop': end recording and save the file"),
|
|
@@ -1365,10 +1366,13 @@ var consecutiveFailures = 0;
|
|
|
1365
1366
|
var MAX_CONSECUTIVE_FAILURES = 2;
|
|
1366
1367
|
var PERM_FIX_HINT = [
|
|
1367
1368
|
"\n\n\u{1F527} PERMISSION FIX \u2014 run these via execute_command:",
|
|
1368
|
-
"1. Check
|
|
1369
|
-
"2. Screen Recording:
|
|
1370
|
-
"3. Accessibility:
|
|
1371
|
-
"
|
|
1369
|
+
"1. Check: peekaboo permissions --json-output",
|
|
1370
|
+
"2. Screen Recording: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
1371
|
+
"3. Accessibility: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1372
|
+
"\u2192 macOS system dialogs appear. Ask user to click Allow, then retry.",
|
|
1373
|
+
"NOTE: peekaboo inherits permissions from the terminal app \u2014 do NOT look for 'peekaboo' in System Preferences.",
|
|
1374
|
+
"Fallback (if Swift fails): open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
|
|
1375
|
+
" open 'x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility'"
|
|
1372
1376
|
].join("\n");
|
|
1373
1377
|
function isPermissionError(msg) {
|
|
1374
1378
|
const lower = msg.toLowerCase();
|
|
@@ -1407,27 +1411,33 @@ var DesktopTools = class {
|
|
|
1407
1411
|
"Workflow: desktop_open_app \u2192 desktop_see \u2192 desktop_click/type/paste \u2192 verify with desktop_see or desktop_screenshot.",
|
|
1408
1412
|
"",
|
|
1409
1413
|
"WORKFLOW TIPS:",
|
|
1410
|
-
"- If accessibility tree times out (complex UI apps like KakaoTalk):
|
|
1414
|
+
"- If accessibility tree times out (complex UI apps like KakaoTalk): increase timeout parameter, or fall back to:",
|
|
1415
|
+
" desktop_screenshot \u2192 desktop_list_windows (get window bounds x,y,w,h) \u2192 calculate coordinates \u2192 desktop_click with coords parameter.",
|
|
1411
1416
|
"- For Korean/Japanese/Chinese text input: always use desktop_paste (NOT desktop_type).",
|
|
1412
1417
|
"- For multi-window apps: use desktop_list_windows to find specific windows.",
|
|
1413
1418
|
"- Pass snapshotId to subsequent calls for 240x speed improvement.",
|
|
1419
|
+
"- Double-click to open items (e.g. chat windows in KakaoTalk): use desktop_click with doubleClick=true.",
|
|
1414
1420
|
"",
|
|
1415
|
-
"PERMISSIONS: Requires Accessibility + Screen Recording
|
|
1416
|
-
"
|
|
1417
|
-
"
|
|
1418
|
-
"
|
|
1419
|
-
"
|
|
1420
|
-
"
|
|
1421
|
+
"PERMISSIONS: Requires Accessibility + Screen Recording.",
|
|
1422
|
+
"peekaboo inherits permissions from the parent terminal app \u2014 it does NOT need its own entry in System Preferences.",
|
|
1423
|
+
"If denied, fix via execute_command:",
|
|
1424
|
+
" 1. peekaboo permissions --json-output (check which are missing)",
|
|
1425
|
+
" 2. Screen Recording: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
1426
|
+
" 3. Accessibility: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1427
|
+
" \u2192 macOS system dialogs appear. Ask user to click Allow, then retry.",
|
|
1428
|
+
" Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
|
|
1421
1429
|
"",
|
|
1422
1430
|
"SAFETY: Terminal, iTerm, and Finder are blocked. Two consecutive failures trigger automatic safety stop."
|
|
1423
1431
|
].join("\n"),
|
|
1424
1432
|
{
|
|
1425
|
-
app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'Notes', 'Google Chrome'). Omit for the frontmost app.")
|
|
1433
|
+
app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'Notes', 'Google Chrome'). Omit for the frontmost app."),
|
|
1434
|
+
timeout: z5.number().optional().describe("Timeout in seconds (default: 20). Increase for complex UI apps. If it still times out, fall back to desktop_screenshot + coordinate-based desktop_click.")
|
|
1426
1435
|
},
|
|
1427
|
-
async ({ app }) => {
|
|
1436
|
+
async ({ app, timeout }) => {
|
|
1428
1437
|
checkBlacklist(app);
|
|
1429
1438
|
const args = ["see"];
|
|
1430
1439
|
if (app) args.push("--app", app);
|
|
1440
|
+
if (timeout) args.push("--timeout-seconds", String(timeout));
|
|
1431
1441
|
const result = await peekaboo(args);
|
|
1432
1442
|
const data = result.data;
|
|
1433
1443
|
const snapshotId = data?.snapshot_id ?? result.snapshotId ?? result.snapshot_id;
|
|
@@ -1448,27 +1458,48 @@ var DesktopTools = class {
|
|
|
1448
1458
|
server.tool(
|
|
1449
1459
|
"desktop_click",
|
|
1450
1460
|
[
|
|
1451
|
-
"Click a macOS UI element by
|
|
1461
|
+
"Click a macOS UI element by text query, element ID, or x,y coordinates.",
|
|
1462
|
+
"",
|
|
1463
|
+
"PARAMETER GUIDE:",
|
|
1464
|
+
"- query: Text/label to search for (e.g. 'Save', 'Submit'). Searches visible UI elements.",
|
|
1465
|
+
"- on: Element ID from a previous desktop_see snapshot (e.g. 'B1', 'T2'). Fastest with snapshotId.",
|
|
1466
|
+
"- coords: Click at exact screen coordinates as 'x,y' (e.g. '1070,188'). Use when accessibility tree times out.",
|
|
1452
1467
|
"",
|
|
1453
|
-
"
|
|
1454
|
-
"
|
|
1468
|
+
"PROVEN WORKFLOW (from KakaoTalk automation):",
|
|
1469
|
+
"1. Try desktop_see first to get element IDs \u2192 click with 'on' parameter.",
|
|
1470
|
+
"2. If desktop_see times out: use desktop_screenshot \u2192 calculate coordinates \u2192 click with 'coords'.",
|
|
1471
|
+
"3. Use desktop_list_windows to get window bounds (x,y,w,h) for coordinate calculation.",
|
|
1455
1472
|
"",
|
|
1456
|
-
"PERMISSIONS: Requires
|
|
1473
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app).",
|
|
1457
1474
|
"",
|
|
1458
1475
|
"SAFETY: Terminal, iTerm, and Finder are blocked. Two consecutive failures trigger automatic safety stop."
|
|
1459
1476
|
].join("\n"),
|
|
1460
1477
|
{
|
|
1461
|
-
|
|
1462
|
-
|
|
1463
|
-
|
|
1464
|
-
|
|
1478
|
+
query: z5.string().optional().describe("Text/label to search and click (e.g. 'Save', 'Submit Button')"),
|
|
1479
|
+
on: z5.string().optional().describe("Element ID from desktop_see snapshot (e.g. 'B1', 'T2')"),
|
|
1480
|
+
coords: z5.string().optional().describe("Screen coordinates as 'x,y' (e.g. '1070,188'). Use when accessibility tree is unavailable."),
|
|
1481
|
+
app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'KakaoTalk')"),
|
|
1482
|
+
snapshot: z5.string().optional().describe("snapshotId from desktop_see for cached interaction (240x faster)"),
|
|
1483
|
+
doubleClick: z5.boolean().optional().default(false).describe("Double-click instead of single click (e.g. open files, open chat windows)"),
|
|
1484
|
+
rightClick: z5.boolean().optional().default(false).describe("Right-click (context menu)")
|
|
1465
1485
|
},
|
|
1466
|
-
async ({ on, app, snapshot, doubleClick }) => {
|
|
1486
|
+
async ({ query, on, coords, app, snapshot, doubleClick, rightClick }) => {
|
|
1467
1487
|
checkBlacklist(app);
|
|
1468
|
-
|
|
1488
|
+
if (!query && !on && !coords) {
|
|
1489
|
+
throw new Error("Provide at least one of: query (text search), on (element ID), or coords ('x,y').");
|
|
1490
|
+
}
|
|
1491
|
+
const args = ["click"];
|
|
1492
|
+
if (coords) {
|
|
1493
|
+
args.push("--coords", coords);
|
|
1494
|
+
} else if (on) {
|
|
1495
|
+
args.push("--on", on);
|
|
1496
|
+
} else if (query) {
|
|
1497
|
+
args.push(query);
|
|
1498
|
+
}
|
|
1469
1499
|
if (app) args.push("--app", app);
|
|
1470
1500
|
if (snapshot) args.push("--snapshot", snapshot);
|
|
1471
|
-
if (doubleClick) args.push("--double
|
|
1501
|
+
if (doubleClick) args.push("--double");
|
|
1502
|
+
if (rightClick) args.push("--right");
|
|
1472
1503
|
const result = await peekaboo(args);
|
|
1473
1504
|
return {
|
|
1474
1505
|
content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
|
|
@@ -1478,23 +1509,27 @@ var DesktopTools = class {
|
|
|
1478
1509
|
server.tool(
|
|
1479
1510
|
"desktop_type",
|
|
1480
1511
|
[
|
|
1481
|
-
"Type text into the currently focused UI element on macOS
|
|
1512
|
+
"Type text into the currently focused UI element on macOS via keyboard simulation.",
|
|
1482
1513
|
"",
|
|
1483
|
-
"IMPORTANT:
|
|
1484
|
-
"
|
|
1514
|
+
"IMPORTANT: For Korean/Japanese/Chinese/emoji text, use desktop_paste instead \u2014 keyboard simulation does not support CJK.",
|
|
1515
|
+
"Always click the target input field first (via desktop_click) before typing.",
|
|
1485
1516
|
"",
|
|
1486
|
-
"PERMISSIONS: Requires
|
|
1517
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app).",
|
|
1487
1518
|
"",
|
|
1488
1519
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1489
1520
|
].join("\n"),
|
|
1490
1521
|
{
|
|
1491
|
-
text: z5.string().describe("Text to type
|
|
1492
|
-
app: z5.string().optional().describe("App name to focus before typing")
|
|
1522
|
+
text: z5.string().describe("Text to type (ASCII only \u2014 for CJK/emoji use desktop_paste)"),
|
|
1523
|
+
app: z5.string().optional().describe("App name to focus before typing"),
|
|
1524
|
+
pressReturn: z5.boolean().optional().default(false).describe("Press Return/Enter after typing (e.g. to send a message or submit a form)"),
|
|
1525
|
+
clear: z5.boolean().optional().default(false).describe("Clear the field before typing (Cmd+A, Delete)")
|
|
1493
1526
|
},
|
|
1494
|
-
async ({ text, app }) => {
|
|
1527
|
+
async ({ text, app, pressReturn, clear }) => {
|
|
1495
1528
|
checkBlacklist(app);
|
|
1496
1529
|
const args = ["type", text];
|
|
1497
1530
|
if (app) args.push("--app", app);
|
|
1531
|
+
if (clear) args.push("--clear");
|
|
1532
|
+
if (pressReturn) args.push("--return");
|
|
1498
1533
|
const result = await peekaboo(args);
|
|
1499
1534
|
return {
|
|
1500
1535
|
content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
|
|
@@ -1508,7 +1543,8 @@ var DesktopTools = class {
|
|
|
1508
1543
|
"",
|
|
1509
1544
|
"Common shortcuts: 'cmd,c' (copy), 'cmd,v' (paste), 'cmd,z' (undo), 'cmd,s' (save), 'cmd,w' (close tab), 'cmd,q' (quit), 'cmd,shift,t' (reopen tab), 'cmd,tab' (switch app).",
|
|
1510
1545
|
"",
|
|
1511
|
-
"PERMISSIONS: Requires
|
|
1546
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
|
|
1547
|
+
"Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1512
1548
|
"",
|
|
1513
1549
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1514
1550
|
].join("\n"),
|
|
@@ -1533,7 +1569,8 @@ var DesktopTools = class {
|
|
|
1533
1569
|
"",
|
|
1534
1570
|
"Use 'ticks' to control scroll distance (default: 3, higher = more scrolling). Can target a specific element by label or ID from a previous accessibility tree capture.",
|
|
1535
1571
|
"",
|
|
1536
|
-
"PERMISSIONS: Requires
|
|
1572
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
|
|
1573
|
+
"Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1537
1574
|
"",
|
|
1538
1575
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1539
1576
|
].join("\n"),
|
|
@@ -1590,7 +1627,8 @@ var DesktopTools = class {
|
|
|
1590
1627
|
"If no app is specified, lists windows for the frontmost application.",
|
|
1591
1628
|
"Use this after identifying running apps to find specific windows before capturing the accessibility tree or taking a screenshot.",
|
|
1592
1629
|
"",
|
|
1593
|
-
"PERMISSIONS: Requires
|
|
1630
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
|
|
1631
|
+
"Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'"
|
|
1594
1632
|
].join("\n"),
|
|
1595
1633
|
{
|
|
1596
1634
|
app: z5.string().optional().describe("Filter by app name. Omit to query the frontmost app.")
|
|
@@ -1627,23 +1665,41 @@ var DesktopTools = class {
|
|
|
1627
1665
|
server.tool(
|
|
1628
1666
|
"desktop_screenshot",
|
|
1629
1667
|
[
|
|
1630
|
-
"Take a high-quality macOS screenshot
|
|
1668
|
+
"Take a high-quality macOS screenshot. Returns base64 image data.",
|
|
1669
|
+
"",
|
|
1670
|
+
"MODES:",
|
|
1671
|
+
"- 'screen': full display capture (default). Use screenIndex for multi-monitor setups.",
|
|
1672
|
+
"- 'window': specific app window. Specify with app, windowTitle, or windowIndex.",
|
|
1673
|
+
"- 'frontmost': capture only the frontmost window.",
|
|
1674
|
+
"- 'auto': peekaboo chooses the best mode automatically.",
|
|
1675
|
+
"",
|
|
1676
|
+
"TARGETING SPECIFIC WINDOWS:",
|
|
1677
|
+
"- app: capture by app name (e.g. 'Safari', 'KakaoTalk')",
|
|
1678
|
+
"- windowTitle: capture a specific window by title (partial match supported)",
|
|
1679
|
+
"- windowIndex: capture by window z-order (0 = frontmost window of the app)",
|
|
1680
|
+
"- screenIndex: which display to capture in 'screen' mode (0-based, for multi-monitor)",
|
|
1631
1681
|
"",
|
|
1632
|
-
"MODES: 'screen' captures the full display, 'window' captures a specific app window.",
|
|
1633
1682
|
"TIP: Prefer the accessibility tree for understanding UI structure \u2014 use screenshots only when visual appearance matters (layouts, images, colors).",
|
|
1634
1683
|
"",
|
|
1635
|
-
"PERMISSIONS: Requires
|
|
1684
|
+
"PERMISSIONS: Requires Screen Recording (inherited from terminal app, not peekaboo itself).",
|
|
1685
|
+
"Fix if denied via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
1636
1686
|
"",
|
|
1637
1687
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1638
1688
|
].join("\n"),
|
|
1639
1689
|
{
|
|
1640
|
-
app: z5.string().optional().describe("Capture a specific app's window (by name)"),
|
|
1641
|
-
mode: z5.enum(["screen", "window"]).optional().default("screen").describe("'screen': full display
|
|
1690
|
+
app: z5.string().optional().describe("Capture a specific app's window (by name, e.g. 'Safari', 'KakaoTalk')"),
|
|
1691
|
+
mode: z5.enum(["screen", "window", "frontmost", "auto"]).optional().default("screen").describe("'screen': full display, 'window': specific app window, 'frontmost': frontmost window, 'auto': peekaboo decides"),
|
|
1692
|
+
windowTitle: z5.string().optional().describe("Capture window by title (partial match). Use with mode='window'."),
|
|
1693
|
+
windowIndex: z5.number().optional().describe("Window z-order index (0 = frontmost window of the app). Use with mode='window'."),
|
|
1694
|
+
screenIndex: z5.number().optional().describe("Display index for multi-monitor (0-based). Use with mode='screen'.")
|
|
1642
1695
|
},
|
|
1643
|
-
async ({ app, mode }) => {
|
|
1696
|
+
async ({ app, mode, windowTitle, windowIndex, screenIndex }) => {
|
|
1644
1697
|
checkBlacklist(app);
|
|
1645
1698
|
const args = ["image", "--mode", mode];
|
|
1646
1699
|
if (app) args.push("--app", app);
|
|
1700
|
+
if (windowTitle) args.push("--window-title", windowTitle);
|
|
1701
|
+
if (windowIndex !== void 0) args.push("--window-index", String(windowIndex));
|
|
1702
|
+
if (screenIndex !== void 0) args.push("--screen-index", String(screenIndex));
|
|
1647
1703
|
const result = await peekaboo(args);
|
|
1648
1704
|
const data = result.data;
|
|
1649
1705
|
const files = data?.files;
|
|
@@ -1671,7 +1727,8 @@ var DesktopTools = class {
|
|
|
1671
1727
|
"Examples: ['File', 'New Tab'], ['Edit', 'Find', 'Find...'], ['View', 'Enter Full Screen'].",
|
|
1672
1728
|
"Omit the 'app' parameter to target the frontmost app. The target app must be running.",
|
|
1673
1729
|
"",
|
|
1674
|
-
"PERMISSIONS: Requires
|
|
1730
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
|
|
1731
|
+
"Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
|
|
1675
1732
|
"",
|
|
1676
1733
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1677
1734
|
].join("\n"),
|
|
@@ -1704,21 +1761,24 @@ var DesktopTools = class {
|
|
|
1704
1761
|
server.tool(
|
|
1705
1762
|
"desktop_paste",
|
|
1706
1763
|
[
|
|
1707
|
-
"Paste text via clipboard into the focused element.
|
|
1764
|
+
"Paste text via clipboard into the focused element. Automatically sets clipboard, pastes (Cmd+V), then restores previous clipboard.",
|
|
1708
1765
|
"",
|
|
1709
|
-
"
|
|
1766
|
+
"ALWAYS USE THIS instead of desktop_type for: Korean, Japanese, Chinese, emoji, or any non-ASCII text.",
|
|
1767
|
+
"Unlike desktop_type (keyboard simulation), this uses the system clipboard \u2014 works with ALL character sets.",
|
|
1710
1768
|
"",
|
|
1711
|
-
|
|
1769
|
+
`PROVEN: In KakaoTalk automation, 'peekaboo paste "\uC548\uB155?"' successfully sent Korean text while 'type' would have failed.`,
|
|
1770
|
+
"",
|
|
1771
|
+
"PERMISSIONS: Requires Accessibility (inherited from terminal app).",
|
|
1712
1772
|
"",
|
|
1713
1773
|
"SAFETY: Terminal, iTerm, and Finder are blocked."
|
|
1714
1774
|
].join("\n"),
|
|
1715
1775
|
{
|
|
1716
|
-
text: z5.string().describe("Text to paste
|
|
1776
|
+
text: z5.string().describe("Text to paste (supports Korean, Japanese, Chinese, emoji, any Unicode)"),
|
|
1717
1777
|
app: z5.string().optional().describe("App name to focus before pasting")
|
|
1718
1778
|
},
|
|
1719
1779
|
async ({ text, app }) => {
|
|
1720
1780
|
checkBlacklist(app);
|
|
1721
|
-
const args = ["
|
|
1781
|
+
const args = ["paste", text];
|
|
1722
1782
|
if (app) args.push("--app", app);
|
|
1723
1783
|
const result = await peekaboo(args);
|
|
1724
1784
|
return {
|
|
@@ -1731,8 +1791,10 @@ var DesktopTools = class {
|
|
|
1731
1791
|
[
|
|
1732
1792
|
"Launch or bring to front a macOS application. Use this as the FIRST STEP when automating any app.",
|
|
1733
1793
|
"",
|
|
1734
|
-
"
|
|
1735
|
-
"
|
|
1794
|
+
"PROVEN WORKFLOW (from KakaoTalk automation):",
|
|
1795
|
+
"1. desktop_open_app \u2192 2. desktop_list_apps (verify) \u2192 3. desktop_see or desktop_screenshot \u2192 4. interact",
|
|
1796
|
+
"",
|
|
1797
|
+
"After launching, use desktop_list_apps to confirm the app is running, then desktop_see to capture UI.",
|
|
1736
1798
|
"",
|
|
1737
1799
|
"SAFETY: Terminal, iTerm, and Finder are blocked for automation safety."
|
|
1738
1800
|
].join("\n"),
|
|
@@ -1741,29 +1803,30 @@ var DesktopTools = class {
|
|
|
1741
1803
|
},
|
|
1742
1804
|
async ({ app }) => {
|
|
1743
1805
|
checkBlacklist(app);
|
|
1744
|
-
|
|
1745
|
-
|
|
1806
|
+
const args = ["app", "launch", app, "--wait-until-ready"];
|
|
1807
|
+
const result = await peekaboo(args);
|
|
1746
1808
|
return {
|
|
1747
|
-
content: [{ type: "text", text:
|
|
1809
|
+
content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
|
|
1748
1810
|
};
|
|
1749
1811
|
}
|
|
1750
1812
|
);
|
|
1751
1813
|
server.tool(
|
|
1752
1814
|
"desktop_open_url",
|
|
1753
1815
|
[
|
|
1754
|
-
"Open a URL
|
|
1816
|
+
"Open a URL or file with its default (or specified) application.",
|
|
1755
1817
|
"",
|
|
1756
|
-
"Examples: 'https://google.com', '
|
|
1818
|
+
"Examples: 'https://google.com', '~/Documents/report.pdf', 'x-apple.systempreferences:...'"
|
|
1757
1819
|
].join("\n"),
|
|
1758
1820
|
{
|
|
1759
|
-
url: z5.string().describe("URL
|
|
1760
|
-
app: z5.string().optional().describe("Specific app to open
|
|
1821
|
+
url: z5.string().describe("URL or file path to open"),
|
|
1822
|
+
app: z5.string().optional().describe("Specific app to open with (e.g. 'Google Chrome', 'Preview')")
|
|
1761
1823
|
},
|
|
1762
1824
|
async ({ url, app }) => {
|
|
1763
|
-
const args =
|
|
1764
|
-
|
|
1825
|
+
const args = ["open", url];
|
|
1826
|
+
if (app) args.push("--app", app);
|
|
1827
|
+
const result = await peekaboo(args);
|
|
1765
1828
|
return {
|
|
1766
|
-
content: [{ type: "text", text:
|
|
1829
|
+
content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
|
|
1767
1830
|
};
|
|
1768
1831
|
}
|
|
1769
1832
|
);
|
package/dist/server/stdio.js
CHANGED
|
@@ -78,7 +78,7 @@ var FilesystemTools = class {
|
|
|
78
78
|
"- Use for system commands, package managers (npm, pip, brew), git, build tools, and scripting.",
|
|
79
79
|
"- For reading files prefer read_file, for editing prefer edit_block, for searching prefer search_code.",
|
|
80
80
|
"- NOT for macOS app GUI interaction. When the user asks to interact with, control, or automate any application (clicking, typing, reading screen, navigating menus), use the desktop_* tools instead (desktop_open_app, desktop_see, desktop_click, desktop_type, desktop_paste, desktop_hotkey, desktop_scroll, desktop_menu, desktop_screenshot).",
|
|
81
|
-
"- The ONLY exception:
|
|
81
|
+
"- The ONLY exception: permission fix commands (swift -e for CGRequestScreenCaptureAccess/AXIsProcessTrustedWithOptions, peekaboo permissions, or open 'x-apple.systempreferences:...').",
|
|
82
82
|
"",
|
|
83
83
|
"BEHAVIOR:",
|
|
84
84
|
"- Execute commands directly when the user requests them. Do not ask for confirmation \u2014 the user has already decided.",
|
|
@@ -1220,9 +1220,10 @@ Cause: ${e.message}${hint}` }],
|
|
|
1220
1220
|
"Use action='start' to begin, action='stop' to end and save. Only one recording can be active at a time.",
|
|
1221
1221
|
"Platform-specific: macOS (screencapture -v), Windows/Linux (ffmpeg).",
|
|
1222
1222
|
"",
|
|
1223
|
-
"PERMISSIONS (macOS): Screen Recording permission is needed
|
|
1224
|
-
"
|
|
1225
|
-
"
|
|
1223
|
+
"PERMISSIONS (macOS): Screen Recording permission is needed (inherited from terminal app).",
|
|
1224
|
+
"If denied, run via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
|
|
1225
|
+
"\u2192 macOS dialog appears. Ask user to click Allow, then retry.",
|
|
1226
|
+
"Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'"
|
|
1226
1227
|
].join("\n"),
|
|
1227
1228
|
{
|
|
1228
1229
|
action: z4.enum(["start", "stop"]).describe("'start': begin recording, 'stop': end recording and save the file"),
|