junis 0.3.11 → 0.3.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/cli/index.js CHANGED
@@ -491,7 +491,7 @@ var FilesystemTools = class {
491
491
  "- Use for system commands, package managers (npm, pip, brew), git, build tools, and scripting.",
492
492
  "- For reading files prefer read_file, for editing prefer edit_block, for searching prefer search_code.",
493
493
  "- NOT for macOS app GUI interaction. When the user asks to interact with, control, or automate any application (clicking, typing, reading screen, navigating menus), use the desktop_* tools instead (desktop_open_app, desktop_see, desktop_click, desktop_type, desktop_paste, desktop_hotkey, desktop_scroll, desktop_menu, desktop_screenshot).",
494
- "- The ONLY exception: opening System Preferences URLs for permissions (e.g. open 'x-apple.systempreferences:...').",
494
+ "- The ONLY exception: permission fix commands (swift -e for CGRequestScreenCaptureAccess/AXIsProcessTrustedWithOptions, peekaboo permissions, or open 'x-apple.systempreferences:...').",
495
495
  "",
496
496
  "BEHAVIOR:",
497
497
  "- Execute commands directly when the user requests them. Do not ask for confirmation \u2014 the user has already decided.",
@@ -1633,9 +1633,10 @@ Cause: ${e.message}${hint}` }],
1633
1633
  "Use action='start' to begin, action='stop' to end and save. Only one recording can be active at a time.",
1634
1634
  "Platform-specific: macOS (screencapture -v), Windows/Linux (ffmpeg).",
1635
1635
  "",
1636
- "PERMISSIONS (macOS): Screen Recording permission is needed. If denied, run via execute_command:",
1637
- " open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1638
- "Toggle ON for 'screencapture' (or your terminal app), then retry."
1636
+ "PERMISSIONS (macOS): Screen Recording permission is needed (inherited from terminal app).",
1637
+ "If denied, run via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
1638
+ "\u2192 macOS dialog appears. Ask user to click Allow, then retry.",
1639
+ "Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'"
1639
1640
  ].join("\n"),
1640
1641
  {
1641
1642
  action: z4.enum(["start", "stop"]).describe("'start': begin recording, 'stop': end recording and save the file"),
@@ -1779,10 +1780,13 @@ var consecutiveFailures = 0;
1779
1780
  var MAX_CONSECUTIVE_FAILURES = 2;
1780
1781
  var PERM_FIX_HINT = [
1781
1782
  "\n\n\u{1F527} PERMISSION FIX \u2014 run these via execute_command:",
1782
- "1. Check status: peekaboo permissions --json-output",
1783
- "2. Screen Recording: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1784
- "3. Accessibility: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility'",
1785
- "Toggle ON for 'peekaboo' in the opened panel, then retry."
1783
+ "1. Check: peekaboo permissions --json-output",
1784
+ "2. Screen Recording: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
1785
+ "3. Accessibility: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1786
+ "\u2192 macOS system dialogs appear. Ask user to click Allow, then retry.",
1787
+ "NOTE: peekaboo inherits permissions from the terminal app \u2014 do NOT look for 'peekaboo' in System Preferences.",
1788
+ "Fallback (if Swift fails): open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1789
+ " open 'x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility'"
1786
1790
  ].join("\n");
1787
1791
  function isPermissionError(msg) {
1788
1792
  const lower = msg.toLowerCase();
@@ -1821,27 +1825,33 @@ var DesktopTools = class {
1821
1825
  "Workflow: desktop_open_app \u2192 desktop_see \u2192 desktop_click/type/paste \u2192 verify with desktop_see or desktop_screenshot.",
1822
1826
  "",
1823
1827
  "WORKFLOW TIPS:",
1824
- "- If accessibility tree times out (complex UI apps like KakaoTalk): use desktop_screenshot + coordinate-based desktop_click instead.",
1828
+ "- If accessibility tree times out (complex UI apps like KakaoTalk): increase timeout parameter, or fall back to:",
1829
+ " desktop_screenshot \u2192 desktop_list_windows (get window bounds x,y,w,h) \u2192 calculate coordinates \u2192 desktop_click with coords parameter.",
1825
1830
  "- For Korean/Japanese/Chinese text input: always use desktop_paste (NOT desktop_type).",
1826
1831
  "- For multi-window apps: use desktop_list_windows to find specific windows.",
1827
1832
  "- Pass snapshotId to subsequent calls for 240x speed improvement.",
1833
+ "- Double-click to open items (e.g. chat windows in KakaoTalk): use desktop_click with doubleClick=true.",
1828
1834
  "",
1829
- "PERMISSIONS: Requires Accessibility + Screen Recording for 'peekaboo'.",
1830
- "If denied, run via execute_command:",
1831
- " 1. peekaboo permissions --json-output",
1832
- " 2. open 'x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility'",
1833
- " 3. open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1834
- "Toggle ON for 'peekaboo', then retry.",
1835
+ "PERMISSIONS: Requires Accessibility + Screen Recording.",
1836
+ "peekaboo inherits permissions from the parent terminal app \u2014 it does NOT need its own entry in System Preferences.",
1837
+ "If denied, fix via execute_command:",
1838
+ " 1. peekaboo permissions --json-output (check which are missing)",
1839
+ " 2. Screen Recording: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
1840
+ " 3. Accessibility: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1841
+ " \u2192 macOS system dialogs appear. Ask user to click Allow, then retry.",
1842
+ " Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1835
1843
  "",
1836
1844
  "SAFETY: Terminal, iTerm, and Finder are blocked. Two consecutive failures trigger automatic safety stop."
1837
1845
  ].join("\n"),
1838
1846
  {
1839
- app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'Notes', 'Google Chrome'). Omit for the frontmost app.")
1847
+ app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'Notes', 'Google Chrome'). Omit for the frontmost app."),
1848
+ timeout: z5.number().optional().describe("Timeout in seconds (default: 20). Increase for complex UI apps. If it still times out, fall back to desktop_screenshot + coordinate-based desktop_click.")
1840
1849
  },
1841
- async ({ app }) => {
1850
+ async ({ app, timeout }) => {
1842
1851
  checkBlacklist(app);
1843
1852
  const args = ["see"];
1844
1853
  if (app) args.push("--app", app);
1854
+ if (timeout) args.push("--timeout-seconds", String(timeout));
1845
1855
  const result = await peekaboo(args);
1846
1856
  const data = result.data;
1847
1857
  const snapshotId = data?.snapshot_id ?? result.snapshotId ?? result.snapshot_id;
@@ -1862,27 +1872,48 @@ var DesktopTools = class {
1862
1872
  server.tool(
1863
1873
  "desktop_click",
1864
1874
  [
1865
- "Click a macOS UI element by its accessibility label, ID, or x,y coordinates.",
1875
+ "Click a macOS UI element by text query, element ID, or x,y coordinates.",
1876
+ "",
1877
+ "PARAMETER GUIDE:",
1878
+ "- query: Text/label to search for (e.g. 'Save', 'Submit'). Searches visible UI elements.",
1879
+ "- on: Element ID from a previous desktop_see snapshot (e.g. 'B1', 'T2'). Fastest with snapshotId.",
1880
+ "- coords: Click at exact screen coordinates as 'x,y' (e.g. '1070,188'). Use when accessibility tree times out.",
1866
1881
  "",
1867
- "The 'on' parameter accepts: element label text (e.g. 'Save'), accessibility ID from a previous accessibility tree capture, or coordinates as 'x,y' string.",
1868
- "For faster interaction, pass the snapshotId from a recent accessibility tree capture.",
1882
+ "PROVEN WORKFLOW (from KakaoTalk automation):",
1883
+ "1. Try desktop_see first to get element IDs \u2192 click with 'on' parameter.",
1884
+ "2. If desktop_see times out: use desktop_screenshot \u2192 calculate coordinates \u2192 click with 'coords'.",
1885
+ "3. Use desktop_list_windows to get window bounds (x,y,w,h) for coordinate calculation.",
1869
1886
  "",
1870
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1887
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app).",
1871
1888
  "",
1872
1889
  "SAFETY: Terminal, iTerm, and Finder are blocked. Two consecutive failures trigger automatic safety stop."
1873
1890
  ].join("\n"),
1874
1891
  {
1875
- on: z5.string().describe("Element label, accessibility ID, or 'x,y' coordinates to click"),
1876
- app: z5.string().optional().describe("App name to target (e.g. 'Safari')"),
1877
- snapshot: z5.string().optional().describe("snapshotId from a previous accessibility tree capture for cached interaction (240x faster)"),
1878
- doubleClick: z5.boolean().optional().default(false).describe("Double-click instead of single click")
1892
+ query: z5.string().optional().describe("Text/label to search and click (e.g. 'Save', 'Submit Button')"),
1893
+ on: z5.string().optional().describe("Element ID from desktop_see snapshot (e.g. 'B1', 'T2')"),
1894
+ coords: z5.string().optional().describe("Screen coordinates as 'x,y' (e.g. '1070,188'). Use when accessibility tree is unavailable."),
1895
+ app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'KakaoTalk')"),
1896
+ snapshot: z5.string().optional().describe("snapshotId from desktop_see for cached interaction (240x faster)"),
1897
+ doubleClick: z5.boolean().optional().default(false).describe("Double-click instead of single click (e.g. open files, open chat windows)"),
1898
+ rightClick: z5.boolean().optional().default(false).describe("Right-click (context menu)")
1879
1899
  },
1880
- async ({ on, app, snapshot, doubleClick }) => {
1900
+ async ({ query, on, coords, app, snapshot, doubleClick, rightClick }) => {
1881
1901
  checkBlacklist(app);
1882
- const args = ["click", "--on", on];
1902
+ if (!query && !on && !coords) {
1903
+ throw new Error("Provide at least one of: query (text search), on (element ID), or coords ('x,y').");
1904
+ }
1905
+ const args = ["click"];
1906
+ if (coords) {
1907
+ args.push("--coords", coords);
1908
+ } else if (on) {
1909
+ args.push("--on", on);
1910
+ } else if (query) {
1911
+ args.push(query);
1912
+ }
1883
1913
  if (app) args.push("--app", app);
1884
1914
  if (snapshot) args.push("--snapshot", snapshot);
1885
- if (doubleClick) args.push("--double-click");
1915
+ if (doubleClick) args.push("--double");
1916
+ if (rightClick) args.push("--right");
1886
1917
  const result = await peekaboo(args);
1887
1918
  return {
1888
1919
  content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
@@ -1892,23 +1923,27 @@ var DesktopTools = class {
1892
1923
  server.tool(
1893
1924
  "desktop_type",
1894
1925
  [
1895
- "Type text into the currently focused UI element on macOS. The text is sent as keyboard input character-by-character.",
1926
+ "Type text into the currently focused UI element on macOS via keyboard simulation.",
1896
1927
  "",
1897
- "IMPORTANT: Always capture the accessibility tree first to verify the correct element is focused before typing.",
1898
- "For Korean/Japanese/Chinese text or emoji, use desktop_paste instead \u2014 keyboard input does not support CJK characters.",
1928
+ "IMPORTANT: For Korean/Japanese/Chinese/emoji text, use desktop_paste instead \u2014 keyboard simulation does not support CJK.",
1929
+ "Always click the target input field first (via desktop_click) before typing.",
1899
1930
  "",
1900
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1931
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app).",
1901
1932
  "",
1902
1933
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1903
1934
  ].join("\n"),
1904
1935
  {
1905
- text: z5.string().describe("Text to type into the focused element"),
1906
- app: z5.string().optional().describe("App name to focus before typing")
1936
+ text: z5.string().describe("Text to type (ASCII only \u2014 for CJK/emoji use desktop_paste)"),
1937
+ app: z5.string().optional().describe("App name to focus before typing"),
1938
+ pressReturn: z5.boolean().optional().default(false).describe("Press Return/Enter after typing (e.g. to send a message or submit a form)"),
1939
+ clear: z5.boolean().optional().default(false).describe("Clear the field before typing (Cmd+A, Delete)")
1907
1940
  },
1908
- async ({ text, app }) => {
1941
+ async ({ text, app, pressReturn, clear }) => {
1909
1942
  checkBlacklist(app);
1910
1943
  const args = ["type", text];
1911
1944
  if (app) args.push("--app", app);
1945
+ if (clear) args.push("--clear");
1946
+ if (pressReturn) args.push("--return");
1912
1947
  const result = await peekaboo(args);
1913
1948
  return {
1914
1949
  content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
@@ -1922,7 +1957,8 @@ var DesktopTools = class {
1922
1957
  "",
1923
1958
  "Common shortcuts: 'cmd,c' (copy), 'cmd,v' (paste), 'cmd,z' (undo), 'cmd,s' (save), 'cmd,w' (close tab), 'cmd,q' (quit), 'cmd,shift,t' (reopen tab), 'cmd,tab' (switch app).",
1924
1959
  "",
1925
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1960
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
1961
+ "Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1926
1962
  "",
1927
1963
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1928
1964
  ].join("\n"),
@@ -1947,7 +1983,8 @@ var DesktopTools = class {
1947
1983
  "",
1948
1984
  "Use 'ticks' to control scroll distance (default: 3, higher = more scrolling). Can target a specific element by label or ID from a previous accessibility tree capture.",
1949
1985
  "",
1950
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1986
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
1987
+ "Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1951
1988
  "",
1952
1989
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1953
1990
  ].join("\n"),
@@ -2004,7 +2041,8 @@ var DesktopTools = class {
2004
2041
  "If no app is specified, lists windows for the frontmost application.",
2005
2042
  "Use this after identifying running apps to find specific windows before capturing the accessibility tree or taking a screenshot.",
2006
2043
  "",
2007
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'."
2044
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
2045
+ "Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'"
2008
2046
  ].join("\n"),
2009
2047
  {
2010
2048
  app: z5.string().optional().describe("Filter by app name. Omit to query the frontmost app.")
@@ -2041,23 +2079,41 @@ var DesktopTools = class {
2041
2079
  server.tool(
2042
2080
  "desktop_screenshot",
2043
2081
  [
2044
- "Take a high-quality macOS screenshot (Retina display support). Returns base64 image data.",
2082
+ "Take a high-quality macOS screenshot. Returns base64 image data.",
2083
+ "",
2084
+ "MODES:",
2085
+ "- 'screen': full display capture (default). Use screenIndex for multi-monitor setups.",
2086
+ "- 'window': specific app window. Specify with app, windowTitle, or windowIndex.",
2087
+ "- 'frontmost': capture only the frontmost window.",
2088
+ "- 'auto': peekaboo chooses the best mode automatically.",
2089
+ "",
2090
+ "TARGETING SPECIFIC WINDOWS:",
2091
+ "- app: capture by app name (e.g. 'Safari', 'KakaoTalk')",
2092
+ "- windowTitle: capture a specific window by title (partial match supported)",
2093
+ "- windowIndex: capture by window z-order (0 = frontmost window of the app)",
2094
+ "- screenIndex: which display to capture in 'screen' mode (0-based, for multi-monitor)",
2045
2095
  "",
2046
- "MODES: 'screen' captures the full display, 'window' captures a specific app window.",
2047
2096
  "TIP: Prefer the accessibility tree for understanding UI structure \u2014 use screenshots only when visual appearance matters (layouts, images, colors).",
2048
2097
  "",
2049
- "PERMISSIONS: Requires macOS Screen Recording permission for 'peekaboo'.",
2098
+ "PERMISSIONS: Requires Screen Recording (inherited from terminal app, not peekaboo itself).",
2099
+ "Fix if denied via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
2050
2100
  "",
2051
2101
  "SAFETY: Terminal, iTerm, and Finder are blocked."
2052
2102
  ].join("\n"),
2053
2103
  {
2054
- app: z5.string().optional().describe("Capture a specific app's window (by name)"),
2055
- mode: z5.enum(["screen", "window"]).optional().default("screen").describe("'screen': full display capture, 'window': specific app window only")
2104
+ app: z5.string().optional().describe("Capture a specific app's window (by name, e.g. 'Safari', 'KakaoTalk')"),
2105
+ mode: z5.enum(["screen", "window", "frontmost", "auto"]).optional().default("screen").describe("'screen': full display, 'window': specific app window, 'frontmost': frontmost window, 'auto': peekaboo decides"),
2106
+ windowTitle: z5.string().optional().describe("Capture window by title (partial match). Use with mode='window'."),
2107
+ windowIndex: z5.number().optional().describe("Window z-order index (0 = frontmost window of the app). Use with mode='window'."),
2108
+ screenIndex: z5.number().optional().describe("Display index for multi-monitor (0-based). Use with mode='screen'.")
2056
2109
  },
2057
- async ({ app, mode }) => {
2110
+ async ({ app, mode, windowTitle, windowIndex, screenIndex }) => {
2058
2111
  checkBlacklist(app);
2059
2112
  const args = ["image", "--mode", mode];
2060
2113
  if (app) args.push("--app", app);
2114
+ if (windowTitle) args.push("--window-title", windowTitle);
2115
+ if (windowIndex !== void 0) args.push("--window-index", String(windowIndex));
2116
+ if (screenIndex !== void 0) args.push("--screen-index", String(screenIndex));
2061
2117
  const result = await peekaboo(args);
2062
2118
  const data = result.data;
2063
2119
  const files = data?.files;
@@ -2085,7 +2141,8 @@ var DesktopTools = class {
2085
2141
  "Examples: ['File', 'New Tab'], ['Edit', 'Find', 'Find...'], ['View', 'Enter Full Screen'].",
2086
2142
  "Omit the 'app' parameter to target the frontmost app. The target app must be running.",
2087
2143
  "",
2088
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
2144
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
2145
+ "Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
2089
2146
  "",
2090
2147
  "SAFETY: Terminal, iTerm, and Finder are blocked."
2091
2148
  ].join("\n"),
@@ -2118,21 +2175,24 @@ var DesktopTools = class {
2118
2175
  server.tool(
2119
2176
  "desktop_paste",
2120
2177
  [
2121
- "Paste text via clipboard into the focused element. Use this for Korean, Japanese, Chinese, emoji, or any non-ASCII text.",
2178
+ "Paste text via clipboard into the focused element. Automatically sets clipboard, pastes (Cmd+V), then restores previous clipboard.",
2122
2179
  "",
2123
- "Unlike desktop_type (which sends keyboard input character-by-character), this uses the system clipboard to paste text, supporting all character sets including CJK and emoji.",
2180
+ "ALWAYS USE THIS instead of desktop_type for: Korean, Japanese, Chinese, emoji, or any non-ASCII text.",
2181
+ "Unlike desktop_type (keyboard simulation), this uses the system clipboard \u2014 works with ALL character sets.",
2124
2182
  "",
2125
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
2183
+ `PROVEN: In KakaoTalk automation, 'peekaboo paste "\uC548\uB155?"' successfully sent Korean text while 'type' would have failed.`,
2184
+ "",
2185
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app).",
2126
2186
  "",
2127
2187
  "SAFETY: Terminal, iTerm, and Finder are blocked."
2128
2188
  ].join("\n"),
2129
2189
  {
2130
- text: z5.string().describe("Text to paste into the focused element (supports Korean, Japanese, Chinese, emoji)"),
2190
+ text: z5.string().describe("Text to paste (supports Korean, Japanese, Chinese, emoji, any Unicode)"),
2131
2191
  app: z5.string().optional().describe("App name to focus before pasting")
2132
2192
  },
2133
2193
  async ({ text, app }) => {
2134
2194
  checkBlacklist(app);
2135
- const args = ["type", "--paste", text];
2195
+ const args = ["paste", text];
2136
2196
  if (app) args.push("--app", app);
2137
2197
  const result = await peekaboo(args);
2138
2198
  return {
@@ -2145,8 +2205,10 @@ var DesktopTools = class {
2145
2205
  [
2146
2206
  "Launch or bring to front a macOS application. Use this as the FIRST STEP when automating any app.",
2147
2207
  "",
2148
- "This uses macOS native 'open -a' command. The app will be launched if not running, or brought to front if already running.",
2149
- "After launching, wait briefly then use desktop_see to capture the accessibility tree.",
2208
+ "PROVEN WORKFLOW (from KakaoTalk automation):",
2209
+ "1. desktop_open_app \u2192 2. desktop_list_apps (verify) \u2192 3. desktop_see or desktop_screenshot \u2192 4. interact",
2210
+ "",
2211
+ "After launching, use desktop_list_apps to confirm the app is running, then desktop_see to capture UI.",
2150
2212
  "",
2151
2213
  "SAFETY: Terminal, iTerm, and Finder are blocked for automation safety."
2152
2214
  ].join("\n"),
@@ -2155,29 +2217,30 @@ var DesktopTools = class {
2155
2217
  },
2156
2218
  async ({ app }) => {
2157
2219
  checkBlacklist(app);
2158
- await execa("open", ["-a", app]);
2159
- await new Promise((r) => setTimeout(r, 1500));
2220
+ const args = ["app", "launch", app, "--wait-until-ready"];
2221
+ const result = await peekaboo(args);
2160
2222
  return {
2161
- content: [{ type: "text", text: `Launched ${app}` }]
2223
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
2162
2224
  };
2163
2225
  }
2164
2226
  );
2165
2227
  server.tool(
2166
2228
  "desktop_open_url",
2167
2229
  [
2168
- "Open a URL in the default browser or a specified app. Also works for file paths and custom URL schemes.",
2230
+ "Open a URL or file with its default (or specified) application.",
2169
2231
  "",
2170
- "Examples: 'https://google.com', 'file:///path/to/file.html', 'x-apple.systempreferences:...'"
2232
+ "Examples: 'https://google.com', '~/Documents/report.pdf', 'x-apple.systempreferences:...'"
2171
2233
  ].join("\n"),
2172
2234
  {
2173
- url: z5.string().describe("URL to open (https://, file://, or custom scheme)"),
2174
- app: z5.string().optional().describe("Specific app to open the URL with (e.g. 'Google Chrome', 'Firefox')")
2235
+ url: z5.string().describe("URL or file path to open"),
2236
+ app: z5.string().optional().describe("Specific app to open with (e.g. 'Google Chrome', 'Preview')")
2175
2237
  },
2176
2238
  async ({ url, app }) => {
2177
- const args = app ? ["-a", app, url] : [url];
2178
- await execa("open", args);
2239
+ const args = ["open", url];
2240
+ if (app) args.push("--app", app);
2241
+ const result = await peekaboo(args);
2179
2242
  return {
2180
- content: [{ type: "text", text: `Opened: ${url}` }]
2243
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
2181
2244
  };
2182
2245
  }
2183
2246
  );
@@ -77,7 +77,7 @@ var FilesystemTools = class {
77
77
  "- Use for system commands, package managers (npm, pip, brew), git, build tools, and scripting.",
78
78
  "- For reading files prefer read_file, for editing prefer edit_block, for searching prefer search_code.",
79
79
  "- NOT for macOS app GUI interaction. When the user asks to interact with, control, or automate any application (clicking, typing, reading screen, navigating menus), use the desktop_* tools instead (desktop_open_app, desktop_see, desktop_click, desktop_type, desktop_paste, desktop_hotkey, desktop_scroll, desktop_menu, desktop_screenshot).",
80
- "- The ONLY exception: opening System Preferences URLs for permissions (e.g. open 'x-apple.systempreferences:...').",
80
+ "- The ONLY exception: permission fix commands (swift -e for CGRequestScreenCaptureAccess/AXIsProcessTrustedWithOptions, peekaboo permissions, or open 'x-apple.systempreferences:...').",
81
81
  "",
82
82
  "BEHAVIOR:",
83
83
  "- Execute commands directly when the user requests them. Do not ask for confirmation \u2014 the user has already decided.",
@@ -1219,9 +1219,10 @@ Cause: ${e.message}${hint}` }],
1219
1219
  "Use action='start' to begin, action='stop' to end and save. Only one recording can be active at a time.",
1220
1220
  "Platform-specific: macOS (screencapture -v), Windows/Linux (ffmpeg).",
1221
1221
  "",
1222
- "PERMISSIONS (macOS): Screen Recording permission is needed. If denied, run via execute_command:",
1223
- " open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1224
- "Toggle ON for 'screencapture' (or your terminal app), then retry."
1222
+ "PERMISSIONS (macOS): Screen Recording permission is needed (inherited from terminal app).",
1223
+ "If denied, run via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
1224
+ "\u2192 macOS dialog appears. Ask user to click Allow, then retry.",
1225
+ "Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'"
1225
1226
  ].join("\n"),
1226
1227
  {
1227
1228
  action: z4.enum(["start", "stop"]).describe("'start': begin recording, 'stop': end recording and save the file"),
@@ -1365,10 +1366,13 @@ var consecutiveFailures = 0;
1365
1366
  var MAX_CONSECUTIVE_FAILURES = 2;
1366
1367
  var PERM_FIX_HINT = [
1367
1368
  "\n\n\u{1F527} PERMISSION FIX \u2014 run these via execute_command:",
1368
- "1. Check status: peekaboo permissions --json-output",
1369
- "2. Screen Recording: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1370
- "3. Accessibility: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility'",
1371
- "Toggle ON for 'peekaboo' in the opened panel, then retry."
1369
+ "1. Check: peekaboo permissions --json-output",
1370
+ "2. Screen Recording: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
1371
+ "3. Accessibility: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1372
+ "\u2192 macOS system dialogs appear. Ask user to click Allow, then retry.",
1373
+ "NOTE: peekaboo inherits permissions from the terminal app \u2014 do NOT look for 'peekaboo' in System Preferences.",
1374
+ "Fallback (if Swift fails): open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1375
+ " open 'x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility'"
1372
1376
  ].join("\n");
1373
1377
  function isPermissionError(msg) {
1374
1378
  const lower = msg.toLowerCase();
@@ -1407,27 +1411,33 @@ var DesktopTools = class {
1407
1411
  "Workflow: desktop_open_app \u2192 desktop_see \u2192 desktop_click/type/paste \u2192 verify with desktop_see or desktop_screenshot.",
1408
1412
  "",
1409
1413
  "WORKFLOW TIPS:",
1410
- "- If accessibility tree times out (complex UI apps like KakaoTalk): use desktop_screenshot + coordinate-based desktop_click instead.",
1414
+ "- If accessibility tree times out (complex UI apps like KakaoTalk): increase timeout parameter, or fall back to:",
1415
+ " desktop_screenshot \u2192 desktop_list_windows (get window bounds x,y,w,h) \u2192 calculate coordinates \u2192 desktop_click with coords parameter.",
1411
1416
  "- For Korean/Japanese/Chinese text input: always use desktop_paste (NOT desktop_type).",
1412
1417
  "- For multi-window apps: use desktop_list_windows to find specific windows.",
1413
1418
  "- Pass snapshotId to subsequent calls for 240x speed improvement.",
1419
+ "- Double-click to open items (e.g. chat windows in KakaoTalk): use desktop_click with doubleClick=true.",
1414
1420
  "",
1415
- "PERMISSIONS: Requires Accessibility + Screen Recording for 'peekaboo'.",
1416
- "If denied, run via execute_command:",
1417
- " 1. peekaboo permissions --json-output",
1418
- " 2. open 'x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility'",
1419
- " 3. open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1420
- "Toggle ON for 'peekaboo', then retry.",
1421
+ "PERMISSIONS: Requires Accessibility + Screen Recording.",
1422
+ "peekaboo inherits permissions from the parent terminal app \u2014 it does NOT need its own entry in System Preferences.",
1423
+ "If denied, fix via execute_command:",
1424
+ " 1. peekaboo permissions --json-output (check which are missing)",
1425
+ " 2. Screen Recording: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
1426
+ " 3. Accessibility: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1427
+ " \u2192 macOS system dialogs appear. Ask user to click Allow, then retry.",
1428
+ " Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1421
1429
  "",
1422
1430
  "SAFETY: Terminal, iTerm, and Finder are blocked. Two consecutive failures trigger automatic safety stop."
1423
1431
  ].join("\n"),
1424
1432
  {
1425
- app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'Notes', 'Google Chrome'). Omit for the frontmost app.")
1433
+ app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'Notes', 'Google Chrome'). Omit for the frontmost app."),
1434
+ timeout: z5.number().optional().describe("Timeout in seconds (default: 20). Increase for complex UI apps. If it still times out, fall back to desktop_screenshot + coordinate-based desktop_click.")
1426
1435
  },
1427
- async ({ app }) => {
1436
+ async ({ app, timeout }) => {
1428
1437
  checkBlacklist(app);
1429
1438
  const args = ["see"];
1430
1439
  if (app) args.push("--app", app);
1440
+ if (timeout) args.push("--timeout-seconds", String(timeout));
1431
1441
  const result = await peekaboo(args);
1432
1442
  const data = result.data;
1433
1443
  const snapshotId = data?.snapshot_id ?? result.snapshotId ?? result.snapshot_id;
@@ -1448,27 +1458,48 @@ var DesktopTools = class {
1448
1458
  server.tool(
1449
1459
  "desktop_click",
1450
1460
  [
1451
- "Click a macOS UI element by its accessibility label, ID, or x,y coordinates.",
1461
+ "Click a macOS UI element by text query, element ID, or x,y coordinates.",
1462
+ "",
1463
+ "PARAMETER GUIDE:",
1464
+ "- query: Text/label to search for (e.g. 'Save', 'Submit'). Searches visible UI elements.",
1465
+ "- on: Element ID from a previous desktop_see snapshot (e.g. 'B1', 'T2'). Fastest with snapshotId.",
1466
+ "- coords: Click at exact screen coordinates as 'x,y' (e.g. '1070,188'). Use when accessibility tree times out.",
1452
1467
  "",
1453
- "The 'on' parameter accepts: element label text (e.g. 'Save'), accessibility ID from a previous accessibility tree capture, or coordinates as 'x,y' string.",
1454
- "For faster interaction, pass the snapshotId from a recent accessibility tree capture.",
1468
+ "PROVEN WORKFLOW (from KakaoTalk automation):",
1469
+ "1. Try desktop_see first to get element IDs \u2192 click with 'on' parameter.",
1470
+ "2. If desktop_see times out: use desktop_screenshot \u2192 calculate coordinates \u2192 click with 'coords'.",
1471
+ "3. Use desktop_list_windows to get window bounds (x,y,w,h) for coordinate calculation.",
1455
1472
  "",
1456
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1473
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app).",
1457
1474
  "",
1458
1475
  "SAFETY: Terminal, iTerm, and Finder are blocked. Two consecutive failures trigger automatic safety stop."
1459
1476
  ].join("\n"),
1460
1477
  {
1461
- on: z5.string().describe("Element label, accessibility ID, or 'x,y' coordinates to click"),
1462
- app: z5.string().optional().describe("App name to target (e.g. 'Safari')"),
1463
- snapshot: z5.string().optional().describe("snapshotId from a previous accessibility tree capture for cached interaction (240x faster)"),
1464
- doubleClick: z5.boolean().optional().default(false).describe("Double-click instead of single click")
1478
+ query: z5.string().optional().describe("Text/label to search and click (e.g. 'Save', 'Submit Button')"),
1479
+ on: z5.string().optional().describe("Element ID from desktop_see snapshot (e.g. 'B1', 'T2')"),
1480
+ coords: z5.string().optional().describe("Screen coordinates as 'x,y' (e.g. '1070,188'). Use when accessibility tree is unavailable."),
1481
+ app: z5.string().optional().describe("App name to target (e.g. 'Safari', 'KakaoTalk')"),
1482
+ snapshot: z5.string().optional().describe("snapshotId from desktop_see for cached interaction (240x faster)"),
1483
+ doubleClick: z5.boolean().optional().default(false).describe("Double-click instead of single click (e.g. open files, open chat windows)"),
1484
+ rightClick: z5.boolean().optional().default(false).describe("Right-click (context menu)")
1465
1485
  },
1466
- async ({ on, app, snapshot, doubleClick }) => {
1486
+ async ({ query, on, coords, app, snapshot, doubleClick, rightClick }) => {
1467
1487
  checkBlacklist(app);
1468
- const args = ["click", "--on", on];
1488
+ if (!query && !on && !coords) {
1489
+ throw new Error("Provide at least one of: query (text search), on (element ID), or coords ('x,y').");
1490
+ }
1491
+ const args = ["click"];
1492
+ if (coords) {
1493
+ args.push("--coords", coords);
1494
+ } else if (on) {
1495
+ args.push("--on", on);
1496
+ } else if (query) {
1497
+ args.push(query);
1498
+ }
1469
1499
  if (app) args.push("--app", app);
1470
1500
  if (snapshot) args.push("--snapshot", snapshot);
1471
- if (doubleClick) args.push("--double-click");
1501
+ if (doubleClick) args.push("--double");
1502
+ if (rightClick) args.push("--right");
1472
1503
  const result = await peekaboo(args);
1473
1504
  return {
1474
1505
  content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
@@ -1478,23 +1509,27 @@ var DesktopTools = class {
1478
1509
  server.tool(
1479
1510
  "desktop_type",
1480
1511
  [
1481
- "Type text into the currently focused UI element on macOS. The text is sent as keyboard input character-by-character.",
1512
+ "Type text into the currently focused UI element on macOS via keyboard simulation.",
1482
1513
  "",
1483
- "IMPORTANT: Always capture the accessibility tree first to verify the correct element is focused before typing.",
1484
- "For Korean/Japanese/Chinese text or emoji, use desktop_paste instead \u2014 keyboard input does not support CJK characters.",
1514
+ "IMPORTANT: For Korean/Japanese/Chinese/emoji text, use desktop_paste instead \u2014 keyboard simulation does not support CJK.",
1515
+ "Always click the target input field first (via desktop_click) before typing.",
1485
1516
  "",
1486
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1517
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app).",
1487
1518
  "",
1488
1519
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1489
1520
  ].join("\n"),
1490
1521
  {
1491
- text: z5.string().describe("Text to type into the focused element"),
1492
- app: z5.string().optional().describe("App name to focus before typing")
1522
+ text: z5.string().describe("Text to type (ASCII only \u2014 for CJK/emoji use desktop_paste)"),
1523
+ app: z5.string().optional().describe("App name to focus before typing"),
1524
+ pressReturn: z5.boolean().optional().default(false).describe("Press Return/Enter after typing (e.g. to send a message or submit a form)"),
1525
+ clear: z5.boolean().optional().default(false).describe("Clear the field before typing (Cmd+A, Delete)")
1493
1526
  },
1494
- async ({ text, app }) => {
1527
+ async ({ text, app, pressReturn, clear }) => {
1495
1528
  checkBlacklist(app);
1496
1529
  const args = ["type", text];
1497
1530
  if (app) args.push("--app", app);
1531
+ if (clear) args.push("--clear");
1532
+ if (pressReturn) args.push("--return");
1498
1533
  const result = await peekaboo(args);
1499
1534
  return {
1500
1535
  content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
@@ -1508,7 +1543,8 @@ var DesktopTools = class {
1508
1543
  "",
1509
1544
  "Common shortcuts: 'cmd,c' (copy), 'cmd,v' (paste), 'cmd,z' (undo), 'cmd,s' (save), 'cmd,w' (close tab), 'cmd,q' (quit), 'cmd,shift,t' (reopen tab), 'cmd,tab' (switch app).",
1510
1545
  "",
1511
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1546
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
1547
+ "Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1512
1548
  "",
1513
1549
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1514
1550
  ].join("\n"),
@@ -1533,7 +1569,8 @@ var DesktopTools = class {
1533
1569
  "",
1534
1570
  "Use 'ticks' to control scroll distance (default: 3, higher = more scrolling). Can target a specific element by label or ID from a previous accessibility tree capture.",
1535
1571
  "",
1536
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1572
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
1573
+ "Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1537
1574
  "",
1538
1575
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1539
1576
  ].join("\n"),
@@ -1590,7 +1627,8 @@ var DesktopTools = class {
1590
1627
  "If no app is specified, lists windows for the frontmost application.",
1591
1628
  "Use this after identifying running apps to find specific windows before capturing the accessibility tree or taking a screenshot.",
1592
1629
  "",
1593
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'."
1630
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
1631
+ "Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'"
1594
1632
  ].join("\n"),
1595
1633
  {
1596
1634
  app: z5.string().optional().describe("Filter by app name. Omit to query the frontmost app.")
@@ -1627,23 +1665,41 @@ var DesktopTools = class {
1627
1665
  server.tool(
1628
1666
  "desktop_screenshot",
1629
1667
  [
1630
- "Take a high-quality macOS screenshot (Retina display support). Returns base64 image data.",
1668
+ "Take a high-quality macOS screenshot. Returns base64 image data.",
1669
+ "",
1670
+ "MODES:",
1671
+ "- 'screen': full display capture (default). Use screenIndex for multi-monitor setups.",
1672
+ "- 'window': specific app window. Specify with app, windowTitle, or windowIndex.",
1673
+ "- 'frontmost': capture only the frontmost window.",
1674
+ "- 'auto': peekaboo chooses the best mode automatically.",
1675
+ "",
1676
+ "TARGETING SPECIFIC WINDOWS:",
1677
+ "- app: capture by app name (e.g. 'Safari', 'KakaoTalk')",
1678
+ "- windowTitle: capture a specific window by title (partial match supported)",
1679
+ "- windowIndex: capture by window z-order (0 = frontmost window of the app)",
1680
+ "- screenIndex: which display to capture in 'screen' mode (0-based, for multi-monitor)",
1631
1681
  "",
1632
- "MODES: 'screen' captures the full display, 'window' captures a specific app window.",
1633
1682
  "TIP: Prefer the accessibility tree for understanding UI structure \u2014 use screenshots only when visual appearance matters (layouts, images, colors).",
1634
1683
  "",
1635
- "PERMISSIONS: Requires macOS Screen Recording permission for 'peekaboo'.",
1684
+ "PERMISSIONS: Requires Screen Recording (inherited from terminal app, not peekaboo itself).",
1685
+ "Fix if denied via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
1636
1686
  "",
1637
1687
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1638
1688
  ].join("\n"),
1639
1689
  {
1640
- app: z5.string().optional().describe("Capture a specific app's window (by name)"),
1641
- mode: z5.enum(["screen", "window"]).optional().default("screen").describe("'screen': full display capture, 'window': specific app window only")
1690
+ app: z5.string().optional().describe("Capture a specific app's window (by name, e.g. 'Safari', 'KakaoTalk')"),
1691
+ mode: z5.enum(["screen", "window", "frontmost", "auto"]).optional().default("screen").describe("'screen': full display, 'window': specific app window, 'frontmost': frontmost window, 'auto': peekaboo decides"),
1692
+ windowTitle: z5.string().optional().describe("Capture window by title (partial match). Use with mode='window'."),
1693
+ windowIndex: z5.number().optional().describe("Window z-order index (0 = frontmost window of the app). Use with mode='window'."),
1694
+ screenIndex: z5.number().optional().describe("Display index for multi-monitor (0-based). Use with mode='screen'.")
1642
1695
  },
1643
- async ({ app, mode }) => {
1696
+ async ({ app, mode, windowTitle, windowIndex, screenIndex }) => {
1644
1697
  checkBlacklist(app);
1645
1698
  const args = ["image", "--mode", mode];
1646
1699
  if (app) args.push("--app", app);
1700
+ if (windowTitle) args.push("--window-title", windowTitle);
1701
+ if (windowIndex !== void 0) args.push("--window-index", String(windowIndex));
1702
+ if (screenIndex !== void 0) args.push("--screen-index", String(screenIndex));
1647
1703
  const result = await peekaboo(args);
1648
1704
  const data = result.data;
1649
1705
  const files = data?.files;
@@ -1671,7 +1727,8 @@ var DesktopTools = class {
1671
1727
  "Examples: ['File', 'New Tab'], ['Edit', 'Find', 'Find...'], ['View', 'Enter Full Screen'].",
1672
1728
  "Omit the 'app' parameter to target the frontmost app. The target app must be running.",
1673
1729
  "",
1674
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1730
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app, not peekaboo itself).",
1731
+ "Fix if denied via execute_command: swift -e 'import ApplicationServices; let opts = [kAXTrustedCheckOptionPrompt.takeUnretainedValue(): true] as CFDictionary; AXIsProcessTrustedWithOptions(opts)'",
1675
1732
  "",
1676
1733
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1677
1734
  ].join("\n"),
@@ -1704,21 +1761,24 @@ var DesktopTools = class {
1704
1761
  server.tool(
1705
1762
  "desktop_paste",
1706
1763
  [
1707
- "Paste text via clipboard into the focused element. Use this for Korean, Japanese, Chinese, emoji, or any non-ASCII text.",
1764
+ "Paste text via clipboard into the focused element. Automatically sets clipboard, pastes (Cmd+V), then restores previous clipboard.",
1708
1765
  "",
1709
- "Unlike desktop_type (which sends keyboard input character-by-character), this uses the system clipboard to paste text, supporting all character sets including CJK and emoji.",
1766
+ "ALWAYS USE THIS instead of desktop_type for: Korean, Japanese, Chinese, emoji, or any non-ASCII text.",
1767
+ "Unlike desktop_type (keyboard simulation), this uses the system clipboard \u2014 works with ALL character sets.",
1710
1768
  "",
1711
- "PERMISSIONS: Requires macOS Accessibility permission for 'peekaboo'.",
1769
+ `PROVEN: In KakaoTalk automation, 'peekaboo paste "\uC548\uB155?"' successfully sent Korean text while 'type' would have failed.`,
1770
+ "",
1771
+ "PERMISSIONS: Requires Accessibility (inherited from terminal app).",
1712
1772
  "",
1713
1773
  "SAFETY: Terminal, iTerm, and Finder are blocked."
1714
1774
  ].join("\n"),
1715
1775
  {
1716
- text: z5.string().describe("Text to paste into the focused element (supports Korean, Japanese, Chinese, emoji)"),
1776
+ text: z5.string().describe("Text to paste (supports Korean, Japanese, Chinese, emoji, any Unicode)"),
1717
1777
  app: z5.string().optional().describe("App name to focus before pasting")
1718
1778
  },
1719
1779
  async ({ text, app }) => {
1720
1780
  checkBlacklist(app);
1721
- const args = ["type", "--paste", text];
1781
+ const args = ["paste", text];
1722
1782
  if (app) args.push("--app", app);
1723
1783
  const result = await peekaboo(args);
1724
1784
  return {
@@ -1731,8 +1791,10 @@ var DesktopTools = class {
1731
1791
  [
1732
1792
  "Launch or bring to front a macOS application. Use this as the FIRST STEP when automating any app.",
1733
1793
  "",
1734
- "This uses macOS native 'open -a' command. The app will be launched if not running, or brought to front if already running.",
1735
- "After launching, wait briefly then use desktop_see to capture the accessibility tree.",
1794
+ "PROVEN WORKFLOW (from KakaoTalk automation):",
1795
+ "1. desktop_open_app \u2192 2. desktop_list_apps (verify) \u2192 3. desktop_see or desktop_screenshot \u2192 4. interact",
1796
+ "",
1797
+ "After launching, use desktop_list_apps to confirm the app is running, then desktop_see to capture UI.",
1736
1798
  "",
1737
1799
  "SAFETY: Terminal, iTerm, and Finder are blocked for automation safety."
1738
1800
  ].join("\n"),
@@ -1741,29 +1803,30 @@ var DesktopTools = class {
1741
1803
  },
1742
1804
  async ({ app }) => {
1743
1805
  checkBlacklist(app);
1744
- await execa("open", ["-a", app]);
1745
- await new Promise((r) => setTimeout(r, 1500));
1806
+ const args = ["app", "launch", app, "--wait-until-ready"];
1807
+ const result = await peekaboo(args);
1746
1808
  return {
1747
- content: [{ type: "text", text: `Launched ${app}` }]
1809
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
1748
1810
  };
1749
1811
  }
1750
1812
  );
1751
1813
  server.tool(
1752
1814
  "desktop_open_url",
1753
1815
  [
1754
- "Open a URL in the default browser or a specified app. Also works for file paths and custom URL schemes.",
1816
+ "Open a URL or file with its default (or specified) application.",
1755
1817
  "",
1756
- "Examples: 'https://google.com', 'file:///path/to/file.html', 'x-apple.systempreferences:...'"
1818
+ "Examples: 'https://google.com', '~/Documents/report.pdf', 'x-apple.systempreferences:...'"
1757
1819
  ].join("\n"),
1758
1820
  {
1759
- url: z5.string().describe("URL to open (https://, file://, or custom scheme)"),
1760
- app: z5.string().optional().describe("Specific app to open the URL with (e.g. 'Google Chrome', 'Firefox')")
1821
+ url: z5.string().describe("URL or file path to open"),
1822
+ app: z5.string().optional().describe("Specific app to open with (e.g. 'Google Chrome', 'Preview')")
1761
1823
  },
1762
1824
  async ({ url, app }) => {
1763
- const args = app ? ["-a", app, url] : [url];
1764
- await execa("open", args);
1825
+ const args = ["open", url];
1826
+ if (app) args.push("--app", app);
1827
+ const result = await peekaboo(args);
1765
1828
  return {
1766
- content: [{ type: "text", text: `Opened: ${url}` }]
1829
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }]
1767
1830
  };
1768
1831
  }
1769
1832
  );
@@ -78,7 +78,7 @@ var FilesystemTools = class {
78
78
  "- Use for system commands, package managers (npm, pip, brew), git, build tools, and scripting.",
79
79
  "- For reading files prefer read_file, for editing prefer edit_block, for searching prefer search_code.",
80
80
  "- NOT for macOS app GUI interaction. When the user asks to interact with, control, or automate any application (clicking, typing, reading screen, navigating menus), use the desktop_* tools instead (desktop_open_app, desktop_see, desktop_click, desktop_type, desktop_paste, desktop_hotkey, desktop_scroll, desktop_menu, desktop_screenshot).",
81
- "- The ONLY exception: opening System Preferences URLs for permissions (e.g. open 'x-apple.systempreferences:...').",
81
+ "- The ONLY exception: permission fix commands (swift -e for CGRequestScreenCaptureAccess/AXIsProcessTrustedWithOptions, peekaboo permissions, or open 'x-apple.systempreferences:...').",
82
82
  "",
83
83
  "BEHAVIOR:",
84
84
  "- Execute commands directly when the user requests them. Do not ask for confirmation \u2014 the user has already decided.",
@@ -1220,9 +1220,10 @@ Cause: ${e.message}${hint}` }],
1220
1220
  "Use action='start' to begin, action='stop' to end and save. Only one recording can be active at a time.",
1221
1221
  "Platform-specific: macOS (screencapture -v), Windows/Linux (ffmpeg).",
1222
1222
  "",
1223
- "PERMISSIONS (macOS): Screen Recording permission is needed. If denied, run via execute_command:",
1224
- " open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'",
1225
- "Toggle ON for 'screencapture' (or your terminal app), then retry."
1223
+ "PERMISSIONS (macOS): Screen Recording permission is needed (inherited from terminal app).",
1224
+ "If denied, run via execute_command: swift -e 'import CoreGraphics; CGRequestScreenCaptureAccess()'",
1225
+ "\u2192 macOS dialog appears. Ask user to click Allow, then retry.",
1226
+ "Fallback: open 'x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture'"
1226
1227
  ].join("\n"),
1227
1228
  {
1228
1229
  action: z4.enum(["start", "stop"]).describe("'start': begin recording, 'stop': end recording and save the file"),
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "junis",
3
- "version": "0.3.11",
3
+ "version": "0.3.13",
4
4
  "description": "One-line device control for AI agents",
5
5
  "type": "module",
6
6
  "bin": {