@syntesseraai/opencode-feature-factory 0.2.44 → 0.2.45
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +4 -4
- package/skills/ff-computer-use/SKILL.md +0 -473
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"$schema": "https://json.schemastore.org/package.json",
|
|
3
3
|
"name": "@syntesseraai/opencode-feature-factory",
|
|
4
|
-
"version": "0.2.
|
|
4
|
+
"version": "0.2.45",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"description": "OpenCode plugin for Feature Factory agents - provides sub-agents and skills for validation, review, security, and architecture assessment",
|
|
7
7
|
"license": "MIT",
|
|
@@ -30,6 +30,7 @@
|
|
|
30
30
|
"security-audit",
|
|
31
31
|
"aws-well-architected"
|
|
32
32
|
],
|
|
33
|
+
"scripts": {},
|
|
33
34
|
"dependencies": {
|
|
34
35
|
"@opencode-ai/plugin": "^1.1.48",
|
|
35
36
|
"glob": "^10.0.0",
|
|
@@ -39,6 +40,5 @@
|
|
|
39
40
|
"@types/bun": "^1.2.6",
|
|
40
41
|
"@types/node": "^22.0.0",
|
|
41
42
|
"typescript": "^5.0.0"
|
|
42
|
-
}
|
|
43
|
-
|
|
44
|
-
}
|
|
43
|
+
}
|
|
44
|
+
}
|
|
@@ -1,473 +0,0 @@
|
|
|
1
|
-
# ff-computer-use: iPhone Mirroring Interaction Skill
|
|
2
|
-
|
|
3
|
-
> **Purpose:** Enable agents to interact with iPhone apps via macOS iPhone Mirroring using MCP tools for mouse control, keyboard input, screenshots, and scrolling.
|
|
4
|
-
|
|
5
|
-
## Required MCP Servers
|
|
6
|
-
|
|
7
|
-
This skill depends on two MCP servers that must be configured:
|
|
8
|
-
|
|
9
|
-
### 1. `computer-use-mcp` (Primary — mouse, keyboard, screenshots)
|
|
10
|
-
|
|
11
|
-
Provides the `computer` tool for clicking, dragging, typing, screenshots, and cursor control.
|
|
12
|
-
|
|
13
|
-
```json
|
|
14
|
-
"computer-use-mcp": {
|
|
15
|
-
"type": "local",
|
|
16
|
-
"command": ["npx", "-y", "computer-use-mcp@latest"],
|
|
17
|
-
"enabled": true
|
|
18
|
-
}
|
|
19
|
-
```
|
|
20
|
-
|
|
21
|
-
### 2. `macos-automator` (Supplementary — scrolling, AppleScript, window discovery)
|
|
22
|
-
|
|
23
|
-
Provides `execute_script` for running AppleScript/JXA (needed for scroll wheel events and window management).
|
|
24
|
-
|
|
25
|
-
```json
|
|
26
|
-
"macos-automator": {
|
|
27
|
-
"type": "local",
|
|
28
|
-
"command": ["npx", "-y", "@steipete/macos-automator-mcp@latest"],
|
|
29
|
-
"enabled": true
|
|
30
|
-
}
|
|
31
|
-
```
|
|
32
|
-
|
|
33
|
-
## Prerequisites
|
|
34
|
-
|
|
35
|
-
- **macOS 15+ (Sequoia)** with iPhone Mirroring enabled
|
|
36
|
-
- **iPhone Mirroring app** running and connected to your iPhone
|
|
37
|
-
- **Accessibility permissions** granted to the terminal/editor app (System Settings → Privacy & Security → Accessibility)
|
|
38
|
-
- **Screen Recording permissions** granted to the terminal/editor app (System Settings → Privacy & Security → Screen Recording)
|
|
39
|
-
- **Node.js** installed (for `npx` to run the MCP servers)
|
|
40
|
-
|
|
41
|
-
---
|
|
42
|
-
|
|
43
|
-
## Architecture
|
|
44
|
-
|
|
45
|
-
### How It Works
|
|
46
|
-
|
|
47
|
-
iPhone Mirroring on macOS renders the iPhone screen as a native macOS window. We interact with it using:
|
|
48
|
-
|
|
49
|
-
1. **`computer` tool** (computer-use-mcp) — Mouse clicks, drags, typing, key presses, and screenshots via nut.js
|
|
50
|
-
2. **`execute_script` tool** (macos-automator) — AppleScript/JXA for scroll wheel events (CGEvent), window discovery, and app management
|
|
51
|
-
3. **`get_scripting_tips` tool** (macos-automator) — Search 200+ pre-built macOS automation scripts
|
|
52
|
-
|
|
53
|
-
### Why Two MCPs?
|
|
54
|
-
|
|
55
|
-
- **`computer-use-mcp`** handles most interactions (click, drag, type, screenshot) but has **no scroll wheel action**
|
|
56
|
-
- **`macos-automator`** fills the gap with JXA scripts that post `CGEvent` scroll wheel events, which iPhone Mirroring correctly translates to touch scrolls
|
|
57
|
-
|
|
58
|
-
### What Doesn't Work
|
|
59
|
-
|
|
60
|
-
- **Arrow keys for scrolling** — iPhone Mirroring doesn't translate keyboard arrow keys to scrolling
|
|
61
|
-
- **Mouse drag for scrolling** — Simple mouse drag doesn't trigger touch-scroll in iPhone Mirroring
|
|
62
|
-
- **mobile-mcp / Appium** — External mobile automation tools don't work through the mirroring layer
|
|
63
|
-
- **AX UI inspection** — iPhone Mirroring window doesn't expose iOS accessibility elements to macOS
|
|
64
|
-
|
|
65
|
-
---
|
|
66
|
-
|
|
67
|
-
## Tool Reference
|
|
68
|
-
|
|
69
|
-
### `computer` — Mouse, Keyboard & Screenshots
|
|
70
|
-
|
|
71
|
-
The `computer` tool from `computer-use-mcp` provides these actions:
|
|
72
|
-
|
|
73
|
-
#### `get_screenshot` — Capture the Screen
|
|
74
|
-
|
|
75
|
-
Takes a screenshot and returns it as a base64 PNG with display dimensions.
|
|
76
|
-
|
|
77
|
-
```
|
|
78
|
-
Tool: computer
|
|
79
|
-
Action: get_screenshot
|
|
80
|
-
```
|
|
81
|
-
|
|
82
|
-
> **Tip:** Use this before and after every interaction to understand the current state.
|
|
83
|
-
|
|
84
|
-
#### `left_click` — Click at Coordinates
|
|
85
|
-
|
|
86
|
-
```
|
|
87
|
-
Tool: computer
|
|
88
|
-
Action: left_click
|
|
89
|
-
Coordinate: [x, y]
|
|
90
|
-
```
|
|
91
|
-
|
|
92
|
-
#### `right_click` — Right-Click at Coordinates
|
|
93
|
-
|
|
94
|
-
```
|
|
95
|
-
Tool: computer
|
|
96
|
-
Action: right_click
|
|
97
|
-
Coordinate: [x, y]
|
|
98
|
-
```
|
|
99
|
-
|
|
100
|
-
#### `double_click` — Double-Click at Coordinates
|
|
101
|
-
|
|
102
|
-
```
|
|
103
|
-
Tool: computer
|
|
104
|
-
Action: double_click
|
|
105
|
-
Coordinate: [x, y]
|
|
106
|
-
```
|
|
107
|
-
|
|
108
|
-
#### `mouse_move` — Move Cursor
|
|
109
|
-
|
|
110
|
-
```
|
|
111
|
-
Tool: computer
|
|
112
|
-
Action: mouse_move
|
|
113
|
-
Coordinate: [x, y]
|
|
114
|
-
```
|
|
115
|
-
|
|
116
|
-
#### `drag` — Drag to Coordinates
|
|
117
|
-
|
|
118
|
-
Presses left mouse button at current position, moves to target, and releases.
|
|
119
|
-
|
|
120
|
-
```
|
|
121
|
-
Tool: computer
|
|
122
|
-
Action: drag
|
|
123
|
-
Coordinate: [x, y] # Target/end position
|
|
124
|
-
```
|
|
125
|
-
|
|
126
|
-
> **Important:** Move the cursor to the start position first with `mouse_move`, then use `drag` to the end position.
|
|
127
|
-
|
|
128
|
-
#### `type` — Type Text
|
|
129
|
-
|
|
130
|
-
Types a string of text. Use this when a text field is focused.
|
|
131
|
-
|
|
132
|
-
```
|
|
133
|
-
Tool: computer
|
|
134
|
-
Action: type
|
|
135
|
-
Text: "Hello, World!"
|
|
136
|
-
```
|
|
137
|
-
|
|
138
|
-
#### `key` — Press Keyboard Shortcut
|
|
139
|
-
|
|
140
|
-
Press a key or key combination.
|
|
141
|
-
|
|
142
|
-
```
|
|
143
|
-
Tool: computer
|
|
144
|
-
Action: key
|
|
145
|
-
Text: "Return" # Enter key
|
|
146
|
-
Text: "BackSpace" # Delete
|
|
147
|
-
Text: "Escape" # Escape
|
|
148
|
-
Text: "Tab" # Tab
|
|
149
|
-
Text: "ctrl+a" # Select all
|
|
150
|
-
Text: "cmd+c" # Copy
|
|
151
|
-
```
|
|
152
|
-
|
|
153
|
-
#### `cursor_position` — Get Current Cursor Position
|
|
154
|
-
|
|
155
|
-
```
|
|
156
|
-
Tool: computer
|
|
157
|
-
Action: cursor_position
|
|
158
|
-
```
|
|
159
|
-
|
|
160
|
-
---
|
|
161
|
-
|
|
162
|
-
### `execute_script` — Scrolling & Window Management
|
|
163
|
-
|
|
164
|
-
The `execute_script` tool from `macos-automator` runs AppleScript or JXA scripts.
|
|
165
|
-
|
|
166
|
-
#### Scroll via CGEvent (JXA)
|
|
167
|
-
|
|
168
|
-
This is the **only reliable way** to scroll in iPhone Mirroring:
|
|
169
|
-
|
|
170
|
-
```javascript
|
|
171
|
-
// Scroll DOWN (content moves up, see content below)
|
|
172
|
-
// Use execute_script with language: "javascript"
|
|
173
|
-
ObjC.import('CoreGraphics');
|
|
174
|
-
|
|
175
|
-
const x = CURSOR_X; // Absolute X coordinate
|
|
176
|
-
const y = CURSOR_Y; // Absolute Y coordinate
|
|
177
|
-
const delta = -5; // Negative = scroll down, Positive = scroll up
|
|
178
|
-
|
|
179
|
-
// Move cursor to position
|
|
180
|
-
const moveEvent = $.CGEventCreateMouseEvent(null, $.kCGEventMouseMoved, $.CGPointMake(x, y), 0);
|
|
181
|
-
$.CGEventPost($.kCGHIDEventTap, moveEvent);
|
|
182
|
-
delay(0.05);
|
|
183
|
-
|
|
184
|
-
// Post scroll wheel event
|
|
185
|
-
const scrollEvent = $.CGEventCreateScrollWheelEvent(null, 1, 1, delta);
|
|
186
|
-
$.CGEventPost($.kCGHIDEventTap, scrollEvent);
|
|
187
|
-
```
|
|
188
|
-
|
|
189
|
-
**Scroll direction:**
|
|
190
|
-
|
|
191
|
-
- `delta > 0` → Scroll **UP** (content moves down, you see content above)
|
|
192
|
-
- `delta < 0` → Scroll **DOWN** (content moves up, you see content below)
|
|
193
|
-
|
|
194
|
-
**Recommended scroll amounts:**
|
|
195
|
-
|
|
196
|
-
- Small scroll: `3` to `5`
|
|
197
|
-
- Medium scroll: `8` to `12`
|
|
198
|
-
- Large scroll: `15` to `25`
|
|
199
|
-
- Full page: `30` to `50`
|
|
200
|
-
|
|
201
|
-
#### Get Window Bounds (AppleScript)
|
|
202
|
-
|
|
203
|
-
```applescript
|
|
204
|
-
-- Use execute_script with language: "applescript"
|
|
205
|
-
tell application "System Events"
|
|
206
|
-
tell process "iPhone Mirroring"
|
|
207
|
-
set winPos to position of window 1
|
|
208
|
-
set winSize to size of window 1
|
|
209
|
-
return (item 1 of winPos as text) & "," & (item 2 of winPos as text) & "," & (item 1 of winSize as text) & "," & (item 2 of winSize as text)
|
|
210
|
-
end tell
|
|
211
|
-
end tell
|
|
212
|
-
```
|
|
213
|
-
|
|
214
|
-
Returns: `x,y,width,height` — where `(x, y)` is the top-left corner of the window.
|
|
215
|
-
|
|
216
|
-
#### Activate iPhone Mirroring (AppleScript)
|
|
217
|
-
|
|
218
|
-
```applescript
|
|
219
|
-
tell application "iPhone Mirroring" to activate
|
|
220
|
-
delay 2
|
|
221
|
-
```
|
|
222
|
-
|
|
223
|
-
#### Bring to Front (AppleScript)
|
|
224
|
-
|
|
225
|
-
```applescript
|
|
226
|
-
tell application "System Events"
|
|
227
|
-
tell process "iPhone Mirroring"
|
|
228
|
-
set frontmost to true
|
|
229
|
-
end tell
|
|
230
|
-
end tell
|
|
231
|
-
```
|
|
232
|
-
|
|
233
|
-
---
|
|
234
|
-
|
|
235
|
-
### `get_scripting_tips` — Find Pre-Built Scripts
|
|
236
|
-
|
|
237
|
-
Search the macos-automator knowledge base for automation scripts:
|
|
238
|
-
|
|
239
|
-
```
|
|
240
|
-
Tool: get_scripting_tips
|
|
241
|
-
search_term: "screenshot"
|
|
242
|
-
limit: 5
|
|
243
|
-
```
|
|
244
|
-
|
|
245
|
-
---
|
|
246
|
-
|
|
247
|
-
## Coordinate System
|
|
248
|
-
|
|
249
|
-
### Understanding Coordinates
|
|
250
|
-
|
|
251
|
-
All coordinates are **absolute screen coordinates** (macOS global coordinate space).
|
|
252
|
-
|
|
253
|
-
The iPhone Mirroring window has:
|
|
254
|
-
|
|
255
|
-
- A **title bar** (~28px) at the top that is NOT part of the iPhone screen
|
|
256
|
-
- The **iPhone content area** below the title bar
|
|
257
|
-
|
|
258
|
-
### Coordinate Calculation
|
|
259
|
-
|
|
260
|
-
To interact with a point within the iPhone screen:
|
|
261
|
-
|
|
262
|
-
```
|
|
263
|
-
absoluteX = windowX + relativeX
|
|
264
|
-
absoluteY = windowY + titleBarHeight + relativeY
|
|
265
|
-
```
|
|
266
|
-
|
|
267
|
-
Where:
|
|
268
|
-
|
|
269
|
-
- `windowX`, `windowY` = window position from AppleScript (see Get Window Bounds above)
|
|
270
|
-
- `titleBarHeight` = ~28 pixels (macOS window title bar)
|
|
271
|
-
- `relativeX`, `relativeY` = position within the iPhone content area
|
|
272
|
-
|
|
273
|
-
### Typical iPhone Mirroring Window Dimensions
|
|
274
|
-
|
|
275
|
-
| iPhone Model | Window Width | Content Height | Total Height (with title bar) |
|
|
276
|
-
| --------------- | ------------ | -------------- | ----------------------------- |
|
|
277
|
-
| Standard (6.1") | ~336px | ~728px | ~756px |
|
|
278
|
-
| Plus/Max (6.7") | ~336px | ~728px | ~756px |
|
|
279
|
-
|
|
280
|
-
> **Note:** Actual dimensions may vary. Always discover dynamically using the Get Window Bounds AppleScript.
|
|
281
|
-
|
|
282
|
-
### Common Tap Targets
|
|
283
|
-
|
|
284
|
-
For a standard iPhone Mirroring window at position (x, y):
|
|
285
|
-
|
|
286
|
-
| Target | Approximate Coordinates |
|
|
287
|
-
| ------------------------- | ------------------------- |
|
|
288
|
-
| Status bar | `(x + 168, y + 28 + 10)` |
|
|
289
|
-
| Center of screen | `(x + 168, y + 28 + 364)` |
|
|
290
|
-
| Bottom tab bar (1st item) | `(x + 42, y + 28 + 695)` |
|
|
291
|
-
| Bottom tab bar (2nd item) | `(x + 126, y + 28 + 695)` |
|
|
292
|
-
| Bottom tab bar (3rd item) | `(x + 210, y + 28 + 695)` |
|
|
293
|
-
| Bottom tab bar (4th item) | `(x + 294, y + 28 + 695)` |
|
|
294
|
-
| Home indicator area | `(x + 168, y + 28 + 720)` |
|
|
295
|
-
| Back button (top-left) | `(x + 30, y + 28 + 55)` |
|
|
296
|
-
| Navigation title | `(x + 168, y + 28 + 55)` |
|
|
297
|
-
|
|
298
|
-
---
|
|
299
|
-
|
|
300
|
-
## Workflows
|
|
301
|
-
|
|
302
|
-
### Workflow 1: Discover and Interact
|
|
303
|
-
|
|
304
|
-
The standard workflow for interacting with an iPhone app:
|
|
305
|
-
|
|
306
|
-
```
|
|
307
|
-
Step 1: Activate iPhone Mirroring
|
|
308
|
-
→ execute_script (applescript): tell application "iPhone Mirroring" to activate
|
|
309
|
-
|
|
310
|
-
Step 2: Get window position
|
|
311
|
-
→ execute_script (applescript): Get Window Bounds script (see above)
|
|
312
|
-
→ Parse the returned "x,y,width,height" string
|
|
313
|
-
|
|
314
|
-
Step 3: Take a screenshot to see current state
|
|
315
|
-
→ computer: get_screenshot
|
|
316
|
-
|
|
317
|
-
Step 4: Analyze the screenshot
|
|
318
|
-
→ Identify UI elements and calculate their absolute coordinates
|
|
319
|
-
→ Remember: absoluteY = windowY + 28 + relativeY
|
|
320
|
-
|
|
321
|
-
Step 5: Click on a target
|
|
322
|
-
→ computer: left_click at [absoluteX, absoluteY]
|
|
323
|
-
|
|
324
|
-
Step 6: Wait briefly, then screenshot to verify
|
|
325
|
-
→ (pause ~1 second)
|
|
326
|
-
→ computer: get_screenshot
|
|
327
|
-
```
|
|
328
|
-
|
|
329
|
-
### Workflow 2: Scroll Through Content
|
|
330
|
-
|
|
331
|
-
```
|
|
332
|
-
Step 1: Get window position (see Workflow 1, Steps 1-2)
|
|
333
|
-
→ Calculate center: centerX = windowX + width/2, centerY = windowY + 28 + contentHeight/2
|
|
334
|
-
|
|
335
|
-
Step 2: Scroll down
|
|
336
|
-
→ execute_script (javascript): CGEvent scroll script with x=centerX, y=centerY, delta=-5
|
|
337
|
-
|
|
338
|
-
Step 3: Wait and screenshot
|
|
339
|
-
→ (pause ~0.5 seconds)
|
|
340
|
-
→ computer: get_screenshot
|
|
341
|
-
|
|
342
|
-
Step 4: Repeat as needed with larger/smaller delta values
|
|
343
|
-
```
|
|
344
|
-
|
|
345
|
-
### Workflow 3: Navigate Between Screens
|
|
346
|
-
|
|
347
|
-
```
|
|
348
|
-
Step 1: Click a list item to navigate forward
|
|
349
|
-
→ computer: left_click at [itemX, itemY]
|
|
350
|
-
|
|
351
|
-
Step 2: Wait for transition
|
|
352
|
-
→ (pause ~1 second)
|
|
353
|
-
|
|
354
|
-
Step 3: Screenshot the new screen
|
|
355
|
-
→ computer: get_screenshot
|
|
356
|
-
|
|
357
|
-
Step 4: Go back (tap back button, typically top-left)
|
|
358
|
-
→ computer: left_click at [windowX + 30, windowY + 28 + 55]
|
|
359
|
-
```
|
|
360
|
-
|
|
361
|
-
### Workflow 4: Type Text
|
|
362
|
-
|
|
363
|
-
iPhone Mirroring supports keyboard input when a text field is focused:
|
|
364
|
-
|
|
365
|
-
```
|
|
366
|
-
Step 1: Tap on a text field to focus it
|
|
367
|
-
→ computer: left_click at [fieldX, fieldY]
|
|
368
|
-
|
|
369
|
-
Step 2: Wait for keyboard to appear
|
|
370
|
-
→ (pause ~0.5 seconds)
|
|
371
|
-
|
|
372
|
-
Step 3: Type the text
|
|
373
|
-
→ computer: type "Hello, World!"
|
|
374
|
-
|
|
375
|
-
Step 4: Press Enter if needed
|
|
376
|
-
→ computer: key "Return"
|
|
377
|
-
```
|
|
378
|
-
|
|
379
|
-
### Workflow 5: Swipe / Drag Gesture
|
|
380
|
-
|
|
381
|
-
```
|
|
382
|
-
Step 1: Move cursor to start position
|
|
383
|
-
→ computer: mouse_move to [startX, startY]
|
|
384
|
-
|
|
385
|
-
Step 2: Drag to end position
|
|
386
|
-
→ computer: drag to [endX, endY]
|
|
387
|
-
|
|
388
|
-
Examples:
|
|
389
|
-
- Swipe left (next page): mouse_move [300, 500] → drag [100, 500]
|
|
390
|
-
- Pull to refresh: mouse_move [168, 400] → drag [168, 600]
|
|
391
|
-
- Swipe right (go back): mouse_move [50, 400] → drag [300, 400]
|
|
392
|
-
```
|
|
393
|
-
|
|
394
|
-
---
|
|
395
|
-
|
|
396
|
-
## Troubleshooting
|
|
397
|
-
|
|
398
|
-
### "Not permitted to send input events"
|
|
399
|
-
|
|
400
|
-
Your terminal/editor needs Accessibility permissions:
|
|
401
|
-
|
|
402
|
-
1. Open **System Settings → Privacy & Security → Accessibility**
|
|
403
|
-
2. Add your terminal app (Terminal.app, iTerm2, VS Code, Cursor, etc.)
|
|
404
|
-
3. Toggle it ON
|
|
405
|
-
4. Restart the app
|
|
406
|
-
|
|
407
|
-
### Screenshots not capturing correctly
|
|
408
|
-
|
|
409
|
-
Your terminal/editor needs Screen Recording permissions:
|
|
410
|
-
|
|
411
|
-
1. Open **System Settings → Privacy & Security → Screen Recording**
|
|
412
|
-
2. Add your terminal/editor app
|
|
413
|
-
3. Toggle it ON
|
|
414
|
-
4. Restart the app
|
|
415
|
-
|
|
416
|
-
### iPhone Mirroring window not found
|
|
417
|
-
|
|
418
|
-
```
|
|
419
|
-
→ execute_script (applescript):
|
|
420
|
-
tell application "System Events" to name of every process whose name contains "iPhone"
|
|
421
|
-
```
|
|
422
|
-
|
|
423
|
-
If not running:
|
|
424
|
-
|
|
425
|
-
```
|
|
426
|
-
→ execute_script (applescript):
|
|
427
|
-
tell application "iPhone Mirroring" to activate
|
|
428
|
-
delay 3
|
|
429
|
-
```
|
|
430
|
-
|
|
431
|
-
### Clicks not registering
|
|
432
|
-
|
|
433
|
-
1. Ensure the iPhone Mirroring window is **not minimized**
|
|
434
|
-
2. Ensure coordinates are within the window bounds
|
|
435
|
-
3. Bring the window to focus first:
|
|
436
|
-
```
|
|
437
|
-
→ execute_script (applescript):
|
|
438
|
-
tell application "iPhone Mirroring" to activate
|
|
439
|
-
```
|
|
440
|
-
|
|
441
|
-
### Scroll not working
|
|
442
|
-
|
|
443
|
-
1. Ensure the cursor position is within the iPhone content area (not the title bar)
|
|
444
|
-
2. Try larger scroll values (e.g., `-15` instead of `-3`)
|
|
445
|
-
3. Add a small delay between scroll events if doing multiple scrolls
|
|
446
|
-
4. Verify you're using the JXA CGEvent scroll approach (not arrow keys or mouse drag)
|
|
447
|
-
|
|
448
|
-
---
|
|
449
|
-
|
|
450
|
-
## Best Practices
|
|
451
|
-
|
|
452
|
-
1. **Always screenshot first** — Before interacting, take a screenshot to understand the current state
|
|
453
|
-
2. **Analyze screenshots visually** — Use the returned screenshot image to identify UI elements and their positions
|
|
454
|
-
3. **Calculate coordinates dynamically** — Always get window position via AppleScript; never hardcode coordinates
|
|
455
|
-
4. **Add delays after interactions** — Wait ~1 second after clicks/scrolls to let the UI update before screenshotting
|
|
456
|
-
5. **Scroll incrementally** — Use small scroll values (5-10) and check results rather than large jumps
|
|
457
|
-
6. **Verify after each action** — Take a screenshot after each interaction to confirm it worked
|
|
458
|
-
7. **Handle the title bar** — Always add 28px to Y coordinates to account for the macOS title bar
|
|
459
|
-
8. **Keep iPhone Mirroring focused** — Activate the app before interactions
|
|
460
|
-
9. **Use `mouse_move` before `drag`** — The drag action goes FROM current cursor position TO the target
|
|
461
|
-
10. **Prefer `computer` tool for most actions** — Only use `execute_script` for scrolling and window management
|
|
462
|
-
|
|
463
|
-
---
|
|
464
|
-
|
|
465
|
-
## Limitations
|
|
466
|
-
|
|
467
|
-
- **No element inspection** — Cannot query iOS accessibility tree through mirroring
|
|
468
|
-
- **Coordinate-based only** — All interactions require knowing pixel coordinates
|
|
469
|
-
- **Single touch only** — Cannot simulate multi-touch gestures (pinch, rotate)
|
|
470
|
-
- **No gesture recognition** — Drag works but complex gestures may not translate correctly
|
|
471
|
-
- **Screen resolution dependent** — Coordinates depend on display scaling settings
|
|
472
|
-
- **Requires visual analysis** — Must use screenshots + vision to understand UI state
|
|
473
|
-
- **No scroll in `computer` tool** — Must use `execute_script` with JXA CGEvent for scrolling
|