@syntesseraai/opencode-feature-factory 0.2.44 → 0.2.45

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "$schema": "https://json.schemastore.org/package.json",
3
3
  "name": "@syntesseraai/opencode-feature-factory",
4
- "version": "0.2.44",
4
+ "version": "0.2.45",
5
5
  "type": "module",
6
6
  "description": "OpenCode plugin for Feature Factory agents - provides sub-agents and skills for validation, review, security, and architecture assessment",
7
7
  "license": "MIT",
@@ -30,6 +30,7 @@
30
30
  "security-audit",
31
31
  "aws-well-architected"
32
32
  ],
33
+ "scripts": {},
33
34
  "dependencies": {
34
35
  "@opencode-ai/plugin": "^1.1.48",
35
36
  "glob": "^10.0.0",
@@ -39,6 +40,5 @@
39
40
  "@types/bun": "^1.2.6",
40
41
  "@types/node": "^22.0.0",
41
42
  "typescript": "^5.0.0"
42
- },
43
- "scripts": {}
44
- }
43
+ }
44
+ }
@@ -1,473 +0,0 @@
1
- # ff-computer-use: iPhone Mirroring Interaction Skill
2
-
3
- > **Purpose:** Enable agents to interact with iPhone apps via macOS iPhone Mirroring using MCP tools for mouse control, keyboard input, screenshots, and scrolling.
4
-
5
- ## Required MCP Servers
6
-
7
- This skill depends on two MCP servers that must be configured:
8
-
9
- ### 1. `computer-use-mcp` (Primary — mouse, keyboard, screenshots)
10
-
11
- Provides the `computer` tool for clicking, dragging, typing, screenshots, and cursor control.
12
-
13
- ```json
14
- "computer-use-mcp": {
15
- "type": "local",
16
- "command": ["npx", "-y", "computer-use-mcp@latest"],
17
- "enabled": true
18
- }
19
- ```
20
-
21
- ### 2. `macos-automator` (Supplementary — scrolling, AppleScript, window discovery)
22
-
23
- Provides `execute_script` for running AppleScript/JXA (needed for scroll wheel events and window management).
24
-
25
- ```json
26
- "macos-automator": {
27
- "type": "local",
28
- "command": ["npx", "-y", "@steipete/macos-automator-mcp@latest"],
29
- "enabled": true
30
- }
31
- ```
32
-
33
- ## Prerequisites
34
-
35
- - **macOS 15+ (Sequoia)** with iPhone Mirroring enabled
36
- - **iPhone Mirroring app** running and connected to your iPhone
37
- - **Accessibility permissions** granted to the terminal/editor app (System Settings → Privacy & Security → Accessibility)
38
- - **Screen Recording permissions** granted to the terminal/editor app (System Settings → Privacy & Security → Screen Recording)
39
- - **Node.js** installed (for `npx` to run the MCP servers)
40
-
41
- ---
42
-
43
- ## Architecture
44
-
45
- ### How It Works
46
-
47
- iPhone Mirroring on macOS renders the iPhone screen as a native macOS window. We interact with it using:
48
-
49
- 1. **`computer` tool** (computer-use-mcp) — Mouse clicks, drags, typing, key presses, and screenshots via nut.js
50
- 2. **`execute_script` tool** (macos-automator) — AppleScript/JXA for scroll wheel events (CGEvent), window discovery, and app management
51
- 3. **`get_scripting_tips` tool** (macos-automator) — Search 200+ pre-built macOS automation scripts
52
-
53
- ### Why Two MCPs?
54
-
55
- - **`computer-use-mcp`** handles most interactions (click, drag, type, screenshot) but has **no scroll wheel action**
56
- - **`macos-automator`** fills the gap with JXA scripts that post `CGEvent` scroll wheel events, which iPhone Mirroring correctly translates to touch scrolls
57
-
58
- ### What Doesn't Work
59
-
60
- - **Arrow keys for scrolling** — iPhone Mirroring doesn't translate keyboard arrow keys to scrolling
61
- - **Mouse drag for scrolling** — Simple mouse drag doesn't trigger touch-scroll in iPhone Mirroring
62
- - **mobile-mcp / Appium** — External mobile automation tools don't work through the mirroring layer
63
- - **AX UI inspection** — iPhone Mirroring window doesn't expose iOS accessibility elements to macOS
64
-
65
- ---
66
-
67
- ## Tool Reference
68
-
69
- ### `computer` — Mouse, Keyboard & Screenshots
70
-
71
- The `computer` tool from `computer-use-mcp` provides these actions:
72
-
73
- #### `get_screenshot` — Capture the Screen
74
-
75
- Takes a screenshot and returns it as a base64 PNG with display dimensions.
76
-
77
- ```
78
- Tool: computer
79
- Action: get_screenshot
80
- ```
81
-
82
- > **Tip:** Use this before and after every interaction to understand the current state.
83
-
84
- #### `left_click` — Click at Coordinates
85
-
86
- ```
87
- Tool: computer
88
- Action: left_click
89
- Coordinate: [x, y]
90
- ```
91
-
92
- #### `right_click` — Right-Click at Coordinates
93
-
94
- ```
95
- Tool: computer
96
- Action: right_click
97
- Coordinate: [x, y]
98
- ```
99
-
100
- #### `double_click` — Double-Click at Coordinates
101
-
102
- ```
103
- Tool: computer
104
- Action: double_click
105
- Coordinate: [x, y]
106
- ```
107
-
108
- #### `mouse_move` — Move Cursor
109
-
110
- ```
111
- Tool: computer
112
- Action: mouse_move
113
- Coordinate: [x, y]
114
- ```
115
-
116
- #### `drag` — Drag to Coordinates
117
-
118
- Presses left mouse button at current position, moves to target, and releases.
119
-
120
- ```
121
- Tool: computer
122
- Action: drag
123
- Coordinate: [x, y] # Target/end position
124
- ```
125
-
126
- > **Important:** Move the cursor to the start position first with `mouse_move`, then use `drag` to the end position.
127
-
128
- #### `type` — Type Text
129
-
130
- Types a string of text. Use this when a text field is focused.
131
-
132
- ```
133
- Tool: computer
134
- Action: type
135
- Text: "Hello, World!"
136
- ```
137
-
138
- #### `key` — Press Keyboard Shortcut
139
-
140
- Press a key or key combination.
141
-
142
- ```
143
- Tool: computer
144
- Action: key
145
- Text: "Return" # Enter key
146
- Text: "BackSpace" # Delete
147
- Text: "Escape" # Escape
148
- Text: "Tab" # Tab
149
- Text: "ctrl+a" # Select all
150
- Text: "cmd+c" # Copy
151
- ```
152
-
153
- #### `cursor_position` — Get Current Cursor Position
154
-
155
- ```
156
- Tool: computer
157
- Action: cursor_position
158
- ```
159
-
160
- ---
161
-
162
- ### `execute_script` — Scrolling & Window Management
163
-
164
- The `execute_script` tool from `macos-automator` runs AppleScript or JXA scripts.
165
-
166
- #### Scroll via CGEvent (JXA)
167
-
168
- This is the **only reliable way** to scroll in iPhone Mirroring:
169
-
170
- ```javascript
171
- // Scroll DOWN (content moves up, see content below)
172
- // Use execute_script with language: "javascript"
173
- ObjC.import('CoreGraphics');
174
-
175
- const x = CURSOR_X; // Absolute X coordinate
176
- const y = CURSOR_Y; // Absolute Y coordinate
177
- const delta = -5; // Negative = scroll down, Positive = scroll up
178
-
179
- // Move cursor to position
180
- const moveEvent = $.CGEventCreateMouseEvent(null, $.kCGEventMouseMoved, $.CGPointMake(x, y), 0);
181
- $.CGEventPost($.kCGHIDEventTap, moveEvent);
182
- delay(0.05);
183
-
184
- // Post scroll wheel event
185
- const scrollEvent = $.CGEventCreateScrollWheelEvent(null, 1, 1, delta);
186
- $.CGEventPost($.kCGHIDEventTap, scrollEvent);
187
- ```
188
-
189
- **Scroll direction:**
190
-
191
- - `delta > 0` → Scroll **UP** (content moves down, you see content above)
192
- - `delta < 0` → Scroll **DOWN** (content moves up, you see content below)
193
-
194
- **Recommended scroll amounts:**
195
-
196
- - Small scroll: `3` to `5`
197
- - Medium scroll: `8` to `12`
198
- - Large scroll: `15` to `25`
199
- - Full page: `30` to `50`
200
-
201
- #### Get Window Bounds (AppleScript)
202
-
203
- ```applescript
204
- -- Use execute_script with language: "applescript"
205
- tell application "System Events"
206
- tell process "iPhone Mirroring"
207
- set winPos to position of window 1
208
- set winSize to size of window 1
209
- return (item 1 of winPos as text) & "," & (item 2 of winPos as text) & "," & (item 1 of winSize as text) & "," & (item 2 of winSize as text)
210
- end tell
211
- end tell
212
- ```
213
-
214
- Returns: `x,y,width,height` — where `(x, y)` is the top-left corner of the window.
215
-
216
- #### Activate iPhone Mirroring (AppleScript)
217
-
218
- ```applescript
219
- tell application "iPhone Mirroring" to activate
220
- delay 2
221
- ```
222
-
223
- #### Bring to Front (AppleScript)
224
-
225
- ```applescript
226
- tell application "System Events"
227
- tell process "iPhone Mirroring"
228
- set frontmost to true
229
- end tell
230
- end tell
231
- ```
232
-
233
- ---
234
-
235
- ### `get_scripting_tips` — Find Pre-Built Scripts
236
-
237
- Search the macos-automator knowledge base for automation scripts:
238
-
239
- ```
240
- Tool: get_scripting_tips
241
- search_term: "screenshot"
242
- limit: 5
243
- ```
244
-
245
- ---
246
-
247
- ## Coordinate System
248
-
249
- ### Understanding Coordinates
250
-
251
- All coordinates are **absolute screen coordinates** (macOS global coordinate space).
252
-
253
- The iPhone Mirroring window has:
254
-
255
- - A **title bar** (~28px) at the top that is NOT part of the iPhone screen
256
- - The **iPhone content area** below the title bar
257
-
258
- ### Coordinate Calculation
259
-
260
- To interact with a point within the iPhone screen:
261
-
262
- ```
263
- absoluteX = windowX + relativeX
264
- absoluteY = windowY + titleBarHeight + relativeY
265
- ```
266
-
267
- Where:
268
-
269
- - `windowX`, `windowY` = window position from AppleScript (see Get Window Bounds above)
270
- - `titleBarHeight` = ~28 pixels (macOS window title bar)
271
- - `relativeX`, `relativeY` = position within the iPhone content area
272
-
273
- ### Typical iPhone Mirroring Window Dimensions
274
-
275
- | iPhone Model | Window Width | Content Height | Total Height (with title bar) |
276
- | --------------- | ------------ | -------------- | ----------------------------- |
277
- | Standard (6.1") | ~336px | ~728px | ~756px |
278
- | Plus/Max (6.7") | ~336px | ~728px | ~756px |
279
-
280
- > **Note:** Actual dimensions may vary. Always discover dynamically using the Get Window Bounds AppleScript.
281
-
282
- ### Common Tap Targets
283
-
284
- For a standard iPhone Mirroring window at position (x, y):
285
-
286
- | Target | Approximate Coordinates |
287
- | ------------------------- | ------------------------- |
288
- | Status bar | `(x + 168, y + 28 + 10)` |
289
- | Center of screen | `(x + 168, y + 28 + 364)` |
290
- | Bottom tab bar (1st item) | `(x + 42, y + 28 + 695)` |
291
- | Bottom tab bar (2nd item) | `(x + 126, y + 28 + 695)` |
292
- | Bottom tab bar (3rd item) | `(x + 210, y + 28 + 695)` |
293
- | Bottom tab bar (4th item) | `(x + 294, y + 28 + 695)` |
294
- | Home indicator area | `(x + 168, y + 28 + 720)` |
295
- | Back button (top-left) | `(x + 30, y + 28 + 55)` |
296
- | Navigation title | `(x + 168, y + 28 + 55)` |
297
-
298
- ---
299
-
300
- ## Workflows
301
-
302
- ### Workflow 1: Discover and Interact
303
-
304
- The standard workflow for interacting with an iPhone app:
305
-
306
- ```
307
- Step 1: Activate iPhone Mirroring
308
- → execute_script (applescript): tell application "iPhone Mirroring" to activate
309
-
310
- Step 2: Get window position
311
- → execute_script (applescript): Get Window Bounds script (see above)
312
- → Parse the returned "x,y,width,height" string
313
-
314
- Step 3: Take a screenshot to see current state
315
- → computer: get_screenshot
316
-
317
- Step 4: Analyze the screenshot
318
- → Identify UI elements and calculate their absolute coordinates
319
- → Remember: absoluteY = windowY + 28 + relativeY
320
-
321
- Step 5: Click on a target
322
- → computer: left_click at [absoluteX, absoluteY]
323
-
324
- Step 6: Wait briefly, then screenshot to verify
325
- → (pause ~1 second)
326
- → computer: get_screenshot
327
- ```
328
-
329
- ### Workflow 2: Scroll Through Content
330
-
331
- ```
332
- Step 1: Get window position (see Workflow 1, Steps 1-2)
333
- → Calculate center: centerX = windowX + width/2, centerY = windowY + 28 + contentHeight/2
334
-
335
- Step 2: Scroll down
336
- → execute_script (javascript): CGEvent scroll script with x=centerX, y=centerY, delta=-5
337
-
338
- Step 3: Wait and screenshot
339
- → (pause ~0.5 seconds)
340
- → computer: get_screenshot
341
-
342
- Step 4: Repeat as needed with larger/smaller delta values
343
- ```
344
-
345
- ### Workflow 3: Navigate Between Screens
346
-
347
- ```
348
- Step 1: Click a list item to navigate forward
349
- → computer: left_click at [itemX, itemY]
350
-
351
- Step 2: Wait for transition
352
- → (pause ~1 second)
353
-
354
- Step 3: Screenshot the new screen
355
- → computer: get_screenshot
356
-
357
- Step 4: Go back (tap back button, typically top-left)
358
- → computer: left_click at [windowX + 30, windowY + 28 + 55]
359
- ```
360
-
361
- ### Workflow 4: Type Text
362
-
363
- iPhone Mirroring supports keyboard input when a text field is focused:
364
-
365
- ```
366
- Step 1: Tap on a text field to focus it
367
- → computer: left_click at [fieldX, fieldY]
368
-
369
- Step 2: Wait for keyboard to appear
370
- → (pause ~0.5 seconds)
371
-
372
- Step 3: Type the text
373
- → computer: type "Hello, World!"
374
-
375
- Step 4: Press Enter if needed
376
- → computer: key "Return"
377
- ```
378
-
379
- ### Workflow 5: Swipe / Drag Gesture
380
-
381
- ```
382
- Step 1: Move cursor to start position
383
- → computer: mouse_move to [startX, startY]
384
-
385
- Step 2: Drag to end position
386
- → computer: drag to [endX, endY]
387
-
388
- Examples:
389
- - Swipe left (next page): mouse_move [300, 500] → drag [100, 500]
390
- - Pull to refresh: mouse_move [168, 400] → drag [168, 600]
391
- - Swipe right (go back): mouse_move [50, 400] → drag [300, 400]
392
- ```
393
-
394
- ---
395
-
396
- ## Troubleshooting
397
-
398
- ### "Not permitted to send input events"
399
-
400
- Your terminal/editor needs Accessibility permissions:
401
-
402
- 1. Open **System Settings → Privacy & Security → Accessibility**
403
- 2. Add your terminal app (Terminal.app, iTerm2, VS Code, Cursor, etc.)
404
- 3. Toggle it ON
405
- 4. Restart the app
406
-
407
- ### Screenshots not capturing correctly
408
-
409
- Your terminal/editor needs Screen Recording permissions:
410
-
411
- 1. Open **System Settings → Privacy & Security → Screen Recording**
412
- 2. Add your terminal/editor app
413
- 3. Toggle it ON
414
- 4. Restart the app
415
-
416
- ### iPhone Mirroring window not found
417
-
418
- ```
419
- → execute_script (applescript):
420
- tell application "System Events" to name of every process whose name contains "iPhone"
421
- ```
422
-
423
- If not running:
424
-
425
- ```
426
- → execute_script (applescript):
427
- tell application "iPhone Mirroring" to activate
428
- delay 3
429
- ```
430
-
431
- ### Clicks not registering
432
-
433
- 1. Ensure the iPhone Mirroring window is **not minimized**
434
- 2. Ensure coordinates are within the window bounds
435
- 3. Bring the window to focus first:
436
- ```
437
- → execute_script (applescript):
438
- tell application "iPhone Mirroring" to activate
439
- ```
440
-
441
- ### Scroll not working
442
-
443
- 1. Ensure the cursor position is within the iPhone content area (not the title bar)
444
- 2. Try larger scroll values (e.g., `-15` instead of `-3`)
445
- 3. Add a small delay between scroll events if doing multiple scrolls
446
- 4. Verify you're using the JXA CGEvent scroll approach (not arrow keys or mouse drag)
447
-
448
- ---
449
-
450
- ## Best Practices
451
-
452
- 1. **Always screenshot first** — Before interacting, take a screenshot to understand the current state
453
- 2. **Analyze screenshots visually** — Use the returned screenshot image to identify UI elements and their positions
454
- 3. **Calculate coordinates dynamically** — Always get window position via AppleScript; never hardcode coordinates
455
- 4. **Add delays after interactions** — Wait ~1 second after clicks/scrolls to let the UI update before screenshotting
456
- 5. **Scroll incrementally** — Use small scroll values (5-10) and check results rather than large jumps
457
- 6. **Verify after each action** — Take a screenshot after each interaction to confirm it worked
458
- 7. **Handle the title bar** — Always add 28px to Y coordinates to account for the macOS title bar
459
- 8. **Keep iPhone Mirroring focused** — Activate the app before interactions
460
- 9. **Use `mouse_move` before `drag`** — The drag action goes FROM current cursor position TO the target
461
- 10. **Prefer `computer` tool for most actions** — Only use `execute_script` for scrolling and window management
462
-
463
- ---
464
-
465
- ## Limitations
466
-
467
- - **No element inspection** — Cannot query iOS accessibility tree through mirroring
468
- - **Coordinate-based only** — All interactions require knowing pixel coordinates
469
- - **Single touch only** — Cannot simulate multi-touch gestures (pinch, rotate)
470
- - **No gesture recognition** — Drag works but complex gestures may not translate correctly
471
- - **Screen resolution dependent** — Coordinates depend on display scaling settings
472
- - **Requires visual analysis** — Must use screenshots + vision to understand UI state
473
- - **No scroll in `computer` tool** — Must use `execute_script` with JXA CGEvent for scrolling