mobile-debug-mcp 0.24.8 → 0.25.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,388 @@
1
+ # Mobile Debug MCP Prioritized Roadmap
2
+
3
+ ## Prioritization Criteria
4
+
5
+ Ordered by:
6
+
7
+ 1. Impact on agent reliability
8
+ 2. Reduction in retries / brittleness
9
+ 3. Breadth of app coverage improved
10
+ 4. Implementation complexity vs payoff
11
+
12
+ ## Program-Level Success Metrics
13
+ Track roadmap impact across releases using:
14
+
15
+ - Retry reduction rate (% fewer action retries per task)
16
+ - Element match success rate (% successful element targeting)
17
+ - Verification success rate (% expect_* checks passing first attempt)
18
+ - Wait success rate for asynchronous UI flows
19
+ - Custom control interaction success rate
20
+ - Gesture success rate
21
+ - Mean time to root cause during debugging
22
+ - Overall agent task completion rate
23
+
24
+ Primary KPI:
25
+ Higher task success with fewer retries.
26
+
27
+ ---
28
+
29
+ # Priority 1 — Stronger State Verification
30
+
31
+ ## Why first
32
+ Highest leverage improvement.
33
+
34
+ Most failures are not “can’t act,” they’re:
35
+ - uncertain state
36
+ - weak verification
37
+ - retry loops caused by inference
38
+
39
+ ## Deliver
40
+ - Direct readable control values
41
+ - Expanded `expect_*` verification
42
+ - Move from inference to state introspection
43
+
44
+ ## Expected Impact
45
+ Very high.
46
+
47
+ ## Done Criteria
48
+ - Control state readable for core widgets (toggle, slider, input, dropdown)
49
+ - New expect_* state verifiers implemented
50
+ - Agents can verify state without visual inference in representative flows
51
+ - Documentation and snapshot response shape updated
52
+
53
+ ## Success Metrics
54
+ - 30%+ retry reduction on stateful tasks
55
+ - Higher first-pass verification success
56
+ - Reduced false positive verifications
57
+
58
+ ## Dependencies
59
+ Blocks or strengthens:
60
+ - Priority 5 — Better Compose / Custom Control Semantics
61
+ - Priority 6 — Pinch to Zoom verification
62
+ - Priority 7 — Action Trace Correlation
63
+
64
+ ---
65
+
66
+ # Priority 2 — Richer Element Identity
67
+
68
+ ## Why second
69
+ Directly reduces selector brittleness.
70
+
71
+ Improves:
72
+ - targeting stability
73
+ - repeatability
74
+ - agent confidence
75
+
76
+ ## Deliver
77
+ - Stable IDs / test tags prioritization
78
+ - Selector confidence metadata
79
+ - Preferred selector hierarchy
80
+
81
+ ## Expected Impact
82
+ Very high.
83
+
84
+ ## Done Criteria
85
+ - Stable selector preference order implemented
86
+ - Test tags/resource IDs surfaced where available
87
+ - Selector confidence metadata available
88
+ - Structural fallback selectors defined
89
+
90
+ ## Success Metrics
91
+ - Higher element match rate
92
+ - Reduced selector drift failures
93
+ - Lower retargeting retries
94
+
95
+ ## Dependencies
96
+ Blocks or strengthens:
97
+ - Priority 4 — Long Press targeting reliability
98
+ - Priority 5 — Better Compose / Custom Control Semantics
99
+ - Priority 6 — Pinch to Zoom targeting
100
+
101
+ ---
102
+
103
+ # Priority 3 — Wait and Synchronization Reliability
104
+
105
+ ## Why third
106
+ Reliable async synchronization is foundational for agent success and should precede gesture expansion.
107
+
108
+ Addresses failures where agents:
109
+ - skip UI waits after actions
110
+ - rely on network/log signals too early
111
+ - struggle with in-place UI updates
112
+ - misread stale UI snapshots
113
+
114
+ ## Deliver
115
+ - UI-first synchronization policy guidance
116
+ - wait_for_ui_change (hierarchy diff based waiting)
117
+ - Structured loading state detection
118
+ - Snapshot revision / staleness metadata
119
+ - Compose-aware wait robustness improvements
120
+
121
+ ## Expected Impact
122
+ Very high.
123
+
124
+ ## Done Criteria
125
+ - wait_for_ui_change implemented
126
+ - Loading state detection available for representative controls
127
+ - Snapshot revision or staleness metadata exposed
128
+ - UI-first sync guidance added to spec guardrails
129
+ - In-place update waits validated on benchmark flows
130
+
131
+ ## Success Metrics
132
+ - Reduced missed async UI transitions
133
+ - Fewer retries caused by premature actions
134
+ - Higher wait success rate for dynamic UI flows
135
+ - Lower fallback usage to network/log checks
136
+
137
+ ## Dependencies
138
+ Depends on:
139
+ - Priority 1 — Stronger State Verification
140
+ - Priority 2 — Richer Element Identity
141
+
142
+ Blocks or strengthens:
143
+ - Priority 5 — Better Compose / Custom Control Semantics
144
+ - Priority 7 — Action Trace Correlation
145
+
146
+ ---
147
+
148
+ # Priority 4 — Long Press Gesture
149
+
150
+ ## Why fourth
151
+ High utility, relatively low complexity.
152
+
153
+ Unlocks many currently awkward interactions:
154
+
155
+ - context menus
156
+ - hidden actions
157
+ - reorder handles
158
+ - press-and-hold controls
159
+
160
+ Broad usefulness.
161
+
162
+ ## Deliver
163
+ New tool:
164
+
165
+ ```json
166
+ long_press(element_id, duration_ms?)
167
+ ```
168
+
169
+ Verification alignment:
170
+ - expect_context_menu
171
+ - expect_press_effect
172
+
173
+ ## Expected Impact
174
+ High.
175
+
176
+ ## Done Criteria
177
+ - long_press tool implemented across supported platforms
178
+ - Duration defaults and overrides supported
179
+ - Verification patterns for long press outcomes defined
180
+ - Included in action capability model
181
+
182
+ ## Success Metrics
183
+ - Increased hidden/control-surface interaction coverage
184
+ - Reduced dead-end interaction failures
185
+ - Long press task success rate tracked
186
+
187
+ ## Dependencies
188
+ Depends on:
189
+ - Priority 2 — Richer Element Identity
190
+
191
+ Strengthens:
192
+ - Priority 5 semantics interaction contracts
193
+
194
+ ---
195
+
196
+ # Priority 5 — Better Compose / Custom Control Semantics
197
+
198
+ ## Why fifth
199
+ Important, but strengthened by priorities 1–4 first.
200
+
201
+ Semantics become more useful once:
202
+ - identity is stronger
203
+ - verification is stronger
204
+ - gestures are richer
205
+ - synchronization is more reliable
206
+
207
+ ## Deliver
208
+ - Composite control traits
209
+ - Control role enrichment (adjustable, expandable, selectable_group)
210
+ - Interaction contracts metadata
211
+ - Custom widget gesture affordance hints
212
+ - Semantic confidence annotations
213
+ - Compose-aware selectors for waits (merged semantics and element relationships)
214
+
215
+ ## Expected Impact
216
+ High.
217
+
218
+ ## Done Criteria
219
+ - Semantic traits implemented for major custom control classes
220
+ - Interaction contracts surfaced in snapshot model
221
+ - Confidence model defined for derived semantics
222
+ - Custom control manipulation success validated in benchmark flows
223
+
224
+ ## Success Metrics
225
+ - Higher custom control interaction success rate
226
+ - Fewer retries on non-standard widgets
227
+ - Reduced semantic ambiguity failures
228
+
229
+ ## Dependencies
230
+ Depends on:
231
+ - Priority 1 — Stronger State Verification
232
+ - Priority 2 — Richer Element Identity
233
+ - Priority 3 — Wait and Synchronization Reliability
234
+ - Priority 4 — Long Press
235
+
236
+ ---
237
+
238
+ # Priority 6 — Pinch to Zoom
239
+
240
+ ## Why sixth
241
+ Valuable, but narrower than long press.
242
+
243
+ Applies mainly to:
244
+ - maps
245
+ - images
246
+ - canvases
247
+ - zoomable custom surfaces
248
+
249
+ Useful, but less universal.
250
+
251
+ ## Deliver
252
+
253
+ ```json
254
+ pinch_to_zoom(target, scale, center?)
255
+ ```
256
+
257
+ Verification:
258
+ - expect_zoom_level
259
+ - expect_viewport_change
260
+
261
+ ## Expected Impact
262
+ Medium-high.
263
+
264
+ ## Done Criteria
265
+ - pinch_to_zoom implemented
266
+ - Zoom in/out flows supported
267
+ - Verification primitives for viewport or zoom state available
268
+ - Gesture integrated into action model
269
+
270
+ ## Success Metrics
271
+ - Successful execution across zoomable surfaces
272
+ - Reduced failures on map/image workflows
273
+ - Gesture success rate tracked
274
+
275
+ ## Dependencies
276
+ Depends on:
277
+ - Priority 1 — Stronger State Verification
278
+ - Priority 2 — Richer Element Identity
279
+
280
+ ---
281
+
282
+ # Priority 7 — Action Trace Correlation
283
+
284
+ ## Why seventh
285
+ Very valuable for debugging,
286
+ but less critical than improving control success first.
287
+
288
+ Improves diagnosis more than task completion.
289
+
290
+ ## Deliver
291
+ - Action correlation metadata
292
+ - UI/network/log linkage
293
+
294
+ ## Expected Impact
295
+ Medium-high.
296
+
297
+ ## Done Criteria
298
+ - Action correlation model defined
299
+ - UI/network/log linkage captured for representative actions
300
+ - Correlation metadata exposed to agents
301
+ - Debugging workflows validated with trace linkage
302
+
303
+ ## Success Metrics
304
+ - Lower time-to-root-cause
305
+ - Faster diagnosis of partial failures
306
+ - Improved action causality attribution
307
+
308
+ ## Dependencies
309
+ Depends on:
310
+ - Priority 1 — Stronger State Verification
311
+ - Priority 2 — Richer Element Identity
312
+ - Priority 3 — Wait and Synchronization Reliability
313
+
314
+ ---
315
+
316
+ # Delivery Waves
317
+
318
+ ## Dependency Summary
319
+ Foundational sequence:
320
+
321
+ Layer 1 (Foundations)
322
+ - Priority 1
323
+ - Priority 2
324
+
325
+ Layer 2 (Synchronization)
326
+ - Priority 3 depends on 1,2
327
+
328
+ Layer 3 (Interaction Expansion)
329
+ - Priority 4 depends on 2
330
+ - Priority 5 depends on 1,2,3,4
331
+ - Priority 6 depends on 1,2
332
+
333
+ Layer 4 (Observability)
334
+ - Priority 7 depends on 1,2,3
335
+
336
+ ## Wave 1 (Immediate)
337
+ - Stronger State Verification
338
+ - Richer Element Identity
339
+ - Wait and Synchronization Reliability
340
+
341
+ Focus:
342
+ Make core loop more reliable.
343
+
344
+ ---
345
+
346
+ ## Wave 2
347
+ - Long Press
348
+ - Better Compose Semantics
349
+
350
+ Focus:
351
+ Expand interaction capability.
352
+
353
+ ---
354
+
355
+ ## Wave 3
356
+ - Pinch to Zoom
357
+ - Action Trace Correlation
358
+
359
+ Focus:
360
+ Advanced gestures + observability.
361
+
362
+ ---
363
+
364
+ # Priority Stack Summary
365
+
366
+ Execution Order:
367
+ 1. Stronger State Verification
368
+ 2. Richer Element Identity
369
+ 3. Wait and Synchronization Reliability
370
+ 4. Long Press
371
+ 5. Better Compose / Custom Control Semantics
372
+ 6. Pinch to Zoom
373
+ 7. Action Trace Correlation
374
+
375
+ Rationale:
376
+ - Priorities 1–3 harden control, verification, and synchronization.
377
+ - Priorities 4–6 expand interaction capability.
378
+ - Priority 7 adds observability once control reliability matures.
379
+
380
+ ---
381
+
382
+ ## Explicitly Deferred
383
+ Still out of scope:
384
+
385
+ - Recovery planning logic
386
+ - Autonomous retry strategy
387
+ - MCP-level agent orchestration
388
+ - Autonomous recovery hinting (future consideration only)