mobile-debug-mcp 0.26.2 → 0.26.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +3 -0
- package/dist/interact/index.js +600 -70
- package/dist/observe/ios.js +1 -1
- package/dist/server/tool-definitions.js +59 -1
- package/dist/server/tool-handlers.js +25 -0
- package/dist/server-core.js +1 -1
- package/dist/utils/android/utils.js +2 -2
- package/docs/CHANGELOG.md +6 -0
- package/docs/ROADMAP.md +72 -16
- package/docs/rfcs/007-actionability-resolution-and-executable-target-selection.md +277 -0
- package/docs/rfcs/008-adjustable-control-support-and-semantic-value-manipulation.md +273 -0
- package/docs/specs/mcp-tooling-spec-v1.md +1 -1
- package/docs/tools/interact.md +30 -1
- package/package.json +1 -1
- package/src/interact/index.ts +761 -72
- package/src/observe/ios.ts +1 -1
- package/src/server/tool-definitions.ts +59 -1
- package/src/server/tool-handlers.ts +26 -0
- package/src/server-core.ts +1 -1
- package/src/types.ts +90 -0
- package/src/utils/android/utils.ts +2 -2
- package/test/unit/interact/adjust_control.test.ts +365 -0
- package/test/unit/observe/find_element.test.ts +5 -0
- package/test/unit/observe/state_extraction.test.ts +24 -0
- package/test/unit/server/contract.test.ts +8 -0
- package/test/unit/server/response_shapes.test.ts +39 -0
|
@@ -0,0 +1,273 @@
|
|
|
1
|
+
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
# RFC 008 — Adjustable Control Support and Semantic Value Manipulation
|
|
5
|
+
|
|
6
|
+
## 1. Summary
|
|
7
|
+
|
|
8
|
+
This RFC defines semantic interaction support for adjustable controls whose primary interaction changes a value rather than triggering a discrete action.
|
|
9
|
+
|
|
10
|
+
Examples include:
|
|
11
|
+
- sliders
|
|
12
|
+
- steppers
|
|
13
|
+
- seek bars
|
|
14
|
+
- drag-based range controls
|
|
15
|
+
- quantized parameter selectors
|
|
16
|
+
|
|
17
|
+
Goal:
|
|
18
|
+
Enable reliable value-setting interactions with verification, minimizing coordinate guessing and brittle gesture calibration.
|
|
19
|
+
|
|
20
|
+
Builds on:
|
|
21
|
+
- RFC 005 — correctness model
|
|
22
|
+
- RFC 006 — runtime execution binding
|
|
23
|
+
- RFC 007 — target resolution
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## 2. Problem Statement
|
|
28
|
+
|
|
29
|
+
Current control adjustment often degrades into coordinate heuristics.
|
|
30
|
+
|
|
31
|
+
Observed failure modes:
|
|
32
|
+
- slider handles not semantically surfaced
|
|
33
|
+
- coordinate calibration guesswork
|
|
34
|
+
- snapping or quantized values behaving unexpectedly
|
|
35
|
+
- weak confirmation of resulting value
|
|
36
|
+
- retries caused by partial adjustment success
|
|
37
|
+
|
|
38
|
+
This is not primarily a gesture problem.
|
|
39
|
+
|
|
40
|
+
It is:
|
|
41
|
+
- a control semantics problem
|
|
42
|
+
- a value verification problem
|
|
43
|
+
- an adjustment convergence problem
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## 3. Design Goals
|
|
48
|
+
|
|
49
|
+
Support MUST:
|
|
50
|
+
- Prefer semantic adjustment over coordinate manipulation
|
|
51
|
+
- Support deterministic value targeting
|
|
52
|
+
- Verify resulting value after adjustment
|
|
53
|
+
- Handle quantized or snapping controls
|
|
54
|
+
- Support bounded tolerance when exact values are impossible
|
|
55
|
+
- Use coordinate fallback only as degraded mode
|
|
56
|
+
|
|
57
|
+
---
|
|
58
|
+
|
|
59
|
+
## 4. Adjustable Control Model
|
|
60
|
+
|
|
61
|
+
Treat adjustable controls as a combination of:
|
|
62
|
+
- target control
|
|
63
|
+
- value model
|
|
64
|
+
- adjustment mechanism
|
|
65
|
+
- verification loop
|
|
66
|
+
|
|
67
|
+
Control should expose where possible:
|
|
68
|
+
- current value
|
|
69
|
+
- minimum and maximum range
|
|
70
|
+
- step granularity (if known)
|
|
71
|
+
- adjustable role metadata
|
|
72
|
+
|
|
73
|
+
---
|
|
74
|
+
|
|
75
|
+
## 5. Primary Primitive
|
|
76
|
+
|
|
77
|
+
Illustrative adjustment primitive (conceptual, not yet a committed tool surface):
|
|
78
|
+
|
|
79
|
+
set_slider_value(target, value, tolerance?)
|
|
80
|
+
|
|
81
|
+
This denotes an adjustment capability, not a required standalone tool API. Implementations may realize it as:
|
|
82
|
+
- an extension of existing gesture tools
|
|
83
|
+
- an internal adjustment helper
|
|
84
|
+
- a future dedicated control-adjustment tool surface
|
|
85
|
+
|
|
86
|
+
This RFC does not mandate which mechanism is used.
|
|
87
|
+
|
|
88
|
+
Semantics:
|
|
89
|
+
- resolve control target
|
|
90
|
+
- perform adjustment
|
|
91
|
+
- read back resulting value
|
|
92
|
+
- converge or fail explicitly
|
|
93
|
+
|
|
94
|
+
This is not:
|
|
95
|
+
blind drag gestures
|
|
96
|
+
|
|
97
|
+
It is:
|
|
98
|
+
set and verify
|
|
99
|
+
|
|
100
|
+
## 5.1 Tool Surface Boundary
|
|
101
|
+
|
|
102
|
+
This RFC specifies adjustment semantics, not a committed tool API.
|
|
103
|
+
|
|
104
|
+
It does not assume a new public tool is introduced in this RFC.
|
|
105
|
+
Implementation may extend existing gesture/runtime surfaces before introducing any dedicated adjustment command.
|
|
106
|
+
|
|
107
|
+
---
|
|
108
|
+
|
|
109
|
+
## 6. Adjustment Modes
|
|
110
|
+
|
|
111
|
+
### Semantic mode (preferred)
|
|
112
|
+
Use control semantics to set value directly or intelligently drive adjustment.
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
### Gesture-assisted mode
|
|
117
|
+
Use controlled drag informed by:
|
|
118
|
+
- bounds
|
|
119
|
+
- target percentage
|
|
120
|
+
- snapping model
|
|
121
|
+
|
|
122
|
+
---
|
|
123
|
+
|
|
124
|
+
### Coordinate fallback (last resort)
|
|
125
|
+
Allowed only when semantic control support is absent.
|
|
126
|
+
|
|
127
|
+
Must be explicit degraded mode.
|
|
128
|
+
|
|
129
|
+
---
|
|
130
|
+
|
|
131
|
+
## 7. Verification Loop
|
|
132
|
+
|
|
133
|
+
Adjustment is incomplete until verified.
|
|
134
|
+
|
|
135
|
+
Loop:
|
|
136
|
+
- adjust
|
|
137
|
+
- read back
|
|
138
|
+
- compare to target or tolerance
|
|
139
|
+
- converge or fail after bounded retries
|
|
140
|
+
|
|
141
|
+
Possible outcomes:
|
|
142
|
+
- Verified
|
|
143
|
+
- Tolerance satisfied
|
|
144
|
+
- Failed to converge
|
|
145
|
+
|
|
146
|
+
Aligns with RFC 005.
|
|
147
|
+
Verification may initially be realized through explicit expect_state mappings for control value assertions. Dedicated value verifiers are a possible future extension, not a prerequisite of this RFC.
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## 8. Quantized / Snapping Controls
|
|
152
|
+
|
|
153
|
+
Support:
|
|
154
|
+
- discrete step controls
|
|
155
|
+
- snapping values
|
|
156
|
+
- non-linear scales (where detectable)
|
|
157
|
+
|
|
158
|
+
Tolerance model required.
|
|
159
|
+
|
|
160
|
+
Example:
|
|
161
|
+
Target 30
|
|
162
|
+
Actual 29.8
|
|
163
|
+
Within tolerance acceptable
|
|
164
|
+
Discrete or non-numeric controls may satisfy convergence through semantic state equivalence rather than numeric tolerance alone.
|
|
165
|
+
|
|
166
|
+
---
|
|
167
|
+
|
|
168
|
+
## 9. Control Resolution Dependency
|
|
169
|
+
|
|
170
|
+
Uses RFC 007 target resolution for:
|
|
171
|
+
- locating actual adjustable control
|
|
172
|
+
- avoiding fake slider containers
|
|
173
|
+
- resolving executable adjustable target
|
|
174
|
+
|
|
175
|
+
RFC 007 resolves what to adjust.
|
|
176
|
+
RFC 008 defines how to adjust it.
|
|
177
|
+
|
|
178
|
+
---
|
|
179
|
+
|
|
180
|
+
## 10. Compose / Custom Control Support
|
|
181
|
+
|
|
182
|
+
Support derived adjustable semantics for:
|
|
183
|
+
- Compose sliders
|
|
184
|
+
- custom parameter widgets
|
|
185
|
+
- composite adjustable controls
|
|
186
|
+
|
|
187
|
+
Strengthens Better Compose / Custom Control Semantics.
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## 11. Output / Result Model (Illustrative)
|
|
192
|
+
|
|
193
|
+
Illustrative result shape:
|
|
194
|
+
|
|
195
|
+
{
|
|
196
|
+
"target_state": 30,
|
|
197
|
+
"actual_state": 30,
|
|
198
|
+
"converged": true,
|
|
199
|
+
"adjustment_mode": "semantic"
|
|
200
|
+
}
|
|
201
|
+
|
|
202
|
+
State may be numeric, discrete, or semantic depending on control type. This model is illustrative and current implementations may expose only a subset through existing verification surfaces.
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
## 12. Success Metrics
|
|
207
|
+
|
|
208
|
+
Track:
|
|
209
|
+
- reduction in coordinate fallback usage
|
|
210
|
+
- reduced retries adjusting controls
|
|
211
|
+
- improved first-pass value convergence
|
|
212
|
+
- improved custom control adjustment success
|
|
213
|
+
|
|
214
|
+
---
|
|
215
|
+
|
|
216
|
+
## 13. Dependencies
|
|
217
|
+
|
|
218
|
+
Depends on:
|
|
219
|
+
- Stronger State Verification
|
|
220
|
+
- Actionability Resolution (RFC 007)
|
|
221
|
+
|
|
222
|
+
Strengthens:
|
|
223
|
+
- Better Compose / Custom Control Semantics
|
|
224
|
+
- Pinch to Zoom (future)
|
|
225
|
+
|
|
226
|
+
---
|
|
227
|
+
|
|
228
|
+
## 14. Relationship to Prior RFCs
|
|
229
|
+
|
|
230
|
+
RFC 005
|
|
231
|
+
Defines successful adjustment verification.
|
|
232
|
+
|
|
233
|
+
RFC 006
|
|
234
|
+
Defines runtime execution interpretation.
|
|
235
|
+
|
|
236
|
+
RFC 007
|
|
237
|
+
Defines which adjustable target gets selected.
|
|
238
|
+
|
|
239
|
+
RFC 008
|
|
240
|
+
Defines how value-changing controls are manipulated reliably.
|
|
241
|
+
|
|
242
|
+
Together:
|
|
243
|
+
- RFC 005 — correctness
|
|
244
|
+
- RFC 006 — runtime binding
|
|
245
|
+
- RFC 007 — target resolution
|
|
246
|
+
- RFC 008 — value manipulation
|
|
247
|
+
|
|
248
|
+
---
|
|
249
|
+
|
|
250
|
+
## 15. Summary
|
|
251
|
+
|
|
252
|
+
This RFC moves adjustable controls from:
|
|
253
|
+
- gesture guesswork
|
|
254
|
+
|
|
255
|
+
to:
|
|
256
|
+
- semantic value manipulation with verification
|
|
257
|
+
|
|
258
|
+
It reduces one of the largest remaining sources of interaction brittleness.
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
## 16. Non-Goals / Scope Boundary
|
|
263
|
+
|
|
264
|
+
This RFC defines adjustment semantics and convergence behavior.
|
|
265
|
+
|
|
266
|
+
It does not commit in this RFC to:
|
|
267
|
+
- a specific runtime tool API
|
|
268
|
+
- full adjustable control support across all control types
|
|
269
|
+
- generalized gesture framework support
|
|
270
|
+
- arbitrary drag or canvas manipulation
|
|
271
|
+
- pinch-to-zoom or broader gesture semantics
|
|
272
|
+
|
|
273
|
+
This RFC specifies the behavioral model adjustable-control support should satisfy as implementations mature.
|
|
@@ -52,7 +52,7 @@ For backend/API activity, `wait_for_screen_change` is not the right verification
|
|
|
52
52
|
Action tools mutate application state.
|
|
53
53
|
|
|
54
54
|
Includes:
|
|
55
|
-
`start_app`, `restart_app`, `tap`, `tap_element`, `swipe`, `scroll_to_element`, `type_text`, `press_back`
|
|
55
|
+
`start_app`, `restart_app`, `tap`, `tap_element`, `swipe`, `scroll_to_element`, `type_text`, `press_back`, `adjust_control`
|
|
56
56
|
|
|
57
57
|
### 4.2 Required Semantics
|
|
58
58
|
|
package/docs/tools/interact.md
CHANGED
|
@@ -172,6 +172,27 @@ Guidance:
|
|
|
172
172
|
|
|
173
173
|
---
|
|
174
174
|
|
|
175
|
+
## adjust_control
|
|
176
|
+
|
|
177
|
+
Purpose:
|
|
178
|
+
|
|
179
|
+
- adjust a numeric control value with bounded verification
|
|
180
|
+
|
|
181
|
+
Notes:
|
|
182
|
+
|
|
183
|
+
- initial support is for slider-like controls that expose `value_range` or readable numeric value state
|
|
184
|
+
- `expect_state` is the verification surface used to read back the resulting value
|
|
185
|
+
- direct target placement is preferred; drag fallback is treated as degraded mode
|
|
186
|
+
- the tool returns `target_state`, `actual_state`, `within_tolerance`, `converged`, `attempts`, and `adjustment_mode`
|
|
187
|
+
|
|
188
|
+
Input example:
|
|
189
|
+
|
|
190
|
+
```json
|
|
191
|
+
{ "selector": { "text": "Duration" }, "property": "value", "targetValue": 30, "tolerance": 0.5, "platform": "android", "deviceId": "emulator-5554" }
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
---
|
|
195
|
+
|
|
175
196
|
## find_element
|
|
176
197
|
|
|
177
198
|
Locate a UI element on the current screen using semantic matching and return an actionable element descriptor.
|
|
@@ -199,7 +220,14 @@ Output:
|
|
|
199
220
|
"telemetry": { "matchedIndex": 3, "matchedInteractable": true }
|
|
200
221
|
},
|
|
201
222
|
"score": 1.0,
|
|
202
|
-
"confidence": 1.0
|
|
223
|
+
"confidence": 1.0,
|
|
224
|
+
"resolution": {
|
|
225
|
+
"confidence": 1.0,
|
|
226
|
+
"reason": "exact_text_match",
|
|
227
|
+
"fallback_available": false,
|
|
228
|
+
"matched_count": 1,
|
|
229
|
+
"alternates": []
|
|
230
|
+
}
|
|
203
231
|
}
|
|
204
232
|
```
|
|
205
233
|
|
|
@@ -207,6 +235,7 @@ Notes:
|
|
|
207
235
|
|
|
208
236
|
- Best used when no precise selector is available yet.
|
|
209
237
|
- `tapCoordinates` are suitable for `tap` calls.
|
|
238
|
+
- `resolution` explains why the element was selected and may include fallback alternates when the runtime had to promote a parent or nearby control.
|
|
210
239
|
- Prefer `wait_for_ui` when you already know a deterministic selector and want a stable `elementId`.
|
|
211
240
|
|
|
212
241
|
---
|