mobile-debug-mcp 0.24.7 → 0.25.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,452 @@
1
+ # RFC-001: Stronger State Verification
2
+
3
+ Priority: 1
4
+ Depends on: Snapshot v1
5
+ Related Roadmap Item: Priority 1 — Stronger State Verification
6
+
7
+ ---
8
+
9
+ # 1. Problem
10
+
11
+ Agents currently infer UI control state too often from surrounding text or visual clues instead of reading authoritative control state.
12
+
13
+ This causes:
14
+
15
+ - Retry loops caused by uncertainty
16
+ - False-positive verification
17
+ - Brittle behavior in dynamic UIs
18
+ - Poor handling of sliders, toggles, selectors, and custom controls
19
+
20
+ Current failure mode:
21
+
22
+ Infer:
23
+ - “Toggle looks enabled”
24
+ - “Dropdown probably changed”
25
+ - “Slider appears moved”
26
+
27
+ Desired:
28
+
29
+ Read:
30
+ - checked=true
31
+ - selected="Dark Mode"
32
+ - value=75
33
+
34
+ Shift from visual inference to explicit state introspection.
35
+
36
+ ---
37
+
38
+ # 2. Goals
39
+
40
+ This RFC introduces:
41
+
42
+ 1. Readable control state in snapshot responses
43
+ 2. Stronger verification primitives for state assertions
44
+ 3. Clear conflict and fallback semantics
45
+ 4. Backward-compatible protocol extension
46
+
47
+ Success goals:
48
+
49
+ - Reduce retries on stateful tasks
50
+ - Improve first-pass verification success
51
+ - Reduce ambiguity in control manipulation
52
+
53
+ ---
54
+
55
+ # 3. Non-Goals
56
+
57
+ This RFC does not:
58
+
59
+ - Introduce planner or recovery logic
60
+ - Add agent orchestration behavior
61
+ - Redesign semantic enrichment architecture
62
+ - Add gesture capabilities (covered by later RFCs)
63
+ - Add control interaction contracts (later RFC)
64
+
65
+ ---
66
+
67
+ # 4. Proposed Model
68
+
69
+ ## 4.1 State Model Additions
70
+
71
+ Add optional readable state block to elements where applicable.
72
+
73
+ Candidate fields:
74
+
75
+ - checked
76
+ - selected
77
+ - value
78
+ - focused
79
+ - expanded
80
+ - enabled
81
+ - text_value
82
+
83
+ Illustrative shape:
84
+
85
+ ```json
86
+ {
87
+ "element_id": "wifi_toggle",
88
+ "role": "switch",
89
+ "state": {
90
+ "checked": true,
91
+ "enabled": true
92
+ }
93
+ }
94
+ ```
95
+
96
+ Rules:
97
+
98
+ - State is optional when unavailable.
99
+ - Absence of state is not negative assertion.
100
+ - Raw exposed state is authoritative.
101
+ - Derived semantic guesses may not override raw state.
102
+
103
+ ---
104
+
105
+ ## 4.2 Verification Primitive Additions
106
+
107
+ Canonical verification primitive:
108
+
109
+ ```text
110
+ expect_state(element, property, expected)
111
+ ```
112
+
113
+ Canonical model is the normative API.
114
+
115
+ Convenience aliases MAY be exposed:
116
+
117
+ - expect_checked(...)
118
+ - expect_value(...)
119
+ - expect_selected(...)
120
+ - expect_expanded(...)
121
+ - expect_enabled(...)
122
+
123
+ Rules:
124
+
125
+ - `expect_state` is the authoritative verification primitive.
126
+ - Specialized expect_* forms are aliases over the same semantics.
127
+ - New control properties should extend `expect_state`, not proliferate bespoke tools.
128
+
129
+ Rationale:
130
+
131
+ - Preserves extensibility
132
+ - Avoids tool surface explosion
133
+ - Keeps a single verification model
134
+ - Retains ergonomic shortcuts for agents
135
+
136
+ ---
137
+
138
+ ## 4.3 Minimum v1 Widget Coverage
139
+
140
+ Implementations MUST support readable state and verification for:
141
+
142
+ Required v1 controls:
143
+
144
+ - Switch / toggle
145
+ - Slider / seekbar
146
+ - Text input
147
+ - Single-select dropdown or picker
148
+
149
+ Deferred from v1:
150
+
151
+ - Multi-select controls
152
+ - Date/time pickers
153
+ - Rich custom composite widgets
154
+ - Advanced gesture-driven controls
155
+
156
+ RFC-001 is considered incomplete without required v1 control support.
157
+
158
+ ---
159
+
160
+ ## 4.4 Value Normalization
161
+
162
+ ### Numeric values
163
+
164
+ Implementations MUST expose canonical normalized values where applicable.
165
+
166
+ Preferred shape:
167
+
168
+ ```json
169
+ {
170
+ "value": 75,
171
+ "value_range": {
172
+ "min": 0,
173
+ "max": 100
174
+ },
175
+ "raw_value": 0.75,
176
+ "raw_value_unit": "fraction"
177
+ }
178
+ ```
179
+
180
+ Canonical/raw model rules:
181
+
182
+ - `value` is the canonical comparison value used for verification.
183
+ - `raw_value` is optional implementation-native representation and MUST NOT be used as canonical comparison state.
184
+ - If both are present, canonical and raw representations MUST refer to the same underlying control state.
185
+ - Agents and verification primitives MUST prefer canonical values over raw values.
186
+
187
+ ---
188
+
189
+ ### Selected values
190
+
191
+ Selected values SHOULD expose stable identity and user-visible label when possible.
192
+
193
+ Preferred shape:
194
+
195
+ ```json
196
+ {
197
+ "selected": {
198
+ "id": "dark_mode",
199
+ "label": "Dark Mode"
200
+ }
201
+ }
202
+ ```
203
+
204
+ Rules:
205
+
206
+ - Prefer stable identifiers over label-only values.
207
+ - Labels may vary; identifiers should remain stable.
208
+ - String-only selected values allowed only when richer identity unavailable.
209
+
210
+ ---
211
+
212
+ # 5. Guardrails and Conflict Semantics
213
+
214
+ Required invariants:
215
+
216
+ ## Raw wins on conflict
217
+
218
+ If raw state conflicts with derived semantics:
219
+
220
+ - raw state is authoritative
221
+ - semantic layer is advisory only
222
+
223
+ ---
224
+
225
+ ## Missing state handling
226
+
227
+ If state unavailable:
228
+
229
+ - do not infer false
230
+ - allow agent fallback strategies
231
+ - degrade gracefully
232
+
233
+ ---
234
+
235
+ ## Verification ownership
236
+
237
+ State assertions remain owned by:
238
+
239
+ - expect_* verification
240
+
241
+ Observation and verification stay distinct.
242
+
243
+ ---
244
+
245
+ # 6. Protocol Delta (Draft)
246
+
247
+ Snapshot schema extension:
248
+
249
+ ```json
250
+ state?: {
251
+ checked?: boolean,
252
+ selected?: string,
253
+ value?: number | string,
254
+ focused?: boolean,
255
+ expanded?: boolean,
256
+ enabled?: boolean,
257
+ text_value?: string
258
+ }
259
+ ```
260
+
261
+ Notes:
262
+
263
+ - Additive only
264
+ - Backward compatible
265
+ - No breaking changes intended
266
+
267
+ Versioning approach under consideration:
268
+
269
+ - capability flag preferred over protocol fork
270
+
271
+ Illustrative:
272
+
273
+ ```json
274
+ capabilities: {
275
+ state_verification_v2: true
276
+ }
277
+ ```
278
+
279
+ ---
280
+
281
+ # 7. Failure Modes
282
+
283
+ Must define behavior for:
284
+
285
+ ## Unsupported state
286
+
287
+ Example:
288
+ custom control exposes no readable value.
289
+
290
+ Expected behavior:
291
+ - omit state field
292
+ - agent may fall back to other evidence
293
+
294
+ ---
295
+
296
+ ## Partial state
297
+
298
+ Example:
299
+ slider exposes value but not enabled.
300
+
301
+ Expected:
302
+ - partial state valid
303
+ - no synthetic completion
304
+
305
+ ---
306
+
307
+ ## Stale reads
308
+ If snapshot freshness is uncertain, no assumptions of real-time synchronization.
309
+ (Interaction with synchronization RFC expected later.)
310
+
311
+ ---
312
+
313
+ # 8. Acceptance Test Vectors
314
+
315
+ Representative benchmark flows:
316
+
317
+ ## Toggle
318
+
319
+ Given:
320
+ - toggle off
321
+
322
+ When:
323
+ - agent toggles on
324
+
325
+ Then:
326
+ - expect_checked passes true
327
+
328
+ ---
329
+
330
+ ## Slider
331
+
332
+ Given:
333
+ - slider at 50
334
+
335
+ When:
336
+ - set to 75
337
+
338
+ Then:
339
+ - expect_value verifies 75
340
+
341
+ ---
342
+
343
+ ## Dropdown
344
+
345
+ Given:
346
+ - current option A
347
+
348
+ When:
349
+ - select option B
350
+
351
+ Then:
352
+ - expect_selected verifies B
353
+
354
+ ---
355
+
356
+ ## Input
357
+
358
+ Given:
359
+ - editable text field
360
+
361
+ When:
362
+ - enter new value
363
+
364
+ Then:
365
+ - text_value reflects update
366
+
367
+ ---
368
+
369
+ # 9. Acceptance Criteria
370
+
371
+ RFC considered complete when:
372
+
373
+ - Core widget readable state implemented
374
+ - Verification primitives implemented
375
+ - Representative benchmark flows passing
376
+ - Documentation and schema updated
377
+ - Roadmap done criteria satisfied
378
+
379
+ ---
380
+
381
+ # 10. Success Metrics
382
+
383
+ Target outcomes:
384
+
385
+ - 30%+ retry reduction on stateful tasks
386
+ - Higher first-pass verification success
387
+ - Reduced false positive verifications
388
+
389
+ Measured against roadmap KPIs.
390
+
391
+ ---
392
+
393
+ # 11. Normative Requirements (v1)
394
+
395
+ Implementations conforming to RFC-001:
396
+
397
+ MUST:
398
+
399
+ - Support readable state for required v1 controls
400
+ - Implement `expect_state(...)`
401
+ - Treat raw exposed state as authoritative
402
+ - Preserve additive backward compatibility
403
+ - Gracefully omit unsupported state rather than synthesize values
404
+
405
+ SHOULD:
406
+
407
+ - Expose normalized numeric values
408
+ - Expose stable identifiers for selected values
409
+ - Support convenience expect_* aliases
410
+
411
+ MAY:
412
+
413
+ - Expose raw platform-native values alongside canonical values
414
+ - Extend support beyond required v1 controls
415
+
416
+ ---
417
+
418
+ # 12. Resolved Design Decisions
419
+
420
+ Former open questions resolved for RFC-001:
421
+
422
+ 1. Verification API:
423
+ - `expect_state(...)` is normative.
424
+ - expect_* forms are convenience aliases.
425
+
426
+ 2. Raw vs semantic split:
427
+ - Raw exposed state remains authoritative.
428
+ - Semantic layer remains advisory only.
429
+
430
+ 3. Numeric normalization:
431
+ - Canonical normalized values are normative.
432
+ - Raw values are optional supplemental representation.
433
+
434
+ 4. Capability negotiation:
435
+ - Capability flag approach retained for initial implementation.
436
+
437
+ 5. v1 widget scope:
438
+ - Defined by checklist in Section 4.3.
439
+
440
+ No unresolved design blockers remain for RFC-001 draft scope.
441
+
442
+ ---
443
+
444
+ # 13. Deferred To Later RFCs
445
+
446
+ Deferred to later RFCs:
447
+
448
+ - Interaction contracts
449
+ - Compose semantic enrichment extensions
450
+ - Wait/synchronization improvements
451
+ - Action trace correlation
452
+ - Gesture support
@@ -40,6 +40,7 @@ Outcome-specific guidance:
40
40
 
41
41
  - visible navigation expected -> `wait_for_screen_change` (optional) -> `expect_screen`
42
42
  - local UI change expected -> `wait_for_ui` (optional) -> `expect_element_visible`
43
+ - readable element state expected -> `wait_for_ui` (optional) -> `expect_state`
43
44
  - backend/API activity expected without a visible UI change -> compare `get_screen_fingerprint` before/after, then call `get_network_activity` immediately after the action and `classify_action_outcome` with the observed requests
44
45
 
45
46
  For backend/API activity, `wait_for_screen_change` is not the right verification tool unless a visible transition is also expected.
@@ -108,6 +109,7 @@ Primary:
108
109
 
109
110
  - `expect_screen`
110
111
  - `expect_element_visible`
112
+ - `expect_state`
111
113
 
112
114
  ### 5.2 Required Semantics
113
115
 
@@ -130,6 +132,7 @@ An `expect_*` tool is applicable when:
130
132
 
131
133
  - expected destination screen is known -> `expect_screen`
132
134
  - expected UI element state is known -> `expect_element_visible`
135
+ - expected readable state property is known -> `expect_state`
133
136
  - outcome is explicitly defined or testable
134
137
 
135
138
  Rules:
@@ -234,6 +237,7 @@ The semantic layer is derived, best-effort, and MUST be generated exclusively fr
234
237
  Raw layer contents include:
235
238
 
236
239
  - UI hierarchy or accessibility tree
240
+ - normalized readable element state where exposed by the platform
237
241
  - screenshot when available
238
242
  - element-level attributes
239
243
  - logs and fingerprint/activity observations
@@ -53,6 +53,7 @@ Preferred verification:
53
53
 
54
54
  - navigation outcome known -> `expect_screen`
55
55
  - local UI change known -> `expect_element_visible`
56
+ - readable element state known -> `expect_state`
56
57
  - backend/API activity expected -> `classify_action_outcome` + `get_network_activity`
57
58
 
58
59
  Use `wait_for_screen_change` only when a visible transition is the expected outcome. If a button should trigger an API request but the screen should stay the same, rely on network activity and classification instead.
@@ -459,6 +460,30 @@ Notes:
459
460
 
460
461
  ---
461
462
 
463
+ ## expect_state
464
+
465
+ Deterministically verify a readable state property on a visible element.
466
+
467
+ Input:
468
+
469
+ ```json
470
+ {
471
+ "selector": { "text": "Notifications" },
472
+ "property": "checked",
473
+ "expected": true,
474
+ "platform": "android",
475
+ "deviceId": "emulator-5554"
476
+ }
477
+ ```
478
+
479
+ Notes:
480
+
481
+ - Use this when the element is visible but its state also matters.
482
+ - Supported properties include `checked`, `selected`, `focused`, `expanded`, `enabled`, `text_value`, `value`, and `raw_value`.
483
+ - The tool compares normalized state and returns the observed value when available.
484
+
485
+ ---
486
+
462
487
  ## classify_action_outcome + get_network_activity
463
488
 
464
489
  Use this pair when the action is expected to trigger network/backend work and the screen may not visibly change.
@@ -83,11 +83,12 @@ Input:
83
83
  Response (example):
84
84
 
85
85
  ```json
86
- { "device": { "platform": "android", "id": "emulator-5554" }, "screen": "", "resolution": { "width": 1080, "height": 2400 }, "elements": [ { "text": "Sign in", "type": "android.widget.Button", "resourceId": "com.example:id/signin", "clickable": true, "bounds": [0,0,100,50] } ] }
86
+ { "device": { "platform": "android", "id": "emulator-5554" }, "screen": "", "resolution": { "width": 1080, "height": 2400 }, "elements": [ { "text": "Sign in", "type": "android.widget.Button", "resourceId": "com.example:id/signin", "clickable": true, "bounds": [0,0,100,50], "state": { "enabled": true } } ] }
87
87
  ```
88
88
 
89
89
  Notes:
90
90
  - Useful for inspection, selector development, and fallback debugging.
91
+ - Elements may include a normalized `state` object when the platform exposes readable state such as checked, selected, focused, expanded, text input, or slider values.
91
92
  - Prefer `wait_for_ui` for deterministic element resolution in interactive flows.
92
93
 
93
94
  ---
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mobile-debug-mcp",
3
- "version": "0.24.7",
3
+ "version": "0.25.0",
4
4
  "description": "MCP server for mobile app debugging (Android + iOS), with focus on security and reliability",
5
5
  "type": "module",
6
6
  "bin": {