mobile-debug-mcp 0.25.1 → 0.26.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/interact/classify.js +48 -11
- package/dist/interact/index.js +113 -0
- package/dist/observe/android.js +10 -1
- package/dist/observe/index.js +19 -1
- package/dist/observe/ios.js +15 -1
- package/dist/observe/snapshot-metadata.js +88 -0
- package/dist/server/tool-definitions.js +49 -14
- package/dist/server/tool-handlers.js +12 -0
- package/dist/server-core.js +1 -1
- package/docs/CHANGELOG.md +9 -0
- package/docs/ROADMAP.md +66 -38
- package/docs/rfcs/003-wait-and-synchronization-reliability.md +296 -0
- package/docs/rfcs/004-action-verification-routing.md +342 -0
- package/docs/specs/mcp-tooling-spec-v1.md +11 -3
- package/docs/tools/interact.md +31 -8
- package/docs/tools/observe.md +4 -2
- package/package.json +1 -1
- package/skills/rfc-review/SKILL.md +52 -0
- package/skills/rfc-review/references/rfc-review-checklist.md +12 -0
- package/skills/rfc-review/references/rfc-review-template.md +28 -0
- package/src/interact/classify.ts +53 -13
- package/src/interact/index.ts +151 -0
- package/src/observe/android.ts +11 -1
- package/src/observe/index.ts +26 -1
- package/src/observe/ios.ts +28 -13
- package/src/observe/snapshot-metadata.ts +107 -0
- package/src/server/tool-definitions.ts +49 -14
- package/src/server/tool-handlers.ts +13 -0
- package/src/server-core.ts +1 -1
- package/src/types.ts +23 -0
- package/test/unit/interact/classify_action_outcome.test.ts +44 -25
- package/test/unit/interact/wait_for_ui_change.test.ts +76 -0
- package/test/unit/server/contract.test.ts +8 -6
- package/test/unit/server/response_shapes.test.ts +37 -3
- package/docs/rfcs/003-wait-and-synchronization-reliability +0 -232
package/docs/ROADMAP.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
|
-
# Mobile Debug MCP
|
|
1
|
+
# Mobile Debug MCP Roadmap
|
|
2
2
|
|
|
3
|
-
##
|
|
3
|
+
## Planning Principles
|
|
4
4
|
|
|
5
5
|
Ordered by:
|
|
6
6
|
|
|
@@ -26,33 +26,45 @@ Higher task success with fewer retries.
|
|
|
26
26
|
|
|
27
27
|
---
|
|
28
28
|
|
|
29
|
-
#
|
|
29
|
+
# Roadmap Status Overview
|
|
30
30
|
|
|
31
|
-
|
|
31
|
+
## Completed Foundations
|
|
32
32
|
|
|
33
|
-
|
|
34
|
-
|
|
33
|
+
| Capability | Status | Notes |
|
|
34
|
+
|-----------|--------|-------|
|
|
35
|
+
| Stronger State Verification | Complete | Foundational verification layer shipped |
|
|
36
|
+
| Richer Element Identity | Complete | Identity and selector confidence foundations shipped |
|
|
37
|
+
|
|
38
|
+
## Current Focus
|
|
39
|
+
|
|
40
|
+
- Wait and Synchronization Reliability
|
|
41
|
+
|
|
42
|
+
## Upcoming Work
|
|
43
|
+
|
|
44
|
+
- Long Press Gesture
|
|
45
|
+
- Better Compose / Custom Control Semantics
|
|
35
46
|
|
|
36
|
-
|
|
47
|
+
## Later Horizon
|
|
37
48
|
|
|
38
|
-
-
|
|
39
|
-
-
|
|
49
|
+
- Pinch to Zoom
|
|
50
|
+
- Action Trace Correlation
|
|
40
51
|
|
|
41
52
|
---
|
|
42
53
|
|
|
43
|
-
#
|
|
54
|
+
# Stronger State Verification
|
|
44
55
|
|
|
45
56
|
## Why first
|
|
46
57
|
Highest leverage improvement.
|
|
47
58
|
|
|
48
|
-
**Status:** Completed
|
|
59
|
+
**Status:** Completed
|
|
60
|
+
**Priority:** P1
|
|
49
61
|
|
|
50
62
|
Most failures are not “can’t act,” they’re:
|
|
51
63
|
- uncertain state
|
|
52
64
|
- weak verification
|
|
53
65
|
- retry loops caused by inference
|
|
54
66
|
|
|
55
|
-
##
|
|
67
|
+
## Scope
|
|
56
68
|
- Direct readable control values
|
|
57
69
|
- Expanded `expect_*` verification
|
|
58
70
|
- Move from inference to state introspection
|
|
@@ -60,7 +72,7 @@ Most failures are not “can’t act,” they’re:
|
|
|
60
72
|
## Expected Impact
|
|
61
73
|
Very high.
|
|
62
74
|
|
|
63
|
-
##
|
|
75
|
+
## Exit Criteria
|
|
64
76
|
- Control state readable for core widgets (toggle, slider, input, dropdown)
|
|
65
77
|
- New expect_* state verifiers implemented
|
|
66
78
|
- Agents can verify state without visual inference in representative flows
|
|
@@ -79,19 +91,20 @@ Blocks or strengthens:
|
|
|
79
91
|
|
|
80
92
|
---
|
|
81
93
|
|
|
82
|
-
#
|
|
94
|
+
# Richer Element Identity
|
|
83
95
|
|
|
84
96
|
## Why second
|
|
85
97
|
Directly reduces selector brittleness.
|
|
86
98
|
|
|
87
|
-
**Status:** Completed
|
|
99
|
+
**Status:** Completed
|
|
100
|
+
**Priority:** P2
|
|
88
101
|
|
|
89
102
|
Improves:
|
|
90
103
|
- targeting stability
|
|
91
104
|
- repeatability
|
|
92
105
|
- agent confidence
|
|
93
106
|
|
|
94
|
-
##
|
|
107
|
+
## Scope
|
|
95
108
|
- Stable IDs / test tags prioritization
|
|
96
109
|
- Selector confidence metadata
|
|
97
110
|
- Preferred selector hierarchy
|
|
@@ -99,7 +112,7 @@ Improves:
|
|
|
99
112
|
## Expected Impact
|
|
100
113
|
Very high.
|
|
101
114
|
|
|
102
|
-
##
|
|
115
|
+
## Exit Criteria
|
|
103
116
|
- Stable selector preference order implemented
|
|
104
117
|
- Test tags/resource IDs surfaced where available
|
|
105
118
|
- Selector confidence metadata available
|
|
@@ -118,18 +131,21 @@ Blocks or strengthens:
|
|
|
118
131
|
|
|
119
132
|
---
|
|
120
133
|
|
|
121
|
-
#
|
|
134
|
+
# Wait and Synchronization Reliability
|
|
122
135
|
|
|
123
136
|
## Why third
|
|
124
137
|
Reliable async synchronization is foundational for agent success and should precede gesture expansion.
|
|
125
138
|
|
|
139
|
+
**Status:** Spec Ready
|
|
140
|
+
**Priority:** P3
|
|
141
|
+
|
|
126
142
|
Addresses failures where agents:
|
|
127
143
|
- skip UI waits after actions
|
|
128
144
|
- rely on network/log signals too early
|
|
129
145
|
- struggle with in-place UI updates
|
|
130
146
|
- misread stale UI snapshots
|
|
131
147
|
|
|
132
|
-
##
|
|
148
|
+
## Scope
|
|
133
149
|
- UI-first synchronization policy guidance
|
|
134
150
|
- wait_for_ui_change (hierarchy diff based waiting)
|
|
135
151
|
- Structured loading state detection
|
|
@@ -139,7 +155,7 @@ Addresses failures where agents:
|
|
|
139
155
|
## Expected Impact
|
|
140
156
|
Very high.
|
|
141
157
|
|
|
142
|
-
##
|
|
158
|
+
## Exit Criteria
|
|
143
159
|
- wait_for_ui_change implemented
|
|
144
160
|
- Loading state detection available for representative controls
|
|
145
161
|
- Snapshot revision or staleness metadata exposed
|
|
@@ -163,11 +179,14 @@ Blocks or strengthens:
|
|
|
163
179
|
|
|
164
180
|
---
|
|
165
181
|
|
|
166
|
-
#
|
|
182
|
+
# Long Press Gesture
|
|
167
183
|
|
|
168
184
|
## Why fourth
|
|
169
185
|
High utility, relatively low complexity.
|
|
170
186
|
|
|
187
|
+
**Status:** Planned
|
|
188
|
+
**Priority:** P4
|
|
189
|
+
|
|
171
190
|
Unlocks many currently awkward interactions:
|
|
172
191
|
|
|
173
192
|
- context menus
|
|
@@ -177,7 +196,7 @@ Unlocks many currently awkward interactions:
|
|
|
177
196
|
|
|
178
197
|
Broad usefulness.
|
|
179
198
|
|
|
180
|
-
##
|
|
199
|
+
## Scope
|
|
181
200
|
New tool:
|
|
182
201
|
|
|
183
202
|
```json
|
|
@@ -191,7 +210,7 @@ Verification alignment:
|
|
|
191
210
|
## Expected Impact
|
|
192
211
|
High.
|
|
193
212
|
|
|
194
|
-
##
|
|
213
|
+
## Exit Criteria
|
|
195
214
|
- long_press tool implemented across supported platforms
|
|
196
215
|
- Duration defaults and overrides supported
|
|
197
216
|
- Verification patterns for long press outcomes defined
|
|
@@ -211,18 +230,21 @@ Strengthens:
|
|
|
211
230
|
|
|
212
231
|
---
|
|
213
232
|
|
|
214
|
-
#
|
|
233
|
+
# Better Compose / Custom Control Semantics
|
|
215
234
|
|
|
216
235
|
## Why fifth
|
|
217
236
|
Important, but strengthened by priorities 1–4 first.
|
|
218
237
|
|
|
238
|
+
**Status:** Planned
|
|
239
|
+
**Priority:** P5
|
|
240
|
+
|
|
219
241
|
Semantics become more useful once:
|
|
220
242
|
- identity is stronger
|
|
221
243
|
- verification is stronger
|
|
222
244
|
- gestures are richer
|
|
223
245
|
- synchronization is more reliable
|
|
224
246
|
|
|
225
|
-
##
|
|
247
|
+
## Scope
|
|
226
248
|
- Composite control traits
|
|
227
249
|
- Control role enrichment (adjustable, expandable, selectable_group)
|
|
228
250
|
- Interaction contracts metadata
|
|
@@ -233,7 +255,7 @@ Semantics become more useful once:
|
|
|
233
255
|
## Expected Impact
|
|
234
256
|
High.
|
|
235
257
|
|
|
236
|
-
##
|
|
258
|
+
## Exit Criteria
|
|
237
259
|
- Semantic traits implemented for major custom control classes
|
|
238
260
|
- Interaction contracts surfaced in snapshot model
|
|
239
261
|
- Confidence model defined for derived semantics
|
|
@@ -253,11 +275,14 @@ Depends on:
|
|
|
253
275
|
|
|
254
276
|
---
|
|
255
277
|
|
|
256
|
-
#
|
|
278
|
+
# Pinch to Zoom
|
|
257
279
|
|
|
258
280
|
## Why sixth
|
|
259
281
|
Valuable, but narrower than long press.
|
|
260
282
|
|
|
283
|
+
**Status:** Planned
|
|
284
|
+
**Priority:** P6
|
|
285
|
+
|
|
261
286
|
Applies mainly to:
|
|
262
287
|
- maps
|
|
263
288
|
- images
|
|
@@ -266,7 +291,7 @@ Applies mainly to:
|
|
|
266
291
|
|
|
267
292
|
Useful, but less universal.
|
|
268
293
|
|
|
269
|
-
##
|
|
294
|
+
## Scope
|
|
270
295
|
|
|
271
296
|
```json
|
|
272
297
|
pinch_to_zoom(target, scale, center?)
|
|
@@ -279,7 +304,7 @@ Verification:
|
|
|
279
304
|
## Expected Impact
|
|
280
305
|
Medium-high.
|
|
281
306
|
|
|
282
|
-
##
|
|
307
|
+
## Exit Criteria
|
|
283
308
|
- pinch_to_zoom implemented
|
|
284
309
|
- Zoom in/out flows supported
|
|
285
310
|
- Verification primitives for viewport or zoom state available
|
|
@@ -297,22 +322,25 @@ Depends on:
|
|
|
297
322
|
|
|
298
323
|
---
|
|
299
324
|
|
|
300
|
-
#
|
|
325
|
+
# Action Trace Correlation
|
|
301
326
|
|
|
302
327
|
## Why seventh
|
|
303
328
|
Very valuable for debugging,
|
|
304
329
|
but less critical than improving control success first.
|
|
305
330
|
|
|
331
|
+
**Status:** Planned
|
|
332
|
+
**Priority:** P7
|
|
333
|
+
|
|
306
334
|
Improves diagnosis more than task completion.
|
|
307
335
|
|
|
308
|
-
##
|
|
336
|
+
## Scope
|
|
309
337
|
- Action correlation metadata
|
|
310
338
|
- UI/network/log linkage
|
|
311
339
|
|
|
312
340
|
## Expected Impact
|
|
313
341
|
Medium-high.
|
|
314
342
|
|
|
315
|
-
##
|
|
343
|
+
## Exit Criteria
|
|
316
344
|
- Action correlation model defined
|
|
317
345
|
- UI/network/log linkage captured for representative actions
|
|
318
346
|
- Correlation metadata exposed to agents
|
|
@@ -331,7 +359,7 @@ Depends on:
|
|
|
331
359
|
|
|
332
360
|
---
|
|
333
361
|
|
|
334
|
-
#
|
|
362
|
+
# Roadmap Sequence
|
|
335
363
|
|
|
336
364
|
## Dependency Summary
|
|
337
365
|
Foundational sequence:
|
|
@@ -351,7 +379,7 @@ Layer 3 (Interaction Expansion)
|
|
|
351
379
|
Layer 4 (Observability)
|
|
352
380
|
- Priority 7 depends on 1,2,3
|
|
353
381
|
|
|
354
|
-
## Wave 1 (
|
|
382
|
+
## Wave 1 (Current Focus)
|
|
355
383
|
- Stronger State Verification
|
|
356
384
|
- Richer Element Identity
|
|
357
385
|
- Wait and Synchronization Reliability
|
|
@@ -361,7 +389,7 @@ Make core loop more reliable.
|
|
|
361
389
|
|
|
362
390
|
---
|
|
363
391
|
|
|
364
|
-
## Wave 2
|
|
392
|
+
## Wave 2 (Expansion)
|
|
365
393
|
- Long Press
|
|
366
394
|
- Better Compose Semantics
|
|
367
395
|
|
|
@@ -370,7 +398,7 @@ Expand interaction capability.
|
|
|
370
398
|
|
|
371
399
|
---
|
|
372
400
|
|
|
373
|
-
## Wave 3
|
|
401
|
+
## Wave 3 (Advanced)
|
|
374
402
|
- Pinch to Zoom
|
|
375
403
|
- Action Trace Correlation
|
|
376
404
|
|
|
@@ -379,7 +407,7 @@ Advanced gestures + observability.
|
|
|
379
407
|
|
|
380
408
|
---
|
|
381
409
|
|
|
382
|
-
#
|
|
410
|
+
# Capability Sequence
|
|
383
411
|
|
|
384
412
|
Execution Order:
|
|
385
413
|
1. Stronger State Verification
|
|
@@ -397,7 +425,7 @@ Rationale:
|
|
|
397
425
|
|
|
398
426
|
---
|
|
399
427
|
|
|
400
|
-
##
|
|
428
|
+
## Future Considerations
|
|
401
429
|
Still out of scope:
|
|
402
430
|
|
|
403
431
|
- Recovery planning logic
|
|
@@ -0,0 +1,296 @@
|
|
|
1
|
+
# RFC-003: Wait and Synchronization Reliability
|
|
2
|
+
|
|
3
|
+
Priority: 3
|
|
4
|
+
Depends on: RFC-001 (Stronger State Verification), RFC-002 (Platform-Native Element Metadata and Resolution Hints)
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# 1. Problem
|
|
9
|
+
|
|
10
|
+
Agents can often identify the right element (RFC-002) and verify the right state (RFC-001), but still fail because they act before the UI has reached the intended post-action state.
|
|
11
|
+
|
|
12
|
+
This causes:
|
|
13
|
+
|
|
14
|
+
- retries caused by racing the UI
|
|
15
|
+
- false failures from stale snapshots
|
|
16
|
+
- overuse of network/log verification when UI evidence should suffice
|
|
17
|
+
- flakiness in asynchronous and in-place update flows
|
|
18
|
+
- unreliable behaviour in Compose-heavy or thin accessibility trees
|
|
19
|
+
|
|
20
|
+
Current system limitations:
|
|
21
|
+
|
|
22
|
+
- wait_for_ui is underused after actions involving async state changes
|
|
23
|
+
- current waits focus on expected elements appearing, not general UI transition detection
|
|
24
|
+
- snapshot staleness is not explicitly surfaced
|
|
25
|
+
- loading state transitions are inconsistently observable
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
# 2. Goals
|
|
30
|
+
|
|
31
|
+
This RFC introduces:
|
|
32
|
+
|
|
33
|
+
1. UI-first synchronization policy after actions
|
|
34
|
+
2. Snapshot staleness and revision metadata
|
|
35
|
+
3. UI-change based waiting for in-place updates
|
|
36
|
+
4. Structured loading-state detection
|
|
37
|
+
5. Compose-aware synchronization hints
|
|
38
|
+
|
|
39
|
+
Success goals:
|
|
40
|
+
|
|
41
|
+
- reduce retries caused by premature actions
|
|
42
|
+
- increase successful post-action verification
|
|
43
|
+
- reduce unnecessary fallbacks to logs/network checks
|
|
44
|
+
- improve reliability in asynchronous UI flows
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
# 3. Non-Goals
|
|
49
|
+
|
|
50
|
+
This RFC does not:
|
|
51
|
+
|
|
52
|
+
- redefine state verification semantics (RFC-001)
|
|
53
|
+
- redefine element identity contracts (RFC-002)
|
|
54
|
+
- add new interaction primitives (long press, pinch, etc.)
|
|
55
|
+
- replace network or log verification where no UI outcome exists
|
|
56
|
+
|
|
57
|
+
---
|
|
58
|
+
|
|
59
|
+
# 4. Proposed Model
|
|
60
|
+
|
|
61
|
+
## 4.1 UI-First Synchronization Contract (v1)
|
|
62
|
+
|
|
63
|
+
Default post-action flow SHOULD be:
|
|
64
|
+
|
|
65
|
+
```text
|
|
66
|
+
action
|
|
67
|
+
→ wait_for_ui(expected outcome)
|
|
68
|
+
→ verify state
|
|
69
|
+
→ only fall back to network/logs when no UI outcome exists or wait fails
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Tool-level contract:
|
|
73
|
+
|
|
74
|
+
- After actions expected to cause visible UI changes, agents SHOULD invoke wait_for_ui or wait_for_ui_change before verification.
|
|
75
|
+
- wait_for_ui SHOULD be used when an expected element or explicit outcome is known.
|
|
76
|
+
- wait_for_ui_change SHOULD be used for in-place mutations where a specific element target is not known.
|
|
77
|
+
- wait_for_screen_change SHOULD remain preferred for full navigation transitions when available.
|
|
78
|
+
|
|
79
|
+
Rules:
|
|
80
|
+
|
|
81
|
+
- UI evidence MUST be preferred over network or log evidence when a UI outcome is expected.
|
|
82
|
+
- Actions that trigger navigation, async mutation, or visible state changes SHOULD be followed by a wait.
|
|
83
|
+
- Network/log checks are fallback signals, not primary synchronization mechanisms.
|
|
84
|
+
- This synchronization order is normative tool behavior for agents, not advisory prose.
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## 4.2 Snapshot Revision Contract
|
|
89
|
+
|
|
90
|
+
All snapshot responses MUST include revision metadata.
|
|
91
|
+
|
|
92
|
+
Emission scope:
|
|
93
|
+
|
|
94
|
+
- snapshot_revision and captured_at_ms MUST be emitted on snapshot responses.
|
|
95
|
+
- get_ui_tree responses SHOULD emit the same fields when backed by the same snapshot generation layer.
|
|
96
|
+
- If both surfaces exist, revision values MUST be consistent across them when derived from the same underlying snapshot.
|
|
97
|
+
|
|
98
|
+
Required snapshot envelope:
|
|
99
|
+
|
|
100
|
+
```json
|
|
101
|
+
{
|
|
102
|
+
"snapshot_revision": 184,
|
|
103
|
+
"captured_at_ms": 1714452012301
|
|
104
|
+
}
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
Field requirements:
|
|
108
|
+
|
|
109
|
+
- snapshot_revision REQUIRED on every snapshot response.
|
|
110
|
+
- captured_at_ms REQUIRED on every snapshot response.
|
|
111
|
+
|
|
112
|
+
Source of truth:
|
|
113
|
+
|
|
114
|
+
- snapshot_revision originates in the snapshot generation layer.
|
|
115
|
+
- It MUST increment when a meaningful hierarchy delta is detected.
|
|
116
|
+
- Cosmetic-only changes MUST NOT increment revision.
|
|
117
|
+
|
|
118
|
+
Meaningful deltas include:
|
|
119
|
+
|
|
120
|
+
- node added or removed
|
|
121
|
+
- visible text mutation
|
|
122
|
+
- control state change
|
|
123
|
+
- list content mutation
|
|
124
|
+
- navigation or view transition
|
|
125
|
+
|
|
126
|
+
Cosmetic churn examples (must not increment):
|
|
127
|
+
|
|
128
|
+
- cursor blink
|
|
129
|
+
- focus-only changes
|
|
130
|
+
- animation-only transitions
|
|
131
|
+
- timestamp or unrelated ephemeral text changes
|
|
132
|
+
|
|
133
|
+
Rules:
|
|
134
|
+
|
|
135
|
+
- Agents SHOULD use revision changes as synchronization signals.
|
|
136
|
+
- Stale revisions SHOULD trigger reacquisition before verification.
|
|
137
|
+
- This extends the snapshot response contract defined by RFC-002.
|
|
138
|
+
|
|
139
|
+
- Snapshot responses are the normative required emission surface; get_ui_tree emission is recommended for consistency.
|
|
140
|
+
- snapshot_revision MUST be monotonically increasing within a session.
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
## 4.3 wait_for_ui_change API
|
|
145
|
+
|
|
146
|
+
Concrete API contract:
|
|
147
|
+
|
|
148
|
+
```ts
|
|
149
|
+
wait_for_ui_change({
|
|
150
|
+
expected_change?: "hierarchy_diff" | "text_change" | "state_change",
|
|
151
|
+
timeout_ms?: number,
|
|
152
|
+
stability_window_ms?: number
|
|
153
|
+
}) => {
|
|
154
|
+
success: boolean,
|
|
155
|
+
observed_change: "hierarchy_diff" | "text_change" | "state_change" | null,
|
|
156
|
+
snapshot_revision?: number,
|
|
157
|
+
timeout: boolean
|
|
158
|
+
}
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
Relationship to other wait primitives:
|
|
162
|
+
|
|
163
|
+
- wait_for_screen_change remains the preferred primitive for navigation-level transitions.
|
|
164
|
+
- wait_for_ui_change is the preferred primitive for non-navigation UI mutations and in-place updates.
|
|
165
|
+
- wait_for_ui_change is additive to wait_for_screen_change, not a replacement for it.
|
|
166
|
+
|
|
167
|
+
Rules:
|
|
168
|
+
|
|
169
|
+
- stability_window_ms represents time a detected change must remain stable before success.
|
|
170
|
+
- Meaningful delta semantics are inherited from Section 4.2.
|
|
171
|
+
- wait_for_ui_change complements wait_for_ui; it does not replace it.
|
|
172
|
+
|
|
173
|
+
- Agents SHOULD prefer wait_for_screen_change for navigation and wait_for_ui_change for non-navigation changes.
|
|
174
|
+
|
|
175
|
+
---
|
|
176
|
+
|
|
177
|
+
## 4.4 Structured Loading-State Contract
|
|
178
|
+
|
|
179
|
+
Loading signals are OPTIONAL overall, but when a detectable loading signal exists they SHOULD be surfaced on snapshot responses and UI tree responses, and if emitted they MUST conform to the contract below.
|
|
180
|
+
|
|
181
|
+
Required shape:
|
|
182
|
+
|
|
183
|
+
```json
|
|
184
|
+
{
|
|
185
|
+
"loading_state": {
|
|
186
|
+
"active": true,
|
|
187
|
+
"signal": "progress_indicator",
|
|
188
|
+
"source": "snapshot"
|
|
189
|
+
}
|
|
190
|
+
}
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
Required fields:
|
|
194
|
+
|
|
195
|
+
- active
|
|
196
|
+
- signal
|
|
197
|
+
- source
|
|
198
|
+
|
|
199
|
+
Rules:
|
|
200
|
+
|
|
201
|
+
- Loading signals are synchronization hints only.
|
|
202
|
+
- Loading completion MUST NOT alone be treated as success.
|
|
203
|
+
- If emitted, the shape above MUST be used.
|
|
204
|
+
- Absence of loading_state is valid when no reliable loading signal is detectable; malformed or partial loading_state emission is not valid.
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## 4.5 Compose-Aware Synchronization Hints
|
|
209
|
+
|
|
210
|
+
For Compose or thin accessibility structures:
|
|
211
|
+
|
|
212
|
+
Systems SHOULD support:
|
|
213
|
+
|
|
214
|
+
- merged semantic node changes as wait signals
|
|
215
|
+
- text mutations within existing nodes
|
|
216
|
+
- in-place recomposition awareness
|
|
217
|
+
|
|
218
|
+
These are synchronization hints layered on top of standard wait behaviour.
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
# 5. Failure Modes
|
|
223
|
+
|
|
224
|
+
## 5.1 Premature Action Progression
|
|
225
|
+
|
|
226
|
+
If an action is followed immediately by verification without waiting:
|
|
227
|
+
|
|
228
|
+
- system SHOULD bias toward suggesting wait_for_ui
|
|
229
|
+
- retries SHOULD prefer synchronization correction before repeated action execution
|
|
230
|
+
|
|
231
|
+
---
|
|
232
|
+
|
|
233
|
+
## 5.2 Stale Snapshot Reads
|
|
234
|
+
|
|
235
|
+
If verification uses an old snapshot:
|
|
236
|
+
|
|
237
|
+
- revision metadata SHOULD expose staleness
|
|
238
|
+
- agents SHOULD reacquire snapshot before retrying verification
|
|
239
|
+
|
|
240
|
+
---
|
|
241
|
+
|
|
242
|
+
## 5.3 No Visible UI Outcome
|
|
243
|
+
|
|
244
|
+
If no UI outcome is expected:
|
|
245
|
+
|
|
246
|
+
- network/log verification MAY be primary evidence
|
|
247
|
+
- UI-first policy does not apply rigidly
|
|
248
|
+
|
|
249
|
+
---
|
|
250
|
+
|
|
251
|
+
## 5.4 False Positive UI Change Detection
|
|
252
|
+
|
|
253
|
+
If unrelated UI churn triggers early wait completion:
|
|
254
|
+
|
|
255
|
+
- systems SHOULD reject cosmetic-only changes using Section 4.2 rules
|
|
256
|
+
- agents SHOULD prefer stability windows before considering waits satisfied
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
# 6. Acceptance Criteria
|
|
261
|
+
|
|
262
|
+
RFC-003 specification is complete when:
|
|
263
|
+
|
|
264
|
+
- Snapshot Revision Contract is fully defined and mandatory.
|
|
265
|
+
- wait_for_ui_change API contract is fully defined.
|
|
266
|
+
- Loading-State Contract required schema is defined.
|
|
267
|
+
- Synchronization tool-selection rules are explicitly specified.
|
|
268
|
+
- False-positive change handling is specified.
|
|
269
|
+
|
|
270
|
+
Implementation readiness success is measured when:
|
|
271
|
+
|
|
272
|
+
- snapshot revisions reduce stale-read retries
|
|
273
|
+
- synchronization retries decrease
|
|
274
|
+
- post-action verification success increases
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
# 7. Success Metrics
|
|
279
|
+
|
|
280
|
+
- Fewer retries caused by timing/synchronization errors
|
|
281
|
+
- Higher post-action verification success rate
|
|
282
|
+
- Reduced unnecessary fallback to network/log evidence
|
|
283
|
+
- Improved stability in asynchronous and Compose-heavy flows
|
|
284
|
+
|
|
285
|
+
---
|
|
286
|
+
|
|
287
|
+
# 8. Deferred To Later RFCs
|
|
288
|
+
|
|
289
|
+
- Advanced subscriptions / notify-when-element-appears APIs
|
|
290
|
+
- Full action-to-ui trace correlation (Priority 7)
|
|
291
|
+
- Gesture-trigger-specific synchronization logic
|
|
292
|
+
- Element appearance subscription / notify-when-ready APIs
|
|
293
|
+
|
|
294
|
+
---
|
|
295
|
+
|
|
296
|
+
This RFC standardises temporal reliability and synchronization signals layered on top of state verification and element identity guarantees from RFC-001 and RFC-002.
|