mobile-debug-mcp 0.26.4 → 0.27.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,265 @@
1
+
2
+
3
+ # RFC 010 — Verification Stabilization and Temporal Convergence
4
+
5
+ ## 1. Summary
6
+
7
+ This RFC defines a verification stabilization layer that ensures UI state transitions are not misclassified due to timing instability, transient UI states, or stale snapshots.
8
+
9
+ It introduces temporal semantics into verification so that readiness and state checks are based on convergence over time, not a single snapshot.
10
+
11
+ ---
12
+
13
+ ## 2. Problem Statement
14
+
15
+ Current verification behavior is snapshot-based and may produce false-negative failures when UI state is in transition.
16
+
17
+ Observed issues include:
18
+
19
+ - readiness checks timing out even though UI converges shortly after
20
+ - stale snapshots being treated as authoritative state
21
+ - transient UI states causing premature failure classification
22
+ - mismatch between UI convergence and verification success
23
+
24
+ These issues lead to unnecessary retries, incorrect failure classification, and degraded automation reliability.
25
+
26
+ ---
27
+
28
+ ## 3. Goals
29
+
30
+ This RFC introduces a temporal verification model that MUST:
31
+
32
+ - reduce false-negative readiness failures
33
+ - ensure verification reflects stable UI convergence
34
+ - introduce bounded recheck before failure
35
+ - debounce transient mismatches
36
+ - maintain deterministic verification behavior
37
+
38
+ ---
39
+
40
+ ## 4. Non-Goals
41
+
42
+ This RFC does NOT define:
43
+
44
+ - recovery or replanning strategies (covered by a later RFC)
45
+ - probabilistic verification
46
+ - ML-based state inference
47
+ - changes to action execution semantics
48
+
49
+ Verification remains deterministic and grounded in observable UI state.
50
+
51
+ ---
52
+
53
+ ## 5. Runtime Ownership and Integration
54
+
55
+ This RFC applies to existing verification surfaces:
56
+
57
+ - expect_* handlers (e.g. expect_state)
58
+ - readiness checks in wait_for_ui_element
59
+ - post-action verification in src/interact
60
+
61
+ It augments these surfaces with temporal semantics; it does not replace them.
62
+
63
+ ### 5.1 Ownership and Composition with Existing Logic
64
+
65
+ This RFC refines existing behavior rather than introducing a parallel mechanism.
66
+
67
+ - `wait_for_ui_element` (and underlying `waitForUICore`) owns **readiness stabilization**.
68
+ - `expect_*` handlers (e.g. `expect_state`) own **state verification stabilization**.
69
+ - `src/interact` owns **post-action verification application** of these rules.
70
+
71
+ Composition rules:
72
+ - `wait_for_ui_element` MUST apply stabilization for presence/readiness before returning success or failure.
73
+ - `expect_*` MUST apply stabilization for state/value assertions.
74
+ - If both are used in sequence, `wait_for_ui_element` completes first, then `expect_*` applies its own stabilization.
75
+ - Stabilization MUST NOT be duplicated across layers for the same check.
76
+
77
+ ---
78
+
79
+ ## 6. Temporal Verification Model
80
+
81
+ Verification MUST consider state over time, not a single observation.
82
+
83
+ ### 6.1 Stabilization Window
84
+
85
+ Verification SHOULD use a bounded observation window before declaring failure.
86
+
87
+ Within this window:
88
+ - multiple UI reads MAY be performed
89
+ - transient mismatches MUST NOT immediately trigger failure
90
+
91
+ ### 6.2 Verify-Until-Stable
92
+
93
+ Verification SHOULD require state to be stable across consecutive observations before success is confirmed.
94
+
95
+ Example:
96
+ - state must match expected condition for N consecutive reads
97
+
98
+ ### 6.3 Debounce Semantics
99
+
100
+ Transient mismatches SHOULD be debounced.
101
+
102
+ Short-lived mismatches within the stabilization window MUST NOT be treated as terminal failure.
103
+
104
+ ### 6.4 Deterministic Defaults (Required)
105
+
106
+ Implementations MUST use bounded defaults unless explicitly overridden:
107
+
108
+ - `stabilization_window_ms`: 1000ms (range: 500–1500ms)
109
+ - `stable_observation_count`: 2 consecutive matching reads
110
+ - `max_recheck_attempts`: 3
111
+ - `min_read_interval_ms`: 100–200ms between reads
112
+
113
+ These values MUST be configurable but bounded to prevent unbounded waits.
114
+
115
+ ---
116
+
117
+ ## 6.1 Reference Stabilization Algorithm
118
+
119
+ For a given verification predicate `P(snapshot)`:
120
+
121
+ 1. Start timer `t0`.
122
+ 2. Initialize `stable_count = 0`, `attempts = 0`.
123
+ 3. Loop until `now - t0 > stabilization_window_ms` OR `stable_count >= stable_observation_count`:
124
+ - Read fresh snapshot `S`.
125
+ - If `P(S)` is true:
126
+ - `stable_count += 1`
127
+ Else:
128
+ - `stable_count = 0`
129
+ - `attempts += 1`
130
+ - Sleep `min_read_interval_ms`.
131
+ 4. If `stable_count >= stable_observation_count`: SUCCESS
132
+ 5. Else if `attempts < max_recheck_attempts`:
133
+ - Perform one additional fresh read and re-evaluate once.
134
+ 6. Else: FAILURE
135
+
136
+ Notes:
137
+ - Implementations MUST ensure at least one fresh read occurs before failure.
138
+ - Debounce is achieved via resetting `stable_count` on mismatch.
139
+
140
+ ---
141
+
142
+ ## 7. Snapshot Freshness
143
+
144
+ Verification MUST account for snapshot freshness.
145
+
146
+ ### 7.1 Freshness Constraints
147
+
148
+ - snapshots older than `snapshot_stale_threshold_ms` MUST be considered stale (default: 500ms)
149
+ - stale snapshots MUST NOT be used as final verification evidence and MUST trigger a fresh read
150
+
151
+ ### 7.2 Re-read Requirement
152
+
153
+ Before declaring failure, the system MUST attempt at least one fresh UI read within the stabilization window.
154
+
155
+ ### 7.3 Freshness Defaults
156
+
157
+ - `snapshot_stale_threshold_ms`: 500ms (range: 300–800ms)
158
+
159
+ ---
160
+
161
+ ## 8. Runtime Failure Code Mapping
162
+
163
+ Existing runtime failure signals MUST map into RFC 010 failure categories.
164
+
165
+ | Runtime Code | RFC 010 Category |
166
+ |--------------|------------------|
167
+ | ELEMENT_NOT_FOUND | Target Resolution Failure |
168
+ | STALE_REFERENCE | Target Resolution Failure |
169
+ | AMBIGUOUS_TARGET | Target Resolution Failure |
170
+ | TIMEOUT | Execution Failure |
171
+ | ACTION_REJECTED | Execution Failure |
172
+ | VERIFICATION_FAILED | Verification Failure |
173
+ | EXPECT_STATE_MISMATCH | Verification Failure |
174
+ | CONTROL_CONVERGENCE_FAILED | Control Convergence Failure |
175
+ | SEMANTIC_MISMATCH | Semantic Mismatch Failure |
176
+ | UNKNOWN | Execution Failure (default fallback) |
177
+
178
+ This mapping MUST be deterministic, exhaustive, and versioned with the runtime.
179
+
180
+ ### 8.1 Failure Gating Rules
181
+
182
+ Failure MUST only be emitted when:
183
+
184
+ - stabilization window is exhausted
185
+ - fresh snapshot verification still fails
186
+
187
+ Transient mismatches SHOULD NOT be classified as:
188
+ - TIMEOUT
189
+ - VERIFICATION_FAILED
190
+
191
+ until stabilization logic has completed.
192
+
193
+ - FAILURE MUST NOT be emitted if `stable_observation_count` has not been attempted within the stabilization window.
194
+ - FAILURE MUST NOT be emitted without at least one fresh read within `snapshot_stale_threshold_ms`.
195
+ - TIMEOUT MUST correspond to exhaustion of `stabilization_window_ms`, not a single read failure.
196
+
197
+ ---
198
+
199
+ ## 9. Integration with RFC 005 (Verification Correctness)
200
+
201
+ RFC 005 defines what correctness means.
202
+
203
+ RFC 010 defines when correctness can be confidently evaluated.
204
+
205
+ RFC 010 augments RFC 005 by introducing temporal convergence requirements before asserting success or failure.
206
+
207
+ ---
208
+
209
+ ## 10. Integration with RFC 006 (Execution Layer)
210
+
211
+ Post-action verification in src/interact MUST apply stabilization logic before returning failure.
212
+
213
+ Execution MUST NOT prematurely surface verification failure without applying temporal checks defined in this RFC.
214
+
215
+ `src/interact` MUST wrap post-action verification with the reference stabilization algorithm. It MUST pass through configuration (window, counts) and MUST NOT short-circuit on first mismatch.
216
+
217
+ ---
218
+
219
+ ## 11. Integration with RFC 011.1 (Recovery Contract)
220
+
221
+ Verification stabilization reduces false-positive failure signals that would otherwise trigger downstream recovery mechanisms (defined in a companion RFC).
222
+
223
+ ---
224
+
225
+ ## 13. Output Behavior (Progressive Extension)
226
+
227
+ Future implementations MAY expose additional metadata such as:
228
+
229
+ ```ts
230
+ interface VerificationMetadata {
231
+ stabilization_attempts?: number;
232
+ stabilization_window_ms?: number;
233
+ stable_observation_count?: number;
234
+ snapshot_freshness_ms?: number;
235
+ }
236
+ ```
237
+
238
+ These fields are optional and for observability only.
239
+
240
+ ---
241
+
242
+ ## 14. Failure Modes
243
+
244
+ Verification stabilization MAY fail due to:
245
+
246
+ - UI never converging to expected state
247
+ - repeated oscillation of UI state
248
+ - persistent stale snapshot conditions
249
+
250
+ In these cases, failure MUST be emitted after stabilization window is exhausted.
251
+
252
+ ---
253
+
254
+ ## 15. Success Metrics
255
+
256
+ - reduced false-negative readiness failures
257
+ - higher first-pass verification success
258
+ - lower premature timeout rates
259
+ - improved reliability of wait and readiness checks
260
+
261
+ ---
262
+
263
+ ## 16. Summary
264
+
265
+ This RFC introduces temporal stabilization into verification, ensuring that UI state is evaluated based on convergence over time rather than single snapshots. It improves reliability by eliminating transient mismatches and stale-state errors without introducing probabilistic behavior.
@@ -0,0 +1,321 @@
1
+
2
+
3
+ # RFC 011 — Recovery and Replanning for Failed or Ambiguous Interaction Flows
4
+
5
+ ## 1. Summary
6
+
7
+ This RFC defines a structured recovery and replanning model for UI interaction failures, enabling the system to respond to execution uncertainty with bounded, deterministic recovery strategies.
8
+
9
+ It extends the interaction stack defined in RFCs 005–009 by introducing explicit failure classification, recovery policy selection, and bounded replanning of interaction sequences.
10
+
11
+ ---
12
+
13
+ ## 2. Problem Statement
14
+
15
+ Even with reliable execution primitives (RFC 005–009), UI interactions can fail due to:
16
+
17
+ - incorrect or stale target resolution
18
+ - state drift between observation and execution
19
+ - ambiguous or partial UI snapshots
20
+ - control convergence failures (RFC 008)
21
+ - semantic mismatches in custom/Compose controls (RFC 009)
22
+
23
+ Currently, failure handling is implicit and ad hoc, often resulting in:
24
+
25
+ - repeated identical retries
26
+ - stalled flows with no recovery path
27
+ - loss of interaction context
28
+ - inability to switch strategy after failure
29
+
30
+ This leads to brittle automation behavior even when core primitives are correct.
31
+
32
+ ---
33
+
34
+ ## 3. Goals
35
+
36
+ This RFC introduces a structured recovery system that MUST:
37
+
38
+ - classify failures into distinct categories
39
+ - select appropriate recovery strategies based on failure type
40
+ - enable bounded replanning of interaction flows
41
+ - prevent infinite retry loops
42
+ - preserve interaction context across recovery attempts
43
+ - improve robustness under UI drift or ambiguity
44
+
45
+ ---
46
+
47
+ ## 4. Non-Goals
48
+
49
+ This RFC does NOT define:
50
+
51
+ - new UI interaction primitives (covered in RFC 006–008)
52
+ - new target resolution mechanisms (RFC 007)
53
+ - new control semantics (RFC 008–009)
54
+ - general autonomous planning system
55
+ - ML-based decision making or probabilistic policy learning
56
+
57
+ Recovery is deterministic and rule-based in this version.
58
+
59
+ ---
60
+
61
+ ## 5. Runtime Ownership and Integration
62
+
63
+ Recovery is a cross-layer concern with explicit ownership:
64
+
65
+ ### 5.1 Server Layer (src/server)
66
+ - Detects failure conditions from action execution results
67
+ - Emits normalized failure objects
68
+ - Applies initial failure classification mapping
69
+
70
+ ### 5.2 Interact Layer (src/interact)
71
+ - Executes recovery strategies
72
+ - Performs re-resolution, retry, and step-back operations where supported
73
+ - Maintains bounded retry loops
74
+
75
+ ### 5.3 Shared Contract Layer
76
+ - Defines failure schema
77
+ - Defines recovery state machine transitions
78
+
79
+ Recovery is NOT owned by a single layer; it is a coordinated contract between server and interact.
80
+
81
+ ---
82
+
83
+ ## 5. Failure Classification Model
84
+
85
+ All interaction failures MUST be classified into one of the following categories:
86
+
87
+ ### 5.1 Target Resolution Failure
88
+ - element not found
89
+ - ambiguous or multiple matches
90
+ - stale UI tree snapshot
91
+
92
+ ### 5.2 Execution Failure
93
+ - action could not be dispatched
94
+ - runtime rejection of interaction
95
+ - invalid gesture or control interaction
96
+
97
+ ### 5.3 Verification Failure
98
+ - action executed but expected state not observed
99
+ - expect_state mismatch (RFC 005)
100
+
101
+ ### 5.4 Control Convergence Failure
102
+ - adjustable control failed to reach target state (RFC 008)
103
+
104
+ ### 5.5 Semantic Mismatch Failure
105
+ - control semantics inferred incorrectly (RFC 009)
106
+
107
+ ---
108
+
109
+ ## 6. Runtime Failure Code Mapping
110
+
111
+ Existing runtime failure signals MUST map into RFC 011 failure categories.
112
+
113
+ | Runtime Code | RFC 011 Category |
114
+ |--------------|------------------|
115
+ | ELEMENT_NOT_FOUND | Target Resolution Failure |
116
+ | STALE_REFERENCE | Target Resolution Failure |
117
+ | AMBIGUOUS_TARGET | Target Resolution Failure |
118
+ | TIMEOUT | Execution Failure |
119
+ | ACTION_REJECTED | Execution Failure |
120
+ | VERIFICATION_FAILED | Verification Failure |
121
+ | EXPECT_STATE_MISMATCH | Verification Failure |
122
+ | CONTROL_CONVERGENCE_FAILED | Control Convergence Failure |
123
+ | UNKNOWN | Execution Failure (default fallback) |
124
+
125
+ This mapping MUST be deterministic and versioned with the runtime.
126
+
127
+ ---
128
+
129
+ ## 6. Recovery Strategy Model
130
+
131
+ Each failure type MUST map to a bounded set of recovery strategies:
132
+
133
+ ### 6.1 Re-resolve Strategy
134
+ Re-run target resolution (RFC 007) with updated context.
135
+
136
+ Used for:
137
+ - stale snapshot
138
+ - ambiguous target
139
+
140
+ ---
141
+
142
+ ### 6.2 Alternate Candidate Strategy
143
+ Select next-best candidate from resolved targets.
144
+
145
+ Used for:
146
+ - multiple matches
147
+ - incorrect initial resolution
148
+
149
+ ---
150
+
151
+ ### 6.3 State Refresh Strategy
152
+ Re-observe UI state before retrying action.
153
+
154
+ Used for:
155
+ - drift between observation and execution
156
+
157
+ ---
158
+
159
+ ### 6.4 Retry with Constraint Adjustment
160
+ Retry action with adjusted parameters:
161
+ - increased tolerance (RFC 008)
162
+ - alternative interaction mode
163
+
164
+ Used for:
165
+ - convergence failures
166
+ - flaky execution paths
167
+
168
+ ---
169
+
170
+ ### 6.5 Step-back Strategy
171
+ Rollback interaction context one step and re-enter flow.
172
+
173
+ Used for:
174
+ - persistent verification failure
175
+ - inconsistent UI state transitions
176
+
177
+ ---
178
+
179
+ ## 7. Replanning Model
180
+
181
+ Replanning is the process of constructing a new bounded interaction sequence after failure.
182
+
183
+ A replanned sequence MUST:
184
+
185
+ - preserve original intent
186
+ - incorporate failure classification context
187
+ - apply a recovery strategy
188
+ - remain bounded in retry depth
189
+
190
+ Replanning is NOT full autonomous task planning.
191
+
192
+ ---
193
+
194
+ ## 7.1 Scope of Replanning
195
+
196
+ Replanning in this RFC is strictly scoped to:
197
+
198
+ - Single-action recovery sequences
199
+ - Local retry chains
200
+ - Bounded corrective adjustments
201
+
202
+ It does NOT include:
203
+
204
+ - multi-step autonomous task planning
205
+ - global goal decomposition
206
+ - long-horizon planning
207
+
208
+ Replanning is therefore a bounded extension of execution, not a planning system.
209
+
210
+ ---
211
+
212
+ ## 8. Recovery State and Budget Contract
213
+
214
+ The system MUST represent recovery state explicitly per action.
215
+
216
+ ### 8.1 Recovery State Schema (conceptual)
217
+
218
+ {
219
+ "failure_class": "TargetResolutionFailure | ExecutionFailure | VerificationFailure | ControlConvergenceFailure | SemanticMismatchFailure",
220
+ "recovery_strategy": "re_resolve | alternate_candidate | state_refresh | retry_adjustment | step_back",
221
+ "recovery_attempts": 0,
222
+ "max_recovery_attempts": 3,
223
+ "retry_depth": 0,
224
+ "max_retry_depth": 3
225
+ }
226
+
227
+ ### 8.2 Budget Rules
228
+
229
+ - Each action MUST track recovery_attempts
230
+ - Recovery MUST NOT exceed max_recovery_attempts
231
+ - retry_depth MUST be bounded per interaction step
232
+ - Exhaustion MUST produce a terminal failure state
233
+
234
+ ### 8.3 Enforcement Point
235
+
236
+ Budget enforcement is the responsibility of the Interact layer (src/interact), with server providing initial values.
237
+
238
+ ---
239
+
240
+ ## 9. Execution Context Model
241
+
242
+ Full rollback is NOT required or assumed.
243
+
244
+ The system MUST preserve:
245
+
246
+ - last resolved target set (RFC 007)
247
+ - last executed action descriptor (RFC 006)
248
+ - last verification result (RFC 005)
249
+ - recovery_attempts counter
250
+
251
+ The system MAY optionally retain:
252
+
253
+ - prior candidate selections
254
+ - intermediate resolution outputs
255
+
256
+ Step-back is implemented as a re-resolution + re-execution, NOT a full state rollback system.
257
+
258
+ ---
259
+
260
+ ## 10. Relationship to Existing RFCs
261
+
262
+ ### RFC 005 — Correctness Model
263
+ Defines verification failures that trigger recovery.
264
+
265
+ ### RFC 006 — Runtime Binding
266
+ Defines execution surface where failures occur.
267
+
268
+ ### RFC 007 — Target Resolution
269
+ Provides alternate candidates for recovery strategies.
270
+
271
+ ### RFC 008 — Control-State Convergence
272
+ Defines recovery paths for control adjustment failures.
273
+
274
+ ### RFC 009 — Semantic Control Model
275
+ Defines classification of semantic mismatch failures.
276
+
277
+ ---
278
+
279
+ ## 11. Expected System Behaviour
280
+
281
+ On failure:
282
+
283
+ 1. classify runtime failure using deterministic mapping (Section 6)
284
+ 2. select recovery strategy
285
+ 3. optionally re-resolve target
286
+ 4. re-execute bounded action
287
+ 5. verify outcome using RFC 005 or mark recovery attempt failure
288
+ 6. escalate if budget exceeded
289
+
290
+ ---
291
+
292
+ ## 12. Structured Failure Output Contract
293
+
294
+ When recovery is exhausted or fails, the system MUST emit a structured failure object:
295
+
296
+ {
297
+ "failure_class": "...",
298
+ "runtime_code": "...",
299
+ "resolved_target": "...",
300
+ "attempted_recovery_strategies": ["..."],
301
+ "recovery_attempts": 3,
302
+ "final_state": "failed"
303
+ }
304
+
305
+ This ensures consistent observability across server and interact layers.
306
+
307
+ ---
308
+
309
+ ## 12. Success Metrics
310
+
311
+ - reduction in stuck interaction flows
312
+ - reduced repeated identical retries
313
+ - improved recovery success rate after first failure
314
+ - improved robustness under UI drift
315
+ - clearer structured failure outputs
316
+
317
+ ---
318
+
319
+ ## 13. Summary
320
+
321
+ This RFC introduces deterministic recovery and replanning for UI interaction failures, enabling the system to remain robust under ambiguity, drift, and execution uncertainty while preserving bounded and explainable behavior.