mobile-debug-mcp 0.26.0 → 0.26.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,342 @@
1
+
2
+
3
+ # RFC 004: Verification Routing for Local and Side-Effect Actions
4
+
5
+ ## Status
6
+ Draft
7
+
8
+ ## Summary
9
+
10
+ This RFC corrects a specification flaw in action verification routing where agents may treat lack of obvious UI change as a trigger to inspect network activity by default.
11
+
12
+ The current fallback can cause unnecessary network calls during purely local UI interactions (for example sliders, pickers, toggles, text entry), creating noise and reinforcing incorrect agent behavior.
13
+
14
+ This RFC separates:
15
+ - action verification
16
+ - failure diagnosis
17
+ - backend signal inspection
18
+
19
+ And introduces context-aware routing based on action type.
20
+
21
+ ## Motivation
22
+
23
+ Observed agent sessions showed `get_network_activity` being invoked during local UI manipulation solely because an action produced no coarse-grained UI diff.
24
+
25
+ Current implicit reasoning resembles:
26
+
27
+ ```text
28
+ if uiChanged == false:
29
+ inspect network activity
30
+ ```
31
+
32
+ This is overly broad.
33
+
34
+ For many interactions, absence of obvious snapshot change does not imply backend ambiguity. It often means verification used the wrong signals.
35
+
36
+ Examples:
37
+ - Slider value changed but tree structure did not.
38
+ - Picker selection updated in-place.
39
+ - Toggle changed checked state only.
40
+ - Text field value changed without large snapshot delta.
41
+ - Tab or accordion state changed through selection metadata.
42
+
43
+ In these cases network inspection is diagnostic noise, not evidence.
44
+
45
+ ## Problem Statement
46
+
47
+ The current model conflates:
48
+
49
+ 1. Verifying whether an action succeeded.
50
+ 2. Diagnosing why an action may have failed.
51
+
52
+ These are distinct phases.
53
+
54
+ As a result:
55
+ - agents overuse network inspection
56
+ - verification costs increase
57
+ - local-state actions are treated as ambiguous too often
58
+ - network hints can be elevated beyond their intended role
59
+
60
+ ## Goals
61
+
62
+ This RFC:
63
+ - Prevents default network fallbacks for local-state actions.
64
+ - Makes verification primarily state-driven.
65
+ - Restricts network activity inspection to side-effect actions where ambiguity remains.
66
+ - Refines `classify_action_outcome` decision routing.
67
+
68
+ ## Non-Goals
69
+
70
+ This RFC does not:
71
+ - change raw snapshot precedence (raw remains authoritative)
72
+ - redefine expect_* ownership of verification
73
+ - make network activity mandatory evidence
74
+ - expand semantic hints into executable truth
75
+
76
+ ## Action Categories
77
+
78
+ ### Category A: Local-State Actions
79
+
80
+ Actions expected to modify client-side UI state.
81
+
82
+ Examples:
83
+ - tap toggle
84
+ - drag slider
85
+ - picker selection
86
+ - text entry
87
+ - scrolling
88
+ - tab switching
89
+ - expand/collapse
90
+ - local navigation controls
91
+
92
+ ### Category B: Side-Effect Actions
93
+
94
+ Actions that may trigger backend or asynchronous side effects.
95
+
96
+ Examples:
97
+ - submit
98
+ - save
99
+ - sync
100
+ - search
101
+ - refresh
102
+ - login
103
+ - purchase flows
104
+
105
+ ## Action Classification Source of Truth
106
+
107
+ ## Action Type Emission (Runtime Contract)
108
+
109
+ `action_type` MUST be emitted by the runtime layer that produces or executes actions. It is not inferred by the agent.
110
+
111
+ There are three valid sources of truth, in order of precedence:
112
+
113
+ ### 1. Tool Schema Annotation (preferred)
114
+
115
+ If the action originates from a tool invocation, `action_type` MUST be defined in the tool’s schema definition.
116
+
117
+ Example:
118
+
119
+ ```json
120
+ {
121
+ "name": "toggle_switch",
122
+ "action_type": "local_state"
123
+ }
124
+ ```
125
+
126
+ or
127
+
128
+ ```json
129
+ {
130
+ "name": "submit_form",
131
+ "action_type": "side_effect"
132
+ }
133
+ ```
134
+
135
+ This is the canonical source.
136
+
137
+ ### 2. Handler Output (runtime execution layer)
138
+
139
+ If tool schema does not define `action_type`, the runtime handler that executes the action MUST attach it before returning the action result.
140
+
141
+ Example:
142
+
143
+ ```json
144
+ {
145
+ "action": "click",
146
+ "target": "save_button",
147
+ "action_type": "side_effect"
148
+ }
149
+ ```
150
+
151
+ This is valid only when schema-level annotation is absent.
152
+
153
+ ### 3. Fallback Mapping Table (last resort, deterministic only)
154
+
155
+ If neither schema nor handler provides `action_type`, the system MUST use a deterministic mapping table maintained by the runtime.
156
+
157
+ This table MUST be:
158
+ - static (no runtime inference)
159
+ - versioned
160
+ - explicitly defined in implementation
161
+
162
+ Example mapping:
163
+
164
+ | action | action_type |
165
+ |--------|------------|
166
+ | tap_toggle | local_state |
167
+ | enter_text | local_state |
168
+ | submit | side_effect |
169
+ | refresh | side_effect |
170
+
171
+ If an action is not in the table, it MUST default to:
172
+
173
+ ```
174
+ side_effect
175
+ ```
176
+
177
+ ### Hard Constraint
178
+
179
+ Agents MUST NOT infer or override `action_type` based on UI state changes, snapshot diffs, or network activity.
180
+
181
+ ### Normative Interpretation
182
+
183
+ `action_type` is part of the execution contract, not the reasoning layer.
184
+
185
+ Action type MUST be explicitly defined by the action schema or tool output.
186
+
187
+ Valid values:
188
+ - local_state
189
+ - side_effect
190
+
191
+ Agents MUST NOT infer action type from UI changes.
192
+
193
+ If action type is missing, agents MUST treat it as side_effect only if backend interaction is plausible; otherwise classify as local_state.
194
+
195
+ ## Revised Verification Routing
196
+
197
+ ### For Local-State Actions
198
+
199
+ Verification priority:
200
+
201
+ 1. Expected state assertions
202
+ 2. Refreshed snapshot comparison
203
+ 3. Element property checks
204
+ 4. Targeted expect_* verification
205
+
206
+ Signals may include:
207
+ - value changes
208
+ - selected state
209
+ - checked state
210
+ - focus changes
211
+ - labels/text
212
+ - enabled/disabled transitions
213
+ - position/state metadata
214
+
215
+ Network activity should not be used as default fallback.
216
+
217
+ ## For Side-Effect Actions
218
+
219
+ Verification priority:
220
+
221
+ 1. Expected UI/state verification first
222
+ 2. Retry richer local verification if ambiguous
223
+ 3. Only then optionally inspect network or log signals
224
+
225
+ Network signals are supporting hints, not primary proof of success.
226
+
227
+ ## Decision Logic Update
228
+
229
+ Replace implied logic:
230
+
231
+ ```text
232
+ if uiChanged == false:
233
+ get_network_activity()
234
+ ```
235
+
236
+ With:
237
+
238
+ ```text
239
+ if expected_state_verified:
240
+ success
241
+
242
+ elif action_type == local_state:
243
+ retry using richer state verification
244
+
245
+ elif action_type == side_effect and ambiguity_remains:
246
+ optionally inspect network activity
247
+
248
+ else:
249
+ inconclusive
250
+ ```
251
+
252
+ ## Definition of Ambiguity
253
+
254
+ Ambiguity exists only when:
255
+
256
+ - expected state cannot be evaluated from UI snapshot, AND
257
+ - no single deterministic state predicate can be computed from UI fields
258
+
259
+ Ambiguity does NOT include:
260
+ - absence of visual diff
261
+ - absence of network activity
262
+ - lack of large UI tree changes
263
+
264
+ ## Normative Rules
265
+
266
+ ### Rule 1
267
+
268
+ Agents MUST NOT use network activity inspection as a default fallback for local-state actions solely because coarse UI diffs are absent.
269
+
270
+ ### Rule 2
271
+
272
+ Agents MUST prefer explicit state verification over backend diagnostics whenever the action is expected to be locally observable.
273
+
274
+ ### Rule 3
275
+
276
+ Network activity MAY be consulted only when:
277
+ - the action plausibly triggers backend work, and
278
+ - local verification remains ambiguous under the defined ambiguity criteria.
279
+
280
+ ### Rule 4
281
+
282
+ Network activity evidence MUST be treated as auxiliary signal, not authoritative proof of action success.
283
+
284
+ ## Unified Diagnostic Signals
285
+
286
+ Network activity and log inspection are equivalent diagnostic signals.
287
+
288
+ Both:
289
+ - are secondary to UI state verification
290
+ - MUST NOT be used as default fallback for local-state actions
291
+ - follow the same escalation rules defined in this RFC
292
+
293
+ ## Impact on classify_action_outcome
294
+
295
+ `classify_action_outcome` should be interpreted as routing logic, not a mandatory network escalation path.
296
+
297
+ For `uiChanged=false`, action category determines next step.
298
+
299
+ No automatic implication:
300
+
301
+ ```text
302
+ uiChanged=false => inspect network
303
+ ```
304
+
305
+ ## Expected Benefits
306
+
307
+ - Fewer unnecessary tool calls
308
+ - Cleaner verification traces
309
+ - Reduced cargo-cult network probing
310
+ - Better behavior for local UI interactions
311
+ - Stronger separation between verification and diagnosis
312
+ - More reliable agent reasoning
313
+
314
+ ## Compatibility
315
+
316
+ This is a patch-level specification correction.
317
+
318
+ It refines routing semantics but does not break:
319
+ - existing expect_* semantics
320
+ - snapshot response shape
321
+ - raw-over-semantic precedence
322
+ - action execution model
323
+
324
+ ## Implementation Notes
325
+
326
+ Follow-up work may include:
327
+ - prompt updates
328
+ - regression examples for sliders/toggles/pickers
329
+ - protocol examples showing correct routing
330
+ - telemetry on reduced unnecessary network inspections
331
+
332
+ ## Open Questions
333
+
334
+ Questions for review:
335
+
336
+ 1. Should action category be explicitly emitted as runtime metadata, or is heuristic inference acceptable only within the fallback mapping layer defined in the Action Type Emission contract?
337
+ 2. Should side-effect actions permit optional log inspection alongside network hints?
338
+ 3. Should local-state verification examples be added to core spec or examples appendix?
339
+
340
+ ## Decision Requested
341
+
342
+ Adopt verification routing based on action type and remove implicit default escalation from missing UI diffs to network inspection.
@@ -0,0 +1,216 @@
1
+ # RFC 005 — Unified Action Execution and Verification Model
2
+
3
+ ## 1. Summary
4
+
5
+ This RFC defines a unified execution and verification model for all agent-driven UI actions.
6
+
7
+ It standardises:
8
+ - how actions are resolved
9
+ - how they are executed
10
+ - how outcomes are verified
11
+ - how failures are classified
12
+ - how observability signals are emitted
13
+
14
+ The goal is to eliminate inconsistent per-feature execution logic and establish a single deterministic lifecycle for all UI interactions.
15
+
16
+ ---
17
+
18
+ ## 2. Problem Statement
19
+
20
+ Current execution paths are fragmented across interaction types:
21
+
22
+ - Tap / click actions rely on implicit success assumptions
23
+ - Control adjustments (sliders, inputs) use ad-hoc verification logic
24
+ - Gesture actions lack consistent post-execution validation
25
+ - Action success is often inferred from indirect UI changes or logs
26
+
27
+ This leads to:
28
+ - ambiguous success states
29
+ - inconsistent retries
30
+ - weak failure classification
31
+ - poor observability signal quality
32
+
33
+ ---
34
+
35
+ ## 3. Design Goals
36
+
37
+ The model must:
38
+
39
+ - Provide a single lifecycle for all actions
40
+ - Separate target resolution from execution
41
+ - Require explicit verification of state change
42
+ - Standardise failure classification
43
+ - Integrate with observability systems cleanly
44
+ - Support both simple and parameterised actions
45
+
46
+ ---
47
+
48
+ ## 4. Action Lifecycle
49
+
50
+ Every action MUST pass through the following states:
51
+
52
+ 1. Resolved
53
+ - A target has been identified via Actionability Resolution
54
+ - The target is executable (not just visible)
55
+
56
+ 2. Dispatched
57
+ - The action has been issued to the runtime layer
58
+
59
+ 3. Pending Verification
60
+ - Waiting for expected UI or state change
61
+
62
+ 4. Verified
63
+ - Expected outcome confirmed
64
+
65
+ 5. Failed
66
+ - Verification did not succeed within constraints
67
+
68
+ ---
69
+
70
+ ## 5. Action Types
71
+
72
+ All actions are categorised into canonical types:
73
+
74
+ - Navigation
75
+ - Input
76
+ - Selection
77
+ - Gesture
78
+ - Control Adjustment
79
+
80
+ Each type may have type-specific execution adapters but MUST conform to the same lifecycle.
81
+
82
+ ---
83
+
84
+ ## 6. Execution Contract
85
+
86
+ All actions MUST define:
87
+
88
+ ### 6.1 Target
89
+ A resolved executable entity (not a UI label or text node)
90
+
91
+ ### 6.2 Intent
92
+ The intended effect of the action
93
+
94
+ ### 6.3 Expected State Delta
95
+ What must change in the UI or application state
96
+
97
+ ---
98
+
99
+ ## 7. Verification Model
100
+
101
+ Verification MUST be explicit and deterministic.
102
+
103
+ ### 7.1 Verification Sources
104
+ At least one must be used:
105
+
106
+ - UI state diff
107
+ - element property change
108
+ - navigation change
109
+ - value update (for controls)
110
+
111
+ ### 7.2 Timeout Behaviour
112
+ - Each action defines a verification window
113
+ - Failure occurs if no valid state delta is observed in time
114
+
115
+ ### 7.3 No Implicit Success
116
+ Actions MUST NOT be considered successful without explicit verification.
117
+
118
+ ---
119
+
120
+ ## 8. Actionability Integration
121
+
122
+ This model depends on Actionability Resolution:
123
+
124
+ - Only resolved executable targets may be executed
125
+ - Visible but non-actionable nodes are invalid targets
126
+ - Execution is blocked if confidence is below threshold
127
+
128
+ ---
129
+
130
+ ## 9. Control Adjustment Model
131
+
132
+ Control actions (sliders, inputs) are treated as parameterised actions:
133
+
134
+ Example:
135
+
136
+ set_slider_value(target, value, tolerance)
137
+
138
+ Must include:
139
+ - pre-state value
140
+ - post-state verification
141
+ - tolerance-aware validation
142
+
143
+ Fallback to coordinate-based interaction is allowed only if semantic control resolution fails.
144
+
145
+ ---
146
+
147
+ ## 10. Observability Hooks
148
+
149
+ Each action emits structured signals:
150
+
151
+ - action_id
152
+ - target_id
153
+ - action_type
154
+ - lifecycle_state transitions
155
+ - verification result
156
+ - failure reason (if applicable)
157
+
158
+ These signals feed:
159
+ - Signal-Oriented Diagnostic Filtering
160
+ - Action Trace Correlation
161
+
162
+ ---
163
+
164
+ ## 11. Failure Classification
165
+
166
+ Failures MUST be categorised:
167
+
168
+ - Target resolution failure
169
+ - Dispatch failure
170
+ - Verification timeout
171
+ - Unexpected state delta
172
+ - No state change observed
173
+
174
+ This enables consistent debugging and telemetry.
175
+
176
+ ---
177
+
178
+ ## 12. Relationship to Existing Roadmap
179
+
180
+ This RFC provides the foundation for:
181
+
182
+ - Actionability Resolution (#4)
183
+ - Adjustable Control Support (#5)
184
+ - Signal-Oriented Diagnostic Filtering (#6)
185
+
186
+ It defines the shared execution substrate those capabilities plug into.
187
+
188
+ ---
189
+
190
+ ## 13. Scope Boundary
191
+
192
+ This RFC defines the execution model and lifecycle semantics for agent-driven UI actions.
193
+
194
+ - Action types referenced in this RFC correspond to the existing runtime `action_type` contract and do not redefine or extend the underlying taxonomy
195
+ - Lifecycle signals described in this RFC are emitted by the runtime execution layer (defined in RFC 006), not by this specification directly
196
+
197
+ It does NOT define:
198
+ - runtime instrumentation details
199
+ - how lifecycle states are emitted in code
200
+ - mapping to specific source modules (e.g. src/server, src/interact)
201
+ - tool schema implementation details
202
+ - mapping between semantic action categories and runtime implementation modules (this is defined in RFC 006)
203
+
204
+ Those concerns are delegated to a separate binding-layer RFC which defines how this model is implemented in the current system.
205
+
206
+ ---
207
+
208
+ ## 14. Summary
209
+
210
+ This model enforces a single, verifiable lifecycle for all UI actions.
211
+
212
+ It ensures:
213
+ - deterministic execution
214
+ - explicit verification
215
+ - consistent failure handling
216
+ - unified observability