npm - mobile-debug-mcp - Versions diffs - 0.25.0 → 0.26.0 - Mend

mobile-debug-mcp 0.25.0 → 0.26.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

package/dist/interact/index.js +143 -4
package/dist/observe/android.js +10 -1
package/dist/observe/index.js +19 -1
package/dist/observe/ios.js +86 -3
package/dist/observe/snapshot-metadata.js +88 -0
package/dist/server/tool-definitions.js +30 -2
package/dist/server/tool-handlers.js +10 -0
package/dist/server-core.js +1 -1
package/dist/utils/android/utils.js +68 -3
package/docs/CHANGELOG.md +12 -0
package/docs/ROADMAP.md +19 -1
package/docs/rfcs/002-richer-element-identity +400 -0
package/docs/rfcs/003-wait-and-synchronization-reliability.md +296 -0
package/docs/specs/mcp-tooling-spec-v1.md +9 -0
package/docs/tools/interact.md +21 -0
package/docs/tools/observe.md +5 -2
package/package.json +1 -1
package/skills/rfc-review/SKILL.md +52 -0
package/skills/rfc-review/references/rfc-review-checklist.md +12 -0
package/skills/rfc-review/references/rfc-review-template.md +28 -0
package/src/interact/index.ts +186 -4
package/src/observe/android.ts +11 -1
package/src/observe/index.ts +32 -1
package/src/observe/ios.ts +97 -16
package/src/observe/snapshot-metadata.ts +107 -0
package/src/server/tool-definitions.ts +30 -2
package/src/server/tool-handlers.ts +11 -0
package/src/server-core.ts +1 -1
package/src/types.ts +49 -1
package/src/utils/android/utils.ts +78 -20
package/test/unit/interact/wait_for_ui_change.test.ts +76 -0
package/test/unit/observe/state_extraction.test.ts +47 -0
package/test/unit/server/response_shapes.test.ts +37 -3

package/docs/rfcs/003-wait-and-synchronization-reliability.md ADDED Viewed

@@ -0,0 +1,296 @@
+# RFC-003: Wait and Synchronization Reliability
+Priority: 3
+Depends on: RFC-001 (Stronger State Verification), RFC-002 (Platform-Native Element Metadata and Resolution Hints)
+---
+# 1. Problem
+Agents can often identify the right element (RFC-002) and verify the right state (RFC-001), but still fail because they act before the UI has reached the intended post-action state.
+This causes:
+- retries caused by racing the UI
+- false failures from stale snapshots
+- overuse of network/log verification when UI evidence should suffice
+- flakiness in asynchronous and in-place update flows
+- unreliable behaviour in Compose-heavy or thin accessibility trees
+Current system limitations:
+- wait_for_ui is underused after actions involving async state changes
+- current waits focus on expected elements appearing, not general UI transition detection
+- snapshot staleness is not explicitly surfaced
+- loading state transitions are inconsistently observable
+---
+# 2. Goals
+This RFC introduces:
+1. UI-first synchronization policy after actions
+2. Snapshot staleness and revision metadata
+3. UI-change based waiting for in-place updates
+4. Structured loading-state detection
+5. Compose-aware synchronization hints
+Success goals:
+- reduce retries caused by premature actions
+- increase successful post-action verification
+- reduce unnecessary fallbacks to logs/network checks
+- improve reliability in asynchronous UI flows
+---
+# 3. Non-Goals
+This RFC does not:
+- redefine state verification semantics (RFC-001)
+- redefine element identity contracts (RFC-002)
+- add new interaction primitives (long press, pinch, etc.)
+- replace network or log verification where no UI outcome exists
+---
+# 4. Proposed Model
+## 4.1 UI-First Synchronization Contract (v1)
+Default post-action flow SHOULD be:
+```text
+action
+→ wait_for_ui(expected outcome)
+→ verify state
+→ only fall back to network/logs when no UI outcome exists or wait fails
+```
+Tool-level contract:
+- After actions expected to cause visible UI changes, agents SHOULD invoke wait_for_ui or wait_for_ui_change before verification.
+- wait_for_ui SHOULD be used when an expected element or explicit outcome is known.
+- wait_for_ui_change SHOULD be used for in-place mutations where a specific element target is not known.
+- wait_for_screen_change SHOULD remain preferred for full navigation transitions when available.
+Rules:
+- UI evidence MUST be preferred over network or log evidence when a UI outcome is expected.
+- Actions that trigger navigation, async mutation, or visible state changes SHOULD be followed by a wait.
+- Network/log checks are fallback signals, not primary synchronization mechanisms.
+- This synchronization order is normative tool behavior for agents, not advisory prose.
+---
+## 4.2 Snapshot Revision Contract
+All snapshot responses MUST include revision metadata.
+Emission scope:
+- snapshot_revision and captured_at_ms MUST be emitted on snapshot responses.
+- get_ui_tree responses SHOULD emit the same fields when backed by the same snapshot generation layer.
+- If both surfaces exist, revision values MUST be consistent across them when derived from the same underlying snapshot.
+Required snapshot envelope:
+```json
+{
+  "snapshot_revision": 184,
+  "captured_at_ms": 1714452012301
+}
+```
+Field requirements:
+- snapshot_revision REQUIRED on every snapshot response.
+- captured_at_ms REQUIRED on every snapshot response.
+Source of truth:
+- snapshot_revision originates in the snapshot generation layer.
+- It MUST increment when a meaningful hierarchy delta is detected.
+- Cosmetic-only changes MUST NOT increment revision.
+Meaningful deltas include:
+- node added or removed
+- visible text mutation
+- control state change
+- list content mutation
+- navigation or view transition
+Cosmetic churn examples (must not increment):
+- cursor blink
+- focus-only changes
+- animation-only transitions
+- timestamp or unrelated ephemeral text changes
+Rules:
+- Agents SHOULD use revision changes as synchronization signals.
+- Stale revisions SHOULD trigger reacquisition before verification.
+- This extends the snapshot response contract defined by RFC-002.
+- Snapshot responses are the normative required emission surface; get_ui_tree emission is recommended for consistency.
+- snapshot_revision MUST be monotonically increasing within a session.
+---
+## 4.3 wait_for_ui_change API
+Concrete API contract:
+```ts
+wait_for_ui_change({
+  expected_change?: "hierarchy_diff" | "text_change" | "state_change",
+  timeout_ms?: number,
+  stability_window_ms?: number
+}) => {
+  success: boolean,
+  observed_change: "hierarchy_diff" | "text_change" | "state_change" | null,
+  snapshot_revision?: number,
+  timeout: boolean
+}
+```
+Relationship to other wait primitives:
+- wait_for_screen_change remains the preferred primitive for navigation-level transitions.
+- wait_for_ui_change is the preferred primitive for non-navigation UI mutations and in-place updates.
+- wait_for_ui_change is additive to wait_for_screen_change, not a replacement for it.
+Rules:
+- stability_window_ms represents time a detected change must remain stable before success.
+- Meaningful delta semantics are inherited from Section 4.2.
+- wait_for_ui_change complements wait_for_ui; it does not replace it.
+- Agents SHOULD prefer wait_for_screen_change for navigation and wait_for_ui_change for non-navigation changes.
+---
+## 4.4 Structured Loading-State Contract
+Loading signals are OPTIONAL overall, but when a detectable loading signal exists they SHOULD be surfaced on snapshot responses and UI tree responses, and if emitted they MUST conform to the contract below.
+Required shape:
+```json
+{
+  "loading_state": {
+    "active": true,
+    "signal": "progress_indicator",
+    "source": "snapshot"
+  }
+}
+```
+Required fields:
+- active
+- signal
+- source
+Rules:
+- Loading signals are synchronization hints only.
+- Loading completion MUST NOT alone be treated as success.
+- If emitted, the shape above MUST be used.
+- Absence of loading_state is valid when no reliable loading signal is detectable; malformed or partial loading_state emission is not valid.
+---
+## 4.5 Compose-Aware Synchronization Hints
+For Compose or thin accessibility structures:
+Systems SHOULD support:
+- merged semantic node changes as wait signals
+- text mutations within existing nodes
+- in-place recomposition awareness
+These are synchronization hints layered on top of standard wait behaviour.
+---
+# 5. Failure Modes
+## 5.1 Premature Action Progression
+If an action is followed immediately by verification without waiting:
+- system SHOULD bias toward suggesting wait_for_ui
+- retries SHOULD prefer synchronization correction before repeated action execution
+---
+## 5.2 Stale Snapshot Reads
+If verification uses an old snapshot:
+- revision metadata SHOULD expose staleness
+- agents SHOULD reacquire snapshot before retrying verification
+---
+## 5.3 No Visible UI Outcome
+If no UI outcome is expected:
+- network/log verification MAY be primary evidence
+- UI-first policy does not apply rigidly
+---
+## 5.4 False Positive UI Change Detection
+If unrelated UI churn triggers early wait completion:
+- systems SHOULD reject cosmetic-only changes using Section 4.2 rules
+- agents SHOULD prefer stability windows before considering waits satisfied
+---
+# 6. Acceptance Criteria
+RFC-003 specification is complete when:
+- Snapshot Revision Contract is fully defined and mandatory.
+- wait_for_ui_change API contract is fully defined.
+- Loading-State Contract required schema is defined.
+- Synchronization tool-selection rules are explicitly specified.
+- False-positive change handling is specified.
+Implementation readiness success is measured when:
+- snapshot revisions reduce stale-read retries
+- synchronization retries decrease
+- post-action verification success increases
+---
+# 7. Success Metrics
+- Fewer retries caused by timing/synchronization errors
+- Higher post-action verification success rate
+- Reduced unnecessary fallback to network/log evidence
+- Improved stability in asynchronous and Compose-heavy flows
+---
+# 8. Deferred To Later RFCs
+- Advanced subscriptions / notify-when-element-appears APIs
+- Full action-to-ui trace correlation (Priority 7)
+- Gesture-trigger-specific synchronization logic
+- Element appearance subscription / notify-when-ready APIs
+---
+This RFC standardises temporal reliability and synchronization signals layered on top of state verification and element identity guarantees from RFC-001 and RFC-002.

package/docs/specs/mcp-tooling-spec-v1.md CHANGED Viewed

@@ -151,6 +151,7 @@ Examples:
 - `wait_for_ui`
 - `wait_for_screen_change`
+- `wait_for_ui_change`
 ### 6.2 Rules
@@ -238,6 +239,9 @@ Raw layer contents include:
 - UI hierarchy or accessibility tree
 - normalized readable element state where exposed by the platform
+- platform-native identity hints such as stable identifiers, roles, and test tags
+- snapshot metadata such as `snapshot_revision` and `captured_at_ms`
+- `loading_state` when a reliable loading signal is detectable
 - screenshot when available
 - element-level attributes
 - logs and fingerprint/activity observations
@@ -304,10 +308,15 @@ Canonical pattern:
 `wait_for_ui -> tap_element -> wait_for_screen_change (optional) -> expect_screen`
+For in-place UI mutations, agents SHOULD prefer:
+`wait_for_ui_change -> expect_element_visible / expect_state`
 Interpretation:
 - `tap_element.success` = executed
 - `wait_for_screen_change.success` = UI changed
+- `wait_for_ui_change.success` = in-place UI mutation observed and stable
 - `expect_screen.success` = correct outcome verified
 ## 12. Known Deviations

package/docs/tools/interact.md CHANGED Viewed

@@ -58,6 +58,7 @@ Preferred verification:
 Use `wait_for_screen_change` only when a visible transition is the expected outcome. If a button should trigger an API request but the screen should stay the same, rely on network activity and classification instead.
 For backend-only actions, prefer comparing `get_screen_fingerprint` before/after and call `get_network_activity` immediately after the action; do not wait on `wait_for_screen_change` if no visible transition is expected.
+Use `wait_for_ui_change` when the screen stays in place but visible text or element state should change.
 ---
@@ -148,6 +149,26 @@ Notes:
 ---
+## wait_for_ui_change
+Purpose:
+- detect a stable in-place UI mutation without naming a target element first
+Capabilities:
+- waits for hierarchy, text, or state deltas
+- uses snapshot revision metadata when available
+- confirms the change remains stable before returning success
+Guidance:
+- prefer `wait_for_screen_change` for navigation
+- prefer `wait_for_ui_change` for in-place updates and recomposition-style changes
+- follow with `expect_*` when the expected final state is known
+---
 ## find_element
 Locate a UI element on the current screen using semantic matching and return an actionable element descriptor.

package/docs/tools/observe.md CHANGED Viewed

@@ -83,12 +83,14 @@ Input:
 Response (example):
 ```json
-{ "device": { "platform": "android", "id": "emulator-5554" }, "screen": "", "resolution": { "width": 1080, "height": 2400 }, "elements": [ { "text": "Sign in", "type": "android.widget.Button", "resourceId": "com.example:id/signin", "clickable": true, "bounds": [0,0,100,50], "state": { "enabled": true } } ] }
+{ "device": { "platform": "android", "id": "emulator-5554" }, "screen": "", "resolution": { "width": 1080, "height": 2400 }, "snapshot_revision": 12, "captured_at_ms": 1710000000123, "loading_state": { "active": true, "signal": "spinner", "source": "ui_tree" }, "elements": [ { "text": "Sign in", "type": "android.widget.Button", "resourceId": "com.example:id/signin", "clickable": true, "bounds": [0,0,100,50], "state": { "enabled": true }, "stable_id": "com.example:id/signin", "role": "button", "test_tag": "com.example:id/signin", "selector": { "value": "com.example:id/signin", "confidence": { "score": 1, "reason": "resource_id" } }, "semantic": { "is_clickable": true, "is_container": false } } ] }
 ```
 Notes:
 - Useful for inspection, selector development, and fallback debugging.
 - Elements may include a normalized `state` object when the platform exposes readable state such as checked, selected, focused, expanded, text input, or slider values.
+- Elements may also include platform-native identity hints such as `stable_id`, `role`, `test_tag`, `selector`, and `semantic`.
+- The tree response may include `snapshot_revision`, `captured_at_ms`, and `loading_state` when a reliable signal is available.
 - Prefer `wait_for_ui` for deterministic element resolution in interactive flows.
 ---
@@ -135,7 +137,8 @@ Behavior:
 - Fast by default: does not wait for new logs and avoids long blocking operations.
 - Returns a dual-layer payload:
   - `raw` is authoritative and contains the underlying observation data unchanged.
-  - `semantic` is optional, derived from `raw`, and intended for planning only.
+- `semantic` is optional, derived from `raw`, and intended for planning only.
+- `raw` now includes `snapshot_revision`, `captured_at_ms`, and `loading_state` when detectable.
 Response (example):

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mobile-debug-mcp",
-  "version": "0.25.0",
+  "version": "0.26.0",
   "description": "MCP server for mobile app debugging (Android + iOS), with focus on security and reliability",
   "type": "module",
   "bin": {

package/skills/rfc-review/SKILL.md ADDED Viewed

@@ -0,0 +1,52 @@
+# RFC Review skill
+name: rfc-review
+version: 0.1.1
+summary: Reusable workflow for reviewing RFCs/specs in this repository with a consistent readiness rubric and output template.
+# Purpose
+Help an agent review an RFC for clarity, implementation readiness, and alignment with the current codebase. Use a common template so reviews stay consistent across documents and reviewers.
+# Activation conditions
+Activate when an agent needs to:
+- review a new or revised RFC
+- assess whether an RFC is implementation-ready
+- identify whether feedback is an RFC issue or an implementation issue
+- compare a spec against the current `src/` contract surface and docs
+# Surface area (actions)
+- locate-rfc
+- compare-against-code
+- assess-contract-completeness
+- classify-gaps
+- produce-review
+# Core guidance
+1. Read the RFC first, then compare it against the relevant code, docs, and tests.
+2. Separate **spec gaps** from **implementation gaps**.
+3. Check for: problem clarity, scope boundaries, explicit contracts, acceptance criteria, non-goals, and consistency with existing behavior.
+4. Prefer precise feedback that names the missing contract, unclear rule, or inconsistent behavior.
+5. Use the shared review template in `references/rfc-review-template.md` for the final output.
+6. If the RFC is not ready, say exactly what must be clarified before implementation can start.
+7. Classify each blocker as either a **spec gap** or an **implementation contract gap** and stop at that boundary.
+# Inputs & outputs
+- review-rfc(input: { rfcPath, relatedPaths?, focusAreas? }) -> { verdict, risks, specGaps, implementationGaps, recommendations }
+- compare-against-code(input: { rfcPath, codePaths[] }) -> { matches, mismatches, notes }
+- produce-review(input: { rfcPath, findings[] }) -> { summary, verdict, checklist, nextStep }
+# Failure handling
+- If the RFC file is missing, stop and report the missing path explicitly.
+- If the RFC is ambiguous, classify each concern as either "spec" or "implementation" instead of blending them.
+- If the review cannot be grounded in the current repo, state that the RFC is not reviewable yet.
+# Progressive disclosure
+- Keep this file short.
+- Load the reference template only when writing the final review.
+# References
+- `references/rfc-review-template.md` — standard review format and verdict rubric
+- `references/rfc-review-checklist.md` — questions to apply while reviewing an RFC
+# License
+Same as repository (MIT).

package/skills/rfc-review/references/rfc-review-checklist.md ADDED Viewed

@@ -0,0 +1,12 @@
+# RFC Review Checklist
+Ask these questions while reviewing:
+1. Is the problem statement specific and grounded in current failures?
+2. Are non-goals explicit?
+3. Are contracts concrete enough to implement?
+4. Are acceptance criteria testable?
+5. Does the RFC define the source of truth for new fields or behaviors?
+6. Does it match existing code paths and public tool surfaces?
+7. Can each open concern be classified as a spec issue or an implementation issue?
+8. Is the RFC ready to implement without further interpretation?

package/skills/rfc-review/references/rfc-review-template.md ADDED Viewed

@@ -0,0 +1,28 @@
+# RFC Review Template
+Use this structure for every RFC review:
+## Verdict
+- Ready / Needs clarification / Needs implementation contract / Not ready
+## Summary
+- One short paragraph on the RFC's current quality.
+## What is good
+- List the strongest parts of the RFC.
+## Issues
+For each issue, include:
+- **Type:** spec / implementation / implementation contract / doc
+- **Severity:** low / medium / high
+- **Why it matters:** one sentence
+- **Fix:** exact change needed
+## Missing contract surfaces
+- List any API shapes, response fields, state transitions, or invariants that are still undefined.
+## Codebase alignment
+- Note whether the RFC matches current `src/`, docs, and tests.
+## Next step
+- State the smallest next action needed to move the RFC forward.