npm - ta-studio-mcp - Versions diffs - 1.2.4 → 1.3.0 - Mend

ta-studio-mcp 1.2.4 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md +99 -53
package/dist/index.js +1 -1
package/dist/knowledge/methodology.d.ts.map +1 -1
package/dist/knowledge/methodology.js +56 -1
package/dist/knowledge/methodology.js.map +1 -1
package/package.json +2 -2

package/README.md CHANGED Viewed

@@ -6,13 +6,6 @@ AI agents often struggle with project-specific context, unique navigation patter
 ---
-## 📋 Prerequisites
-- **Node.js**: `v18.0.0` or higher.
-- **MCP Client**: An IDE or tool that supports the [Model Context Protocol](https://modelcontextprotocol.io) (e.g., Claude Desktop, Cursor, Windsurf, VS Code).
----
 ## ⚡ Quick Start
 ```bash
@@ -21,41 +14,93 @@ npx ta-studio-mcp
 ---
-## 🧠 Expert Knowledge & Deep Technical Lore
-This section documents the state-of-the-art implementations used by the TA Studio team.
-### 1. Set-of-Mark (SoM) Screenshot Annotation
-Based on OmniParser's SoM approach, we use color-coded, type-aware bounding boxes to provide visual anchors.
-- **Type-Aware Palette**: 9 distinct colors (e.g., **Dodger Blue** for buttons, **Orange** for inputs, **Purple** for toggles).
-- **PIL Threading**: `asyncio.to_thread(_draw_bounding_boxes_threaded)` for non-blocking UI drawing.
-- **TOON Optimization**: **Token Optimized Object Notation** reduces prompt tokens by 40% by stripping redundant XML.
-- **Scaling Correction**: Screenshots are compressed to ~45% resolution. We apply `native_coord * (img_width / native_width)` for pixel-perfect alignment.
-### 2. Deep Subagent Handoff Protocol
-Our "Deep Agent Pattern" orchestrates specialized specialists via a strict chain of custody:
-1. **Perceptor** (`Screen Classifier`): Returns structured state and **TOON** elements.
-2. **Planner** (`Device Agent`): Proposes action based on the identified UI state.
-3. **Guardrail** (`Action Verifier`): Applies **Boolean Verification** (Safe/Relevant/Executable).
-4. **Actor** (`Mobile MCP`): Executes the approved action on the target device.
-5. **Doctor** (`Failure Diagnosis`): Categorizes failures and suggests recovery (Jitter/Wait/Backtrack).
-### 3. Boolean Verification vs. Numerical Scoring
-We reject "0.85 confidence" scores. Every action must pass three binary checks:
-- **is_safe**: Does this action cause data loss or unauthorized access? (YES/NO)
-- **is_relevant**: Does this move the needle on the task goal? (YES/NO)
-- **is_executable**: Is the target realistically reachable on the current screen? (YES/NO)
-- **Logic**: Action executes ONLY if ALL checks are YES. Reject and propose an `alternative_action` otherwise.
-### 4. Real-Time HUD & Parallel Execution
-- **Observation Pipeline**: Achieve `<200ms` lag via `on_step` async callbacks that emit SSE events to the frontend.
-- **Concurrency Control**: `asyncio.Semaphore` and per-simulation `asyncio.Lock` manage multiple parallel device streams without resource collision.
-- **Retention**: Automated 24h age or 100 total simulations cleanup before auto-purge.
-### 5. Model Tiering (Jan 2026 Standard)
-- **Thinking Tier (GPT-5.2)**: High-level orchestration (Coordinator) and complex visual reasoning. reasoning effort: `high`.
-- **Core Tier (GPT-5-mini)**: Specialized agents (Classifier, Verifier, Diagnosis). *Never use nano for classification.*
-- **Utility Tier (GPT-5-nano)**: MCP tool call formatting and data distillation.
+## 🎨 Figma Flow Analysis Pipeline
+The cornerstone of TA Studio's design-to-test workflow is a sophisticated 3-phase analysis pipeline that converts complex Figma documents into actionable test plans.
+### Phase 1: Direct Extraction
+We use the Figma REST API with a specific recursive traversal logic:
+- **Depth-3 Extraction**: `DOC` -> `CANVAS` -> `SECTION` -> `FRAME`.
+- **Logic**: We stop at the Frame level to capture screens. Critical insight: standard `depth=2` extractions only reach the Section level, often missing the actual UI Frames nested within.
+- **Node Analysis**: Every frame is analyzed for its name, `transitionNodeID` (prototype links), and spatial coordinates.
+### Phase 2: Multi-Signal Priority Clustering
+To group screens into logical user flows (e.g., "Onboarding", "Checkout"), we use a cascade of grouping signals in order of reliability:
+1. **Section-Based**: (Highest Priority) Uses Figma's native `Section` grouping if available.
+2. **Prototype Connections**: Uses Union-Find algorithm on `transitionNodeID` links to group screens that are functionally connected.
+3. **Name-Prefix Matching**: Split by common naming separators like ` / `, ` - `, or ` — `.
+4. **Spatial Clustering**: (Lowest Priority) Groups by proximity using Y-binning and X-gap splitting.
+### Phase 3: Visual Overlay & CV Fallback
+- **PIL Visualizer**: Generates a high-contrast overlay with 12 distinct colors, semi-transparent fills (alpha=40), and strong outlines (alpha=200).
+- **Rate-Limit Fallback**: When the Figma Images API is rate-limited (common with `429` errors), we switch to a Computer Vision (CV) pipeline:
+    - **Brightness Thresholding**: Identify sections and frames by detecting canvas brightness deltas (>80 for sections, >100 for frames).
+    - **Morphological Opening**: Uses `scipy.ndimage` to clean up noise and bridge gaps in detected outlines.
+    - **Connected Component Analysis**: Reconstructs frame hierarchies from pixel data when API metadata is unavailable.
+Key files:
+- `backend/app/figma/flow_analyzer.py` (707 lines)
+- `scripts/figma_cv_overlay.py` (162 lines)
+---
+## 📱 Device Testing & Simulation Lifecycle
+The TA Studio backend manages thousands of automated simulation steps across a fleet of Android emulators.
+### Concurrency & Thread Safety
+- **Async Execution**: Each device task runs in its own `asyncio.Task`.
+- **Semaphore Guard**: A global `asyncio.Semaphore(max_concurrent)` prevents host resource exhaustion.
+- **Simulation Lock**: Per-simulation `asyncio.Lock` ensures that result indexing is thread-safe and deterministic.
+### Simulation States
+`queued` -> `running` -> `completed` | `failed` | `cancelled`
+- **Auto-Purge**: 24h age-out or 100 total simulations retention limit to prevent memory bloat.
+- **Device Auto-Select**: Automatically ranks devices (`emulator-5554` first, then general emulators, then physical devices).
+Key file: `backend/app/agents/coordinator/coordinator_service.py`
+---
+## 👁️ Vision Click & Agentic Visual Reasoning
+When the accessibility tree (`list_elements_on_screen`) fails—common in canvas-based UIs or custom views—we switch to Agentic Vision.
+### Two-Layer Architecture
+1. **Layer 1: SoM Annotation**: Deterministic, fast (<100ms) annotation of the screenshot using color-coded bounding boxes derived from the element list.
+2. **Layer 2: GPT-5.2 Agentic Vision**:
+    - **Think-Act-Observe**: GPT-5.2 generates specialized Python code and runs it in a `LocalCodeExecutor`.
+    - **Analysis**: The agent analyzes the SoM-annotated image + the element metadata to find the target.
+    - **Feedback**: Results are fed back into the OAVR loop for final execution.
+Key file: `backend/app/agents/device_testing/agentic_vision_service.py` (847 lines)
+---
+## 🔗 Mobile MCP & ADB Fallback
+`ta-studio-mcp` provides the bridge to `mobile-mcp` but with a critical safety layer for production stability.
+### The v0.0.36 Bug Fix
+Mobile MCP v0.0.36 has a critical flaw: it fails to detect *any* device if *one* device in the list is offline.
+- **The Solution**: Our `MobileMCPClient` implements a 1:1 ADB bridge fallback.
+- **ADB Commands**: Uses `exec-out screencap -p` for fast PNG streaming and `uiautomator dump /dev/tty` for zero-file-I/O UI tree extraction.
+Key file: `backend/app/agents/device_testing/mobile_mcp_client.py`
+---
+## ⚡ Flicker Detection & Performance Metrics
+We detect regressions that occur faster than standard screenshot intervals (16-200ms).
+### 4-Layer Flicker Pipeline
+1. **Trigger**: High-speed `screenrecord` (10s limit).
+2. **Extraction**: `ffmpeg` scene filter (`gt(scene,0.003)`).
+3. **Analysis**: Consecutive frame SSIM (Structural Similarity Index) calculation. SSIM drops > 0.15 are flagged.
+4. **Verification**: GPT-5.2 Vision confirms the flicker and generates a video artifact for regression triage.
+Key file: `backend/app/agents/device_testing/flicker_detection_service.py`
 ---
@@ -63,22 +108,23 @@ We reject "0.85 confidence" scores. Every action must pass three binary checks:
 | Severity | Issue | Root Cause & Expert Fix |
 |----------|-------|-------------------------|
-| **CRITICAL** | Bbox Misalignment | **RC**: 45% Scaling Delta. **Fix**: Apply `img_width / native_width` factor. |
-| **CRITICAL** | Async to_thread | **RC**: CORO vs CALL mismatch. **Fix**: Remove `async` from functions passed to `asyncio.to_thread`. |
-| **CRITICAL** | Race Condition | **RC**: Parallel session state collision. **Fix**: Set `parallel_tool_calls=False`. |
-| **HIGH** | Simulation Leak | **RC**: Memory persistence. **Fix**: 24h/100-run auto-purge with `asyncio.Lock`. |
-| **HIGH** | Mobile MCP Bug | **RC**: Offline device fail (v0.0.36). **Fix**: Full ADB bridge fallback (screencap/uiautomator). |
+| **CRITICAL** | Bbox Misalignment | **RC**: 45% Scaling Delta between native vs JPEG. **Fix**: Map coords via `img_width / native_width`. |
+| **CRITICAL** | Async to_thread | **RC**: Collision when passing `async` functions to `to_thread`. **Fix**: Ensure target is a plain function. |
+| **CRITICAL** | Race Condition | **RC**: Parallel tool calls in device sessions. **Fix**: `parallel_tool_calls=False` is mandatory. |
+| **HIGH** | Simulation Leak | **RC**: Context persistence leading to memory fail. **Fix**: 24h retention + indexing lock. |
+| **HIGH** | Figma API 429 | **RC**: Heavy polling on Figma API. **Fix**: Integrated CV Brightness thresholding fallback. |
+| **MEDIUM** | SSIM False Positive | **RC**: Normal UI transitions. **Fix**: Increased sensitivity threshold to 0.15. |
 ---
 ## 🔄 Core Workflows
 ### The Ralph Loop (Closed-Loop Verification)
-1. **CODE** → Implement feature or fix.
-2. **LINT** → `mypy` / `eslint` verification.
-3. **UNIT TEST** → Specific module verification.
-4. **CHECK ASYNC** → Confirm `to_thread` safety.
-5. **VERIFY HUD** → Watch the emulator stream while agent runs autonomously.
+1. **CODE** -> Implement the feature or fix.
+2. **LINT** -> Run `mypy` and `eslint`.
+3. **UNIT TEST** -> Verify the module in isolation.
+4. **CHECK ASYNC** -> Explicitly audit `to_thread` safety and concurrency locks.
+5. **VERIFY HUD** -> Launch the emulator and watch the real-time Stream HUD during autonomous execution.
 ---
@@ -95,4 +141,4 @@ Add `npx -y ta-studio-mcp@latest` as a command-type MCP server.
 ---
 ## 📜 License
-MIT © 2026 TA Studios.
+MIT © 2026 TA Studios.

package/dist/index.js CHANGED Viewed

@@ -14,7 +14,7 @@ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
 import { registerAllTools } from './tools/register-all.js';
 const server = new McpServer({
     name: 'ta-studio-mcp',
-    version: '1.2.4',
+    version: '1.3.0',
 }, {
     capabilities: {
         logging: {},

package/dist/knowledge/methodology.d.ts.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"methodology.d.ts","sourceRoot":"","sources":["../../src/knowledge/methodology.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,eAAO,MAAM,kBAAkB,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,~~CA2LrD~~,CAAC;AAEF,eAAO,MAAM,sBAAsB,UAAkC,CAAC"}
1	+ {"version":3,"file":"methodology.d.ts","sourceRoot":"","sources":["../../src/knowledge/methodology.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,eAAO,MAAM,kBAAkB,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,CAqPrD,CAAC;AAEF,eAAO,MAAM,sBAAsB,UAAkC,CAAC"}

package/dist/knowledge/methodology.js CHANGED Viewed

@@ -4,7 +4,7 @@
 export const METHODOLOGY_TOPICS = {
     overview: `# TA Studio Methodologies — Overview
-Available topics: oavr, som_annotation, coordinate_scaling, agent_config, flicker_detection, golden_bugs, mobile_mcp, vision_click, failure_diagnosis, self_correction, model_tiering, simulation_lifecycle, subagent_handoff, boolean_verification, hud_streaming
+Available topics: oavr, som_annotation, coordinate_scaling, agent_config, flicker_detection, golden_bugs, mobile_mcp_fallback, vision_click, failure_diagnosis, self_correction, model_tiering, simulation_lifecycle, subagent_handoff, boolean_verification, hud_streaming, figma_pipeline
 ## Architecture
 - **Backend**: FastAPI (Python 3.11+) at backend/
@@ -175,6 +175,61 @@ Key files: agents/coordinator/coordinator_service.py, api/device_simulation.py`,
 Sequential execution ensures session stability.
 Key file: agents/device_testing/device_testing_agent.py`,
+    vision_click: `# Vision Click — Agentic Vision (GPT-5.2)
+When the accessibility tree (list_elements_on_screen) is insufficient — canvas-based UIs, loading states, or custom views — we use GPT-5.2 with code execution to find elements visually.
+## Two-Layer Architecture
+- **Layer 1**: SoM Structural Annotation (deterministic, <100ms, free) — accessibility tree → element classification → color-coded bounding boxes
+- **Layer 2**: GPT-5.2 Agentic Vision (intelligent, Think-Act-Observe) — SoM-annotated image + element list → GPT-5.2 generates Python code → LocalCodeExecutor runs it → results fed back
+## Workflow
+1. Take screenshot via Mobile MCP
+2. Get screen size for coordinate mapping
+3. Call AgenticVisionClient.multi_step_vision() with image + query
+4. GPT-5.2 Think-Act-Observe loop: analyze image → generate Python code → execute locally → feed results back
+5. Parse COORDINATES: (x, y) from final analysis
+6. Execute click at found coordinates
+Key file: agents/device_testing/agentic_vision_service.py (847 lines)`,
+    figma_pipeline: `# Figma Flow Analysis — 3-Phase Pipeline
+## Phase 1: Extract (Figma REST API, depth=3)
+DOC → CANVAS → SECTION → FRAME tree traversal. CRITICAL: depth=3 not depth=2 — depth=2 only gets SECTION nodes, missing FRAMEs inside them.
+## Phase 2: Cluster (Multi-Signal Priority Cascade)
+Tries each signal in order, uses first that produces ≥2 groups:
+1. **Section-Based** (highest priority) — group by section_name
+2. **Prototype Connections** — Union-Find on transitionNodeID links
+3. **Name-Prefix Matching** — split by " / ", " - ", " — " separators
+4. **Spatial Clustering** (lowest) — Y-binning + X-gap splitting
+## Phase 3: Visualize (PIL Overlay)
+12 distinct colors, semi-transparent fill (alpha=40), strong outline (alpha=200), group labels.
+## CV Overlay Fallback (No API)
+When Figma Images API is rate-limited (429 with Retry-After: 396156 = 4.6 days):
+- Brightness thresholding (>80 for sections, >100 for frames)
+- Morphological closing/opening (scipy.ndimage) to bridge gaps
+- Connected component analysis for section groups
+- Column brightness profiling for sub-frame detection
+Key files: app/figma/flow_analyzer.py (707 lines), scripts/figma_cv_overlay.py (162 lines)`,
+    self_correction: `# Self-Correction Protocol (Ralph Loop)
+When verification fails:
+1. **Read the error** — Understand what broke
+2. **Diagnose root cause** — Don't just patch symptoms
+3. **Fix systematically** — Update code, tests, and docs together
+4. **Re-verify** — Run full verification again
+5. **Iterate** — Repeat until green
+## Verification Commands
+- Backend: pytest --tb=short
+- Frontend: npm run build && npm run lint
+- E2E: npx playwright test
+Key principle: NEVER commit without running verification.`,
 };
 export const METHODOLOGY_TOPIC_LIST = Object.keys(METHODOLOGY_TOPICS);
 //# sourceMappingURL=methodology.js.map

package/dist/knowledge/methodology.js.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"methodology.js","sourceRoot":"","sources":["../../src/knowledge/methodology.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,MAAM,CAAC,MAAM,kBAAkB,GAA2B;IACvD,QAAQ,EAAE;;;;;;;;;;gEAUmD;IAE7D,IAAI,EAAE;;;;;;;;;oDAS2C;IAEjD,cAAc,EAAE;;;;;;;;;;;;;;;;;qEAiBkD;IAElE,kBAAkB,EAAE;;;;;;;;;;;;;;uDAcgC;IAEpD,iBAAiB,EAAE;;;;;;;;;6DASuC;IAE1D,WAAW,EAAE;;;;;;;;;;;;sDAYsC;IAEnD,iBAAiB,EAAE;;;;;;;;;;qEAU+C;IAElE,aAAa,EAAE;;;;;;;;+CAQ6B;IAE5C,mBAAmB,EAAE;;;;;;;;;;qDAU6B;IAElD,oBAAoB,EAAE;;;;;;;;;oDAS2B;IAEjD,gBAAgB,EAAE;;;;;;;;;;;;;;wDAcmC;IAErD,oBAAoB,EAAE;;;;;;;;;;;;;mEAa0C;IAEhE,aAAa,EAAE;;;;;;;;;;;;;+EAa6D;IAE5E,YAAY,EAAE;;;;;;;;;;;wDAWuC;~~CACvD~~,CAAC;AAEF,MAAM,CAAC,MAAM,sBAAsB,GAAG,MAAM,CAAC,IAAI,CAAC,kBAAkB,CAAC,CAAC"}
1	+ {"version":3,"file":"methodology.js","sourceRoot":"","sources":["../../src/knowledge/methodology.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,MAAM,CAAC,MAAM,kBAAkB,GAA2B;IACvD,QAAQ,EAAE;;;;;;;;;;gEAUmD;IAE7D,IAAI,EAAE;;;;;;;;;oDAS2C;IAEjD,cAAc,EAAE;;;;;;;;;;;;;;;;;qEAiBkD;IAElE,kBAAkB,EAAE;;;;;;;;;;;;;;uDAcgC;IAEpD,iBAAiB,EAAE;;;;;;;;;6DASuC;IAE1D,WAAW,EAAE;;;;;;;;;;;;sDAYsC;IAEnD,iBAAiB,EAAE;;;;;;;;;;qEAU+C;IAElE,aAAa,EAAE;;;;;;;;+CAQ6B;IAE5C,mBAAmB,EAAE;;;;;;;;;;qDAU6B;IAElD,oBAAoB,EAAE;;;;;;;;;oDAS2B;IAEjD,gBAAgB,EAAE;;;;;;;;;;;;;;wDAcmC;IAErD,oBAAoB,EAAE;;;;;;;;;;;;;mEAa0C;IAEhE,aAAa,EAAE;;;;;;;;;;;;;+EAa6D;IAE5E,YAAY,EAAE;;;;;;;;;;;wDAWuC;IAErD,YAAY,EAAE;;;;;;;;;;;;;;;;sEAgBqD;IAEnE,cAAc,EAAE;;;;;;;;;;;;;;;;;;;;;;2FAsBwE;IAExF,eAAe,EAAE;;;;;;;;;;;;;;0DAcsC;CACzD,CAAC;AAEF,MAAM,CAAC,MAAM,sBAAsB,GAAG,MAAM,CAAC,IAAI,CAAC,kBAAkB,CAAC,CAAC"}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ta-studio-mcp",
-  "version": "1.2.4",
+  "version": "1.3.0",
   "description": "TA Studio MCP — Domain knowledge, patterns, bug fixes, and workflows for AI agents working on the TA Studio mobile test automation platform.",
   "type": "module",
   "bin": {
@@ -56,4 +56,4 @@
   "engines": {
     "node": ">=18"
   }
-}
+}