npm - ta-studio-mcp - Versions diffs - 1.2.0 → 1.2.1 - Mend

ta-studio-mcp 1.2.0 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -21,34 +21,35 @@ npx ta-studio-mcp
 ---
-## 🧪 Methodologies & Deep Technical Lore
+## 🧠 Expert Knowledge & Deep Technical Lore
 This section documents the state-of-the-art implementations used by the TA Studio team.
-### 1. Model Tiering (Jan 2026 Standard)
-We avoid "one size fits all" model selection. Models are tiered by "Thinking Budget":
-- **Thinking Tier (GPT-5.2)**: Used for high-level orchestration (Coordinator) and complex visual reasoning. reasoning effort: `high`.
-- **Core Tier (GPT-5-mini)**: Used for specialized specialists (Classifier, Verifier, Diagnosis). *Never use nano for classification.*
-- **Utility Tier (GPT-5-nano)**: Used for MCP tool formatting, data distillation, and search cleanup.
+### 1. Set-of-Mark (SoM) Screenshot Annotation
+Based on OmniParser's SoM approach, we use color-coded, type-aware bounding boxes to provide visual anchors for agents.
+- **Type-Aware Palette**: 9 distinct colors (e.g., **Dodger Blue** for buttons, **Orange** for inputs, **Purple** for toggles).
+- **PIL Threading**: To prevent UI blocking during heavy drawing (50+ boxes), we use `asyncio.to_thread(_draw_bounding_boxes_threaded)`.
+- **TOON Optimization**: **Token Optimized Object Notation** strips redundant XML metadata, reducing prompt tokens by 40% while maintaining coordinate precision.
+- **Scaling Correction**: Screenshots are compressed to ~45% resolution. We apply `native_coord * (img_width / native_width)` to ensure pixel-perfect alignment.
-### 2. Failure Taxonomy (OAVR "Reason")
+### 2. Model Tiering (Jan 2026 Standard)
+Models are strictly tiered by "Thinking Budget":
+- **Thinking Tier (GPT-5.2)**: Orchestration (Coordinator) and complex visual reasoning. reasoning effort: `high`.
+- **Core Tier (GPT-5-mini)**: Specialists (Classifier, Verifier, Diagnosis). *Never use nano for classification.*
+- **Utility Tier (GPT-5-nano)**: MCP formatting and data distillation.
+### 3. Failure Taxonomy (OAVR "Reason")
 The `Failure Diagnosis Specialist` uses a structured taxonomy to recover from errors:
 - **PLANNING_ERROR**: Incorrect action choice. *Recovery*: Backtrack/Retry.
-- **PERCEPTION_ERROR**: UI misrepresented in model. *Recovery*: Wait/Re-scan.
+- **PERCEPTION_ERROR**: UI misrepresented. *Recovery*: Wait/Re-scan.
 - **ENVIRONMENT_ERROR**: App crash/Dialogs. *Recovery*: Handle OS dialog/Restart.
-- **EXECUTION_ERROR**: Click/Swipe failed to register. *Recovery*: Apply 5px jitter/Retry.
+- **EXECUTION_ERROR**: Click/Swipe failed. *Recovery*: Apply 5px jitter/Retry.
-### 3. Mobile MCP ADB Fallback
+### 4. Mobile MCP ADB Fallback
 Mobile MCP v0.0.36 fails if *any* device is offline. Our client implements a comprehensive ADB bridge:
 - **Fast Screenshot**: `exec-out screencap -p` (Base64 direct stream).
-- **Fast UI Dump**: `uiautomator dump /dev/tty` (No temp file I/O).
-- **Control**: Direct `input tap`, `input swipe`, and `am start -n` activity mapping.
-### 4. Golden Bug Metrics
-We measure agent reliability via a two-stage deterministic pipeline:
-- **Planning Judge**: Static analysis + LLM verification before device boot.
-- **Execution Judge**: Reproduction and AI verification of goal state.
-- **Output**: Precision, Recall, and F1 scores aggregated in `data/agent_runs/golden`.
+- **Fast UI Dump**: `uiautomator dump /dev/tty`.
+- **Control**: Direct `input tap`, `input swipe`, and `am start -n`.
 ---
@@ -59,8 +60,8 @@ We measure agent reliability via a two-stage deterministic pipeline:
 | **CRITICAL** | Bbox Misalignment | **RC**: 45% Scaling Delta. **Fix**: Apply `img_width / native_width` factor. |
 | **CRITICAL** | Async to_thread | **RC**: CORO vs CALL. **Fix**: Remove `async` from functions passed to `asyncio.to_thread`. |
 | **CRITICAL** | Race Condition | **RC**: Parallel sessions. **Fix**: `parallel_tool_calls=False` for sequential testing. |
-| **HIGH** | Simulation Leak | **RC**: Memory persistence. **Fix**: 24h/100-run auto-purge with `asyncio.Lock` safety. |
-| **HIGH** | Figma Rate Limit | **RC**: API 429 status. **Fix**: Direct CV overlay via Playwright + brightness detection. |
+| **HIGH** | Simulation Leak | **RC**: Memory persistence. **Fix**: 24h/100-run auto-purge with `asyncio.Lock`. |
+| **HIGH** | Figma Rate Limit | **RC**: API 429 status. **Fix**: Direct CV overlay via high-contrast brightness detection. |
 ---

package/dist/index.js CHANGED Viewed

@@ -14,7 +14,7 @@ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
 import { registerAllTools } from './tools/register-all.js';
 const server = new McpServer({
     name: 'ta-studio-mcp',
-    version: '1.2.0',
+    version: '1.2.1',
 }, {
     capabilities: {
         logging: {},

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ta-studio-mcp",
-  "version": "1.2.0",
+  "version": "1.2.1",
   "description": "TA Studio MCP — Domain knowledge, patterns, bug fixes, and workflows for AI agents working on the TA Studio mobile test automation platform.",
   "type": "module",
   "bin": {