ta-studio-mcp 1.2.0 โ†’ 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +21 -20
  2. package/dist/index.js +1 -1
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -21,34 +21,35 @@ npx ta-studio-mcp
21
21
 
22
22
  ---
23
23
 
24
- ## ๐Ÿงช Methodologies & Deep Technical Lore
24
+ ## ๐Ÿง  Expert Knowledge & Deep Technical Lore
25
25
 
26
26
  This section documents the state-of-the-art implementations used by the TA Studio team.
27
27
 
28
- ### 1. Model Tiering (Jan 2026 Standard)
29
- We avoid "one size fits all" model selection. Models are tiered by "Thinking Budget":
30
- - **Thinking Tier (GPT-5.2)**: Used for high-level orchestration (Coordinator) and complex visual reasoning. reasoning effort: `high`.
31
- - **Core Tier (GPT-5-mini)**: Used for specialized specialists (Classifier, Verifier, Diagnosis). *Never use nano for classification.*
32
- - **Utility Tier (GPT-5-nano)**: Used for MCP tool formatting, data distillation, and search cleanup.
28
+ ### 1. Set-of-Mark (SoM) Screenshot Annotation
29
+ Based on OmniParser's SoM approach, we use color-coded, type-aware bounding boxes to provide visual anchors for agents.
30
+ - **Type-Aware Palette**: 9 distinct colors (e.g., **Dodger Blue** for buttons, **Orange** for inputs, **Purple** for toggles).
31
+ - **PIL Threading**: To prevent UI blocking during heavy drawing (50+ boxes), we use `asyncio.to_thread(_draw_bounding_boxes_threaded)`.
32
+ - **TOON Optimization**: **Token Optimized Object Notation** strips redundant XML metadata, reducing prompt tokens by 40% while maintaining coordinate precision.
33
+ - **Scaling Correction**: Screenshots are compressed to ~45% resolution. We apply `native_coord * (img_width / native_width)` to ensure pixel-perfect alignment.
33
34
 
34
- ### 2. Failure Taxonomy (OAVR "Reason")
35
+ ### 2. Model Tiering (Jan 2026 Standard)
36
+ Models are strictly tiered by "Thinking Budget":
37
+ - **Thinking Tier (GPT-5.2)**: Orchestration (Coordinator) and complex visual reasoning. reasoning effort: `high`.
38
+ - **Core Tier (GPT-5-mini)**: Specialists (Classifier, Verifier, Diagnosis). *Never use nano for classification.*
39
+ - **Utility Tier (GPT-5-nano)**: MCP formatting and data distillation.
40
+
41
+ ### 3. Failure Taxonomy (OAVR "Reason")
35
42
  The `Failure Diagnosis Specialist` uses a structured taxonomy to recover from errors:
36
43
  - **PLANNING_ERROR**: Incorrect action choice. *Recovery*: Backtrack/Retry.
37
- - **PERCEPTION_ERROR**: UI misrepresented in model. *Recovery*: Wait/Re-scan.
44
+ - **PERCEPTION_ERROR**: UI misrepresented. *Recovery*: Wait/Re-scan.
38
45
  - **ENVIRONMENT_ERROR**: App crash/Dialogs. *Recovery*: Handle OS dialog/Restart.
39
- - **EXECUTION_ERROR**: Click/Swipe failed to register. *Recovery*: Apply 5px jitter/Retry.
46
+ - **EXECUTION_ERROR**: Click/Swipe failed. *Recovery*: Apply 5px jitter/Retry.
40
47
 
41
- ### 3. Mobile MCP ADB Fallback
48
+ ### 4. Mobile MCP ADB Fallback
42
49
  Mobile MCP v0.0.36 fails if *any* device is offline. Our client implements a comprehensive ADB bridge:
43
50
  - **Fast Screenshot**: `exec-out screencap -p` (Base64 direct stream).
44
- - **Fast UI Dump**: `uiautomator dump /dev/tty` (No temp file I/O).
45
- - **Control**: Direct `input tap`, `input swipe`, and `am start -n` activity mapping.
46
-
47
- ### 4. Golden Bug Metrics
48
- We measure agent reliability via a two-stage deterministic pipeline:
49
- - **Planning Judge**: Static analysis + LLM verification before device boot.
50
- - **Execution Judge**: Reproduction and AI verification of goal state.
51
- - **Output**: Precision, Recall, and F1 scores aggregated in `data/agent_runs/golden`.
51
+ - **Fast UI Dump**: `uiautomator dump /dev/tty`.
52
+ - **Control**: Direct `input tap`, `input swipe`, and `am start -n`.
52
53
 
53
54
  ---
54
55
 
@@ -59,8 +60,8 @@ We measure agent reliability via a two-stage deterministic pipeline:
59
60
  | **CRITICAL** | Bbox Misalignment | **RC**: 45% Scaling Delta. **Fix**: Apply `img_width / native_width` factor. |
60
61
  | **CRITICAL** | Async to_thread | **RC**: CORO vs CALL. **Fix**: Remove `async` from functions passed to `asyncio.to_thread`. |
61
62
  | **CRITICAL** | Race Condition | **RC**: Parallel sessions. **Fix**: `parallel_tool_calls=False` for sequential testing. |
62
- | **HIGH** | Simulation Leak | **RC**: Memory persistence. **Fix**: 24h/100-run auto-purge with `asyncio.Lock` safety. |
63
- | **HIGH** | Figma Rate Limit | **RC**: API 429 status. **Fix**: Direct CV overlay via Playwright + brightness detection. |
63
+ | **HIGH** | Simulation Leak | **RC**: Memory persistence. **Fix**: 24h/100-run auto-purge with `asyncio.Lock`. |
64
+ | **HIGH** | Figma Rate Limit | **RC**: API 429 status. **Fix**: Direct CV overlay via high-contrast brightness detection. |
64
65
 
65
66
  ---
66
67
 
package/dist/index.js CHANGED
@@ -14,7 +14,7 @@ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
14
14
  import { registerAllTools } from './tools/register-all.js';
15
15
  const server = new McpServer({
16
16
  name: 'ta-studio-mcp',
17
- version: '1.2.0',
17
+ version: '1.2.1',
18
18
  }, {
19
19
  capabilities: {
20
20
  logging: {},
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ta-studio-mcp",
3
- "version": "1.2.0",
3
+ "version": "1.2.1",
4
4
  "description": "TA Studio MCP โ€” Domain knowledge, patterns, bug fixes, and workflows for AI agents working on the TA Studio mobile test automation platform.",
5
5
  "type": "module",
6
6
  "bin": {