mstro-app 0.1.54 → 0.1.57

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. package/bin/mstro.js +2 -1
  2. package/dist/server/cli/headless/claude-invoker.d.ts.map +1 -1
  3. package/dist/server/cli/headless/claude-invoker.js +151 -0
  4. package/dist/server/cli/headless/claude-invoker.js.map +1 -1
  5. package/dist/server/cli/headless/runner.d.ts.map +1 -1
  6. package/dist/server/cli/headless/runner.js +7 -1
  7. package/dist/server/cli/headless/runner.js.map +1 -1
  8. package/dist/server/cli/headless/stall-assessor.d.ts +30 -0
  9. package/dist/server/cli/headless/stall-assessor.d.ts.map +1 -0
  10. package/dist/server/cli/headless/stall-assessor.js +184 -0
  11. package/dist/server/cli/headless/stall-assessor.js.map +1 -0
  12. package/dist/server/cli/headless/types.d.ts +9 -1
  13. package/dist/server/cli/headless/types.d.ts.map +1 -1
  14. package/dist/server/cli/improvisation-session-manager.d.ts +21 -2
  15. package/dist/server/cli/improvisation-session-manager.d.ts.map +1 -1
  16. package/dist/server/cli/improvisation-session-manager.js +65 -5
  17. package/dist/server/cli/improvisation-session-manager.js.map +1 -1
  18. package/dist/server/index.js +4 -1
  19. package/dist/server/index.js.map +1 -1
  20. package/dist/server/mcp/bouncer-integration.d.ts.map +1 -1
  21. package/dist/server/mcp/bouncer-integration.js +32 -0
  22. package/dist/server/mcp/bouncer-integration.js.map +1 -1
  23. package/dist/server/services/platform.d.ts.map +1 -1
  24. package/dist/server/services/platform.js +8 -5
  25. package/dist/server/services/platform.js.map +1 -1
  26. package/dist/server/services/settings.d.ts +25 -0
  27. package/dist/server/services/settings.d.ts.map +1 -0
  28. package/dist/server/services/settings.js +72 -0
  29. package/dist/server/services/settings.js.map +1 -0
  30. package/dist/server/services/websocket/autocomplete.d.ts.map +1 -1
  31. package/dist/server/services/websocket/autocomplete.js +12 -15
  32. package/dist/server/services/websocket/autocomplete.js.map +1 -1
  33. package/dist/server/services/websocket/handler.d.ts +99 -2
  34. package/dist/server/services/websocket/handler.d.ts.map +1 -1
  35. package/dist/server/services/websocket/handler.js +618 -157
  36. package/dist/server/services/websocket/handler.js.map +1 -1
  37. package/dist/server/services/websocket/session-registry.d.ts +38 -0
  38. package/dist/server/services/websocket/session-registry.d.ts.map +1 -0
  39. package/dist/server/services/websocket/session-registry.js +154 -0
  40. package/dist/server/services/websocket/session-registry.js.map +1 -0
  41. package/dist/server/services/websocket/types.d.ts +2 -2
  42. package/dist/server/services/websocket/types.d.ts.map +1 -1
  43. package/package.json +2 -2
  44. package/server/cli/headless/RESEARCH.md +627 -0
  45. package/server/cli/headless/claude-invoker.ts +192 -1
  46. package/server/cli/headless/runner.ts +7 -1
  47. package/server/cli/headless/stall-assessor.ts +245 -0
  48. package/server/cli/headless/types.ts +9 -1
  49. package/server/cli/improvisation-session-manager.ts +73 -5
  50. package/server/index.ts +4 -1
  51. package/server/mcp/bouncer-integration.ts +32 -0
  52. package/server/services/platform.ts +8 -5
  53. package/server/services/settings.ts +89 -0
  54. package/server/services/websocket/autocomplete.ts +18 -14
  55. package/server/services/websocket/handler.ts +677 -170
  56. package/server/services/websocket/session-registry.ts +180 -0
  57. package/server/services/websocket/types.ts +31 -2
@@ -0,0 +1,627 @@
1
+ # Claude Code Headless API & Process Management: Comprehensive Research
2
+
3
+ **Date:** 2026-02-15
4
+ **Context:** Research for Mstro platform -- managing Claude Code subprocesses from the CLI relay component.
5
+
6
+ ---
7
+
8
+ ## Table of Contents
9
+
10
+ 1. [SDK & API Overview](#1-sdk--api-overview)
11
+ 2. [Headless Mode (CLI -p Flag)](#2-headless-mode-cli--p-flag)
12
+ 3. [TypeScript Agent SDK (Programmatic)](#3-typescript-agent-sdk-programmatic)
13
+ 4. [Streaming & Progress Detection](#4-streaming--progress-detection)
14
+ 5. [Timeout Management](#5-timeout-management)
15
+ 6. [AbortController & Process Termination](#6-abortcontroller--process-termination)
16
+ 7. [Known Stall/Hang Issues & Patterns](#7-known-stallhang-issues--patterns)
17
+ 8. [Best Practices for Remote/Automation Execution](#8-best-practices-for-remoteautomation-execution)
18
+ 9. [Recommendations for Mstro](#9-recommendations-for-mstro)
19
+
20
+ ---
21
+
22
+ ## 1. SDK & API Overview
23
+
24
+ ### Package Evolution
25
+
26
+ The SDK has gone through a naming transition:
27
+
28
+ - **Old:** `@anthropic-ai/claude-code` (npm) -- now deprecated for installation via npm
29
+ - **Current:** `@anthropic-ai/claude-agent-sdk` (npm, version 0.2.x as of Feb 2026)
30
+ - **Python:** `claude-code-sdk` (PyPI) / `claude_agent_sdk`
31
+
32
+ The "headless mode" terminology has been deprecated in favor of "Agent SDK." The `-p` flag and all CLI options work the same way, but the SDK packages provide richer programmatic control.
33
+
34
+ ### Two Interface Generations
35
+
36
+ **V1 (Stable):** Async generator pattern via `query()` function.
37
+
38
+ ```typescript
39
+ import { query } from "@anthropic-ai/claude-agent-sdk";
40
+
41
+ for await (const message of query({
42
+ prompt: "Fix the bug",
43
+ options: { maxTurns: 10, abortController: controller }
44
+ })) {
45
+ // Handle messages
46
+ }
47
+ ```
48
+
49
+ **V2 (Unstable Preview):** Session-based `send()`/`stream()` pattern.
50
+
51
+ ```typescript
52
+ import { unstable_v2_createSession } from "@anthropic-ai/claude-agent-sdk";
53
+
54
+ await using session = unstable_v2_createSession({ model: "claude-opus-4-6" });
55
+ await session.send("Fix the bug");
56
+ for await (const msg of session.stream()) {
57
+ // Handle messages
58
+ }
59
+ ```
60
+
61
+ The V2 interface is explicitly marked as unstable. V1 is the production-ready interface.
62
+
63
+ ### Key Interfaces (V1)
64
+
65
+ The `query()` function returns a `Query` object that extends `AsyncGenerator<SDKMessage, void>` with:
66
+
67
+ - `interrupt()` -- Interrupts the query (streaming input mode only)
68
+ - `rewindFiles(userMessageUuid)` -- Restores files to a checkpoint
69
+ - `setPermissionMode(mode)` -- Changes permission mode at runtime
70
+ - `setModel(model)` -- Changes model at runtime
71
+
72
+ ---
73
+
74
+ ## 2. Headless Mode (CLI -p Flag)
75
+
76
+ ### Basic Usage
77
+
78
+ ```bash
79
+ claude -p "Find and fix the bug in auth.py" --allowedTools "Read,Edit,Bash"
80
+ ```
81
+
82
+ ### Output Formats
83
+
84
+ | Format | Flag | Description |
85
+ |--------|------|-------------|
86
+ | `text` | `--output-format text` | Plain text (default) |
87
+ | `json` | `--output-format json` | Structured JSON with `result`, `session_id`, metadata |
88
+ | `stream-json` | `--output-format stream-json` | Newline-delimited JSON (NDJSON), real-time streaming |
89
+
90
+ ### Streaming with stream-json
91
+
92
+ To get real-time token-by-token output:
93
+
94
+ ```bash
95
+ claude -p "Explain recursion" \
96
+ --output-format stream-json \
97
+ --verbose \
98
+ --include-partial-messages
99
+ ```
100
+
101
+ Each line is a JSON object. Filter for text deltas:
102
+
103
+ ```bash
104
+ claude -p "Write a poem" --output-format stream-json --verbose --include-partial-messages | \
105
+ jq -rj 'select(.type == "stream_event" and .event.delta.type? == "text_delta") | .event.delta.text'
106
+ ```
107
+
108
+ ### Session Continuity
109
+
110
+ ```bash
111
+ # Capture session ID
112
+ session_id=$(claude -p "Start a review" --output-format json | jq -r '.session_id')
113
+ # Resume later
114
+ claude -p "Continue that review" --resume "$session_id"
115
+ ```
116
+
117
+ ### Known Limitation
118
+
119
+ Claude CLI in headless mode returns empty output when processing large stdin input (~7000+ characters). Smaller inputs (~2500 characters) work correctly. (GitHub issue #7263)
120
+
121
+ ---
122
+
123
+ ## 3. TypeScript Agent SDK (Programmatic)
124
+
125
+ ### Core Options
126
+
127
+ ```typescript
128
+ interface Options {
129
+ abortController?: AbortController; // For cancelling operations
130
+ allowedTools?: string[]; // Allowed tool names
131
+ cwd?: string; // Working directory
132
+ maxTurns?: number; // Maximum conversation turns
133
+ maxBudgetUsd?: number; // Maximum budget in USD
134
+ maxThinkingTokens?: number; // Max tokens for thinking
135
+ includePartialMessages?: boolean; // Enable streaming events
136
+ model?: string; // Claude model to use
137
+ permissionMode?: PermissionMode; // default | acceptEdits | bypassPermissions | plan
138
+ systemPrompt?: string | { type: 'preset'; preset: 'claude_code'; append?: string };
139
+ tools?: string[] | { type: 'preset'; preset: 'claude_code' };
140
+ env?: Dict<string>; // Environment variables
141
+ stderr?: (data: string) => void; // Callback for stderr output
142
+ hooks?: Partial<Record<HookEvent, HookCallbackMatcher[]>>;
143
+ fallbackModel?: string; // Model to use if primary fails
144
+ resume?: string; // Session ID to resume
145
+ settingSources?: SettingSource[]; // Which settings files to load
146
+ // ... many more options
147
+ }
148
+ ```
149
+
150
+ ### Message Types
151
+
152
+ The SDK yields these message types via the async generator:
153
+
154
+ | Type | Description |
155
+ |------|-------------|
156
+ | `SDKAssistantMessage` | Complete assistant response (after generation finishes) |
157
+ | `SDKUserMessage` | User input message |
158
+ | `SDKResultMessage` | Final result with duration, cost, usage, errors |
159
+ | `SDKSystemMessage` | System init message with tools, model, session info |
160
+ | `SDKPartialAssistantMessage` | Streaming partial (only with `includePartialMessages: true`) |
161
+ | `SDKCompactBoundaryMessage` | Conversation compaction boundary |
162
+
163
+ ### Result Message Structure (Critical for Detecting Completion)
164
+
165
+ ```typescript
166
+ type SDKResultMessage =
167
+ | {
168
+ type: "result";
169
+ subtype: "success";
170
+ duration_ms: number;
171
+ duration_api_ms: number;
172
+ is_error: boolean;
173
+ num_turns: number;
174
+ result: string;
175
+ total_cost_usd: number;
176
+ usage: NonNullableUsage;
177
+ }
178
+ | {
179
+ type: "result";
180
+ subtype: "error_max_turns"
181
+ | "error_during_execution"
182
+ | "error_max_budget_usd"
183
+ | "error_max_structured_output_retries";
184
+ errors: string[];
185
+ // ... same metadata fields
186
+ }
187
+ ```
188
+
189
+ **Key insight:** The `subtype` field distinguishes success from various error terminations. For Mstro, this is the definitive signal that Claude Code has finished.
190
+
191
+ ### Hook Events for Monitoring
192
+
193
+ The SDK supports hooks that fire at specific lifecycle points:
194
+
195
+ ```typescript
196
+ type HookEvent =
197
+ | "PreToolUse" // Before a tool executes
198
+ | "PostToolUse" // After a tool executes
199
+ | "PostToolUseFailure" // After a tool fails
200
+ | "Notification" // Claude sends a notification
201
+ | "SessionStart" // Session begins
202
+ | "SessionEnd" // Session ends
203
+ | "Stop" // Agent stops
204
+ | "SubagentStart" // Subagent launched
205
+ | "SubagentStop" // Subagent stopped
206
+ | "PreCompact" // Before context compaction
207
+ | "PermissionRequest"; // Permission needed
208
+ ```
209
+
210
+ ---
211
+
212
+ ## 4. Streaming & Progress Detection
213
+
214
+ ### How to Know Claude Code is Working
215
+
216
+ When `includePartialMessages: true` is set, the SDK emits `stream_event` messages containing raw Claude API streaming events. The event flow is:
217
+
218
+ ```
219
+ StreamEvent (message_start)
220
+ StreamEvent (content_block_start) - text block
221
+ StreamEvent (content_block_delta) - text chunks... <-- ACTIVE TEXT GENERATION
222
+ StreamEvent (content_block_stop)
223
+ StreamEvent (content_block_start) - tool_use block <-- TOOL CALL STARTING
224
+ StreamEvent (content_block_delta) - tool input chunks...
225
+ StreamEvent (content_block_stop)
226
+ StreamEvent (message_delta)
227
+ StreamEvent (message_stop)
228
+ AssistantMessage - complete message <-- TURN COMPLETE
229
+ ... tool executes (GAP WITH NO EVENTS) ... <-- TOOL EXECUTING
230
+ ... more streaming events for next turn ...
231
+ ResultMessage - final result <-- DONE
232
+ ```
233
+
234
+ ### Critical Gap: Tool Execution Silence
235
+
236
+ **The biggest challenge for stall detection:** Between when Claude decides to call a tool and when the tool result is returned, there are NO streaming events from the SDK. During a `Bash` command execution, for example, the process is alive and working but produces no output through the SDK's message stream.
237
+
238
+ ### Detecting Tool Calls in Progress
239
+
240
+ Use `content_block_start` with `type === "tool_use"` to detect when a tool call begins:
241
+
242
+ ```typescript
243
+ if (event.type === "content_block_start") {
244
+ if (event.content_block.type === "tool_use") {
245
+ currentTool = event.content_block.name;
246
+ // Mark: tool execution phase started
247
+ }
248
+ }
249
+ ```
250
+
251
+ ### The stderr Callback
252
+
253
+ The `stderr` option provides a callback for Claude Code's subprocess stderr output:
254
+
255
+ ```typescript
256
+ options: {
257
+ stderr: (data: string) => void; // Callback for stderr output
258
+ }
259
+ ```
260
+
261
+ This can potentially be used to detect process-level activity even when the SDK message stream is silent.
262
+
263
+ ### Known Limitation: Extended Thinking Blocks Streaming
264
+
265
+ When `maxThinkingTokens` is explicitly set, `StreamEvent` messages are NOT emitted. Only complete messages are yielded after each turn. Since thinking is disabled by default in the SDK, streaming works unless you enable it.
266
+
267
+ ---
268
+
269
+ ## 5. Timeout Management
270
+
271
+ ### Environment Variables
272
+
273
+ | Variable | Default | Purpose |
274
+ |----------|---------|---------|
275
+ | `BASH_DEFAULT_TIMEOUT_MS` | `120000` (2 min) | Default bash command timeout |
276
+ | `BASH_MAX_TIMEOUT_MS` | Not documented | Maximum allowable timeout |
277
+
278
+ ### Configuration Location
279
+
280
+ Configure in `~/.claude/settings.json`:
281
+
282
+ ```json
283
+ {
284
+ "env": {
285
+ "BASH_DEFAULT_TIMEOUT_MS": "1800000",
286
+ "BASH_MAX_TIMEOUT_MS": "7200000"
287
+ }
288
+ }
289
+ ```
290
+
291
+ **Important:** A full application restart is required for settings to take effect. Shell `export` alone does NOT work reliably.
292
+
293
+ ### Per-Tool Timeout
294
+
295
+ The `Bash` tool input supports a per-command timeout:
296
+
297
+ ```typescript
298
+ interface BashInput {
299
+ command: string;
300
+ timeout?: number; // Optional timeout in ms (max 600000 = 10 min)
301
+ }
302
+ ```
303
+
304
+ ### SDK-Level Controls
305
+
306
+ | Control | Description |
307
+ |---------|-------------|
308
+ | `maxTurns` | Hard cap on conversation turns |
309
+ | `maxBudgetUsd` | Hard cap on spending |
310
+ | `abortController` | Programmatic cancellation |
311
+
312
+ **There is no built-in overall wall-clock timeout in the SDK.** You must implement this yourself.
313
+
314
+ ### Known Issue: BASH_DEFAULT_TIMEOUT_MS Ignored
315
+
316
+ GitHub issue #3964 reports that bash commands ignore `BASH_DEFAULT_TIMEOUT_MS` and hang indefinitely. The timeout only works when explicitly specified in the Bash tool's `timeout` parameter. This is a significant reliability concern.
317
+
318
+ ---
319
+
320
+ ## 6. AbortController & Process Termination
321
+
322
+ ### Correct Usage (Critical)
323
+
324
+ The AbortController MUST be passed inside the `options` object, NOT as a top-level parameter:
325
+
326
+ ```typescript
327
+ // WRONG -- abort signal is ignored
328
+ query({
329
+ prompt: "...",
330
+ abortController: controller // BUG: not respected here
331
+ })
332
+
333
+ // CORRECT -- abort signal is respected
334
+ query({
335
+ prompt: "...",
336
+ options: {
337
+ abortController: controller // Works here
338
+ }
339
+ })
340
+ ```
341
+
342
+ This was a known bug (issue #2970) that was resolved by documentation clarification, not by fixing the top-level parameter.
343
+
344
+ ### AbortController Behavior
345
+
346
+ Even when correctly placed, there are limitations:
347
+
348
+ 1. **Queued tool calls may still complete.** The abort signal does not immediately kill in-flight tool executions.
349
+ 2. **Subagent cascade termination:** In v1.0.62+, all subagents share a single AbortController. One failure kills all subagents (issue #6594).
350
+ 3. **Post-abort session resume can fail:** Using abortController immediately after init causes subsequent `resume` calls to fail with "No conversation found" (claude-agent-sdk-typescript issue #69).
351
+
352
+ ### Process Cleanup Concerns
353
+
354
+ - Claude Code spawns child processes (Bash commands, subagents) that may not be properly cleaned up on abort.
355
+ - In Docker containers, background process termination can crash Claude Code because Claude Code and spawned processes share the same process group. Killing the process group kills Claude Code itself (issue #16135).
356
+ - On macOS with broken `pgrep`, Claude Code's child-process tracking loop can spawn thousands of zombie processes until hitting per-user process limits.
357
+
358
+ ### Recommended Abort Pattern
359
+
360
+ ```typescript
361
+ const controller = new AbortController();
362
+
363
+ // Set a wall-clock timeout
364
+ const timeout = setTimeout(() => {
365
+ controller.abort();
366
+ }, MAX_EXECUTION_MS);
367
+
368
+ try {
369
+ for await (const message of query({
370
+ prompt: taskPrompt,
371
+ options: {
372
+ abortController: controller,
373
+ maxTurns: 20,
374
+ includePartialMessages: true,
375
+ }
376
+ })) {
377
+ // Reset timeout on each message (activity detected)
378
+ clearTimeout(timeout);
379
+ timeout = setTimeout(() => controller.abort(), IDLE_TIMEOUT_MS);
380
+
381
+ handleMessage(message);
382
+ }
383
+ } finally {
384
+ clearTimeout(timeout);
385
+ }
386
+ ```
387
+
388
+ ---
389
+
390
+ ## 7. Known Stall/Hang Issues & Patterns
391
+
392
+ ### Issue Catalog
393
+
394
+ | Issue | Summary | Status |
395
+ |-------|---------|--------|
396
+ | #4744 | Agent execution timeout: persistent hanging during complex tasks (800-900s+) | Closed |
397
+ | #17711 | Interactive CLI degrades over time, leading to UI lag and repeated timeout errors | Open |
398
+ | #15945 | MCP server causes 16+ hour hang with no timeout or stuck detection | Open |
399
+ | #6857 | Bash command execution hangs and exceeds timeout limits | Open |
400
+ | #3964 | Bash commands ignore BASH_DEFAULT_TIMEOUT_MS and hang indefinitely | Open |
401
+ | #1554 | Hanging/freezing mid-work, hung indefinitely | Closed |
402
+ | #619 | CLI hangs or becomes unresponsive in WSL | Open |
403
+ | #2970 | AbortController not respected (fixed via docs clarification) | Closed |
404
+ | #6594 | Subagent termination bug: one failure kills all subagents | Open |
405
+ | #18532 | Complete freeze, 100% CPU, main thread stuck in infinite loop | Open |
406
+ | #15012 | Claude update detects its own spawned process as "another instance running" | Open |
407
+ | #7263 | Empty output with large stdin input in headless mode | Open |
408
+
409
+ ### Common Stall Patterns
410
+
411
+ 1. **MCP Server Hangs:** MCP servers that become unresponsive (e.g., waiting for sudo password) block Claude Code indefinitely with no timeout.
412
+ 2. **Subagent Recursive Spawning:** `Task()` tool can create infinite loops of subagents calling subagents.
413
+ 3. **Context Accumulation:** After many turns, context grows until the system becomes sluggish and eventually stalls.
414
+ 4. **Bash Command Hangs:** Long-running or interactive bash commands that ignore configured timeouts.
415
+ 5. **API Timeout Loops:** API request timeouts trigger retry loops that themselves get stuck.
416
+ 6. **Process Group Issues:** Killing a process group in Docker kills Claude Code itself.
417
+
418
+ ### What Happens When Claude Code Stalls
419
+
420
+ Based on the issues:
421
+ - **No error output.** Silent failures are common.
422
+ - **CPU may spike to 100%** or the process may sit idle at 0%.
423
+ - **No heartbeat mechanism.** There is no built-in liveness signal.
424
+ - **Process stays alive.** The Node.js process remains running; it just stops producing output.
425
+ - **Zombie children accumulate.** Child processes from bash commands pile up.
426
+
427
+ ---
428
+
429
+ ## 8. Best Practices for Remote/Automation Execution
430
+
431
+ ### From Official Documentation (GitHub Actions)
432
+
433
+ 1. **Set `--max-turns`** to cap conversation iterations (default: 10 in GitHub Actions).
434
+ 2. **Set workflow-level timeouts** (`timeout-minutes` in GitHub Actions).
435
+ 3. **Use `--allowedTools`** to restrict tool access.
436
+ 4. **Use `permissionMode: "acceptEdits"` or `"bypassPermissions"`** for unattended execution.
437
+ 5. **Monitor costs** via `maxBudgetUsd`.
438
+
439
+ ### From Community Experience
440
+
441
+ 1. **Implement your own wall-clock timeout.** The SDK has no built-in overall timeout. Use `AbortController` with `setTimeout`.
442
+ 2. **Implement idle detection.** Track time since last `stream_event` and abort if it exceeds a threshold.
443
+ 3. **Use `includePartialMessages: true`** to get real-time activity signals.
444
+ 4. **Cap process trees.** Wrap Claude Code in a process limiter (e.g., `CLAUDE_CODE_MAXPROC=1000`).
445
+ 5. **Clean up on exit.** Always kill child processes in a `finally` block. Consider `pkill -P <pid>` or process group cleanup.
446
+ 6. **Avoid MCP servers in automation** unless you control their timeout behavior.
447
+ 7. **Use `maxTurns`** as a safety net against infinite loops.
448
+
449
+ ### From Trigger.dev Integration
450
+
451
+ ```typescript
452
+ // AbortController integration with external cancellation
453
+ signal.addEventListener("abort", () => abortController.abort());
454
+
455
+ // Always use try/finally for cleanup
456
+ try {
457
+ for await (const message of query({ prompt, options })) {
458
+ // Process messages
459
+ }
460
+ } finally {
461
+ // Clean up temp directories, kill child processes
462
+ }
463
+ ```
464
+
465
+ ### Process Management in Remote Contexts
466
+
467
+ - Claude Code manages its own subprocess tree (bash shells, background processes).
468
+ - When running inside Docker/containers, be aware of process group issues (issue #16135).
469
+ - On Windows/WSL, use `CREATE_NEW_PROCESS_GROUP` for process tree management.
470
+ - Set `encoding="utf-8"` on Windows to avoid crashes on non-ASCII output.
471
+
472
+ ---
473
+
474
+ ## 9. Recommendations for Mstro
475
+
476
+ Based on this research, here are specific recommendations for the Mstro CLI component that manages Claude Code processes:
477
+
478
+ ### A. Use the TypeScript SDK, Not CLI Subprocess
479
+
480
+ Use `@anthropic-ai/claude-agent-sdk` with the V1 `query()` function rather than spawning `claude -p` as a child process. This provides:
481
+ - Typed message stream
482
+ - Proper AbortController integration
483
+ - Hook callbacks for lifecycle events
484
+ - No stdout/stderr parsing needed
485
+
486
+ ### B. Implement a Three-Layer Timeout System
487
+
488
+ ```
489
+ 1. IDLE TIMEOUT (per-message): Reset on every SDKMessage received.
490
+ If no message received for N seconds, assume stall.
491
+ Recommended: 120-300 seconds for complex tasks.
492
+
493
+ 2. TURN TIMEOUT (per-turn): Track time between AssistantMessage events.
494
+ Long tool executions (Bash, Task) can take minutes.
495
+ Recommended: 600 seconds per turn.
496
+
497
+ 3. WALL-CLOCK TIMEOUT (per-session): Total execution time cap.
498
+ Use AbortController with setTimeout.
499
+ Recommended: 1800 seconds (30 min) default, configurable per task.
500
+ ```
501
+
502
+ ### C. Activity Detection via Streaming
503
+
504
+ Enable `includePartialMessages: true` and track:
505
+
506
+ ```typescript
507
+ let lastActivityTimestamp = Date.now();
508
+ let currentPhase: 'idle' | 'generating' | 'tool_executing' | 'done' = 'idle';
509
+
510
+ for await (const message of queryGenerator) {
511
+ lastActivityTimestamp = Date.now();
512
+
513
+ if (message.type === 'stream_event') {
514
+ const event = message.event;
515
+ if (event.type === 'content_block_start' && event.content_block?.type === 'tool_use') {
516
+ currentPhase = 'tool_executing';
517
+ // Relay tool name to web clients for status display
518
+ } else if (event.type === 'content_block_delta' && event.delta?.type === 'text_delta') {
519
+ currentPhase = 'generating';
520
+ // Relay text delta to web clients
521
+ }
522
+ } else if (message.type === 'assistant') {
523
+ currentPhase = 'tool_executing'; // About to execute tool calls
524
+ } else if (message.type === 'result') {
525
+ currentPhase = 'done';
526
+ }
527
+ }
528
+ ```
529
+
530
+ ### D. Watchdog Timer
531
+
532
+ Run a separate interval that checks `lastActivityTimestamp`:
533
+
534
+ ```typescript
535
+ const STALL_THRESHOLD_MS = 180_000; // 3 minutes with no messages
536
+
537
+ const watchdog = setInterval(() => {
538
+ const silentMs = Date.now() - lastActivityTimestamp;
539
+ if (silentMs > STALL_THRESHOLD_MS && currentPhase !== 'done') {
540
+ console.error(`Claude Code stall detected: ${silentMs}ms silent in phase ${currentPhase}`);
541
+ // Option 1: Abort and restart
542
+ abortController.abort();
543
+ // Option 2: Send notification to web clients
544
+ broadcastStallWarning(silentMs);
545
+ }
546
+ }, 30_000);
547
+ ```
548
+
549
+ ### E. Process Cleanup
550
+
551
+ ```typescript
552
+ // Ensure child process tree is cleaned up
553
+ process.on('exit', () => {
554
+ try {
555
+ // Kill the entire process group if possible
556
+ process.kill(-childProcess.pid, 'SIGKILL');
557
+ } catch (e) {
558
+ // Process may already be dead
559
+ }
560
+ });
561
+ ```
562
+
563
+ ### F. AbortController Placement
564
+
565
+ Always pass `abortController` inside `options`:
566
+
567
+ ```typescript
568
+ // CORRECT
569
+ query({
570
+ prompt: taskPrompt,
571
+ options: {
572
+ abortController: controller, // MUST be here
573
+ maxTurns: 20,
574
+ includePartialMessages: true,
575
+ }
576
+ })
577
+ ```
578
+
579
+ ### G. Relay Activity to Web Clients
580
+
581
+ Map SDK messages to WebSocket events for the web frontend:
582
+
583
+ | SDK Event | Web Client Message |
584
+ |-----------|-------------------|
585
+ | `stream_event` (text_delta) | `{ type: "text_delta", text: "..." }` |
586
+ | `stream_event` (content_block_start, tool_use) | `{ type: "tool_start", name: "Bash" }` |
587
+ | `stream_event` (content_block_stop) | `{ type: "tool_end" }` |
588
+ | `assistant` (complete) | `{ type: "assistant_turn_complete" }` |
589
+ | `result` (success) | `{ type: "task_complete", result: "..." }` |
590
+ | `result` (error_*) | `{ type: "task_error", errors: [...] }` |
591
+ | Watchdog stall detection | `{ type: "stall_warning", silentMs: N }` |
592
+
593
+ ### H. Configuration Recommendations
594
+
595
+ ```typescript
596
+ const DEFAULT_OPTIONS = {
597
+ maxTurns: 50, // Safety cap
598
+ maxBudgetUsd: 5.0, // Cost cap
599
+ includePartialMessages: true, // Enable streaming
600
+ permissionMode: "acceptEdits", // Auto-approve file edits
601
+ allowedTools: ["Read", "Write", "Edit", "Bash", "Grep", "Glob", "WebSearch", "WebFetch"],
602
+ settingSources: ["project"], // Load CLAUDE.md
603
+ systemPrompt: {
604
+ type: "preset",
605
+ preset: "claude_code",
606
+ append: "You are running inside Mstro. Report progress clearly."
607
+ }
608
+ };
609
+ ```
610
+
611
+ ---
612
+
613
+ ## Key Takeaways
614
+
615
+ 1. **There is no built-in heartbeat or liveness mechanism in Claude Code.** You must implement your own watchdog.
616
+
617
+ 2. **The streaming event gap during tool execution is the primary challenge** for stall detection. When a Bash command runs for 5 minutes, the SDK goes silent.
618
+
619
+ 3. **AbortController works but has caveats:** must be in `options`, may not immediately kill in-flight tools, and can cause cascade failures with subagents.
620
+
621
+ 4. **The bash timeout system is unreliable.** `BASH_DEFAULT_TIMEOUT_MS` is reportedly ignored in many cases. Per-command timeouts via the tool's `timeout` parameter are more reliable but are controlled by the model, not the caller.
622
+
623
+ 5. **Process cleanup is your responsibility.** Claude Code can leave zombie processes, especially in Docker/container environments.
624
+
625
+ 6. **The V2 SDK interface (send/stream) may be better for Mstro's session model** but is unstable. V1 query() is production-ready.
626
+
627
+ 7. **Max turns + wall-clock timeout + idle timeout is the defensive triad** for preventing runaway sessions.