@yemi33/squad 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,414 @@
1
+ # Auto-Discovery & Execution Pipeline
2
+
3
+ How the squad engine finds work and dispatches agents automatically.
4
+
5
+ ## The Tick Loop
6
+
7
+ The engine runs a tick every 60 seconds (configurable via `config.json` → `engine.tickInterval`). Each tick:
8
+
9
+ ```
10
+ tick()
11
+ 1. checkTimeouts() Kill stale/hung agents (>heartbeatTimeout)
12
+ 2. consolidateInbox() Merge learnings into notes.md (at 5+ inbox files)
13
+ 2.5 runCleanup() Periodic cleanup (every 10 ticks ≈ 5min)
14
+ 2.6 pollPrStatus() Poll ADO for build, review, merge status (every 6 ticks ≈ 3min)
15
+ 3. discoverWork() Scan ALL linked projects for new tasks
16
+ 4. updateSnapshot() Write identity/now.md
17
+ 5. dispatch Spawn agents for pending items (up to maxConcurrent)
18
+ ```
19
+
20
+ ## Work Discovery
21
+
22
+ `discoverWork()` iterates every project in `config.projects[]` and runs three core discovery sources. Results are prioritized: fixes > reviews > implements > work-items.
23
+
24
+ Before scanning, the engine materializes plans and specs into project work items (side-effect writes to `work-items.json`), so they're picked up by the work items source below.
25
+
26
+ ### Source 1: Pull Requests (`discoverFromPrs`)
27
+
28
+ **Reads:** `<project>/.squad/pull-requests.json`
29
+
30
+ | PR State | Action | Dispatch Type |
31
+ |----------|--------|---------------|
32
+ | Squad review pending/waiting | Queue a code review | `review` |
33
+ | Squad review `changes-requested` | Route back to author for fixes | `fix` |
34
+ | `buildStatus: "failing"` | Route to any agent for build fix | `fix` |
35
+ | No `buildTested` flag | Queue build & test verification | `test` |
36
+
37
+ Skips PRs where `status !== "active"`.
38
+
39
+ **Build & Test auto-dispatch:** When a PR is first created (synced from agent output), it has no `buildTested` field. The engine dispatches a test agent to check out the branch, build, run tests, and if it's a webapp, start a local dev server. If build/tests fail, the agent auto-files a high-priority fix work item. The PR is marked `buildTested: 'dispatched'` to prevent re-dispatch.
40
+
41
+ ### Source 2: PRD Gap Analysis (`discoverFromPrd`)
42
+
43
+ **Reads:** `<project>/docs/prd-gaps.json` (or custom path from `workSources.prd.path`)
44
+
45
+ | Item State | Action | Dispatch Type |
46
+ |------------|--------|---------------|
47
+ | `status: "missing"` or `"planned"` | Queue implementation | `implement` |
48
+ | `estimated_complexity: "large"` | Routes to `implement:large` (prefers Rebecca) | `implement:large` |
49
+
50
+ ### Source 3: Per-Project Work Items (`discoverFromWorkItems`)
51
+
52
+ **Reads:** `<project>/.squad/work-items.json`
53
+
54
+ | Item State | Action | Dispatch Type |
55
+ |------------|--------|---------------|
56
+ | `status: "queued"` or `"pending"` | Queue based on item's `type` field | Item's `type` (default: `implement`) |
57
+
58
+ After dispatching, the engine writes `status: "dispatched"` back to the item. The agent is scoped to the specific project directory.
59
+
60
+ ### Source 4: Central Work Items (`discoverCentralWorkItems`)
61
+
62
+ **Reads:** `~/.squad/work-items.json` (central, project-agnostic)
63
+
64
+ These are tasks where the agent decides which project to work in. The engine builds a prompt that includes a list of all linked projects with their paths and repo config. The agent then navigates to the appropriate project directory based on the task.
65
+
66
+ | Item State | Action | Dispatch Type |
67
+ |------------|--------|---------------|
68
+ | `status: "queued"` or `"pending"` | Queue based on item's `type` field | Item's `type` (default: `implement`) |
69
+
70
+ **How it differs from per-project work items:**
71
+ - No `cwd` is set — agent starts in the squad directory and navigates itself
72
+ - The prompt includes all project paths and **descriptions** so the agent can choose
73
+ - No branch/worktree is pre-created — agent handles this
74
+ - Useful for cross-project tasks, exploratory work, or when you don't know which repo is relevant
75
+
76
+ **Project descriptions drive routing.** Each project in `config.json` has a `description` field. The engine injects these into the central work item prompt:
77
+
78
+ ```
79
+ ### MyProject
80
+ - **Path:** C:/Users/you/MyProject
81
+ - **Repo:** org/project/MyProject
82
+ - **What it is:** Description from config.json...
83
+ ```
84
+
85
+ Better descriptions → better agent routing. Describe what each repo contains, what kind of work happens there, and what technologies it uses.
86
+
87
+ **Cross-repo tasks.** Central work items can span multiple repositories. The agent's prompt instructs it to:
88
+ 1. Determine all repos affected by the task
89
+ 2. Work on each sequentially (worktree → commit → push → PR per repo)
90
+ 3. Note cross-repo dependencies in PR descriptions (e.g., "Requires MyProject PR #456")
91
+ 4. Use the correct repo config (org, project, repoId) for each repo
92
+ 5. Document which repos were touched in the learnings file
93
+
94
+ This means a single work item like "Add telemetry to the document creation pipeline" can result in PRs across multiple repos if the agent determines the change touches shared modules in one repo and the frontend in another.
95
+
96
+ **Adding central work items:**
97
+ - Dashboard Command Center → type your intent (no `#project` = central queue)
98
+ - CLI: `node engine.js work "task title"` (defaults to central queue)
99
+ - Direct edit: `~/.squad/work-items.json`
100
+
101
+ ### Materialization: Specs and Plans → Work Items
102
+
103
+ Before the 3 core sources run, the engine materializes indirect sources into work items:
104
+
105
+ **Specs** (`materializeSpecsAsWorkItems`): When a PR merges that added/modified `.md` files under `docs/` (configurable via `workSources.specs.filePatterns`), the engine reads each doc and checks for `type: spec` in its frontmatter. Only docs with this marker are treated as actionable specs — regular documentation is ignored. For matching docs, it extracts title/summary/priority and creates implementation work items with `createdBy: 'engine:spec-discovery'`. State tracked in `.squad/spec-tracker.json` to avoid re-processing merged PRs.
106
+
107
+ Example spec frontmatter:
108
+ ```markdown
109
+ ---
110
+ type: spec
111
+ title: Add user authentication
112
+ priority: high
113
+ ---
114
+ # Add user authentication
115
+ ...
116
+ ```
117
+
118
+ **Plans** (`materializePlansAsWorkItems`): Scans `~/.squad/plans/*.json` for plan files with `missing`/`planned` items. Creates work items in the target project's queue with `createdBy: 'engine:plan-discovery'`. Deduped via `sourcePlan` + `sourcePlanItem` fields.
119
+
120
+ Both write to `work-items.json` and are picked up by Source 3 on the same or next tick.
121
+
122
+ ## ADO PR Status Polling (`pollPrStatus`)
123
+
124
+ **Runs:** Every 6 ticks (≈ 3 minutes), independently of work discovery. Replaces the retired agent-based `pr-sync`.
125
+
126
+ The engine directly polls the Azure DevOps REST API for **all** PR metadata: build/CI status, human review votes, and completion state. Two API calls per PR — no agent dispatch needed.
127
+
128
+ **Per PR:**
129
+ 1. `GET pullrequests/{id}` → `status` (active/completed/abandoned), `mergeStatus`, `reviewers[].vote`
130
+ 2. `GET pullrequests/{id}/statuses` → CI pipeline results
131
+
132
+ **Fields updated in `pull-requests.json`:**
133
+
134
+ | Field | Source | Values |
135
+ |-------|--------|--------|
136
+ | `status` | PR details | `active` / `merged` / `abandoned` |
137
+ | `reviewStatus` | `reviewers[].vote` | `approved` (vote ≥ 5) / `changes-requested` (-10) / `waiting` (-5) / `pending` (0) |
138
+ | `buildStatus` | PR statuses (codecoverage/deploy/build/ci contexts) | `passing` / `failing` / `running` / `none` |
139
+ | `buildFailReason` | Failed status description | Set on failure, cleared otherwise |
140
+
141
+ **Auth:** Bearer token via `azureauth ado token --output token` (cached 30 minutes).
142
+
143
+ This feeds `discoverFromPrs` — when `buildStatus` flips to `"failing"`, the next discovery tick dispatches a fix agent. When `status` becomes `"merged"`, the PR drops out of active polling.
144
+
145
+ ## Discovery Gates
146
+
147
+ Every discovered item passes through three gates before being queued:
148
+
149
+ ```
150
+ Item found
151
+
152
+ ├─ isAlreadyDispatched(key)? → skip if already in pending or active queue
153
+ │ Key format: <source>-<projectName>-<itemId>
154
+ │ e.g., "prd-MyProject-M001", "review-MyRepo-PR-123"
155
+
156
+ ├─ isOnCooldown(key)? → skip if dispatched within cooldown window
157
+ │ Default: 30min for PRD/PRs, 0 for work-items
158
+ │ Cooldowns are in-memory (reset on engine restart)
159
+
160
+ └─ resolveAgent(workType)? → skip if no idle agent available
161
+ Checks routing.md: preferred → fallback → any idle agent
162
+ ```
163
+
164
+ ## Agent Routing
165
+
166
+ `resolveAgent()` parses `routing.md` to pick the right agent:
167
+
168
+ ```
169
+ routing.md table:
170
+ implement → dallas (fallback: ralph)
171
+ implement:large → rebecca (fallback: dallas)
172
+ review → ripley (fallback: lambert)
173
+ fix → _author_ (fallback: dallas) ← routes to PR author
174
+ analyze → lambert (fallback: rebecca)
175
+ explore → ripley (fallback: rebecca)
176
+ test → dallas (fallback: ralph)
177
+ ```
178
+
179
+ Resolution order:
180
+ 1. Check if **preferred** agent is idle → use it
181
+ 2. Check if **fallback** agent is idle → use it
182
+ 3. Check **any** agent that's idle → use it
183
+ 4. If all busy → return null, item stays undiscovered until next tick
184
+
185
+ ## Dispatch Queue
186
+
187
+ Discovered items land in `engine/dispatch.json`:
188
+
189
+ ```json
190
+ {
191
+ "pending": [ ... ], // Waiting to be spawned
192
+ "active": [ ... ], // Currently running
193
+ "completed": [ ... ] // Finished (last 100 kept)
194
+ }
195
+ ```
196
+
197
+ Each tick, the engine checks available slots:
198
+
199
+ ```
200
+ slotsAvailable = maxConcurrent (3) - activeCount
201
+ ```
202
+
203
+ It takes up to `slotsAvailable` items from pending and spawns them. Items are processed in discovery-priority order (fixes first, then reviews, then implements, then work-items).
204
+
205
+ ## Execution: spawnAgent()
206
+
207
+ When an item is dispatched:
208
+
209
+ ### 1. Resolve Project Context
210
+ ```
211
+ meta.project.localPath → rootDir (the repo on disk)
212
+ ```
213
+
214
+ ### 2. Create Git Worktree (if task has a branch)
215
+ ```
216
+ git worktree add <rootDir>/../worktrees/<branch> -b <branch> <mainBranch>
217
+ ```
218
+ - Branch names are sanitized (alphanumeric, dots, hyphens, slashes only, max 200 chars)
219
+ - If worktree fails for implement/fix → item marked as error, moved to completed
220
+ - If worktree fails for review/analyze → falls back to rootDir (read-only tasks)
221
+
222
+ ### 3. Render Playbook
223
+ ```
224
+ playbooks/<type>.md → substitute {{variables}} → append notes.md → append learnings requirement
225
+ ```
226
+
227
+ Variables injected from config and item metadata:
228
+ - `{{project_name}}`, `{{ado_org}}`, `{{ado_project}}`, `{{repo_name}}` — from project config
229
+ - `{{agent_name}}`, `{{agent_id}}`, `{{agent_role}}` — from agent roster
230
+ - `{{item_id}}`, `{{item_name}}`, `{{branch_name}}`, `{{repo_id}}` — from work item
231
+ - `{{team_root}}` — path to central `.squad/` directory
232
+
233
+ ### 4. Build System Prompt
234
+ Combines:
235
+ - Agent identity (name, role, skills)
236
+ - Agent charter (`agents/<name>/charter.md`)
237
+ - Project context (repo name, repo host config, main branch)
238
+ - Critical rules (worktrees, MCP tools, PowerShell, learnings)
239
+ - Full `notes.md` content
240
+
241
+ ### 5. Spawn Claude CLI
242
+ ```bash
243
+ claude -p <prompt-file> --system-prompt <sysprompt-file> \
244
+ --output-format json --max-turns 100 --verbose \
245
+ --allowedTools Edit,Write,Read,Bash,Glob,Grep,Agent,WebFetch,WebSearch
246
+ ```
247
+
248
+ - Process runs in the worktree directory (or rootDir for reviews)
249
+ - stdout/stderr captured (capped at 1MB each)
250
+ - CLAUDECODE env vars stripped to allow nested sessions
251
+
252
+ ### 6. Track State
253
+ - Agent status → `working` in `agents/<name>/status.json`
254
+ - Dispatch item → moved from `pending` to `active` in `dispatch.json`
255
+ - Process tracked in `activeProcesses` Map for timeout monitoring
256
+
257
+ ## Post-Completion
258
+
259
+ When the claude process exits:
260
+
261
+ ```
262
+ proc.on('close')
263
+
264
+ ├─ Save output to agents/<name>/output.log
265
+
266
+ ├─ Set agent status to "done" (exit 0) or "error" (non-zero)
267
+
268
+ ├─ Move dispatch item: active → completed
269
+
270
+ ├─ Sync PRs from output (scan for PR URLs → pull-requests.json)
271
+
272
+ ├─ Post-completion hooks:
273
+ │ review → update PR squadReview in pull-requests.json, vote on ADO
274
+ │ fix → set PR squadReview back to "waiting"
275
+ │ build-test → (agent auto-files fix work items on failure)
276
+
277
+ ├─ Check for learnings in notes/inbox/
278
+ │ (warns if agent didn't write findings)
279
+
280
+ ├─ Update agent history and metrics
281
+
282
+ └─ Clean up temp prompt files from engine/
283
+ ```
284
+
285
+ ## Command Center (Dashboard Input)
286
+
287
+ The dashboard exposes a unified input box at `http://localhost:7331` that parses natural-language intent into structured work items, decisions, or PRD items.
288
+
289
+ **Syntax:**
290
+ | Token | Effect |
291
+ |-------|--------|
292
+ | `@agent` | Assigns to a specific agent (sets `item.agent`) |
293
+ | `@everyone` | Fan-out to all agents (sets `scope: 'fan-out'`) |
294
+ | `!high` / `!low` | Sets priority (default: medium) |
295
+ | `/decide` | Creates a decision (appended to `notes.md`) |
296
+ | `/prd` | Creates a PRD item (appended to `prd-gaps.json`) |
297
+ | `#project` | Targets a specific project queue |
298
+
299
+ Work type is auto-detected from keywords (fix, explore, test, review → implement as fallback). The `@agent` assignment flows through to the engine: `item.agent || resolveAgent(workType, config)`.
300
+
301
+ ## Data Flow Diagram
302
+
303
+ ```
304
+ Dashboard Command Center
305
+ (unified intent input)
306
+
307
+ ┌──────────┼──────────┐
308
+ ▼ ▼ ▼
309
+ work-items decisions prd-gaps
310
+ .json .md .json
311
+
312
+ Per-project sources: Central engine: Agents:
313
+
314
+ work-items.json ──┐
315
+ prd-gaps.json ────┤
316
+ pull-requests.json┤ discoverWork() dispatch.json
317
+ docs/**/*.md (specs)┤ (each tick) ┌──────────┐
318
+ │ │ │ pending │
319
+ ~/.squad/ │ │ │ active │
320
+ work-items.json ┤ │ │ completed │
321
+ plans/*.md ─────┘ ▼ └─────┬────┘
322
+ addToDispatch()─────────┘
323
+
324
+ ADO REST API ─── pollPrBuildStatus() ──► pull-requests.json
325
+ (every 3min) (buildStatus field) │
326
+ spawnAgent()
327
+
328
+ ┌────────────┼────────────┐
329
+ ▼ ▼ ▼
330
+ worktree claude CLI status.json
331
+ (in project (max 100 (working)
332
+ repo dir) turns)
333
+
334
+ on exit:
335
+
336
+ ┌───────────┬───────┼───────┬──────────┐
337
+ ▼ ▼ ▼ ▼ ▼
338
+ output.log notes/ PRs work-items localhost
339
+ (per agent) inbox/*.md .json .json (if webapp,
340
+ │ (auto-filed from build
341
+ consolidateInbox() on failure) & test)
342
+ (at 5+ files)
343
+
344
+
345
+ notes.md
346
+ (injected into
347
+ all future
348
+ playbooks)
349
+ ```
350
+
351
+ ## Timeout & Stale Detection
352
+
353
+ Two layers of protection:
354
+
355
+ **Agent timeout** (`engine.agentTimeout`, default 10min):
356
+ - Checks `activeProcesses` Map for elapsed time
357
+ - Sends SIGTERM, then SIGKILL after 5s
358
+
359
+ **Stale detection** (`engine.staleThreshold`, default 30min):
360
+ - Scans `dispatch.active` for items where `started_at` exceeds threshold
361
+ - Catches cases where the process exited but dispatch wasn't cleaned up
362
+ - Kills process if still tracked, marks dispatch as error, resets agent to idle
363
+
364
+ ## Cooldown Behavior
365
+
366
+ | Source | Default Cooldown | Behavior |
367
+ |--------|-----------------|----------|
368
+ | PRD items | 30 minutes | After dispatching M001, won't re-discover it for 30min |
369
+ | Pull requests | 30 minutes | After dispatching a review, won't re-queue for 30min |
370
+ | Work items | 0 (immediate) | No cooldown, but item status changes to "dispatched" |
371
+
372
+ Cooldowns are **in-memory only**. On engine restart, `isAlreadyDispatched()` still prevents duplicates by checking the pending/active queue in `dispatch.json`. The cooldown just prevents rapid re-discovery within a single engine session.
373
+
374
+ ## Configuration Reference
375
+
376
+ All discovery behavior is controlled via `config.json`:
377
+
378
+ ```json
379
+ {
380
+ "engine": {
381
+ "tickInterval": 60000, // ms between ticks
382
+ "maxConcurrent": 3, // max agents running at once
383
+ "agentTimeout": 600000, // 10min — kill hung processes
384
+ "staleThreshold": 1800000, // 30min — kill stale dispatches
385
+ "maxTurns": 100 // max claude CLI turns per agent
386
+ },
387
+ "projects": [
388
+ {
389
+ "name": "MyProject",
390
+ "localPath": "C:/Users/you/MyProject",
391
+ "workSources": {
392
+ "prd": {
393
+ "enabled": true,
394
+ "path": "docs/prd-gaps.json",
395
+ "itemFilter": { "status": ["missing", "planned"] },
396
+ "cooldownMinutes": 30
397
+ },
398
+ "pullRequests": {
399
+ "enabled": true,
400
+ "path": ".squad/pull-requests.json",
401
+ "cooldownMinutes": 30
402
+ },
403
+ "workItems": {
404
+ "enabled": true,
405
+ "path": ".squad/work-items.json",
406
+ "cooldownMinutes": 0
407
+ }
408
+ }
409
+ }
410
+ ]
411
+ }
412
+ ```
413
+
414
+ To disable a work source for a project, set `"enabled": false`. To change where the engine looks for PRD or PR files, change the `path` field (resolved relative to `localPath`).
@@ -0,0 +1,127 @@
1
+ # Blog: First Successful Agent Dispatch — The Spawn Debugging Saga
2
+
3
+ **Date:** 2026-03-12
4
+ **Author:** Squad Team + Claude Opus 4.6
5
+ **Result:** PR #4959092 created by Dallas on office-bohemia
6
+
7
+ ## The Task
8
+
9
+ Simple ask: add a `/cowork` route to the bebop webapp, referencing an existing commit as a pattern. Created via the dashboard Command Center as a central work item (auto-route, agent decides which project).
10
+
11
+ ## What Went Wrong (7 Failed Attempts)
12
+
13
+ ### Attempt 1-2: Shell Metacharacter Explosion
14
+
15
+ The engine spawned agents via bash:
16
+ ```bash
17
+ claude -p "$(cat 'prompt.md')" --system-prompt "$(cat 'sysprompt.md')" ...
18
+ ```
19
+
20
+ The prompt contained parentheses, backticks, and special characters:
21
+ ```
22
+ Each repo has its own ADO config (org, project, repoId) — use the correct one per repo
23
+ ```
24
+
25
+ Bash interpreted `(org, project, repoId)` as a subshell command. **Silent failure** — no error logged, agent process died instantly.
26
+
27
+ ### Attempt 3-4: Direct Spawn Without Shell
28
+
29
+ Switched to `spawn('claude', args)` — no bash wrapper. But on Windows, `claude` is a POSIX shell script (`#!/bin/sh`), not a binary. Node's `spawn` without `shell: true` couldn't execute it. **Process returned no PID**, error callback never fired.
30
+
31
+ ### Attempt 5: Shell True With Content Args
32
+
33
+ Added `shell: true` to the spawn. But now the `--system-prompt` arg (3KB of text with special chars) got shell-interpreted again. Same metacharacter problem, different location.
34
+
35
+ ### Attempt 6: Node Wrapper Script
36
+
37
+ Created `spawn-agent.js` — a Node wrapper that reads files and calls claude. Engine spawns `node spawn-agent.js <files> <args>`. Wrapper uses `shell: true` to find claude binary. **Same problem** — system prompt content as a CLI arg still gets shell-expanded.
38
+
39
+ ### Attempt 7: The Actual Fix
40
+
41
+ Realized the claude shell wrapper at `/c/.tools/.npm-global/claude` just calls:
42
+ ```sh
43
+ exec node "$basedir/node_modules/@anthropic-ai/claude-code/cli.js" "$@"
44
+ ```
45
+
46
+ **Solution:** Resolve the actual `cli.js` entry point and spawn it directly via `process.execPath` (the current node binary). No shell involved at all:
47
+
48
+ ```javascript
49
+ // spawn-agent.js
50
+ const proc = spawn(process.execPath, [claudeBin, ...cliArgs], {
51
+ stdio: ['pipe', 'pipe', 'pipe'],
52
+ env // CLAUDECODE vars already stripped
53
+ });
54
+ proc.stdin.write(prompt); // prompt via stdin — no shell interpretation
55
+ proc.stdin.end();
56
+ ```
57
+
58
+ ## Other Issues Discovered Along the Way
59
+
60
+ ### Output Format: json vs stream-json
61
+
62
+ `--output-format json` produces **one JSON blob at exit**. No streaming output during execution. This broke:
63
+ - Live output in dashboard (nothing to show until agent finishes)
64
+ - Heartbeat monitoring (no file writes to check mtime against)
65
+
66
+ Fix: switched to `--output-format stream-json` — streams events as they happen.
67
+
68
+ ### Permission Mode
69
+
70
+ Agents would hang waiting for permission prompts (invisible in headless mode). Added `--permission-mode bypassPermissions`.
71
+
72
+ ### CLAUDECODE Environment Variable
73
+
74
+ Claude Code sets `CLAUDECODE` env var to prevent nested sessions. Spawned agents inherit it and refuse to start. The engine strips it from `childEnv`, but the wrapper script was using `process.env` (which re-inherits from the parent). Fixed by stripping in both places.
75
+
76
+ ### Heartbeat vs Stale Detection
77
+
78
+ Original approach: kill agents after a fixed time threshold (staleThreshold). Problem: agents can legitimately run for hours on complex tasks.
79
+
80
+ New approach: heartbeat based on `live-output.log` mtime. As long as the agent produces output, it's alive. If silent for 5 minutes → declared dead. Catches orphaned processes (engine restart loses process handles) and hung agents.
81
+
82
+ ### Engine Restart Orphan Problem
83
+
84
+ When the engine restarts, the in-memory `activeProcesses` Map is lost. Active dispatch items stay in `dispatch.json` but the engine has no process handle. Old stale detection (6h threshold) was too slow to catch this. The heartbeat check catches it in 5 minutes.
85
+
86
+ ## The Successful Run
87
+
88
+ **Dispatch 1773292681199** — Dallas, central work item, auto-route:
89
+
90
+ 1. Engine spawns `node spawn-agent.js prompt.md sysprompt.md --output-format stream-json --verbose --permission-mode bypassPermissions --mcp-config mcp-servers.json`
91
+ 2. spawn-agent.js resolves `cli.js`, spawns `node cli.js -p --system-prompt <content> ...`
92
+ 3. Prompt piped via stdin — no shell interpretation
93
+ 4. MCP servers connect (azure-ado, azure-kusto, mobile, DevBox)
94
+ 5. Dallas reads the reference commit, explores office-bohemia's route structure
95
+ 6. Creates `apps/bebop/src/routes/_mainLayout.cowork.tsx`
96
+ 7. Updates route tree
97
+ 8. Creates git worktree, commits, pushes
98
+ 9. Creates **PR #4959092** via `mcp__azure-ado__repo_create_pull_request`
99
+ 10. Posts implementation notes as PR thread comment
100
+ 11. Writes learnings to `decisions/inbox/dallas-2026-03-12.md`
101
+ 12. Exits with code 0
102
+
103
+ **Time:** ~8 minutes from dispatch to PR created
104
+ **Output:** 70KB of stream-json events captured in `live-output.log`
105
+
106
+ ## Architecture Lessons
107
+
108
+ 1. **Never pass user content through shell expansion.** Use stdin or direct args via Node's `spawn` (without shell).
109
+ 2. **On Windows, npm-installed CLI tools are shell wrappers.** Resolve the actual `.js` entry point and spawn via `node`.
110
+ 3. **Streaming output format is essential** for monitoring long-running agents. One-shot JSON is useless for heartbeats.
111
+ 4. **Environment variable inheritance is tricky** with nested spawns. Strip at every level.
112
+ 5. **Heartbeats > timeouts** for agents that can run for hours. Check liveness, not elapsed time.
113
+
114
+ ## The Spawn Chain (Final Working Version)
115
+
116
+ ```
117
+ engine.js (tick loop)
118
+ → spawn(process.execPath, ['spawn-agent.js', prompt.md, sysprompt.md, ...args])
119
+ → spawn-agent.js reads files, resolves cli.js path
120
+ → spawn(process.execPath, ['cli.js', '-p', '--system-prompt', content, ...args])
121
+ → claude-code runs with prompt via stdin
122
+ → agent works, streams JSON events to stdout
123
+ → engine captures to live-output.log (heartbeat)
124
+ → dashboard polls /api/agent/:id/live (3s refresh)
125
+ ```
126
+
127
+ No bash. No shell. No metacharacter interpretation. Just Node spawning Node.