@yemi33/minions 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/CHANGELOG.md +819 -0
  2. package/LICENSE +21 -0
  3. package/README.md +598 -0
  4. package/agents/dallas/charter.md +56 -0
  5. package/agents/lambert/charter.md +67 -0
  6. package/agents/ralph/charter.md +45 -0
  7. package/agents/rebecca/charter.md +57 -0
  8. package/agents/ripley/charter.md +47 -0
  9. package/bin/minions.js +467 -0
  10. package/config.template.json +28 -0
  11. package/dashboard.html +4822 -0
  12. package/dashboard.js +2623 -0
  13. package/docs/auto-discovery.md +416 -0
  14. package/docs/blog-first-successful-dispatch.md +128 -0
  15. package/docs/command-center.md +156 -0
  16. package/docs/demo/01-dashboard-overview.gif +0 -0
  17. package/docs/demo/02-command-center.gif +0 -0
  18. package/docs/demo/03-work-items.gif +0 -0
  19. package/docs/demo/04-plan-docchat.gif +0 -0
  20. package/docs/demo/05-prd-progress.gif +0 -0
  21. package/docs/demo/06-inbox-metrics.gif +0 -0
  22. package/docs/deprecated.json +83 -0
  23. package/docs/distribution.md +96 -0
  24. package/docs/engine-restart.md +92 -0
  25. package/docs/human-vs-automated.md +108 -0
  26. package/docs/index.html +221 -0
  27. package/docs/plan-lifecycle.md +140 -0
  28. package/docs/self-improvement.md +344 -0
  29. package/engine/ado-mcp-wrapper.js +42 -0
  30. package/engine/ado.js +383 -0
  31. package/engine/check-status.js +23 -0
  32. package/engine/cli.js +754 -0
  33. package/engine/consolidation.js +417 -0
  34. package/engine/github.js +331 -0
  35. package/engine/lifecycle.js +1113 -0
  36. package/engine/llm.js +116 -0
  37. package/engine/queries.js +677 -0
  38. package/engine/shared.js +397 -0
  39. package/engine/spawn-agent.js +151 -0
  40. package/engine.js +3227 -0
  41. package/minions.js +556 -0
  42. package/package.json +48 -0
  43. package/playbooks/ask.md +49 -0
  44. package/playbooks/build-and-test.md +155 -0
  45. package/playbooks/explore.md +64 -0
  46. package/playbooks/fix.md +57 -0
  47. package/playbooks/implement-shared.md +68 -0
  48. package/playbooks/implement.md +95 -0
  49. package/playbooks/plan-to-prd.md +104 -0
  50. package/playbooks/plan.md +99 -0
  51. package/playbooks/review.md +68 -0
  52. package/playbooks/test.md +75 -0
  53. package/playbooks/verify.md +190 -0
  54. package/playbooks/work-item.md +74 -0
@@ -0,0 +1,416 @@
1
+ # Auto-Discovery & Execution Pipeline
2
+
3
+ How the minions engine finds work and dispatches agents automatically.
4
+
5
+ ## The Tick Loop
6
+
7
+ The engine runs a tick every 60 seconds (configurable via `config.json` → `engine.tickInterval`). Each tick:
8
+
9
+ ```
10
+ tick()
11
+ 1. checkTimeouts() Kill stale/hung agents (>heartbeatTimeout)
12
+ 2. consolidateInbox() Merge learnings into notes.md (Haiku-powered)
13
+ 2.5 runCleanup() Periodic cleanup (every 10 ticks ≈ 10min)
14
+ 2.6 pollPrStatus() Poll ADO for build, review, merge status (every 6 ticks ≈ 6min)
15
+ 2.7 pollPrHumanComments() Poll PR threads for human @minions comments (every 12 ticks ≈ 12min)
16
+ 3. discoverWork() Scan ALL linked projects for new tasks
17
+ 4. updateSnapshot() Write identity/now.md
18
+ 5. dispatch Spawn agents for pending items (up to maxConcurrent)
19
+ ```
20
+
21
+ ## Work Discovery
22
+
23
+ `discoverWork()` iterates every project in `config.projects[]` and runs three core discovery sources. Results are prioritized: fixes > reviews > implements > work-items.
24
+
25
+ Before scanning, the engine materializes plans and specs into project work items (side-effect writes to `work-items.json`), so they're picked up by the work items source below.
26
+
27
+ ### Source 1: Pull Requests (`discoverFromPrs`)
28
+
29
+ **Reads:** `~/.minions/projects/<project>/pull-requests.json`
30
+
31
+ | PR State | Action | Dispatch Type |
32
+ |----------|--------|---------------|
33
+ | Minions review pending/waiting | Queue a code review | `review` |
34
+ | Minions review `changes-requested` | Route back to author for fixes | `fix` |
35
+ | `buildStatus: "failing"` | Route to any agent for build fix | `fix` |
36
+ Skips PRs where `status !== "active"`.
37
+
38
+ ### Source 2: PRD Gap Analysis (via `materializePlansAsWorkItems`)
39
+
40
+ PRD items flow through `materializePlansAsWorkItems()`, which scans `~/.minions/prd/*.json` for PRD files with `missing`/`planned` items and creates work items in the target project's queue.
41
+
42
+ **Reads:** `<project>/docs/prd-gaps.json` (or custom path from `workSources.prd.path`)
43
+
44
+ | Item State | Action | Dispatch Type |
45
+ |------------|--------|---------------|
46
+ | `status: "missing"` or `"planned"` | Queue implementation | `implement` |
47
+ | `estimated_complexity: "large"` | Routes to `implement:large` (prefers Rebecca) | `implement:large` |
48
+
49
+ ### Source 3: Per-Project Work Items (`discoverFromWorkItems`)
50
+
51
+ **Reads:** `~/.minions/projects/<project>/work-items.json`
52
+
53
+ | Item State | Action | Dispatch Type |
54
+ |------------|--------|---------------|
55
+ | `status: "queued"` or `"pending"` | Queue based on item's `type` field | Item's `type` (default: `implement`) |
56
+
57
+ After dispatching, the engine writes `status: "dispatched"` back to the item. The agent is scoped to the specific project directory.
58
+
59
+ ### Source 4: Central Work Items (`discoverCentralWorkItems`)
60
+
61
+ **Reads:** `~/.minions/work-items.json` (central, project-agnostic)
62
+
63
+ These are tasks where the agent decides which project to work in. The engine builds a prompt that includes a list of all linked projects with their paths and repo config. The agent then navigates to the appropriate project directory based on the task.
64
+
65
+ | Item State | Action | Dispatch Type |
66
+ |------------|--------|---------------|
67
+ | `status: "queued"` or `"pending"` | Queue based on item's `type` field | Item's `type` (default: `implement`) |
68
+
69
+ **How it differs from per-project work items:**
70
+ - No `cwd` is set — agent starts in the minions directory and navigates itself
71
+ - The prompt includes all project paths and **descriptions** so the agent can choose
72
+ - No branch/worktree is pre-created — agent handles this
73
+ - Useful for cross-project tasks, exploratory work, or when you don't know which repo is relevant
74
+
75
+ **Project descriptions drive routing.** Each project in `config.json` has a `description` field. The engine injects these into the central work item prompt:
76
+
77
+ ```
78
+ ### MyProject
79
+ - **Path:** C:/Users/you/MyProject
80
+ - **Repo:** org/project/MyProject
81
+ - **What it is:** Description from config.json...
82
+ ```
83
+
84
+ Better descriptions → better agent routing. Describe what each repo contains, what kind of work happens there, and what technologies it uses.
85
+
86
+ **Cross-repo tasks.** Central work items can span multiple repositories. The agent's prompt instructs it to:
87
+ 1. Determine all repos affected by the task
88
+ 2. Work on each sequentially (worktree → commit → push → PR per repo)
89
+ 3. Note cross-repo dependencies in PR descriptions (e.g., "Requires MyProject PR #456")
90
+ 4. Use the correct repo config (org, project, repoId) for each repo
91
+ 5. Document which repos were touched in the learnings file
92
+
93
+ This means a single work item like "Add telemetry to the document creation pipeline" can result in PRs across multiple repos if the agent determines the change touches shared modules in one repo and the frontend in another.
94
+
95
+ **Adding central work items:**
96
+ - Dashboard Command Center → type your intent (no `#project` = central queue)
97
+ - CLI: `node engine.js work "task title"` (defaults to central queue)
98
+ - Direct edit: `~/.minions/work-items.json`
99
+
100
+ ### Materialization: Specs and Plans → Work Items
101
+
102
+ Before the 3 core sources run, the engine materializes indirect sources into work items:
103
+
104
+ **Specs** (`materializeSpecsAsWorkItems`): When a PR merges that added/modified `.md` files under `docs/` (configurable via `workSources.specs.filePatterns`), the engine reads each doc and checks for `type: spec` in its frontmatter. Only docs with this marker are treated as actionable specs — regular documentation is ignored. For matching docs, it extracts title/summary/priority and creates implementation work items with `createdBy: 'engine:spec-discovery'`. State tracked in `.minions/spec-tracker.json` to avoid re-processing merged PRs.
105
+
106
+ Example spec frontmatter:
107
+ ```markdown
108
+ ---
109
+ type: spec
110
+ title: Add user authentication
111
+ priority: high
112
+ ---
113
+ # Add user authentication
114
+ ...
115
+ ```
116
+
117
+ **Plans** (`materializePlansAsWorkItems`): Scans `~/.minions/prd/*.json` for PRD files with `missing`/`planned` items. Creates work items in the target project's queue with `createdBy: 'engine:plan-discovery'`. Work item ID = PRD item ID (e.g. `P-43e5ac28`). Deduped by `id`.
118
+
119
+ Both write to `work-items.json` and are picked up by Source 3 on the same or next tick.
120
+
121
+ ## ADO PR Status Polling (`pollPrStatus`)
122
+
123
+ **Runs:** Every 6 ticks (≈ 6 minutes), independently of work discovery. Replaces the retired agent-based `pr-sync`.
124
+
125
+ The engine directly polls the Azure DevOps REST API for **all** PR metadata: build/CI status, human review votes, and completion state. Two API calls per PR — no agent dispatch needed.
126
+
127
+ **Per PR:**
128
+ 1. `GET pullrequests/{id}` → `status` (active/completed/abandoned), `mergeStatus`, `reviewers[].vote`
129
+ 2. `GET pullrequests/{id}/statuses` → CI pipeline results
130
+
131
+ **Fields updated in `pull-requests.json`:**
132
+
133
+ | Field | Source | Values |
134
+ |-------|--------|--------|
135
+ | `status` | PR details | `active` / `merged` / `abandoned` |
136
+ | `reviewStatus` | `reviewers[].vote` | `approved` (vote ≥ 5) / `changes-requested` (-10) / `waiting` (-5) / `pending` (0) |
137
+ | `buildStatus` | PR statuses (codecoverage/deploy/build/ci contexts) | `passing` / `failing` / `running` / `none` |
138
+ | `buildFailReason` | Failed status description | Set on failure, cleared otherwise |
139
+
140
+ **Auth:** Bearer token via `azureauth ado token --output token` (cached 30 minutes).
141
+
142
+ This feeds `discoverFromPrs` — when `buildStatus` flips to `"failing"`, the next discovery tick dispatches a fix agent. When `status` becomes `"merged"`, the PR drops out of active polling.
143
+
144
+ ## Discovery Gates
145
+
146
+ Every discovered item passes through three gates before being queued:
147
+
148
+ ```
149
+ Item found
150
+
151
+ ├─ isAlreadyDispatched(key)? → skip if already in pending or active queue
152
+ │ Key format: <source>-<projectName>-<itemId>
153
+ │ e.g., "prd-MyProject-M001", "review-MyRepo-PR-123"
154
+
155
+ ├─ isOnCooldown(key)? → skip if dispatched within cooldown window
156
+ │ Default: 30min for PRD/PRs, 0 for work-items
157
+ │ Cooldowns are in-memory (reset on engine restart)
158
+
159
+ └─ resolveAgent(workType)? → skip if no idle agent available
160
+ Checks routing.md: preferred → fallback → any idle agent
161
+ ```
162
+
163
+ ## Agent Routing
164
+
165
+ `resolveAgent()` parses `routing.md` to pick the right agent:
166
+
167
+ ```
168
+ routing.md table:
169
+ implement → dallas (fallback: ralph)
170
+ implement:large → rebecca (fallback: dallas)
171
+ review → ripley (fallback: lambert)
172
+ fix → _author_ (fallback: dallas) ← routes to PR author
173
+ analyze → lambert (fallback: rebecca)
174
+ explore → ripley (fallback: rebecca)
175
+ test → dallas (fallback: ralph)
176
+ ```
177
+
178
+ Resolution order:
179
+ 1. Check if **preferred** agent is idle → use it
180
+ 2. Check if **fallback** agent is idle → use it
181
+ 3. Check **any** agent that's idle → use it
182
+ 4. If all busy → return null, item stays undiscovered until next tick
183
+
184
+ ## Dispatch Queue
185
+
186
+ Discovered items land in `engine/dispatch.json`:
187
+
188
+ ```json
189
+ {
190
+ "pending": [ ... ], // Waiting to be spawned
191
+ "active": [ ... ], // Currently running
192
+ "completed": [ ... ] // Finished (last 100 kept)
193
+ }
194
+ ```
195
+
196
+ Each tick, the engine checks available slots:
197
+
198
+ ```
199
+ slotsAvailable = maxConcurrent (3) - activeCount
200
+ ```
201
+
202
+ It takes up to `slotsAvailable` items from pending and spawns them. Items are processed in discovery-priority order (fixes first, then reviews, then implements, then work-items).
203
+
204
+ ## Execution: spawnAgent()
205
+
206
+ When an item is dispatched:
207
+
208
+ ### 1. Resolve Project Context
209
+ ```
210
+ meta.project.localPath → rootDir (the repo on disk)
211
+ ```
212
+
213
+ ### 2. Create Git Worktree (if task has a branch)
214
+ ```
215
+ git worktree add <rootDir>/../worktrees/<branch> -b <branch> <mainBranch>
216
+ ```
217
+ - Branch names are sanitized (alphanumeric, dots, hyphens, slashes only, max 200 chars)
218
+ - If worktree fails for implement/fix → item marked as error, moved to completed
219
+ - If worktree fails for review/analyze → falls back to rootDir (read-only tasks)
220
+
221
+ ### 3. Render Playbook
222
+ ```
223
+ playbooks/<type>.md → substitute {{variables}} → append notes.md → append learnings requirement
224
+ ```
225
+
226
+ Variables injected from config and item metadata:
227
+ - `{{project_name}}`, `{{ado_org}}`, `{{ado_project}}`, `{{repo_name}}` — from project config
228
+ - `{{agent_name}}`, `{{agent_id}}`, `{{agent_role}}` — from agent roster
229
+ - `{{item_id}}`, `{{item_name}}`, `{{branch_name}}`, `{{repo_id}}` — from work item
230
+ - `{{team_root}}` — path to central `.minions/` directory
231
+
232
+ ### 4. Build System Prompt
233
+ Combines:
234
+ - Agent identity (name, role, skills)
235
+ - Agent charter (`agents/<name>/charter.md`)
236
+ - Project context (repo name, repo host config, main branch)
237
+ - Critical rules (worktrees, MCP tools, PowerShell, learnings)
238
+ - Full `notes.md` content
239
+
240
+ ### 5. Spawn Claude CLI
241
+ ```bash
242
+ claude -p <prompt-file> --system-prompt <sysprompt-file> \
243
+ --output-format json --max-turns 100 --verbose \
244
+ --allowedTools Edit,Write,Read,Bash,Glob,Grep,Agent,WebFetch,WebSearch
245
+ ```
246
+
247
+ - Process runs in the worktree directory (or rootDir for reviews)
248
+ - stdout/stderr captured (capped at 1MB each)
249
+ - CLAUDECODE env vars stripped to allow nested sessions
250
+
251
+ ### 6. Track State
252
+ - Agent status derived from dispatch queue (`engine/dispatch.json`)
253
+ - Dispatch item → moved from `pending` to `active` in `dispatch.json`
254
+ - Process tracked in `activeProcesses` Map for timeout monitoring
255
+
256
+ ## Post-Completion
257
+
258
+ When the claude process exits:
259
+
260
+ ```
261
+ proc.on('close')
262
+
263
+ ├─ Save output to agents/<name>/output.log
264
+
265
+ ├─ Dispatch completion determines visible agent status ("done"/"error")
266
+
267
+ ├─ Move dispatch item: active → completed
268
+
269
+ ├─ Sync PRs from output (scan for PR URLs → pull-requests.json)
270
+
271
+ ├─ Post-completion hooks:
272
+ │ review → update PR minionsReview in pull-requests.json, vote on ADO
273
+ │ fix → set PR minionsReview back to "waiting"
274
+ │ build-test → (agent auto-files fix work items on failure)
275
+
276
+ ├─ Check for learnings in notes/inbox/
277
+ │ (warns if agent didn't write findings)
278
+
279
+ ├─ Update agent history and metrics
280
+
281
+ └─ Clean up temp prompt files from engine/
282
+ ```
283
+
284
+ ## Command Center (Dashboard Input)
285
+
286
+ The dashboard exposes a unified input box at `http://localhost:7331` that parses natural-language intent into structured work items, decisions, or PRD items.
287
+
288
+ **Syntax:**
289
+ | Token | Effect |
290
+ |-------|--------|
291
+ | `@agent` | Assigns to a specific agent (sets `item.agent`) |
292
+ | `@everyone` | Fan-out to all agents (sets `scope: 'fan-out'`) |
293
+ | `!high` / `!low` | Sets priority (default: medium) |
294
+ | `/decide` | Creates a decision (appended to `notes.md`) |
295
+ | `/prd` | Creates a PRD item (appended to `prd-gaps.json`) |
296
+ | `#project` | Targets a specific project queue |
297
+
298
+ Work type is auto-detected from keywords (fix, explore, test, review → implement as fallback). The `@agent` assignment flows through to the engine: `item.agent || resolveAgent(workType, config)`.
299
+
300
+ ## Data Flow Diagram
301
+
302
+ ```
303
+ Dashboard Command Center
304
+ (unified intent input)
305
+
306
+ ┌──────────┼──────────┐
307
+ ▼ ▼ ▼
308
+ work-items decisions prd-gaps
309
+ .json .md .json
310
+
311
+ Per-project sources: Central engine: Agents:
312
+
313
+ work-items.json ──┐
314
+ prd-gaps.json ────┤
315
+ pull-requests.json┤ discoverWork() dispatch.json
316
+ docs/**/*.md (specs)┤ (each tick) ┌──────────┐
317
+ │ │ │ pending │
318
+ ~/.minions/ │ │ │ active │
319
+ work-items.json ┤ │ │ completed │
320
+ plans/*.md ─────┘ ▼ └─────┬────┘
321
+ addToDispatch()─────────┘
322
+
323
+ ADO REST API ─── pollPrBuildStatus() ──► pull-requests.json
324
+ (every 6min) (buildStatus field) │
325
+ spawnAgent()
326
+
327
+ ┌────────────┼────────────┐
328
+ ▼ ▼ ▼
329
+ worktree claude CLI dispatch.json
330
+ (in project (max 100 (active/completed)
331
+ repo dir) turns)
332
+
333
+ on exit:
334
+
335
+ ┌───────────┬───────┼───────┬──────────┐
336
+ ▼ ▼ ▼ ▼ ▼
337
+ output.log notes/ PRs work-items localhost
338
+ (per agent) inbox/*.md .json .json (if webapp,
339
+ │ (auto-filed from build
340
+ consolidateInbox() on failure) & test)
341
+ (at 5+ files)
342
+
343
+
344
+ notes.md
345
+ (injected into
346
+ all future
347
+ playbooks)
348
+ ```
349
+
350
+ ## Timeout & Stale Detection
351
+
352
+ Two layers of protection:
353
+
354
+ **Agent timeout** (`engine.agentTimeout`, default 5 hours / 18,000,000ms):
355
+ - Checks `activeProcesses` Map for elapsed time
356
+ - Sends SIGTERM, then SIGKILL after 5s
357
+
358
+ **Stale detection** (`engine.heartbeatTimeout`, default 5 min / 300,000ms):
359
+ - Scans `dispatch.active` for items where `started_at` exceeds threshold
360
+ - Catches cases where the process exited but dispatch wasn't cleaned up
361
+ - Kills process if still tracked, marks dispatch as error, resets agent to idle
362
+
363
+ ## Cooldown Behavior
364
+
365
+ | Source | Default Cooldown | Behavior |
366
+ |--------|-----------------|----------|
367
+ | PRD items | 30 minutes | After dispatching M001, won't re-discover it for 30min |
368
+ | Pull requests | 30 minutes | After dispatching a review, won't re-queue for 30min |
369
+ | Work items | 0 (immediate) | No cooldown, but item status changes to "dispatched" |
370
+
371
+ Cooldowns are persisted to `engine/cooldowns.json` and reloaded on engine startup. Entries older than 24 hours are pruned. `isAlreadyDispatched()` still provides an additional guard by checking pending/active and recently completed dispatches in `dispatch.json`.
372
+
373
+ ## Configuration Reference
374
+
375
+ All discovery behavior is controlled via `config.json`:
376
+
377
+ ```json
378
+ {
379
+ "engine": {
380
+ "tickInterval": 60000, // ms between ticks
381
+ "maxConcurrent": 3, // max agents running at once
382
+ "agentTimeout": 18000000, // 5 hours — kill hung processes
383
+ "heartbeatTimeout": 300000, // 5min — kill stale/silent agents
384
+ "maxTurns": 100, // max claude CLI turns per agent
385
+ "worktreeCreateTimeout": 300000, // timeout for git worktree add on large repos
386
+ "worktreeCreateRetries": 1 // retry count for transient add failures
387
+ },
388
+ "projects": [
389
+ {
390
+ "name": "MyProject",
391
+ "localPath": "C:/Users/you/MyProject",
392
+ "workSources": {
393
+ "prd": {
394
+ "enabled": true,
395
+ "path": "docs/prd-gaps.json",
396
+ "itemFilter": { "status": ["missing", "planned"] },
397
+ "cooldownMinutes": 30
398
+ },
399
+ "pullRequests": {
400
+ "enabled": true,
401
+ "path": "projects/<name>/pull-requests.json",
402
+ "cooldownMinutes": 30
403
+ },
404
+ "workItems": {
405
+ "enabled": true,
406
+ "path": "projects/<name>/work-items.json",
407
+ "cooldownMinutes": 0
408
+ }
409
+ }
410
+ }
411
+ ]
412
+ }
413
+ ```
414
+
415
+ To disable a work source for a project, set `"enabled": false`. To change where the engine looks for PRD or PR files, change the `path` field (resolved relative to `localPath`).
416
+
@@ -0,0 +1,128 @@
1
+ # Blog: First Successful Agent Dispatch — The Spawn Debugging Saga
2
+
3
+ **Date:** 2026-03-12
4
+ **Author:** Minions Team + Claude Opus 4.6
5
+ **Result:** PR #4959092 created by Dallas on office-bohemia
6
+
7
+ ## The Task
8
+
9
+ Simple ask: add a `/cowork` route to the bebop webapp, referencing an existing commit as a pattern. Created via the dashboard Command Center as a central work item (auto-route, agent decides which project).
10
+
11
+ ## What Went Wrong (7 Failed Attempts)
12
+
13
+ ### Attempt 1-2: Shell Metacharacter Explosion
14
+
15
+ The engine spawned agents via bash:
16
+ ```bash
17
+ claude -p "$(cat 'prompt.md')" --system-prompt "$(cat 'sysprompt.md')" ...
18
+ ```
19
+
20
+ The prompt contained parentheses, backticks, and special characters:
21
+ ```
22
+ Each repo has its own ADO config (org, project, repoId) — use the correct one per repo
23
+ ```
24
+
25
+ Bash interpreted `(org, project, repoId)` as a subshell command. **Silent failure** — no error logged, agent process died instantly.
26
+
27
+ ### Attempt 3-4: Direct Spawn Without Shell
28
+
29
+ Switched to `spawn('claude', args)` — no bash wrapper. But on Windows, `claude` is a POSIX shell script (`#!/bin/sh`), not a binary. Node's `spawn` without `shell: true` couldn't execute it. **Process returned no PID**, error callback never fired.
30
+
31
+ ### Attempt 5: Shell True With Content Args
32
+
33
+ Added `shell: true` to the spawn. But now the `--system-prompt` arg (3KB of text with special chars) got shell-interpreted again. Same metacharacter problem, different location.
34
+
35
+ ### Attempt 6: Node Wrapper Script
36
+
37
+ Created `spawn-agent.js` — a Node wrapper that reads files and calls claude. Engine spawns `node spawn-agent.js <files> <args>`. Wrapper uses `shell: true` to find claude binary. **Same problem** — system prompt content as a CLI arg still gets shell-expanded.
38
+
39
+ ### Attempt 7: The Actual Fix
40
+
41
+ Realized the claude shell wrapper at `/c/.tools/.npm-global/claude` just calls:
42
+ ```sh
43
+ exec node "$basedir/node_modules/@anthropic-ai/claude-code/cli.js" "$@"
44
+ ```
45
+
46
+ **Solution:** Resolve the actual `cli.js` entry point and spawn it directly via `process.execPath` (the current node binary). No shell involved at all:
47
+
48
+ ```javascript
49
+ // spawn-agent.js
50
+ const proc = spawn(process.execPath, [claudeBin, ...cliArgs], {
51
+ stdio: ['pipe', 'pipe', 'pipe'],
52
+ env // CLAUDECODE vars already stripped
53
+ });
54
+ proc.stdin.write(prompt); // prompt via stdin — no shell interpretation
55
+ proc.stdin.end();
56
+ ```
57
+
58
+ ## Other Issues Discovered Along the Way
59
+
60
+ ### Output Format: json vs stream-json
61
+
62
+ `--output-format json` produces **one JSON blob at exit**. No streaming output during execution. This broke:
63
+ - Live output in dashboard (nothing to show until agent finishes)
64
+ - Heartbeat monitoring (no file writes to check mtime against)
65
+
66
+ Fix: switched to `--output-format stream-json` — streams events as they happen.
67
+
68
+ ### Permission Mode
69
+
70
+ Agents would hang waiting for permission prompts (invisible in headless mode). Added `--permission-mode bypassPermissions`.
71
+
72
+ ### CLAUDECODE Environment Variable
73
+
74
+ Claude Code sets `CLAUDECODE` env var to prevent nested sessions. Spawned agents inherit it and refuse to start. The engine strips it from `childEnv`, but the wrapper script was using `process.env` (which re-inherits from the parent). Fixed by stripping in both places.
75
+
76
+ ### Heartbeat vs Stale Detection
77
+
78
+ Original approach: kill agents after a fixed time threshold (staleThreshold). Problem: agents can legitimately run for hours on complex tasks.
79
+
80
+ New approach: heartbeat based on `live-output.log` mtime. As long as the agent produces output, it's alive. If silent for 5 minutes → declared dead. Catches orphaned processes (engine restart loses process handles) and hung agents.
81
+
82
+ ### Engine Restart Orphan Problem
83
+
84
+ When the engine restarts, the in-memory `activeProcesses` Map is lost. Active dispatch items stay in `dispatch.json` but the engine has no process handle. Old stale detection (6h threshold) was too slow to catch this. The heartbeat check catches it in 5 minutes.
85
+
86
+ ## The Successful Run
87
+
88
+ **Dispatch 1773292681199** — Dallas, central work item, auto-route:
89
+
90
+ 1. Engine spawns `node spawn-agent.js prompt.md sysprompt.md --output-format stream-json --verbose --permission-mode bypassPermissions`
91
+ 2. spawn-agent.js resolves `cli.js`, spawns `node cli.js -p --system-prompt <content> ...`
92
+ 3. Prompt piped via stdin — no shell interpretation
93
+ 4. MCP servers connect (azure-ado, azure-kusto, mobile, DevBox)
94
+ 5. Dallas reads the reference commit, explores office-bohemia's route structure
95
+ 6. Creates `apps/bebop/src/routes/_mainLayout.cowork.tsx`
96
+ 7. Updates route tree
97
+ 8. Creates git worktree, commits, pushes
98
+ 9. Creates **PR #4959092** via `mcp__azure-ado__repo_create_pull_request`
99
+ 10. Posts implementation notes as PR thread comment
100
+ 11. Writes learnings to `decisions/inbox/dallas-2026-03-12.md`
101
+ 12. Exits with code 0
102
+
103
+ **Time:** ~8 minutes from dispatch to PR created
104
+ **Output:** 70KB of stream-json events captured in `live-output.log`
105
+
106
+ ## Architecture Lessons
107
+
108
+ 1. **Never pass user content through shell expansion.** Use stdin or direct args via Node's `spawn` (without shell).
109
+ 2. **On Windows, npm-installed CLI tools are shell wrappers.** Resolve the actual `.js` entry point and spawn via `node`.
110
+ 3. **Streaming output format is essential** for monitoring long-running agents. One-shot JSON is useless for heartbeats.
111
+ 4. **Environment variable inheritance is tricky** with nested spawns. Strip at every level.
112
+ 5. **Heartbeats > timeouts** for agents that can run for hours. Check liveness, not elapsed time.
113
+
114
+ ## The Spawn Chain (Final Working Version)
115
+
116
+ ```
117
+ engine.js (tick loop)
118
+ → spawn(process.execPath, ['spawn-agent.js', prompt.md, sysprompt.md, ...args])
119
+ → spawn-agent.js reads files, resolves cli.js path
120
+ → spawn(process.execPath, ['cli.js', '-p', '--system-prompt', content, ...args])
121
+ → claude-code runs with prompt via stdin
122
+ → agent works, streams JSON events to stdout
123
+ → engine captures to live-output.log (heartbeat)
124
+ → dashboard polls /api/agent/:id/live (3s refresh)
125
+ ```
126
+
127
+ No bash. No shell. No metacharacter interpretation. Just Node spawning Node.
128
+