@yemi33/squad 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 yemi33
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,483 @@
1
+ # Squad — Central AI Development Team
2
+
3
+ A multi-project AI dev team that runs from `~/.squad/`. Five autonomous agents share a single engine, dashboard, knowledge base, and MCP toolchain — working across any number of linked repos with self-improving workflows.
4
+
5
+ Inspired by and initially scaffolded from [Brady Gaster's Squad](https://bradygaster.github.io/squad/).
6
+
7
+ ## Prerequisites
8
+
9
+ - **Node.js** 18+ (LTS recommended)
10
+ - **Claude Code CLI** — install with `npm install -g @anthropic-ai/claude-code`
11
+ - **Anthropic API key** or Claude Max subscription (agents spawn Claude Code sessions)
12
+ - **Git** — agents create worktrees for all code changes
13
+
14
+ ## Installation
15
+
16
+ ```bash
17
+ # Install globally from npm
18
+ npm install -g @yemi33/squad
19
+
20
+ # Bootstrap ~/.squad/ with default config and agents
21
+ squad init
22
+
23
+ # Link your first project (interactive — auto-detects from git remote)
24
+ squad add ~/my-project
25
+ ```
26
+
27
+ Or try without installing:
28
+
29
+ ```bash
30
+ npx @yemi33/squad init
31
+ ```
32
+
33
+ No dependencies — Squad uses only Node.js built-in modules.
34
+
35
+ **Alternative: clone directly**
36
+ ```bash
37
+ git clone https://github.com/yemi33/squad.git ~/.squad
38
+ node ~/.squad/squad.js init
39
+ ```
40
+
41
+ ## Quick Start
42
+
43
+ ```bash
44
+ # 1. Link your projects (interactive — prompts for name, description, repo config)
45
+ squad add ~/repo1
46
+ squad add ~/repo2
47
+
48
+ # 2. Start the engine (runs in foreground, ticks every 60s)
49
+ squad start
50
+
51
+ # 3. Open the dashboard (separate terminal)
52
+ squad dash
53
+ # → http://localhost:7331
54
+ ```
55
+
56
+ ## Setup via Claude Code
57
+
58
+ If you use Claude Code as your daily driver, you can set up Squad by prompting Claude directly:
59
+
60
+ **First-time setup:**
61
+ ```
62
+ Install squad with `npm install -g @yemi33/squad`, run `squad init`,
63
+ then link my project at ~/my-project with `squad add ~/my-project` —
64
+ answer the interactive prompts using what you can auto-detect from the repo.
65
+ ```
66
+
67
+ **Give the squad work:**
68
+ ```
69
+ Add a work item to my squad: "Explore the codebase and document the architecture"
70
+ — run `squad work "Explore the codebase and document the architecture"`
71
+ ```
72
+
73
+ **Check status:**
74
+ ```
75
+ Run `squad status` and tell me what my squad is doing
76
+ ```
77
+
78
+ ### What happens on first run
79
+
80
+ 1. The engine starts ticking every 60 seconds
81
+ 2. It scans each linked project for work: PRs needing review, PRD gaps, queued work items
82
+ 3. If it finds work and an agent is idle, it spawns a Claude Code session with the right playbook
83
+ 4. You can watch progress on the dashboard or via `squad status`
84
+
85
+ To give the squad its first task, open the dashboard Command Center and add a work item, or use the CLI:
86
+ ```bash
87
+ squad work "Explore the codebase and document the architecture"
88
+ ```
89
+
90
+ ## CLI Reference
91
+
92
+ | Command | Description |
93
+ |---------|-------------|
94
+ | `squad init` | Bootstrap `~/.squad/` with default agents and config |
95
+ | `squad add <dir>` | Link a project (auto-detects settings from git, prompts to confirm) |
96
+ | `squad remove <dir>` | Unlink a project |
97
+ | `squad list` | List all linked projects with descriptions |
98
+ | `squad start` | Start engine daemon (ticks every 60s, auto-syncs MCP servers) |
99
+ | `squad stop` | Stop the engine |
100
+ | `squad status` | Show agents, projects, dispatch queue, quality metrics |
101
+ | `squad pause` / `resume` | Pause/resume dispatching |
102
+ | `squad dispatch` | Force a dispatch cycle |
103
+ | `squad discover` | Dry-run work discovery |
104
+ | `squad work <title> [opts]` | Add to central work queue |
105
+ | `squad spawn <agent> <prompt>` | Manually spawn an agent |
106
+ | `squad plan <file\|text> [proj]` | Run a plan |
107
+ | `squad cleanup` | Run cleanup manually (temp files, worktrees, zombies) |
108
+ | `squad dash` | Start web dashboard (default port 7331) |
109
+
110
+ You can also run scripts directly: `node ~/.squad/engine.js start`, `node ~/.squad/dashboard.js`, etc.
111
+
112
+ ## Architecture
113
+
114
+ ```
115
+ ┌───────────────────────────────┐
116
+ │ ~/.squad/ (central) │
117
+ │ │
118
+ │ engine.js ← tick 60s │
119
+ │ dashboard.js ← :7331 │
120
+ │ config.json ← projects │
121
+ │ mcp-servers.json ← auto-sync │
122
+ │ agents/ ← 5 agents │
123
+ │ playbooks/ ← templates │
124
+ │ skills/ ← workflows │
125
+ │ notes/ ← knowledge │
126
+ └──────┬────────────────────────┘
127
+ │ discovers work + dispatches agents
128
+ ┌────────────┼────────────────┐
129
+ ▼ ▼ ▼
130
+ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
131
+ │ Project A │ │ Project B │ │ Project C │
132
+ │ .squad/ │ │ .squad/ │ │ .squad/ │
133
+ │ work-items │ │ work-items │ │ work-items │
134
+ │ pull-reqs │ │ pull-reqs │ │ pull-reqs │
135
+ │ docs/ │ │ docs/ │ │ docs/ │
136
+ │ prd-gaps │ │ prd-gaps │ │ prd-gaps │
137
+ │ .claude/ │ │ .claude/ │ │ .claude/ │
138
+ │ skills/ │ │ skills/ │ │ skills/ │
139
+ └──────────────┘ └──────────────┘ └──────────────┘
140
+ ```
141
+
142
+ ## What It Does
143
+
144
+ - **Auto-discovers work** from PRD gaps, pull requests, and work queues across all linked projects
145
+ - **Dispatches AI agents** (Claude CLI) with full project context, git worktrees, and MCP server access
146
+ - **Routes intelligently** — fixes first, then reviews, then implementation, matched to agent strengths
147
+ - **Learns from itself** — agents write findings, engine consolidates into institutional knowledge
148
+ - **Tracks quality** — approval rates, error rates, and task metrics per agent
149
+ - **Shares workflows** — agents create reusable skills (Claude Code-compatible) that all other agents can follow
150
+ - **Supports cross-repo tasks** — a single work item can span multiple repositories
151
+ - **Fan-out dispatch** — broad tasks can be split across all idle agents in parallel, each assigned a project
152
+ - **Auto-syncs PRs** — engine scans agent output for PR URLs and updates project trackers automatically
153
+ - **Auto-extracts skills** — agents write ` ```skill ` blocks in output; engine auto-extracts them
154
+ - **Heartbeat monitoring** — detects dead/hung agents via output file activity, not just timeouts
155
+ - **Auto-cleanup** — stale temp files, orphaned worktrees, zombie processes cleaned every 10 minutes
156
+
157
+ ## Dashboard
158
+
159
+ The web dashboard at `http://localhost:7331` provides:
160
+
161
+ - **Projects bar** — all linked projects with descriptions (hover for full text)
162
+ - **Command Center** — add work items (per-project, auto-route, or fan-out), notes, and PRD items
163
+ - **Squad Members** — agent cards with status, click for charter/history/output detail panel
164
+ - **Live Output tab** — real-time streaming output for working agents (auto-refreshes every 3s)
165
+ - **Work Items** — paginated table with status, source, type, priority, assigned agent, linked PRs, fan-out badges, and retry button for failed items
166
+ - **PRD** — gap analysis stats + progress bar with item-level breakdown and linked PRs per item
167
+ - **Pull Requests** — paginated PR tracker sorted by date, with review/build/merge status
168
+ - **Skills** — agent-created reusable workflows (squad-wide + project-specific), click to view full content
169
+ - **Notes Inbox + Team Notes** — learnings and team rules
170
+ - **Dispatch Queue + Engine Log** — active/pending work and audit trail
171
+ - **Agent Metrics** — tasks completed, errors, PRs created/approved/rejected, approval rates
172
+
173
+ ## Project Config
174
+
175
+ When you run `squad.js add <dir>`, it prompts for project details and saves them to `config.json`. Each project entry looks like:
176
+
177
+ ```json
178
+ {
179
+ "name": "MyProject",
180
+ "description": "What this repo is for — agents read this to decide where to work",
181
+ "localPath": "C:/Users/you/MyProject",
182
+ "repoHost": "github",
183
+ "repositoryId": "",
184
+ "adoOrg": "your-github-org",
185
+ "adoProject": "",
186
+ "repoName": "MyProject",
187
+ "mainBranch": "main",
188
+ "workSources": {
189
+ "prd": { "enabled": true, "path": "docs/prd-gaps.json" },
190
+ "pullRequests": { "enabled": true, "path": ".squad/pull-requests.json" },
191
+ "workItems": { "enabled": true, "path": ".squad/work-items.json" }
192
+ }
193
+ }
194
+ ```
195
+
196
+ **Key fields:**
197
+ - `description` — critical for auto-routing. Agents read this to decide which repo to work in.
198
+ - `repoHost` — `"ado"` (Azure DevOps) or `"github"`. Controls which MCP tools agents use for PR creation, review comments, and status checks. Defaults to `"ado"`.
199
+ - `repositoryId` — required for ADO (the repo GUID), optional for GitHub.
200
+ - `adoOrg` — ADO organization or GitHub org/user.
201
+ - `adoProject` — ADO project name (leave blank for GitHub).
202
+ - `workSources` — toggle which work sources the engine scans for each project.
203
+
204
+ The init script also creates `<project>/.squad/` with empty `work-items.json` and `pull-requests.json`.
205
+
206
+ ### Auto-Discovery
207
+
208
+ When you run `squad.js add`, the tool automatically detects what it can from the repo:
209
+
210
+ | What | How |
211
+ |------|-----|
212
+ | Main branch | `git symbolic-ref` |
213
+ | Repo host | Git remote URL (github.com → `github`, visualstudio.com/dev.azure.com → `ado`) |
214
+ | Org / project / repo | Parsed from git remote URL |
215
+ | Description | First non-heading line from `CLAUDE.md` or `README.md` |
216
+ | Project name | `name` field from `package.json` |
217
+
218
+ All detected values are shown as defaults in the interactive prompts — just press Enter to accept or type to override.
219
+
220
+ ### Project Conventions (CLAUDE.md)
221
+
222
+ When dispatching agents, the engine reads each project's `CLAUDE.md` and injects it into the agent's system prompt as "Project Conventions". This means agents automatically follow repo-specific rules (logging, build commands, coding style, etc.) without needing to discover them each time. Each project can have different conventions.
223
+
224
+ ## MCP Server Integration
225
+
226
+ Agents need MCP tools to interact with your repo host (create PRs, post review comments, etc.). On engine start, MCP servers are auto-synced from `~/.claude.json` to `mcp-servers.json`.
227
+
228
+ **Example:** If you use Azure DevOps, configure the `azure-ado` MCP server in your Claude Code settings. If you use GitHub, configure the `github` MCP server. Agents will discover and use whichever tools are available.
229
+
230
+ Manually refresh with `node engine.js mcp-sync`.
231
+
232
+ ## Work Items
233
+
234
+ All work items use the shared `playbooks/work-item.md` template, which provides consistent branch naming, worktree workflow, PR creation steps, and status tracking.
235
+
236
+ **Per-project** — scoped to one repo. Select a project in the Command Center dropdown.
237
+
238
+ **Central (auto-route)** — agent gets all project descriptions and decides where to work. Use "Auto (agent decides)" in the dropdown, or `node engine.js work "title"`. Can span multiple repos.
239
+
240
+ ### Fan-Out (Parallel Multi-Agent)
241
+
242
+ Set Scope to "Fan-out (all agents)" in the Command Center, or add `"scope": "fan-out"` to the work item JSON.
243
+
244
+ The engine dispatches the task to **all idle agents simultaneously**, assigning each a project (round-robin). Each agent focuses on their assigned project and writes findings to the inbox.
245
+
246
+ ```
247
+ "Explore all codebases and write architecture doc" scope: fan-out
248
+
249
+ ├─→ Ripley → Project A
250
+ ├─→ Lambert → Project B
251
+ ├─→ Rebecca → Project C
252
+ └─→ Ralph → Project A (round-robin wraps)
253
+ ```
254
+
255
+ ### Failed Work Items
256
+
257
+ When an agent fails (timeout, crash, error), the engine marks the work item as `failed` with a reason. The dashboard shows a **Retry** button that resets it to `pending` for re-dispatch.
258
+
259
+ ## Auto-Discovery Pipeline
260
+
261
+ The engine discovers work from 5 sources, in priority order:
262
+
263
+ | Priority | Source | Dispatch Type |
264
+ |----------|--------|---------------|
265
+ | 1 | PRs with changes-requested | `fix` |
266
+ | 2 | PRs pending review | `review` |
267
+ | 3 | PRD items (missing/planned) | `implement` |
268
+ | 4 | Per-project work items | item's `type` |
269
+ | 5 | Central work items | item's `type` |
270
+
271
+ Each item passes through: dedup (checks pending, active, AND recently completed), cooldown, and agent availability gates. See `docs/auto-discovery.md` for the full pipeline.
272
+
273
+ ## Agent Execution
274
+
275
+ ### Spawn Chain
276
+
277
+ ```
278
+ engine.js
279
+ → spawn node spawn-agent.js <prompt.md> <sysprompt.md> <args...>
280
+ → spawn-agent.js resolves claude-code cli.js path
281
+ → spawn node cli.js -p --system-prompt <content> <args...>
282
+ → prompt piped via stdin (avoids shell metacharacter issues)
283
+ ```
284
+
285
+ No bash or shell involved — Node spawns Node directly. Prompts with special characters (parentheses, backticks, etc.) are safe.
286
+
287
+ ### What Each Agent Gets
288
+
289
+ - **System prompt** — identity, charter, history, project context, critical rules, skill index, team notes
290
+ - **Task prompt** — rendered playbook with `{{variables}}` filled from config
291
+ - **Working directory** — project root (agent creates worktrees as needed)
292
+ - **MCP servers** — all servers from `~/.claude.json` via `--mcp-config`
293
+ - **Full tool access** — all built-in tools plus all MCP tools
294
+ - **Permission mode** — `bypassPermissions` (no interactive prompts)
295
+ - **Output format** — `stream-json` (real-time streaming for live dashboard + heartbeat)
296
+
297
+ ### Post-Completion
298
+
299
+ When an agent finishes:
300
+ 1. Output saved to `agents/<name>/output.log`
301
+ 2. Agent status updated (done/error)
302
+ 3. Work item status updated (done/failed)
303
+ 4. PRs auto-synced from output → project's `pull-requests.json`
304
+ 5. Agent history updated (last 20 tasks)
305
+ 6. Quality metrics updated
306
+ 7. Review feedback created for PR authors (if review task)
307
+ 8. Learnings checked in `notes/inbox/`
308
+ 9. Skills auto-extracted from ` ```skill ` blocks in output
309
+ 10. Temp files cleaned up
310
+
311
+ ## Team
312
+
313
+ | Agent | Role | Best for |
314
+ |-------|------|----------|
315
+ | Ripley | Lead / Explorer | Code review, architecture, exploration |
316
+ | Dallas | Engineer | Features, tests, UI |
317
+ | Lambert | Analyst | PRD generation, docs |
318
+ | Rebecca | Architect | Complex systems, CI/infra |
319
+ | Ralph | Engineer | Features, bug fixes |
320
+
321
+ Routing rules in `routing.md`. Charters in `agents/{name}/charter.md`. Both are editable — customize agents and routing to fit your team's needs.
322
+
323
+ ## Playbooks
324
+
325
+ | Playbook | Purpose |
326
+ |----------|---------|
327
+ | `work-item.md` | Shared template for all work items (central + per-project) |
328
+ | `implement.md` | Build a PRD item in a git worktree, create PR |
329
+ | `review.md` | Review a PR, post findings to repo host |
330
+ | `fix.md` | Fix review feedback on existing PR branch |
331
+ | `analyze.md` | Generate PRD gap analysis in a worktree |
332
+ | `explore.md` | Read-only codebase exploration |
333
+ | `test.md` | Run tests and report results |
334
+
335
+ All playbooks use `{{template_variables}}` filled from project config. The `work-item.md` playbook uses `{{scope_section}}` to inject project-specific or multi-project context. Playbooks are fully customizable — edit them to match your workflow.
336
+
337
+ ## Health Monitoring
338
+
339
+ ### Heartbeat Check (every tick)
340
+
341
+ Uses `live-output.log` file modification time as a heartbeat:
342
+ - **Process alive + recent output** → healthy, keep running
343
+ - **Process alive + silent >5min** → hung, kill and mark failed
344
+ - **No process + silent >5min** → orphaned (engine restarted), mark failed
345
+
346
+ Agents can run for hours as long as they're producing output. The `heartbeatTimeout` (default 5min) only triggers on silence.
347
+
348
+ ### Automated Cleanup (every 10 ticks)
349
+
350
+ | What | Condition |
351
+ |------|-----------|
352
+ | Temp prompt/sysprompt files | >1 hour old |
353
+ | `live-output.log` for idle agents | >1 hour old |
354
+ | Git worktrees for merged/abandoned PRs | PR status is `merged`/`abandoned`/`completed` |
355
+ | Orphaned worktrees | >24 hours old, no active dispatch references them |
356
+ | Zombie processes | In memory but no matching dispatch |
357
+
358
+ Manual cleanup: `node engine.js cleanup`
359
+
360
+ ## Self-Improvement Loop
361
+
362
+ Five mechanisms that make the squad get better over time:
363
+
364
+ ### 1. Learnings Inbox → notes.md
365
+ Agents write findings to `notes/inbox/`. Engine consolidates at 5+ files into `notes.md` — categorized (reviews, feedback, learnings, other) with one-line summaries. Auto-prunes at 50KB. Injected into every future playbook.
366
+
367
+ ### 2. Per-Agent History
368
+ `agents/{name}/history.md` tracks last 20 tasks with timestamps, results, projects, and branches. Injected into the agent's system prompt so it remembers past work.
369
+
370
+ ### 3. Review Feedback Loop
371
+ When a reviewer flags issues, the engine creates `feedback-<author>-from-<reviewer>.md` in the inbox. The PR author sees the feedback in their next task.
372
+
373
+ ### 4. Quality Metrics
374
+ `engine/metrics.json` tracks per agent: tasks completed, errors, PRs created/approved/rejected, reviews done. Visible in CLI (`status`) and dashboard with color-coded approval rates.
375
+
376
+ ### 5. Skills
377
+ Agents save repeatable workflows to `skills/<name>.md` with Claude Code-compatible frontmatter. Engine builds an index injected into all prompts. Skills can also be stored per-project at `<project>/.claude/skills/<name>/SKILL.md` (requires a PR). Visible in dashboard alongside decisions.
378
+
379
+ See `docs/self-improvement.md` for the full breakdown.
380
+
381
+ ## Configuration Reference
382
+
383
+ Engine behavior is controlled via `config.json`. Key settings:
384
+
385
+ ```json
386
+ {
387
+ "engine": {
388
+ "tickInterval": 60000,
389
+ "maxConcurrent": 3,
390
+ "agentTimeout": 600000,
391
+ "staleThreshold": 1800000,
392
+ "maxTurns": 100,
393
+ "inboxConsolidateThreshold": 5
394
+ }
395
+ }
396
+ ```
397
+
398
+ | Setting | Default | Description |
399
+ |---------|---------|-------------|
400
+ | `tickInterval` | 60000 (1min) | Milliseconds between engine ticks |
401
+ | `maxConcurrent` | 3 | Max agents running simultaneously |
402
+ | `agentTimeout` | 600000 (10min) | Kill agents silent longer than this |
403
+ | `staleThreshold` | 1800000 (30min) | Kill stale dispatch entries |
404
+ | `maxTurns` | 100 | Max Claude CLI turns per agent session |
405
+ | `inboxConsolidateThreshold` | 5 | Inbox files needed before consolidation |
406
+
407
+ ## Node.js Upgrade Caution
408
+
409
+ The engine and all spawned agents use the Node binary that started the engine (`process.execPath`). After upgrading Node, restart the engine:
410
+
411
+ ```powershell
412
+ node ~/.squad/engine.js stop
413
+ node ~/.squad/engine.js
414
+ ```
415
+
416
+ ## Portability
417
+
418
+ **Portable (works on any machine):** Engine, dashboard, playbooks, charters, routing, notes, skills, docs, work items.
419
+
420
+ **Machine-specific (reconfigure per machine):**
421
+ - `config.json` — contains absolute paths to project directories. Re-link via `node squad.js add <dir>`.
422
+ - `mcp-servers.json` — auto-synced from `~/.claude.json` on engine start.
423
+
424
+ To move to a new machine: clone `~/.squad/`, delete `engine/control.json`, re-run `node squad.js add` for each project.
425
+
426
+ ## File Layout
427
+
428
+ ```
429
+ ~/.squad/
430
+ squad.js <- CLI: init, add, remove, list projects
431
+ engine.js <- Engine daemon
432
+ engine/
433
+ spawn-agent.js <- Agent spawn wrapper (resolves claude cli.js)
434
+ control.json <- running/paused/stopped
435
+ dispatch.json <- pending/active/completed queue
436
+ log.json <- Audit trail (capped at 500)
437
+ metrics.json <- Per-agent quality metrics
438
+ dashboard.js <- Web dashboard server
439
+ dashboard.html <- Dashboard UI
440
+ config.json <- projects[], agents, engine, claude settings
441
+ config.template.json <- Template for reference
442
+ mcp-servers.json <- MCP servers (auto-synced, gitignored)
443
+ routing.md <- Dispatch rules table (editable)
444
+ team.md <- Team roster
445
+ notes.md <- Team rules + consolidated learnings
446
+ work-items.json <- Central work queue (agent decides which project)
447
+ TODO.md <- Future improvements roadmap
448
+ playbooks/
449
+ work-item.md <- Shared work item template
450
+ implement.md <- Build a PRD item
451
+ review.md <- Review a PR
452
+ fix.md <- Fix review feedback
453
+ analyze.md <- Generate new PRD
454
+ explore.md <- Codebase exploration
455
+ test.md <- Run tests
456
+ skills/
457
+ README.md <- Skill format guide
458
+ <name>.md <- Agent-created reusable workflows
459
+ agents/
460
+ {name}/
461
+ charter.md <- Agent identity and boundaries (editable)
462
+ status.json <- Current state (runtime)
463
+ history.md <- Task history (last 20, runtime)
464
+ live-output.log <- Streaming output while working (runtime)
465
+ output.log <- Final output after completion (runtime)
466
+ identity/
467
+ now.md <- Engine-generated state snapshot
468
+ notes/
469
+ inbox/ <- Agent findings drop-box
470
+ archive/ <- Processed inbox files
471
+ docs/
472
+ auto-discovery.md <- Auto-discovery pipeline docs
473
+ self-improvement.md <- Self-improvement loop docs
474
+
475
+ Each linked project keeps locally:
476
+ <project>/.squad/
477
+ work-items.json <- Per-project work queue
478
+ pull-requests.json <- PR tracker
479
+ <project>/.claude/
480
+ skills/ <- Project-specific skills (requires PR)
481
+ <project>/docs/
482
+ prd-gaps.json <- PRD gap analysis
483
+ ```
package/TODO.md ADDED
@@ -0,0 +1,60 @@
1
+ # Squad — Future Improvements
2
+
3
+ Ordered by difficulty: quick wins first, larger efforts later.
4
+
5
+ ---
6
+
7
+ ## Quick Wins (< 1 hour each)
8
+
9
+ - [ ] **Output.log append, not overwrite** — keep all dispatch outputs, not just the last one. Rotate by dispatch ID.
10
+ - [ ] **Persistent cooldowns** — save cooldown state to disk so engine restarts don't re-dispatch everything
11
+ - [ ] **Worktree cleanup on merge/close** — when `pollPrStatus` detects a PR merged or abandoned, auto-remove its worktree. Currently only `runCleanup` catches old worktrees on a timer.
12
+ - [ ] **Discovery skip logging** — log why items were skipped during discovery (cooldown, already dispatched, no idle agent) so users can diagnose "why isn't my work item being picked up?"
13
+ - [ ] **Idle threshold alert** — if all agents are idle for >N minutes, notify via Teams/dashboard
14
+ - [ ] **Config validation at startup** — verify all project paths exist, all agents defined, all playbooks exist. Fail fast with clear errors instead of silently skipping.
15
+ - [ ] **macOS/Linux browser launch** — replace Windows `start` command with `open` (macOS) / `xdg-open` (Linux)
16
+ - [ ] **Health check endpoint** — `/api/health` returning engine state, project reachability, agent statuses for monitoring
17
+ - [ ] **Fan-out per-agent timeout** — when `@everyone` dispatches, set individual deadlines per agent instead of relying only on global `agentTimeout`
18
+
19
+ ## Small Effort (1–3 hours each)
20
+
21
+ - [ ] **Auto-retry with backoff** — when an agent errors, auto-retry with exponential backoff (5min, 15min, cap at 3 attempts) instead of requiring manual dashboard retry.
22
+ - [ ] **Auto-escalation** — if an agent errors 3 times in a row, pause their dispatch and alert via dashboard/Teams
23
+ - [ ] **Post-merge hooks** — when a PR merges, trigger configurable actions: clean up worktree, update PRD item status to `done`, notify Teams, update metrics
24
+ - [ ] **Pending dispatch explanation** — show in dashboard why each pending item hasn't been dispatched (no idle agent? cooldown? max concurrency?)
25
+ - [ ] **Work item editing** — edit title, description, type, priority, agent assignment from the dashboard UI (currently requires editing JSON)
26
+ - [ ] **Bulk operations** — retry/delete/reassign multiple work items at once
27
+ - [ ] **Per-dispatch artifact archive** — `artifacts/<agent>/<dispatch-id>/` preserving output.log, live-output.log, inbox findings. Never overwritten, indexed by dispatch ID.
28
+ - [ ] **Graceful shutdown** — on SIGTERM, wait for active agents to finish (with timeout) before exiting
29
+ - [ ] **Shell-agnostic playbooks** — remove PowerShell-specific instructions when running on non-Windows
30
+ - [ ] **Build failure notification to author** — when build-and-test files a fix work item, inject the error context into the original implementation agent's next prompt so they learn from the failure
31
+ - [ ] **Work item status updates** — show dispatched agent progress in dashboard, link to PRs created
32
+
33
+ ## Medium Effort (3–8 hours each)
34
+
35
+ - [ ] **Dispatch lifecycle timeline** — for each work item, show: created → queued → dispatched (agent, time) → completed/failed (duration, result). Requires tracking events per item.
36
+ - [ ] **Adaptive routing** — use quality metrics (approval rate, error rate) to adjust routing preferences. Deprioritize underperforming agents, promote high-performers.
37
+ - [ ] **Cascading dependency awareness** — if work item A blocks B, and A fails, mark B as `blocked`. Requires honoring `depends_on` field (already exists in plan-generated items).
38
+ - [ ] **Error pattern detection** — aggregate failed dispatches by type/project. If 3+ failures share the same error, surface an alert instead of dispatching more agents into the same wall.
39
+ - [ ] **Shared scratchpad** — in-progress workspace where agents working on related tasks can leave notes for each other without waiting for inbox consolidation
40
+ - [ ] **Proactive work discovery** — agents can propose work items by writing to `proposals/` inbox, engine reviews and promotes to work queue
41
+ - [ ] **Scheduled tasks** — cron-style recurring work (e.g., "every Monday Lambert regenerates the PRD", "every day Ripley explores recent commits")
42
+ - [ ] **Work item → PR → review chain view** — trace a work item through its full lifecycle in the dashboard UI
43
+ - [ ] **Auto-trigger new PRD analysis** — when all PRD items are complete/pr-created, automatically dispatch Lambert to generate the next version
44
+ - [ ] **Artifact query for agents** — inject recent artifact summaries into agent prompts so they can reference past investigations
45
+
46
+ ## Large Effort (1–2 days each)
47
+
48
+ - [ ] **Agent message board** — agents can post tagged messages to specific agents or all agents. Injected into recipient's next prompt. Requires message format, routing, expiry, and prompt injection.
49
+ - [ ] **Handoff protocol** — agent can mark a task as "blocked on X" or "ready for Y", engine picks up dependencies and sequences dispatch accordingly
50
+ - [ ] **Idle task system** — when all work sources are empty, engine assigns background tasks by agent role (Ripley: explore, Lambert: audit docs, Dallas: run tests, Rebecca: check infra)
51
+ - [ ] **Live agent output streaming** — real-time stdout/stderr while agents work, not just after completion. Requires WebSocket or SSE from engine to dashboard.
52
+ - [ ] **Task decomposition** — large work items auto-decompose into subtasks via analyst agent (similar to plan-to-prd but for individual work items)
53
+ - [ ] **Artifact browser in dashboard** — browse past dispatch artifacts, view reasoning chains, search across agent outputs
54
+ - [ ] **Temp agent support** — spawn short-lived agents for housekeeping without consuming a permanent squad slot. Minimal system prompt, low maxTurns, auto-cleanup.
55
+ - [ ] **Skill editor** — create/edit skills directly in the dashboard UI
56
+ - [ ] **PRD diffing** — show what changed between PRD versions in the dashboard
57
+
58
+ ## Ambitious (multi-day)
59
+
60
+ - [ ] **Dedicated ops agent** — permanent 6th agent scoped to housekeeping: cleanup, status checks, metric collection. Never assigned feature work.
@@ -0,0 +1,55 @@
1
+ # Dallas — Engineer
2
+
3
+ > Ships working code. Reads the instructions once, then executes — no over-engineering.
4
+
5
+ ## Identity
6
+
7
+ - **Name:** Dallas
8
+ - **Role:** Engineer
9
+ - **Expertise:** TypeScript/Node.js, agent architecture, monorepo builds (Yarn + lage), Docker, Claude SDK integration
10
+ - **Style:** Pragmatic. Implements what's specified. Minimal, clean code. Follows existing patterns.
11
+
12
+ ## What I Own
13
+
14
+ - Prototype implementation — building exactly what the codebase instructions specify
15
+ - Following agent architecture patterns from `agents/*/CLAUDE.md` and `docs/`
16
+ - Wiring up `src/registry.ts`, agent entry points, prompts, skills, and sub-agents per pattern
17
+ - TypeScript builds, test scaffolding, Docker integration
18
+
19
+ ## How I Work
20
+
21
+ - Wait for Ripley's exploration findings before writing any code
22
+ - Read the specific CLAUDE.md for the agent/area I'm building
23
+ - Follow the common agent pattern: `registry.ts` → main agent file → `src/prompts/{version}/`
24
+ - Use PowerShell for build commands on Windows if applicable
25
+ - Follow the project's logging and coding conventions (check CLAUDE.md)
26
+ - Write tests in `**/tests/**/*.test.ts` following existing Jest patterns
27
+
28
+ ## Build Commands (PowerShell only)
29
+
30
+ ```powershell
31
+ yarn build # Build packages + Docker image
32
+ yarn test # Run all tests
33
+ yarn lint # ESLint
34
+ # Check project's CLAUDE.md for build/test commands
35
+ ```
36
+
37
+ ## Boundaries
38
+
39
+ **I handle:** Prototype implementation, code scaffolding, TypeScript/Node.js, build config, Docker integration.
40
+
41
+ **I don't handle:** Codebase exploration (Ripley), product requirements (Lambert).
42
+
43
+ **When I'm unsure:** I read the relevant CLAUDE.md or docs/ file first, then proceed.
44
+
45
+ ## Model
46
+
47
+ - **Preferred:** auto
48
+
49
+ ## Voice
50
+
51
+ Ships clean code that follows the patterns already in the repo. Gets annoyed by reinventing wheels when a module already does it. Will call out if Ripley's findings are incomplete before starting.
52
+
53
+ ## Directives
54
+
55
+ **Before starting any work, read `.squad/notes.md` for team rules and constraints.**