@os-eco/overstory-cli 0.6.9 → 0.6.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. package/README.md +161 -265
  2. package/agents/builder.md +6 -15
  3. package/agents/lead.md +13 -6
  4. package/agents/merger.md +5 -13
  5. package/agents/reviewer.md +2 -9
  6. package/package.json +1 -1
  7. package/src/agents/hooks-deployer.test.ts +105 -0
  8. package/src/agents/hooks-deployer.ts +26 -11
  9. package/src/agents/manifest.test.ts +1 -0
  10. package/src/agents/overlay.test.ts +235 -1
  11. package/src/agents/overlay.ts +107 -9
  12. package/src/commands/completions.test.ts +8 -20
  13. package/src/commands/completions.ts +7 -5
  14. package/src/commands/coordinator.ts +4 -4
  15. package/src/commands/doctor.ts +97 -48
  16. package/src/commands/ecosystem.ts +291 -0
  17. package/src/commands/feed.ts +2 -2
  18. package/src/commands/group.ts +4 -4
  19. package/src/commands/mail.test.ts +63 -1
  20. package/src/commands/mail.ts +18 -1
  21. package/src/commands/merge.ts +2 -2
  22. package/src/commands/monitor.ts +2 -2
  23. package/src/commands/sling.test.ts +174 -27
  24. package/src/commands/sling.ts +96 -12
  25. package/src/commands/status.ts +1 -1
  26. package/src/commands/supervisor.ts +4 -4
  27. package/src/commands/trace.ts +2 -2
  28. package/src/commands/upgrade.test.ts +46 -0
  29. package/src/commands/upgrade.ts +259 -0
  30. package/src/config.test.ts +22 -0
  31. package/src/config.ts +12 -0
  32. package/src/doctor/agents.test.ts +1 -0
  33. package/src/doctor/config-check.test.ts +1 -0
  34. package/src/doctor/consistency.test.ts +1 -0
  35. package/src/doctor/databases.test.ts +39 -0
  36. package/src/doctor/databases.ts +7 -10
  37. package/src/doctor/dependencies.test.ts +1 -0
  38. package/src/doctor/ecosystem.test.ts +308 -0
  39. package/src/doctor/ecosystem.ts +155 -0
  40. package/src/doctor/logs.test.ts +1 -0
  41. package/src/doctor/merge-queue.test.ts +99 -0
  42. package/src/doctor/merge-queue.ts +23 -0
  43. package/src/doctor/structure.test.ts +131 -1
  44. package/src/doctor/structure.ts +87 -1
  45. package/src/doctor/types.ts +5 -2
  46. package/src/doctor/version.test.ts +1 -0
  47. package/src/index.ts +29 -4
  48. package/src/types.ts +11 -0
  49. package/templates/overlay.md.tmpl +3 -1
package/README.md CHANGED
@@ -1,68 +1,40 @@
1
1
  # Overstory
2
2
 
3
- [![CI](https://img.shields.io/github/actions/workflow/status/jayminwest/overstory/ci.yml?branch=main)](https://github.com/jayminwest/overstory/actions/workflows/ci.yml)
4
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
- [![Bun](https://img.shields.io/badge/Bun-%E2%89%A51.0-orange)](https://bun.sh)
6
- [![GitHub release](https://img.shields.io/github/v/release/jayminwest/overstory)](https://github.com/jayminwest/overstory/releases)
3
+ Multi-agent orchestration for Claude Code.
7
4
 
8
- Project-agnostic swarm system for Claude Code agent orchestration. Overstory turns a single Claude Code session into a multi-agent team by spawning worker agents in git worktrees via tmux, coordinating them through a custom SQLite mail system, and merging their work back with tiered conflict resolution.
5
+ [![npm](https://img.shields.io/npm/v/@os-eco/overstory-cli)](https://www.npmjs.com/package/@os-eco/overstory-cli)
6
+ [![CI](https://github.com/jayminwest/overstory/actions/workflows/ci.yml/badge.svg)](https://github.com/jayminwest/overstory/actions/workflows/ci.yml)
7
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
9
8
 
10
- > **⚠️ Warning: Agent swarms are not a universal solution.** Do not deploy Overstory without understanding the risks of multi-agent orchestration compounding error rates, cost amplification, debugging complexity, and merge conflicts are the normal case, not edge cases. Read [STEELMAN.md](STEELMAN.md) for a full risk analysis and the [Agentic Engineering Book](https://github.com/jayminwest/agentic-engineering-book) ([web version](https://jayminwest.com/agentic-engineering-book)) before using this tool in production.
9
+ Overstory turns a single Claude Code session into a multi-agent team by spawning worker agents in git worktrees via tmux, coordinating them through a custom SQLite mail system, and merging their work back with tiered conflict resolution.
11
10
 
12
- ## How It Works
13
-
14
- CLAUDE.md + hooks + the `ov` CLI turn your Claude Code session into a multi-agent orchestrator. A persistent coordinator agent manages task decomposition and dispatch, while a mechanical watchdog daemon monitors agent health in the background.
15
-
16
- ```
17
- Coordinator (persistent orchestrator at project root)
18
- --> Supervisor (per-project team lead, depth 1)
19
- --> Workers: Scout, Builder, Reviewer, Merger (depth 2)
20
- ```
21
-
22
- ### Agent Types
11
+ > **Warning: Agent swarms are not a universal solution.** Do not deploy Overstory without understanding the risks of multi-agent orchestration — compounding error rates, cost amplification, debugging complexity, and merge conflicts are the normal case, not edge cases. Read [STEELMAN.md](STEELMAN.md) for a full risk analysis and the [Agentic Engineering Book](https://github.com/jayminwest/agentic-engineering-book) ([web version](https://jayminwest.com/agentic-engineering-book)) before using this tool in production.
23
12
 
24
- | Agent | Role | Access |
25
- |-------|------|--------|
26
- | **Coordinator** | Persistent orchestrator — decomposes objectives, dispatches agents, tracks task groups | Read-only |
27
- | **Supervisor** | Per-project team lead — manages worker lifecycle, handles nudge/escalation | Read-only |
28
- | **Scout** | Read-only exploration and research | Read-only |
29
- | **Builder** | Implementation and code changes | Read-write |
30
- | **Reviewer** | Validation and code review | Read-only |
31
- | **Lead** | Team coordination, can spawn sub-workers | Read-write |
32
- | **Merger** | Branch merge specialist | Read-write |
33
- | **Monitor** | Tier 2 continuous fleet patrol — ongoing health monitoring | Read-only |
13
+ ## Install
34
14
 
35
- ### Key Architecture
15
+ Requires [Bun](https://bun.sh) v1.0+, [Claude Code](https://docs.anthropic.com/en/docs/claude-code), git, and tmux.
36
16
 
37
- - **Agent Definitions**: Two-layer system — base `.md` files define the HOW (workflow), per-task overlays define the WHAT (task scope). Base definition content is injected into spawned agent overlays automatically.
38
- - **Messaging**: Custom SQLite mail system with typed protocol — 8 message types (`worker_done`, `merge_ready`, `dispatch`, `escalation`, etc.) for structured agent coordination, plus broadcast messaging with group addresses (`@all`, `@builders`, etc.)
39
- - **Worktrees**: Each agent gets an isolated git worktree — no file conflicts between agents
40
- - **Merge**: FIFO merge queue (SQLite-backed) with 4-tier conflict resolution
41
- - **Watchdog**: Tiered health monitoring — Tier 0 mechanical daemon (tmux/pid liveness), Tier 1 AI-assisted failure triage, Tier 2 monitor agent for continuous fleet patrol
42
- - **Tool Enforcement**: PreToolUse hooks mechanically block file modifications for non-implementation agents and dangerous git operations for all agents
43
- - **Task Groups**: Batch coordination with auto-close when all member issues complete
44
- - **Session Lifecycle**: Checkpoint save/restore for compaction survivability, handoff orchestration for crash recovery
45
- - **Token Instrumentation**: Session metrics extracted from Claude Code transcript JSONL files
17
+ ```bash
18
+ bun install -g @os-eco/overstory-cli
19
+ ```
46
20
 
47
- ## Requirements
21
+ Or try without installing:
48
22
 
49
- - [Bun](https://bun.sh) (v1.0+)
50
- - [Claude Code](https://docs.anthropic.com/en/docs/claude-code)
51
- - git
52
- - tmux
23
+ ```bash
24
+ npx @os-eco/overstory-cli --help
25
+ ```
53
26
 
54
- ## Installation
27
+ ### Development
55
28
 
56
29
  ```bash
57
- # Clone the repository
58
30
  git clone https://github.com/jayminwest/overstory.git
59
31
  cd overstory
60
-
61
- # Install dev dependencies
62
32
  bun install
33
+ bun link # Makes 'ov' available globally
63
34
 
64
- # Link the CLI globally
65
- bun link
35
+ bun test # Run all tests
36
+ bun run lint # Biome check
37
+ bun run typecheck # tsc --noEmit
66
38
  ```
67
39
 
68
40
  ## Quick Start
@@ -94,223 +66,134 @@ ov nudge <agent-name>
94
66
  ov mail check --inject
95
67
  ```
96
68
 
97
- ## CLI Reference
98
-
99
- ```
100
- ov agents discover Discover agents by capability/state/parent
101
- --capability <type> Filter by capability type
102
- --state <state> Filter by agent state
103
- --parent <name> Filter by parent agent
104
- --json JSON output
105
-
106
- ov init Initialize .overstory/ in current project
107
- (deploys agent definitions automatically)
108
- --yes, -y Skip interactive prompts
109
- --name <name> Set project name (default: auto-detect)
110
-
111
- ov coordinator start Start persistent coordinator agent
112
- --attach / --no-attach TTY-aware tmux attach (default: auto)
113
- --watchdog Auto-start watchdog daemon with coordinator
114
- --monitor Auto-start Tier 2 monitor agent
115
- ov coordinator stop Stop coordinator
116
- ov coordinator status Show coordinator state
117
-
118
- ov supervisor start Start per-project supervisor agent
119
- --attach / --no-attach TTY-aware tmux attach (default: auto)
120
- ov supervisor stop Stop supervisor
121
- ov supervisor status Show supervisor state
122
-
123
- ov sling <task-id> Spawn a worker agent
124
- --capability <type> builder | scout | reviewer | lead | merger
125
- | coordinator | supervisor | monitor
126
- --name <name> Unique agent name
127
- --spec <path> Path to task spec file
128
- --files <f1,f2,...> Exclusive file scope
129
- --parent <agent-name> Parent (for hierarchy tracking)
130
- --depth <n> Current hierarchy depth
131
- --skip-scout Skip scout phase (passed to lead overlay)
132
- --skip-task-check Skip task existence validation
133
- --json JSON output
134
-
135
- ov stop <agent-name> Terminate a running agent
136
- --clean-worktree Remove the agent's worktree (best-effort)
137
- --json JSON output
138
-
139
- ov prime Load context for orchestrator/agent
140
- --agent <name> Per-agent priming
141
- --compact Restore from checkpoint (compaction)
142
-
143
- ov status Show all active agents, worktrees, tracker state
144
- --json JSON output
145
- --verbose Show detailed agent info
146
- --all Show all runs (default: current run only)
147
-
148
- ov dashboard Live TUI dashboard for agent monitoring
149
- --interval <ms> Refresh interval (default: 2000)
150
- --all Show all runs (default: current run only)
151
-
152
- ov hooks install Install orchestrator hooks to .claude/settings.local.json
153
- --force Overwrite existing hooks
154
- ov hooks uninstall Remove orchestrator hooks
155
- ov hooks status Check if hooks are installed
156
-
157
- ov mail send Send a message
158
- --to <agent> --subject <text> --body <text>
159
- --to @all | @builders | @scouts ... Broadcast to group addresses
160
- --type <status|question|result|error>
161
- --priority <low|normal|high|urgent> (urgent/high auto-nudges recipient)
162
-
163
- ov mail check Check inbox (unread messages)
164
- --agent <name> --inject --json
165
- --debounce <ms> Skip if checked within window
166
-
167
- ov mail list List messages with filters
168
- --from <name> --to <name> --unread
169
-
170
- ov mail read <id> Mark message as read
171
- ov mail reply <id> --body <text> Reply in same thread
172
-
173
- ov nudge <agent> [message] Send a text nudge to an agent
174
- --from <name> Sender name (default: orchestrator)
175
- --force Skip debounce check
176
- --json JSON output
177
-
178
- ov group create <name> Create a task group for batch tracking
179
- ov group status <name> Show group progress
180
- ov group add <name> <issue-id> Add issue to group
181
- ov group list List all groups
182
-
183
- ov merge Merge agent branches into canonical
184
- --branch <name> Specific branch
185
- --all All completed branches
186
- --into <branch> Target branch (default: session-branch.txt > canonicalBranch)
187
- --dry-run Check for conflicts only
188
-
189
- ov worktree list List worktrees with status
190
- ov worktree clean Remove completed worktrees
191
- --completed Only finished agents
192
- --all Force remove all
193
- --force Delete even if branches are unmerged
194
-
195
- ov monitor start Start Tier 2 monitor agent
196
- ov monitor stop Stop monitor agent
197
- ov monitor status Show monitor state
198
-
199
- ov log <event> Log a hook event
200
- ov watch Start watchdog daemon (Tier 0)
201
- --interval <ms> Health check interval
202
- --background Run as background process
203
- ov run list List orchestration runs
204
- ov run show <id> Show run details
205
- ov run complete <id> Mark a run complete
206
-
207
- ov trace View agent/bead timeline
208
- --agent <name> Filter by agent
209
- --run <id> Filter by run
210
-
211
- ov clean Clean up worktrees, sessions, artifacts
212
- --completed Only finished agents
213
- --all Force remove all
214
- --run <id> Clean a specific run
215
-
216
- ov doctor Run health checks on overstory setup
217
- --json JSON output
218
- --category <name> Run a specific check category only
219
-
220
- ov inspect <agent> Deep per-agent inspection
221
- --json JSON output
222
- --follow Polling mode (refreshes periodically)
223
- --interval <ms> Refresh interval for --follow
224
- --no-tmux Skip tmux capture
225
- --limit <n> Limit events shown
226
-
227
- ov spec write <task-id> Write a task specification
228
- --body <content> Spec content (or pipe via stdin)
229
-
230
- ov errors Aggregated error view across agents
231
- --agent <name> Filter by agent
232
- --run <id> Filter by run
233
- --since <ts> --until <ts> Time range filter
234
- --limit <n> --json
235
-
236
- ov replay Interleaved chronological replay
237
- --run <id> Filter by run
238
- --agent <name> Filter by agent(s)
239
- --since <ts> --until <ts> Time range filter
240
- --limit <n> --json
241
-
242
- ov feed [options] Unified real-time event stream across agents
243
- --follow, -f Continuously poll for new events
244
- --interval <ms> Polling interval (default: 2000)
245
- --agent <name> --run <id> Filter by agent or run
246
- --json JSON output
247
-
248
- ov logs [options] Query NDJSON logs across agents
249
- --agent <name> Filter by agent
250
- --level <level> Filter by log level (debug|info|warn|error)
251
- --since <ts> --until <ts> Time range filter
252
- --follow Tail logs in real time
253
- --json JSON output
254
-
255
- ov costs Token/cost analysis and breakdown
256
- --live Show real-time token usage for active agents
257
- --self Show cost for current orchestrator session
258
- --agent <name> Filter by agent
259
- --run <id> Filter by run
260
- --by-capability Group by capability type
261
- --last <n> --json
262
-
263
- ov metrics Show session metrics
264
- --last <n> Last N sessions
265
- --json JSON output
266
-
267
- Global Flags:
268
- --quiet, -q Suppress non-error output
269
- --completions <shell> Generate shell completions (bash, zsh, fish)
270
- ```
271
-
272
- ## Tech Stack
273
-
274
- - **Runtime**: Bun (TypeScript directly, no build step)
275
- - **Dependencies**: Minimal runtime — `chalk` (color output), `commander` (CLI framework), core I/O via Bun built-in APIs
276
- - **Database**: SQLite via `bun:sqlite` (WAL mode for concurrent access)
277
- - **Linting**: Biome (formatter + linter)
278
- - **Testing**: `bun test` (2186 tests across 77 files, colocated with source)
279
- - **External CLIs**: `bd` (beads) or `sd` (seeds), `mulch`, `git`, `tmux` — invoked as subprocesses
69
+ ## Commands
70
+
71
+ Every command supports `--json` where noted. Global flags: `-q`/`--quiet`, `--timing`. ANSI colors respect `NO_COLOR`.
72
+
73
+ ### Core Workflow
74
+
75
+ | Command | Description |
76
+ |---------|-------------|
77
+ | `ov init` | Initialize `.overstory/` in current project (`--yes`, `--name`) |
78
+ | `ov sling <task-id>` | Spawn a worker agent (`--capability`, `--name`, `--spec`, `--files`, `--parent`, `--depth`, `--skip-scout`, `--skip-review`, `--max-agents`, `--dispatch-max-agents`, `--skip-task-check`, `--json`) |
79
+ | `ov stop <agent-name>` | Terminate a running agent (`--clean-worktree`, `--json`) |
80
+ | `ov prime` | Load context for orchestrator/agent (`--agent`, `--compact`) |
81
+ | `ov spec write <task-id>` | Write a task specification (`--body`) |
82
+
83
+ ### Coordination
84
+
85
+ | Command | Description |
86
+ |---------|-------------|
87
+ | `ov coordinator start` | Start persistent coordinator agent (`--attach`/`--no-attach`, `--watchdog`, `--monitor`) |
88
+ | `ov coordinator stop` | Stop coordinator |
89
+ | `ov coordinator status` | Show coordinator state |
90
+ | `ov supervisor start` | Start per-project supervisor agent (`--attach`/`--no-attach`) |
91
+ | `ov supervisor stop` | Stop supervisor |
92
+ | `ov supervisor status` | Show supervisor state |
93
+
94
+ ### Messaging
95
+
96
+ | Command | Description |
97
+ |---------|-------------|
98
+ | `ov mail send` | Send a message (`--to`, `--subject`, `--body`, `--type`, `--priority`) |
99
+ | `ov mail check` | Check inbox — unread messages (`--agent`, `--inject`, `--debounce`, `--json`) |
100
+ | `ov mail list` | List messages with filters (`--from`, `--to`, `--unread`) |
101
+ | `ov mail read <id>` | Mark message as read |
102
+ | `ov mail reply <id>` | Reply in same thread (`--body`) |
103
+ | `ov nudge <agent> [message]` | Send a text nudge to an agent (`--from`, `--force`, `--json`) |
104
+
105
+ ### Task Groups
106
+
107
+ | Command | Description |
108
+ |---------|-------------|
109
+ | `ov group create <name>` | Create a task group for batch tracking |
110
+ | `ov group status <name>` | Show group progress |
111
+ | `ov group add <name> <issue-id>` | Add issue to group |
112
+ | `ov group list` | List all groups |
113
+
114
+ ### Merge
115
+
116
+ | Command | Description |
117
+ |---------|-------------|
118
+ | `ov merge` | Merge agent branches into canonical (`--branch`, `--all`, `--into`, `--dry-run`, `--json`) |
119
+
120
+ ### Observability
121
+
122
+ | Command | Description |
123
+ |---------|-------------|
124
+ | `ov status` | Show all active agents, worktrees, tracker state (`--json`, `--verbose`, `--all`) |
125
+ | `ov dashboard` | Live TUI dashboard for agent monitoring (`--interval`, `--all`) |
126
+ | `ov inspect <agent>` | Deep per-agent inspection (`--follow`, `--interval`, `--no-tmux`, `--limit`, `--json`) |
127
+ | `ov trace` | View agent/task timeline (`--agent`, `--run`, `--since`, `--until`, `--limit`, `--json`) |
128
+ | `ov errors` | Aggregated error view across agents (`--agent`, `--run`, `--since`, `--until`, `--limit`, `--json`) |
129
+ | `ov replay` | Interleaved chronological replay (`--run`, `--agent`, `--since`, `--until`, `--limit`, `--json`) |
130
+ | `ov feed` | Unified real-time event stream (`--follow`, `--interval`, `--agent`, `--run`, `--json`) |
131
+ | `ov logs` | Query NDJSON logs across agents (`--agent`, `--level`, `--since`, `--until`, `--follow`, `--json`) |
132
+ | `ov costs` | Token/cost analysis and breakdown (`--live`, `--self`, `--agent`, `--run`, `--by-capability`, `--last`, `--json`) |
133
+ | `ov metrics` | Show session metrics (`--last`, `--json`) |
134
+ | `ov run list` | List orchestration runs (`--last`, `--json`) |
135
+ | `ov run show <id>` | Show run details |
136
+ | `ov run complete` | Mark current run as completed |
137
+
138
+ ### Infrastructure
139
+
140
+ | Command | Description |
141
+ |---------|-------------|
142
+ | `ov hooks install` | Install orchestrator hooks to `.claude/settings.local.json` (`--force`) |
143
+ | `ov hooks uninstall` | Remove orchestrator hooks |
144
+ | `ov hooks status` | Check if hooks are installed |
145
+ | `ov worktree list` | List worktrees with status |
146
+ | `ov worktree clean` | Remove completed worktrees (`--completed`, `--all`, `--force`) |
147
+ | `ov watch` | Start watchdog daemon — Tier 0 (`--interval`, `--background`) |
148
+ | `ov monitor start` | Start Tier 2 monitor agent |
149
+ | `ov monitor stop` | Stop monitor agent |
150
+ | `ov monitor status` | Show monitor state |
151
+ | `ov log <event>` | Log a hook event (`--agent`) |
152
+ | `ov clean` | Clean up worktrees, sessions, artifacts (`--completed`, `--all`, `--run`) |
153
+ | `ov doctor` | Run health checks on overstory setup (`--category`, `--fix`, `--json`) |
154
+ | `ov ecosystem` | Show os-eco tool versions and health (`--json`) |
155
+ | `ov upgrade` | Upgrade overstory to latest npm version (`--check`, `--all`, `--json`) |
156
+ | `ov agents discover` | Discover agents by capability/state/parent (`--capability`, `--state`, `--parent`, `--json`) |
157
+ | `ov completions <shell>` | Generate shell completions (bash, zsh, fish) |
158
+
159
+ ## Architecture
160
+
161
+ Overstory uses CLAUDE.md overlays and PreToolUse hooks to turn Claude Code sessions into orchestrated agents. Each agent runs in an isolated git worktree via tmux. Inter-agent messaging is handled by a custom SQLite mail system (WAL mode, ~1-5ms per query) with typed protocol messages and broadcast support. A FIFO merge queue with 4-tier conflict resolution merges agent branches back to canonical. A tiered watchdog system (Tier 0 mechanical daemon, Tier 1 AI-assisted triage, Tier 2 monitor agent) ensures fleet health. See [CLAUDE.md](CLAUDE.md) for full technical details.
280
162
 
281
- ## Development
282
-
283
- ```bash
284
- # Run tests (2186 tests across 77 files)
285
- bun test
286
-
287
- # Run a single test
288
- bun test src/config.test.ts
289
-
290
- # Lint + format check
291
- biome check .
163
+ ## How It Works
292
164
 
293
- # Type check
294
- tsc --noEmit
165
+ CLAUDE.md + hooks + the `ov` CLI turn your Claude Code session into a multi-agent orchestrator. A persistent coordinator agent manages task decomposition and dispatch, while a mechanical watchdog daemon monitors agent health in the background.
295
166
 
296
- # All quality gates
297
- bun test && biome check . && tsc --noEmit
167
+ ```
168
+ Coordinator (persistent orchestrator at project root)
169
+ --> Supervisor (per-project team lead, depth 1)
170
+ --> Workers: Scout, Builder, Reviewer, Merger (depth 2)
298
171
  ```
299
172
 
300
- ### Versioning
301
-
302
- Version is maintained in two places that must stay in sync:
303
-
304
- 1. `package.json` — `"version"` field
305
- 2. `src/index.ts` — `VERSION` constant
173
+ ### Agent Types
306
174
 
307
- Use the bump script to update both:
175
+ | Agent | Role | Access |
176
+ |-------|------|--------|
177
+ | **Coordinator** | Persistent orchestrator — decomposes objectives, dispatches agents, tracks task groups | Read-only |
178
+ | **Supervisor** | Per-project team lead — manages worker lifecycle, handles nudge/escalation | Read-only |
179
+ | **Scout** | Read-only exploration and research | Read-only |
180
+ | **Builder** | Implementation and code changes | Read-write |
181
+ | **Reviewer** | Validation and code review | Read-only |
182
+ | **Lead** | Team coordination, can spawn sub-workers | Read-write |
183
+ | **Merger** | Branch merge specialist | Read-write |
184
+ | **Monitor** | Tier 2 continuous fleet patrol — ongoing health monitoring | Read-only |
308
185
 
309
- ```bash
310
- bun run version:bump <major|minor|patch>
311
- ```
186
+ ### Key Architecture
312
187
 
313
- Git tags, npm publishing, and GitHub releases are handled automatically by the `publish.yml` workflow when a version bump is pushed to `main`.
188
+ - **Agent Definitions**: Two-layer system base `.md` files define the HOW (workflow), per-task overlays define the WHAT (task scope). Base definition content is injected into spawned agent overlays automatically.
189
+ - **Messaging**: Custom SQLite mail system with typed protocol — 8 message types (`worker_done`, `merge_ready`, `dispatch`, `escalation`, etc.) for structured agent coordination, plus broadcast messaging with group addresses (`@all`, `@builders`, etc.)
190
+ - **Worktrees**: Each agent gets an isolated git worktree — no file conflicts between agents
191
+ - **Merge**: FIFO merge queue (SQLite-backed) with 4-tier conflict resolution
192
+ - **Watchdog**: Tiered health monitoring — Tier 0 mechanical daemon (tmux/pid liveness), Tier 1 AI-assisted failure triage, Tier 2 monitor agent for continuous fleet patrol
193
+ - **Tool Enforcement**: PreToolUse hooks mechanically block file modifications for non-implementation agents and dangerous git operations for all agents
194
+ - **Task Groups**: Batch coordination with auto-close when all member issues complete
195
+ - **Session Lifecycle**: Checkpoint save/restore for compaction survivability, handoff orchestration for crash recovery
196
+ - **Token Instrumentation**: Session metrics extracted from Claude Code transcript JSONL files
314
197
 
315
198
  ## Project Structure
316
199
 
@@ -322,7 +205,7 @@ overstory/
322
205
  config.ts Config loader + validation
323
206
  errors.ts Custom error types
324
207
  json.ts Standardized JSON envelope helpers
325
- commands/ One file per CLI subcommand (30 commands)
208
+ commands/ One file per CLI subcommand (32 commands)
326
209
  agents.ts Agent discovery and querying
327
210
  coordinator.ts Persistent orchestrator lifecycle
328
211
  supervisor.ts Team lead management
@@ -343,9 +226,9 @@ overstory/
343
226
  logs.ts NDJSON log query
344
227
  feed.ts Unified real-time event stream
345
228
  run.ts Orchestration run lifecycle
346
- trace.ts Agent/bead timeline viewing
229
+ trace.ts Agent/task timeline viewing
347
230
  clean.ts Worktree/session cleanup
348
- doctor.ts Health check runner (9 check modules)
231
+ doctor.ts Health check runner (10 check modules)
349
232
  inspect.ts Deep per-agent inspection
350
233
  spec.ts Task spec management
351
234
  errors.ts Aggregated error view
@@ -353,6 +236,8 @@ overstory/
353
236
  stop.ts Agent termination
354
237
  costs.ts Token/cost analysis
355
238
  metrics.ts Session metrics
239
+ ecosystem.ts os-eco tool dashboard
240
+ upgrade.ts npm version upgrades
356
241
  completions.ts Shell completion generation (bash/zsh/fish)
357
242
  agents/ Agent lifecycle management
358
243
  manifest.ts Agent registry (load + query)
@@ -367,7 +252,7 @@ overstory/
367
252
  watchdog/ Tiered health monitoring (daemon, triage, health)
368
253
  logging/ Multi-format logger + sanitizer + reporter + color control
369
254
  metrics/ SQLite metrics + transcript parsing
370
- doctor/ Health check modules (9 checks)
255
+ doctor/ Health check modules (10 checks)
371
256
  insights/ Session insight analyzer for auto-expertise
372
257
  tracker/ Pluggable task tracker (beads + seeds backends)
373
258
  mulch/ mulch CLI wrapper
@@ -376,10 +261,21 @@ overstory/
376
261
  templates/ Templates for overlays and hooks
377
262
  ```
378
263
 
379
- ## License
264
+ ## Part of os-eco
380
265
 
381
- MIT
266
+ Overstory is part of the [os-eco](https://github.com/jayminwest/os-eco) AI agent tooling ecosystem.
267
+
268
+ ```
269
+ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ overstory orchestration
270
+ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ canopy prompts
271
+ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ seeds issues
272
+ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ mulch expertise
273
+ ```
274
+
275
+ ## Contributing
382
276
 
383
- ---
277
+ Contributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
384
278
 
385
- Inspired by: https://github.com/steveyegge/gastown/
279
+ ## License
280
+
281
+ MIT
package/agents/builder.md CHANGED
@@ -14,8 +14,8 @@ These are named failures. If you catch yourself doing any of these, stop and cor
14
14
  - **FILE_SCOPE_VIOLATION** -- Editing or writing to a file not listed in your FILE_SCOPE. Read any file for context, but only modify scoped files.
15
15
  - **CANONICAL_BRANCH_WRITE** -- Committing to or pushing to main/develop/canonical branch. You commit to your worktree branch only.
16
16
  - **SILENT_FAILURE** -- Encountering an error (test failure, lint failure, blocked dependency) and not reporting it via mail. Every error must be communicated to your parent with `--type error`.
17
- - **INCOMPLETE_CLOSE** -- Running `{{TRACKER_CLI}} close` without first passing quality gates (`bun test`, `bun run lint`, `bun run typecheck`) and sending a result mail to your parent.
18
- - **MISSING_WORKER_DONE** -- Closing a bead issue without first sending `worker_done` mail to parent. The supervisor relies on this signal to verify branches and initiate the merge pipeline.
17
+ - **INCOMPLETE_CLOSE** -- Running `{{TRACKER_CLI}} close` without first passing quality gates ({{QUALITY_GATE_INLINE}}) and sending a result mail to your parent.
18
+ - **MISSING_WORKER_DONE** -- Closing a {{TRACKER_NAME}} issue without first sending `worker_done` mail to parent. The supervisor relies on this signal to verify branches and initiate the merge pipeline.
19
19
  - **MISSING_MULCH_RECORD** -- Closing without recording mulch learnings. Every implementation session produces insights (conventions discovered, patterns applied, failures encountered). Skipping `ml record` loses knowledge for future agents.
20
20
 
21
21
  ## overlay
@@ -29,7 +29,7 @@ Your task-specific context (task ID, file scope, spec path, branch name, parent
29
29
  - **Never push to the canonical branch** (main/develop). You commit to your worktree branch only. Merging is handled by the orchestrator or a merger agent.
30
30
  - **Never run `git push`** -- your branch lives in the local worktree. The merge process handles integration.
31
31
  - **Never spawn sub-workers.** You are a leaf node. If you need something decomposed, ask your parent via mail.
32
- - **Run quality gates before closing.** Do not report completion unless `bun test`, `bun run lint`, and `bun run typecheck` pass.
32
+ - **Run quality gates before closing.** Do not report completion unless {{QUALITY_GATE_INLINE}} pass.
33
33
  - If tests fail, fix them. If you cannot fix them, report the failure via mail with `--type error`.
34
34
 
35
35
  ## communication-protocol
@@ -49,9 +49,7 @@ Your task-specific context (task ID, file scope, spec path, branch name, parent
49
49
 
50
50
  ## completion-protocol
51
51
 
52
- 1. Run `bun test` -- all tests must pass.
53
- 2. Run `bun run lint` -- lint and formatting must be clean.
54
- 3. Run `bun run typecheck` -- no TypeScript errors.
52
+ {{QUALITY_GATE_STEPS}}
55
53
  4. Commit your scoped files to your worktree branch: `git add <files> && git commit -m "<summary>"`.
56
54
  5. **Record mulch learnings** -- review your work for insights worth preserving (conventions discovered, patterns applied, failures encountered, decisions made) and record them with outcome data:
57
55
  ```bash
@@ -88,10 +86,7 @@ You are an implementation specialist. Given a spec and a set of files you own, y
88
86
  - **Grep** -- search file contents with regex
89
87
  - **Bash:**
90
88
  - `git add`, `git commit`, `git diff`, `git log`, `git status`
91
- - `bun test` (run tests)
92
- - `bun run lint` (lint and format check via biome)
93
- - `bun run biome check --write` (auto-fix lint/format issues)
94
- - `bun run typecheck` (type checking via tsc)
89
+ {{QUALITY_GATE_CAPABILITIES}}
95
90
  - `{{TRACKER_CLI}} show`, `{{TRACKER_CLI}} close` ({{TRACKER_NAME}} task management)
96
91
  - `ml prime`, `ml record`, `ml query` (expertise)
97
92
  - `ov mail send`, `ov mail check` (communication)
@@ -116,11 +111,7 @@ You are an implementation specialist. Given a spec and a set of files you own, y
116
111
  - Follow project conventions (check existing code for patterns).
117
112
  - Write tests alongside implementation.
118
113
  5. **Run quality gates:**
119
- ```bash
120
- bun test # All tests must pass
121
- bun run lint # Lint and format must be clean
122
- bun run typecheck # No TypeScript errors
123
- ```
114
+ {{QUALITY_GATE_BASH}}
124
115
  6. **Commit your work** to your worktree branch:
125
116
  ```bash
126
117
  git add <your-scoped-files>
package/agents/lead.md CHANGED
@@ -2,6 +2,15 @@
2
2
 
3
3
  Read your assignment. Assess complexity. For simple tasks, start implementing immediately. For moderate tasks, write a spec and spawn a builder. For complex tasks, spawn scouts and create issues. Do not ask for confirmation, do not propose a plan and wait for approval. Start working within your first tool calls.
4
4
 
5
+ ## dispatch-overrides
6
+
7
+ Your overlay may contain a **Dispatch Overrides** section with directives from your coordinator. These override the default workflow:
8
+
9
+ - **SKIP REVIEW**: Do not spawn a reviewer. Self-verify by reading the builder diff and running quality gates. This is appropriate for simple or well-tested changes.
10
+ - **MAX AGENTS**: Limits the number of sub-workers you may spawn. Plan your decomposition to fit within this budget.
11
+
12
+ Always check your overlay for dispatch overrides before following the default three-phase workflow. If no overrides section exists, follow the standard playbook.
13
+
5
14
  ## cost-awareness
6
15
 
7
16
  **Your time is the scarcest resource in the swarm.** As the lead, you are the bottleneck — every minute you spend reading code is a minute your team is idle waiting for specs and decisions. Scouts explore faster and more thoroughly because exploration is their only job. Your job is to make coordination decisions, not to read files.
@@ -74,9 +83,7 @@ You are primarily a coordinator, but you can also be a doer for simple tasks. Yo
74
83
  - **Grep** -- search file contents with regex
75
84
  - **Bash:**
76
85
  - `git add`, `git commit`, `git diff`, `git log`, `git status`
77
- - `bun test` (run tests)
78
- - `bun run lint` (lint check)
79
- - `bun run typecheck` (type checking)
86
+ {{QUALITY_GATE_CAPABILITIES}}
80
87
  - `{{TRACKER_CLI}} create`, `{{TRACKER_CLI}} show`, `{{TRACKER_CLI}} ready`, `{{TRACKER_CLI}} close`, `{{TRACKER_CLI}} update` (full {{TRACKER_NAME}} management)
81
88
  - `{{TRACKER_CLI}} sync` (sync {{TRACKER_NAME}} with git)
82
89
  - `ml prime`, `ml record`, `ml query`, `ml search` (expertise)
@@ -230,7 +237,7 @@ Review is a quality investment. For complex, multi-file changes, spawn a reviewe
230
237
  **Self-verification (simple/moderate tasks):**
231
238
  1. Read the builder's diff: `git diff main..<builder-branch>`
232
239
  2. Check the diff matches the spec
233
- 3. Run quality gates: `bun test`, `bun run lint`, `bun run typecheck`
240
+ 3. Run quality gates: {{QUALITY_GATE_INLINE}}
234
241
  4. If everything passes, send merge_ready directly
235
242
 
236
243
  **Reviewer verification (complex tasks):**
@@ -250,7 +257,7 @@ Review is a quality investment. For complex, multi-file changes, spawn a reviewe
250
257
  --body "Review the changes on branch <builder-branch>. Spec: .overstory/specs/<builder-bead-id>.md. Run quality gates and report PASS or FAIL." \
251
258
  --type dispatch
252
259
  ```
253
- The reviewer validates against the builder's spec and runs quality gates (`bun test`, `bun run lint`, `bun run typecheck`).
260
+ The reviewer validates against the builder's spec and runs the project's quality gates ({{QUALITY_GATE_INLINE}}).
254
261
  13. **Handle review results:**
255
262
  - **PASS:** Either the reviewer sends a `result` mail with "PASS" in the subject, or self-verification confirms the diff matches the spec and quality gates pass. Immediately signal `merge_ready` for that builder's branch -- do not wait for other builders to finish:
256
263
  ```bash
@@ -286,7 +293,7 @@ Good decomposition follows these principles:
286
293
 
287
294
  1. **Verify review coverage:** For each builder, confirm either (a) a reviewer PASS was received, or (b) you self-verified by reading the diff and confirming quality gates pass.
288
295
  2. Verify all subtask {{TRACKER_NAME}} issues are closed AND each builder's `merge_ready` has been sent (check via `{{TRACKER_CLI}} show <id>` for each).
289
- 3. Run integration tests if applicable: `bun test`.
296
+ 3. Run integration tests if applicable: {{QUALITY_GATE_INLINE}}.
290
297
  4. **Record mulch learnings** -- review your orchestration work for insights (decomposition strategies, worker coordination patterns, failures encountered, decisions made) and record them:
291
298
  ```bash
292
299
  ml record <domain> --type <convention|pattern|failure|decision> --description "..."