@yemi33/minions 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +819 -0
- package/LICENSE +21 -0
- package/README.md +598 -0
- package/agents/dallas/charter.md +56 -0
- package/agents/lambert/charter.md +67 -0
- package/agents/ralph/charter.md +45 -0
- package/agents/rebecca/charter.md +57 -0
- package/agents/ripley/charter.md +47 -0
- package/bin/minions.js +467 -0
- package/config.template.json +28 -0
- package/dashboard.html +4822 -0
- package/dashboard.js +2623 -0
- package/docs/auto-discovery.md +416 -0
- package/docs/blog-first-successful-dispatch.md +128 -0
- package/docs/command-center.md +156 -0
- package/docs/demo/01-dashboard-overview.gif +0 -0
- package/docs/demo/02-command-center.gif +0 -0
- package/docs/demo/03-work-items.gif +0 -0
- package/docs/demo/04-plan-docchat.gif +0 -0
- package/docs/demo/05-prd-progress.gif +0 -0
- package/docs/demo/06-inbox-metrics.gif +0 -0
- package/docs/deprecated.json +83 -0
- package/docs/distribution.md +96 -0
- package/docs/engine-restart.md +92 -0
- package/docs/human-vs-automated.md +108 -0
- package/docs/index.html +221 -0
- package/docs/plan-lifecycle.md +140 -0
- package/docs/self-improvement.md +344 -0
- package/engine/ado-mcp-wrapper.js +42 -0
- package/engine/ado.js +383 -0
- package/engine/check-status.js +23 -0
- package/engine/cli.js +754 -0
- package/engine/consolidation.js +417 -0
- package/engine/github.js +331 -0
- package/engine/lifecycle.js +1113 -0
- package/engine/llm.js +116 -0
- package/engine/queries.js +677 -0
- package/engine/shared.js +397 -0
- package/engine/spawn-agent.js +151 -0
- package/engine.js +3227 -0
- package/minions.js +556 -0
- package/package.json +48 -0
- package/playbooks/ask.md +49 -0
- package/playbooks/build-and-test.md +155 -0
- package/playbooks/explore.md +64 -0
- package/playbooks/fix.md +57 -0
- package/playbooks/implement-shared.md +68 -0
- package/playbooks/implement.md +95 -0
- package/playbooks/plan-to-prd.md +104 -0
- package/playbooks/plan.md +99 -0
- package/playbooks/review.md +68 -0
- package/playbooks/test.md +75 -0
- package/playbooks/verify.md +190 -0
- package/playbooks/work-item.md +74 -0
|
@@ -0,0 +1,156 @@
|
|
|
1
|
+
# Command Center
|
|
2
|
+
|
|
3
|
+
The Command Center (CC) is the minions's conversational AI brain. It powers the dashboard's chat panel, document editing modals, and plan steering — all through a single persistent Sonnet session with full minions awareness.
|
|
4
|
+
|
|
5
|
+
## Access
|
|
6
|
+
|
|
7
|
+
Click the **CC** button in the top-right header of the dashboard. A slide-out chat drawer opens on the right side.
|
|
8
|
+
|
|
9
|
+
## Persistent Sessions
|
|
10
|
+
|
|
11
|
+
CC maintains a true multi-turn session using Claude CLI's `--resume` flag. Unlike a typical stateless API call, each message resumes the same conversation — Claude retains full history including tool calls, intermediate reasoning, and file contents from prior turns.
|
|
12
|
+
|
|
13
|
+
**Session lifecycle:**
|
|
14
|
+
- **Created** on first message (or after expiry/new-session)
|
|
15
|
+
- **Resumed** on subsequent messages via `--resume <sessionId>`
|
|
16
|
+
- **Expires** after 2 hours of inactivity or 50 turns
|
|
17
|
+
- **Persisted** to `engine/cc-session.json` — survives dashboard restarts
|
|
18
|
+
- **Frontend messages** saved to `localStorage` — survive page refresh
|
|
19
|
+
|
|
20
|
+
Click **New Session** in the drawer header to start fresh.
|
|
21
|
+
|
|
22
|
+
### Fresh State Each Turn
|
|
23
|
+
|
|
24
|
+
The system prompt is baked into the session at creation (persona, rules, action format, tool access). Dynamic minions state is injected as a preamble in each user message, so CC always sees current data even in a resumed session.
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
Turn 1: system prompt (static) + [Current Minions State] + user message
|
|
28
|
+
Turn 2+: [Updated Minions State] + user message (system prompt already in session)
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## What It Knows
|
|
32
|
+
|
|
33
|
+
Full minions context is injected every turn:
|
|
34
|
+
|
|
35
|
+
| Context | Details |
|
|
36
|
+
|---------|---------|
|
|
37
|
+
| Agents | Statuses, current tasks, full charters (expertise/roles) |
|
|
38
|
+
| Routing | Who's preferred/fallback for each work type |
|
|
39
|
+
| Config | Tick interval, max concurrent, timeouts, max turns |
|
|
40
|
+
| Work items | Active, pending, failed across all projects with failure reasons |
|
|
41
|
+
| Pull requests | Status, review status, build status, branch, URL |
|
|
42
|
+
| Plans | Full `.md` plan contents + PRD JSON summaries + archived plans |
|
|
43
|
+
| PRD items | All items with status, priority, dependencies |
|
|
44
|
+
| Dispatch | Active agents + last 15 completions with result summaries |
|
|
45
|
+
| Skills | All reusable agent workflows with triggers |
|
|
46
|
+
| Knowledge base | Recent entries (architecture, conventions, build reports, reviews) |
|
|
47
|
+
| Team notes | Recent consolidated notes |
|
|
48
|
+
|
|
49
|
+
## Tool Access
|
|
50
|
+
|
|
51
|
+
CC can use tools to look beyond the pre-loaded context:
|
|
52
|
+
|
|
53
|
+
- **Bash** — Run shell commands (git, build tools, scripts)
|
|
54
|
+
- **Read** — Open any file (agent output logs, code, config, plans)
|
|
55
|
+
- **Write** — Create or overwrite files
|
|
56
|
+
- **Edit** — Make targeted edits to existing files
|
|
57
|
+
- **Glob** — Find files by pattern (e.g., `agents/*/output.log`)
|
|
58
|
+
- **Grep** — Search file contents (find functions, search agent outputs)
|
|
59
|
+
- **WebFetch/WebSearch** — Look up external resources
|
|
60
|
+
|
|
61
|
+
## Unified Brain
|
|
62
|
+
|
|
63
|
+
All LLM-powered features in the dashboard route through the same CC session:
|
|
64
|
+
|
|
65
|
+
| Feature | Endpoint | How It Works |
|
|
66
|
+
|---------|----------|-------------|
|
|
67
|
+
| **CC chat panel** | `POST /api/command-center` | Direct CC interaction |
|
|
68
|
+
| **Doc chat modal** | `POST /api/doc-chat` | Routes through CC via `ccDocCall()` |
|
|
69
|
+
| **Doc steer** | `POST /api/steer-document` | Routes through CC via `ccDocCall()` |
|
|
70
|
+
| **Plan revision** | `POST /api/plans/revise` | Routes through CC via `ccDocCall()` |
|
|
71
|
+
|
|
72
|
+
This means:
|
|
73
|
+
- A question in the doc chat modal shares context with the CC panel
|
|
74
|
+
- CC remembers what you discussed in a document editing session
|
|
75
|
+
- Doc modals can cross-reference other minions knowledge, PRs, work items
|
|
76
|
+
- All turns accumulate in the same session
|
|
77
|
+
|
|
78
|
+
## Actions
|
|
79
|
+
|
|
80
|
+
When you ask CC to *do* something, it includes structured action blocks in its response.
|
|
81
|
+
|
|
82
|
+
| Action | What It Does | Example Prompt |
|
|
83
|
+
|--------|-------------|----------------|
|
|
84
|
+
| `dispatch` | Create a work item | "Have dallas fix the login bug" |
|
|
85
|
+
| `note` | Save a decision/reminder | "Remember we need to migrate to v3" |
|
|
86
|
+
| `plan` | Create an implementation plan | "Plan the GitHub integration feature" |
|
|
87
|
+
| `cancel` | Stop a running agent | "Cancel whatever ripley is doing" |
|
|
88
|
+
| `retry` | Retry failed work items | "Retry the three failed tasks" |
|
|
89
|
+
| `pause-plan` | Pause a PRD | "Pause the officeagent PRD" |
|
|
90
|
+
| `approve-plan` | Approve a PRD | "Approve the new plan" |
|
|
91
|
+
| `edit-prd-item` | Edit a PRD item | "Change P003's priority to high" |
|
|
92
|
+
| `remove-prd-item` | Remove a PRD item | "Remove P011 from the plan" |
|
|
93
|
+
| `delete-work-item` | Delete a work item | "Delete work item W025" |
|
|
94
|
+
|
|
95
|
+
## Error Handling
|
|
96
|
+
|
|
97
|
+
- **Frontend timeout**: 10-minute `AbortSignal` on the fetch — prevents infinite "thinking" spinner
|
|
98
|
+
- **Backend timeout**: 5-minute kill timer on the claude process
|
|
99
|
+
- **Resume failure**: If `--resume` fails (corrupted/deleted session), automatically retries with a fresh session
|
|
100
|
+
- **Concurrency guard**: Only one CC call at a time — concurrent requests get a 429 "CC is busy" response
|
|
101
|
+
- **Phase indicators**: Thinking indicator shows progressive phases ("Thinking..." → "Analyzing..." → "Timing out soon...")
|
|
102
|
+
|
|
103
|
+
## Architecture
|
|
104
|
+
|
|
105
|
+
```
|
|
106
|
+
User message (CC panel, doc modal, or steer)
|
|
107
|
+
│
|
|
108
|
+
▼
|
|
109
|
+
POST /api/command-center (or /api/doc-chat, /api/steer-document)
|
|
110
|
+
│
|
|
111
|
+
├── Validate session (expiry, turn limit)
|
|
112
|
+
├── Build dynamic state preamble (buildCCStatePreamble)
|
|
113
|
+
├── callLLM() with sessionId (resume) or without (new)
|
|
114
|
+
│ model: sonnet
|
|
115
|
+
│ maxTurns: 25 (CC) or 10 (doc-plan) or 1 (doc-other)
|
|
116
|
+
│ allowedTools: Bash, Read, Write, Edit, Glob, Grep, WebFetch, WebSearch
|
|
117
|
+
│ timeout: 600s (CC) or 300s (doc-plan) or 60s (doc-other)
|
|
118
|
+
│
|
|
119
|
+
▼
|
|
120
|
+
spawn-agent.js
|
|
121
|
+
│
|
|
122
|
+
├── If --resume: skip system prompt file, pass -p --resume <id>
|
|
123
|
+
├── If new: pass -p --system-prompt-file <file>
|
|
124
|
+
│
|
|
125
|
+
▼
|
|
126
|
+
claude CLI (persistent session on disk)
|
|
127
|
+
│
|
|
128
|
+
▼
|
|
129
|
+
Parse response
|
|
130
|
+
├── Extract sessionId from stream-json result
|
|
131
|
+
├── Extract ===ACTIONS=== block (CC)
|
|
132
|
+
├── Extract ---DOCUMENT--- block (doc chat/steer)
|
|
133
|
+
├── Update cc-session.json
|
|
134
|
+
│
|
|
135
|
+
▼
|
|
136
|
+
Frontend
|
|
137
|
+
├── Render chat message
|
|
138
|
+
├── Execute actions / apply doc edits
|
|
139
|
+
├── Save sessionId + messages to localStorage
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## Key Files
|
|
143
|
+
|
|
144
|
+
| File | Role |
|
|
145
|
+
|------|------|
|
|
146
|
+
| `engine/llm.js` | `callLLM()` — single LLM function with optional `sessionId` for resume |
|
|
147
|
+
| `engine/spawn-agent.js` | Spawns claude CLI; skips system prompt on `--resume` |
|
|
148
|
+
| `engine/shared.js` | `parseStreamJsonOutput()` extracts `sessionId` from result |
|
|
149
|
+
| `engine/cc-session.json` | Persisted session state (sessionId, turnCount, timestamps) |
|
|
150
|
+
| `dashboard.js` | CC endpoint, `buildCCStatePreamble()`, `ccDocCall()`, `parseCCActions()` |
|
|
151
|
+
| `dashboard.html` | Frontend: localStorage persistence, session indicator, New Session button |
|
|
152
|
+
|
|
153
|
+
## Command Bar
|
|
154
|
+
|
|
155
|
+
The command bar at the top of the dashboard routes all input to the CC panel. Typing in the command bar opens the CC drawer and sends the message as a CC turn.
|
|
156
|
+
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
[
|
|
2
|
+
{
|
|
3
|
+
"id": "status-in-pr",
|
|
4
|
+
"summary": "in-pr status alias — backward compat shim for done",
|
|
5
|
+
"deprecated": "2026-03-21",
|
|
6
|
+
"reason": "Simplified status model: agents mark items done directly, no intermediate in-pr state",
|
|
7
|
+
"locations": [
|
|
8
|
+
"dashboard.html: CSS classes, progress bar, status labels, graph colors (~10 locations)",
|
|
9
|
+
"dashboard.js:1525 completedStatuses set",
|
|
10
|
+
"engine.js:1165 PRD_MET_STATUSES set",
|
|
11
|
+
"engine.js:1180 dependency check",
|
|
12
|
+
"engine.js:1974 completedStatuses set",
|
|
13
|
+
"engine/lifecycle.js:54,61 plan completion gate",
|
|
14
|
+
"engine/queries.js:466,580,597 status ordering and counting"
|
|
15
|
+
],
|
|
16
|
+
"cleanup": "Remove in-pr from all status checks, CSS, and display logic. Only keep done."
|
|
17
|
+
},
|
|
18
|
+
{
|
|
19
|
+
"id": "status-implemented",
|
|
20
|
+
"summary": "implemented status alias — legacy alias for done",
|
|
21
|
+
"deprecated": "2026-03-21",
|
|
22
|
+
"reason": "Canonical status is done. implemented was the original name before standardization.",
|
|
23
|
+
"locations": [
|
|
24
|
+
"dashboard.html: CSS, progress filters, status labels (~8 locations)",
|
|
25
|
+
"dashboard.js:1525 completedStatuses set",
|
|
26
|
+
"engine.js:1165,1974 status sets",
|
|
27
|
+
"engine/lifecycle.js:694 sets implemented on post-merge"
|
|
28
|
+
],
|
|
29
|
+
"cleanup": "Replace all implemented references with done. Update lifecycle.js post-merge to set done."
|
|
30
|
+
},
|
|
31
|
+
{
|
|
32
|
+
"id": "status-complete",
|
|
33
|
+
"summary": "complete status alias — another legacy alias for done",
|
|
34
|
+
"deprecated": "2026-03-21",
|
|
35
|
+
"reason": "Canonical status is done. complete appeared in early PRD schemas.",
|
|
36
|
+
"locations": [
|
|
37
|
+
"dashboard.html: progress filters, completion checks (~4 locations)",
|
|
38
|
+
"engine.js:1165 PRD_MET_STATUSES set"
|
|
39
|
+
],
|
|
40
|
+
"cleanup": "Remove complete from all status checks. Only keep done."
|
|
41
|
+
},
|
|
42
|
+
{
|
|
43
|
+
"id": "dead-plan-version-actions",
|
|
44
|
+
"summary": "showPlanVersionActions, qaJustSave, qaReplacePrd, qaNewPrd, qaDisablePrdButtons — dead code",
|
|
45
|
+
"deprecated": "2026-03-21",
|
|
46
|
+
"reason": "Doc-chat fork logic removed. These functions are unreachable.",
|
|
47
|
+
"locations": [
|
|
48
|
+
"dashboard.html:3775-3860 five function definitions (~80 lines)"
|
|
49
|
+
],
|
|
50
|
+
"cleanup": "Delete the five functions entirely."
|
|
51
|
+
},
|
|
52
|
+
{
|
|
53
|
+
"id": "dead-modal-original-plan",
|
|
54
|
+
"summary": "_modalOriginalPlan variable — dead code",
|
|
55
|
+
"deprecated": "2026-03-21",
|
|
56
|
+
"reason": "Tracked original plan for fork edits. Fork logic removed.",
|
|
57
|
+
"locations": [
|
|
58
|
+
"dashboard.html:3187 declaration",
|
|
59
|
+
"dashboard.html:2018 reset in closeModal"
|
|
60
|
+
],
|
|
61
|
+
"cleanup": "Delete the variable declaration and all references."
|
|
62
|
+
},
|
|
63
|
+
{
|
|
64
|
+
"id": "dead-revise-and-regenerate",
|
|
65
|
+
"summary": "/api/plans/revise-and-regenerate endpoint — disabled behind if(false)",
|
|
66
|
+
"deprecated": "2026-03-21",
|
|
67
|
+
"reason": "Plan versioning handled differently now. Endpoint was explicitly disabled.",
|
|
68
|
+
"locations": [
|
|
69
|
+
"dashboard.js:1782-1829 (~45 lines of dead code)"
|
|
70
|
+
],
|
|
71
|
+
"cleanup": "Delete the entire if(false) block."
|
|
72
|
+
},
|
|
73
|
+
{
|
|
74
|
+
"id": "dead-steer-btn-comments",
|
|
75
|
+
"summary": "// steer btn removed — unified send comments",
|
|
76
|
+
"deprecated": "2026-03-21",
|
|
77
|
+
"reason": "Steer button was removed long ago. Comments are noise.",
|
|
78
|
+
"locations": [
|
|
79
|
+
"dashboard.html: 4 occurrences"
|
|
80
|
+
],
|
|
81
|
+
"cleanup": "Delete the comments."
|
|
82
|
+
}
|
|
83
|
+
]
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
# Distribution & Publishing
|
|
2
|
+
|
|
3
|
+
Minions is distributed as an npm package (`@yemi33/minions`) from a sanitized copy of the main repo.
|
|
4
|
+
|
|
5
|
+
## Two-Repo Architecture
|
|
6
|
+
|
|
7
|
+
| Repo | Purpose | What's included |
|
|
8
|
+
|------|---------|----------------|
|
|
9
|
+
| **origin** (`yemishin_microsoft/minions`) | Full working repo with all session state | Everything — history, notes, decisions, work items, CLAUDE.md |
|
|
10
|
+
| **personal** (`yemi33/minions`) | Clean distribution for others | Engine, dashboard, playbooks, charters, skills, docs, npm package files |
|
|
11
|
+
|
|
12
|
+
## What Gets Stripped
|
|
13
|
+
|
|
14
|
+
These files are removed during sync to personal:
|
|
15
|
+
|
|
16
|
+
| Category | Pattern | Reason |
|
|
17
|
+
|----------|---------|--------|
|
|
18
|
+
| Agent history | `agents/*/history.md` | Session-specific task logs |
|
|
19
|
+
| Notes archive | `notes/archive/*` | Historical agent findings |
|
|
20
|
+
| Notes inbox | `notes/inbox/*` | Pending agent findings |
|
|
21
|
+
| Notes summary | `notes.md` | Consolidated knowledge (runtime) |
|
|
22
|
+
| Work items | `work-items.json` | Runtime dispatch tracking |
|
|
23
|
+
| Project instructions | `CLAUDE.md` | Org-specific context |
|
|
24
|
+
|
|
25
|
+
## npm Package
|
|
26
|
+
|
|
27
|
+
**Package:** `@yemi33/minions`
|
|
28
|
+
**Registry:** https://www.npmjs.com/package/@yemi33/minions
|
|
29
|
+
|
|
30
|
+
### What's in the package
|
|
31
|
+
|
|
32
|
+
Controlled by the `files` field in `package.json`:
|
|
33
|
+
- `bin/minions.js` — CLI entry point
|
|
34
|
+
- `engine.js`, `dashboard.js`, `dashboard.html`, `minions.js` — core scripts
|
|
35
|
+
- `engine/spawn-agent.js`, `engine/ado-mcp-wrapper.js` — engine helpers
|
|
36
|
+
- `agents/*/charter.md` — agent role definitions
|
|
37
|
+
- `playbooks/*.md` — task templates
|
|
38
|
+
- `config.template.json` — starter config
|
|
39
|
+
- `routing.md`, `team.md` — editable team config
|
|
40
|
+
- `skills/`, `docs/` — documentation and workflows
|
|
41
|
+
|
|
42
|
+
### How `minions init` works
|
|
43
|
+
|
|
44
|
+
1. Copies all package files from `node_modules/@yemi33/minions/` to `~/.minions/`
|
|
45
|
+
2. Creates `config.json` from `config.template.json` if it doesn't exist
|
|
46
|
+
3. Creates runtime directories (`engine/`, `notes/inbox/`, `notes/archive/`, etc.)
|
|
47
|
+
4. Runs `minions.js init` to populate config with default agents
|
|
48
|
+
5. On `--force`, overwrites `.js` and `.html` files but preserves user-modified `.md` files
|
|
49
|
+
|
|
50
|
+
### How updates work
|
|
51
|
+
|
|
52
|
+
- Users run `npm update -g @yemi33/minions` then `minions init --force` to update engine code
|
|
53
|
+
- `npx @yemi33/minions` always fetches the latest version
|
|
54
|
+
|
|
55
|
+
## Auto-Publishing
|
|
56
|
+
|
|
57
|
+
A GitHub Action on the personal repo auto-publishes to npm on every push to master.
|
|
58
|
+
|
|
59
|
+
### How it works
|
|
60
|
+
|
|
61
|
+
1. Push to `yemi33/minions` master triggers `.github/workflows/publish.yml`
|
|
62
|
+
2. Action queries npm for the current published version
|
|
63
|
+
3. Bumps patch version (e.g., `0.1.5` → `0.1.6`)
|
|
64
|
+
4. Publishes to npm with the new version
|
|
65
|
+
5. Commits the version bump back to the repo with `[skip ci]` to prevent loops
|
|
66
|
+
|
|
67
|
+
### Why version comes from npm, not the repo
|
|
68
|
+
|
|
69
|
+
The sync-to-personal workflow force-pushes, which overwrites any version bump commits from previous action runs. So the action reads the latest version from the npm registry and bumps from there.
|
|
70
|
+
|
|
71
|
+
### Setup requirements
|
|
72
|
+
|
|
73
|
+
- `NPM_TOKEN` secret on `yemi33/minions` — a granular access token with publish permissions and 2FA bypass enabled
|
|
74
|
+
- The workflow file (`.github/workflows/publish.yml`) is gitignored on the org repo and force-added during sync
|
|
75
|
+
|
|
76
|
+
## Sync Workflow
|
|
77
|
+
|
|
78
|
+
Run `/sync-to-personal` or manually:
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
# 1. Create dist branch, strip files, add workflow, force-push
|
|
82
|
+
git checkout -b dist-clean
|
|
83
|
+
git rm --cached agents/*/history.md notes.md work-items.json CLAUDE.md
|
|
84
|
+
git rm -r --cached notes/archive/ notes/inbox/ notes/
|
|
85
|
+
# ... add .gitkeep files, .gitignore entries, workflow file
|
|
86
|
+
git add -f .github/workflows/publish.yml
|
|
87
|
+
git commit -m "Strip for distribution"
|
|
88
|
+
git push personal dist-clean:master --force
|
|
89
|
+
|
|
90
|
+
# 2. Return to master
|
|
91
|
+
git checkout master
|
|
92
|
+
git branch -D dist-clean
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
The full workflow is documented in `.claude/skills/sync-to-personal/SKILL.md`.
|
|
96
|
+
|
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
# Engine Restart & Agent Survival
|
|
2
|
+
|
|
3
|
+
## The Problem
|
|
4
|
+
|
|
5
|
+
When the engine restarts, it loses its in-memory process handles (`activeProcesses` Map). Claude CLI agents spawned before the restart are still running as OS processes, but the engine can't monitor their stdout, detect exit codes, or manage their lifecycle. Without protection, the heartbeat check (5-min default) would kill these agents as "orphans."
|
|
6
|
+
|
|
7
|
+
## What's Persisted vs Lost
|
|
8
|
+
|
|
9
|
+
| State | Storage | Survives Restart |
|
|
10
|
+
|-------|---------|-----------------|
|
|
11
|
+
| Dispatch queue (pending/active/completed) | `engine/dispatch.json` | Yes |
|
|
12
|
+
| Agent status (working/idle/error) | Derived from `engine/dispatch.json` | Yes |
|
|
13
|
+
| Agent live output | `agents/*/live-output.log` | Yes (mtime used as heartbeat) |
|
|
14
|
+
| Process handles (`ChildProcess`) | In-memory Map | **No** |
|
|
15
|
+
| Cooldown timestamps | In-memory Map | **No** (repopulated from `engine/cooldowns.json`) |
|
|
16
|
+
|
|
17
|
+
## Protection Mechanisms
|
|
18
|
+
|
|
19
|
+
### 1. Grace Period on Startup (20 min default)
|
|
20
|
+
|
|
21
|
+
When the engine starts and finds active dispatches from a previous session, it sets `engineRestartGraceUntil` to `now + 20 minutes`. During this window, orphan detection is completely suppressed — agents won't be killed even if the engine has no process handle for them.
|
|
22
|
+
|
|
23
|
+
Configurable via `config.json`:
|
|
24
|
+
```json
|
|
25
|
+
{
|
|
26
|
+
"engine": {
|
|
27
|
+
"restartGracePeriod": 1200000
|
|
28
|
+
}
|
|
29
|
+
}
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### 2. Blocking Tool Detection
|
|
33
|
+
|
|
34
|
+
Even after the grace period expires, the engine scans each agent's `live-output.log` for the most recent `tool_use` call. If the agent is in a known blocking tool:
|
|
35
|
+
|
|
36
|
+
- **`TaskOutput` with `block: true`** — timeout extended to the task's own timeout + 1 min
|
|
37
|
+
- **`Bash` with long timeout (>5 min)** — timeout extended to the bash timeout + 1 min
|
|
38
|
+
|
|
39
|
+
This works for both tracked processes and orphans (no process handle).
|
|
40
|
+
|
|
41
|
+
### 3. Stop Warning
|
|
42
|
+
|
|
43
|
+
`engine.js stop` checks for active dispatches and warns:
|
|
44
|
+
```
|
|
45
|
+
WARNING: 2 agent(s) are still working:
|
|
46
|
+
- Dallas: [office-bohemia] Build & test PR PR-4959092
|
|
47
|
+
- Rebecca: [office-bohemia] Review PR PR-4964594
|
|
48
|
+
|
|
49
|
+
These agents will continue running but the engine won't monitor them.
|
|
50
|
+
On next start, they'll get a 20-min grace period before being marked as orphans.
|
|
51
|
+
To kill them now, run: node engine.js kill
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### 4. Exponential Backoff on Failures
|
|
55
|
+
|
|
56
|
+
If an agent is killed as an orphan and the work item retries, cooldowns use exponential backoff (2^failures, max 8x) to prevent spam-retrying broken tasks.
|
|
57
|
+
|
|
58
|
+
## Safe Restart Pattern
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
node engine.js stop # Check the warning — are agents working?
|
|
62
|
+
# If yes, decide: wait for them to finish, or accept the grace period
|
|
63
|
+
# Make your code changes
|
|
64
|
+
node engine.js start # Grace period kicks in for surviving agents
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## What the Engine Cannot Do
|
|
68
|
+
|
|
69
|
+
- **Reattach to processes** — Node.js `child_process` doesn't support adopting external PIDs. Once the process handle is lost, the engine can only observe the agent indirectly via file output.
|
|
70
|
+
- **Guarantee completion** — An agent that finishes during a restart will have its output saved to `live-output.log`, but the engine won't run post-completion hooks (PR sync, metrics update, learnings check). These are picked up on the next tick via output file scanning.
|
|
71
|
+
- **Resume mid-task** — If an agent is killed (by orphan detection or timeout), the work item is marked failed. It can be retried but starts from scratch.
|
|
72
|
+
|
|
73
|
+
## Timeline of a Restart
|
|
74
|
+
|
|
75
|
+
```
|
|
76
|
+
T+0s engine.js stop (warns about active agents)
|
|
77
|
+
Engine process exits. Agents keep running as OS processes.
|
|
78
|
+
|
|
79
|
+
T+30s Code changes made. engine.js start.
|
|
80
|
+
Engine reads dispatch.json — finds 2 active items.
|
|
81
|
+
Sets grace period: 20 min from now.
|
|
82
|
+
Logs: "2 active dispatch(es) from previous session"
|
|
83
|
+
|
|
84
|
+
T+0-20m Ticks run. Orphan detection skipped (grace period).
|
|
85
|
+
If an agent finishes, output is written to live-output.log.
|
|
86
|
+
Engine detects completed output on next tick via file scan.
|
|
87
|
+
|
|
88
|
+
T+20m Grace period expires.
|
|
89
|
+
Heartbeat check resumes. Blocking tool detection still active.
|
|
90
|
+
Agent in TaskOutput block:true gets extended timeout.
|
|
91
|
+
Agent with no output for 5min+ and no blocking tool → orphaned.
|
|
92
|
+
```
|
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
# Human vs. Automated — What Requires You, What Doesn't
|
|
2
|
+
|
|
3
|
+
## Quick Reference
|
|
4
|
+
|
|
5
|
+
| Feature | Who starts it | Who runs it | Who decides | Who recovers |
|
|
6
|
+
|---------|--------------|-------------|-------------|-------------|
|
|
7
|
+
| Work items | You (dashboard) | Engine + agent | — | You (retry) |
|
|
8
|
+
| Plans | You (dashboard) | Agent writes plan | You (approve/reject) | You (revise) |
|
|
9
|
+
| PRD items | You (dashboard) | Engine dispatches | — | You (retry) |
|
|
10
|
+
| PR creation | Agent (auto) | Agent | — | — |
|
|
11
|
+
| PR review | Engine (auto-dispatch) | Agent reviewer | Human (vote to merge) | Human (comments → auto-fix) |
|
|
12
|
+
| Build failures | Engine (auto-detect) | Agent (auto-fix) | — | — |
|
|
13
|
+
| Notes | You (`/note`) or agent (findings) | Engine (consolidate) | You (promote to KB) | — |
|
|
14
|
+
| Cleanup | Engine (every 10 min) | Engine | — | — |
|
|
15
|
+
| Metrics | Engine (auto-collect) | Engine | You (view) | — |
|
|
16
|
+
| Error recovery | Engine (detect) | — | You (retry/delete) | You |
|
|
17
|
+
| Project linking | You (`minions add/scan`) | — | — | — |
|
|
18
|
+
| MCP servers | You (`~/.claude.json`) | Inherited by agents | — | — |
|
|
19
|
+
|
|
20
|
+
## The Two Human Gates
|
|
21
|
+
|
|
22
|
+
Minions is designed around **two approval gates** where humans make decisions. Everything else is automated.
|
|
23
|
+
|
|
24
|
+
### Gate 1: Plan Approval
|
|
25
|
+
|
|
26
|
+
When you submit a `/plan`, an agent creates a structured plan file. The plan sits in `awaiting-approval` status until you:
|
|
27
|
+
- **Approve** → engine auto-dispatches all plan items
|
|
28
|
+
- **Reject** → plan is archived
|
|
29
|
+
- **Revise** → feedback sent back to agent, plan is reworked
|
|
30
|
+
|
|
31
|
+
This is the only point where you decide *what* gets built.
|
|
32
|
+
|
|
33
|
+
### Gate 2: PR Review
|
|
34
|
+
|
|
35
|
+
When agents create PRs, they need human review votes before merging. You can:
|
|
36
|
+
- **Approve** → PR is merge-ready
|
|
37
|
+
- **Comment** → engine detects `@minions` mentions, auto-dispatches a fix task
|
|
38
|
+
- **Request changes** → same as comment, triggers auto-fix
|
|
39
|
+
|
|
40
|
+
This is the only point where you decide if the *quality* is good enough.
|
|
41
|
+
|
|
42
|
+
## Fully Automated (Zero Human Involvement)
|
|
43
|
+
|
|
44
|
+
These run continuously without you:
|
|
45
|
+
|
|
46
|
+
- **Work discovery** — engine scans all project queues every tick (~60s)
|
|
47
|
+
- **Agent dispatch** — engine picks the right agent, builds the prompt, spawns Claude
|
|
48
|
+
- **Worktree management** — create on dispatch, pull on shared-branch, clean after merge
|
|
49
|
+
- **PR status polling** — checks ADO for build status, review votes, merge state every ~6 min
|
|
50
|
+
- **Build failure detection** — auto-files fix tasks when CI fails
|
|
51
|
+
- **Inbox consolidation** — LLM-powered dedup and categorization when inbox hits threshold
|
|
52
|
+
- **Knowledge base classification** — auto-assigns category to consolidated notes
|
|
53
|
+
- **Heartbeat monitoring** — detects hung/dead agents, marks them failed
|
|
54
|
+
- **Blocking tool detection** — extends timeout when agent is in a long-running operation
|
|
55
|
+
- **Metrics collection** — tracks tasks, errors, PRs, approvals per agent
|
|
56
|
+
- **Dispatch priority** — fixes first, then reviews, then implementations
|
|
57
|
+
- **Cooldown & backoff** — prevents re-dispatching recently failed items
|
|
58
|
+
- **Zombie cleanup** — temp files, orphaned worktrees, stale processes every 10 min
|
|
59
|
+
- **Post-merge hooks** — worktree cleanup, PRD status update, metrics update
|
|
60
|
+
|
|
61
|
+
## Human-Triggered, Then Autonomous
|
|
62
|
+
|
|
63
|
+
You kick these off, then they run without you:
|
|
64
|
+
|
|
65
|
+
- **Work items** — type in dashboard, engine dispatches, agent executes, PR created
|
|
66
|
+
- **PRD items** — `/prd` in dashboard, engine discovers and dispatches implement tasks
|
|
67
|
+
- **Fan-out** — one task dispatched to all idle agents in parallel
|
|
68
|
+
- **Retry** — click retry on failed item, engine re-dispatches fresh
|
|
69
|
+
- **Notes** — `/note` in dashboard, flows through inbox → consolidation → team knowledge
|
|
70
|
+
- **KB promotion** — click "Add to Knowledge Base", pick category, done
|
|
71
|
+
- **Project linking** — `minions scan` or `minions add`, engine discovers work on next tick
|
|
72
|
+
|
|
73
|
+
## Human-in-the-Loop
|
|
74
|
+
|
|
75
|
+
These pause and wait for your input:
|
|
76
|
+
|
|
77
|
+
- **Plan approval** — agent writes plan, waits for approve/reject/revise
|
|
78
|
+
- **Plan discussion** — interactive Claude session where you refine the plan
|
|
79
|
+
- **PR merge** — agents can't merge their own PRs, humans must vote
|
|
80
|
+
- **PR feedback cycle** — human comments → auto-fix → human re-reviews (loop until approved)
|
|
81
|
+
|
|
82
|
+
## Manual Only
|
|
83
|
+
|
|
84
|
+
These are entirely on you:
|
|
85
|
+
|
|
86
|
+
- **Project setup** — `minions init`, `minions scan`, `minions add`
|
|
87
|
+
- **Agent customization** — edit `agents/*/charter.md`, `routing.md`
|
|
88
|
+
- **Config changes** — edit `config.json` (engine settings, projects)
|
|
89
|
+
- **MCP server setup** — add servers to `~/.claude.json`
|
|
90
|
+
- **Dashboard access** — open browser to `http://localhost:7331`
|
|
91
|
+
- **Engine start/stop** — `node engine.js start/stop`
|
|
92
|
+
|
|
93
|
+
## What Happens When You Walk Away
|
|
94
|
+
|
|
95
|
+
If you start the engine and dashboard, then leave:
|
|
96
|
+
|
|
97
|
+
1. Engine ticks every 60 seconds
|
|
98
|
+
2. Discovers pending work items, PRD gaps, PR reviews needed
|
|
99
|
+
3. Dispatches agents (up to max concurrent)
|
|
100
|
+
4. Agents create worktrees, write code, create PRs
|
|
101
|
+
5. Engine monitors for completion, hung agents, build failures
|
|
102
|
+
6. Successful work → PRs appear in your ADO/GitHub queue
|
|
103
|
+
7. Failed work → marked failed, waiting for your retry
|
|
104
|
+
8. Notes consolidated into team knowledge automatically
|
|
105
|
+
9. Worktrees cleaned up after PRs merge
|
|
106
|
+
|
|
107
|
+
**What blocks:** Plans waiting for approval. PRs waiting for your review vote. Failed tasks waiting for retry. Everything else keeps moving.
|
|
108
|
+
|