npm - @yemi33/minions - Versions diffs - 0.1.1884 → 0.1.1885 - Mend

@yemi33/minions 0.1.1884 → 0.1.1885

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/README.md +78 -64
package/docs/README.md +36 -0
package/docs/{distribution.md → archive/distribution.md} +2 -0
package/docs/auto-discovery.md +21 -8
package/docs/completion-reports.md +215 -0
package/docs/engine-restart.md +2 -0
package/docs/kb-sweep.md +173 -0
package/docs/onboarding.md +246 -0
package/docs/runtime-adapters.md +213 -0
package/docs/team-memory.md +116 -0
package/docs/teams-production.md +3 -1
package/docs/watches.md +201 -0
package/package.json +1 -1
package/playbooks/docs.md +1 -1
package/playbooks/fix.md +1 -12
package/playbooks/implement-shared.md +1 -12
package/playbooks/implement.md +2 -2
package/playbooks/plan-to-prd.md +3 -3
package/playbooks/review.md +1 -1
package/playbooks/shared-rules.md +1 -3
package/playbooks/verify.md +1 -1
package/prompts/cc-system.md +10 -2

package/docs/kb-sweep.md ADDED Viewed

@@ -0,0 +1,173 @@
+# Knowledge Base Sweep
+How the `knowledge/` directory stays curated, deduplicated, and small enough to keep injecting into agent prompts.
+## What the Sweep Does
+The KB sweep is a 3-pass cycle implemented in [`engine/kb-sweep.js`](../engine/kb-sweep.js) that prunes duplicate, stale, and oversized entries from `knowledge/`. It runs asynchronously, archives (rather than deletes) what it removes, and writes a `_swept` frontmatter flag so the expensive rewrite pass is idempotent.
+```
+Entries in knowledge/<category>/*.md
+  → Pass 1: Hash dedup       (cheap, no LLM)
+  → Pass 2: LLM batch sweep  (Haiku, batches of 30)
+  → Pass 3: Per-entry rewrite (Haiku, concurrency 5)
+  → Prune _swept/ archive    (>30 days)
+  → Invalidate KB cache, write engine/kb-swept.json
+```
+## The 3-Pass Cycle
+### Pass 1 — Hash Dedup
+Cheap, deterministic. No LLM call.
+For every entry, compute `sha256(normalize(content).slice(0,500) + ':' + content.length)` (source: [`engine/kb-sweep.js:29`](../engine/kb-sweep.js#L29)). Group by hash. When two or more entries collide, keep the most recent (by `date:` frontmatter, then mtime) and archive the rest into `knowledge/_swept/` with a `<!-- swept: ... | reason: hash-duplicate of ... -->` header (source: [`engine/kb-sweep.js:84-105`](../engine/kb-sweep.js#L84)).
+This catches near-identical re-postings across batches that the LLM pass might miss because they land in different batches.
+### Pass 2 — LLM Batch Sweep
+The remaining survivors are sent to Claude Haiku in batches of `LLM_BATCH_SIZE = 30` (source: [`engine/kb-sweep.js:25`](../engine/kb-sweep.js#L25)). Each batch prompt asks for three lists in JSON:
+| Field | What the LLM looks for |
+|-------|------------------------|
+| `duplicates` | Entries with substantially the same content. Returned as `{keep, remove[], reason}`. |
+| `reclassify` | Entries in the wrong category. Returned as `{index, from, to, reason}`. |
+| `remove`     | Stale or empty entries (boilerplate, "no changes needed", bail-out notes). Returned as `{index, reason}`. |
+Each action archives the file via the same `_archiveKbFile()` helper used by Pass 1; reclassification rewrites the `category:` frontmatter line and moves the file into the new category directory (source: [`engine/kb-sweep.js:243-279`](../engine/kb-sweep.js#L243)).
+Reclassification targets are validated against `shared.KB_CATEGORIES` (`architecture`, `conventions`, `project-notes`, `build-reports`, `reviews` — source: [`engine/shared.js:1012`](../engine/shared.js#L1012)); unknown categories are silently dropped.
+If a batch returns invalid JSON or the runtime is unavailable, that batch is skipped with a warning and the rest of the sweep continues (source: [`engine/kb-sweep.js:139-151`](../engine/kb-sweep.js#L139)).
+### Pass 3 — Per-Entry Rewrite
+Survivors of Passes 1 and 2 that are still on disk get rewritten one entry at a time, with up to `NORMALIZE_CONCURRENCY = 5` parallel workers (source: [`engine/kb-sweep.js:26`](../engine/kb-sweep.js#L26)). Each entry is sent to Haiku with a fixed template prompt that:
+- Caps output at ~800 words.
+- Forces the structure `## Summary` → `## Key Findings` → `## Action Items` (omit if none) → `## References` (omit if none).
+- Preserves all `file:line` references and code snippets.
+- Drops boilerplate (full dates, agent IDs in the body, narrative scaffolding).
+After a successful rewrite, the entry's frontmatter gains a `_swept: <ISO timestamp>` key (source: [`engine/kb-sweep.js:229`](../engine/kb-sweep.js#L229)).
+## The `_swept` Frontmatter Flag
+`_swept` makes Pass 3 idempotent across runs. Semantics:
+- **Set** to `new Date().toISOString()` whenever the rewrite pass successfully rewrites the body (source: [`engine/kb-sweep.js:229`](../engine/kb-sweep.js#L229)).
+- **Skipped** by Pass 3 on subsequent sweeps as long as the file's `mtimeMs` is `<= sweptAt + 1000` ms (1 s grace for FS timestamp jitter — source: [`engine/kb-sweep.js:196-202`](../engine/kb-sweep.js#L196)).
+- **Re-processed** if the file is edited after the sweep flag was set (mtime exceeds the grace window).
+Hash dedup (Pass 1) and the LLM sweep (Pass 2) ignore `_swept` — those passes act on metadata and similarity, not on whether the entry was previously rewritten.
+The flag key is exported as `SWEPT_FLAG_KEY` for tests (source: [`engine/kb-sweep.js:27,429`](../engine/kb-sweep.js#L27)).
+## 30-Day Archive Retention
+Archived entries land in `knowledge/_swept/` with their original filename (deduped via `shared.uniquePath()` if it collides). They're not deleted immediately — `_pruneOldSwept()` runs at the end of every sweep and removes any file whose `mtimeMs` is older than `SWEPT_RETENTION_MS = 30 * 24 * 60 * 60 * 1000` (30 days — source: [`engine/kb-sweep.js:23,69-81`](../engine/kb-sweep.js#L23)).
+This gives humans a recovery window: if a sweep archives something useful, you have ~30 days to copy it back out of `_swept/` before it's permanently pruned.
+## Manual Trigger
+The dashboard exposes two endpoints:
+| Method | Path | Purpose |
+|--------|------|---------|
+| `POST` | `/api/knowledge/sweep` | Start a sweep in the background. Returns `202 { ok: true, started: true }` (source: [`dashboard.js:7351`](../dashboard.js#L7351), [`dashboard.js:4285`](../dashboard.js#L4285)). |
+| `GET`  | `/api/knowledge/sweep/status` | Poll `{ inFlight, startedAt, lastResult, lastCompletedAt }` (source: [`dashboard.js:7352`](../dashboard.js#L7352), [`dashboard.js:4335`](../dashboard.js#L4335)). |
+POST body is optional. Pass `{ "pinnedKeys": ["knowledge/conventions/foo.md", ...] }` to add request-level pins on top of `pinned.md`. Pinned entries are excluded from every pass (source: [`engine/kb-sweep.js:345-356`](../engine/kb-sweep.js#L345)).
+```bash
+# Trigger a sweep
+curl -X POST http://localhost:7331/api/knowledge/sweep \
+  -H 'Content-Type: application/json' \
+  -d '{}'
+# Poll status
+curl http://localhost:7331/api/knowledge/sweep/status
+```
+### Concurrency Guard
+Only one sweep runs at a time. The POST handler refuses to start a second sweep with `{ ok: true, alreadyRunning: true, startedAt }` while one is in flight (source: [`dashboard.js:4306-4311`](../dashboard.js#L4306)).
+The guard auto-releases when the sweep has been running longer than `staleGuardMs(entryCount)` — `max(30 min, 1 s × entryCount)` — which protects against a dashboard process that died mid-sweep leaving the flag stuck (source: [`engine/kb-sweep.js:413-417`](../engine/kb-sweep.js#L413), [`dashboard.js:4286-4305`](../dashboard.js#L4286)).
+### Persistent State
+Sweep state is mirrored to `engine/kb-sweep-state.json` so the dashboard can recover the in-flight indicator across restarts:
+```jsonc
+// While running
+{ "status": "in-flight", "startedAt": 1715472000000, "startedAtIso": "2026-05-12T..." }
+// On success
+{ "status": "completed", "startedAt": ..., "completedAt": ..., "durationMs": ..., "summary": "...", "lastResult": { ... } }
+// On failure
+{ "status": "failed", "startedAt": ..., "completedAt": ..., "error": "..." }
+```
+Memory still wins when present; the disk file is a fallback (source: [`engine/kb-sweep.js:298-320`](../engine/kb-sweep.js#L298), [`dashboard.js:4296-4354`](../dashboard.js#L4296)). A separate `engine/kb-swept.json` is written after each successful sweep with the human-readable summary line shown by the dashboard's "swept N days ago" badge (source: [`engine/kb-sweep.js:407`](../engine/kb-sweep.js#L407)).
+## Dashboard UI
+The KB page surfaces sweep state in two places:
+1. **Trigger button** (`kbSweep()` in [`dashboard/js/render-kb.js:166`](../dashboard/js/render-kb.js#L166)) posts to `/api/knowledge/sweep` with the user's pinned keys and shows an immediate "queued" toast. If the response says `alreadyRunning`, the toast switches to "already running (Xm elapsed) — let it finish first".
+2. **`#kb-swept-time` indicator** renders `swept N days ago · now sweeping (Xm)` (source: [`dashboard/js/render-kb.js:72-82`](../dashboard/js/render-kb.js#L72)). The `sweepInFlight` and `sweepStartedAt` fields come from `GET /api/knowledge` (source: [`dashboard.js:4262-4267`](../dashboard.js#L4262)).
+### Polling
+The KB page is refreshed by the dashboard's main refresh loop **every third tick (~12 s)** in [`dashboard/js/refresh.js:121`](../dashboard/js/refresh.js#L121). There is no separate sweep-status polling timer and no auto-stop — the in-flight badge keeps refreshing for as long as the user has the dashboard open and the sweep is running. Sweeps can take many minutes for a large KB; the persistent `engine/kb-sweep-state.json` flag plus the indefinite refresh loop are what let the indicator survive dashboard restarts and arbitrarily long sweeps.
+## Result Summary
+`runKbSweep()` returns (and persists in `engine/kb-swept.json`):
+```js
+{
+  ok: true,
+  entriesBefore, entriesAfter,
+  bytesBefore,  bytesAfter,
+  hashDuplicatesArchived,  // Pass 1
+  llmDuplicatesArchived,   // Pass 2 (within-batch dupes)
+  staleRemoved,            // Pass 2 (LLM-flagged removals)
+  reclassified,            // Pass 2 (category moves)
+  rewritten,               // Pass 3
+  rewriteBytesBefore, rewriteBytesAfter,
+  sweptArchivePruned,      // Files purged from _swept/ (>30 days)
+  durationMs,
+  summary: '<n> hash-dup, <n> llm-dup, <n> stale, <n> reclassified, <n> rewritten (<bytes> saved)',
+}
+```
+After a non-dry-run sweep, `queries.invalidateKnowledgeBaseCache()` is called so the next `/api/knowledge` read sees the new state (source: [`engine/kb-sweep.js:408`](../engine/kb-sweep.js#L408)).
+## What NOT to Do
+> **Never write to `knowledge/` directly.** The sweep is the only writer. Every other code path — agents, dashboard handlers, scripts — should write findings to `notes/inbox/<agent>-<id>-<date>.md` and let the consolidation engine classify them into `knowledge/<category>/`.
+Why this matters:
+- Direct writes bypass the hash-dedup, classification, and `_swept` rewrite passes, so future sweeps see a "fresh" entry and may immediately rewrite or archive it.
+- Editing an entry resets its `mtime`, which makes the `_swept` flag stale (re-rewrite on next sweep) and silently changes the entry the next consolidation pass diffs against.
+- Manual edits collide with the LLM rewrite pass — your edit can be replaced wholesale by the next sweep.
+- Per the team-wide rule injected into every dispatched playbook: **"Never delete, move, or overwrite files in `knowledge/`."**
+If you spot an incorrect KB entry, write a note to `notes/inbox/` describing the issue. The next sweep will reclassify, merge, or remove it; humans can also pin entries (via `pinned.md` or the dashboard) to keep them out of the sweep entirely.
+The archive directory `knowledge/_swept/` is also off-limits to manual edits — touching it interferes with the 30-day retention timestamp.
+## Related Files
+- [`engine/kb-sweep.js`](../engine/kb-sweep.js) — implementation
+- [`engine/queries.js`](../engine/queries.js) — `getKnowledgeBaseEntries()`, `invalidateKnowledgeBaseCache()`
+- [`dashboard.js`](../dashboard.js) — `handleKnowledgeSweep`, `handleKnowledgeSweepStatus`, `handleKnowledgeList`
+- [`dashboard/js/render-kb.js`](../dashboard/js/render-kb.js) — UI trigger and indicator
+- [`dashboard/js/refresh.js`](../dashboard/js/refresh.js) — refresh cadence (every 3rd tick)
+- [`engine/shared.js`](../engine/shared.js) — `KB_CATEGORIES`, `getPinnedItems()`, `uniquePath()`

package/docs/onboarding.md ADDED Viewed

@@ -0,0 +1,246 @@
+# Minions Onboarding — Your First 30 Minutes
+A fast, hands-on walkthrough for new contributors. Follow the eight steps below in order. Each step is independent enough that you can stop, take a break, and resume without losing state.
+> **Prerequisites:** Node.js 18+, Git, and a working Claude Code CLI (`npm install -g @anthropic-ai/claude-code`) authenticated against your Anthropic API key or Claude Max subscription. See the top-level [README.md](../README.md#prerequisites) for the full list.
+> **Screenshots:** This walkthrough is intentionally text-only on the first pass. If you want annotated dashboard screenshots, run the [`capture-demos`](../README.md#dashboard) skill after Step 5; it drives Playwright over the running dashboard and writes images you can drop alongside each section.
+---
+## Step 1 — Clone the repo
+You only need to clone if you intend to modify Minions itself. End users normally `npm install -g @yemi33/minions` instead, but a checkout is the right starting point for contributors.
+```bash
+git clone https://github.com/yemi33/minions.git ~/minions-dev
+cd ~/minions-dev
+npm install     # installs dev tooling (Playwright); engine itself has zero deps
+```
+You should now have:
+- `engine.js` — the orchestrator entry point
+- `dashboard.js` — the HTTP/UI server (binds `:7331`)
+- `minions.js` — the CLI shim that maps `minions <cmd>` to the right script
+- `playbooks/`, `agents/`, `prompts/`, `docs/` — content the engine renders into agent prompts
+- `test/` — `unit.test.js`, the integration suite, and Playwright specs
+If you cloned to somewhere other than `~/.minions`, the CLI will still work — `minions init` writes runtime state under `~/.minions/` regardless of where the source lives. The `node` scripts under your checkout will use the checkout as the runtime root when invoked directly (e.g. `node ./engine.js start`).
+---
+## Step 2 — Run `minions doctor`
+`minions doctor` runs the same preflight checks the engine runs at startup. It is safe to run any time and never mutates state.
+```bash
+minions doctor
+```
+A healthy Mac/Linux run looks roughly like:
+```
+  Runtime:
+    ✓ node: v20.11.1 (>= 18 required)
+    ✓ git: git version 2.45.0
+    ✓ claude CLI: 1.0.x found on PATH
+  Authentication:
+    ✓ ANTHROPIC_API_KEY: set
+    ✓ gh auth: logged in as <you>
+  Filesystem:
+    ✓ ~/.minions writable
+    ✓ engine/ writable
+  All checks passed.
+```
+What to look for:
+- `✓` (green) means OK. `!` is a warning the engine can usually work around. `✗` (critical) will block dispatch — fix it before continuing.
+- The most common failure is `claude CLI: not found` — re-install with `npm install -g @anthropic-ai/claude-code` and re-run.
+- On Windows, runtime resolution prefers native `claude.exe` over the POSIX bash shim; if you see runtime-resolution warnings, `gh auth status` and `where.exe claude` are good first stops.
+If `doctor` reports any `✗`, stop and resolve it. The remaining steps assume a clean preflight.
+---
+## Step 3 — Bootstrap state with `minions init`
+`minions init` scaffolds `~/.minions/` with a default config, five agent charters, default playbooks, an empty knowledge base, and the engine state files.
+```bash
+minions init
+```
+After init you will have (under `~/.minions/`):
+- `config.json` — projects, agents, schedules
+- `agents/<name>/charter.md` — the personality + boundaries for each of the five default agents (`ripley`, `dallas`, `lambert`, `rebecca`, `ralph`)
+- `engine/dispatch.json`, `engine/control.json`, `engine/metrics.json` — empty queues, ready for first dispatch
+- `pinned.md`, `notes.md` — empty team-memory surfaces
+- `playbooks/`, `prompts/`, `routing.md` — agent prompt scaffolding
+`init` is safe to re-run; existing customizations to `config.json`, charters, notes, knowledge, and routing are preserved. To force overwrites after a major version bump, use `minions init --force`.
+---
+## Step 4 — Link your first project with `minions add`
+A "project" tells Minions which repo to discover work for and where to spawn worktrees. The CLI auto-detects host (GitHub/ADO), org, repo name, and main branch from the directory's git remote.
+```bash
+minions add ~/code/your-repo
+```
+You will be prompted to confirm or override the auto-detected fields, plus a one-line description. The description is **important** — agents read it during routing to decide whether a work item belongs to your repo. Keep it specific (e.g. "Internal billing API and webhook ingestion service") rather than generic ("Backend code").
+After `minions add` returns, `~/.minions/config.json` has a new `projects[]` entry, and `~/.minions/projects/<name>/` holds the per-project `work-items.json` + `pull-requests.json` trackers.
+To link several at once interactively, run `minions scan` (default: scans `~` to depth 3 for git repos and lets you multi-select).
+---
+## Step 5 — Start engine + dashboard with `minions restart`
+`minions restart` starts (or restarts) both the engine daemon and the dashboard server. Use `restart` rather than `start` whenever you have edited engine code or playbooks — it ensures the daemon picks up your changes.
+```bash
+minions restart
+```
+You should see logs like `Engine started (pid=…)` and `Dashboard listening on http://localhost:7331`. Open the dashboard:
+```bash
+minions dash    # opens http://localhost:7331 in your default browser
+```
+A quick tour of the panels you will use most often during onboarding:
+| Panel | What it shows | When to use it |
+|---|---|---|
+| **Projects bar** (top) | All linked projects with descriptions | Confirm Step 4 worked |
+| **Minions Members** | Five agent cards with status, current task, last result | Watch dispatches happen in real time |
+| **Live Output** (per agent) | Streaming `live-output.log` | Tail an agent while it runs — refreshes every 3 s |
+| **Command Center** (CC button, top-right) | Conversational chat with Claude over your full Minions state | Ask questions, queue work, draft plans |
+| **Work Items** | Paginated table of all work, filterable by project/status | Inspect / retry / archive |
+| **Plans & PRD** | Plan markdown drafts and approved PRDs | Approve, discuss-and-revise, or reject plans |
+| **Pull Requests** | Cached PR status (build, review, merge) | Track the PRs your agents create |
+| **Engine** | Dispatch queue, engine log, LLM call performance | Diagnose timing / token issues |
+The engine ticks every 60 seconds. If the dashboard says "Engine: stopped", run `minions restart` again — the dashboard auto-restarts a dead engine on its next health check, but a manual restart is faster.
+---
+## Step 6 — Your first dispatch via the Command Center
+The fastest way to give Minions a task is the Command Center (CC).
+1. Click the **CC** button in the top-right of the dashboard.
+2. In the slide-out chat, type a one-line description, e.g.:
+   ```
+   Add a brief comment to README.md explaining what minions doctor does
+   ```
+3. CC will propose an action (usually `POST /api/work-items` or `POST /api/plan`). Confirm.
+Within one tick (≤60 s), the engine will:
+1. Pick an idle agent that matches the work type (per `routing.md`).
+2. Create a git worktree for that agent under `<your-repo>/.worktrees/<work-item-id>/`.
+3. Render the appropriate playbook (`playbooks/implement.md` for an implement task) with your project context, charter, pinned notes, and recent team notes.
+4. Spawn `engine/spawn-agent.js`, which invokes the Claude CLI with the rendered prompt.
+Watch the **Minions Members** card for that agent flip to "working", and click into the **Live Output** tab to tail the streaming agent log.
+You can do the same thing from the CLI: `minions work "Add a brief comment to README.md…"`.
+---
+## Step 7 — Your first plan via `/plan`
+For multi-step work, use a plan instead of a one-shot work item. In the Command Center, type:
+```
+/plan Audit our README and propose three concrete improvements with examples
+```
+CC will queue a `plan` work item. The plan agent writes a markdown draft to `~/.minions/plans/<project>-<date>.md`, then chains automatically to a `plan-to-prd` agent that produces a structured PRD JSON in `~/.minions/prd/`. The PRD lands with `status: "awaiting-approval"`.
+Open **Plans & PRD** in the dashboard. You will see the new plan with three buttons:
+- **Approve** — materializes one work item per PRD entry, with `depends_on` honored.
+- **Discuss & Revise** — opens a doc-chat modal where you can iterate on the plan with Claude.
+- **Reject** — closes the plan with a reason.
+After approving, agents start picking up the materialized work items on the next tick. When all PRD items complete, the engine auto-dispatches a `verify` task that builds + tests the changes and writes a manual testing guide. Archiving (after verify) is a manual dashboard action — see [docs/plan-lifecycle.md](plan-lifecycle.md) for the full pipeline.
+---
+## Step 8 — Your first PR review
+When an `implement` agent finishes a work item, it pushes a branch and opens a PR. The engine polls PR status every `prPollStatusEvery` ticks (≈12 minutes by default) and dispatches a `review` agent automatically.
+To watch the cycle end-to-end on your first task:
+1. Open **Pull Requests** in the dashboard. Find the PR your agent just opened.
+2. Wait for the next status poll (or run `minions dispatch` to wake the daemon immediately). The PR's `reviewStatus` will flip from `pending` to `waiting`, then a review agent will spawn.
+3. Click the agent card to watch the live review. The reviewer reads the diff, runs targeted checks, then posts a comment that starts with `VERDICT: APPROVE` or `VERDICT: REQUEST_CHANGES`.
+4. If the verdict is `REQUEST_CHANGES`, the engine queues a `fix` task for the original author with a feedback note. The cycle repeats until approved or capped by the eval-loop config.
+You can also trigger reviews manually by commenting on the PR — see [docs/pr-review-fix-loop.md](pr-review-fix-loop.md).
+> **Note on self-approval:** Minions agents share a single GitHub identity, so `gh pr review --approve` is blocked by GitHub's self-approval rule. Reviewers post a `VERDICT:` **comment** instead, and the engine treats that comment as authoritative.
+---
+## Debugging — Where to look when something goes wrong
+When an agent gets stuck, hangs, or produces unexpected output, these files are your first stops. They live under `~/.minions/` (or wherever your runtime root is).
+| File | What it tells you |
+|---|---|
+| `agents/<id>/live-output.log` | Real-time streaming output from the agent's Claude CLI session — same content the dashboard's Live Output tab shows. Tail this with `tail -f ~/.minions/agents/<id>/live-output.log`. |
+| `agents/<id>/last-prompt.txt` | The most recent rendered prompt the agent received (playbook + system prompt + injected context). Inspect this to see exactly what the agent was asked to do, including pinned notes, team notes, charter, and dispatch context. If your install does not surface this file yet, the live equivalents are `engine/tmp/prompt-<dispatch-id>.md` (while the agent is running) and `engine/contexts/<dispatch-id>.md` (when the prompt was sidecarred because it exceeded `maxDispatchPromptBytes`). |
+| `agents/<id>/output.log` | The agent's final output after completion (post-mortem read; `live-output.log` is the during-run read). |
+| `agents/<id>/history.md` | Last 20 tasks for that agent with timestamps, results, projects, and branches. Useful when a single agent is misbehaving over multiple tasks. |
+| `engine/log.json` | Audit trail of engine actions (last 2500 entries, rotated to 2000). Filter with `jq` for a specific agent or work item. |
+| `engine/dispatch.json` | Pending / active / completed dispatch queue. If an agent appears idle on the dashboard but the queue says `active`, the engine likely lost process tracking — try `minions restart`. |
+| `engine/metrics.json` | Per-agent token usage, runtime, error counts. Useful for spotting agents that are silently retrying. |
+| `notes/inbox/` | Raw findings agents drop after each task. Look here for the latest learnings before the consolidator merges them into `notes.md`. |
+| `pinned.md` and `notes.md` | Team memory that gets injected into every agent prompt. If an agent keeps making the same mistake, pin a correction here. |
+Other quick debugging knobs:
+- `minions status` — text snapshot of agents, projects, dispatch queue, and quality metrics.
+- `minions queue` — focused view of the dispatch queue (pending, active, completed).
+- `minions kill` — kills all active agents and resets their dispatches to `pending`. Use when an agent is wedged.
+- `minions cleanup` — manually run the temp-file / worktree / zombie sweep that normally runs every 10 ticks.
+For deeper dives: [docs/engine-restart.md](engine-restart.md), [docs/auto-discovery.md](auto-discovery.md), and [docs/self-improvement.md](self-improvement.md).
+---
+## What's next
+You have:
+- Linked a project, started the engine, opened the dashboard.
+- Dispatched a one-shot task via Command Center.
+- Authored a plan, approved it, and watched the PRD materialize into work items.
+- Watched an agent open and review a PR.
+- Located the files you need to debug a stuck agent.
+From here, sensible next reads are:
+- [README.md](../README.md) — the full feature catalog and CLI reference.
+- [CLAUDE.md](../CLAUDE.md) — engineering conventions, tick cycle, locking rules. Read this before editing engine code.
+- [docs/plan-lifecycle.md](plan-lifecycle.md) — plan → PRD → implement → verify → archive in detail.
+- [docs/command-center.md](command-center.md) — full CC capabilities, including doc-chat and plan steering.
+- [docs/pr-review-fix-loop.md](pr-review-fix-loop.md) — how the review/fix cycle is bounded and steered.
+Welcome to the team.

package/docs/runtime-adapters.md ADDED Viewed

@@ -0,0 +1,213 @@
+# Runtime Adapters
+How Minions abstracts over CLI runtimes (Claude Code, GitHub Copilot CLI, future
+adapters). Engine code never special-cases a runtime by name — every CLI-specific
+behavior is hidden behind an adapter object resolved through `resolveRuntime()`.
+## What lives in `engine/runtimes/`
+| File | Role |
+|------|------|
+| `engine/runtimes/index.js` | Adapter registry. `resolveRuntime(name)`, `listRuntimes()`, `registerRuntime()`. Engine code MUST go through these. |
+| `engine/runtimes/claude.js` | Claude Code adapter. Owns binary probe, `--system-prompt-file`, JSONL parser, model shorthands, budget cap, bare mode. |
+| `engine/runtimes/copilot.js` | GitHub Copilot CLI adapter. Owns standalone-vs-`gh-copilot` resolution, stdin-only prompt delivery, `https://api.githubcopilot.com/models` discovery, effort `'max' → 'xhigh'` mapping. |
+`resolveRuntime(name)` throws when `name` is unknown so misconfigurations surface
+at dispatch time instead of producing silent fallbacks deep inside spawn logic.
+The default name is `'claude'` (matches `engine.defaultCli` fallback).
+## Adapter Interface
+Every adapter exports the following surface. The Claude adapter is the canonical
+implementation — copy it as the template for new runtimes and only override the
+methods that genuinely differ.
+| Member | Type / Returns | Purpose |
+|--------|----------------|---------|
+| `name` | string | Adapter identifier (matches the registry key). |
+| `capabilities` | flag object (see below) | Feature flags consumed by engine code. |
+| `resolveBinary({ env, config? })` | `{ bin, native, leadingArgs }` or null | Probe PATH, npm-globals, package overrides; cached to `engine/<name>-caps.json`. `leadingArgs` is `[]` for direct binaries and `['copilot']` for the `gh copilot` extension fallback. |
+| `capsFile` | absolute path | Cache file for `resolveBinary()` results. |
+| `installHint` | string | Surfaced when `resolveBinary()` returns null. Consumed by `engine/preflight.js` and `engine/spawn-agent.js`. |
+| `listModels()` | `Promise<{id,name,provider}[] | null>` | Model catalog or null when discovery isn't supported. |
+| `modelsCache` | absolute path | Cache file for `listModels()` results. |
+| `spawnScript` | absolute path | Path to the runtime-agnostic spawn wrapper (currently `engine/spawn-agent.js` for both runtimes). |
+| `buildArgs(opts)` | `string[]` | CLI args excluding the binary itself. |
+| `buildSpawnFlags(opts)` | `string[]` | Optional; engine-side flags consumed by `engine/spawn-agent.js` (overrides the default `_buildAgentSpawnFlags`). |
+| `buildPrompt(promptText, sysPromptText)` | string | Final prompt delivered to the CLI. Adapters that lack `--system-prompt-file` (Copilot) prepend a `<system>` block here. |
+| `getUserAssetDirs({ homeDir })` | `string[]` | Runtime-native global asset roots passed to spawn as `--add-dir` so worktrees still see them. |
+| `getSkillRoots({ homeDir, project? })` | `[{ scope, dir, projectName? }]` | Where `collectSkillFiles` looks for native + project skill markdown. |
+| `getSkillWriteTargets({ homeDir, project? })` | `{ personal, project }` | Where `extractSkillsFromOutput` writes auto-extracted skills. |
+| `getResumeSessionId(...)` | string or null | Pre-spawn resume probe; runtime-agnostic stamp prevents cross-runtime session-ID confusion. |
+| `saveSession(...)` | void | Writes `agents/<id>/session.json` after a successful turn. |
+| `resolveModel(input)` | string or undefined | Shorthand expansion / passthrough. Returning undefined causes the adapter to omit `--model` and the CLI uses its own default. |
+| `modelLooksFamiliar(model)` | boolean | Heuristic powering the preflight "stale model after CLI switch" warning. |
+| `parseOutput(raw)` | `{ text, usage, sessionId, model }` | Final-event parser. |
+| `parseStreamChunk(line)` | event object or null | Single JSONL line → typed event. |
+| `parseError(rawOutput)` | `{ message, code, retriable }` | Codes: `auth-failure`, `context-limit`, `budget-exceeded`, `crash`, null. |
+| `createStreamConsumer(ctx)` | consumer object | Stream accumulator used by `engine/llm.js`. |
+| `detectPermissionGate`, `getPromptDeliveryMode`, `usesSystemPromptFile`, `classifyFailure` | misc | Adapter-owned policy that engine code reads through accessors instead of branching on `runtime.name`. |
+The contract surface is documented inline at the top of `engine/runtimes/claude.js`
+(see the JSDoc block at the start of that file). Keep that block authoritative —
+this table is the human-readable mirror.
+## Capability Flags
+Engine code branches on flags, never on runtime names. Add a flag whenever a
+real behavioral split appears between adapters.
+| Flag | Claude | Copilot | Gates |
+|------|--------|---------|-------|
+| `streaming` | ✓ | ✓ | JSONL events on stdout. |
+| `sessionResume` | ✓ | ✓ | `--resume <id>`. |
+| `midRunSessionId` | ✓ | ✗ | Resumable session ID emitted before the terminal `result` event; when false, steering waits for a checkpoint. |
+| `systemPromptFile` | ✓ | ✗ | sysprompt via `--system-prompt-file` (else inlined into stdin by `buildPrompt`). |
+| `effortLevels` | ✓ | ✓ | `--effort low\|medium\|high\|xhigh`. Copilot maps `'max' → 'xhigh'` inside `resolveModel`/`buildArgs`; Claude leaves `'max'` alone. |
+| `costTracking` | ✓ | ✗ | USD + token counts in the result event. Copilot only emits `premiumRequests`. |
+| `modelShorthands` | ✓ | ✗ | Bare `sonnet`/`opus`/`haiku` accepted. Copilot expects full model IDs (`claude-sonnet-4.5`, `gpt-5.4`). |
+| `modelDiscovery` | ✗ | ✓ | `listModels()` returns a real catalog (Copilot reads `https://api.githubcopilot.com/models`). |
+| `promptViaArg` | ✗ | ✗ | If true, prompt goes via `--prompt <text>` instead of stdin. False on both because Windows ARG_MAX (~32 KB) breaks `-p "<long-prompt>"` outright. |
+| `budgetCap` | ✓ | ✗ | `--max-budget-usd <n>`. |
+| `bareMode` | ✓ | ✗ | `--bare` (suppresses CLAUDE.md auto-discovery). Closest Copilot equivalent is `--no-custom-instructions`, gated separately. |
+| `fallbackModel` | ✓ | ✗ | `--fallback-model <id>` on rate-limit. |
+| `sessionPersistenceControl` | ✓ | ✗ | Engine writes `session.json`. Copilot manages session state internally in `~/.copilot/session-state/`. |
+| `resumePromptCarryover` | ✗ | ✓ | CC resume turns prepend recent visible Q&A in stdin because Copilot's session store is opaque to Minions. |
+| `resumeBookkeepingTurn` | ✓ | — | Claude CLI injects a synthetic "Continue from where you left off." meta turn on `--resume`; CC prompts must tell the model not to treat it as user intent. |
+| `streamConsumer` | ✓ | ✓ | Adapter implements `createStreamConsumer(ctx)` — required by `engine/llm.js` accumulator. |
+When a behavior is genuinely uniform across all current adapters, it still gets
+a flag if a future runtime might disagree — flags are cheap.
+## CC vs Agent Runtime Independence
+Command Center (CC) / doc-chat and the agent fleet are two **independent**
+runtime paths. Setting `engine.ccCli: copilot` for CC alone must NOT switch the
+agent fleet to Copilot, and vice versa. Both paths fall through to
+`engine.defaultCli` (the fleet-wide default), but they never see each other's
+overrides.
+Six helpers in `engine/shared.js` are the single source of truth for runtime/model
+resolution. Engine code MUST go through them instead of reading config keys
+directly.
+| Helper | Chain |
+|--------|-------|
+| `resolveAgentCli(agent, engine)` | `agent.cli` → `engine.defaultCli` → `'claude'` |
+| `resolveCcCli(engine)` | `engine.ccCli` → `engine.defaultCli` → `'claude'` |
+| `resolveAgentModel(agent, engine)` | `agent.model` → `engine.defaultModel` → undefined |
+| `resolveCcModel(engine)` | `engine.ccModel` → `engine.defaultModel` → undefined |
+| `resolveAgentMaxBudget(agent, engine)` | `agent.maxBudgetUsd` → `engine.maxBudgetUsd`. Honors literal `0`. |
+| `resolveAgentBareMode(agent, engine)` | `agent.bareMode` → `engine.claudeBareMode` → false. Strict null check so per-agent `false` overrides engine `true`. |
+Agent dispatch resolves the runtime once at spawn time:
+```js
+// engine.js spawnAgent (~line 1158)
+const runtime = resolveRuntime(shared.resolveAgentCli(agentConfig, engineConfig));
+```
+CC / doc-chat resolves the runtime via `engine/llm.js` `_resolveRuntimeFor()`:
+```js
+// engine/llm.js (~line 247)
+if (!runtimeName && callOpts.engineConfig) runtimeName = resolveCcCli(callOpts.engineConfig);
+```
+Tests in `test/unit/runtime-fleet-helpers.test.js` enforce that
+`resolveAgentCli` does NOT fall through to `engine.ccCli` and that `resolveCcCli`
+does NOT inspect any agent settings.
+## `--cli` / `--model` / `--effort` Knobs
+Three CLI flags switch the fleet without dashboard interaction. They forward
+through `engine/cli.js` `parseRuntimeFleetFlags()` and persist into `config.json`.
+| Flag | Persists to | Notes |
+|------|-------------|-------|
+| `--cli <runtime>` | `engine.defaultCli` | Validated against the runtime registry; unknown runtimes exit non-zero with the registered list. Implicitly clears any explicit `engine.ccCli` (with a warning naming the prior value). |
+| `--model <id>` | `engine.defaultModel` | Empty-string sentinel (`--model ''`) **deletes** `engine.defaultModel` — never pin to a literal empty string. Compatibility with `defaultCli` is checked; mismatch emits an incompatibility warning suggesting `--model ''`. |
+| `--effort <level>` | dispatched as `opts.effort` | One of `low|medium|high|xhigh`. Copilot maps `'max' → 'xhigh'`; Claude passes `'max'` through unchanged. Capability-gated by `effortLevels`. |
+Examples:
+```bash
+minions start --cli copilot --model claude-sonnet-4.5
+minions restart --cli claude --model ''     # '' DELETES defaultModel
+minions config set-cli copilot --model gpt-5.4
+```
+Inside the adapters, `buildArgs(opts)` translates the resolved values into the
+CLI-specific argv:
+```js
+// engine/runtimes/copilot.js buildArgs
+if (model) args.push('--model', String(model));
+if (mappedEffort) args.push('--effort', mappedEffort);
+```
+When `resolveAgentModel` returns undefined the adapter omits `--model` from
+`buildArgs` entirely and the underlying CLI uses its own default.
+## Per-Runtime Skill Roots
+Skills live in runtime-native directories so a skill placed there is genuinely
+visible to that runtime when invoked outside Minions. The adapter's
+`getSkillRoots()` and `getSkillWriteTargets()` are the discovery and write
+contracts.
+| Adapter | Personal skill root | Project skill roots | Auto-extraction write targets |
+|---------|---------------------|---------------------|-------------------------------|
+| Claude  | `~/.claude/skills`  | `<repo>/.claude/skills` | personal: `~/.claude/skills`; project: `<repo>/.claude/skills` |
+| Copilot | `~/.copilot/skills` (+ `~/.agents/skills`) | `<repo>/.github/skills`, `<repo>/.agents/skills` | personal: `~/.copilot/skills`; project: `<repo>/.github/skills` |
+`getUserAssetDirs()` returns the union of runtime-native global roots and is
+passed to spawn as `--add-dir` so spawned agents read the same global assets the
+user sees outside Minions. The cross-runtime portable location
+`~/.agents/skills` is included by every adapter that opts into it (Copilot today;
+add it to new adapters when the runtime can read directly from there).
+## The "No Runtime-Name Branching" Rule
+The whole point of this layer: **engine code MUST gate behavior on
+`runtime.capabilities.*` flags, not on `runtime.name === 'claude'` comparisons.**
+If you find yourself wanting to write `if (runtime.name === 'copilot')` in
+engine code, the correct response is one of:
+1. Add a capability flag to the contract and set it on every adapter.
+2. Add a method to the adapter and call it through the runtime object.
+3. Move the special case into the adapter's `buildArgs` / `buildPrompt` /
+   `parseOutput` where it belongs.
+Tests in `test/unit/runtime-adapters.test.js` and `test/unit/copilot-adapter.test.js`
+enforce the contract; the source-inspection tests in `test/unit.test.js` look
+for `resolveRuntime(...)` call sites and reject runtime-name comparisons in new
+code.
+## Adding a New Runtime
+1. Create `engine/runtimes/<name>.js` implementing the full adapter contract.
+   Copy `claude.js` as the template and override only the members that differ.
+   Set a useful `installHint`.
+2. Register it in `engine/runtimes/index.js`:
+   ```js
+   registry.set('<name>', require('./<name>'));
+   ```
+3. Add unit coverage modeled on `test/unit/runtime-adapters.test.js`.
+Once registered, the dashboard `/api/runtimes` endpoint, the `--cli` flag
+validator, the agent CLI dropdown, the `engine/preflight.js` per-runtime binary
+check, and the model discovery cache all light up automatically. Use capability
+flags — never special-case in engine code.
+## See Also
+- `CLAUDE.md` → "Runtime Adapters" section — short reference table that mirrors
+  this doc and is enforced by `test/unit.test.js`.
+- `docs/copilot-cli-schema.md` — empirical Copilot CLI behavior captured during
+  the spike that produced the Copilot adapter; cite it when changing
+  Copilot-specific capability flags.
+- `engine/runtimes/claude.js` JSDoc header — authoritative adapter contract.
+- `engine/shared.js` `resolveAgentCli` / `resolveCcCli` etc. — the only
+  supported way to read runtime / model selection from config.