npm - pi-messenger - Versions diffs - 0.10.0 → 0.12.0 - Mend

pi-messenger 0.10.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

package/CHANGELOG.md +86 -0
package/README.md +102 -16
package/config.ts +6 -0
package/crew/agents/crew-plan-sync.md +1 -1
package/crew/agents/crew-planner.md +18 -2
package/crew/agents/crew-reviewer.md +1 -1
package/crew/agents/crew-worker.md +23 -1
package/crew/agents.ts +220 -44
package/crew/handlers/coordination.ts +275 -0
package/crew/handlers/plan.ts +328 -42
package/crew/handlers/review.ts +18 -13
package/crew/handlers/revise.ts +374 -0
package/crew/handlers/status.ts +91 -57
package/crew/handlers/sync.ts +6 -6
package/crew/handlers/task.ts +214 -92
package/crew/handlers/work.ts +171 -132
package/crew/index.ts +26 -49
package/crew/live-progress.ts +58 -0
package/crew/lobby.ts +470 -0
package/crew/prompt.ts +112 -0
package/crew/registry.ts +114 -0
package/crew/spawn.ts +103 -0
package/crew/state-autonomous.ts +134 -0
package/crew/state-planning.ts +263 -0
package/crew/state.ts +14 -82
package/crew/store.ts +154 -21
package/crew/task-actions.ts +123 -0
package/crew/types.ts +17 -2
package/crew/utils/config.ts +66 -14
package/crew/utils/discover.ts +26 -65
package/crew/utils/install.ts +39 -333
package/crew/utils/progress.ts +18 -5
package/feed.ts +88 -23
package/handlers.ts +66 -34
package/index.ts +350 -160
package/install.mjs +14 -63
package/overlay-actions.ts +511 -0
package/overlay-render.ts +650 -0
package/overlay.ts +614 -537
package/package.json +11 -4
package/skills/pi-messenger-crew/SKILL.md +194 -24
package/store.ts +17 -8
package/vitest.config.ts +10 -0
package/ARCHITECTURE.md +0 -220
package/crew/agents/crew-interview-generator.md +0 -79
package/crew/handlers/interview.ts +0 -211
package/crew-overlay.ts +0 -233

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,91 @@
 # Changelog
+## [0.12.0] - 2026-02-21
+### Added
+- **Thinking level support** - Agents support `thinking: <level>` in frontmatter (off, minimal, low, medium, high, xhigh). Config `thinking.<role>` overrides per-agent frontmatter. Applied via `--thinking` flag to spawned processes.
+- **Advisory dependencies** - Dependencies are now informational context, not scheduling blockers. All `todo` tasks are eligible for assignment regardless of dependency status. Workers see dependency completion state (done/in-progress/not started) in their prompt and coordinate via reservations and DMs. Configurable via `dependencies` config (`"advisory"` default, `"strict"` for blocking mode). Transitive dependencies pruned from plans automatically.
+- **Auto-work on plan completion** - Planning automatically starts autonomous work when tasks are created (default behavior). Pass `autoWork: false` to review the plan first. A steer message triggers the LLM to call `work { autonomous: true }` on its next turn.
+- **Continuous worker refill** - Overlay automatically spawns replacement workers when tasks complete, maintaining concurrency up to the configured limit throughout execution.
+- **Prompt-based planning** - `plan` action accepts an inline `prompt` parameter as the spec when no PRD file is available. The planner breaks down arbitrary requests into parallel tasks the same way it handles PRD files.
+- **Live feed with flash notifications** - Feed tracks last-seen timestamp and highlights new events. Significant events (task completions, messages, plan changes) trigger flash notifications in the status bar.
+- **Task revision** - `[p]` revises a single task's spec via the planner. `[P]` revises an entire subtree (target + transitive dependents), preserving done tasks and resetting revisable ones to `todo`. Re-plan with steering prompt injects user guidance into the planning-progress.md Notes section.
+- **Chat UX redesign** - Global `[m]` key activates chat input from any overlay state (vim-style command mode). Plain text broadcasts to all peers; `@name message` sends DMs. Tab autocompletes agent names after `@` (Shift+Tab cycles backwards). Feed gets 60/40 space allocation when workers active. `[f]` toggles feed-focus mode (tasks compress to 2-line summary). Visual improvements: system events dimmed, colored agent names, separator dots between groups.
+- **Auto-dismiss on completion** - Overlay auto-closes with snapshot handover 3 seconds after all tasks complete. Only triggers when the overlay witnessed tasks in progress. Any keypress cancels.
+- **Worker coordination & chatter** - Workers receive environmental context (concurrent tasks, dependency summaries, recent feed, ready tasks) and coordination instructions. Four configurable levels (`none`|`minimal`|`moderate`|`chatty`, default: `chatty`) control verbosity. At `chatty`, workers DM peers and broadcast announcements, creating a chat-room effect in the overlay. New `coordination` config field.
+- **Lobby workers** - `+` spawns workers: assigns ready tasks directly when available, otherwise pre-spawns idle lobby workers that join the mesh and chat. Lobby workers receive task assignments via steer message when tasks become available. Token budgets per coordination level cap spending while idle.
+- **Auto-spawn on plan completion** - Workers auto-spawn when planning finishes and ready tasks exist. Lobby workers assigned first, then fresh workers up to concurrency limit.
+- **TUI coordination level control** - `[v]` cycles coordination level at runtime (`chatty` → `none` → `minimal` → `moderate`).
+- **Configurable `concurrency.max`** - Crew config field (default: 10) lets the user lower the worker ceiling. `+`/`-` keys respect the configured max.
+- **Stuck task UX** - `[q]` Stop works on any `in_progress` task, not just those with live workers. `[s]` Start rejects when the agent is already busy. Detail view shows "Worker not running" warning for orphaned tasks.
+- **Chatter mitigation** - Per-worker outgoing message rate limiting via `messageBudgets` config. Coordination level and per-worker token count displayed in overlay header.
+- **Broadcast filtering** - Worker broadcasts now go to the feed only (no inbox delivery, no LLM turn interrupts). Eliminates O(N^2) token cascades between workers. DMs, user broadcasts, and planner broadcasts still deliver to inboxes in real-time. Detection via `PI_CREW_WORKER` env var set at spawn time. Worker name consistency also fixed in `agents.ts` via `PI_AGENT_NAME`.
+- **Worker exit feed notifications** - Lobby workers that exit without a task assignment now appear as `leave` events in the feed.
+- **2026-02-12 Overlay redesign** - Replaced tabbed `/messenger` UI with a unified crew dashboard layout (status, workers, tasks, feed, agents row, legend), added `Ctrl+T` snapshot transfer, `Ctrl+B` background/reattach support, empty-state system diagnostics, and inline `@all` / `@name` messaging in the overlay input flow.
+- **Task progress system** - Added per-task append-only progress logs (`tasks/task-N.progress.md`) with `task.progress`, system auto-entries (assignment, review verdicts, shutdown resets, crash/failure outcomes), and retry-context injection in worker prompts.
+- **Interactive Crew task manager** - Crew overlay now supports list/detail modes, task actions (reset/cascade-reset/unblock/start/block/delete), in-overlay confirmations, block-reason input, live worker messaging, context-aware key hints, and transient success/failure notifications.
+- **Live worker concurrency adjustment** - In the Crew overlay, `+`/`-` now adjusts worker concurrency live during execution (clamped to 1-10).
+- **Auto-open overlay on autonomous work** - The Crew overlay now opens automatically when autonomous mode starts, showing live worker progress without requiring `/messenger`. Configurable via `autoOverlay` (default: `true`). Escape dismisses; won't reopen until a new autonomous session.
+- **Auto-open overlay on planning work** - The Crew overlay now opens automatically when planning starts and when an in-progress planning run is restored on session start/switch/fork/tree. Configurable via `autoOverlayPlanning` (default: `true`). Uses per-run planning IDs to avoid reopen loops within the same run after dismissal.
+- **Task splitting** - New `task.split` action supports an inspect phase (returns spec/progress/deps/dependents) and an execute phase that creates subtasks, rewires downstream dependencies, converts the parent to a milestone, and auto-completes milestones when all subtasks are done. Milestones can't be started manually and are never dispatched to workers.
+- **Planning observability state** - Added first-class planning run state with pass/phase tracking and persistence at `.pi/messenger/crew/planning-state.json`. Status, Crew UI, and feed now expose planning progress (`plan.start`, `plan.pass.*`, `plan.review.*`, `plan.done`, `plan.failed`) so long planning runs are visible by default. Planning runs with no updates for 5 minutes are flagged as stalled in status and Crew UI, with periodic refresh so stalls surface without requiring user interaction.
+- **Planning cwd normalization** - Planning state now canonicalizes project paths (realpath) and compares by canonical path, fixing false negatives when sessions use `/tmp/...` while persisted state resolves to `/private/tmp/...`.
+- **Structured planning outline** - Planner prompts now require ordered sections (PRD understanding, reviewed resources, sequential steps, parallel task graph). Latest structured output is persisted to `.pi/messenger/crew/planning-outline.md` for audit/review.
+- **Lobby worker keep-alive** - Lobby workers now survive the full planning phase via a file-based keep-alive signal. The orchestrator creates a `.alive` file per lobby worker; the worker's `turn_end` hook checks the file and injects a minimal steer message to prevent the session from exiting. File is deleted on task assignment, planning completion, or shutdown. ~25 tokens per keep-alive cycle.
+- **`prompt` tool parameter** - The `prompt` parameter (used by `plan` and `task.revise`/`task.revise-tree`) is now declared in the tool schema so LLMs can discover it without the skill loaded.
+- **Dynamic overlay width** - Overlay adapts to terminal width (clamped 40-100) instead of a fixed size. Fits half-width and quarter-width windows on smaller screens.
+- **Live coordination level display** - Status bar hints now show the actual coordination level (`v:chatty`, `v:none`, etc.) instead of the opaque `v:Coord` label.
+### Changed
+- **Default `planning.maxPasses` reduced to 1** - Single-pass planning is now the default. Multi-pass planning with reviewer feedback is still available by setting `planning.maxPasses` in user or project config.
+- **Structural cleanup** - Extracted `crew/prompt.ts` (worker prompt builder), `crew/spawn.ts` (shared spawn logic), `crew/registry.ts` (unified worker registry replacing separate maps in agents.ts and lobby.ts). Split `crew/state.ts` into `state-autonomous.ts` and `state-planning.ts` with a barrel re-export.
+- **Crew agent locality** - Crew agents are now discovered from extension-local `crew/agents/` plus project overrides in `.pi/messenger/crew/agents/`. Removed auto-install/update copy machinery and converted `crew.install` / `--crew-install` to informational commands.
+### Removed
+- **Legacy key-based routing** - Removed the `join`, `claim`, `unclaim`, `complete`, `swarm`, `list`, `rename`, `reserve`, `release`, and `broadcast` tool parameters and the legacy routing block in `index.ts` that duplicated `crew/index.ts` action-based routing. All callers should use `action`-based syntax (e.g., `{ action: "join" }` instead of `{ join: true }`).
+- **Interview handler** - Removed `crew/handlers/interview.ts`, `crew-interview-generator.md` agent, and the `interview` action. The feature spawned an LLM to generate questions for the user but never worked (got stuck). Agent moved to deprecated list for cleanup on existing installs.
+### Fixed
+- **Overlay rendering corruption from multi-line bash args** - `extractArgsPreview` returned raw bash `command` strings containing newlines (comments, `&&` chains, heredocs). These embedded newlines broke overlay rows, cascading layout corruption through the workers section, task list, and feed. Newlines are now collapsed to spaces at the extraction point.
+- **Fire-and-forget dynamic import in `task-actions.ts`** - `killWorkerByTask` was imported via `import("./registry.js").then(...)` which swallowed errors silently. Replaced with a static import since no circular dependency exists.
+- **Stale legacy syntax in user-facing strings** - Five handler/index strings still referenced removed `{ join: true }` / `{ list: true }` / `{ to: "..." }` syntax. Updated to action-based equivalents.
+- **SIGKILL escalation never fired** - After `proc.kill("SIGTERM")`, Node.js immediately sets `proc.killed = true`. Both `killWorkerByTask` and `runAgent` graceful shutdown checked `!proc.killed` before escalating to SIGKILL, which was always false. Workers ignoring SIGTERM could hang indefinitely. Now checks only `exitCode === null`.
+- **Lobby worker assignment zombie state** - State was mutated before the disk write, so a failed write left the worker permanently marked as assigned but invisible to future assignment. Reordered to write first, mutate on success.
+- **Lobby assignment race in work handler** - If a lobby worker exited between availability check and assignment, the task got stuck as `in_progress` with no worker. Added return value check with rollback.
+- **Autonomous state checks not cwd-scoped** - Auto-dismiss, snapshot generation, and revision guards used a global flag instead of per-project checks, blocking operations for unrelated projects.
+- **Task flicker from lobby worker spawn** - Orchestrator and worker generated independent random names so `assigned_to` never matched. Task was also pre-set to `in_progress` but the worker expected `todo`. Fixed name propagation via env var, made `task.start` idempotent for the assigned agent, and added ownership guards in close handlers.
+- **`[q]` Stop couldn't kill lobby-spawned workers** - Stop only checked one worker registry. Lobby workers are tracked separately and now have their own kill path.
+- **Double-formatted message previews** - Overlay stored pre-formatted previews that the display layer formatted again. Now stores raw text; preview limit raised from 60 to 200 chars.
+- **Stale workersHeight in trim loop** - Worker-line overflow trimming used a pre-computed height, causing over-trim to 0 workers on small terminals.
+- **Artifact I/O crash** - Uncaught exceptions in child process event handlers crashed pi. Wrapped all artifact writes with try/catch.
+- **Input field scrolling** - Text was clipped at the field edge instead of scrolling to keep the cursor visible.
+- **Session shutdown orphans** - Workers now killed at the top of session shutdown, preventing orphan processes.
+- **Task list blank line padding** - Task list no longer pads with empty lines. Surplus space goes to the feed.
+- **`getLobbyWorkerCount` counted dead workers** - Registry function didn't filter exited processes, inflating the "N in lobby" status bar count.
+- **`models.analyst` config was dead** - Config type and docs existed but `interview.ts` and `sync.ts` never read it. Now wired to both analyst handler call sites.
+## [0.11.0] - 2026-02-08
+### Added
+- **Test suite** — 53 tests across 7 Vitest suites covering store CRUD, state machine, config merging, agent discovery, model resolution, graceful shutdown, and live progress (including cwd isolation). Includes test helpers for temp directories and mock contexts.
+- **Per-agent runtime config** — Model override with 4-level priority: per-task > per-wave `model` param > config `crew.models.worker` > agent `.md` frontmatter. Environment variable override via `crew.work.env` config (not exposed as tool param to prevent API keys in logs).
+- **Graceful shutdown** — `AbortSignal` threaded from tool execute through to spawned workers. On abort: discovers worker name via PID-based registry scan, writes shutdown message to worker inbox, waits 30s grace period, SIGTERM, waits 5s, SIGKILL. Tasks reset to `todo` for retry. Workers instructed to release reservations and exit without committing.
+- **Live crew progress** — In-memory pub/sub store fed by worker JSONL events. Overlay Crew tab shows Active Workers section with tool name, call count, tokens, and elapsed time updating every second. Status bar shows active worker count during autonomous mode (`🔨N`).
+- **`shutdownGracePeriodMs` config** — Configurable grace period before SIGTERM (default: 30000).
+### Changed
+- **Dynamic overlay height** — Content area scales with terminal size (8-25 lines) instead of hardcoded 10. On a standard 24-row terminal, visible content goes from 10 to 15 lines.
+- **Handler signatures simplified** — Removed unused `state` and `dirs` parameters from plan and review handlers.
+### Fixed
+- **`deepMerge` crash** — Merging config with `models` key crashed when the target didn't have the key. Hardened for undefined target keys.
+- **Result/task association** — Worker results now matched by `taskId` field instead of array index, since `spawnAgents` returns in completion order not submission order.
+### Removed
+- `attemptsPerTask` field from `AutonomousState` — declared but never populated by any code.
+- `ARCHITECTURE.md` from repo (moved to external docs).
 ## [0.10.0] - 2026-02-05
 ### Fixed

package/README.md CHANGED Viewed

@@ -4,7 +4,7 @@
 # Pi Messenger
-**What if multiple agents in different terminals sharing a folder could talk to each other like they're in a chat room?** Join, see who's online and what they're doing. Claim tasks, reserve files, send messages. Built on [Pi's](https://github.com/badlogic/pi-mono) extension system. No daemon, no server, just files.
+**What if multiple agents in different terminals sharing a folder could talk to each other like they're in a chat room?** Join, see who's online and what they're doing. Claim tasks, reserve files, send messages. An extension for [Pi coding agent](https://pi.dev/) — install it and go. No daemon, no server, just files.
 [![npm version](https://img.shields.io/npm/v/pi-messenger?style=for-the-badge)](https://www.npmjs.com/package/pi-messenger)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](LICENSE)
@@ -16,13 +16,15 @@
 pi install npm:pi-messenger
 ```
-Crew agents and the `pi-messenger-crew` skill are auto-installed on first use of `plan`, `work`, or `review`. To install them ahead of time:
+Crew agents ship with the extension (`crew/agents/*.md`) and are discovered automatically. The `pi-messenger-crew` skill is auto-loaded from the extension.
+To show available crew agents:
 ```bash
 npx pi-messenger --crew-install
 ```
-This copies `crew/agents/*.md` to `~/.pi/agent/agents/` and `skills/pi-messenger-crew/` to `~/.pi/agent/skills/`.
+To customize an agent for one project, copy it to `.pi/messenger/crew/agents/` and edit it.
 To remove the extension:
@@ -30,7 +32,7 @@ To remove the extension:
 npx pi-messenger --remove
 ```
-To also remove crew agents and skill:
+To remove stale crew agent copies from the shared legacy directory (`~/.pi/agent/agents/`):
 ```bash
 npx pi-messenger --crew-uninstall
@@ -96,7 +98,14 @@ Crew turns a PRD into a dependency graph of tasks, then executes them in paralle
 2. **Work** — Workers implement ready tasks (all dependencies met) in parallel waves. A single `work` call runs one wave. `autonomous: true` runs waves back-to-back until everything is done or blocked.
 3. **Review** — Reviewer checks each implementation: SHIP, NEEDS_WORK, or MAJOR_RETHINK.
-No special PRD format required — the planner auto-discovers `PRD.md`, `SPEC.md`, `DESIGN.md`, etc. in your project root and `docs/`.
+No special PRD format required — the planner auto-discovers `PRD.md`, `SPEC.md`, `DESIGN.md`, etc. in your project root and `docs/`. Or skip the file entirely:
+```typescript
+pi_messenger({ action: "plan", prompt: "Scan the codebase for bugs" })
+// Plan + auto-start autonomous work when planning completes
+pi_messenger({ action: "plan" })  // auto-starts workers (default)
+```
 ### Wave Execution
@@ -116,20 +125,94 @@ The planner structures tasks to maximize parallelism. Foundation work has no dep
 ### Crew Configuration
-Add to `~/.pi/agent/pi-messenger.json`:
+Crew spawns multiple LLM sessions in parallel — it can burn tokens fast. Start with a cheap worker model and scale up once you've seen the workflow. Add this to `~/.pi/agent/pi-messenger.json`:
+```json
+{ "crew": { "models": { "worker": "claude-haiku-4-5" } } }
+```
+The planner and reviewer keep their frontmatter defaults; only workers (the bulk of the spend) get the cheap model. Override per-role as needed:
+```json
+{
+  "crew": {
+    "models": {
+      "worker": "claude-haiku-4-5",
+      "planner": "claude-sonnet-4-6",
+      "reviewer": "claude-sonnet-4-6"
+    }
+  }
+}
+```
+Model strings accept `provider/model` format for explicit provider selection and `:level` suffix for inline thinking control. These work anywhere a model is specified — config, frontmatter, or per-task override:
+```json
+{
+  "crew": {
+    "models": {
+      "worker": "anthropic/claude-haiku-4-5",
+      "planner": "openrouter/anthropic/claude-sonnet-4:high"
+    }
+  }
+}
+```
+The `:level` suffix and the `thinking.<role>` config are independent — if both are set, the suffix takes precedence and the `--thinking` flag is skipped to avoid double-application.
+Full config reference (all fields optional — only set what you want to change):
 ```json
 {
   "crew": {
-    "concurrency": { "workers": 2 },
+    "concurrency": { "workers": 2, "max": 10 },
+    "coordination": "chatty",
+    "models": { "worker": "claude-haiku-4-5" },
     "review": { "enabled": true, "maxIterations": 3 },
-    "planning": { "maxPasses": 3 },
-    "work": { "maxAttemptsPerTask": 5, "maxWaves": 50 }
+    "planning": { "maxPasses": 1 },
+    "work": {
+      "maxAttemptsPerTask": 5,
+      "maxWaves": 50
+    }
   }
 }
 ```
-Crew agents (planner, worker, reviewer, interview-generator, plan-sync) are **auto-installed** on first use. Run `npx pi-messenger --crew-install` to manually install or update.
+| Setting | Description | Default |
+|---------|-------------|---------|
+| `concurrency.workers` | Default parallel workers per wave | `2` |
+| `concurrency.max` | Maximum workers allowed (hard ceiling is 10) | `10` |
+| `dependencies` | Dependency scheduling mode: `advisory` or `strict` | `"advisory"` |
+| `coordination` | Worker coordination level: `none`, `minimal`, `moderate`, `chatty` | `"chatty"` |
+| `messageBudgets` | Max outgoing messages per worker per level (sends rejected after limit) | `{ none: 0, minimal: 2, moderate: 5, chatty: 10 }` |
+| `models.planner` | Model for planner agent | `anthropic/claude-opus-4-6` |
+| `models.worker` | Model for workers (overridden by per-task or per-wave `model` param) | `anthropic/claude-haiku-4-5` |
+| `models.reviewer` | Model for reviewer agent | `anthropic/claude-opus-4-6` |
+| `models.analyst` | Model for analyst (plan-sync) agent | `anthropic/claude-haiku-4-5` |
+| `thinking.planner` | Thinking level for planner agent | (from frontmatter) |
+| `thinking.worker` | Thinking level for worker agents | (from frontmatter) |
+| `thinking.reviewer` | Thinking level for reviewer agents | (from frontmatter) |
+| `thinking.analyst` | Thinking level for analyst agents | (from frontmatter) |
+| `review.enabled` | Auto-review after task completion | `true` |
+| `review.maxIterations` | Max review/fix cycles per task | `3` |
+| `planning.maxPasses` | Max planner/reviewer refinement passes | `1` |
+| `work.maxAttemptsPerTask` | Auto-block after N failures | `5` |
+| `work.maxWaves` | Max autonomous waves | `50` |
+| `work.shutdownGracePeriodMs` | Grace period before SIGTERM on abort | `30000` |
+| `work.env` | Environment variables passed to spawned workers | `{}` |
+### Default Agent Models
+Each crew agent ships with a default model in its frontmatter. Override any of these via `crew.models.<role>` in config:
+| Agent | Role | Default Model |
+|-------|------|---------------|
+| `crew-planner` | planner | `anthropic/claude-opus-4-6` |
+| `crew-worker` | worker | `anthropic/claude-haiku-4-5` |
+| `crew-reviewer` | reviewer | `anthropic/claude-opus-4-6` |
+| `crew-plan-sync` | analyst | `anthropic/claude-haiku-4-5` |
+Agent definitions live in `crew/agents/` within the extension. To customize one for a project, copy it to `.pi/messenger/crew/agents/` and edit the frontmatter — project-level agents override extension defaults by name. Agents support `thinking: <level>` in frontmatter (off, minimal, low, medium, high, xhigh). Config `thinking.<role>` overrides the frontmatter value.
 ## API Reference
@@ -153,7 +236,7 @@ Crew agents (planner, worker, reviewer, interview-generator, plan-sync) are **au
 | Action | Description |
 |--------|-------------|
-| `plan` | Create plan from PRD (`prd` optional — auto-discovers if omitted) |
+| `plan` | Create plan from PRD or inline prompt (`prd`, `prompt` optional — auto-discovers PRD if omitted, auto-starts workers unless `autoWork: false`) |
 | `work` | Run ready tasks (`autonomous`, `concurrency` optional) |
 | `review` | Review implementation (`target` task ID required) |
 | `task.list` | List all tasks |
@@ -167,8 +250,8 @@ Crew agents (planner, worker, reviewer, interview-generator, plan-sync) are **au
 | `crew.status` | Overall crew status |
 | `crew.validate` | Validate plan dependencies |
 | `crew.agents` | List available crew agents |
-| `crew.install` | Install/update crew agents |
-| `crew.uninstall` | Remove crew agents and skill |
+| `crew.install` | Show discovered crew agents and their sources |
+| `crew.uninstall` | Remove stale shared-directory crew agent copies |
 ### Swarm (Spec-Based)
@@ -190,7 +273,8 @@ Create `~/.pi/agent/pi-messenger.json`:
   "scopeToFolder": false,
   "nameTheme": "default",
   "stuckThreshold": 900,
-  "stuckNotify": true
+  "stuckNotify": true,
+  "autoOverlayPlanning": true
 }
 ```
@@ -205,6 +289,8 @@ Create `~/.pi/agent/pi-messenger.json`:
 | `stuckThreshold` | Seconds of inactivity before stuck detection | `900` |
 | `stuckNotify` | Show notification when a peer appears stuck | `true` |
 | `autoStatus` | Auto-generate status messages from activity | `true` |
+| `autoOverlay` | Auto-open overlay when autonomous crew work starts | `true` |
+| `autoOverlayPlanning` | Auto-open Crew overlay when planning starts or is restored in-progress | `true` |
 | `crewEventsInFeed` | Include crew task events in activity feed | `true` |
 | `contextMode` | Context injection level: `full`, `minimal`, `none` | `"full"` |
@@ -216,9 +302,9 @@ Pi-messenger is a [pi extension](https://github.com/badlogic/pi-mono) that hooks
 Incoming messages wake the receiving agent via `pi.sendMessage()` with `triggerTurn: true` and `deliverAs: "steer"`, which injects the message as a steering prompt that resumes the agent. File reservations are enforced by returning `{ block: true }` from a `tool_call` hook on write/edit operations. The `/messenger` overlay uses `ctx.ui.custom()` for the chat TUI, and `ctx.ui.setStatus()` keeps the status bar updated with peer count and unread messages.
-Crew workers are spawned as `pi --mode json` subprocesses with the agent's system prompt, model, and tool restrictions from their `.md` definitions. Progress is tracked via JSONL streaming. The planner and reviewer work the same way — just pi instances with different agent configs.
+Crew workers are spawned as `pi --mode json` subprocesses with the agent's system prompt, model, and tool restrictions from their `.md` definitions. Progress is tracked via JSONL streaming — the overlay subscribes to a live progress store that shows each worker's current tool, call count, and token usage in real time. Aborting a work run triggers graceful shutdown: each worker receives an inbox message asking it to stop, followed by a grace period before SIGTERM. The planner and reviewer work the same way — just pi instances with different agent configs.
-All coordination is file-based, no daemon required. Global state (registry, inboxes, activity feed) lives in `~/.pi/agent/messenger/`. Per-project crew data (plan, tasks, artifacts) lives in `.pi/messenger/crew/` inside your project. Dead agents are detected via PID checks and cleaned up automatically.
+All coordination is file-based, no daemon required. Shared state (registry, inboxes, swarm claims/completions) lives in `~/.pi/agent/messenger/`. Activity feed and crew data are project-scoped under `.pi/messenger/` inside your project. Dead agents are detected via PID checks and cleaned up automatically.
 ## Credits

package/config.ts CHANGED Viewed

@@ -26,6 +26,8 @@ export interface MessengerConfig {
   stuckThreshold: number;
   stuckNotify: boolean;
   autoStatus: boolean;
+  autoOverlay: boolean;
+  autoOverlayPlanning: boolean;
   crewEventsInFeed: boolean;
 }
@@ -42,6 +44,8 @@ const DEFAULT_CONFIG: MessengerConfig = {
   stuckThreshold: 900,
   stuckNotify: true,
   autoStatus: true,
+  autoOverlay: true,
+  autoOverlayPlanning: true,
   crewEventsInFeed: true,
 };
@@ -155,6 +159,8 @@ export function loadConfig(cwd: string): MessengerConfig {
     stuckThreshold: typeof merged.stuckThreshold === "number" ? merged.stuckThreshold : DEFAULT_CONFIG.stuckThreshold,
     stuckNotify: merged.stuckNotify !== false,
     autoStatus: merged.autoStatus !== false,
+    autoOverlay: merged.autoOverlay !== false,
+    autoOverlayPlanning: merged.autoOverlayPlanning !== false,
     crewEventsInFeed: merged.crewEventsInFeed !== false,
   };

package/crew/agents/crew-plan-sync.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: crew-plan-sync
 description: Syncs downstream specs after task completion
 tools: read, write, bash, pi_messenger
-model: claude-opus-4-5
+model: anthropic/claude-haiku-4-5
 crewRole: analyst
 maxOutput: { bytes: 51200, lines: 500 }
 parallel: false

package/crew/agents/crew-planner.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: crew-planner
 description: Analyzes codebase and PRD to create a comprehensive task breakdown
 tools: read, bash, web_search, pi_messenger
-model: claude-opus-4-5
+model: anthropic/claude-opus-4-6
 crewRole: planner
 maxOutput: { bytes: 204800, lines: 5000 }
 parallel: false
@@ -66,7 +66,7 @@ Create a task breakdown following the exact output format below. Guidelines:
 - **4-8 tasks** typically (scale with complexity)
 - Each task should be **completable in one work session**
 - Group related work but keep tasks focused
-- End with testing and documentation tasks
+- Include tests in each task's acceptance criteria
 - Include specific files to create/modify when known
 - Include acceptance criteria in task descriptions
@@ -80,6 +80,22 @@ Tasks execute in waves — all tasks whose dependencies are met run concurrently
 - **Dependencies should reflect real data flow.** Task B depends on Task A only if B imports types, calls functions, or reads files that A creates. Conceptual ordering ("settings before server") isn't a dependency unless the server literally imports from the settings module.
 - **Front-load foundation tasks with no dependencies** so multiple streams can start from wave 1.
+### Anti-Patterns to Avoid
+- **No standalone "types" or "config" tasks.** If every other task depends on a types-only task, you've created a bottleneck. The first task that needs shared types should define them alongside its real work.
+- **No integration funnels.** A task that depends on everything (tasks 1-5 all as deps) forces serialization. Split integration so pieces can start as each module completes.
+- **No redundant transitive deps.** If B depends on A, and C depends on B, don't list A in C's dependencies — it's already satisfied through B.
+- **Tests belong in their task.** Include unit tests in each implementation task's acceptance criteria. A standalone "Testing" task at the end is always on the critical path.
+### Dependency Descriptions
+Be specific about what each dependency provides. Workers may start before dependencies complete and need to know what to expect:
+- Name the **files** each task creates or modifies
+- Name the **exported symbols** (functions, types, classes) other tasks will import
+- Describe **interface shapes** (function signatures, type fields) so workers can code against them even before the dependency is done
+Workers coordinate via file reservations and direct messages. They can define local interfaces and adapt when dependency outputs appear. Your dependency descriptions help them make good decisions.
 ## Output Format
 Your final output MUST include both formats:

package/crew/agents/crew-reviewer.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: crew-reviewer
 description: Reviews task implementations for quality and correctness
 tools: read, bash, pi_messenger
-model: openai/gpt-5.2-high
+model: anthropic/claude-opus-4-6
 crewRole: reviewer
 maxOutput: { bytes: 102400, lines: 2000 }
 parallel: true

package/crew/agents/crew-worker.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: crew-worker
 description: Implements a single crew task with mesh coordination
 tools: read, write, edit, bash, pi_messenger
-model: claude-opus-4-5
+model: anthropic/claude-haiku-4-5
 crewRole: worker
 maxOutput: { bytes: 204800, lines: 5000 }
 parallel: true
@@ -54,6 +54,14 @@ pi_messenger({ action: "reserve", paths: ["src/path/to/files/"], reason: "<TASK_
 3. Write tests if applicable
 4. Run tests to verify: `bash({ command: "npm test" })` or equivalent
+**Progress Logging:** After each significant step above, log what you did:
+```typescript
+pi_messenger({ action: "task.progress", id: "<TASK_ID>", message: "Added JWT validation to src/auth/middleware.ts" })
+```
+Keep entries concise — one line per step. This helps the next agent pick up where you left off if the task gets interrupted.
 ## Phase 5: Commit
 ```bash
@@ -85,6 +93,15 @@ pi_messenger({
 })
 ```
+## Shutdown Handling
+If you receive a message saying "SHUTDOWN REQUESTED":
+1. Stop what you're doing
+2. Release reservations: `pi_messenger({ action: "release" })`
+3. Do NOT mark the task as done — leave it as in_progress for retry
+4. Do NOT commit anything
+5. Exit immediately
 ## Important Rules
 - ALWAYS join first, before any other pi_messenger calls
@@ -93,3 +110,8 @@ pi_messenger({
 - ALWAYS release before completing
 - If you encounter a blocker, use `task.block` with a clear reason
 - Follow existing code patterns and conventions
+## Coordination
+Follow the coordination instructions in your task prompt's "Coordination" section.
+If no coordination section is present, do not send messages — focus on your task.