npm - @tmustier/pi-agent-teams - Versions diffs - 0.4.0-beta.3 → 0.5.0 - Mend

@tmustier/pi-agent-teams 0.4.0-beta.3 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

package/CHANGELOG.md +26 -0
package/README.md +72 -9
package/WORKFLOW.md +110 -0
package/docs/claude-parity.md +18 -13
package/docs/hook-contract.md +183 -0
package/docs/smoke-test-plan.md +26 -7
package/extensions/teams/activity-tracker.ts +296 -8
package/extensions/teams/cleanup.ts +216 -3
package/extensions/teams/hooks.ts +57 -5
package/extensions/teams/leader-attach-commands.ts +8 -4
package/extensions/teams/leader-inbox.ts +162 -4
package/extensions/teams/leader-info-commands.ts +105 -3
package/extensions/teams/leader-lifecycle-commands.ts +205 -3
package/extensions/teams/leader-messaging-commands.ts +19 -7
package/extensions/teams/leader-spawn-command.ts +5 -1
package/extensions/teams/leader-team-command.ts +51 -2
package/extensions/teams/leader-teams-tool.ts +387 -11
package/extensions/teams/leader.ts +126 -52
package/extensions/teams/mailbox.ts +6 -1
package/extensions/teams/model-policy.ts +117 -0
package/extensions/teams/spawn-types.ts +4 -0
package/extensions/teams/teammate-rpc.ts +14 -0
package/extensions/teams/teams-panel.ts +117 -19
package/extensions/teams/teams-ui-shared.ts +205 -2
package/extensions/teams/teams-widget.ts +67 -14
package/extensions/teams/worker.ts +18 -6
package/extensions/teams/worktree.ts +143 -0
package/package.json +4 -2
package/scripts/integration-cleanup-test.mts +419 -0
package/scripts/integration-hooks-remediation-test.mts +382 -0
package/scripts/integration-spawn-overrides-test.mts +10 -0
package/scripts/smoke-test.mts +701 -3
package/skills/agent-teams/SKILL.md +28 -7

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,26 @@
+# Changelog
+## 0.5.0
+### Features
+- **DM routing to leader LLM context** — Teammate DMs are now injected into the leader's conversation via `sendLeaderLlmMessage` instead of only showing in the TUI. The leader can now act on DM requests autonomously. (Thanks **@davidsu** — #6, #29)
+- **Batch-complete auto-wake** — `DelegationTracker` tracks task ID batches from `delegate()` calls. When all tasks in a batch complete, the idle leader is automatically woken to review results and continue orchestrating. Quality-gate aware. (Thanks **@RensTillmann** — #7, #29)
+- **Worker completion messages in leader context** — Per-task completion/failure notifications injected into the leader LLM conversation with task subject, result summary, and progress counters. All-done detection warns when quality gates are still running. (#13)
+- **Ergonomic worker status** — Real-time time-in-state, stall detection (configurable via `PI_TEAMS_STALL_THRESHOLD_MS`), last message summary, and model-per-worker in panel detail view. `member_status` tool action for agent-driven orchestration. (#10)
+- **Tool call content in transcript** — Worker transcript view shows tool arguments inline: file paths, commands, patterns. Errors marked with ✗. (#18, #21, #23)
+- **`/team done` + auto-done detection** — `/team done` ends a run (stops teammates, hides widget, notifies with summary). Widget auto-detects when all tasks complete. (#16)
+- **Hook/model policy in panel** — Compact policy summary (hooks status, failure action, reopens, model inheritance) shown in widget and panel. (#20)
+- **Model, thinking, task in spawn/panel** — Visible in spawn output, panel detail, and transcript header. (#19)
+- **Urgent message interrupts** — `--urgent` flag on `/team dm` and `/team broadcast` interrupts active worker turns via steering. (#15)
+- **Hook contract versioning** — Formal versioning and compatibility policy for quality-gate hooks. (#24)
+### Fixes
+- **Worktree/branch auto-cleanup** — Stale team dirs, worktrees, and branches cleaned up on shutdown and session switch. (#14)
+- **`/team status` in README** — Added missing command to docs table. (#27)
+- **Activity tracker** — Added `extractStartSummary`/`extractEndSummary` for tool transcript display.
+## 0.4.0
+Initial public release.

package/README.md CHANGED Viewed

@@ -10,15 +10,16 @@ Core agent-teams primitives, matching Claude's design:
 - **Shared task list** — file-per-task on disk with three states (pending / in-progress / completed) and dependency tracking so blocked tasks stay blocked until their prerequisites finish.
 - **Auto-claim** — idle teammates automatically pick up the next unassigned, unblocked task. No manual dispatching required (disable with `PI_TEAMS_DEFAULT_AUTO_CLAIM=0`).
-- **Direct messages and broadcast** — send a message to one teammate or all of them at once, via file-based mailboxes.
+- **Direct messages and broadcast** — send a message to one teammate or all of them at once, via file-based mailboxes. Urgent messages can interrupt active turns via steering.
 - **Graceful lifecycle** — spawn, stop, shutdown (with handshake), or kill teammates. The leader tracks who's online, idle, or streaming.
 - **LLM-callable teams tool** — the model can spawn teammates, delegate tasks, mutate task assignment/status/dependencies, message teammates, and run lifecycle actions in tool calls (no slash commands needed).
-- **Team cleanup** — tear down all team artifacts (tasks, mailboxes, sessions, worktrees) when you're done.
+- **Team done + cleanup** — `/team done` ends a run (stops teammates, hides the widget, notifies with a summary); the widget auto-detects when all tasks are complete and shows a hint. `/team cleanup` tears down artifacts afterward.
 Additional Pi-specific capabilities:
 - **Git worktrees** — optionally give each teammate its own worktree so they work on isolated branches without conflicting edits.
 - **Session branching** — clone the leader's conversation context into a teammate so it starts with full awareness of the work so far, instead of from scratch.
+- **Completion notifications** — when a teammate finishes or fails a task, the leader LLM receives a structured `[Team]` message with task ID, subject, result summary, and progress counters so it can orchestrate autonomously without human intervention. When quality-gate hooks are active, the message warns that task states may still change.
 - **Hooks / quality gates** — optional leader-side hooks on idle / task completion to run scripts (opt-in).
 ## UI style (terminology + naming)
@@ -110,9 +111,13 @@ Or drive it manually:
 /tw                                        # open the interactive widget panel
+/team done                                 # end run: stop teammates + hide widget
+/team done --force                         # end run even with in-progress tasks
 /team shutdown alice                       # graceful shutdown (handshake)
 /team shutdown                             # stop all teammates (leader session remains active)
-/team cleanup                              # remove team artifacts when done
+/team cleanup                              # remove team artifacts (worktrees, branches, team dir)
+/team gc --dry-run                         # preview stale team dirs that would be removed
 ```
 Or let the model drive it with the delegate tool:
@@ -143,17 +148,21 @@ Or let the model drive it with the delegate tool:
 | `task_dep_add` | `taskId`, `depId` | Add dependency edge (`taskId` depends on `depId`). |
 | `task_dep_rm` | `taskId`, `depId` | Remove dependency edge. |
 | `task_dep_ls` | `taskId` | Inspect dependency/block graph for one task. |
-| `message_dm` | `name`, `message` | Send mailbox DM to one teammate. |
-| `message_broadcast` | `message` | Send mailbox message to all discovered workers. |
+| `message_dm` | `name`, `message` | Send mailbox DM to one teammate. Set `urgent=true` to interrupt their active turn. |
+| `message_broadcast` | `message` | Send mailbox message to all discovered workers. Set `urgent=true` to interrupt active turns. |
 | `message_steer` | `name`, `message` | Send steer instruction to a running RPC teammate. |
 | `member_spawn` | `name` | Spawn one teammate (supports context/workspace/model/thinking/plan options). |
+| `member_status` | optional `name` | Real-time worker status: activity, time in state, stall detection, tool use, tokens, last message. Omit name for all-worker summary. |
 | `member_shutdown` | `name` or `all=true` | Request graceful shutdown via mailbox handshake. |
 | `member_kill` | `name` | Force-stop one RPC teammate and unassign active tasks. |
 | `member_prune` | _(none)_ | Mark stale non-RPC workers offline (`all=true` to force). |
+| `team_done` | _(none)_ | End team run: stop all teammates, mark workers offline, hide widget (`all=true` to force with in-progress tasks). |
 | `plan_approve` | `name` | Approve pending plan for a plan-required teammate. |
 | `plan_reject` | `name` | Reject pending plan (`feedback` optional). |
 | `hooks_policy_get` | _(none)_ | Read team hooks policy (configured + effective with env fallback). |
 | `hooks_policy_set` | one or more of: `hookFailureAction`, `hookMaxReopensPerTask`, `hookFollowupOwner` | Update team hooks policy at runtime (`hooksPolicyReset=true` clears team overrides first). |
+| `model_policy_get` | _(none)_ | Inspect teammate model policy and default inheritance behavior for the current leader model. |
+| `model_policy_check` | optional `model` | Validate a model override before spawning (`<provider>/<modelId>` or `<modelId>`). |
 Example calls:
@@ -161,10 +170,14 @@ Example calls:
 { "action": "task_assign", "taskId": "12", "assignee": "alice" }
 { "action": "task_dep_add", "taskId": "12", "depId": "7" }
 { "action": "message_broadcast", "message": "Sync: finishing this milestone" }
+{ "action": "message_dm", "name": "alice", "message": "Stop using lib X, use Y instead", "urgent": true }
 { "action": "member_kill", "name": "alice" }
 { "action": "plan_approve", "name": "alice" }
 { "action": "hooks_policy_get" }
 { "action": "hooks_policy_set", "hookFailureAction": "reopen_followup", "hookMaxReopensPerTask": 2, "hookFollowupOwner": "member" }
+{ "action": "model_policy_get" }
+{ "action": "team_done" }
+{ "action": "model_policy_check", "model": "openai-codex/gpt-5.1-codex-mini" }
 ```
 ## Commands
@@ -185,6 +198,7 @@ All management commands live under `/team`.
 | --- | --- |
 | `/team spawn <name> [fresh\|branch] [shared\|worktree] [plan] [--model <provider>/<modelId>] [--thinking <level>]` | Start a teammate |
 | `/team list` | List teammates and their status |
+| `/team status [name]` | Real-time worker state: stall detection, time in state, activity (omit name for summary) |
 | `/team panel` | Interactive widget panel (same as `/tw`) |
 | `/team attach list` | Discover existing team workspaces under `<teamsRoot>` |
 | `/team attach <teamId> [--claim]` | Attach this session to an existing team workspace (`--claim` force-takes over an active claim) |
@@ -195,14 +209,16 @@ All management commands live under `/team`.
 | `/team style <name>` | Set style (built-in or custom) |
 | `/team send <name> <msg>` | Send a prompt over RPC |
 | `/team steer <name> <msg>` | Redirect an in-flight run |
-| `/team dm <name> <msg>` | Send a mailbox message |
-| `/team broadcast <msg>` | Message all teammates |
+| `/team dm <name> [--urgent] <msg>` | Send a mailbox message (`--urgent` interrupts active turns) |
+| `/team broadcast [--urgent] <msg>` | Message all teammates (`--urgent` interrupts active turns) |
 | `/team stop <name> [reason]` | Abort current work (resets task to pending) |
 | `/team shutdown <name> [reason]` | Graceful shutdown (handshake) |
 | `/team shutdown` | Stop all teammates (RPC + best-effort manual) (leader session remains active) |
 | `/team prune [--all]` | Mark stale manual teammates offline (hides them in widget) |
 | `/team kill <name>` | Force-terminate |
-| `/team cleanup [--force]` | Delete team artifacts |
+| `/team done [--force]` | End run: stop teammates + hide widget (auto-detects when all tasks complete) |
+| `/team cleanup [--force]` | Delete team artifacts (worktrees, branches, team dir) |
+| `/team gc [--dry-run] [--force] [--max-age-hours=N]` | Garbage-collect stale team dirs older than N hours (default: 24) |
 | `/team id` | Print team/task-list IDs and paths |
 | `/team env <name>` | Print env vars to start a manual teammate |
@@ -210,12 +226,36 @@ Model inheritance note:
 - If the leader is running a deprecated model id (e.g. Sonnet 4 non-4.5 variants), teammates will **not** inherit that id by default.
 - Explicit deprecated `--model` overrides are rejected.
+- Agents can introspect/check this at runtime via `teams` actions: `{ "action": "model_policy_get" }` and `{ "action": "model_policy_check", "model": "..." }`.
+### Policy visibility
+Both the persistent widget and the interactive panel (`/tw`) show a compact policy
+summary below the header:
+- **Hooks**: on/off status, `failureAction`, `maxReopensPerTask`, `followupOwner` (effective values from team config + env fallback)
+- **Model**: leader model with deprecation warning when applicable, teammate selection source (`inherit`/`override`/`default`)
+These values reflect the same resolution as the `hooks_policy_get` and `model_policy_get`
+tool actions, but are continuously visible — no extra tool calls needed.
+### Ergonomic worker status
+The widget and panel show real-time worker state at a glance:
+- **Time in state**: how long a worker has been in its current status (e.g. `3m12s`)
+- **Stall detection**: when a streaming worker hasn't emitted any agent event for > 5 minutes, status changes to `⚠ stalled` (configurable via `PI_TEAMS_STALL_THRESHOLD_MS`)
+- **Last message summary**: most recent assistant text (first 80–100 chars) visible in the panel's selected-worker detail section
+- **Model per worker**: shown in the panel detail view when available
+- **Current activity**: tool verb (e.g. `running…`, `editing…`) displayed inline
+The `member_status` tool action provides the same information programmatically for agent-driven orchestration — no need to parse JSONL files or check file modification times.
 ### Panel shortcuts (`/tw` / `/team panel`)
 - `↑/↓` or `w/s`: select teammate / scroll transcript
 - `1..9`: jump directly to teammate in overview
-- `enter`: open selected teammate transcript
+- `enter`: open selected teammate transcript (shows tool args inline: file paths, commands, patterns; errors marked with ✗)
 - `t` or `shift+t`: open selected teammate task list (task-centric view with deps/blocks); in task view, toggle back (`esc`/`t`/`shift+t`)
 - task view: `c` complete, `p` pending, `i` in-progress, `u` unassign, `r` reassign selected task
 - `m` or `d`: compose message to selected teammate
@@ -252,6 +292,7 @@ Model inheritance note:
 | `PI_TEAMS_HOOKS_MAX_REOPENS_PER_TASK` | Reopen cap per task when failure action includes `reopen` (`0` disables auto-reopen) | `3` |
 | `PI_TEAMS_HOOKS_FOLLOWUP_OWNER` | Follow-up owner policy: `member`, `lead`, `none` | `member` |
 | `PI_TEAMS_HOOKS_CREATE_TASK_ON_FAILURE` | Legacy shortcut for `PI_TEAMS_HOOKS_FAILURE_ACTION=followup` | `0` (off) |
+| `PI_TEAMS_STALL_THRESHOLD_MS` | Threshold (ms) before a streaming worker with no events is flagged as "stalled" | `300000` (5 min) |
 ## Storage layout
@@ -300,6 +341,8 @@ Hooks run with working directory = the **leader session cwd** and receive contex
 - `PI_TEAMS_MEMBER`
 - `PI_TEAMS_TASK_ID`, `PI_TEAMS_TASK_SUBJECT`, `PI_TEAMS_TASK_OWNER`, `PI_TEAMS_TASK_STATUS`
+See [`docs/hook-contract.md`](docs/hook-contract.md) for the full versioned contract, JSON schema, and compatibility policy.
 Hook policy can be controlled by agents at runtime via `teams` tool actions:
 - `{ "action": "hooks_policy_get" }`
@@ -368,6 +411,26 @@ node scripts/e2e-rpc-test.mjs
 Starts a leader in RPC mode, spawns a teammate, runs a shutdown handshake, verifies cleanup. Sets `PI_TEAMS_ROOT_DIR` to a temp directory so nothing touches `~/.pi/agent/teams`.
+### Integration: hooks remediation loop
+```bash
+npm run integration-hooks-remediation-test
+```
+Deterministic leader-side integration flow that verifies failed `on_task_completed` hook handling end-to-end:
+- task is reopened when policy includes `reopen`
+- follow-up task is created/assigned when policy includes `followup`
+- remediation mailbox nudge is emitted for the responsible teammate
+### Integration: cleanup and garbage collection
+```bash
+npm run integration-cleanup-test
+```
+Tests worktree/branch cleanup lifecycle, full team directory removal, GC age/activity filtering, and filesystem fallback when no git context is available.
 ### tmux dogfooding
 ```bash

package/WORKFLOW.md ADDED Viewed

@@ -0,0 +1,110 @@
+---
+tracker:
+  kind: linear
+  api_key: "$LINEAR_API_KEY"
+  team_key: "SYM"
+  project_slug: "db8e51049f48"
+  active_states:
+    - Todo
+    - In Progress
+    - In Review
+    - Merging
+    - Rework
+  terminal_states:
+    - Done
+    - Closed
+    - Cancelled
+    - Canceled
+    - Duplicate
+polling:
+  interval_ms: 30000
+workspace:
+  root: "$PI_SYMPHONY_WORKSPACE_ROOT"
+worker:
+  runtime: pi
+agent:
+  max_concurrent_agents: 4
+  max_turns: 8
+  max_retry_backoff_ms: 300000
+codex:
+  turn_timeout_ms: 3600000
+  read_timeout_ms: 5000
+  stall_timeout_ms: 300000
+pi:
+  command: pi
+  response_timeout_ms: 60000
+  session_dir_name: .pi-rpc-sessions
+  model:
+    provider: anthropic
+    model_id: claude-opus-4-6
+  thinking_level: xhigh
+  extension_paths:
+    - ../extensions/workspace-guard/index.ts
+    - ../extensions/proof/index.ts
+    - ../extensions/linear-graphql/index.ts
+  disable_extensions: true
+  disable_themes: true
+orchestration:
+  phase_store: workpad
+  default_phase: implementing
+  passive_phases:
+    - waiting_for_checks
+    - waiting_for_human
+    - blocked
+  max_rework_cycles: 3
+  ownership:
+    required_label: symphony
+    required_workpad_marker: "## Symphony Workpad"
+rollout:
+  mode: mutate
+  preflight_required: true
+  kill_switch_label: no-symphony-automation
+  kill_switch_file: /tmp/pi-symphony.pause
+pr:
+  auto_create: true
+  base_branch: main
+  repo_slug: tmustier/pi-agent-teams
+  reuse_branch_pr: true
+  closed_pr_policy: new_branch
+  attach_to_tracker: true
+  required_labels:
+    - symphony
+  review_comment_mode: upsert
+  review_comment_marker: "<!-- symphony-review -->"
+review:
+  enabled: true
+  agent: pr-reviewer
+  output_format: structured_markdown_v1
+  max_passes: 2
+  fix_consideration_severities:
+    - P0
+    - P1
+    - P2
+merge:
+  mode: disabled
+  executor: gh
+  method: squash
+  strategy: queue
+  max_rebase_attempts: 2
+  require_green_checks: true
+  require_head_match: true
+  require_human_approval: true
+  approval_states:
+    - Merging
+hooks:
+  timeout_ms: 60000
+observability:
+  dashboard_enabled: true
+  refresh_ms: 1000
+  render_interval_ms: 16
+server:
+  port: 4041
+  host: 127.0.0.1
+---
+Before starting implementation, read and apply the agent-friendly-design skill at:
+`/Users/thomasmustier/.pi/agent/skills/agent-friendly-design/SKILL.md`
+This project builds interfaces that AI agents operate (CLIs, tools, extensions, UI surfaces).
+Every change should follow agent-friendly design principles: structured output, progressive disclosure,
+clear error classification, token-budget awareness, and discoverability.

package/docs/claude-parity.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Claude Agent Teams parity roadmap (pi-agent-teams)
-Last updated: 2026-02-10
+Last updated: 2026-03-21
 This document tracks **feature parity gaps** between:
@@ -8,12 +8,12 @@ This document tracks **feature parity gaps** between:
   - https://code.claude.com/docs/en/agent-teams#control-your-agent-team
   - https://code.claude.com/docs/en/interactive-mode#task-list
-…and this repository’s implementation:
+...and this repository's implementation:
 - `pi-agent-teams` (Pi extension)
 > Terminology note: this extension supports `PI_TEAMS_STYLE=<style>`.
-> This doc often uses “comrade” as a generic stand-in for “worker/teammate”, but **styles can customize terminology, naming, and lifecycle copy**.
+> This doc often uses "comrade" as a generic stand-in for "worker/teammate", but **styles can customize terminology, naming, and lifecycle copy**.
 > Built-ins: `normal`, `soviet`, `pirate`. Custom styles live under `~/.pi/agent/teams/_styles/`.
 ## Scope / philosophy
@@ -30,8 +30,8 @@ This document tracks **feature parity gaps** between:
 These are intentional differences / additions:
-- **Configurable styles** (`/team style …`) for terminology + naming + lifecycle copy.
-- **Git worktrees** for isolation (`/team spawn <name> … worktree`).
+- **Configurable styles** (`/team style ...`) for terminology + naming + lifecycle copy.
+- **Git worktrees** for isolation (`/team spawn <name> ... worktree`).
 - **Session branching** (clone leader context into a teammate).
 - A **status widget + interactive panel** (`/tw`, `/team panel`).
@@ -41,19 +41,20 @@ Legend: ✅ implemented • 🟡 partial • ❌ missing
 | Area | Claude docs behavior | Pi Teams status | Notes / next step | Priority |
 | --- | --- | --- | --- | --- |
-| Enablement | `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` + settings | N/A | Pi extension is available when installed/loaded. | — |
+| Enablement | `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` + settings | N/A | Pi extension is available when installed/loaded. | - |
 | Team config | `~/.claude/teams/<team>/config.json` w/ members | ✅ | `extensions/teams/team-config.ts` (under `~/.pi/agent/teams/...` or `PI_TEAMS_ROOT_DIR`). | P0 |
 | Task list (shared) | `~/.claude/tasks/<taskListId>/` + states + deps | ✅ | File-per-task + deps (`blockedBy`/`blocks`); `/team task dep add|rm|ls`; self-claim skips blocked tasks. | P0 |
 | Self-claim | Comrades self-claim next unassigned, unblocked task; file locking | ✅ | `claimNextAvailableTask()` + locks; enabled by default (`PI_TEAMS_DEFAULT_AUTO_CLAIM=1`). | P0 |
 | Explicit assign | Lead assigns task to comrade | ✅ | `/team task assign` sets owner + pings via mailbox. | P0 |
-| “Message” vs “broadcast” | Send to one comrade or all comrades | ✅ | `/team dm` + `/team broadcast` use mailbox; `/team send` uses RPC. Recipients = config workers + RPC map + active task owners. | P0 |
-| Comrade↔comrade messaging | Comrades message each other directly | ✅ | Worker tool `team_message`; messages via mailbox + CC leader via `peer_dm_sent`. | P1 |
+| "Message" vs "broadcast" | Send to one comrade or all comrades | ✅ | `/team dm` + `/team broadcast` use mailbox; `/team send` uses RPC. Recipients = config workers + RPC map + active task owners. | P0 |
+| Comrade↔comrade messaging | Comrades message each other directly | ✅ | Worker tool `team_message` (with `urgent` flag for mid-turn interrupts); messages via mailbox + CC leader via `peer_dm_sent`. | P1 |
 | Display modes | In-process selection (Shift+Up/Down); split panes (tmux/iTerm) | ❌ | Pi has widget/panel + commands, but no terminal-level comrade navigation/panes. | P2 |
 | Delegate mode | Lead restricted to coordination-only tools | ✅ | `/team delegate [on|off]`; `tool_call` blocks `bash/edit/write`; widget shows `[delegate]`. | P1 |
-| Plan approval | Comrade can be “plan required” and needs lead approval to implement | ✅ | `/team spawn <name> plan` → read-only tools; sends `plan_approval_request`; `/team plan approve|reject`. | P1 |
-| Shutdown handshake | Lead requests shutdown; comrade can approve/reject | ✅ | Protocol: `shutdown_request` → `shutdown_approved` / `shutdown_rejected`. `/team shutdown <name>` (graceful), `/team kill <name>` (SIGTERM). Wording is style-controlled (e.g. “was asked to shut down”, “walked the plank”). | P1 |
-| Cleanup team | “Clean up the team” removes shared resources after comrades stopped | ✅ | `/team cleanup [--force]` deletes only `<teamsRoot>/<teamId>` after safety checks. | P1 |
+| Plan approval | Comrade can be "plan required" and needs lead approval to implement | ✅ | `/team spawn <name> plan` → read-only tools; sends `plan_approval_request`; `/team plan approve|reject`. | P1 |
+| Shutdown handshake | Lead requests shutdown; comrade can approve/reject | ✅ | Protocol: `shutdown_request` → `shutdown_approved` / `shutdown_rejected`. `/team shutdown <name>` (graceful), `/team kill <name>` (SIGTERM). Wording is style-controlled (e.g. "was asked to shut down", "walked the plank"). | P1 |
+| Cleanup team | "Clean up the team" removes shared resources after comrades stopped | ✅ | `/team done [--force]` ends run (stops teammates, hides widget, auto-detects completion). `/team cleanup [--force]` deletes artifacts. | P1 |
 | Hooks / quality gates | `ComradeIdle`, `TaskCompleted` hooks | 🟡 | Optional leader-side hook runner (idle/task-complete/task-fail) via `PI_TEAMS_HOOKS_ENABLED=1` + scripts under `_hooks/`; inline failure surfacing + failure-action policies (`warn`/`followup`/`reopen`/`reopen_followup`) implemented; stable hook context payload exposed via `PI_TEAMS_HOOK_CONTEXT_JSON` + auto-remediation flow (reopen cap / follow-up owner policy / teammate notification). Runtime policy changes are agent-invocable via `teams` actions (`hooks_policy_get` / `hooks_policy_set`). | P2 |
+| Widget liveliness | Status updates in near real-time | ✅ | Event-driven widget refresh on teammate tool start/end and turn completion; auto-done detection with `/team done` hint. | P2 |
 | Task list UX | Ctrl+T toggle; show all/clear tasks by asking | 🟡 | Widget + `/team task list` + `/team task show` + `/team task clear`; panel supports fast `t`/`shift+t` toggle into task-centric view (`Ctrl+T` is reserved by Pi for thinking blocks). | P0 |
 | Shared task list across sessions | `CLAUDE_CODE_TASK_LIST_ID=...` | ✅ | Worker env: `PI_TEAMS_TASK_LIST_ID` (manual workers). Leader: `/team task use <taskListId>` (persisted). Newly spawned workers inherit; existing workers need restart. | P1 |
 | Join/attach flow | Join existing team context from another running session | 🟡 | `/team attach list`, `/team attach <teamId> [--claim]`, `/team detach` plus claim heartbeat/takeover handshake added. Widget/panel now show attached-mode banner + detach hint. | P2 |
@@ -94,13 +95,14 @@ Legend: ✅ implemented • 🟡 partial • ❌ missing
    - `/team cleanup [--force]` deletes only `<teamsRoot>/<teamId>` after safety checks.
 8) **Peer-to-peer messaging** ✅
-   - Worker tool `team_message`
+   - Worker tool `team_message` with `urgent` flag for mid-turn interrupts
    - Mailbox transport; leader CC notifications
+   - Urgent messages delivered via `sendUserMessage({ deliverAs: "steer" })` to interrupt active turns
 9) **Shared task list across sessions** ✅
    - `/team task use <taskListId>` + persisted `config.json`
-### P2: UX + “product-level” parity
+### P2: UX + "product-level" parity
 10) **Hooks / quality gates** 🟡 (partial)
    - Implemented: optional leader-side hook runner (opt-in + timeout + logs).
@@ -109,6 +111,7 @@ Legend: ✅ implemented • 🟡 partial • ❌ missing
    - Implemented: runtime team-level policy overrides (via `teams` tool: `hooks_policy_get` / `hooks_policy_set`) layered over env defaults.
    - Implemented: standardized hook context contract (`PI_TEAMS_HOOK_CONTEXT_VERSION=1`, `PI_TEAMS_HOOK_CONTEXT_JSON`).
    - Implemented: autonomous remediation helpers (auto-reopen cap, follow-up owner policy, teammate remediation notification).
+   - Implemented: deterministic integration coverage (`scripts/integration-hooks-remediation-test.mts`) for failed hook -> reopen/follow-up/nudge flow.
    - Still missing: broader contract versioning story + richer policy visualization UX.
 11) **Better comrade interaction UX (within Pi constraints)** 🟡 (partial)
@@ -121,6 +124,8 @@ Legend: ✅ implemented • 🟡 partial • ❌ missing
    - Implemented: agent-invocable dependency/messaging actions via `teams` tool (`task_dep_add|rm|ls`, `message_dm|broadcast|steer`).
    - Implemented: agent-invocable lifecycle actions via `teams` tool (`member_spawn|shutdown|kill|prune`).
    - Implemented: agent-invocable governance actions via `teams` tool (`plan_approve|plan_reject`).
+   - Implemented: agent-invocable model policy introspection/check actions via `teams` tool (`model_policy_get|model_policy_check`) to validate spawn overrides before execution.
+   - Implemented: agent-invocable end-of-run via `teams` tool (`team_done`) with structured error classification (`status`/`reason`/`hint`).
    - Next: optional tmux split-pane integration and deeper dependency/task editing flows in panel.
 12) **Join/attach flow** 🟡 (partial)

package/docs/hook-contract.md ADDED Viewed

@@ -0,0 +1,183 @@
+# Hook Contract
+Versioned specification for the interface between pi-agent-teams and hook scripts.
+Hook scripts are external programs (`.js`, `.mjs`, `.sh`, or bare executables) that run
+on lifecycle events. This document defines what hooks receive and what pi-agent-teams
+guarantees across releases.
+## Current version
+**Contract version: 1**
+Exported as `HOOK_CONTRACT_VERSION` from `extensions/teams/hooks.ts`.
+## Contract surface
+Hooks receive context through two channels:
+| Channel | Format | Key / field |
+|---|---|---|
+| Environment variable | string | `PI_TEAMS_HOOK_CONTEXT_VERSION` — the contract version as a string |
+| Environment variable | JSON string | `PI_TEAMS_HOOK_CONTEXT_JSON` — structured payload (schema below) |
+| Environment variables | flat strings | `PI_TEAMS_HOOK_EVENT`, `PI_TEAMS_TEAM_ID`, etc. (convenience) |
+| Exit code | integer | `0` = pass, non-zero = fail |
+### Context JSON schema (v1)
+```jsonc
+{
+  "version": 1,                        // always matches PI_TEAMS_HOOK_CONTEXT_VERSION
+  "event": "task_completed",           // "idle" | "task_completed" | "task_failed"
+  "team": {
+    "id": "<teamId>",
+    "dir": "<absolute path>",
+    "taskListId": "<taskListId>",
+    "style": "normal"                  // current UI style id
+  },
+  "member": "<name>" | null,          // teammate that triggered the event
+  "timestamp": "<ISO 8601>" | null,
+  "task": {                            // null for "idle" events; may also be null
+                                       // for task_completed/task_failed if the task
+                                       // was cleared before the leader processed it
+    "id": "3",
+    "subject": "<text>",               // truncated to 1,000 chars
+    "description": "<text>",           // truncated to 8,000 chars
+    "owner": "<name>" | null,
+    "status": "completed",             // "pending" | "in_progress" | "completed"
+                                       // NOTE: for task_failed events the status is
+                                       // typically "pending" (reset before hook runs)
+    "blockedBy": ["1", "2"],           // max 200 entries
+    "blocks": ["5"],                   // max 200 entries
+    "metadata": {},                    // freeform key-value
+    "createdAt": "<ISO 8601>",
+    "updatedAt": "<ISO 8601>"
+  }
+}
+```
+### Flat environment variables (v1)
+| Variable | Always set | Description |
+|---|---|---|
+| `PI_TEAMS_HOOK_EVENT` | ✓ | Event name |
+| `PI_TEAMS_HOOK_CONTEXT_VERSION` | ✓ | Contract version (string) |
+| `PI_TEAMS_HOOK_CONTEXT_JSON` | ✓ | Full JSON payload |
+| `PI_TEAMS_TEAM_ID` | ✓ | Team identifier |
+| `PI_TEAMS_TEAM_DIR` | ✓ | Absolute path to team directory |
+| `PI_TEAMS_TASK_LIST_ID` | ✓ | Task list identifier |
+| `PI_TEAMS_STYLE` | ✓ | UI style id |
+| `PI_TEAMS_MEMBER` | when available | Teammate name |
+| `PI_TEAMS_EVENT_TIMESTAMP` | when available | ISO 8601 event timestamp |
+| `PI_TEAMS_TASK_ID` | when task exists | Task id |
+| `PI_TEAMS_TASK_SUBJECT` | when task exists | Task subject (untruncated) |
+| `PI_TEAMS_TASK_OWNER` | when task has owner | Task owner name |
+| `PI_TEAMS_TASK_STATUS` | when task exists | Task status |
+### Hook exit code semantics
+| Exit code | Meaning | Leader behavior |
+|---|---|---|
+| `0` | Pass | Record success in task metadata |
+| Non-zero | Fail | Apply failure policy (warn / followup / reopen / reopen_followup) |
+| Timeout (SIGTERM) | Fail | Same as non-zero exit |
+## Compatibility policy
+### Additive changes (no version bump)
+The following changes are **backward-compatible** and do NOT increment the contract version:
+- Adding new **optional** fields to the context JSON (hooks must tolerate unknown keys)
+- Adding new **environment variables** (hooks must tolerate new env vars)
+- Adding new **event types** (hooks only receive events matching their filename)
+- Extending `metadata` with new keys
+- Increasing truncation limits
+### Breaking changes (version bump required)
+The following changes **require incrementing** the contract version:
+- Removing or renaming existing JSON fields
+- Changing the type of an existing field
+- Changing the semantics of an existing field (e.g., status values)
+- Reducing truncation limits below current values
+- Changing exit code semantics
+- Removing environment variables
+### Version lifecycle
+1. **New version**: old version continues to be supported for at least one minor release
+2. **Deprecation**: the old version emits a warning in hook logs
+3. **Removal**: the old version is dropped in the next major release
+### Hook author guidelines
+Write hooks that are resilient to additive changes and race conditions:
+```js
+// ✅ Good: parse only the fields you need, ignore the rest
+const ctx = JSON.parse(process.env.PI_TEAMS_HOOK_CONTEXT_JSON);
+const taskId = ctx.task?.id;
+// ✅ Good: always guard task access — task can be null even for
+// task_completed/task_failed events (race: task cleared before leader reads it)
+if (!ctx.task) {
+  console.log("Task already cleared, skipping quality gate");
+  process.exit(0);
+}
+// ✅ Good: don't assume task.status matches the event name —
+// for task_failed events the status is typically "pending" (reset before hook runs)
+if (ctx.event === "task_failed") {
+  console.log(`Task #${ctx.task.id} failed (current status: ${ctx.task.status})`);
+}
+// ✅ Good: check the version for breaking changes
+const version = parseInt(process.env.PI_TEAMS_HOOK_CONTEXT_VERSION, 10);
+if (version > 1) {
+  console.error(`Unsupported hook contract version: ${version}`);
+  process.exit(1);
+}
+// ❌ Bad: assume exact shape, fail on new fields
+const { version, event, team, member, timestamp, task } = ctx;
+assert(Object.keys(ctx).length === 5); // breaks when fields are added
+// ❌ Bad: unconditionally access task fields
+const subject = ctx.task.subject; // crashes when task is null
+```
+## Hook log format
+Each hook invocation produces a log file in `<teamDir>/hook-logs/` with the structure:
+```jsonc
+{
+  "invocation": {
+    "event": "task_completed",
+    "teamId": "...",
+    // ... full invocation context
+  },
+  "result": {
+    "ran": true,
+    "hookPath": "/path/to/on_task_completed.js",
+    "command": ["node", "/path/to/on_task_completed.js"],
+    "exitCode": 0,
+    "timedOut": false,
+    "durationMs": 142,
+    "stdout": "...",
+    "stderr": "...",
+    "contractVersion": 1           // traces which version was used
+  }
+}
+```
+## Changelog
+### v1 (initial)
+- Context JSON with `version`, `event`, `team`, `member`, `timestamp`, `task`
+- Flat env vars: `PI_TEAMS_HOOK_EVENT`, `PI_TEAMS_HOOK_CONTEXT_VERSION`, `PI_TEAMS_HOOK_CONTEXT_JSON`, team/task fields
+- Exit code 0 = pass, non-zero = fail
+- Truncation: subject 1,000 chars, description 8,000 chars, blockedBy/blocks max 200 entries

package/docs/smoke-test-plan.md CHANGED Viewed

@@ -20,13 +20,22 @@ npx tsx scripts/smoke-test.mts
 | Module           | Coverage                                                        |
 |------------------|-----------------------------------------------------------------|
 | `names.ts`       | `sanitizeName` character replacement, edge cases                |
-| `fs-lock.ts`     | `withLock` returns value, cleans up lock file                   |
-| `mailbox.ts`     | `writeToMailbox`, `popUnreadMessages`, read-once semantics      |
+| `model-policy.ts`| Deprecated model detection, teammate model selection/override   |
+| `teams-ui-shared.ts` | `isTeamDone` (9 edge cases), display helpers                |
+| `fs-lock.ts`     | `withLock` returns value, cleans up lock file, stale locks      |
+| `mailbox.ts`     | `writeToMailbox`, `popUnreadMessages`, read-once, urgent flag   |
 | `task-store.ts`  | CRUD, `startAssignedTask`, `completeTask`, `claimNextAvailable`,|
 |                  | `unassignTasksForAgent`, dependencies, `clearTasks`             |
-| `team-config.ts` | `ensureTeamConfig` (idempotent), `upsertMember`, `setMemberStatus`, `loadTeamConfig` |
+| `team-config.ts` | `ensureTeamConfig` (idempotent), `upsertMember`, `setMemberStatus`, `loadTeamConfig`, hooks policy |
 | `protocol.ts`    | Structured message parsers (valid + invalid JSON + wrong type)  |
-| Pi CLI           | `pi --version` executes (skipped in CI if `pi` not on PATH)     |
+| `teams-style.ts` | Custom styles, naming rules, pool naming                        |
+| `hooks.ts`       | Hook execution, failure policies, follow-up/reopen actions      |
+| `team-attach-claim.ts` | Acquire/release/heartbeat claims, staleness detection     |
+| `team-discovery.ts` | Team listing, config parsing, online worker counts           |
+| `/team done`     | End-of-run cleanup, force-done with in-progress tasks, idempotency |
+| `isTeamDone`     | All branches: empty, completed, pending, streaming, stopped     |
+| docs drift guard | README, SKILL.md, help text consistency checks                  |
+| Pi CLI           | `pi --version` executes (skipped in CI if `pi` not on PATH)    |
 **Expected result:** `PASSED: <n>  FAILED: 0`
@@ -95,7 +104,17 @@ Ask the model:
 **Expected:** the `teams` tool is invoked, task created and assigned.
-### 3g. Shutdown
+### 3g. End of run
+```
+/team done
+```
+**Expected:** all teammates stopped, widget hidden, summary notification shows task counts. If tasks are still in progress, use `/team done --force`.
+After `/team done`, spawning a new teammate should automatically restore the widget.
+### 3h. Shutdown (alternative to done)
 ```
 /team shutdown agent1
@@ -138,9 +157,9 @@ pi --no-extensions -e extensions/teams/index.ts --mode rpc
 | # | Test                          | Method     | Status |
 |---|-------------------------------|------------|--------|
-| 1 | Unit primitives (60 asserts)  | Automated  | ✅     |
+| 1 | Unit primitives (200+ asserts)| Automated  | ✅     |
 | 2 | Extension loading             | CLI        | ✅     |
-| 3 | Interactive spawn/task/dm     | Manual     | 📋     |
+| 3 | Interactive spawn/task/dm/done| Manual     | 📋     |
 | 4 | Worker-side RPC               | Manual     | 📋     |
 ✅ = verified in this run, 📋 = documented for manual execution.