@tintinweb/pi-subagents 0.6.3 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,43 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.7.1] - 2026-05-07
11
+
12
+ > **Heads-up — behavior change:**
13
+ > - `isolation: "worktree"` now fails loud (returns an error) instead of silently falling back to the main tree. Affects users running pi in a non-git directory or a fresh repo with no commits.
14
+
15
+ ### Changed
16
+ - **`isolation: "worktree"` now fails loud instead of silently falling back.** Previously when `createWorktree` returned undefined (not a git repo, no commits yet, or `git worktree add` failed), the agent ran in the main `cwd` with a `[WARNING: ...]` block prepended to its prompt — visible only to the LLM, never surfaced to the caller. Now the failure throws a structured error that propagates back to the `Agent` tool response; no agent record is created. Failed scheduled fires are recorded as `lastStatus: "error"` with the reason in the `subagents:scheduled` error event. Queued background spawns whose worktree creation fails when they dequeue are marked terminal-error and don't block the rest of the queue.
17
+
18
+ ### Fixed
19
+
20
+ - **Headless `pi --print` runs no longer hang or crash after background
21
+ subagents complete.** Cleanup timers no longer keep the process alive, and
22
+ stale completion notifications are treated as best-effort shutdown side
23
+ effects.
24
+
25
+ ## [0.7.0] - 2026-05-04
26
+
27
+ > **Heads-up — behavior changes:**
28
+ > - `subagents:completed`/`failed` event `tokens.total` now excludes `cacheRead` (previously double-counted across turns) — see Fixed [#38].
29
+ > - Cron `?` is now a wildcard (same as `*`), not "current time value" — affects Quartz-style expressions only.
30
+
31
+ ### Changed
32
+ - **`@mariozechner/pi-{ai,coding-agent,tui}` moved to `peerDependencies` (`>=0.70.5`).** Avoids duplicate framework instances when the host loads this extension.
33
+ - **`@sinclair/typebox` pinned from `latest` to `^0.34.49`** so installs are reproducible.
34
+ - **`croner` bumped 8 → 10.** Heads-up: in cron strings, `?` now means wildcard (same as `*`) instead of "current time value" — affects Quartz-style expressions only.
35
+
36
+ ### Added
37
+ - **Master switch for scheduling** — new `schedulingEnabled` setting (default `true`) under `/agents → Settings → Scheduling`. When set to `false`: the `schedule` parameter and its guideline are stripped from the `Agent` tool spec at registration (zero LLM-context cost), the scheduler does not bind to the session, the `/agents → Scheduled jobs` menu entry is hidden, and any in-flight scheduler is stopped immediately. The schema-level removal applies on next pi session; the runtime kill (menu, fire path) takes effect immediately. Persisted at `<cwd>/.pi/subagents.json`.
38
+ - **Schedule subagent spawns** — the `Agent` tool now accepts an optional `schedule` parameter. When set, the spawn registers a job that fires later instead of running immediately. Three formats: 6-field cron (`"0 0 9 * * 1"` — 9am every Monday), interval (`"5m"`, `"1h"`), or one-shot (`"+10m"` or ISO timestamp). Returns the job ID. Schedules are session-scoped — they reset on `/new`, restore on `/resume` (mirrors the persistence model of pi-chonky-tasks). Storage at `<cwd>/.pi/subagent-schedules/<sessionId>.json`, with PID-based file locking + atomic temp+rename for concurrent-instance safety. **Result delivery is identical to today's background-spawn completions**: when the scheduled agent finishes, the existing `subagent-notification` followUp path emits the result to the conversation — no new delivery code, no new message types. **Concurrency**: scheduled fires bypass `maxConcurrent` so a 5-minute interval can't be deferred behind 4 long-running manual agents. **Management**: `/agents` → "Scheduled jobs" lists active jobs and lets you cancel any one of them. Creation is via the `Agent` tool only — no parallel manual-create wizard in this iteration. **Events**: `subagents:scheduled` ({ type: "added" | "removed" | "updated" | "fired" | "error", … }) and `subagents:scheduler_ready` for cross-extension consumers. **Restrictions**: `schedule` is incompatible with `inherit_context` (no parent at fire time) and `resume` (schedules create fresh agents); forces `run_in_background: true`. Scheduler engine mirrors `pi-cron-schedule` (`croner` for cron, `setInterval`/`setTimeout` for interval/once); past one-shot timestamps and invalid cron expressions are caught at create time.
39
+ - **Context-window utilization indicator in the subagent overlay** — token count is now followed by a colored `(NN%)` showing how full the subagent's context is right now (`estimateContextTokens(messages) / model.contextWindow * 100`, sourced from upstream `contextUsage.percent`). Threshold colors: <70% dim, 70–85% warning, ≥85% error. Gracefully omitted when the model has no `contextWindow` declared, or right after compaction before the next assistant turn (`tokens` is `null` in that window). The same annotation slot also surfaces a compaction count `↻N` when the agent has compacted at least once — e.g. `12.3k token (84% · ↻3)` (percent + compactions joined with `·`), `12.3k token (↻1)` (compactions only, immediately post-compaction while percent is still null). The compaction glyph stays dim regardless; the percent's threshold color carries the urgency signal. Two live overlays get the annotations (running stats line; inspect-overlay header); post-completion notifications and result/event payloads only get the count (the indicator is no longer actionable once the agent is done).
40
+ - **Token usage and context% exposed to the parent agent** at every interaction surface — `get_subagent_result` adds `Context: NN%` to its stats line; `steer_subagent` returns a `Current state: 12.3k token · 5 tool uses · context 72% full` line so the steering agent knows whether it has room before sending more context; `task-notification` XML adds `<context_percent>NN</context_percent>` (omitted when null). All plain-text, no ANSI codes — designed for LLM consumption, not human display.
41
+ - **New `subagents:compacted` lifecycle event** fires when a subagent's session successfully compacts. Payload: `{ id, type, description, reason: "manual" | "threshold" | "overflow", tokensBefore, compactionCount }` — `tokensBefore` is upstream's pre-compaction context size estimate; `compactionCount` is the running total for this agent (also persisted on `AgentRecord.compactionCount` and surfaced in `get_subagent_result` / `steer_subagent` / `task-notification` when > 0). Aborted compactions don't fire. Routed through a new manager-level `onCompact` constructor callback, matching the existing `onStart` / `onComplete` pattern.
42
+
43
+ ### Fixed
44
+ - **Subagent token count was inflated 5–15× and reset mid-run** ([#38](https://github.com/tintinweb/pi-subagents/issues/38)). Two distinct bugs in the same field. (1) Upstream `getSessionStats().tokens.total` sums per-turn `cacheRead` across every assistant message — but each turn's `cacheRead` is the *cumulative* cached prefix re-read on that one API call, so summing N turns counts the prefix N times (quadratic inflation, very visible on long sessions). (2) Even with that fixed, anything derived from `session.state.messages` resets at compaction because upstream replaces the array via `this.agent.state.messages = sessionContext.messages`. Fix replaces all six display readers with a lifetime accumulator (`AgentRecord.lifetimeUsage` and `AgentActivity.lifetimeUsage` — `{ input, output, cacheWrite }`) fed by a new `onAssistantUsage` callback dispatched from `message_end` events in both `runAgent` and `resumeAgent`. The accumulator is independent of `state.messages` mutation, so it survives compaction; total = input + output + cacheWrite by construction (cacheRead deliberately excluded — same prefix-double-counting reason). The `subagents:completed`/`failed` event payload's `tokens` field is now also lifetime-accumulated for `input`, `output`, and `total` together (was: `total` lifetime, `input`/`output` session-derived → inconsistent after compaction).
45
+ - **ESC during a foreground `Agent` call now actually stops the subagent** ([#44](https://github.com/tintinweb/pi-subagents/pull/44) — thanks [@Zeng-Zer](https://github.com/Zeng-Zer)). Pi's interrupt path is `esc → agent.abort()` on the parent → `AbortSignal` delivered to every tool's `execute(toolCallId, params, signal, …)`, but the `Agent` tool dropped that signal on the floor: subagents ran on their own independent `AbortController` inside `AgentManager`, so the parent abort was invisible and the subagent kept running until natural completion or `max_turns`. Fix threads `signal` through `Agent.execute` → `manager.spawnAndWait()` → `SpawnOptions.signal`, and `AgentManager.startAgent()` now attaches an `{ once: true }` `"abort"` listener that calls `this.abort(id)` (which sets `status: "stopped"` and aborts the child controller). The listener is detached in both `.then` and `.catch` to avoid leaking on natural settle. **Scope:** foreground only — background agents intentionally outlive the parent tool call, so their spawn deliberately does not forward `signal`. Resume path (`AgentManager.resume()`) has the same blind spot and is tracked as a follow-up.
46
+
10
47
  ## [0.6.3] - 2026-04-28
11
48
 
12
49
  ### Fixed
package/README.md CHANGED
@@ -28,8 +28,9 @@ https://github.com/user-attachments/assets/8685261b-9338-4fea-8dfe-1c590d5df543
28
28
  - **Skill preloading** — inject named skill files from `.pi/skills/` into agent system prompts
29
29
  - **Tool denylist** — block specific tools via `disallowed_tools` frontmatter
30
30
  - **Styled completion notifications** — background agent results render as themed, compact notification boxes (icon, stats, result preview) instead of raw XML. Expandable to show full output. Group completions render each agent individually
31
- - **Event bus** — lifecycle events (`subagents:created`, `started`, `completed`, `failed`, `steered`) emitted via `pi.events`, enabling other extensions to react to sub-agent activity
31
+ - **Event bus** — lifecycle events (`subagents:created`, `started`, `completed`, `failed`, `steered`, `compacted`) emitted via `pi.events`, enabling other extensions to react to sub-agent activity
32
32
  - **Cross-extension RPC** — other pi extensions can spawn and stop subagents via the `pi.events` event bus (`subagents:rpc:ping`, `subagents:rpc:spawn`, `subagents:rpc:stop`). Standardized reply envelopes with protocol versioning. Emits `subagents:ready` on load
33
+ - **Schedule subagents** — pass `schedule` to the `Agent` tool to fire on cron / interval / one-shot. Session-scoped jobs with PID-locked persistence; results land via the same `subagent-notification` followUp path as manual background completions; manage via `/agents → Scheduled jobs`
33
34
 
34
35
  ## Install
35
36
 
@@ -58,29 +59,67 @@ Agent({
58
59
 
59
60
  Foreground agents block until complete and return results inline. Background agents return an ID immediately and notify you on completion.
60
61
 
62
+ ### Scheduling
63
+
64
+ Add a `schedule` field to register the agent to fire later instead of running now:
65
+
66
+ ```
67
+ Agent({
68
+ subagent_type: "Explore",
69
+ prompt: "Look at recent commits and summarize what changed since last week",
70
+ description: "Weekly commit review",
71
+ schedule: "0 0 9 * * 1", // 9am every Monday (6-field cron)
72
+ })
73
+ ```
74
+
75
+ Schedule formats:
76
+
77
+ - **Cron** — 6-field (`second minute hour day-of-month month day-of-week`), e.g. `"0 0 9 * * 1"` for 9am every Monday, `"0 */15 * * * *"` for every 15 minutes.
78
+ - **Interval** — `"5m"`, `"1h"`, `"30s"`, `"2d"`. Fires repeatedly at that interval.
79
+ - **One-shot relative** — `"+10m"`, `"+2h"`, `"+1d"`. Fires once at that future time.
80
+ - **One-shot absolute** — full ISO timestamp, e.g. `"2026-12-25T09:00:00.000Z"`.
81
+
82
+ When a schedule fires, the spawn runs in background and its completion notification arrives in the conversation through the same `subagent-notification` followUp path as a manually-spawned background agent — your parent agent reasons about the result the same way.
83
+
84
+ Schedules are **session-scoped**: they reset on `/new` and restore on `/resume`. List and cancel via `/agents → Scheduled jobs` (creation is the `Agent` tool's job — there is no parallel manual-create wizard). Storage at `<cwd>/.pi/subagent-schedules/<sessionId>.json` with PID-based file locking for cross-instance safety.
85
+
86
+ **Disable the feature entirely**: `/agents → Settings → Scheduling → disabled` removes `schedule` from the `Agent` tool spec (no LLM-context cost), hides the menu entry, and stops any active scheduler. The schema-level removal takes effect on the next pi session; the runtime kill is immediate. Re-enable from the same menu.
87
+
88
+ Restrictions:
89
+ - `schedule` cannot be combined with `inherit_context` (no parent conversation exists at fire time) or `resume` (schedules create fresh agents).
90
+ - `run_in_background` is forced to `true`.
91
+ - Scheduled fires bypass the `maxConcurrent` queue so a 5-minute interval cannot be deferred behind long-running manual agents.
92
+ - **Headless `pi -p` doesn't wait for scheduled subagents.**
93
+
61
94
  ## UI
62
95
 
63
96
  The extension renders a persistent widget above the editor showing all active agents:
64
97
 
65
98
  ```
66
99
  ● Agents
67
- ├─ ⠹ Agent Refactor auth module · ⟳5≤30 · 5 tool uses · 33.8k token · 12.3s
100
+ ├─ ⠹ Agent Refactor auth module · ⟳5≤30 · 5 tool uses · 33.8k token (62%) · 12.3s
68
101
  │ ⎿ editing 2 files…
69
- ├─ ⠹ Explore Find auth files · ⟳3 · 3 tool uses · 12.4k token · 4.1s
102
+ ├─ ⠹ Explore Find auth files · ⟳3 · 3 tool uses · 12.4k token (8%) · 4.1s
70
103
  │ ⎿ searching…
104
+ ├─ ⠹ Agent Long-running task · ⟳42 · 38 tool uses · 91.0k token (84% · ↻2) · 2m17s
105
+ │ ⎿ reading…
71
106
  └─ 2 queued
72
107
  ```
73
108
 
109
+ The token field is annotated with two optional signals inside parens:
110
+ - **`NN%`** — context-window utilization (color-coded: <70% dim, 70–85% warning, ≥85% error). Omitted when the model has no declared `contextWindow`, or briefly right after compaction.
111
+ - **`↻N`** — number of times the session has compacted, when > 0. Stays dim; the percent's color carries urgency.
112
+
74
113
  Individual agent results render Claude Code-style in the conversation:
75
114
 
76
115
  | State | Example |
77
116
  |-------|---------|
78
- | **Running** | `⠹ ⟳3≤30 · 3 tool uses · 12.4k token` / `⎿ searching, reading 3 files…` |
79
- | **Completed** | `✓ ⟳8 · 5 tool uses · 33.8k token · 12.3s` / `⎿ Done` |
80
- | **Wrapped up** | `✓ ⟳50≤50 · 50 tool uses · 89.1k token · 45.2s` / `⎿ Wrapped up (turn limit)` |
81
- | **Stopped** | `■ ⟳3 · 3 tool uses · 12.4k token` / `⎿ Stopped` |
82
- | **Error** | `✗ ⟳3 · 3 tool uses · 12.4k token` / `⎿ Error: timeout` |
83
- | **Aborted** | `✗ ⟳55≤50 · 55 tool uses · 102.3k token` / `⎿ Aborted (max turns exceeded)` |
117
+ | **Running** | `⠹ ⟳3≤30 · 3 tool uses · 12.4k token (8%)` / `⎿ searching, reading 3 files…` |
118
+ | **Completed** | `✓ ⟳8 · 5 tool uses · 33.8k token (62%) · 12.3s` / `⎿ Done` |
119
+ | **Wrapped up** | `✓ ⟳50≤50 · 50 tool uses · 89.1k token (84% · ↻2) · 45.2s` / `⎿ Wrapped up (turn limit)` |
120
+ | **Stopped** | `■ ⟳3 · 3 tool uses · 12.4k token (8%)` / `⎿ Stopped` |
121
+ | **Error** | `✗ ⟳3 · 3 tool uses · 12.4k token (8%)` / `⎿ Error: timeout` |
122
+ | **Aborted** | `✗ ⟳55≤50 · 55 tool uses · 102.3k token (95% · ↻3)` / `⎿ Aborted (max turns exceeded)` |
84
123
 
85
124
  Completed results can be expanded (ctrl+o in pi) to show the full agent output inline.
86
125
 
@@ -304,13 +343,18 @@ Agent lifecycle events are emitted via `pi.events.emit()` so other extensions ca
304
343
  |-------|------|------------|
305
344
  | `subagents:created` | Background agent registered | `id`, `type`, `description`, `isBackground` |
306
345
  | `subagents:started` | Agent transitions to running (including queued→running) | `id`, `type`, `description` |
307
- | `subagents:completed` | Agent finished successfully | `id`, `type`, `durationMs`, `tokens`, `toolUses`, `result` |
346
+ | `subagents:completed` | Agent finished successfully | `id`, `type`, `durationMs`, `tokens` (lifetime `{ input, output, total }`), `toolUses`, `result` |
308
347
  | `subagents:failed` | Agent errored, stopped, or aborted | same as completed + `error`, `status` |
309
348
  | `subagents:steered` | Steering message sent | `id`, `message` |
349
+ | `subagents:compacted` | Agent's session successfully compacted | `id`, `type`, `description`, `reason` (`"manual"` / `"threshold"` / `"overflow"`), `tokensBefore`, `compactionCount` |
350
+ | `subagents:scheduled` | Schedule lifecycle change | `{ type: "added" \| "removed" \| "updated" \| "fired" \| "error", … }` (job/agentId/error fields per type) |
351
+ | `subagents:scheduler_ready` | Scheduler bound to session, enabled jobs armed | `sessionId`, `jobCount` |
310
352
  | `subagents:ready` | Extension loaded and RPC handlers registered | — |
311
353
  | `subagents:settings_loaded` | Persisted settings applied at extension init | `settings` (merged global + project) |
312
354
  | `subagents:settings_changed` | `/agents` → Settings mutation was applied | `settings`, `persisted` (`boolean` — `false` on write failure) |
313
355
 
356
+ `tokens.total` = `input + output + cacheWrite`. `cacheRead` is excluded — each turn's `cacheRead` is the cumulative cached prefix re-read on that one API call, so summing per-message would over-count it. Use `contextUsage.percent` (surfaced as `(NN%)` in the widget) for current context size.
357
+
314
358
  ## Cross-Extension RPC
315
359
 
316
360
  Other pi extensions can spawn and stop subagents programmatically via the `pi.events` event bus, without importing this package directly.
@@ -409,7 +453,7 @@ The agent gets a full, isolated copy of the repository. On completion:
409
453
  - **No changes:** worktree is cleaned up automatically
410
454
  - **Changes made:** changes are committed to a new branch (`pi-agent-<id>`) and returned in the result
411
455
 
412
- If the worktree cannot be created (not a git repo, no commits), the agent falls back to the main working directory with a warning.
456
+ If the worktree cannot be created (not a git repo, no commits, or `git worktree add` fails), the `Agent` tool returns a clear error instead of running unisolated — `isolation: "worktree"` is a strict guarantee, not a hint. Initialize git and commit at least once, or omit `isolation`.
413
457
 
414
458
  ## Skill Preloading
415
459
 
@@ -11,6 +11,11 @@ import { type ToolActivity } from "./agent-runner.js";
11
11
  import type { AgentRecord, IsolationMode, SubagentType, ThinkingLevel } from "./types.js";
12
12
  export type OnAgentComplete = (record: AgentRecord) => void;
13
13
  export type OnAgentStart = (record: AgentRecord) => void;
14
+ export type OnAgentCompact = (record: AgentRecord, info: CompactionInfo) => void;
15
+ export type CompactionInfo = {
16
+ reason: "manual" | "threshold" | "overflow";
17
+ tokensBefore: number;
18
+ };
14
19
  interface SpawnOptions {
15
20
  description: string;
16
21
  model?: Model<any>;
@@ -19,8 +24,16 @@ interface SpawnOptions {
19
24
  inheritContext?: boolean;
20
25
  thinkingLevel?: ThinkingLevel;
21
26
  isBackground?: boolean;
27
+ /**
28
+ * Skip the maxConcurrent queue check for this spawn — start immediately even
29
+ * if the configured concurrency limit would otherwise queue it. Used by the
30
+ * scheduler so a fired job can't be deferred past its trigger window.
31
+ */
32
+ bypassQueue?: boolean;
22
33
  /** Isolation mode — "worktree" creates a temp git worktree for the agent. */
23
34
  isolation?: IsolationMode;
35
+ /** Parent abort signal — when aborted, the subagent is also stopped. */
36
+ signal?: AbortSignal;
24
37
  /** Called on tool start/end with activity info (for streaming progress to UI). */
25
38
  onToolActivity?: (activity: ToolActivity) => void;
26
39
  /** Called on streaming text deltas from the assistant response. */
@@ -29,18 +42,27 @@ interface SpawnOptions {
29
42
  onSessionCreated?: (session: AgentSession) => void;
30
43
  /** Called at the end of each agentic turn with the cumulative count. */
31
44
  onTurnEnd?: (turnCount: number) => void;
45
+ /** Called once per assistant message_end with that message's usage delta. */
46
+ onAssistantUsage?: (usage: {
47
+ input: number;
48
+ output: number;
49
+ cacheWrite: number;
50
+ }) => void;
51
+ /** Called when the session successfully compacts. */
52
+ onCompaction?: (info: CompactionInfo) => void;
32
53
  }
33
54
  export declare class AgentManager {
34
55
  private agents;
35
56
  private cleanupInterval;
36
57
  private onComplete?;
37
58
  private onStart?;
59
+ private onCompact?;
38
60
  private maxConcurrent;
39
61
  /** Queue of background agents waiting to start. */
40
62
  private queue;
41
63
  /** Number of currently running background agents. */
42
64
  private runningBackground;
43
- constructor(onComplete?: OnAgentComplete, maxConcurrent?: number, onStart?: OnAgentStart);
65
+ constructor(onComplete?: OnAgentComplete, maxConcurrent?: number, onStart?: OnAgentStart, onCompact?: OnAgentCompact);
44
66
  /** Update the max concurrent background agents limit. */
45
67
  setMaxConcurrent(n: number): void;
46
68
  getMaxConcurrent(): number;
@@ -7,6 +7,7 @@
7
7
  */
8
8
  import { randomUUID } from "node:crypto";
9
9
  import { resumeAgent, runAgent } from "./agent-runner.js";
10
+ import { addUsage } from "./usage.js";
10
11
  import { cleanupWorktree, createWorktree, pruneWorktrees, } from "./worktree.js";
11
12
  /** Default max concurrent background agents. */
12
13
  const DEFAULT_MAX_CONCURRENT = 4;
@@ -15,17 +16,20 @@ export class AgentManager {
15
16
  cleanupInterval;
16
17
  onComplete;
17
18
  onStart;
19
+ onCompact;
18
20
  maxConcurrent;
19
21
  /** Queue of background agents waiting to start. */
20
22
  queue = [];
21
23
  /** Number of currently running background agents. */
22
24
  runningBackground = 0;
23
- constructor(onComplete, maxConcurrent = DEFAULT_MAX_CONCURRENT, onStart) {
25
+ constructor(onComplete, maxConcurrent = DEFAULT_MAX_CONCURRENT, onStart, onCompact) {
24
26
  this.onComplete = onComplete;
25
27
  this.onStart = onStart;
28
+ this.onCompact = onCompact;
26
29
  this.maxConcurrent = maxConcurrent;
27
30
  // Cleanup completed agents after 10 minutes (but keep sessions for resume)
28
31
  this.cleanupInterval = setInterval(() => this.cleanup(), 60_000);
32
+ this.cleanupInterval.unref();
29
33
  }
30
34
  /** Update the max concurrent background agents limit. */
31
35
  setMaxConcurrent(n) {
@@ -51,40 +55,56 @@ export class AgentManager {
51
55
  toolUses: 0,
52
56
  startedAt: Date.now(),
53
57
  abortController,
58
+ lifetimeUsage: { input: 0, output: 0, cacheWrite: 0 },
59
+ compactionCount: 0,
54
60
  };
55
61
  this.agents.set(id, record);
56
62
  const args = { pi, ctx, type, prompt, options };
57
- if (options.isBackground && this.runningBackground >= this.maxConcurrent) {
63
+ if (options.isBackground && !options.bypassQueue && this.runningBackground >= this.maxConcurrent) {
58
64
  // Queue it — will be started when a running agent completes
59
65
  this.queue.push({ id, args });
60
66
  return id;
61
67
  }
62
- this.startAgent(id, record, args);
68
+ // startAgent can throw (e.g. strict worktree-isolation failure) — clean
69
+ // up the record so callers don't see an orphan in `listAgents()`.
70
+ try {
71
+ this.startAgent(id, record, args);
72
+ }
73
+ catch (err) {
74
+ this.agents.delete(id);
75
+ throw err;
76
+ }
63
77
  return id;
64
78
  }
65
79
  /** Actually start an agent (called immediately or from queue drain). */
66
80
  startAgent(id, record, { pi, ctx, type, prompt, options }) {
81
+ // Worktree isolation: try to create a temporary git worktree. Strict —
82
+ // fail loud if not possible (no silent fallback to main tree). Done
83
+ // BEFORE state mutation so a throw doesn't leave the record half-running.
84
+ let worktreeCwd;
85
+ if (options.isolation === "worktree") {
86
+ const wt = createWorktree(ctx.cwd, id);
87
+ if (!wt) {
88
+ throw new Error('Cannot run with isolation: "worktree" — not a git repo, no commits yet, or `git worktree add` failed. ' +
89
+ 'Initialize git and commit at least once, or omit `isolation`.');
90
+ }
91
+ record.worktree = wt;
92
+ worktreeCwd = wt.path;
93
+ }
67
94
  record.status = "running";
68
95
  record.startedAt = Date.now();
69
96
  if (options.isBackground)
70
97
  this.runningBackground++;
71
98
  this.onStart?.(record);
72
- // Worktree isolation: create a temporary git worktree if requested
73
- let worktreeCwd;
74
- let worktreeWarning = "";
75
- if (options.isolation === "worktree") {
76
- const wt = createWorktree(ctx.cwd, id);
77
- if (wt) {
78
- record.worktree = wt;
79
- worktreeCwd = wt.path;
80
- }
81
- else {
82
- worktreeWarning = "\n\n[WARNING: Worktree isolation was requested but failed (not a git repo, or no commits yet). Running in the main working directory instead.]";
83
- }
99
+ // Wire parent abort signal to stop the subagent when the parent is interrupted
100
+ let detachParentSignal;
101
+ if (options.signal) {
102
+ const onParentAbort = () => this.abort(id);
103
+ options.signal.addEventListener("abort", onParentAbort, { once: true });
104
+ detachParentSignal = () => options.signal.removeEventListener("abort", onParentAbort);
84
105
  }
85
- // Prepend worktree warning to prompt if isolation failed
86
- const effectivePrompt = worktreeWarning ? worktreeWarning + "\n\n" + prompt : prompt;
87
- const promise = runAgent(ctx, type, effectivePrompt, {
106
+ const detach = () => { detachParentSignal?.(); detachParentSignal = undefined; };
107
+ const promise = runAgent(ctx, type, prompt, {
88
108
  pi,
89
109
  model: options.model,
90
110
  maxTurns: options.maxTurns,
@@ -100,6 +120,15 @@ export class AgentManager {
100
120
  },
101
121
  onTurnEnd: options.onTurnEnd,
102
122
  onTextDelta: options.onTextDelta,
123
+ onAssistantUsage: (usage) => {
124
+ addUsage(record.lifetimeUsage, usage);
125
+ options.onAssistantUsage?.(usage);
126
+ },
127
+ onCompaction: (info) => {
128
+ record.compactionCount++;
129
+ this.onCompact?.(record, info);
130
+ options.onCompaction?.(info);
131
+ },
103
132
  onSessionCreated: (session) => {
104
133
  record.session = session;
105
134
  // Flush any steers that arrived before the session was ready
@@ -120,6 +149,7 @@ export class AgentManager {
120
149
  record.result = responseText;
121
150
  record.session = session;
122
151
  record.completedAt ??= Date.now();
152
+ detach();
123
153
  // Final flush of streaming output file
124
154
  if (record.outputCleanup) {
125
155
  try {
@@ -139,7 +169,10 @@ export class AgentManager {
139
169
  }
140
170
  if (options.isBackground) {
141
171
  this.runningBackground--;
142
- this.onComplete?.(record);
172
+ try {
173
+ this.onComplete?.(record);
174
+ }
175
+ catch { /* ignore completion side-effect errors */ }
143
176
  this.drainQueue();
144
177
  }
145
178
  return responseText;
@@ -151,6 +184,7 @@ export class AgentManager {
151
184
  }
152
185
  record.error = err instanceof Error ? err.message : String(err);
153
186
  record.completedAt ??= Date.now();
187
+ detach();
154
188
  // Final flush of streaming output file on error
155
189
  if (record.outputCleanup) {
156
190
  try {
@@ -183,7 +217,17 @@ export class AgentManager {
183
217
  const record = this.agents.get(next.id);
184
218
  if (!record || record.status !== "queued")
185
219
  continue;
186
- this.startAgent(next.id, record, next.args);
220
+ try {
221
+ this.startAgent(next.id, record, next.args);
222
+ }
223
+ catch (err) {
224
+ // Late failure (e.g. strict worktree-isolation) — surface on the record
225
+ // so the user/agent can see it via /agents, then keep draining.
226
+ record.status = "error";
227
+ record.error = err instanceof Error ? err.message : String(err);
228
+ record.completedAt = Date.now();
229
+ this.onComplete?.(record);
230
+ }
187
231
  }
188
232
  }
189
233
  /**
@@ -214,6 +258,13 @@ export class AgentManager {
214
258
  if (activity.type === "end")
215
259
  record.toolUses++;
216
260
  },
261
+ onAssistantUsage: (usage) => {
262
+ addUsage(record.lifetimeUsage, usage);
263
+ },
264
+ onCompaction: (info) => {
265
+ record.compactionCount++;
266
+ this.onCompact?.(record, info);
267
+ },
217
268
  signal,
218
269
  });
219
270
  record.status = "completed";
@@ -38,6 +38,24 @@ export interface RunOptions {
38
38
  onSessionCreated?: (session: AgentSession) => void;
39
39
  /** Called at the end of each agentic turn with the cumulative count. */
40
40
  onTurnEnd?: (turnCount: number) => void;
41
+ /**
42
+ * Called once per assistant message_end with that message's usage delta.
43
+ * Lets callers maintain a lifetime accumulator that survives compaction
44
+ * (which replaces session.state.messages and resets stats-derived sums).
45
+ */
46
+ onAssistantUsage?: (usage: {
47
+ input: number;
48
+ output: number;
49
+ cacheWrite: number;
50
+ }) => void;
51
+ /**
52
+ * Called when the session successfully compacts. `tokensBefore` is upstream's
53
+ * pre-compaction context size estimate. Aborted compactions don't fire.
54
+ */
55
+ onCompaction?: (info: {
56
+ reason: "manual" | "threshold" | "overflow";
57
+ tokensBefore: number;
58
+ }) => void;
41
59
  }
42
60
  export interface RunResult {
43
61
  responseText: string;
@@ -53,6 +71,15 @@ export declare function runAgent(ctx: ExtensionContext, type: SubagentType, prom
53
71
  */
54
72
  export declare function resumeAgent(session: AgentSession, prompt: string, options?: {
55
73
  onToolActivity?: (activity: ToolActivity) => void;
74
+ onAssistantUsage?: (usage: {
75
+ input: number;
76
+ output: number;
77
+ cacheWrite: number;
78
+ }) => void;
79
+ onCompaction?: (info: {
80
+ reason: "manual" | "threshold" | "overflow";
81
+ tokensBefore: number;
82
+ }) => void;
56
83
  signal?: AbortSignal;
57
84
  }): Promise<string>;
58
85
  /**
@@ -261,6 +261,18 @@ export async function runAgent(ctx, type, prompt, options) {
261
261
  if (event.type === "tool_execution_end") {
262
262
  options.onToolActivity?.({ type: "end", toolName: event.toolName });
263
263
  }
264
+ if (event.type === "message_end" && event.message.role === "assistant") {
265
+ const u = event.message.usage;
266
+ if (u)
267
+ options.onAssistantUsage?.({
268
+ input: u.input ?? 0,
269
+ output: u.output ?? 0,
270
+ cacheWrite: u.cacheWrite ?? 0,
271
+ });
272
+ }
273
+ if (event.type === "compaction_end" && !event.aborted && event.result) {
274
+ options.onCompaction?.({ reason: event.reason, tokensBefore: event.result.tokensBefore });
275
+ }
264
276
  });
265
277
  const collector = collectResponseText(session);
266
278
  const cleanupAbort = forwardAbortSignal(session, options.signal);
@@ -289,12 +301,24 @@ export async function runAgent(ctx, type, prompt, options) {
289
301
  export async function resumeAgent(session, prompt, options = {}) {
290
302
  const collector = collectResponseText(session);
291
303
  const cleanupAbort = forwardAbortSignal(session, options.signal);
292
- const unsubToolUse = options.onToolActivity
304
+ const unsubEvents = (options.onToolActivity || options.onAssistantUsage || options.onCompaction)
293
305
  ? session.subscribe((event) => {
294
306
  if (event.type === "tool_execution_start")
295
- options.onToolActivity({ type: "start", toolName: event.toolName });
307
+ options.onToolActivity?.({ type: "start", toolName: event.toolName });
296
308
  if (event.type === "tool_execution_end")
297
- options.onToolActivity({ type: "end", toolName: event.toolName });
309
+ options.onToolActivity?.({ type: "end", toolName: event.toolName });
310
+ if (event.type === "message_end" && event.message.role === "assistant") {
311
+ const u = event.message.usage;
312
+ if (u)
313
+ options.onAssistantUsage?.({
314
+ input: u.input ?? 0,
315
+ output: u.output ?? 0,
316
+ cacheWrite: u.cacheWrite ?? 0,
317
+ });
318
+ }
319
+ if (event.type === "compaction_end" && !event.aborted && event.result) {
320
+ options.onCompaction?.({ reason: event.reason, tokensBefore: event.result.tokensBefore });
321
+ }
298
322
  })
299
323
  : () => { };
300
324
  try {
@@ -302,7 +326,7 @@ export async function resumeAgent(session, prompt, options = {}) {
302
326
  }
303
327
  finally {
304
328
  collector.unsubscribe();
305
- unsubToolUse();
329
+ unsubEvents();
306
330
  cleanupAbort();
307
331
  }
308
332
  return collector.getText().trim() || getLastAssistantText(session);