npm - @melihmucuk/pi-crew - Versions diffs - 1.0.18 → 1.0.19 - Mend

@melihmucuk/pi-crew 1.0.18 → 1.0.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/agents/code-reviewer.md +16 -11
package/agents/quality-reviewer.md +8 -17
package/extension/crew.ts +1 -7
package/extension/index.ts +33 -15
package/extension/subagent-session.ts +1 -14
package/extension/ui.ts +0 -18
package/package.json +6 -6
package/prompts/pi-crew-review.md +25 -16
package/skills/pi-crew/SKILL.md +3 -1
package/extension/overflow-recovery.ts +0 -211

package/agents/code-reviewer.md CHANGED Viewed

@@ -1,26 +1,32 @@
 ---
 name: code-reviewer
-description: Reviews changed code for actionable bugs. Read-only.
+description: Reviews scoped code for actionable bugs. Read-only.
 model: openai-codex/gpt-5.2
 thinking: high
 tools: read, grep, find, ls, bash
 ---
-You are a read-only code reviewer. Your goal is not to find something; it is to decide whether the changed code contains realistic, actionable bugs. An empty review is a valid successful outcome. Reply in the user's language.
+You are a read-only code reviewer. Your goal is not to find something; it is to decide whether the reviewed scope contains realistic, actionable bugs. An empty review is a valid successful outcome. Reply in the user's language.
 Do not modify files. Use bash only for read-only inspection. Do not run builds, tests, typechecks, formatters, installers, or commands that may change project state.
 ## Scope
-Review the provided scope. If none is provided, review uncommitted changes. For commits, branches, PRs, files, or "latest" requests, inspect the corresponding diff. If "latest" is requested, review the last 5 commits unless a count is given.
+Review the provided scope. If none is provided, review uncommitted changes.
-For large or broad diffs, summarize coverage by area with brief risk notes, then deeply review only the highest-risk changed files: business logic, auth, data mutation, error handling, and public APIs. Avoid exhaustive file inventories.
+For commits, branches, PRs, files, directories, modules, or "latest" requests, inspect the corresponding diff or code. If "latest" is requested, review the last 5 commits unless a count is given.
-Review changed-code issues only. Pre-existing code is reportable only when the change triggers it or makes it relevant.
+If "full", "codebase", or whole-repo review is requested, perform a bounded bug audit: map the highest-risk areas, deeply inspect selected files, state coverage/skipped areas briefly, and do not imply exhaustive coverage.
+For large or broad scopes, prioritize highest-risk areas: business logic, auth/security, data mutation, persistence, external integrations, concurrency/async, error handling, and public APIs.
+For changed-code scopes, report pre-existing issues only when the change triggers or makes them relevant. For full-codebase scopes, report existing issues only when directly evidenced, realistically triggerable, and worth acting on now.
 ## Method
-Diffs are not enough. Before reporting a finding, read the full changed file involved. Trace direct callers/callees or nearby patterns only when needed. Check local conventions only when relevant. Stop expanding context when it stops adding evidence.
+Diffs are not enough. Before reporting a finding, read the full relevant file involved. Trace direct callers/callees or nearby patterns only when needed. Check local conventions only when relevant. Stop expanding context when it stops adding evidence.
+For full-codebase scopes, make findings only from files and paths you directly inspected; verify any caller, route, config, schema, or runtime assumption the finding depends on.
 Do not report findings from skipped or unreviewed files. A finding requires direct inspection of the relevant file or diff context; if a file was skipped, only mention it as skipped, not as evidence for a finding.
@@ -40,17 +46,15 @@ Report the same finding pattern at most twice, then list other affected location
 ## Severity
-- Critical: proven realistic security, data loss, or severe breakage.
-- Major: realistic bug likely to affect users, developers, or operations.
-- Minor: real non-blocking bug or high-risk coverage gap.
+- Critical: urgent, high-impact issue within this reviewer's scope that can cause severe user, data, security, operational, or near-term development breakage.
+- Major: realistic issue within this reviewer's scope likely to affect users, developers, operations, or maintainability enough to act on soon.
+- Minor: real but non-blocking issue within this reviewer's scope, localized maintenance friction, or high-risk coverage gap.
 ## Output
 If no findings:
 **No issues found.**
-Reviewed: [files]
-Overall confidence: [high/medium]
 For each finding:
@@ -58,6 +62,7 @@ For each finding:
 File: `path:line`
 Issue: what is wrong
 Evidence: what you verified
+Impact: concrete consequence
 Fix: suggested correction
 Be direct, concise, and unpadded.

package/agents/quality-reviewer.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: quality-reviewer
-description: Reviews changed code for maintainability, duplication, and complexity. Read-only.
+description: Reviews scoped code for maintainability, duplication, and complexity. Read-only.
 model: openai-codex/gpt-5.2
 thinking: high
 tools: read, grep, find, ls, bash
@@ -16,7 +16,7 @@ Do not modify files. Use bash only for read-only inspection. Do not run builds,
 Review the provided scope. If none is provided, review uncommitted changes. For files, directories, modules, commits, branches, PRs, or "latest" requests, inspect the corresponding code or diff. If "latest" is requested, review the last 5 commits unless a count is given.
-If "full" or "codebase" is requested, first produce a structural risk map, then deeply review only the highest-risk areas.
+If "full", "codebase", or whole-repo review is requested, first produce a structural risk map, then deeply review only the highest-risk areas, state coverage/skipped areas briefly, and do not imply exhaustive coverage.
 For large or broad scopes, summarize coverage by area with brief structural notes, then deeply review the highest-risk areas/files: large files, dependency-heavy files, widely imported files, or files crossing module boundaries. Avoid exhaustive file inventories; state skipped areas briefly.
@@ -48,32 +48,23 @@ Default stance: no new abstraction unless it reduces present-day duplication or
 ## Severity
-- High: structure will materially hinder near-term changes or debugging.
-- Medium: noticeable maintenance friction with concrete evidence.
-- Minor: small structural friction on a realistic future change/debug path.
+- Critical: urgent, high-impact issue within this reviewer's scope that can cause severe user, data, security, operational, or near-term development breakage.
+- Major: realistic issue within this reviewer's scope likely to affect users, developers, operations, or maintainability enough to act on soon.
+- Minor: real but non-blocking issue within this reviewer's scope, localized maintenance friction, or high-risk coverage gap.
 ## Output
 If no findings:
 **No issues found.**
-Reviewed: [files]
-Overall health: [brief assessment]
 For each finding:
 **[SEVERITY] Category: Title**
 File: `path:line`
-Issue: structural problem
-Impact: concrete future change/debug task made harder
+Issue: what is wrong
 Evidence: what you verified
-Fix: specific refactoring approach
-End with:
-**Quality Review Summary**
-Files reviewed: [count]
-Findings: [count by severity]
-Overall health: [one sentence]
+Impact: concrete consequence
+Fix: suggested correction
 Be direct, concise, and unpadded.

package/extension/crew.ts CHANGED Viewed

@@ -8,7 +8,6 @@ import {
 	type SendMessageFn,
 	type SteeringPayload,
 	type SubagentStatus,
-	sendRemainingNote,
 	sendSteeringMessage,
 } from "./ui.js";
@@ -36,7 +35,6 @@ export interface SubagentState {
 	model: string | undefined;
 	error?: string;
 	result?: string;
-	promptAbortController?: AbortController;
 	unsubscribe?: () => void;
 }
@@ -271,7 +269,6 @@ export class CrewRuntime {
 	private disposeAgent(state: SubagentState): void {
 		state.unsubscribe?.();
-		state.promptAbortController = undefined;
 		state.session?.dispose();
 		this.agents.delete(state.id);
 		this.refreshWidgetFor(state.ownerSessionId);
@@ -320,8 +317,6 @@ export class CrewRuntime {
 	private schedulePendingFlushFor(sessionId: string): void {
 		if (!this.pendingMessages.some((entry) => entry.ownerSessionId === sessionId)) return;
-		// Delay flush to next macrotask. session_start fires before pi-core reconnects the
-		// agent event listener; synchronous delivery can lose JSONL persistence.
 		this.flushScheduled = true;
 		this.scheduleFlush(() => {
 			this.flushScheduled = false;
@@ -370,10 +365,9 @@ export class CrewRuntime {
 		const remaining = this.countRunningForOwner(ownerSessionId, payload.id);
 		const isIdle = this.activeBinding.isIdle();
-		const triggerResultTurn = !(isIdle && remaining > 0);
+		const triggerResultTurn = payload.status === "waiting" || !(isIdle && remaining > 0);
 		sendSteeringMessage(payload, this.activeBinding.sendMessage, { isIdle, triggerTurn: triggerResultTurn });
-		sendRemainingNote(remaining, this.activeBinding.sendMessage, { isIdle, triggerTurn: isIdle && remaining > 0 });
 	}
 }

package/extension/index.ts CHANGED Viewed

@@ -1,12 +1,25 @@
 import { dirname } from "node:path";
 import { fileURLToPath } from "node:url";
 import type { ExtensionAPI, ExtensionContext } from "@earendil-works/pi-coding-agent";
-import { crewRuntime } from "./crew.js";
+import { crewRuntime, type CrewRuntime } from "./crew.js";
 import { registerCrewTools } from "./tools.js";
 import { registerCrewMessageRenderers, updateWidget } from "./ui.js";
 const extensionDir = dirname(fileURLToPath(import.meta.url));
+interface ProcessHooks {
+	once(event: "SIGINT", listener: () => void): unknown;
+	on(event: "beforeExit", listener: () => void): unknown;
+	exit(code?: number): never;
+}
+interface RegisterPiCrewExtensionOptions {
+	crew?: CrewRuntime;
+	extensionDir?: string;
+	processHooks?: ProcessHooks;
+	processHooksSetupKey?: symbol;
+}
 // Process-level cleanup for subagents on exit
 const processHooksSetupKey = Symbol.for("pi-crew.processHooksSetup");
 const globalWithProcessHooks = globalThis as typeof globalThis & Record<
@@ -14,29 +27,30 @@ const globalWithProcessHooks = globalThis as typeof globalThis & Record<
 	boolean | undefined
 >;
-function setupProcessHooks() {
-	if (globalWithProcessHooks[processHooksSetupKey]) return;
-	globalWithProcessHooks[processHooksSetupKey] = true;
+function setupProcessHooks(crew: CrewRuntime, processHooks: ProcessHooks, setupKey: symbol) {
+	if (globalWithProcessHooks[setupKey]) return;
+	globalWithProcessHooks[setupKey] = true;
-	process.once("SIGINT", () => {
-		crewRuntime.abortAll();
-		process.exit(130);
+	processHooks.once("SIGINT", () => {
+		crew.abortAll();
+		processHooks.exit(130);
 	});
-	process.on("beforeExit", () => crewRuntime.abortAll());
+	processHooks.on("beforeExit", () => crew.abortAll());
 }
-export default function (pi: ExtensionAPI) {
+export function registerPiCrewExtension(pi: ExtensionAPI, options: RegisterPiCrewExtensionOptions = {}) {
+	const crew = options.crew ?? crewRuntime;
 	let currentCtx: ExtensionContext | undefined;
-	setupProcessHooks();
+	setupProcessHooks(crew, options.processHooks ?? process, options.processHooksSetupKey ?? processHooksSetupKey);
 	const refreshWidget = () => {
-		if (currentCtx) updateWidget(currentCtx, crewRuntime);
+		if (currentCtx) updateWidget(currentCtx, crew);
 	};
 	const activateSession = (ctx: ExtensionContext) => {
 		currentCtx = ctx;
-		crewRuntime.activateSession(
+		crew.activateSession(
 			{
 				sessionId: ctx.sessionManager.getSessionId(),
 				isIdle: () => ctx.isIdle(),
@@ -52,13 +66,17 @@ export default function (pi: ExtensionAPI) {
 	pi.on("session_shutdown", (event, ctx) => {
 		const sessionId = ctx.sessionManager.getSessionId();
-		crewRuntime.deactivateSession(sessionId);
+		crew.deactivateSession(sessionId);
 		if (event.reason === "quit") {
-			crewRuntime.abortAll();
+			crew.abortAll();
 		}
 	});
-	registerCrewTools(pi, crewRuntime, extensionDir);
+	registerCrewTools(pi, crew, options.extensionDir ?? extensionDir);
 	registerCrewMessageRenderers(pi);
 }
+export default function (pi: ExtensionAPI) {
+	registerPiCrewExtension(pi);
+}

package/extension/subagent-session.ts CHANGED Viewed

@@ -12,7 +12,6 @@ import type { AgentConfig } from "./catalog.js";
 import { SUPPORTED_TOOL_NAMES, type SupportedToolName } from "./catalog.js";
 import type { SubagentState } from "./crew.js";
 import type { SubagentStatus } from "./ui.js";
-import { runPromptWithOverflowRecovery } from "./overflow-recovery.js";
 export interface BootstrapContext {
 	model: Model<Api> | undefined;
@@ -188,10 +187,7 @@ export class SubagentSessionRunner implements SubagentRunner {
 	}
 	abort(state: SubagentState): void {
-		state.promptAbortController?.abort();
-		state.promptAbortController = undefined;
 		state.session?.abortCompaction();
-		state.session?.abortRetry();
 		state.session?.abort().catch(() => {});
 	}
@@ -221,25 +217,16 @@ export class SubagentSessionRunner implements SubagentRunner {
 	private async runPromptCycle(state: SubagentState, prompt: string): Promise<void> {
 		if (isAborted(state)) return;
-		const abortController = new AbortController();
-		state.promptAbortController = abortController;
 		try {
-			const recovery = await runPromptWithOverflowRecovery(state.session!, prompt, abortController.signal);
+			await state.session!.prompt(prompt);
 			if (isAborted(state)) return;
 			const outcome = getPromptOutcome(state);
-			if (recovery === "failed" && outcome.status !== "error") {
-				this.callbacks.onSettled(state, "error", { error: "Context overflow recovery failed" });
-				return;
-			}
 			this.callbacks.onSettled(state, outcome.status, outcome);
 		} catch (err) {
 			if (isAborted(state)) return;
 			const error = err instanceof Error ? err.message : String(err);
 			this.callbacks.onSettled(state, "error", { error });
-		} finally {
-			state.promptAbortController = undefined;
 		}
 	}

package/extension/ui.ts CHANGED Viewed

@@ -104,23 +104,6 @@ export function sendSteeringMessage(
 	);
 }
-export function sendRemainingNote(
-	remainingCount: number,
-	sendMessage: SendMessageFn,
-	opts: { isIdle: boolean; triggerTurn: boolean },
-): void {
-	if (remainingCount <= 0) return;
-	sendWithDeliveryPolicy(
-		{
-			customType: "crew-remaining",
-			content: `⏳ ${remainingCount} subagent(s) still running`,
-			display: true,
-		},
-		sendMessage,
-		opts,
-	);
-}
 export function sendCrewListActiveWarning(
 	sendMessage: SendMessageFn,
 	opts: { isIdle: boolean; triggerTurn: boolean },
@@ -193,7 +176,6 @@ export function registerCrewMessageRenderers(pi: ExtensionAPI): void {
 		return box;
 	});
-	pi.registerMessageRenderer("crew-remaining", (message, _options, theme) => renderWarningMessage(message.content, theme));
 	pi.registerMessageRenderer("crew-list-warning", (message, _options, theme) => renderWarningMessage(message.content, theme));
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@melihmucuk/pi-crew",
-  "version": "1.0.18",
+  "version": "1.0.19",
   "type": "module",
   "description": "Non-blocking subagent orchestration for pi coding agent",
   "files": [
@@ -43,13 +43,13 @@
     "typebox": "*"
   },
   "devDependencies": {
-    "@earendil-works/pi-agent-core": "^0.75.4",
-    "@earendil-works/pi-ai": "^0.75.4",
-    "@earendil-works/pi-coding-agent": "^0.75.4",
-    "@earendil-works/pi-tui": "^0.75.4",
+    "@earendil-works/pi-agent-core": "^0.76.0",
+    "@earendil-works/pi-ai": "^0.76.0",
+    "@earendil-works/pi-coding-agent": "^0.76.0",
+    "@earendil-works/pi-tui": "^0.76.0",
     "@types/node": "^22.19.17",
     "tsx": "^4.22.3",
-    "typebox": "^1.1.38",
+    "typebox": "^1.1.39",
     "typescript": "^5.9.3"
   }
 }

package/prompts/pi-crew-review.md CHANGED Viewed

@@ -10,7 +10,7 @@ You are a review orchestrator, not a reviewer. Resolve the review scope, gather
 ## Scope
-Use the user's scope when provided. Otherwise rely on each reviewer’s default scope. If “latest” or “recent” is requested, review the last 5 commits unless a count is given.
+Use the user's scope when provided. Otherwise rely on each reviewer’s default scope. If “latest” or “recent” is requested, review the last 5 commits unless a count is given. If “full”, “codebase”, or whole-repo review is requested, treat it as an explicit non-default scope and pass that scope to reviewers.
 Gather minimal review context: why the changes were made, expected behavior/outcome, feature or bug intent, notable fixes since any prior review, verification already run, and user instructions that are specific to this review.
@@ -33,6 +33,8 @@ If you include a Goal, make it specific to the change intent, not the reviewer r
 For default reviews, do not include a Scope section or mention uncommitted/current repo changes in the subagent brief unless needed to disambiguate scope. If you need to state task-specific emphasis, use `Review focus:` instead of `Scope:`.
+For full/codebase requests, state that the requested scope is a bounded full-codebase review.
 Do not echo the raw user instruction if it is already represented in the intent summary; quote it only when exact wording matters.
 Do not restate reviewer-role boilerplate implied by the selected reviewer, such as telling `code-reviewer` to find actionable bugs or telling `quality-reviewer` to review maintainability. Do not include default scope, generic non-goals, acceptance criteria, output format, edit permissions, or severity rules unless the user explicitly overrides them.
@@ -49,26 +51,33 @@ You may do a minimal spot-check only when a finding is ambiguous, high-impact, o
 Reply in the user's language. Apply the gate before merging.
+For each accepted finding, preserve enough detail to act without reading subagent logs:
+**[SEVERITY] Category: Title**
+Source: `code-reviewer` | `quality-reviewer` | `both`
+File: `path:line`
+Issue: what is wrong
+Evidence: what was verified
+Impact: concrete consequence
+Fix: specific suggested correction
+Do not forward findings as summaries only. If evidence, location, or fix is missing and cannot be inferred from the reviewer result, omit the finding or report it as insufficiently evidenced.
 Sections:
-### Consensus Findings
-Issues clearly reported by both reviewers.
+### Findings
+List all accepted findings in severity order. Use `Source:` to identify `code-reviewer`, `quality-reviewer`, or `both`.
-### Code Review Findings
-Accepted findings only from `code-reviewer`.
+If both reviewers report no accepted findings, write only:
-### Quality Review Findings
-Accepted findings only from `quality-reviewer`.
+No accepted findings.
-### Final Summary
-- Review scope
-- Reviewers run and any failures
-- Consensus findings count
-- Code review findings count
-- Quality review findings count
-- Overall assessment
+### Summary
+- Scope: [review scope]
+- Reviewers: [completed reviewers and any failures]
+- Findings: [count by severity]
+- Result: [one-sentence overall assessment]
 Rules:
 - Do not repeat overlapping findings.
-- Do not present a single-reviewer finding as consensus.
-- If both reviewers report no accepted findings, say so clearly.
+- Mark a finding as `Source: both` only when both reviewers clearly reported the same issue.

package/skills/pi-crew/SKILL.md CHANGED Viewed

@@ -35,6 +35,8 @@ Omit sections that would only restate the selected subagent’s role, default sc
 Include only information that helps this specific subagent do this specific task: intent, expected outcome, relevant decisions, exact errors/output, unusual constraints, and file paths or entry points that genuinely clarify the task. Use short Markdown sections and bullets when they improve scanability, especially for multi-part intent, constraints, observations, requirements, or acceptance criteria; avoid dense paragraphs.
+For repeated workflows, make each spawn brief independent. Do not assume a new subagent knows earlier loop results, owner-session discussion, or what another subagent saw. If prior findings, fixes, decisions, or verification matter, summarize the concrete facts or point to durable artifacts the subagent can inspect. Avoid vague references like “we fixed the first review findings” unless you also state what those findings/fixes were or define the current review target without relying on that history.
 Do not restate boilerplate implied by the selected subagent’s role, name, or description. Avoid repeating default scope, output format, edit permissions, or repo guidance. Subagents run in the same cwd as the orchestrator, so do not include mechanical Git state they can inspect themselves, such as full changed-file lists, staged/unstaged/untracked inventories, branch/cwd details, or generic project constraints, unless those details define a non-default scope or prevent ambiguity.
 If the user points to a plan, spec, issue, design, or doc as task intent, read it when practical and summarize the relevant intent instead of merely passing the path. Prefer explaining why the work matters and what outcome is expected over restating repository state.
@@ -44,7 +46,7 @@ If the user points to a plan, spec, issue, design, or doc as task intent, read i
 - Wait for subagent results before using them. Never invent or predict results.
 - Evaluate each result against the task acceptance criteria.
 - If results conflict, are incomplete, or miss criteria, state that clearly and use a follow-up or new spawn only when needed.
-- After spawning, continue only with unrelated work or end the turn.
+- After spawning, do not work on the delegated task; wait for results, continue only with unrelated work, or end the turn.
 ## Interactive Subagents

package/extension/overflow-recovery.ts DELETED Viewed

@@ -1,211 +0,0 @@
-import type { AgentSession, AgentSessionEvent } from "@earendil-works/pi-coding-agent";
-const OVERFLOW_RECOVERY_TIMEOUT_MS = 120_000;
-/**
- * Short grace period for the first terminal agent_end after prompt() resolves.
- * If this window expires, we still wait the full recovery timeout.
- */
-const INITIAL_AGENT_END_WAIT_MS = 5_000;
-type PhaseWaitResult = "done" | "timeout" | "cancelled";
-export type OverflowRecoveryResult = "none" | "recovered" | "failed";
-interface DeferredPhase {
-	promise: Promise<void>;
-	resolve: () => void;
-	isDone: () => boolean;
-}
-function createDeferredPhase(): DeferredPhase {
-	let done = false;
-	let resolveFn: (() => void) | undefined;
-	const promise = new Promise<void>((resolve) => {
-		resolveFn = () => {
-			if (done) return;
-			done = true;
-			resolve();
-		};
-	});
-	return {
-		promise,
-		resolve: () => resolveFn?.(),
-		isDone: () => done,
-	};
-}
-class OverflowRecoveryTracker {
-	private overflowDetected = false;
-	private compactionWillRetry = false;
-	private autoRetryActive = false;
-	private readonly initialAgentEnd = createDeferredPhase();
-	private compactionEnd: DeferredPhase | undefined;
-	private retryAgentEnd: DeferredPhase | undefined;
-	private overflowAutoRetryEnd: DeferredPhase | undefined;
-	private timers: ReturnType<typeof setTimeout>[] = [];
-	handleEvent(event: AgentSessionEvent): void {
-		switch (event.type) {
-			case "agent_end":
-				this.onAgentEnd();
-				break;
-			case "compaction_start":
-				this.onCompactionStart(event.reason);
-				break;
-			case "compaction_end":
-				this.onCompactionEnd(event.reason, event.willRetry);
-				break;
-			case "auto_retry_start":
-				this.onAutoRetryStart();
-				break;
-			case "auto_retry_end":
-				this.onAutoRetryEnd();
-				break;
-			default:
-				break;
-		}
-	}
-	async awaitCompletion(signal: AbortSignal): Promise<OverflowRecoveryResult> {
-		const cancelPromise = new Promise<void>((resolve) => {
-			if (signal.aborted) {
-				resolve();
-				return;
-			}
-			signal.addEventListener("abort", () => resolve(), { once: true });
-		});
-		try {
-			let initialEnd = await this.waitForPhase(
-				this.initialAgentEnd.promise,
-				INITIAL_AGENT_END_WAIT_MS,
-				cancelPromise,
-			);
-			if (initialEnd === "timeout") {
-				initialEnd = await this.waitForPhase(
-					this.initialAgentEnd.promise,
-					OVERFLOW_RECOVERY_TIMEOUT_MS,
-					cancelPromise,
-				);
-			}
-			if (initialEnd !== "done") {
-				return this.overflowDetected ? "failed" : "none";
-			}
-			if (!this.overflowDetected) return "none";
-			if (this.compactionEnd) {
-				const compactionEnd = await this.waitForPhase(
-					this.compactionEnd.promise,
-					OVERFLOW_RECOVERY_TIMEOUT_MS,
-					cancelPromise,
-				);
-				if (compactionEnd !== "done") return "failed";
-			}
-			if (!this.compactionWillRetry) return "failed";
-			if (this.retryAgentEnd) {
-				const retryEnd = await this.waitForPhase(
-					this.retryAgentEnd.promise,
-					OVERFLOW_RECOVERY_TIMEOUT_MS,
-					cancelPromise,
-				);
-				if (retryEnd !== "done") return "failed";
-			}
-			if (this.overflowAutoRetryEnd) {
-				const autoRetryEnd = await this.waitForPhase(
-					this.overflowAutoRetryEnd.promise,
-					OVERFLOW_RECOVERY_TIMEOUT_MS,
-					cancelPromise,
-				);
-				if (autoRetryEnd !== "done") return "failed";
-			}
-			return "recovered";
-		} finally {
-			for (const timer of this.timers) clearTimeout(timer);
-		}
-	}
-	private async waitForPhase(
-		phasePromise: Promise<void>,
-		timeoutMs: number,
-		cancelPromise: Promise<void>,
-	): Promise<PhaseWaitResult> {
-		return Promise.race([
-			phasePromise.then(() => "done" as const),
-			cancelPromise.then(() => "cancelled" as const),
-			new Promise<"timeout">((resolve) => {
-				this.timers.push(setTimeout(() => resolve("timeout"), timeoutMs));
-			}),
-		]);
-	}
-	// agent_end can be followed immediately by auto_retry_start in the same
-	// _processAgentEvent tick. Resolve on microtask so we can ignore retrying
-	// attempts and only accept terminal agent_end events.
-	private onAgentEnd(): void {
-		queueMicrotask(() => {
-			if (this.autoRetryActive) return;
-			if (!this.initialAgentEnd.isDone()) {
-				this.initialAgentEnd.resolve();
-				return;
-			}
-			this.retryAgentEnd?.resolve();
-		});
-	}
-	private onCompactionStart(reason: "manual" | "threshold" | "overflow"): void {
-		if (reason !== "overflow") return;
-		this.overflowDetected = true;
-		this.compactionEnd ??= createDeferredPhase();
-	}
-	private onCompactionEnd(reason: "manual" | "threshold" | "overflow", willRetry: boolean): void {
-		if (reason !== "overflow") return;
-		this.compactionWillRetry = willRetry;
-		if (willRetry) {
-			this.retryAgentEnd ??= createDeferredPhase();
-		}
-		this.compactionEnd?.resolve();
-	}
-	private onAutoRetryStart(): void {
-		this.autoRetryActive = true;
-		if (this.overflowDetected) {
-			this.overflowAutoRetryEnd ??= createDeferredPhase();
-		}
-	}
-	private onAutoRetryEnd(): void {
-		this.autoRetryActive = false;
-		this.overflowAutoRetryEnd?.resolve();
-	}
-}
-export async function runPromptWithOverflowRecovery(
-	session: AgentSession,
-	text: string,
-	signal: AbortSignal,
-): Promise<OverflowRecoveryResult> {
-	const tracker = new OverflowRecoveryTracker();
-	const unsubscribe = session.subscribe((event) => tracker.handleEvent(event));
-	try {
-		await session.prompt(text);
-		return await tracker.awaitCompletion(signal);
-	} finally {
-		unsubscribe();
-	}
-}