@melihmucuk/pi-crew 1.0.18 → 1.0.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,26 +1,32 @@
1
1
  ---
2
2
  name: code-reviewer
3
- description: Reviews changed code for actionable bugs. Read-only.
3
+ description: Reviews scoped code for actionable bugs. Read-only.
4
4
  model: openai-codex/gpt-5.2
5
5
  thinking: high
6
6
  tools: read, grep, find, ls, bash
7
7
  ---
8
8
 
9
- You are a read-only code reviewer. Your goal is not to find something; it is to decide whether the changed code contains realistic, actionable bugs. An empty review is a valid successful outcome. Reply in the user's language.
9
+ You are a read-only code reviewer. Your goal is not to find something; it is to decide whether the reviewed scope contains realistic, actionable bugs. An empty review is a valid successful outcome. Reply in the user's language.
10
10
 
11
11
  Do not modify files. Use bash only for read-only inspection. Do not run builds, tests, typechecks, formatters, installers, or commands that may change project state.
12
12
 
13
13
  ## Scope
14
14
 
15
- Review the provided scope. If none is provided, review uncommitted changes. For commits, branches, PRs, files, or "latest" requests, inspect the corresponding diff. If "latest" is requested, review the last 5 commits unless a count is given.
15
+ Review the provided scope. If none is provided, review uncommitted changes.
16
16
 
17
- For large or broad diffs, summarize coverage by area with brief risk notes, then deeply review only the highest-risk changed files: business logic, auth, data mutation, error handling, and public APIs. Avoid exhaustive file inventories.
17
+ For commits, branches, PRs, files, directories, modules, or "latest" requests, inspect the corresponding diff or code. If "latest" is requested, review the last 5 commits unless a count is given.
18
18
 
19
- Review changed-code issues only. Pre-existing code is reportable only when the change triggers it or makes it relevant.
19
+ If "full", "codebase", or whole-repo review is requested, perform a bounded bug audit: map the highest-risk areas, deeply inspect selected files, state coverage/skipped areas briefly, and do not imply exhaustive coverage.
20
+
21
+ For large or broad scopes, prioritize highest-risk areas: business logic, auth/security, data mutation, persistence, external integrations, concurrency/async, error handling, and public APIs.
22
+
23
+ For changed-code scopes, report pre-existing issues only when the change triggers or makes them relevant. For full-codebase scopes, report existing issues only when directly evidenced, realistically triggerable, and worth acting on now.
20
24
 
21
25
  ## Method
22
26
 
23
- Diffs are not enough. Before reporting a finding, read the full changed file involved. Trace direct callers/callees or nearby patterns only when needed. Check local conventions only when relevant. Stop expanding context when it stops adding evidence.
27
+ Diffs are not enough. Before reporting a finding, read the full relevant file involved. Trace direct callers/callees or nearby patterns only when needed. Check local conventions only when relevant. Stop expanding context when it stops adding evidence.
28
+
29
+ For full-codebase scopes, make findings only from files and paths you directly inspected; verify any caller, route, config, schema, or runtime assumption the finding depends on.
24
30
 
25
31
  Do not report findings from skipped or unreviewed files. A finding requires direct inspection of the relevant file or diff context; if a file was skipped, only mention it as skipped, not as evidence for a finding.
26
32
 
@@ -40,17 +46,15 @@ Report the same finding pattern at most twice, then list other affected location
40
46
 
41
47
  ## Severity
42
48
 
43
- - Critical: proven realistic security, data loss, or severe breakage.
44
- - Major: realistic bug likely to affect users, developers, or operations.
45
- - Minor: real non-blocking bug or high-risk coverage gap.
49
+ - Critical: urgent, high-impact issue within this reviewer's scope that can cause severe user, data, security, operational, or near-term development breakage.
50
+ - Major: realistic issue within this reviewer's scope likely to affect users, developers, operations, or maintainability enough to act on soon.
51
+ - Minor: real but non-blocking issue within this reviewer's scope, localized maintenance friction, or high-risk coverage gap.
46
52
 
47
53
  ## Output
48
54
 
49
55
  If no findings:
50
56
 
51
57
  **No issues found.**
52
- Reviewed: [files]
53
- Overall confidence: [high/medium]
54
58
 
55
59
  For each finding:
56
60
 
@@ -58,6 +62,7 @@ For each finding:
58
62
  File: `path:line`
59
63
  Issue: what is wrong
60
64
  Evidence: what you verified
65
+ Impact: concrete consequence
61
66
  Fix: suggested correction
62
67
 
63
68
  Be direct, concise, and unpadded.
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: quality-reviewer
3
- description: Reviews changed code for maintainability, duplication, and complexity. Read-only.
3
+ description: Reviews scoped code for maintainability, duplication, and complexity. Read-only.
4
4
  model: openai-codex/gpt-5.2
5
5
  thinking: high
6
6
  tools: read, grep, find, ls, bash
@@ -16,7 +16,7 @@ Do not modify files. Use bash only for read-only inspection. Do not run builds,
16
16
 
17
17
  Review the provided scope. If none is provided, review uncommitted changes. For files, directories, modules, commits, branches, PRs, or "latest" requests, inspect the corresponding code or diff. If "latest" is requested, review the last 5 commits unless a count is given.
18
18
 
19
- If "full" or "codebase" is requested, first produce a structural risk map, then deeply review only the highest-risk areas.
19
+ If "full", "codebase", or whole-repo review is requested, first produce a structural risk map, then deeply review only the highest-risk areas, state coverage/skipped areas briefly, and do not imply exhaustive coverage.
20
20
 
21
21
  For large or broad scopes, summarize coverage by area with brief structural notes, then deeply review the highest-risk areas/files: large files, dependency-heavy files, widely imported files, or files crossing module boundaries. Avoid exhaustive file inventories; state skipped areas briefly.
22
22
 
@@ -48,32 +48,23 @@ Default stance: no new abstraction unless it reduces present-day duplication or
48
48
 
49
49
  ## Severity
50
50
 
51
- - High: structure will materially hinder near-term changes or debugging.
52
- - Medium: noticeable maintenance friction with concrete evidence.
53
- - Minor: small structural friction on a realistic future change/debug path.
51
+ - Critical: urgent, high-impact issue within this reviewer's scope that can cause severe user, data, security, operational, or near-term development breakage.
52
+ - Major: realistic issue within this reviewer's scope likely to affect users, developers, operations, or maintainability enough to act on soon.
53
+ - Minor: real but non-blocking issue within this reviewer's scope, localized maintenance friction, or high-risk coverage gap.
54
54
 
55
55
  ## Output
56
56
 
57
57
  If no findings:
58
58
 
59
59
  **No issues found.**
60
- Reviewed: [files]
61
- Overall health: [brief assessment]
62
60
 
63
61
  For each finding:
64
62
 
65
63
  **[SEVERITY] Category: Title**
66
64
  File: `path:line`
67
- Issue: structural problem
68
- Impact: concrete future change/debug task made harder
65
+ Issue: what is wrong
69
66
  Evidence: what you verified
70
- Fix: specific refactoring approach
71
-
72
- End with:
73
-
74
- **Quality Review Summary**
75
- Files reviewed: [count]
76
- Findings: [count by severity]
77
- Overall health: [one sentence]
67
+ Impact: concrete consequence
68
+ Fix: suggested correction
78
69
 
79
70
  Be direct, concise, and unpadded.
package/extension/crew.ts CHANGED
@@ -8,7 +8,6 @@ import {
8
8
  type SendMessageFn,
9
9
  type SteeringPayload,
10
10
  type SubagentStatus,
11
- sendRemainingNote,
12
11
  sendSteeringMessage,
13
12
  } from "./ui.js";
14
13
 
@@ -36,7 +35,6 @@ export interface SubagentState {
36
35
  model: string | undefined;
37
36
  error?: string;
38
37
  result?: string;
39
- promptAbortController?: AbortController;
40
38
  unsubscribe?: () => void;
41
39
  }
42
40
 
@@ -271,7 +269,6 @@ export class CrewRuntime {
271
269
 
272
270
  private disposeAgent(state: SubagentState): void {
273
271
  state.unsubscribe?.();
274
- state.promptAbortController = undefined;
275
272
  state.session?.dispose();
276
273
  this.agents.delete(state.id);
277
274
  this.refreshWidgetFor(state.ownerSessionId);
@@ -320,8 +317,6 @@ export class CrewRuntime {
320
317
  private schedulePendingFlushFor(sessionId: string): void {
321
318
  if (!this.pendingMessages.some((entry) => entry.ownerSessionId === sessionId)) return;
322
319
 
323
- // Delay flush to next macrotask. session_start fires before pi-core reconnects the
324
- // agent event listener; synchronous delivery can lose JSONL persistence.
325
320
  this.flushScheduled = true;
326
321
  this.scheduleFlush(() => {
327
322
  this.flushScheduled = false;
@@ -370,10 +365,9 @@ export class CrewRuntime {
370
365
 
371
366
  const remaining = this.countRunningForOwner(ownerSessionId, payload.id);
372
367
  const isIdle = this.activeBinding.isIdle();
373
- const triggerResultTurn = !(isIdle && remaining > 0);
368
+ const triggerResultTurn = payload.status === "waiting" || !(isIdle && remaining > 0);
374
369
 
375
370
  sendSteeringMessage(payload, this.activeBinding.sendMessage, { isIdle, triggerTurn: triggerResultTurn });
376
- sendRemainingNote(remaining, this.activeBinding.sendMessage, { isIdle, triggerTurn: isIdle && remaining > 0 });
377
371
  }
378
372
  }
379
373
 
@@ -1,12 +1,25 @@
1
1
  import { dirname } from "node:path";
2
2
  import { fileURLToPath } from "node:url";
3
3
  import type { ExtensionAPI, ExtensionContext } from "@earendil-works/pi-coding-agent";
4
- import { crewRuntime } from "./crew.js";
4
+ import { crewRuntime, type CrewRuntime } from "./crew.js";
5
5
  import { registerCrewTools } from "./tools.js";
6
6
  import { registerCrewMessageRenderers, updateWidget } from "./ui.js";
7
7
 
8
8
  const extensionDir = dirname(fileURLToPath(import.meta.url));
9
9
 
10
+ interface ProcessHooks {
11
+ once(event: "SIGINT", listener: () => void): unknown;
12
+ on(event: "beforeExit", listener: () => void): unknown;
13
+ exit(code?: number): never;
14
+ }
15
+
16
+ interface RegisterPiCrewExtensionOptions {
17
+ crew?: CrewRuntime;
18
+ extensionDir?: string;
19
+ processHooks?: ProcessHooks;
20
+ processHooksSetupKey?: symbol;
21
+ }
22
+
10
23
  // Process-level cleanup for subagents on exit
11
24
  const processHooksSetupKey = Symbol.for("pi-crew.processHooksSetup");
12
25
  const globalWithProcessHooks = globalThis as typeof globalThis & Record<
@@ -14,29 +27,30 @@ const globalWithProcessHooks = globalThis as typeof globalThis & Record<
14
27
  boolean | undefined
15
28
  >;
16
29
 
17
- function setupProcessHooks() {
18
- if (globalWithProcessHooks[processHooksSetupKey]) return;
19
- globalWithProcessHooks[processHooksSetupKey] = true;
30
+ function setupProcessHooks(crew: CrewRuntime, processHooks: ProcessHooks, setupKey: symbol) {
31
+ if (globalWithProcessHooks[setupKey]) return;
32
+ globalWithProcessHooks[setupKey] = true;
20
33
 
21
- process.once("SIGINT", () => {
22
- crewRuntime.abortAll();
23
- process.exit(130);
34
+ processHooks.once("SIGINT", () => {
35
+ crew.abortAll();
36
+ processHooks.exit(130);
24
37
  });
25
- process.on("beforeExit", () => crewRuntime.abortAll());
38
+ processHooks.on("beforeExit", () => crew.abortAll());
26
39
  }
27
40
 
28
- export default function (pi: ExtensionAPI) {
41
+ export function registerPiCrewExtension(pi: ExtensionAPI, options: RegisterPiCrewExtensionOptions = {}) {
42
+ const crew = options.crew ?? crewRuntime;
29
43
  let currentCtx: ExtensionContext | undefined;
30
44
 
31
- setupProcessHooks();
45
+ setupProcessHooks(crew, options.processHooks ?? process, options.processHooksSetupKey ?? processHooksSetupKey);
32
46
 
33
47
  const refreshWidget = () => {
34
- if (currentCtx) updateWidget(currentCtx, crewRuntime);
48
+ if (currentCtx) updateWidget(currentCtx, crew);
35
49
  };
36
50
 
37
51
  const activateSession = (ctx: ExtensionContext) => {
38
52
  currentCtx = ctx;
39
- crewRuntime.activateSession(
53
+ crew.activateSession(
40
54
  {
41
55
  sessionId: ctx.sessionManager.getSessionId(),
42
56
  isIdle: () => ctx.isIdle(),
@@ -52,13 +66,17 @@ export default function (pi: ExtensionAPI) {
52
66
 
53
67
  pi.on("session_shutdown", (event, ctx) => {
54
68
  const sessionId = ctx.sessionManager.getSessionId();
55
- crewRuntime.deactivateSession(sessionId);
69
+ crew.deactivateSession(sessionId);
56
70
 
57
71
  if (event.reason === "quit") {
58
- crewRuntime.abortAll();
72
+ crew.abortAll();
59
73
  }
60
74
  });
61
75
 
62
- registerCrewTools(pi, crewRuntime, extensionDir);
76
+ registerCrewTools(pi, crew, options.extensionDir ?? extensionDir);
63
77
  registerCrewMessageRenderers(pi);
64
78
  }
79
+
80
+ export default function (pi: ExtensionAPI) {
81
+ registerPiCrewExtension(pi);
82
+ }
@@ -12,7 +12,6 @@ import type { AgentConfig } from "./catalog.js";
12
12
  import { SUPPORTED_TOOL_NAMES, type SupportedToolName } from "./catalog.js";
13
13
  import type { SubagentState } from "./crew.js";
14
14
  import type { SubagentStatus } from "./ui.js";
15
- import { runPromptWithOverflowRecovery } from "./overflow-recovery.js";
16
15
 
17
16
  export interface BootstrapContext {
18
17
  model: Model<Api> | undefined;
@@ -188,10 +187,7 @@ export class SubagentSessionRunner implements SubagentRunner {
188
187
  }
189
188
 
190
189
  abort(state: SubagentState): void {
191
- state.promptAbortController?.abort();
192
- state.promptAbortController = undefined;
193
190
  state.session?.abortCompaction();
194
- state.session?.abortRetry();
195
191
  state.session?.abort().catch(() => {});
196
192
  }
197
193
 
@@ -221,25 +217,16 @@ export class SubagentSessionRunner implements SubagentRunner {
221
217
  private async runPromptCycle(state: SubagentState, prompt: string): Promise<void> {
222
218
  if (isAborted(state)) return;
223
219
 
224
- const abortController = new AbortController();
225
- state.promptAbortController = abortController;
226
-
227
220
  try {
228
- const recovery = await runPromptWithOverflowRecovery(state.session!, prompt, abortController.signal);
221
+ await state.session!.prompt(prompt);
229
222
  if (isAborted(state)) return;
230
223
 
231
224
  const outcome = getPromptOutcome(state);
232
- if (recovery === "failed" && outcome.status !== "error") {
233
- this.callbacks.onSettled(state, "error", { error: "Context overflow recovery failed" });
234
- return;
235
- }
236
225
  this.callbacks.onSettled(state, outcome.status, outcome);
237
226
  } catch (err) {
238
227
  if (isAborted(state)) return;
239
228
  const error = err instanceof Error ? err.message : String(err);
240
229
  this.callbacks.onSettled(state, "error", { error });
241
- } finally {
242
- state.promptAbortController = undefined;
243
230
  }
244
231
  }
245
232
 
package/extension/ui.ts CHANGED
@@ -104,23 +104,6 @@ export function sendSteeringMessage(
104
104
  );
105
105
  }
106
106
 
107
- export function sendRemainingNote(
108
- remainingCount: number,
109
- sendMessage: SendMessageFn,
110
- opts: { isIdle: boolean; triggerTurn: boolean },
111
- ): void {
112
- if (remainingCount <= 0) return;
113
- sendWithDeliveryPolicy(
114
- {
115
- customType: "crew-remaining",
116
- content: `⏳ ${remainingCount} subagent(s) still running`,
117
- display: true,
118
- },
119
- sendMessage,
120
- opts,
121
- );
122
- }
123
-
124
107
  export function sendCrewListActiveWarning(
125
108
  sendMessage: SendMessageFn,
126
109
  opts: { isIdle: boolean; triggerTurn: boolean },
@@ -193,7 +176,6 @@ export function registerCrewMessageRenderers(pi: ExtensionAPI): void {
193
176
  return box;
194
177
  });
195
178
 
196
- pi.registerMessageRenderer("crew-remaining", (message, _options, theme) => renderWarningMessage(message.content, theme));
197
179
  pi.registerMessageRenderer("crew-list-warning", (message, _options, theme) => renderWarningMessage(message.content, theme));
198
180
  }
199
181
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@melihmucuk/pi-crew",
3
- "version": "1.0.18",
3
+ "version": "1.0.19",
4
4
  "type": "module",
5
5
  "description": "Non-blocking subagent orchestration for pi coding agent",
6
6
  "files": [
@@ -43,13 +43,13 @@
43
43
  "typebox": "*"
44
44
  },
45
45
  "devDependencies": {
46
- "@earendil-works/pi-agent-core": "^0.75.4",
47
- "@earendil-works/pi-ai": "^0.75.4",
48
- "@earendil-works/pi-coding-agent": "^0.75.4",
49
- "@earendil-works/pi-tui": "^0.75.4",
46
+ "@earendil-works/pi-agent-core": "^0.76.0",
47
+ "@earendil-works/pi-ai": "^0.76.0",
48
+ "@earendil-works/pi-coding-agent": "^0.76.0",
49
+ "@earendil-works/pi-tui": "^0.76.0",
50
50
  "@types/node": "^22.19.17",
51
51
  "tsx": "^4.22.3",
52
- "typebox": "^1.1.38",
52
+ "typebox": "^1.1.39",
53
53
  "typescript": "^5.9.3"
54
54
  }
55
55
  }
@@ -10,7 +10,7 @@ You are a review orchestrator, not a reviewer. Resolve the review scope, gather
10
10
 
11
11
  ## Scope
12
12
 
13
- Use the user's scope when provided. Otherwise rely on each reviewer’s default scope. If “latest” or “recent” is requested, review the last 5 commits unless a count is given.
13
+ Use the user's scope when provided. Otherwise rely on each reviewer’s default scope. If “latest” or “recent” is requested, review the last 5 commits unless a count is given. If “full”, “codebase”, or whole-repo review is requested, treat it as an explicit non-default scope and pass that scope to reviewers.
14
14
 
15
15
  Gather minimal review context: why the changes were made, expected behavior/outcome, feature or bug intent, notable fixes since any prior review, verification already run, and user instructions that are specific to this review.
16
16
 
@@ -33,6 +33,8 @@ If you include a Goal, make it specific to the change intent, not the reviewer r
33
33
 
34
34
  For default reviews, do not include a Scope section or mention uncommitted/current repo changes in the subagent brief unless needed to disambiguate scope. If you need to state task-specific emphasis, use `Review focus:` instead of `Scope:`.
35
35
 
36
+ For full/codebase requests, state that the requested scope is a bounded full-codebase review.
37
+
36
38
  Do not echo the raw user instruction if it is already represented in the intent summary; quote it only when exact wording matters.
37
39
 
38
40
  Do not restate reviewer-role boilerplate implied by the selected reviewer, such as telling `code-reviewer` to find actionable bugs or telling `quality-reviewer` to review maintainability. Do not include default scope, generic non-goals, acceptance criteria, output format, edit permissions, or severity rules unless the user explicitly overrides them.
@@ -49,26 +51,33 @@ You may do a minimal spot-check only when a finding is ambiguous, high-impact, o
49
51
 
50
52
  Reply in the user's language. Apply the gate before merging.
51
53
 
54
+ For each accepted finding, preserve enough detail to act without reading subagent logs:
55
+
56
+ **[SEVERITY] Category: Title**
57
+ Source: `code-reviewer` | `quality-reviewer` | `both`
58
+ File: `path:line`
59
+ Issue: what is wrong
60
+ Evidence: what was verified
61
+ Impact: concrete consequence
62
+ Fix: specific suggested correction
63
+
64
+ Do not forward findings as summaries only. If evidence, location, or fix is missing and cannot be inferred from the reviewer result, omit the finding or report it as insufficiently evidenced.
65
+
52
66
  Sections:
53
67
 
54
- ### Consensus Findings
55
- Issues clearly reported by both reviewers.
68
+ ### Findings
69
+ List all accepted findings in severity order. Use `Source:` to identify `code-reviewer`, `quality-reviewer`, or `both`.
56
70
 
57
- ### Code Review Findings
58
- Accepted findings only from `code-reviewer`.
71
+ If both reviewers report no accepted findings, write only:
59
72
 
60
- ### Quality Review Findings
61
- Accepted findings only from `quality-reviewer`.
73
+ No accepted findings.
62
74
 
63
- ### Final Summary
64
- - Review scope
65
- - Reviewers run and any failures
66
- - Consensus findings count
67
- - Code review findings count
68
- - Quality review findings count
69
- - Overall assessment
75
+ ### Summary
76
+ - Scope: [review scope]
77
+ - Reviewers: [completed reviewers and any failures]
78
+ - Findings: [count by severity]
79
+ - Result: [one-sentence overall assessment]
70
80
 
71
81
  Rules:
72
82
  - Do not repeat overlapping findings.
73
- - Do not present a single-reviewer finding as consensus.
74
- - If both reviewers report no accepted findings, say so clearly.
83
+ - Mark a finding as `Source: both` only when both reviewers clearly reported the same issue.
@@ -35,6 +35,8 @@ Omit sections that would only restate the selected subagent’s role, default sc
35
35
 
36
36
  Include only information that helps this specific subagent do this specific task: intent, expected outcome, relevant decisions, exact errors/output, unusual constraints, and file paths or entry points that genuinely clarify the task. Use short Markdown sections and bullets when they improve scanability, especially for multi-part intent, constraints, observations, requirements, or acceptance criteria; avoid dense paragraphs.
37
37
 
38
+ For repeated workflows, make each spawn brief independent. Do not assume a new subagent knows earlier loop results, owner-session discussion, or what another subagent saw. If prior findings, fixes, decisions, or verification matter, summarize the concrete facts or point to durable artifacts the subagent can inspect. Avoid vague references like “we fixed the first review findings” unless you also state what those findings/fixes were or define the current review target without relying on that history.
39
+
38
40
  Do not restate boilerplate implied by the selected subagent’s role, name, or description. Avoid repeating default scope, output format, edit permissions, or repo guidance. Subagents run in the same cwd as the orchestrator, so do not include mechanical Git state they can inspect themselves, such as full changed-file lists, staged/unstaged/untracked inventories, branch/cwd details, or generic project constraints, unless those details define a non-default scope or prevent ambiguity.
39
41
 
40
42
  If the user points to a plan, spec, issue, design, or doc as task intent, read it when practical and summarize the relevant intent instead of merely passing the path. Prefer explaining why the work matters and what outcome is expected over restating repository state.
@@ -44,7 +46,7 @@ If the user points to a plan, spec, issue, design, or doc as task intent, read i
44
46
  - Wait for subagent results before using them. Never invent or predict results.
45
47
  - Evaluate each result against the task acceptance criteria.
46
48
  - If results conflict, are incomplete, or miss criteria, state that clearly and use a follow-up or new spawn only when needed.
47
- - After spawning, continue only with unrelated work or end the turn.
49
+ - After spawning, do not work on the delegated task; wait for results, continue only with unrelated work, or end the turn.
48
50
 
49
51
  ## Interactive Subagents
50
52
 
@@ -1,211 +0,0 @@
1
- import type { AgentSession, AgentSessionEvent } from "@earendil-works/pi-coding-agent";
2
-
3
- const OVERFLOW_RECOVERY_TIMEOUT_MS = 120_000;
4
-
5
- /**
6
- * Short grace period for the first terminal agent_end after prompt() resolves.
7
- * If this window expires, we still wait the full recovery timeout.
8
- */
9
- const INITIAL_AGENT_END_WAIT_MS = 5_000;
10
-
11
- type PhaseWaitResult = "done" | "timeout" | "cancelled";
12
-
13
- export type OverflowRecoveryResult = "none" | "recovered" | "failed";
14
-
15
- interface DeferredPhase {
16
- promise: Promise<void>;
17
- resolve: () => void;
18
- isDone: () => boolean;
19
- }
20
-
21
- function createDeferredPhase(): DeferredPhase {
22
- let done = false;
23
- let resolveFn: (() => void) | undefined;
24
-
25
- const promise = new Promise<void>((resolve) => {
26
- resolveFn = () => {
27
- if (done) return;
28
- done = true;
29
- resolve();
30
- };
31
- });
32
-
33
- return {
34
- promise,
35
- resolve: () => resolveFn?.(),
36
- isDone: () => done,
37
- };
38
- }
39
-
40
- class OverflowRecoveryTracker {
41
- private overflowDetected = false;
42
- private compactionWillRetry = false;
43
-
44
- private autoRetryActive = false;
45
- private readonly initialAgentEnd = createDeferredPhase();
46
- private compactionEnd: DeferredPhase | undefined;
47
- private retryAgentEnd: DeferredPhase | undefined;
48
- private overflowAutoRetryEnd: DeferredPhase | undefined;
49
- private timers: ReturnType<typeof setTimeout>[] = [];
50
-
51
- handleEvent(event: AgentSessionEvent): void {
52
- switch (event.type) {
53
- case "agent_end":
54
- this.onAgentEnd();
55
- break;
56
- case "compaction_start":
57
- this.onCompactionStart(event.reason);
58
- break;
59
- case "compaction_end":
60
- this.onCompactionEnd(event.reason, event.willRetry);
61
- break;
62
- case "auto_retry_start":
63
- this.onAutoRetryStart();
64
- break;
65
- case "auto_retry_end":
66
- this.onAutoRetryEnd();
67
- break;
68
- default:
69
- break;
70
- }
71
- }
72
-
73
- async awaitCompletion(signal: AbortSignal): Promise<OverflowRecoveryResult> {
74
- const cancelPromise = new Promise<void>((resolve) => {
75
- if (signal.aborted) {
76
- resolve();
77
- return;
78
- }
79
- signal.addEventListener("abort", () => resolve(), { once: true });
80
- });
81
-
82
- try {
83
- let initialEnd = await this.waitForPhase(
84
- this.initialAgentEnd.promise,
85
- INITIAL_AGENT_END_WAIT_MS,
86
- cancelPromise,
87
- );
88
-
89
- if (initialEnd === "timeout") {
90
- initialEnd = await this.waitForPhase(
91
- this.initialAgentEnd.promise,
92
- OVERFLOW_RECOVERY_TIMEOUT_MS,
93
- cancelPromise,
94
- );
95
- }
96
-
97
- if (initialEnd !== "done") {
98
- return this.overflowDetected ? "failed" : "none";
99
- }
100
-
101
- if (!this.overflowDetected) return "none";
102
-
103
- if (this.compactionEnd) {
104
- const compactionEnd = await this.waitForPhase(
105
- this.compactionEnd.promise,
106
- OVERFLOW_RECOVERY_TIMEOUT_MS,
107
- cancelPromise,
108
- );
109
- if (compactionEnd !== "done") return "failed";
110
- }
111
-
112
- if (!this.compactionWillRetry) return "failed";
113
-
114
- if (this.retryAgentEnd) {
115
- const retryEnd = await this.waitForPhase(
116
- this.retryAgentEnd.promise,
117
- OVERFLOW_RECOVERY_TIMEOUT_MS,
118
- cancelPromise,
119
- );
120
- if (retryEnd !== "done") return "failed";
121
- }
122
-
123
- if (this.overflowAutoRetryEnd) {
124
- const autoRetryEnd = await this.waitForPhase(
125
- this.overflowAutoRetryEnd.promise,
126
- OVERFLOW_RECOVERY_TIMEOUT_MS,
127
- cancelPromise,
128
- );
129
- if (autoRetryEnd !== "done") return "failed";
130
- }
131
-
132
- return "recovered";
133
- } finally {
134
- for (const timer of this.timers) clearTimeout(timer);
135
- }
136
- }
137
-
138
- private async waitForPhase(
139
- phasePromise: Promise<void>,
140
- timeoutMs: number,
141
- cancelPromise: Promise<void>,
142
- ): Promise<PhaseWaitResult> {
143
- return Promise.race([
144
- phasePromise.then(() => "done" as const),
145
- cancelPromise.then(() => "cancelled" as const),
146
- new Promise<"timeout">((resolve) => {
147
- this.timers.push(setTimeout(() => resolve("timeout"), timeoutMs));
148
- }),
149
- ]);
150
- }
151
-
152
- // agent_end can be followed immediately by auto_retry_start in the same
153
- // _processAgentEvent tick. Resolve on microtask so we can ignore retrying
154
- // attempts and only accept terminal agent_end events.
155
- private onAgentEnd(): void {
156
- queueMicrotask(() => {
157
- if (this.autoRetryActive) return;
158
-
159
- if (!this.initialAgentEnd.isDone()) {
160
- this.initialAgentEnd.resolve();
161
- return;
162
- }
163
-
164
- this.retryAgentEnd?.resolve();
165
- });
166
- }
167
-
168
- private onCompactionStart(reason: "manual" | "threshold" | "overflow"): void {
169
- if (reason !== "overflow") return;
170
- this.overflowDetected = true;
171
- this.compactionEnd ??= createDeferredPhase();
172
- }
173
-
174
- private onCompactionEnd(reason: "manual" | "threshold" | "overflow", willRetry: boolean): void {
175
- if (reason !== "overflow") return;
176
-
177
- this.compactionWillRetry = willRetry;
178
- if (willRetry) {
179
- this.retryAgentEnd ??= createDeferredPhase();
180
- }
181
- this.compactionEnd?.resolve();
182
- }
183
-
184
- private onAutoRetryStart(): void {
185
- this.autoRetryActive = true;
186
- if (this.overflowDetected) {
187
- this.overflowAutoRetryEnd ??= createDeferredPhase();
188
- }
189
- }
190
-
191
- private onAutoRetryEnd(): void {
192
- this.autoRetryActive = false;
193
- this.overflowAutoRetryEnd?.resolve();
194
- }
195
- }
196
-
197
- export async function runPromptWithOverflowRecovery(
198
- session: AgentSession,
199
- text: string,
200
- signal: AbortSignal,
201
- ): Promise<OverflowRecoveryResult> {
202
- const tracker = new OverflowRecoveryTracker();
203
- const unsubscribe = session.subscribe((event) => tracker.handleEvent(event));
204
-
205
- try {
206
- await session.prompt(text);
207
- return await tracker.awaitCompletion(signal);
208
- } finally {
209
- unsubscribe();
210
- }
211
- }