@poncho-ai/harness 0.50.3 → 0.50.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
 
2
- > @poncho-ai/harness@0.50.3 build /home/runner/work/poncho-ai/poncho-ai/packages/harness
2
+ > @poncho-ai/harness@0.50.5 build /home/runner/work/poncho-ai/poncho-ai/packages/harness
3
3
  > node scripts/embed-docs.js && tsup src/index.ts --format esm --dts
4
4
 
5
5
  [embed-docs] Generated poncho-docs.ts with 4 topics
@@ -8,9 +8,9 @@
8
8
  CLI tsup v8.5.1
9
9
  CLI Target: es2022
10
10
  ESM Build start
11
- ESM dist/index.js 530.79 KB
12
- ESM dist/isolate-BNQ6P3HI.js 51.41 KB
13
- ESM ⚡️ Build success in 229ms
11
+ ESM dist/index.js 535.57 KB
12
+ ESM dist/isolate-F2PPSUL6.js 53.82 KB
13
+ ESM ⚡️ Build success in 240ms
14
14
  DTS Build start
15
- DTS ⚡️ Build success in 7430ms
16
- DTS dist/index.d.ts 89.60 KB
15
+ DTS ⚡️ Build success in 7598ms
16
+ DTS dist/index.d.ts 91.35 KB
package/CHANGELOG.md CHANGED
@@ -1,5 +1,29 @@
1
1
  # @poncho-ai/harness
2
2
 
3
+ ## 0.50.5
4
+
5
+ ### Patch Changes
6
+
7
+ - [`991a4b9`](https://github.com/cesr/poncho-ai/commit/991a4b98d6683c105c7aae50551d30b16080d618) Thanks [@cesr](https://github.com/cesr)! - harness: subagents survive the wall-clock timeout, and can be given a longer budget than the foreground turn.
8
+
9
+ Previously a subagent that hit its hard `timeout` (vs. `maxSteps`) emitted `run:error` with no `runResult`, so the orchestrator dropped everything it had gathered: the parent received a bare `(no result)`, the subagent was falsely marked `completed`, and the work — often dozens of completed searches, just short of the write step — was lost.
10
+ - **Graceful timeout/error delivery.** When a subagent run ends abnormally (timeout or model error) with no `runResult`, the orchestrator now recovers its real output (run response → streamed draft → transcript walk-back, discarding the synthetic `[Error: …]` placeholder), and delivers it tagged so the parent knows it didn't finish — it may not have written its files — with a concrete recovery hint (use the partial work, send a write-only `message_subagent` follow-up, or `read_subagent(…, mode:"full")`). The subagent is marked `status: "error"` (not a fake `completed`) and carries the failure in its `error` field. Applied to both the spawn and continuation paths.
11
+ - **`runTimeoutSecOverride` (HarnessOptions).** A constructor-level override for the per-run hard wall-clock timeout, taking precedence over the agent definition's `limits.timeout`. Lets a platform give background subagents a longer budget (e.g. 1h) than a foreground turn (5 min) without forking the agent definition. `0` disables the hard timeout.
12
+
13
+ ## 0.50.4
14
+
15
+ ### Patch Changes
16
+
17
+ - [`9a39327`](https://github.com/cesr/poncho-ai/commit/9a393274d8a8061371d268fa81db3501cb0a8308) Thanks [@cesr](https://github.com/cesr)! - harness: fix three `run_code` / cancellation bugs.
18
+ - **Timers polyfill never fired delayed callbacks.** `setTimeout(fn, ms)` only ran the callback when `ms === 0`; any non-zero delay was stored and never invoked, so `await new Promise(r => setTimeout(r, 50))` (the standard sleep) hung forever. The polyfill now drains pending timers on the microtask queue in delay order against a virtual clock, so sleeps resolve and `setInterval`/`clearInterval` work.
19
+ - **No wall-clock bound on `run_code`.** isolated-vm's `timeout` only bounds synchronous execution; a script that returns a never-settling promise hung the whole turn indefinitely. `runtime.execute` now races the eval against a host timer that disposes the isolate, so `isolate.timeLimit` bounds total execution and returns a `TimeoutError`.
20
+ - **Stopping a turn mid-tool-call dropped the assistant turn from canonical history.** On cancellation the in-flight assistant message (its text + tool calls) lives only in step-local state — it's pushed to `messages` together with the tool results, which never arrive when stopped. The cancellation snapshot now re-attaches that turn with a synthesized "cancelled by user" tool result for each pending tool call, so the next request keeps a valid record instead of showing the model back-to-back user messages.
21
+
22
+ - [`c604fd6`](https://github.com/cesr/poncho-ai/commit/c604fd6b41dfd06600af85daa892ab4fd3852bad) Thanks [@cesr](https://github.com/cesr)! - harness: harden subagent → parent result delivery so a step-exhausted subagent stops surfacing as `(no response)`.
23
+ - **Force a closing text turn on the final step.** On the last permitted step (`step === maxSteps`) the run loop now strips the tools and appends a one-shot "summarize now, no tools" nudge to that model request, so a run that hits its step ceiling produces a real text summary instead of terminating on a dangling tool call. Previously such a run ended on a tool-call turn with no final text — common in subagents doing many tool calls — and the parent received an empty result. `maxSteps` itself is unchanged; the nudge is request-only and never written to history.
24
+ - **Content-shape-robust result extraction.** Pulling a subagent's response no longer requires the last assistant message to be a plain `string`. The new `lastAssistantText` helper handles `string`, `ContentPart[]`, and the run loop's `{"text":...,"tool_calls":[...]}` envelope, and walks backwards to the last non-empty assistant text — so a transcript that ends on a text-less tool turn still yields the prose produced just before it.
25
+ - **Actionable empty-result sentinel.** When a subagent genuinely produced no summary, the injected parent message now says how many steps ran and points at `read_subagent(<id>, mode:"assistant")` to recover the work, instead of a dead-end `(no response)`.
26
+
3
27
  ## 0.50.3
4
28
 
5
29
  ### Patch Changes
package/dist/index.d.ts CHANGED
@@ -1261,6 +1261,14 @@ interface HarnessOptions {
1261
1261
  * should also be browsable in the VFS. Empty by default.
1262
1262
  */
1263
1263
  systemSkillPaths?: string[];
1264
+ /**
1265
+ * Override the per-run hard wall-clock timeout, in seconds, taking
1266
+ * precedence over the agent definition's `limits.timeout`. Platforms use
1267
+ * this to give background subagents a longer budget than the foreground
1268
+ * agent without forking the agent definition (e.g. a 1h research subagent
1269
+ * vs. a 5-min foreground turn). `0` disables the hard timeout.
1270
+ */
1271
+ runTimeoutSecOverride?: number;
1264
1272
  }
1265
1273
  interface HarnessRunOutput {
1266
1274
  runId: string;
@@ -1280,6 +1288,7 @@ interface ArchivedToolResult {
1280
1288
  declare class AgentHarness {
1281
1289
  private readonly workingDir;
1282
1290
  private readonly environment;
1291
+ private readonly runTimeoutSecOverride?;
1283
1292
  private modelProvider;
1284
1293
  private readonly modelProviderInjected;
1285
1294
  private readonly dispatcher;
@@ -1981,6 +1990,35 @@ declare const MAX_SUBAGENT_CALLBACK_COUNT = 20;
1981
1990
  declare const CALLBACK_LOCK_STALE_MS: number;
1982
1991
  declare const STALE_SUBAGENT_THRESHOLD_MS: number;
1983
1992
 
1993
+ /**
1994
+ * Find the last non-empty assistant text in a subagent transcript. Walking
1995
+ * backwards (rather than reading only the final message) means a subagent
1996
+ * that ended on a tool-call turn still yields the prose it produced just
1997
+ * before — instead of surfacing to the parent as an empty result.
1998
+ */
1999
+ declare const lastAssistantText: (messages: Message[]) => string;
2000
+ /**
2001
+ * The run loop stuffs a synthetic `[Error: ...]` placeholder into the draft /
2002
+ * persisted assistant text when a run ends on `run:error` (e.g. a timeout).
2003
+ * That placeholder is not real model output — strip it so we don't surface it
2004
+ * to the parent as the subagent's "response".
2005
+ */
2006
+ declare const realResponseText: (text: string | undefined) => string;
2007
+ /**
2008
+ * Build the result text delivered to the parent when a subagent ended
2009
+ * abnormally (timeout / error) with no RunResult. We never drop the work it
2010
+ * gathered, and the parent is told it didn't finish — e.g. it may not have
2011
+ * written its output files — plus how to recover (use what's here, send a
2012
+ * write-only follow-up, or read the full transcript).
2013
+ */
2014
+ declare const abnormalEndResponse: (opts: {
2015
+ subagentId: string;
2016
+ gathered: string;
2017
+ runError?: {
2018
+ code?: string;
2019
+ message?: string;
2020
+ };
2021
+ }) => string;
1984
2022
  type ActiveConversationRun = {
1985
2023
  ownerId: string;
1986
2024
  abortController: AbortController;
@@ -2143,4 +2181,4 @@ interface RunConversationTurnResult {
2143
2181
  }
2144
2182
  declare const runConversationTurn: (opts: RunConversationTurnOpts) => Promise<RunConversationTurnResult>;
2145
2183
 
2146
- export { type ActiveConversationRun, type ActiveSubagentRun, type AgentFrontmatter, AgentHarness, type AgentIdentity, type AgentLimitsConfig, type AgentModelConfig, AgentOrchestrator, type ApprovalEventItem, type ArchivedToolResult$1 as ArchivedToolResult, type BashConfig, BashEnvironmentManager, type BashExecutionLimits, type BuiltInToolToggles, CALLBACK_LOCK_STALE_MS, type CompactMessagesOptions, type CompactResult, type CompactionConfig, type ContinuationHooks, type Conversation, type ConversationCreateInit, type ConversationState, type ConversationStatusSnapshot, type ConversationStore, type ConversationSummary, type CreateSkillToolsOptions, type CronJobConfig, DEFAULT_AGENT_DESCRIPTION, DEFAULT_AGENT_NAME, DEFAULT_MAX_STEPS, DEFAULT_MODEL_NAME, DEFAULT_MODEL_PROVIDER, DEFAULT_TEMPERATURE, DEFAULT_TIMEOUT, type DefaultAgentDefinitionOptions, type EventSink, type ExecuteTurnResult, type HarnessOptions, type HarnessRunOutput, type HistorySource, InMemoryConversationStore, InMemoryEngine, InMemoryStateStore, type IsolateBinding, type IsolateConfig, LocalMcpBridge, LocalUploadStore, MAX_CONCURRENT_SUBAGENTS, MAX_CONTINUATION_COUNT, MAX_SUBAGENT_CALLBACK_COUNT, MAX_SUBAGENT_NESTING, type MainMemory, type McpConfig, type MemoryConfig, type MemoryStore, type MessagingChannelConfig, type ModelProviderFactory, type MountProvider, type NetworkConfig, OPENAI_CODEX_CLIENT_ID, type OpenAICodexAuthConfig, type OpenAICodexDeviceAuthRequest, type OpenAICodexSession, type OrchestratorHooks, type OrchestratorOptions, type OtlpConfig, type OtlpOption, PONCHO_UPLOAD_SCHEME, type ParsedAgent, type PendingSubagentApproval, type PendingSubagentResult, type PendingToolCall, type PonchoConfig, PonchoFsAdapter, PostgresEngine, type ProviderConfig, type Recurrence, type RecurrenceType, type Reminder, type ReminderCreateInput, type ReminderStatus, type ReminderStore, type RemoteMcpServerConfig, type RunConversationTurnOpts, type RunConversationTurnResult, type RunOutcome, type RunRequest, type RuntimeRenderContext, S3UploadStore, STALE_SUBAGENT_THRESHOLD_MS, STORAGE_SCHEMA_VERSION, type SecretsStore, type SkillContextEntry, type SkillMetadata, type SkillSource, SqliteEngine, type StateConfig, type StateProviderName, type StateStore, type StorageConfig, type StorageEngine, type StorageFactoryOptions, type StorageProvider, type StoredApproval, type SubagentManager, type SubagentResult, type SubagentSpawnResult, type SubagentSummary, type SubagentTranscript, type SubagentTranscriptMode, TOOL_RESULT_ARCHIVE_PARAM, type TelemetryConfig, TelemetryEmitter, type TenantTokenPayload, type ToolAccess, type ToolCall, ToolDispatcher, type ToolExecutionResult, type TurnDraftState, type TurnResultMetadata, type TurnSection, type UploadStore, type UploadsConfig, VFS_SCHEME, VercelBlobUploadStore, type VfsDirEntry, type VfsStat, type VirtualMount, applyTurnMetadata, buildAgentDirectoryName, buildApprovalCheckpoints, buildAssistantMetadata, buildSkillContextWindow, buildToolCompletedText, cloneSections, compactMessages, completeOpenAICodexDeviceAuth, computeNextOccurrence, createBashTool, createConversationStore, createConversationStoreFromEngine, createDefaultTools, createDeleteDirectoryTool, createDeleteTool, createEditTool, createMemoryStore, createMemoryStoreFromEngine, createMemoryTools, createModelProvider, createReminderStore, createReminderStoreFromEngine, createReminderTools, createSearchTools, createSecretsStore, createSkillTools, createStateStore, createStorageEngine, createSubagentTools, createTodoStoreFromEngine, createTurnDraftState, createUploadStore, createWriteTool, decodeFileInputData, defaultAgentDefinition, deleteOpenAICodexSession, deriveUploadKey, ensureAgentIdentity, estimateTokens, estimateTotalTokens, executeConversationTurn, findSafeSplitPoint, flushTurnDraft, generateAgentId, getAgentStoreDirectory, getModelContextWindow, getOpenAICodexAccessToken, getOpenAICodexAuthFilePath, getOpenAICodexRequiredScopes, getPonchoStoreRoot, isMessageArray, jsonSchemaToZod, loadCanonicalHistory, loadPonchoConfig, loadRunHistory, loadSkillContext, loadSkillInstructions, loadSkillMetadata, loadSkillMetadataFromDirs, loadVfsSkillMetadata, mergeSkills, normalizeApprovalCheckpoint, normalizeOtlp, normalizeScriptPolicyPath, normalizeToolAccess, parseAgentFile, parseAgentMarkdown, parseSkillFrontmatter, ponchoDocsTool, readOpenAICodexSession, readSkillResource, recordStandardTurnEvent, renderAgentPrompt, resolveAgentIdentity, resolveCompactionConfig, resolveEnv, resolveMemoryConfig, resolveRunRequest, resolveSkillDirs, resolveStateConfig, runConversationTurn, slugifyStorageComponent, startOpenAICodexDeviceAuth, verifyTenantToken, withToolResultArchiveParam, writeOpenAICodexSession };
2184
+ export { type ActiveConversationRun, type ActiveSubagentRun, type AgentFrontmatter, AgentHarness, type AgentIdentity, type AgentLimitsConfig, type AgentModelConfig, AgentOrchestrator, type ApprovalEventItem, type ArchivedToolResult$1 as ArchivedToolResult, type BashConfig, BashEnvironmentManager, type BashExecutionLimits, type BuiltInToolToggles, CALLBACK_LOCK_STALE_MS, type CompactMessagesOptions, type CompactResult, type CompactionConfig, type ContinuationHooks, type Conversation, type ConversationCreateInit, type ConversationState, type ConversationStatusSnapshot, type ConversationStore, type ConversationSummary, type CreateSkillToolsOptions, type CronJobConfig, DEFAULT_AGENT_DESCRIPTION, DEFAULT_AGENT_NAME, DEFAULT_MAX_STEPS, DEFAULT_MODEL_NAME, DEFAULT_MODEL_PROVIDER, DEFAULT_TEMPERATURE, DEFAULT_TIMEOUT, type DefaultAgentDefinitionOptions, type EventSink, type ExecuteTurnResult, type HarnessOptions, type HarnessRunOutput, type HistorySource, InMemoryConversationStore, InMemoryEngine, InMemoryStateStore, type IsolateBinding, type IsolateConfig, LocalMcpBridge, LocalUploadStore, MAX_CONCURRENT_SUBAGENTS, MAX_CONTINUATION_COUNT, MAX_SUBAGENT_CALLBACK_COUNT, MAX_SUBAGENT_NESTING, type MainMemory, type McpConfig, type MemoryConfig, type MemoryStore, type MessagingChannelConfig, type ModelProviderFactory, type MountProvider, type NetworkConfig, OPENAI_CODEX_CLIENT_ID, type OpenAICodexAuthConfig, type OpenAICodexDeviceAuthRequest, type OpenAICodexSession, type OrchestratorHooks, type OrchestratorOptions, type OtlpConfig, type OtlpOption, PONCHO_UPLOAD_SCHEME, type ParsedAgent, type PendingSubagentApproval, type PendingSubagentResult, type PendingToolCall, type PonchoConfig, PonchoFsAdapter, PostgresEngine, type ProviderConfig, type Recurrence, type RecurrenceType, type Reminder, type ReminderCreateInput, type ReminderStatus, type ReminderStore, type RemoteMcpServerConfig, type RunConversationTurnOpts, type RunConversationTurnResult, type RunOutcome, type RunRequest, type RuntimeRenderContext, S3UploadStore, STALE_SUBAGENT_THRESHOLD_MS, STORAGE_SCHEMA_VERSION, type SecretsStore, type SkillContextEntry, type SkillMetadata, type SkillSource, SqliteEngine, type StateConfig, type StateProviderName, type StateStore, type StorageConfig, type StorageEngine, type StorageFactoryOptions, type StorageProvider, type StoredApproval, type SubagentManager, type SubagentResult, type SubagentSpawnResult, type SubagentSummary, type SubagentTranscript, type SubagentTranscriptMode, TOOL_RESULT_ARCHIVE_PARAM, type TelemetryConfig, TelemetryEmitter, type TenantTokenPayload, type ToolAccess, type ToolCall, ToolDispatcher, type ToolExecutionResult, type TurnDraftState, type TurnResultMetadata, type TurnSection, type UploadStore, type UploadsConfig, VFS_SCHEME, VercelBlobUploadStore, type VfsDirEntry, type VfsStat, type VirtualMount, abnormalEndResponse, applyTurnMetadata, buildAgentDirectoryName, buildApprovalCheckpoints, buildAssistantMetadata, buildSkillContextWindow, buildToolCompletedText, cloneSections, compactMessages, completeOpenAICodexDeviceAuth, computeNextOccurrence, createBashTool, createConversationStore, createConversationStoreFromEngine, createDefaultTools, createDeleteDirectoryTool, createDeleteTool, createEditTool, createMemoryStore, createMemoryStoreFromEngine, createMemoryTools, createModelProvider, createReminderStore, createReminderStoreFromEngine, createReminderTools, createSearchTools, createSecretsStore, createSkillTools, createStateStore, createStorageEngine, createSubagentTools, createTodoStoreFromEngine, createTurnDraftState, createUploadStore, createWriteTool, decodeFileInputData, defaultAgentDefinition, deleteOpenAICodexSession, deriveUploadKey, ensureAgentIdentity, estimateTokens, estimateTotalTokens, executeConversationTurn, findSafeSplitPoint, flushTurnDraft, generateAgentId, getAgentStoreDirectory, getModelContextWindow, getOpenAICodexAccessToken, getOpenAICodexAuthFilePath, getOpenAICodexRequiredScopes, getPonchoStoreRoot, isMessageArray, jsonSchemaToZod, lastAssistantText, loadCanonicalHistory, loadPonchoConfig, loadRunHistory, loadSkillContext, loadSkillInstructions, loadSkillMetadata, loadSkillMetadataFromDirs, loadVfsSkillMetadata, mergeSkills, normalizeApprovalCheckpoint, normalizeOtlp, normalizeScriptPolicyPath, normalizeToolAccess, parseAgentFile, parseAgentMarkdown, parseSkillFrontmatter, ponchoDocsTool, readOpenAICodexSession, readSkillResource, realResponseText, recordStandardTurnEvent, renderAgentPrompt, resolveAgentIdentity, resolveCompactionConfig, resolveEnv, resolveMemoryConfig, resolveRunRequest, resolveSkillDirs, resolveStateConfig, runConversationTurn, slugifyStorageComponent, startOpenAICodexDeviceAuth, verifyTenantToken, withToolResultArchiveParam, writeOpenAICodexSession };
package/dist/index.js CHANGED
@@ -8626,6 +8626,7 @@ var now = () => Date.now();
8626
8626
  var FIRST_CHUNK_TIMEOUT_MS = 9e4;
8627
8627
  var MAX_TRANSIENT_STEP_RETRIES = 1;
8628
8628
  var COMPACTION_CHECK_INTERVAL_STEPS = 3;
8629
+ var FINAL_STEP_SUMMARY_PROMPT = "You have reached the maximum number of steps for this run and cannot call any more tools. Do NOT attempt any tool calls. Using only the work you have already done, write your final response now: summarize what you found or accomplished, include any concrete results, and flag anything left unfinished.";
8629
8630
  var TOOL_RESULT_ARCHIVE_PARAM = "__toolResultArchive";
8630
8631
  var TOOL_RESULT_TRUNCATED_PREFIX = "[TRUNCATED_TOOL_RESULT]";
8631
8632
  var TOOL_RESULT_PREVIEW_CHARS = 700;
@@ -9169,6 +9170,7 @@ function extractMediaFromToolOutput(output) {
9169
9170
  var AgentHarness = class _AgentHarness {
9170
9171
  workingDir;
9171
9172
  environment;
9173
+ runTimeoutSecOverride;
9172
9174
  modelProvider;
9173
9175
  modelProviderInjected;
9174
9176
  dispatcher = new ToolDispatcher();
@@ -9374,6 +9376,7 @@ var AgentHarness = class _AgentHarness {
9374
9376
  constructor(options = {}) {
9375
9377
  this.workingDir = options.workingDir ?? process.cwd();
9376
9378
  this.environment = options.environment ?? "development";
9379
+ this.runTimeoutSecOverride = options.runTimeoutSecOverride;
9377
9380
  this.modelProviderInjected = !!options.modelProvider;
9378
9381
  this.modelProvider = options.modelProvider ?? createModelProvider("anthropic");
9379
9382
  this.uploadStore = options.uploadStore;
@@ -9951,7 +9954,7 @@ var AgentHarness = class _AgentHarness {
9951
9954
  this.registerIfMissing(createEditFileTool(getFs));
9952
9955
  this.registerIfMissing(createWriteFileTool(getFs));
9953
9956
  if (config?.isolate) {
9954
- const { createRunCodeTool, buildRunCodeDescription, bundleLibraries } = await import("./isolate-BNQ6P3HI.js");
9957
+ const { createRunCodeTool, buildRunCodeDescription, bundleLibraries } = await import("./isolate-F2PPSUL6.js");
9955
9958
  let libraryPreamble = null;
9956
9959
  if (config.isolate.libraries?.length) {
9957
9960
  libraryPreamble = await bundleLibraries(config.isolate.libraries, this.workingDir);
@@ -10236,7 +10239,7 @@ var AgentHarness = class _AgentHarness {
10236
10239
  const runId = `run_${randomUUID5()}`;
10237
10240
  const start = now();
10238
10241
  const maxSteps = agent.frontmatter.limits?.maxSteps ?? 20;
10239
- const configuredTimeout = agent.frontmatter.limits?.timeout;
10242
+ const configuredTimeout = this.runTimeoutSecOverride ?? agent.frontmatter.limits?.timeout;
10240
10243
  const timeoutMs = this.environment === "development" && configuredTimeout == null ? 0 : (configuredTimeout ?? 300) * 1e3;
10241
10244
  const platformMaxDurationSec = Number(process.env.PONCHO_MAX_DURATION) || 0;
10242
10245
  const softDeadlineMs = input.disableSoftDeadline || platformMaxDurationSec <= 0 ? 0 : platformMaxDurationSec * 800;
@@ -10327,7 +10330,7 @@ Examples:${this.environment !== "production" ? `
10327
10330
  Files in the VFS are accessible to the user via \`/api/vfs/{path}\`. For example, a file at \`/downloads/report.pdf\` can be linked as \`/api/vfs/downloads/report.pdf\`. Use this to share downloadable files with the user.` : "";
10328
10331
  let isolateContext = "";
10329
10332
  if (this.loadedConfig?.isolate && this.dispatcher.get("run_code")) {
10330
- const { generateIsolateTypeStubs } = await import("./isolate-BNQ6P3HI.js");
10333
+ const { generateIsolateTypeStubs } = await import("./isolate-F2PPSUL6.js");
10331
10334
  const typeStubs = generateIsolateTypeStubs(this.loadedConfig.isolate);
10332
10335
  isolateContext = `
10333
10336
 
@@ -10374,10 +10377,40 @@ ${this.skillFingerprint}`;
10374
10377
  };
10375
10378
  const isCancelled = () => input.abortSignal?.aborted === true;
10376
10379
  let cancellationEmitted = false;
10380
+ let inflightTurn = null;
10377
10381
  const emitCancellation = () => {
10378
10382
  cancellationEmitted = true;
10379
- const snapshot = trimToValidPrefix([...messages]);
10380
- return pushEvent({ type: "run:cancelled", runId, messages: snapshot });
10383
+ const snapshot = [...messages];
10384
+ if (inflightTurn && (inflightTurn.text.length > 0 || inflightTurn.toolCalls.length > 0)) {
10385
+ const hasToolCalls = inflightTurn.toolCalls.length > 0;
10386
+ const assistantContent = hasToolCalls ? JSON.stringify({
10387
+ text: inflightTurn.text,
10388
+ tool_calls: inflightTurn.toolCalls.map((tc) => ({
10389
+ id: tc.id,
10390
+ name: tc.name,
10391
+ input: tc.input
10392
+ }))
10393
+ }) : inflightTurn.text;
10394
+ snapshot.push({
10395
+ role: "assistant",
10396
+ content: assistantContent,
10397
+ metadata: { timestamp: now(), id: randomUUID5(), runId }
10398
+ });
10399
+ if (hasToolCalls) {
10400
+ const cancelledResults = inflightTurn.toolCalls.map((tc) => ({
10401
+ type: "tool_result",
10402
+ tool_use_id: tc.id,
10403
+ tool_name: tc.name,
10404
+ content: "Tool execution cancelled by user."
10405
+ }));
10406
+ snapshot.push({
10407
+ role: "tool",
10408
+ content: JSON.stringify(cancelledResults),
10409
+ metadata: { timestamp: now(), id: randomUUID5(), runId }
10410
+ });
10411
+ }
10412
+ }
10413
+ return pushEvent({ type: "run:cancelled", runId, messages: trimToValidPrefix(snapshot) });
10381
10414
  };
10382
10415
  const resolvedModelName = agent.frontmatter.model?.name ?? "claude-opus-4-5";
10383
10416
  const contextWindow = agent.frontmatter.model?.contextWindow ?? getModelContextWindow(resolvedModelName);
@@ -10460,6 +10493,7 @@ ${this.skillFingerprint}`;
10460
10493
  let cachedCoreMessages = [];
10461
10494
  let convertedUpTo = 0;
10462
10495
  for (let step = 1; step <= maxSteps; step += 1) {
10496
+ inflightTurn = null;
10463
10497
  try {
10464
10498
  yield* drainBrowserEvents();
10465
10499
  if (isCancelled()) {
@@ -10817,11 +10851,14 @@ ${textContent}` };
10817
10851
  ...cachedMessages
10818
10852
  ] : cachedMessages;
10819
10853
  const telemetryEnabled = this.loadedConfig?.telemetry?.enabled !== false;
10854
+ const isFinalStep = step === maxSteps;
10855
+ const toolsForStep = isFinalStep ? {} : tools;
10856
+ const messagesForStep = isFinalStep ? [...finalMessages, { role: "user", content: FINAL_STEP_SUMMARY_PROMPT }] : finalMessages;
10820
10857
  const result = await streamText({
10821
10858
  model: modelInstance,
10822
10859
  ...useStaticCache ? {} : { system: systemPrompt },
10823
- messages: finalMessages,
10824
- tools,
10860
+ messages: messagesForStep,
10861
+ tools: toolsForStep,
10825
10862
  temperature,
10826
10863
  abortSignal: input.abortSignal,
10827
10864
  ...typeof maxTokens === "number" ? { maxTokens } : {},
@@ -10950,6 +10987,7 @@ ${textContent}` };
10950
10987
  yield pushEvent({ type: "run:completed", runId, result: result_ });
10951
10988
  return;
10952
10989
  }
10990
+ inflightTurn = { text: fullText, toolCalls: [] };
10953
10991
  if (isCancelled()) {
10954
10992
  yield emitCancellation();
10955
10993
  return;
@@ -11036,6 +11074,7 @@ ${textContent}` };
11036
11074
  name: tc.toolName,
11037
11075
  input: tc.input
11038
11076
  }));
11077
+ if (inflightTurn) inflightTurn.toolCalls = toolCalls;
11039
11078
  if (toolCalls.length === 0) {
11040
11079
  if (fullText.length === 0) {
11041
11080
  const isExpectedEmpty = finishReason === "stop";
@@ -11416,6 +11455,7 @@ ${textContent}` };
11416
11455
  content: JSON.stringify(toolResultsForModel),
11417
11456
  metadata: toolMsgMeta
11418
11457
  });
11458
+ inflightTurn = null;
11419
11459
  if (softDeadlineMs > 0 && now() - start > softDeadlineMs) {
11420
11460
  const result_ = {
11421
11461
  status: "completed",
@@ -12282,6 +12322,38 @@ var CALLBACK_LOCK_STALE_MS = 5 * 60 * 1e3;
12282
12322
  var STALE_SUBAGENT_THRESHOLD_MS = 5 * 60 * 1e3;
12283
12323
 
12284
12324
  // src/orchestrator/orchestrator.ts
12325
+ import { getTextContent as getTextContent3 } from "@poncho-ai/sdk";
12326
+ var assistantMessageText = (message) => {
12327
+ const raw = getTextContent3(message).trim();
12328
+ if (raw.startsWith("{") && raw.includes('"tool_calls"')) {
12329
+ try {
12330
+ const parsed = JSON.parse(raw);
12331
+ if (typeof parsed.text === "string") return parsed.text.trim();
12332
+ } catch {
12333
+ }
12334
+ }
12335
+ return raw;
12336
+ };
12337
+ var lastAssistantText = (messages) => {
12338
+ for (let i = messages.length - 1; i >= 0; i -= 1) {
12339
+ if (messages[i].role !== "assistant") continue;
12340
+ const text = assistantMessageText(messages[i]);
12341
+ if (text) return text;
12342
+ }
12343
+ return "";
12344
+ };
12345
+ var realResponseText = (text) => {
12346
+ const t = (text ?? "").trim();
12347
+ return t.startsWith("[Error:") ? "" : t;
12348
+ };
12349
+ var abnormalEndResponse = (opts) => {
12350
+ const timedOut = opts.runError?.code === "TIMEOUT";
12351
+ const head = timedOut ? "[Subagent hit its time limit before finishing \u2014 it may not have written its output files.]" : `[Subagent ended before finishing${opts.runError?.message ? `: ${opts.runError.message}` : ""}.]`;
12352
+ const recover = opts.gathered ? "Partial work it gathered is below \u2014 write the files yourself from it, or send a tight write-only follow-up with message_subagent." : `Use read_subagent("${opts.subagentId}", mode:"full") to recover what it gathered.`;
12353
+ return opts.gathered ? `${head} ${recover}
12354
+
12355
+ ${opts.gathered}` : `${head} ${recover}`;
12356
+ };
12285
12357
  var AgentOrchestrator = class {
12286
12358
  harness;
12287
12359
  conversationStore;
@@ -12807,6 +12879,7 @@ var AgentOrchestrator = class {
12807
12879
  const draft = createTurnDraftState();
12808
12880
  let latestRunId = "";
12809
12881
  let runResult;
12882
+ let runError;
12810
12883
  try {
12811
12884
  const conversation = await this.conversationStore.getWithArchive(childConversationId);
12812
12885
  if (!conversation) throw new Error("Subagent conversation not found");
@@ -12943,6 +13016,7 @@ var AgentOrchestrator = class {
12943
13016
  }
12944
13017
  }
12945
13018
  if (event.type === "run:error") {
13019
+ runError = { code: event.error.code, message: event.error.message };
12946
13020
  draft.assistantResponse = draft.assistantResponse || `[Error: ${event.error.message}]`;
12947
13021
  }
12948
13022
  await this.eventSink(childConversationId, event);
@@ -12990,7 +13064,12 @@ var AgentOrchestrator = class {
12990
13064
  }
12991
13065
  return;
12992
13066
  }
12993
- conv.subagentMeta = { ...conv.subagentMeta, status: "completed" };
13067
+ const abnormalEnd = !runResult;
13068
+ conv.subagentMeta = {
13069
+ ...conv.subagentMeta,
13070
+ status: abnormalEnd ? "error" : "completed",
13071
+ ...abnormalEnd ? { error: { code: runError?.code ?? "SUBAGENT_INCOMPLETE", message: runError?.message ?? "subagent ended without a result" } } : {}
13072
+ };
12994
13073
  await this.conversationStore.update(conv);
12995
13074
  }
12996
13075
  this.hooks?.onStreamEnd?.(childConversationId);
@@ -12999,21 +13078,25 @@ var AgentOrchestrator = class {
12999
13078
  subagentId: childConversationId,
13000
13079
  conversationId: childConversationId
13001
13080
  });
13002
- let subagentResponse = runResult?.response ?? draft.assistantResponse;
13003
- if (!subagentResponse) {
13081
+ let gathered = realResponseText(runResult?.response) || realResponseText(draft.assistantResponse);
13082
+ if (!gathered) {
13004
13083
  const freshSubConv = await this.conversationStore.get(childConversationId);
13005
- if (freshSubConv) {
13006
- const lastAssistant = [...freshSubConv.messages].reverse().find((m) => m.role === "assistant");
13007
- if (lastAssistant && typeof lastAssistant.content === "string") {
13008
- subagentResponse = lastAssistant.content;
13009
- }
13010
- }
13084
+ if (freshSubConv) gathered = realResponseText(lastAssistantText(freshSubConv.messages));
13011
13085
  }
13086
+ const abnormal = !runResult;
13087
+ const subagentResponse = abnormal ? abnormalEndResponse({ subagentId: childConversationId, gathered, runError }) : gathered;
13012
13088
  const pendingResult = {
13013
13089
  subagentId: childConversationId,
13014
13090
  task,
13015
- status: "completed",
13016
- result: runResult ? { status: runResult.status, response: subagentResponse, steps: runResult.steps, tokens: { input: 0, output: 0, cached: 0 }, duration: runResult.duration } : void 0,
13091
+ status: abnormal ? "error" : "completed",
13092
+ result: {
13093
+ status: runResult?.status ?? "error",
13094
+ response: subagentResponse,
13095
+ steps: runResult?.steps ?? 0,
13096
+ tokens: { input: 0, output: 0, cached: 0 },
13097
+ duration: runResult?.duration ?? 0
13098
+ },
13099
+ ...abnormal ? { error: { code: runError?.code ?? "SUBAGENT_INCOMPLETE", message: runError?.message ?? "subagent ended without a result" } } : {},
13017
13100
  timestamp: Date.now()
13018
13101
  };
13019
13102
  await this.conversationStore.appendSubagentResult(parentConversationId, pendingResult);
@@ -13095,8 +13178,10 @@ var AgentOrchestrator = class {
13095
13178
  const callbackCount = (conversation.subagentCallbackCount ?? 0) + 1;
13096
13179
  conversation.subagentCallbackCount = callbackCount;
13097
13180
  for (const pr of pendingResults) {
13181
+ const responseText = (pr.result?.response ?? "").trim();
13182
+ const responseLine = responseText || `(subagent produced no final summary after ${pr.result?.steps ?? 0} step(s); its work may be incomplete. Call read_subagent with subagent_id "${pr.subagentId}" and mode "assistant" to retrieve what it did.)`;
13098
13183
  const resultBody = pr.result ? `Status: ${pr.result.status}
13099
- Response: ${pr.result.response ?? "(no response)"}
13184
+ Response: ${responseLine}
13100
13185
  Steps: ${pr.result.steps}, Duration: ${pr.result.duration}ms` : pr.error ? `Error: ${pr.error.message}` : "(no result)";
13101
13186
  conversation.messages.push({
13102
13187
  role: "user",
@@ -13259,6 +13344,7 @@ ${resultBody}`,
13259
13344
  this.activeSubagentRuns.set(conversationId, { abortController: childAbortController, harness: childHarness, parentConversationId });
13260
13345
  const draft = createTurnDraftState();
13261
13346
  let runResult;
13347
+ let runError;
13262
13348
  try {
13263
13349
  const recallParams = this.hooks?.buildRecallParams?.({ ownerId, tenantId: conversation.tenantId, excludeConversationId: conversationId }) ?? {};
13264
13350
  for await (const event of childHarness.runWithTelemetry({
@@ -13291,6 +13377,7 @@ ${resultBody}`,
13291
13377
  }
13292
13378
  }
13293
13379
  if (event.type === "run:error") {
13380
+ runError = { code: event.error.code, message: event.error.message };
13294
13381
  draft.assistantResponse = draft.assistantResponse || `[Error: ${event.error.message}]`;
13295
13382
  }
13296
13383
  await this.eventSink(conversationId, event);
@@ -13339,7 +13426,12 @@ ${resultBody}`,
13339
13426
  }
13340
13427
  return;
13341
13428
  }
13342
- conv.subagentMeta = { ...conv.subagentMeta, status: "completed" };
13429
+ const abnormalEnd = !runResult;
13430
+ conv.subagentMeta = {
13431
+ ...conv.subagentMeta,
13432
+ status: abnormalEnd ? "error" : "completed",
13433
+ ...abnormalEnd ? { error: { code: runError?.code ?? "SUBAGENT_INCOMPLETE", message: runError?.message ?? "subagent ended without a result" } } : {}
13434
+ };
13343
13435
  await this.conversationStore.update(conv);
13344
13436
  }
13345
13437
  this.activeSubagentRuns.delete(conversationId);
@@ -13348,23 +13440,21 @@ ${resultBody}`,
13348
13440
  subagentId: conversationId,
13349
13441
  conversationId
13350
13442
  });
13351
- let subagentResponse = runResult?.response ?? draft.assistantResponse;
13352
- if (!subagentResponse) {
13443
+ let gathered = realResponseText(runResult?.response) || realResponseText(draft.assistantResponse);
13444
+ if (!gathered) {
13353
13445
  const freshSubConv = await this.conversationStore.get(conversationId);
13354
- if (freshSubConv) {
13355
- const lastAssistant = [...freshSubConv.messages].reverse().find((m) => m.role === "assistant");
13356
- if (lastAssistant) {
13357
- subagentResponse = typeof lastAssistant.content === "string" ? lastAssistant.content : "";
13358
- }
13359
- }
13446
+ if (freshSubConv) gathered = realResponseText(lastAssistantText(freshSubConv.messages));
13360
13447
  }
13448
+ const abnormal = !runResult;
13449
+ const subagentResponse = abnormal ? abnormalEndResponse({ subagentId: conversationId, gathered, runError }) : gathered;
13361
13450
  const parentConv = await this.conversationStore.get(parentConversationId);
13362
13451
  if (parentConv) {
13363
13452
  const result = {
13364
13453
  subagentId: conversationId,
13365
13454
  task,
13366
- status: "completed",
13367
- result: { status: "completed", response: subagentResponse, steps: runResult?.steps ?? 0, tokens: { input: 0, output: 0, cached: 0 }, duration: runResult?.duration ?? 0 },
13455
+ status: abnormal ? "error" : "completed",
13456
+ result: { status: runResult?.status ?? "error", response: subagentResponse, steps: runResult?.steps ?? 0, tokens: { input: 0, output: 0, cached: 0 }, duration: runResult?.duration ?? 0 },
13457
+ ...abnormal ? { error: { code: runError?.code ?? "SUBAGENT_INCOMPLETE", message: runError?.message ?? "subagent ended without a result" } } : {},
13368
13458
  timestamp: Date.now()
13369
13459
  };
13370
13460
  await this.conversationStore.appendSubagentResult(parentConversationId, result);
@@ -13994,6 +14084,7 @@ export {
13994
14084
  ToolDispatcher,
13995
14085
  VFS_SCHEME,
13996
14086
  VercelBlobUploadStore,
14087
+ abnormalEndResponse,
13997
14088
  applyTurnMetadata,
13998
14089
  buildAgentDirectoryName,
13999
14090
  buildApprovalCheckpoints,
@@ -14048,6 +14139,7 @@ export {
14048
14139
  getPonchoStoreRoot,
14049
14140
  isMessageArray,
14050
14141
  jsonSchemaToZod,
14142
+ lastAssistantText,
14051
14143
  loadCanonicalHistory,
14052
14144
  loadPonchoConfig,
14053
14145
  loadRunHistory,
@@ -14067,6 +14159,7 @@ export {
14067
14159
  ponchoDocsTool,
14068
14160
  readOpenAICodexSession,
14069
14161
  readSkillResource,
14162
+ realResponseText,
14070
14163
  recordStandardTurnEvent,
14071
14164
  renderAgentPrompt,
14072
14165
  resolveAgentIdentity,
@@ -89,6 +89,8 @@ function createIsolateRuntime(config) {
89
89
  }
90
90
  const t0 = performance.now();
91
91
  let context;
92
+ let timedOut = false;
93
+ let wallTimer;
92
94
  try {
93
95
  context = await isolate.createContext();
94
96
  const jail = context.global;
@@ -121,12 +123,29 @@ function createIsolateRuntime(config) {
121
123
  const wrapped = `(async () => {
122
124
  ${code}
123
125
  })()`;
124
- const rawResult = await context.eval(wrapped, {
126
+ const evalPromise = context.eval(wrapped, {
125
127
  filename: "<user-code>",
126
128
  promise: true,
127
129
  copy: true,
128
130
  timeout: config.timeout
129
131
  });
132
+ const rawResult = config.timeout > 0 ? await Promise.race([
133
+ evalPromise,
134
+ new Promise((_resolve, reject) => {
135
+ wallTimer = setTimeout(() => {
136
+ timedOut = true;
137
+ try {
138
+ isolate.dispose();
139
+ } catch {
140
+ }
141
+ reject(new Error("Execution timed out"));
142
+ }, config.timeout);
143
+ })
144
+ ]) : await evalPromise;
145
+ if (wallTimer) {
146
+ clearTimeout(wallTimer);
147
+ wallTimer = void 0;
148
+ }
130
149
  const stdout = await context.eval("__stdout.join('\\n')", { copy: true });
131
150
  const stderr = await context.eval("__stderr.join('\\n')", { copy: true });
132
151
  let result;
@@ -151,6 +170,17 @@ ${code}
151
170
  executionTimeMs: elapsed
152
171
  };
153
172
  }
173
+ if (timedOut) {
174
+ return {
175
+ stdout: "",
176
+ stderr: "",
177
+ error: {
178
+ message: `Execution timed out after ${config.timeout}ms`,
179
+ name: "TimeoutError"
180
+ },
181
+ executionTimeMs: elapsed
182
+ };
183
+ }
154
184
  let stdout = "";
155
185
  let stderr = "";
156
186
  if (context) {
@@ -169,6 +199,7 @@ ${code}
169
199
  executionTimeMs: elapsed
170
200
  };
171
201
  } finally {
202
+ if (wallTimer) clearTimeout(wallTimer);
172
203
  if (abortHandler && signal) {
173
204
  signal.removeEventListener("abort", abortHandler);
174
205
  }
@@ -927,50 +958,79 @@ var POLYFILL_FETCH_STUB = `
927
958
  `;
928
959
  var POLYFILL_TIMERS = `
929
960
  // --- Timers polyfill ---
961
+ //
962
+ // The isolate has no host event loop, so real wall-clock delays can't be
963
+ // honoured. What we *can* do is drain pending timers on the microtask queue
964
+ // (which isolated-vm does pump while resolving the run's promise), firing
965
+ // them in order of their requested delay against a virtual clock. This makes
966
+ // the overwhelmingly common pattern \u2014 \`await new Promise(r => setTimeout(r, n))\`
967
+ // as a sleep \u2014 actually resolve instead of hanging the whole run forever.
968
+ // Delays collapse to "as soon as possible, in delay order"; that's the right
969
+ // trade for a sandbox with no real time. A runaway setInterval is bounded by
970
+ // __MAX_FIRES here and, ultimately, by the host-side wall-clock timeout.
930
971
  (function() {
931
972
  let __timerId = 0;
932
- const __timers = new Map();
973
+ const __timers = new Map(); // id -> { fn, due, type }
974
+ const __intervals = new Set(); // ids that should reschedule
975
+ let __vclock = 0; // virtual clock (ms)
976
+ let __draining = false;
977
+ let __fired = 0;
978
+ const __MAX_FIRES = 1000000; // backstop against a runaway interval
979
+
980
+ function __schedule(fn, delayMs, type, id) {
981
+ __timers.set(id, { fn, due: __vclock + delayMs, type });
982
+ if (!__draining) __drain();
983
+ return id;
984
+ }
985
+
986
+ function __drain() {
987
+ __draining = true;
988
+ const step = function() {
989
+ if (__timers.size === 0) { __draining = false; return; }
990
+ // Pick the earliest-due timer (ties broken by insertion id for FIFO).
991
+ let pick = null;
992
+ for (const [id, t] of __timers) {
993
+ if (pick === null || t.due < pick.t.due || (t.due === pick.t.due && id < pick.id)) {
994
+ pick = { id, t };
995
+ }
996
+ }
997
+ __timers.delete(pick.id);
998
+ if (pick.t.due > __vclock) __vclock = pick.t.due;
999
+ __fired++;
1000
+ try { pick.t.fn(); } catch (e) { /* host timers swallow callback throws */ }
1001
+ if (__fired > __MAX_FIRES) { __draining = false; return; }
1002
+ Promise.resolve().then(step);
1003
+ };
1004
+ Promise.resolve().then(step);
1005
+ }
933
1006
 
934
1007
  globalThis.setTimeout = function(fn, delay) {
935
1008
  const id = ++__timerId;
936
1009
  const ms = Math.max(0, Number(delay) || 0);
937
- const start = Date.now();
938
- __timers.set(id, { fn, ms, start, type: "timeout" });
939
- // In the isolate, setTimeout returns the id but the callback is
940
- // executed via a polling mechanism in the async wrapper.
941
- // For simple cases (delay=0), we can use a microtask.
942
- if (ms === 0) {
943
- Promise.resolve().then(() => {
944
- if (__timers.has(id)) {
945
- __timers.delete(id);
946
- fn();
947
- }
948
- });
949
- }
950
- return id;
1010
+ return __schedule(typeof fn === "function" ? fn : function() {}, ms, "timeout", id);
951
1011
  };
952
1012
 
953
1013
  globalThis.clearTimeout = function(id) {
954
1014
  __timers.delete(id);
1015
+ __intervals.delete(id);
955
1016
  };
956
1017
 
957
1018
  globalThis.setInterval = function(fn, delay) {
958
1019
  const id = ++__timerId;
959
1020
  const ms = Math.max(1, Number(delay) || 1);
960
- const wrapper = () => {
961
- if (!__timers.has(id)) return;
962
- fn();
963
- if (__timers.has(id)) {
964
- globalThis.setTimeout(wrapper, ms);
1021
+ __intervals.add(id);
1022
+ const tick = function() {
1023
+ if (!__intervals.has(id)) return;
1024
+ try { fn(); } finally {
1025
+ if (__intervals.has(id)) __schedule(tick, ms, "interval", id);
965
1026
  }
966
1027
  };
967
- __timers.set(id, { fn: wrapper, ms, type: "interval" });
968
- globalThis.setTimeout(wrapper, ms);
969
- return id;
1028
+ return __schedule(tick, ms, "interval", id);
970
1029
  };
971
1030
 
972
1031
  globalThis.clearInterval = function(id) {
973
1032
  __timers.delete(id);
1033
+ __intervals.delete(id);
974
1034
  };
975
1035
 
976
1036
  // queueMicrotask if not available
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@poncho-ai/harness",
3
- "version": "0.50.3",
3
+ "version": "0.50.5",
4
4
  "description": "Agent execution runtime - conversation loop, tool dispatch, streaming",
5
5
  "repository": {
6
6
  "type": "git",