npm - @poncho-ai/harness - Versions diffs - 0.50.3 → 0.50.5 - Mend

@poncho-ai/harness 0.50.3 → 0.50.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/.turbo/turbo-build.log +6 -6
package/CHANGELOG.md +24 -0
package/dist/index.d.ts +39 -1
package/dist/index.js +123 -30
package/dist/{isolate-BNQ6P3HI.js → isolate-F2PPSUL6.js} +84 -24
package/package.json +1 -1
package/src/harness.ts +99 -8
package/src/isolate/polyfills.ts +52 -23
package/src/isolate/runtime.ts +45 -1
package/src/orchestrator/index.ts +3 -0
package/src/orchestrator/orchestrator.ts +143 -25
package/test/isolate.test.ts +75 -0
package/test/orchestrator.test.ts +112 -0

package/.turbo/turbo-build.log CHANGED Viewed

@@ -1,5 +1,5 @@
-> @poncho-ai/harness@0.50.3 build /home/runner/work/poncho-ai/poncho-ai/packages/harness
+> @poncho-ai/harness@0.50.5 build /home/runner/work/poncho-ai/poncho-ai/packages/harness
 > node scripts/embed-docs.js && tsup src/index.ts --format esm --dts
 [embed-docs] Generated poncho-docs.ts with 4 topics
@@ -8,9 +8,9 @@
 [34mCLI[39m tsup v8.5.1
 [34mCLI[39m Target: es2022
 [34mESM[39m Build start
-[32mESM[39m [1mdist/index.js            [22m[32m530.79 KB[39m
-[32mESM[39m [1mdist/isolate-BNQ6P3HI.js [22m[32m51.41 KB[39m
-[32mESM[39m ⚡️ Build success in 229ms
+[32mESM[39m [1mdist/index.js            [22m[32m535.57 KB[39m
+[32mESM[39m [1mdist/isolate-F2PPSUL6.js [22m[32m53.82 KB[39m
+[32mESM[39m ⚡️ Build success in 240ms
 [34mDTS[39m Build start
-[32mDTS[39m ⚡️ Build success in 7430ms
-[32mDTS[39m [1mdist/index.d.ts [22m[32m89.60 KB[39m
+[32mDTS[39m ⚡️ Build success in 7598ms
+[32mDTS[39m [1mdist/index.d.ts [22m[32m91.35 KB[39m

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,29 @@
 # @poncho-ai/harness
+## 0.50.5
+### Patch Changes
+- [`991a4b9`](https://github.com/cesr/poncho-ai/commit/991a4b98d6683c105c7aae50551d30b16080d618) Thanks [@cesr](https://github.com/cesr)! - harness: subagents survive the wall-clock timeout, and can be given a longer budget than the foreground turn.
+  Previously a subagent that hit its hard `timeout` (vs. `maxSteps`) emitted `run:error` with no `runResult`, so the orchestrator dropped everything it had gathered: the parent received a bare `(no result)`, the subagent was falsely marked `completed`, and the work — often dozens of completed searches, just short of the write step — was lost.
+  - **Graceful timeout/error delivery.** When a subagent run ends abnormally (timeout or model error) with no `runResult`, the orchestrator now recovers its real output (run response → streamed draft → transcript walk-back, discarding the synthetic `[Error: …]` placeholder), and delivers it tagged so the parent knows it didn't finish — it may not have written its files — with a concrete recovery hint (use the partial work, send a write-only `message_subagent` follow-up, or `read_subagent(…, mode:"full")`). The subagent is marked `status: "error"` (not a fake `completed`) and carries the failure in its `error` field. Applied to both the spawn and continuation paths.
+  - **`runTimeoutSecOverride` (HarnessOptions).** A constructor-level override for the per-run hard wall-clock timeout, taking precedence over the agent definition's `limits.timeout`. Lets a platform give background subagents a longer budget (e.g. 1h) than a foreground turn (5 min) without forking the agent definition. `0` disables the hard timeout.
+## 0.50.4
+### Patch Changes
+- [`9a39327`](https://github.com/cesr/poncho-ai/commit/9a393274d8a8061371d268fa81db3501cb0a8308) Thanks [@cesr](https://github.com/cesr)! - harness: fix three `run_code` / cancellation bugs.
+  - **Timers polyfill never fired delayed callbacks.** `setTimeout(fn, ms)` only ran the callback when `ms === 0`; any non-zero delay was stored and never invoked, so `await new Promise(r => setTimeout(r, 50))` (the standard sleep) hung forever. The polyfill now drains pending timers on the microtask queue in delay order against a virtual clock, so sleeps resolve and `setInterval`/`clearInterval` work.
+  - **No wall-clock bound on `run_code`.** isolated-vm's `timeout` only bounds synchronous execution; a script that returns a never-settling promise hung the whole turn indefinitely. `runtime.execute` now races the eval against a host timer that disposes the isolate, so `isolate.timeLimit` bounds total execution and returns a `TimeoutError`.
+  - **Stopping a turn mid-tool-call dropped the assistant turn from canonical history.** On cancellation the in-flight assistant message (its text + tool calls) lives only in step-local state — it's pushed to `messages` together with the tool results, which never arrive when stopped. The cancellation snapshot now re-attaches that turn with a synthesized "cancelled by user" tool result for each pending tool call, so the next request keeps a valid record instead of showing the model back-to-back user messages.
+- [`c604fd6`](https://github.com/cesr/poncho-ai/commit/c604fd6b41dfd06600af85daa892ab4fd3852bad) Thanks [@cesr](https://github.com/cesr)! - harness: harden subagent → parent result delivery so a step-exhausted subagent stops surfacing as `(no response)`.
+  - **Force a closing text turn on the final step.** On the last permitted step (`step === maxSteps`) the run loop now strips the tools and appends a one-shot "summarize now, no tools" nudge to that model request, so a run that hits its step ceiling produces a real text summary instead of terminating on a dangling tool call. Previously such a run ended on a tool-call turn with no final text — common in subagents doing many tool calls — and the parent received an empty result. `maxSteps` itself is unchanged; the nudge is request-only and never written to history.
+  - **Content-shape-robust result extraction.** Pulling a subagent's response no longer requires the last assistant message to be a plain `string`. The new `lastAssistantText` helper handles `string`, `ContentPart[]`, and the run loop's `{"text":...,"tool_calls":[...]}` envelope, and walks backwards to the last non-empty assistant text — so a transcript that ends on a text-less tool turn still yields the prose produced just before it.
+  - **Actionable empty-result sentinel.** When a subagent genuinely produced no summary, the injected parent message now says how many steps ran and points at `read_subagent(<id>, mode:"assistant")` to recover the work, instead of a dead-end `(no response)`.
 ## 0.50.3
 ### Patch Changes

package/dist/index.d.ts CHANGED Viewed

@@ -1261,6 +1261,14 @@ interface HarnessOptions {
      * should also be browsable in the VFS. Empty by default.
      */
     systemSkillPaths?: string[];
+    /**
+     * Override the per-run hard wall-clock timeout, in seconds, taking
+     * precedence over the agent definition's `limits.timeout`. Platforms use
+     * this to give background subagents a longer budget than the foreground
+     * agent without forking the agent definition (e.g. a 1h research subagent
+     * vs. a 5-min foreground turn). `0` disables the hard timeout.
+     */
+    runTimeoutSecOverride?: number;
 }
 interface HarnessRunOutput {
     runId: string;
@@ -1280,6 +1288,7 @@ interface ArchivedToolResult {
 declare class AgentHarness {
     private readonly workingDir;
     private readonly environment;
+    private readonly runTimeoutSecOverride?;
     private modelProvider;
     private readonly modelProviderInjected;
     private readonly dispatcher;
@@ -1981,6 +1990,35 @@ declare const MAX_SUBAGENT_CALLBACK_COUNT = 20;
 declare const CALLBACK_LOCK_STALE_MS: number;
 declare const STALE_SUBAGENT_THRESHOLD_MS: number;
+/**
+ * Find the last non-empty assistant text in a subagent transcript. Walking
+ * backwards (rather than reading only the final message) means a subagent
+ * that ended on a tool-call turn still yields the prose it produced just
+ * before — instead of surfacing to the parent as an empty result.
+ */
+declare const lastAssistantText: (messages: Message[]) => string;
+/**
+ * The run loop stuffs a synthetic `[Error: ...]` placeholder into the draft /
+ * persisted assistant text when a run ends on `run:error` (e.g. a timeout).
+ * That placeholder is not real model output — strip it so we don't surface it
+ * to the parent as the subagent's "response".
+ */
+declare const realResponseText: (text: string | undefined) => string;
+/**
+ * Build the result text delivered to the parent when a subagent ended
+ * abnormally (timeout / error) with no RunResult. We never drop the work it
+ * gathered, and the parent is told it didn't finish — e.g. it may not have
+ * written its output files — plus how to recover (use what's here, send a
+ * write-only follow-up, or read the full transcript).
+ */
+declare const abnormalEndResponse: (opts: {
+    subagentId: string;
+    gathered: string;
+    runError?: {
+        code?: string;
+        message?: string;
+    };
+}) => string;
 type ActiveConversationRun = {
     ownerId: string;
     abortController: AbortController;
@@ -2143,4 +2181,4 @@ interface RunConversationTurnResult {
 }
 declare const runConversationTurn: (opts: RunConversationTurnOpts) => Promise<RunConversationTurnResult>;
-export { type ActiveConversationRun, type ActiveSubagentRun, type AgentFrontmatter, AgentHarness, type AgentIdentity, type AgentLimitsConfig, type AgentModelConfig, AgentOrchestrator, type ApprovalEventItem, type ArchivedToolResult$1 as ArchivedToolResult, type BashConfig, BashEnvironmentManager, type BashExecutionLimits, type BuiltInToolToggles, CALLBACK_LOCK_STALE_MS, type CompactMessagesOptions, type CompactResult, type CompactionConfig, type ContinuationHooks, type Conversation, type ConversationCreateInit, type ConversationState, type ConversationStatusSnapshot, type ConversationStore, type ConversationSummary, type CreateSkillToolsOptions, type CronJobConfig, DEFAULT_AGENT_DESCRIPTION, DEFAULT_AGENT_NAME, DEFAULT_MAX_STEPS, DEFAULT_MODEL_NAME, DEFAULT_MODEL_PROVIDER, DEFAULT_TEMPERATURE, DEFAULT_TIMEOUT, type DefaultAgentDefinitionOptions, type EventSink, type ExecuteTurnResult, type HarnessOptions, type HarnessRunOutput, type HistorySource, InMemoryConversationStore, InMemoryEngine, InMemoryStateStore, type IsolateBinding, type IsolateConfig, LocalMcpBridge, LocalUploadStore, MAX_CONCURRENT_SUBAGENTS, MAX_CONTINUATION_COUNT, MAX_SUBAGENT_CALLBACK_COUNT, MAX_SUBAGENT_NESTING, type MainMemory, type McpConfig, type MemoryConfig, type MemoryStore, type MessagingChannelConfig, type ModelProviderFactory, type MountProvider, type NetworkConfig, OPENAI_CODEX_CLIENT_ID, type OpenAICodexAuthConfig, type OpenAICodexDeviceAuthRequest, type OpenAICodexSession, type OrchestratorHooks, type OrchestratorOptions, type OtlpConfig, type OtlpOption, PONCHO_UPLOAD_SCHEME, type ParsedAgent, type PendingSubagentApproval, type PendingSubagentResult, type PendingToolCall, type PonchoConfig, PonchoFsAdapter, PostgresEngine, type ProviderConfig, type Recurrence, type RecurrenceType, type Reminder, type ReminderCreateInput, type ReminderStatus, type ReminderStore, type RemoteMcpServerConfig, type RunConversationTurnOpts, type RunConversationTurnResult, type RunOutcome, type RunRequest, type RuntimeRenderContext, S3UploadStore, STALE_SUBAGENT_THRESHOLD_MS, STORAGE_SCHEMA_VERSION, type SecretsStore, type SkillContextEntry, type SkillMetadata, type SkillSource, SqliteEngine, type StateConfig, type StateProviderName, type StateStore, type StorageConfig, type StorageEngine, type StorageFactoryOptions, type StorageProvider, type StoredApproval, type SubagentManager, type SubagentResult, type SubagentSpawnResult, type SubagentSummary, type SubagentTranscript, type SubagentTranscriptMode, TOOL_RESULT_ARCHIVE_PARAM, type TelemetryConfig, TelemetryEmitter, type TenantTokenPayload, type ToolAccess, type ToolCall, ToolDispatcher, type ToolExecutionResult, type TurnDraftState, type TurnResultMetadata, type TurnSection, type UploadStore, type UploadsConfig, VFS_SCHEME, VercelBlobUploadStore, type VfsDirEntry, type VfsStat, type VirtualMount, applyTurnMetadata, buildAgentDirectoryName, buildApprovalCheckpoints, buildAssistantMetadata, buildSkillContextWindow, buildToolCompletedText, cloneSections, compactMessages, completeOpenAICodexDeviceAuth, computeNextOccurrence, createBashTool, createConversationStore, createConversationStoreFromEngine, createDefaultTools, createDeleteDirectoryTool, createDeleteTool, createEditTool, createMemoryStore, createMemoryStoreFromEngine, createMemoryTools, createModelProvider, createReminderStore, createReminderStoreFromEngine, createReminderTools, createSearchTools, createSecretsStore, createSkillTools, createStateStore, createStorageEngine, createSubagentTools, createTodoStoreFromEngine, createTurnDraftState, createUploadStore, createWriteTool, decodeFileInputData, defaultAgentDefinition, deleteOpenAICodexSession, deriveUploadKey, ensureAgentIdentity, estimateTokens, estimateTotalTokens, executeConversationTurn, findSafeSplitPoint, flushTurnDraft, generateAgentId, getAgentStoreDirectory, getModelContextWindow, getOpenAICodexAccessToken, getOpenAICodexAuthFilePath, getOpenAICodexRequiredScopes, getPonchoStoreRoot, isMessageArray, jsonSchemaToZod, loadCanonicalHistory, loadPonchoConfig, loadRunHistory, loadSkillContext, loadSkillInstructions, loadSkillMetadata, loadSkillMetadataFromDirs, loadVfsSkillMetadata, mergeSkills, normalizeApprovalCheckpoint, normalizeOtlp, normalizeScriptPolicyPath, normalizeToolAccess, parseAgentFile, parseAgentMarkdown, parseSkillFrontmatter, ponchoDocsTool, readOpenAICodexSession, readSkillResource, recordStandardTurnEvent, renderAgentPrompt, resolveAgentIdentity, resolveCompactionConfig, resolveEnv, resolveMemoryConfig, resolveRunRequest, resolveSkillDirs, resolveStateConfig, runConversationTurn, slugifyStorageComponent, startOpenAICodexDeviceAuth, verifyTenantToken, withToolResultArchiveParam, writeOpenAICodexSession };
+export { type ActiveConversationRun, type ActiveSubagentRun, type AgentFrontmatter, AgentHarness, type AgentIdentity, type AgentLimitsConfig, type AgentModelConfig, AgentOrchestrator, type ApprovalEventItem, type ArchivedToolResult$1 as ArchivedToolResult, type BashConfig, BashEnvironmentManager, type BashExecutionLimits, type BuiltInToolToggles, CALLBACK_LOCK_STALE_MS, type CompactMessagesOptions, type CompactResult, type CompactionConfig, type ContinuationHooks, type Conversation, type ConversationCreateInit, type ConversationState, type ConversationStatusSnapshot, type ConversationStore, type ConversationSummary, type CreateSkillToolsOptions, type CronJobConfig, DEFAULT_AGENT_DESCRIPTION, DEFAULT_AGENT_NAME, DEFAULT_MAX_STEPS, DEFAULT_MODEL_NAME, DEFAULT_MODEL_PROVIDER, DEFAULT_TEMPERATURE, DEFAULT_TIMEOUT, type DefaultAgentDefinitionOptions, type EventSink, type ExecuteTurnResult, type HarnessOptions, type HarnessRunOutput, type HistorySource, InMemoryConversationStore, InMemoryEngine, InMemoryStateStore, type IsolateBinding, type IsolateConfig, LocalMcpBridge, LocalUploadStore, MAX_CONCURRENT_SUBAGENTS, MAX_CONTINUATION_COUNT, MAX_SUBAGENT_CALLBACK_COUNT, MAX_SUBAGENT_NESTING, type MainMemory, type McpConfig, type MemoryConfig, type MemoryStore, type MessagingChannelConfig, type ModelProviderFactory, type MountProvider, type NetworkConfig, OPENAI_CODEX_CLIENT_ID, type OpenAICodexAuthConfig, type OpenAICodexDeviceAuthRequest, type OpenAICodexSession, type OrchestratorHooks, type OrchestratorOptions, type OtlpConfig, type OtlpOption, PONCHO_UPLOAD_SCHEME, type ParsedAgent, type PendingSubagentApproval, type PendingSubagentResult, type PendingToolCall, type PonchoConfig, PonchoFsAdapter, PostgresEngine, type ProviderConfig, type Recurrence, type RecurrenceType, type Reminder, type ReminderCreateInput, type ReminderStatus, type ReminderStore, type RemoteMcpServerConfig, type RunConversationTurnOpts, type RunConversationTurnResult, type RunOutcome, type RunRequest, type RuntimeRenderContext, S3UploadStore, STALE_SUBAGENT_THRESHOLD_MS, STORAGE_SCHEMA_VERSION, type SecretsStore, type SkillContextEntry, type SkillMetadata, type SkillSource, SqliteEngine, type StateConfig, type StateProviderName, type StateStore, type StorageConfig, type StorageEngine, type StorageFactoryOptions, type StorageProvider, type StoredApproval, type SubagentManager, type SubagentResult, type SubagentSpawnResult, type SubagentSummary, type SubagentTranscript, type SubagentTranscriptMode, TOOL_RESULT_ARCHIVE_PARAM, type TelemetryConfig, TelemetryEmitter, type TenantTokenPayload, type ToolAccess, type ToolCall, ToolDispatcher, type ToolExecutionResult, type TurnDraftState, type TurnResultMetadata, type TurnSection, type UploadStore, type UploadsConfig, VFS_SCHEME, VercelBlobUploadStore, type VfsDirEntry, type VfsStat, type VirtualMount, abnormalEndResponse, applyTurnMetadata, buildAgentDirectoryName, buildApprovalCheckpoints, buildAssistantMetadata, buildSkillContextWindow, buildToolCompletedText, cloneSections, compactMessages, completeOpenAICodexDeviceAuth, computeNextOccurrence, createBashTool, createConversationStore, createConversationStoreFromEngine, createDefaultTools, createDeleteDirectoryTool, createDeleteTool, createEditTool, createMemoryStore, createMemoryStoreFromEngine, createMemoryTools, createModelProvider, createReminderStore, createReminderStoreFromEngine, createReminderTools, createSearchTools, createSecretsStore, createSkillTools, createStateStore, createStorageEngine, createSubagentTools, createTodoStoreFromEngine, createTurnDraftState, createUploadStore, createWriteTool, decodeFileInputData, defaultAgentDefinition, deleteOpenAICodexSession, deriveUploadKey, ensureAgentIdentity, estimateTokens, estimateTotalTokens, executeConversationTurn, findSafeSplitPoint, flushTurnDraft, generateAgentId, getAgentStoreDirectory, getModelContextWindow, getOpenAICodexAccessToken, getOpenAICodexAuthFilePath, getOpenAICodexRequiredScopes, getPonchoStoreRoot, isMessageArray, jsonSchemaToZod, lastAssistantText, loadCanonicalHistory, loadPonchoConfig, loadRunHistory, loadSkillContext, loadSkillInstructions, loadSkillMetadata, loadSkillMetadataFromDirs, loadVfsSkillMetadata, mergeSkills, normalizeApprovalCheckpoint, normalizeOtlp, normalizeScriptPolicyPath, normalizeToolAccess, parseAgentFile, parseAgentMarkdown, parseSkillFrontmatter, ponchoDocsTool, readOpenAICodexSession, readSkillResource, realResponseText, recordStandardTurnEvent, renderAgentPrompt, resolveAgentIdentity, resolveCompactionConfig, resolveEnv, resolveMemoryConfig, resolveRunRequest, resolveSkillDirs, resolveStateConfig, runConversationTurn, slugifyStorageComponent, startOpenAICodexDeviceAuth, verifyTenantToken, withToolResultArchiveParam, writeOpenAICodexSession };

package/dist/index.js CHANGED Viewed

@@ -8626,6 +8626,7 @@ var now = () => Date.now();
 var FIRST_CHUNK_TIMEOUT_MS = 9e4;
 var MAX_TRANSIENT_STEP_RETRIES = 1;
 var COMPACTION_CHECK_INTERVAL_STEPS = 3;
+var FINAL_STEP_SUMMARY_PROMPT = "You have reached the maximum number of steps for this run and cannot call any more tools. Do NOT attempt any tool calls. Using only the work you have already done, write your final response now: summarize what you found or accomplished, include any concrete results, and flag anything left unfinished.";
 var TOOL_RESULT_ARCHIVE_PARAM = "__toolResultArchive";
 var TOOL_RESULT_TRUNCATED_PREFIX = "[TRUNCATED_TOOL_RESULT]";
 var TOOL_RESULT_PREVIEW_CHARS = 700;
@@ -9169,6 +9170,7 @@ function extractMediaFromToolOutput(output) {
 var AgentHarness = class _AgentHarness {
   workingDir;
   environment;
+  runTimeoutSecOverride;
   modelProvider;
   modelProviderInjected;
   dispatcher = new ToolDispatcher();
@@ -9374,6 +9376,7 @@ var AgentHarness = class _AgentHarness {
   constructor(options = {}) {
     this.workingDir = options.workingDir ?? process.cwd();
     this.environment = options.environment ?? "development";
+    this.runTimeoutSecOverride = options.runTimeoutSecOverride;
     this.modelProviderInjected = !!options.modelProvider;
     this.modelProvider = options.modelProvider ?? createModelProvider("anthropic");
     this.uploadStore = options.uploadStore;
@@ -9951,7 +9954,7 @@ var AgentHarness = class _AgentHarness {
     this.registerIfMissing(createEditFileTool(getFs));
     this.registerIfMissing(createWriteFileTool(getFs));
     if (config?.isolate) {
-      const { createRunCodeTool, buildRunCodeDescription, bundleLibraries } = await import("./isolate-BNQ6P3HI.js");
+      const { createRunCodeTool, buildRunCodeDescription, bundleLibraries } = await import("./isolate-F2PPSUL6.js");
       let libraryPreamble = null;
       if (config.isolate.libraries?.length) {
         libraryPreamble = await bundleLibraries(config.isolate.libraries, this.workingDir);
@@ -10236,7 +10239,7 @@ var AgentHarness = class _AgentHarness {
     const runId = `run_${randomUUID5()}`;
     const start = now();
     const maxSteps = agent.frontmatter.limits?.maxSteps ?? 20;
-    const configuredTimeout = agent.frontmatter.limits?.timeout;
+    const configuredTimeout = this.runTimeoutSecOverride ?? agent.frontmatter.limits?.timeout;
     const timeoutMs = this.environment === "development" && configuredTimeout == null ? 0 : (configuredTimeout ?? 300) * 1e3;
     const platformMaxDurationSec = Number(process.env.PONCHO_MAX_DURATION) || 0;
     const softDeadlineMs = input.disableSoftDeadline || platformMaxDurationSec <= 0 ? 0 : platformMaxDurationSec * 800;
@@ -10327,7 +10330,7 @@ Examples:${this.environment !== "production" ? `
 Files in the VFS are accessible to the user via \`/api/vfs/{path}\`. For example, a file at \`/downloads/report.pdf\` can be linked as \`/api/vfs/downloads/report.pdf\`. Use this to share downloadable files with the user.` : "";
     let isolateContext = "";
     if (this.loadedConfig?.isolate && this.dispatcher.get("run_code")) {
-      const { generateIsolateTypeStubs } = await import("./isolate-BNQ6P3HI.js");
+      const { generateIsolateTypeStubs } = await import("./isolate-F2PPSUL6.js");
       const typeStubs = generateIsolateTypeStubs(this.loadedConfig.isolate);
       isolateContext = `
@@ -10374,10 +10377,40 @@ ${this.skillFingerprint}`;
     };
     const isCancelled = () => input.abortSignal?.aborted === true;
     let cancellationEmitted = false;
+    let inflightTurn = null;
     const emitCancellation = () => {
       cancellationEmitted = true;
-      const snapshot = trimToValidPrefix([...messages]);
-      return pushEvent({ type: "run:cancelled", runId, messages: snapshot });
+      const snapshot = [...messages];
+      if (inflightTurn && (inflightTurn.text.length > 0 || inflightTurn.toolCalls.length > 0)) {
+        const hasToolCalls = inflightTurn.toolCalls.length > 0;
+        const assistantContent = hasToolCalls ? JSON.stringify({
+          text: inflightTurn.text,
+          tool_calls: inflightTurn.toolCalls.map((tc) => ({
+            id: tc.id,
+            name: tc.name,
+            input: tc.input
+          }))
+        }) : inflightTurn.text;
+        snapshot.push({
+          role: "assistant",
+          content: assistantContent,
+          metadata: { timestamp: now(), id: randomUUID5(), runId }
+        });
+        if (hasToolCalls) {
+          const cancelledResults = inflightTurn.toolCalls.map((tc) => ({
+            type: "tool_result",
+            tool_use_id: tc.id,
+            tool_name: tc.name,
+            content: "Tool execution cancelled by user."
+          }));
+          snapshot.push({
+            role: "tool",
+            content: JSON.stringify(cancelledResults),
+            metadata: { timestamp: now(), id: randomUUID5(), runId }
+          });
+        }
+      }
+      return pushEvent({ type: "run:cancelled", runId, messages: trimToValidPrefix(snapshot) });
     };
     const resolvedModelName = agent.frontmatter.model?.name ?? "claude-opus-4-5";
     const contextWindow = agent.frontmatter.model?.contextWindow ?? getModelContextWindow(resolvedModelName);
@@ -10460,6 +10493,7 @@ ${this.skillFingerprint}`;
       let cachedCoreMessages = [];
       let convertedUpTo = 0;
       for (let step = 1; step <= maxSteps; step += 1) {
+        inflightTurn = null;
         try {
           yield* drainBrowserEvents();
           if (isCancelled()) {
@@ -10817,11 +10851,14 @@ ${textContent}` };
             ...cachedMessages
           ] : cachedMessages;
           const telemetryEnabled = this.loadedConfig?.telemetry?.enabled !== false;
+          const isFinalStep = step === maxSteps;
+          const toolsForStep = isFinalStep ? {} : tools;
+          const messagesForStep = isFinalStep ? [...finalMessages, { role: "user", content: FINAL_STEP_SUMMARY_PROMPT }] : finalMessages;
           const result = await streamText({
             model: modelInstance,
             ...useStaticCache ? {} : { system: systemPrompt },
-            messages: finalMessages,
-            tools,
+            messages: messagesForStep,
+            tools: toolsForStep,
             temperature,
             abortSignal: input.abortSignal,
             ...typeof maxTokens === "number" ? { maxTokens } : {},
@@ -10950,6 +10987,7 @@ ${textContent}` };
             yield pushEvent({ type: "run:completed", runId, result: result_ });
             return;
           }
+          inflightTurn = { text: fullText, toolCalls: [] };
           if (isCancelled()) {
             yield emitCancellation();
             return;
@@ -11036,6 +11074,7 @@ ${textContent}` };
             name: tc.toolName,
             input: tc.input
           }));
+          if (inflightTurn) inflightTurn.toolCalls = toolCalls;
           if (toolCalls.length === 0) {
             if (fullText.length === 0) {
               const isExpectedEmpty = finishReason === "stop";
@@ -11416,6 +11455,7 @@ ${textContent}` };
             content: JSON.stringify(toolResultsForModel),
             metadata: toolMsgMeta
           });
+          inflightTurn = null;
           if (softDeadlineMs > 0 && now() - start > softDeadlineMs) {
             const result_ = {
               status: "completed",
@@ -12282,6 +12322,38 @@ var CALLBACK_LOCK_STALE_MS = 5 * 60 * 1e3;
 var STALE_SUBAGENT_THRESHOLD_MS = 5 * 60 * 1e3;
 // src/orchestrator/orchestrator.ts
+import { getTextContent as getTextContent3 } from "@poncho-ai/sdk";
+var assistantMessageText = (message) => {
+  const raw = getTextContent3(message).trim();
+  if (raw.startsWith("{") && raw.includes('"tool_calls"')) {
+    try {
+      const parsed = JSON.parse(raw);
+      if (typeof parsed.text === "string") return parsed.text.trim();
+    } catch {
+    }
+  }
+  return raw;
+};
+var lastAssistantText = (messages) => {
+  for (let i = messages.length - 1; i >= 0; i -= 1) {
+    if (messages[i].role !== "assistant") continue;
+    const text = assistantMessageText(messages[i]);
+    if (text) return text;
+  }
+  return "";
+};
+var realResponseText = (text) => {
+  const t = (text ?? "").trim();
+  return t.startsWith("[Error:") ? "" : t;
+};
+var abnormalEndResponse = (opts) => {
+  const timedOut = opts.runError?.code === "TIMEOUT";
+  const head = timedOut ? "[Subagent hit its time limit before finishing \u2014 it may not have written its output files.]" : `[Subagent ended before finishing${opts.runError?.message ? `: ${opts.runError.message}` : ""}.]`;
+  const recover = opts.gathered ? "Partial work it gathered is below \u2014 write the files yourself from it, or send a tight write-only follow-up with message_subagent." : `Use read_subagent("${opts.subagentId}", mode:"full") to recover what it gathered.`;
+  return opts.gathered ? `${head} ${recover}
+${opts.gathered}` : `${head} ${recover}`;
+};
 var AgentOrchestrator = class {
   harness;
   conversationStore;
@@ -12807,6 +12879,7 @@ var AgentOrchestrator = class {
     const draft = createTurnDraftState();
     let latestRunId = "";
     let runResult;
+    let runError;
     try {
       const conversation = await this.conversationStore.getWithArchive(childConversationId);
       if (!conversation) throw new Error("Subagent conversation not found");
@@ -12943,6 +13016,7 @@ var AgentOrchestrator = class {
           }
         }
         if (event.type === "run:error") {
+          runError = { code: event.error.code, message: event.error.message };
           draft.assistantResponse = draft.assistantResponse || `[Error: ${event.error.message}]`;
         }
         await this.eventSink(childConversationId, event);
@@ -12990,7 +13064,12 @@ var AgentOrchestrator = class {
           }
           return;
         }
-        conv.subagentMeta = { ...conv.subagentMeta, status: "completed" };
+        const abnormalEnd = !runResult;
+        conv.subagentMeta = {
+          ...conv.subagentMeta,
+          status: abnormalEnd ? "error" : "completed",
+          ...abnormalEnd ? { error: { code: runError?.code ?? "SUBAGENT_INCOMPLETE", message: runError?.message ?? "subagent ended without a result" } } : {}
+        };
         await this.conversationStore.update(conv);
       }
       this.hooks?.onStreamEnd?.(childConversationId);
@@ -12999,21 +13078,25 @@ var AgentOrchestrator = class {
         subagentId: childConversationId,
         conversationId: childConversationId
       });
-      let subagentResponse = runResult?.response ?? draft.assistantResponse;
-      if (!subagentResponse) {
+      let gathered = realResponseText(runResult?.response) || realResponseText(draft.assistantResponse);
+      if (!gathered) {
         const freshSubConv = await this.conversationStore.get(childConversationId);
-        if (freshSubConv) {
-          const lastAssistant = [...freshSubConv.messages].reverse().find((m) => m.role === "assistant");
-          if (lastAssistant && typeof lastAssistant.content === "string") {
-            subagentResponse = lastAssistant.content;
-          }
-        }
+        if (freshSubConv) gathered = realResponseText(lastAssistantText(freshSubConv.messages));
       }
+      const abnormal = !runResult;
+      const subagentResponse = abnormal ? abnormalEndResponse({ subagentId: childConversationId, gathered, runError }) : gathered;
       const pendingResult = {
         subagentId: childConversationId,
         task,
-        status: "completed",
-        result: runResult ? { status: runResult.status, response: subagentResponse, steps: runResult.steps, tokens: { input: 0, output: 0, cached: 0 }, duration: runResult.duration } : void 0,
+        status: abnormal ? "error" : "completed",
+        result: {
+          status: runResult?.status ?? "error",
+          response: subagentResponse,
+          steps: runResult?.steps ?? 0,
+          tokens: { input: 0, output: 0, cached: 0 },
+          duration: runResult?.duration ?? 0
+        },
+        ...abnormal ? { error: { code: runError?.code ?? "SUBAGENT_INCOMPLETE", message: runError?.message ?? "subagent ended without a result" } } : {},
         timestamp: Date.now()
       };
       await this.conversationStore.appendSubagentResult(parentConversationId, pendingResult);
@@ -13095,8 +13178,10 @@ var AgentOrchestrator = class {
     const callbackCount = (conversation.subagentCallbackCount ?? 0) + 1;
     conversation.subagentCallbackCount = callbackCount;
     for (const pr of pendingResults) {
+      const responseText = (pr.result?.response ?? "").trim();
+      const responseLine = responseText || `(subagent produced no final summary after ${pr.result?.steps ?? 0} step(s); its work may be incomplete. Call read_subagent with subagent_id "${pr.subagentId}" and mode "assistant" to retrieve what it did.)`;
       const resultBody = pr.result ? `Status: ${pr.result.status}
-Response: ${pr.result.response ?? "(no response)"}
+Response: ${responseLine}
 Steps: ${pr.result.steps}, Duration: ${pr.result.duration}ms` : pr.error ? `Error: ${pr.error.message}` : "(no result)";
       conversation.messages.push({
         role: "user",
@@ -13259,6 +13344,7 @@ ${resultBody}`,
     this.activeSubagentRuns.set(conversationId, { abortController: childAbortController, harness: childHarness, parentConversationId });
     const draft = createTurnDraftState();
     let runResult;
+    let runError;
     try {
       const recallParams = this.hooks?.buildRecallParams?.({ ownerId, tenantId: conversation.tenantId, excludeConversationId: conversationId }) ?? {};
       for await (const event of childHarness.runWithTelemetry({
@@ -13291,6 +13377,7 @@ ${resultBody}`,
           }
         }
         if (event.type === "run:error") {
+          runError = { code: event.error.code, message: event.error.message };
           draft.assistantResponse = draft.assistantResponse || `[Error: ${event.error.message}]`;
         }
         await this.eventSink(conversationId, event);
@@ -13339,7 +13426,12 @@ ${resultBody}`,
           }
           return;
         }
-        conv.subagentMeta = { ...conv.subagentMeta, status: "completed" };
+        const abnormalEnd = !runResult;
+        conv.subagentMeta = {
+          ...conv.subagentMeta,
+          status: abnormalEnd ? "error" : "completed",
+          ...abnormalEnd ? { error: { code: runError?.code ?? "SUBAGENT_INCOMPLETE", message: runError?.message ?? "subagent ended without a result" } } : {}
+        };
         await this.conversationStore.update(conv);
       }
       this.activeSubagentRuns.delete(conversationId);
@@ -13348,23 +13440,21 @@ ${resultBody}`,
         subagentId: conversationId,
         conversationId
       });
-      let subagentResponse = runResult?.response ?? draft.assistantResponse;
-      if (!subagentResponse) {
+      let gathered = realResponseText(runResult?.response) || realResponseText(draft.assistantResponse);
+      if (!gathered) {
         const freshSubConv = await this.conversationStore.get(conversationId);
-        if (freshSubConv) {
-          const lastAssistant = [...freshSubConv.messages].reverse().find((m) => m.role === "assistant");
-          if (lastAssistant) {
-            subagentResponse = typeof lastAssistant.content === "string" ? lastAssistant.content : "";
-          }
-        }
+        if (freshSubConv) gathered = realResponseText(lastAssistantText(freshSubConv.messages));
       }
+      const abnormal = !runResult;
+      const subagentResponse = abnormal ? abnormalEndResponse({ subagentId: conversationId, gathered, runError }) : gathered;
       const parentConv = await this.conversationStore.get(parentConversationId);
       if (parentConv) {
         const result = {
           subagentId: conversationId,
           task,
-          status: "completed",
-          result: { status: "completed", response: subagentResponse, steps: runResult?.steps ?? 0, tokens: { input: 0, output: 0, cached: 0 }, duration: runResult?.duration ?? 0 },
+          status: abnormal ? "error" : "completed",
+          result: { status: runResult?.status ?? "error", response: subagentResponse, steps: runResult?.steps ?? 0, tokens: { input: 0, output: 0, cached: 0 }, duration: runResult?.duration ?? 0 },
+          ...abnormal ? { error: { code: runError?.code ?? "SUBAGENT_INCOMPLETE", message: runError?.message ?? "subagent ended without a result" } } : {},
           timestamp: Date.now()
         };
         await this.conversationStore.appendSubagentResult(parentConversationId, result);
@@ -13994,6 +14084,7 @@ export {
   ToolDispatcher,
   VFS_SCHEME,
   VercelBlobUploadStore,
+  abnormalEndResponse,
   applyTurnMetadata,
   buildAgentDirectoryName,
   buildApprovalCheckpoints,
@@ -14048,6 +14139,7 @@ export {
   getPonchoStoreRoot,
   isMessageArray,
   jsonSchemaToZod,
+  lastAssistantText,
   loadCanonicalHistory,
   loadPonchoConfig,
   loadRunHistory,
@@ -14067,6 +14159,7 @@ export {
   ponchoDocsTool,
   readOpenAICodexSession,
   readSkillResource,
+  realResponseText,
   recordStandardTurnEvent,
   renderAgentPrompt,
   resolveAgentIdentity,

package/dist/{isolate-BNQ6P3HI.js → isolate-F2PPSUL6.js} RENAMED Viewed

@@ -89,6 +89,8 @@ function createIsolateRuntime(config) {
       }
       const t0 = performance.now();
       let context;
+      let timedOut = false;
+      let wallTimer;
       try {
         context = await isolate.createContext();
         const jail = context.global;
@@ -121,12 +123,29 @@ function createIsolateRuntime(config) {
         const wrapped = `(async () => {
 ${code}
 })()`;
-        const rawResult = await context.eval(wrapped, {
+        const evalPromise = context.eval(wrapped, {
           filename: "<user-code>",
           promise: true,
           copy: true,
           timeout: config.timeout
         });
+        const rawResult = config.timeout > 0 ? await Promise.race([
+          evalPromise,
+          new Promise((_resolve, reject) => {
+            wallTimer = setTimeout(() => {
+              timedOut = true;
+              try {
+                isolate.dispose();
+              } catch {
+              }
+              reject(new Error("Execution timed out"));
+            }, config.timeout);
+          })
+        ]) : await evalPromise;
+        if (wallTimer) {
+          clearTimeout(wallTimer);
+          wallTimer = void 0;
+        }
         const stdout = await context.eval("__stdout.join('\\n')", { copy: true });
         const stderr = await context.eval("__stderr.join('\\n')", { copy: true });
         let result;
@@ -151,6 +170,17 @@ ${code}
             executionTimeMs: elapsed
           };
         }
+        if (timedOut) {
+          return {
+            stdout: "",
+            stderr: "",
+            error: {
+              message: `Execution timed out after ${config.timeout}ms`,
+              name: "TimeoutError"
+            },
+            executionTimeMs: elapsed
+          };
+        }
         let stdout = "";
         let stderr = "";
         if (context) {
@@ -169,6 +199,7 @@ ${code}
           executionTimeMs: elapsed
         };
       } finally {
+        if (wallTimer) clearTimeout(wallTimer);
         if (abortHandler && signal) {
           signal.removeEventListener("abort", abortHandler);
         }
@@ -927,50 +958,79 @@ var POLYFILL_FETCH_STUB = `
 `;
 var POLYFILL_TIMERS = `
 // --- Timers polyfill ---
+//
+// The isolate has no host event loop, so real wall-clock delays can't be
+// honoured. What we *can* do is drain pending timers on the microtask queue
+// (which isolated-vm does pump while resolving the run's promise), firing
+// them in order of their requested delay against a virtual clock. This makes
+// the overwhelmingly common pattern \u2014 \`await new Promise(r => setTimeout(r, n))\`
+// as a sleep \u2014 actually resolve instead of hanging the whole run forever.
+// Delays collapse to "as soon as possible, in delay order"; that's the right
+// trade for a sandbox with no real time. A runaway setInterval is bounded by
+// __MAX_FIRES here and, ultimately, by the host-side wall-clock timeout.
 (function() {
   let __timerId = 0;
-  const __timers = new Map();
+  const __timers = new Map();   // id -> { fn, due, type }
+  const __intervals = new Set(); // ids that should reschedule
+  let __vclock = 0;             // virtual clock (ms)
+  let __draining = false;
+  let __fired = 0;
+  const __MAX_FIRES = 1000000;  // backstop against a runaway interval
+  function __schedule(fn, delayMs, type, id) {
+    __timers.set(id, { fn, due: __vclock + delayMs, type });
+    if (!__draining) __drain();
+    return id;
+  }
+  function __drain() {
+    __draining = true;
+    const step = function() {
+      if (__timers.size === 0) { __draining = false; return; }
+      // Pick the earliest-due timer (ties broken by insertion id for FIFO).
+      let pick = null;
+      for (const [id, t] of __timers) {
+        if (pick === null || t.due < pick.t.due || (t.due === pick.t.due && id < pick.id)) {
+          pick = { id, t };
+        }
+      }
+      __timers.delete(pick.id);
+      if (pick.t.due > __vclock) __vclock = pick.t.due;
+      __fired++;
+      try { pick.t.fn(); } catch (e) { /* host timers swallow callback throws */ }
+      if (__fired > __MAX_FIRES) { __draining = false; return; }
+      Promise.resolve().then(step);
+    };
+    Promise.resolve().then(step);
+  }
   globalThis.setTimeout = function(fn, delay) {
     const id = ++__timerId;
     const ms = Math.max(0, Number(delay) || 0);
-    const start = Date.now();
-    __timers.set(id, { fn, ms, start, type: "timeout" });
-    // In the isolate, setTimeout returns the id but the callback is
-    // executed via a polling mechanism in the async wrapper.
-    // For simple cases (delay=0), we can use a microtask.
-    if (ms === 0) {
-      Promise.resolve().then(() => {
-        if (__timers.has(id)) {
-          __timers.delete(id);
-          fn();
-        }
-      });
-    }
-    return id;
+    return __schedule(typeof fn === "function" ? fn : function() {}, ms, "timeout", id);
   };
   globalThis.clearTimeout = function(id) {
     __timers.delete(id);
+    __intervals.delete(id);
   };
   globalThis.setInterval = function(fn, delay) {
     const id = ++__timerId;
     const ms = Math.max(1, Number(delay) || 1);
-    const wrapper = () => {
-      if (!__timers.has(id)) return;
-      fn();
-      if (__timers.has(id)) {
-        globalThis.setTimeout(wrapper, ms);
+    __intervals.add(id);
+    const tick = function() {
+      if (!__intervals.has(id)) return;
+      try { fn(); } finally {
+        if (__intervals.has(id)) __schedule(tick, ms, "interval", id);
       }
     };
-    __timers.set(id, { fn: wrapper, ms, type: "interval" });
-    globalThis.setTimeout(wrapper, ms);
-    return id;
+    return __schedule(tick, ms, "interval", id);
   };
   globalThis.clearInterval = function(id) {
     __timers.delete(id);
+    __intervals.delete(id);
   };
   // queueMicrotask if not available

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@poncho-ai/harness",
-  "version": "0.50.3",
+  "version": "0.50.5",
   "description": "Agent execution runtime - conversation loop, tool dispatch, streaming",
   "repository": {
     "type": "git",