npm - @mastra/mcp-docs-server - Versions diffs - 1.1.29-alpha.8 → 1.1.29 - Mend

@mastra/mcp-docs-server 1.1.29-alpha.8 → 1.1.29

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/.docs/docs/agents/background-tasks.md +242 -0
package/.docs/docs/agents/channels.md +2 -1
package/.docs/docs/agents/supervisor-agents.md +35 -4
package/.docs/docs/agents/using-tools.md +1 -0
package/.docs/docs/streaming/background-task-streaming.md +80 -0
package/.docs/docs/streaming/overview.md +3 -0
package/.docs/docs/workspace/filesystem.md +1 -1
package/.docs/guides/guide/slack-assistant.md +191 -0
package/.docs/models/index.md +1 -1
package/.docs/models/providers/deepinfra.md +2 -1
package/.docs/models/providers/fireworks-ai.md +2 -1
package/.docs/models/providers/kilo.md +3 -1
package/.docs/reference/client-js/agents.md +24 -0
package/.docs/reference/configuration.md +63 -0
package/.docs/reference/harness/harness-class.md +53 -10
package/.docs/reference/index.md +2 -0
package/.docs/reference/observability/tracing/interfaces.md +17 -0
package/.docs/reference/processors/stream-error-retry-processor.md +54 -0
package/.docs/reference/streaming/ChunkType.md +140 -0
package/.docs/reference/streaming/agents/streamUntilIdle.md +94 -0
package/.docs/reference/workspace/s3-filesystem.md +79 -5
package/CHANGELOG.md +30 -0
package/package.json +6 -6

package/.docs/models/providers/fireworks-ai.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![Fireworks AI logo](https://models.dev/logos/fireworks-ai.svg)Fireworks AI
-Access 18 Fireworks AI models through Mastra's model router. Authentication is handled automatically using the `FIREWORKS_API_KEY` environment variable.
+Access 19 Fireworks AI models through Mastra's model router. Authentication is handled automatically using the `FIREWORKS_API_KEY` environment variable.
 Learn more in the [Fireworks AI documentation](https://fireworks.ai/docs/).
@@ -36,6 +36,7 @@ for await (const chunk of stream) {
 | --------------------------------------------------------- | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
 | `fireworks-ai/accounts/fireworks/models/deepseek-v3p1`    | 164K    |       |           |       |       |       | $0.56      | $2          |
 | `fireworks-ai/accounts/fireworks/models/deepseek-v3p2`    | 160K    |       |           |       |       |       | $0.56      | $2          |
+| `fireworks-ai/accounts/fireworks/models/deepseek-v4-pro`  | 1.0M    |       |           |       |       |       | $2         | $3          |
 | `fireworks-ai/accounts/fireworks/models/glm-4p5`          | 131K    |       |           |       |       |       | $0.55      | $2          |
 | `fireworks-ai/accounts/fireworks/models/glm-4p5-air`      | 131K    |       |           |       |       |       | $0.22      | $0.88       |
 | `fireworks-ai/accounts/fireworks/models/glm-4p7`          | 198K    |       |           |       |       |       | $0.60      | $2          |

package/.docs/models/providers/kilo.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![Kilo Gateway logo](https://models.dev/logos/kilo.svg)Kilo Gateway
-Access 335 Kilo Gateway models through Mastra's model router. Authentication is handled automatically using the `KILO_API_KEY` environment variable.
+Access 337 Kilo Gateway models through Mastra's model router. Authentication is handled automatically using the `KILO_API_KEY` environment variable.
 Learn more in the [Kilo Gateway documentation](https://kilo.ai).
@@ -357,6 +357,8 @@ for await (const chunk of stream) {
 | `kilo/xiaomi/mimo-v2-flash`                         | 262K    |       |           |       |       |       | $0.09      | $0.29       |
 | `kilo/xiaomi/mimo-v2-omni`                          | 262K    |       |           |       |       |       | $0.40      | $2          |
 | `kilo/xiaomi/mimo-v2-pro`                           | 1.0M    |       |           |       |       |       | $1         | $3          |
+| `kilo/xiaomi/mimo-v2.5`                             | 1.0M    |       |           |       |       |       | $0.40      | $2          |
+| `kilo/xiaomi/mimo-v2.5-pro`                         | 1.0M    |       |           |       |       |       | $1         | $3          |
 | `kilo/z-ai/glm-4-32b`                               | 128K    |       |           |       |       |       | $0.10      | $0.10       |
 | `kilo/z-ai/glm-4.5`                                 | 131K    |       |           |       |       |       | $0.60      | $2          |
 | `kilo/z-ai/glm-4.5-air`                             | 131K    |       |           |       |       |       | $0.13      | $0.85       |

package/.docs/reference/client-js/agents.md CHANGED Viewed

@@ -151,6 +151,30 @@ for await (const part of uiMessageStream) {
 }
 ```
+### `streamUntilIdle()`
+Stream a response and keep the stream open until every [background task](https://mastra.ai/docs/agents/background-tasks) dispatched during the run completes. The server re-enters the agentic loop on each task completion so the LLM can react to results in the same call. Requires background tasks to be [enabled on the Mastra instance](https://mastra.ai/reference/configuration) and a memory thread; otherwise the call falls through to a plain `stream()`.
+```typescript
+const response = await agent.streamUntilIdle('Research solana for me', {
+  memory: {
+    thread: 'thread-1',
+    resource: 'resource-1',
+  },
+  maxIdleMs: 5 * 60_000,
+})
+response.processDataStream({
+  onChunk: async chunk => {
+    if (chunk.type === 'background-task-completed') {
+      console.log('task complete:', chunk.payload.taskId)
+    }
+  },
+})
+```
+The stream emits the same chunk types as `stream()`, plus `background-task-*` chunks for task lifecycle events. Visit [`Agent.streamUntilIdle()`](https://mastra.ai/reference/streaming/agents/streamUntilIdle) for the full server-side API and [background task chunks](https://mastra.ai/reference/streaming/ChunkType) for the payload shapes.
 ### `getTool()`
 Retrieve information about a specific tool available to the agent:

package/.docs/reference/configuration.md CHANGED Viewed

@@ -36,6 +36,69 @@ export const mastra = new Mastra({
 })
 ```
+### backgroundTasks
+**Type:** `BackgroundTaskManagerConfig`
+Enables and configures the background task manager. When enabled, agents can dispatch long-running tool calls (including subagent invocations) to run asynchronously while the agentic loop continues. Tasks are persisted, so a configured `storage` backend is required.
+Visit the [Background tasks documentation](https://mastra.ai/docs/agents/background-tasks) to learn more.
+```typescript
+import { Mastra } from '@mastra/core'
+import { LibSQLStore } from '@mastra/libsql'
+export const mastra = new Mastra({
+  storage: new LibSQLStore({
+    id: 'mastra-storage',
+    url: 'file:./mastra.db',
+  }),
+  backgroundTasks: {
+    enabled: true,
+    globalConcurrency: 10,
+    perAgentConcurrency: 5,
+    backpressure: 'queue',
+    defaultTimeoutMs: 300_000,
+  },
+})
+```
+**enabled** (`boolean`): Whether background tasks are enabled. The manager only initializes when this is true and a storage backend is configured. (Default: `false`)
+**globalConcurrency** (`number`): Maximum number of background tasks running concurrently across all agents. (Default: `10`)
+**perAgentConcurrency** (`number`): Maximum number of background tasks running concurrently for a single agent. (Default: `5`)
+**backpressure** (`'queue' | 'reject' | 'fallback-sync'`): Behavior when a concurrency limit is reached. 'queue' waits for a slot, 'reject' throws on enqueue, 'fallback-sync' runs the tool synchronously in the agentic loop instead. (Default: `'queue'`)
+**defaultTimeoutMs** (`number`): Default per-task timeout in milliseconds. Can be overridden per-tool or per-call. (Default: `300000`)
+**defaultRetries** (`RetryConfig`): Default retry policy applied to tasks that fail.
+**defaultRetries.maxRetries** (`number`): Maximum retry attempts before the task is marked failed.
+**defaultRetries.retryDelayMs** (`number`): Delay between retries in milliseconds.
+**defaultRetries.backoffMultiplier** (`number`): Multiplier applied to retryDelayMs on each subsequent attempt.
+**defaultRetries.maxRetryDelayMs** (`number`): Upper bound on the retry delay regardless of backoff.
+**defaultRetries.retryableErrors** (`(error: Error) => boolean`): Predicate that decides whether a given error should be retried. Default: retry all errors.
+**cleanup** (`CleanupConfig`): Controls how long task records are kept and how often the cleanup process runs.
+**cleanup.completedTtlMs** (`number`): How long to keep completed task records, in milliseconds. Default: 1 hour.
+**cleanup.failedTtlMs** (`number`): How long to keep failed task records, in milliseconds. Default: 24 hours.
+**cleanup.cleanupIntervalMs** (`number`): How often the cleanup process runs, in milliseconds. Default: 1 minute.
+**waitTimeoutMs** (`number`): How long the agentic loop waits for a background task to complete before moving on. If a task has not finished within this time, the loop proceeds without setting isContinued. Default: undefined (do not wait). Can be overridden per-agent or per-tool.
+**onTaskComplete** (`(task: BackgroundTask) => void | Promise<void>`): Global callback invoked when any background task completes successfully. Fires in addition to per-tool and per-agent callbacks.
+**onTaskFailed** (`(task: BackgroundTask) => void | Promise<void>`): Global callback invoked when any background task fails. Fires in addition to per-tool and per-agent callbacks.
 ### deployer
 **Type:** `MastraDeployer`

package/.docs/reference/harness/harness-class.md CHANGED Viewed

@@ -90,6 +90,8 @@ await harness.sendMessage({ content: 'Hello!' })
 **subagents.stopWhen** (`LoopOptions['stopWhen']`): Optional stop condition for the spawned subagent.
+**subagents.forked** (`boolean`): When \`true\`, calls to this subagent default to forked mode: the subagent runs on a clone of the parent thread, reusing the parent agent’s instructions, tools, and model so the prompt-cache prefix stays intact. Requires \`memory\` to be configured. The subagent definition’s own \`instructions\`, \`tools\`, \`allowedHarnessTools\`, \`allowedWorkspaceTools\`, \`defaultModelId\`, \`maxSteps\`, and \`stopWhen\` are ignored in forked mode. Callers can still override per-invocation via \`forked: false\` in the \`subagent\` tool input. See the \[Forked subagents]\(#forked-subagents) section below for full semantics.
 **resolveModel** (`(modelId: string) => MastraLanguageModel`): Converts a model ID string (e.g., \`"anthropic/claude-sonnet-4"\`) to a language model instance. Used by subagents and observational memory model resolution.
 **omConfig** (`HarnessOMConfig`): Default configuration for observational memory (observer/reflector model IDs and thresholds).
@@ -286,16 +288,21 @@ await harness.switchThread({ threadId: 'thread-abc123' })
 #### `listThreads(options?)`
-List threads from storage. By default, only threads for the current resource are returned.
+List threads from storage. By default, only threads for the current resource are returned, and transient [forked subagent](#forked-subagents) threads are hidden so they don’t appear in user-facing thread pickers / startup flows.
 ```typescript
-// List threads for current resource
+// List threads for current resource (forks hidden)
 const threads = await harness.listThreads()
-// List all threads across resources
+// List all threads across resources (forks still hidden)
 const allThreads = await harness.listThreads({ allResources: true })
+// Include forked subagent fork threads (debug / admin tooling only)
+const everything = await harness.listThreads({ includeForkedSubagents: true })
 ```
+Fork threads are tagged with `metadata.forkedSubagent === true` (and `metadata.parentThreadId`) by the harness. Set `includeForkedSubagents: true` to opt back into seeing them — e.g. for a debug panel.
 #### `renameThread({ title })`
 Update the title of the current thread.
@@ -677,6 +684,42 @@ await harness.setSubagentModelId({ modelId: 'anthropic/claude-sonnet-4-6' })
 await harness.setSubagentModelId({ modelId: 'anthropic/claude-haiku-3.5', agentType: 'explore' })
 ```
+### Forked subagents
+By default, a subagent runs with a fresh context — it doesn't see the parent conversation. **Forked subagents** opt into a different model: the subagent runs on a clone of the parent thread and reuses the parent agent's full configuration. This is useful when the subagent needs the full context of the conversation so far (e.g., recalling earlier user-supplied facts), and when prompt-cache hit rates matter.
+#### Enabling forked mode
+Set `forked: true` either on the [`HarnessSubagent` definition](#configuration) (per-type default) or on each `subagent` tool call (per-invocation override):
+```typescript
+// Per-type default — every call to this subagent forks unless overridden.
+const subagents: HarnessSubagent[] = [
+  {
+    id: 'collaborator',
+    name: 'Collaborator',
+    description: 'Continues the conversation in a fork to try a different angle.',
+    instructions: '...',
+    forked: true,
+  },
+]
+```
+The model can also pass `forked: true` (or `forked: false`) per-invocation in the `subagent` tool input; the per-invocation value wins.
+#### Semantics and constraints
+- **Memory required.** Forked mode calls `memory.cloneThread` to create the fork, so the harness must have `memory` configured and an active parent thread. Calls without those return a structured error rather than throwing.
+- **Parent agent reused.** The fork runs through the parent agent's `stream(...)` call. The parent's instructions, tools, model, `maxSteps`, and `stopWhen` apply. The subagent definition's `instructions`, `tools`, `allowedHarnessTools`, `allowedWorkspaceTools`, `defaultModelId`, `maxSteps`, and `stopWhen` are ignored in forked mode — this is what preserves the prompt-cache prefix.
+- **Toolsets inherited, recursive forks blocked at runtime.** Forks inherit the parent's toolsets verbatim (`ask_user`, `submit_plan`, user-configured harness tools, _including the `subagent` tool itself_) so the LLM request prefix — system prompt + tool list + tool schemas + tool descriptions — stays byte-identical to the parent's. This is what preserves the prompt cache. The `subagent` entry is kept on the model side but its `execute` is replaced inside the fork with a stub that returns a non-error "tool unavailable inside a forked subagent" message: nested forks are blocked at the runtime layer without perturbing the cached prefix.
+- **Fork threads are tagged.** Each fork thread is created with `metadata.forkedSubagent === true` and `metadata.parentThreadId === <parent>`. By default, [`listThreads`](#listthreadsoptions) hides these so they don't show up in user-facing thread pickers / startup flows. Pass `includeForkedSubagents: true` to see them in admin / debug tooling.
+- **Save-queue flushed before clone.** The agent stream batches message saves through a debounced `SaveQueueManager`, so the parent's latest user / assistant turn may not be on disk yet when the subagent tool call fires. The fork tool flushes pending saves first via the `flushMessages` callback on `AgentToolExecutionContext` before cloning, so the fork actually carries the latest turn. Flush failures are non-fatal — the clone still runs.
+- **Parent thread untouched.** All subagent activity (messages, OM writes) lands on the fork. The parent thread is never appended to during a forked subagent run.
+#### When to prefer non-forked mode
+Forked mode trades isolation for context inheritance. If the subagent should run with a strictly smaller toolset, a different system prompt, or a cheaper model, use the default (non-forked) mode and pass any required context explicitly in the `task` description.
 ### Events
 #### `subscribe(listener)`
@@ -753,13 +796,13 @@ The harness emits events through registered listeners. The following table lists
 The harness provides built-in tools to agents in every mode:
-| Tool          | Description                                                                                                               |
-| ------------- | ------------------------------------------------------------------------------------------------------------------------- |
-| `ask_user`    | Ask the user a question and wait for their response. Supports free text, single-select choices, and multi-select choices. |
-| `submit_plan` | Submit a plan for user review and approval.                                                                               |
-| `task_write`  | Create or update a structured task list for tracking progress.                                                            |
-| `task_check`  | Check the completion status of the current task list.                                                                     |
-| `subagent`    | Spawn a focused subagent with constrained tools (only available when `subagents` is configured).                          |
+| Tool          | Description                                                                                                                                                                                          |
+| ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `ask_user`    | Ask the user a question and wait for their response. Supports free text, single-select choices, and multi-select choices.                                                                            |
+| `submit_plan` | Submit a plan for user review and approval.                                                                                                                                                          |
+| `task_write`  | Create or update a structured task list for tracking progress.                                                                                                                                       |
+| `task_check`  | Check the completion status of the current task list.                                                                                                                                                |
+| `subagent`    | Spawn a focused subagent with constrained tools (only available when `subagents` is configured). Pass `forked: true` to inherit the parent conversation — see [Forked subagents](#forked-subagents). |
 ### `ask_user` selections

package/.docs/reference/index.md CHANGED Viewed

@@ -169,6 +169,7 @@ The Reference section provides documentation of Mastra's API, including paramete
 - [PromptInjectionDetector](https://mastra.ai/reference/processors/prompt-injection-detector)
 - [SemanticRecall](https://mastra.ai/reference/processors/semantic-recall-processor)
 - [SkillSearchProcessor](https://mastra.ai/reference/processors/skill-search-processor)
+- [StreamErrorRetryProcessor](https://mastra.ai/reference/processors/stream-error-retry-processor)
 - [SystemPromptScrubber](https://mastra.ai/reference/processors/system-prompt-scrubber)
 - [TokenLimiterProcessor](https://mastra.ai/reference/processors/token-limiter-processor)
 - [ToolCallFilter](https://mastra.ai/reference/processors/tool-call-filter)
@@ -209,6 +210,7 @@ The Reference section provides documentation of Mastra's API, including paramete
 - [MastraModelOutput](https://mastra.ai/reference/streaming/agents/MastraModelOutput)
 - [.stream()](https://mastra.ai/reference/streaming/agents/stream)
 - [.streamLegacy()](https://mastra.ai/reference/streaming/agents/streamLegacy)
+- [.streamUntilIdle()](https://mastra.ai/reference/streaming/agents/streamUntilIdle)
 - [.observeStream()](https://mastra.ai/reference/streaming/workflows/observeStream)
 - [.resumeStream()](https://mastra.ai/reference/streaming/workflows/resumeStream)
 - [.stream()](https://mastra.ai/reference/streaming/workflows/stream)

package/.docs/reference/observability/tracing/interfaces.md CHANGED Viewed

@@ -126,6 +126,21 @@ interface ObservabilityExporter {
   /** Initialize exporter with tracing configuration and/or access to Mastra */
   init?(options: InitExporterOptions): void
+  /** Handle tracing events */
+  onTracingEvent?(event: TracingEvent): void | Promise<void>
+  /** Handle log events */
+  onLogEvent?(event: LogEvent): void | Promise<void>
+  /** Handle metric events */
+  onMetricEvent?(event: MetricEvent): void | Promise<void>
+  /** Handle score events */
+  onScoreEvent?(event: ScoreEvent): void | Promise<void>
+  /** Handle feedback events */
+  onFeedbackEvent?(event: FeedbackEvent): void | Promise<void>
   /** Export tracing events */
   exportTracingEvent(event: TracingEvent): Promise<void>
@@ -154,6 +169,8 @@ interface ObservabilityExporter {
 }
 ```
+Event callback payloads use observability event bus envelopes: `TracingEvent` carries span lifecycle events with `exportedSpan`, `LogEvent` wraps `ExportedLog` in `log`, `MetricEvent` wraps `ExportedMetric` in `metric`, `ScoreEvent` wraps `ExportedScore` in `score`, and `FeedbackEvent` wraps `ExportedFeedback` in `feedback`. For Cloud exporter behavior for these callbacks, see [CloudExporter](https://mastra.ai/reference/observability/tracing/exporters/cloud-exporter).
 ### `SpanOutputProcessor`
 Interface for span output processors.

package/.docs/reference/processors/stream-error-retry-processor.md ADDED Viewed

@@ -0,0 +1,54 @@
+# StreamErrorRetryProcessor
+`StreamErrorRetryProcessor` is an **error processor** that retries transient errors emitted after an LLM stream starts. It includes built-in matching for OpenAI Responses stream errors and supports additional matchers for other provider-specific stream error shapes.
+The processor isn't enabled by default in core. Add it to `errorProcessors` for agents that need stream-error retry handling.
+## Usage example
+Add `StreamErrorRetryProcessor` to `errorProcessors`:
+```typescript
+import { Agent } from '@mastra/core/agent'
+import { StreamErrorRetryProcessor } from '@mastra/core/processors'
+export const agent = new Agent({
+  name: 'openai-agent',
+  instructions: 'You are a helpful assistant.',
+  model: 'openai/gpt-5',
+  errorProcessors: [new StreamErrorRetryProcessor()],
+})
+```
+## How it works
+The processor checks the error and its cause chain for:
+- Provider retry metadata: `isRetryable === true`
+- Built-in OpenAI Responses stream error matching
+- Matcher results: Any configured matcher that returns `true`
+When the error is retryable, the processor returns `{ retry: true }`. It doesn't mutate messages.
+## Default OpenAI Responses matcher
+`isRetryableOpenAIResponsesStreamError` matches OpenAI Responses stream error chunks with `type: 'error'` or `type: 'response.failed'`. It retries known transient OpenAI error codes and, as a fallback, errors with explicit retry guidance such as `You can retry your request`.
+`StreamErrorRetryProcessor` includes this matcher by default. You can also import it and reuse it in custom retry logic.
+## Constructor parameters
+**options** (`StreamErrorRetryProcessorOptions`): Configuration for retry handling.
+## Properties
+**id** (`'stream-error-retry-processor'`): Processor identifier.
+**name** (`'Stream Error Retry Processor'`): Processor display name.
+**processAPIError** (`(args: ProcessAPIErrorArgs) => ProcessAPIErrorResult | void`): Retries stream errors up to the configured retry limit.
+## Related
+- [Processor interface](https://mastra.ai/reference/processors/processor-interface)
+- [Processors](https://mastra.ai/docs/agents/processors)

package/.docs/reference/streaming/ChunkType.md CHANGED Viewed

@@ -398,6 +398,146 @@ Contains output from workflow step execution, used primarily for usage tracking
 **payload.output** (`ChunkType`): Nested chunk data from step execution, typically containing finish events or other step results
+## Background task chunks
+Emitted when a tool call is dispatched as a [background task](https://mastra.ai/docs/agents/background-tasks) and `streamUntilIdle()` is used.
+### background-task-started
+Emitted when a tool call is enqueued as a background task and assigned a `taskId`.
+**type** (`"background-task-started"`): Chunk type identifier
+**payload** (`BackgroundTaskStartedPayload`): Identifies the newly enqueued task
+**payload.taskId** (`string`): Unique identifier for the background task
+**payload.toolName** (`string`): Name of the tool being executed
+**payload.toolCallId** (`string`): Tool-call ID from the originating LLM tool call
+### background-task-running
+Emitted when a worker picks up the task and execution begins.
+**type** (`"background-task-running"`): Chunk type identifier
+**payload** (`BackgroundTaskRunningPayload`): Details about the running task
+**payload.taskId** (`string`): Unique identifier for the background task
+**payload.toolName** (`string`): Name of the tool being executed
+**payload.toolCallId** (`string`): Tool-call ID from the originating LLM tool call
+**payload.runId** (`string`): Run ID of the agent that dispatched the task
+**payload.agentId** (`string`): ID of the agent that dispatched the task
+**payload.startedAt** (`Date`): Timestamp at which execution started
+**payload.args** (`Record<string, unknown>`): Arguments passed to the tool's execute function
+### background-task-progress
+Periodic snapshot of how many background tasks are currently running across the agent.
+**type** (`"background-task-progress"`): Chunk type identifier
+**payload** (`BackgroundTaskProgressPayload`): Aggregate progress for all running tasks
+**payload.taskIds** (`string[]`): IDs of all currently running background tasks
+**payload.runningCount** (`number`): Number of background tasks currently running
+**payload.elapsedMs** (`number`): Milliseconds elapsed since the agent run started
+### background-task-output
+A streamed output chunk emitted by the task's `execute` function. Wraps an inner [`tool-output`](#tool-output) chunk.
+**type** (`"background-task-output"`): Chunk type identifier
+**payload** (`BackgroundTaskOutputPayload`): Streamed output from the running task
+**payload.taskId** (`string`): Unique identifier for the background task
+**payload.toolName** (`string`): Name of the tool being executed
+**payload.toolCallId** (`string`): Tool-call ID from the originating LLM tool call
+**payload.runId** (`string`): Run ID of the agent that dispatched the task
+**payload.agentId** (`string`): ID of the agent that dispatched the task
+**payload.payload** (`ToolOutputChunk`): Inner tool-output chunk produced by the task
+### background-task-completed
+Emitted when the task finishes successfully. Triggers a continuation turn when consumed by [`Agent.streamUntilIdle()`](https://mastra.ai/reference/streaming/agents/streamUntilIdle).
+**type** (`"background-task-completed"`): Chunk type identifier
+**payload** (`BackgroundTaskResultPayload`): The completed task's result
+**payload.taskId** (`string`): Unique identifier for the background task
+**payload.toolName** (`string`): Name of the tool that was executed
+**payload.toolCallId** (`string`): Tool-call ID from the originating LLM tool call
+**payload.agentId** (`string`): ID of the agent that dispatched the task
+**payload.runId** (`string`): Run ID of the agent that dispatched the task
+**payload.result** (`unknown`): The tool's resolved return value
+**payload.completedAt** (`Date`): Timestamp at which the task completed
+**payload.isError** (`boolean`): True when the tool returned an error result rather than throwing
+### background-task-failed
+Emitted when the task throws or times out. Triggers a continuation turn when consumed by [`Agent.streamUntilIdle()`](https://mastra.ai/reference/streaming/agents/streamUntilIdle).
+**type** (`"background-task-failed"`): Chunk type identifier
+**payload** (`BackgroundTaskFailedPayload`): Failure details for the task
+**payload.taskId** (`string`): Unique identifier for the background task
+**payload.toolName** (`string`): Name of the tool that was executed
+**payload.toolCallId** (`string`): Tool-call ID from the originating LLM tool call
+**payload.runId** (`string`): Run ID of the agent that dispatched the task
+**payload.agentId** (`string`): ID of the agent that dispatched the task
+**payload.error** (`{ message: string }`): Error details thrown by the task
+**payload.completedAt** (`Date`): Timestamp at which the task failed
+### background-task-cancelled
+Emitted when the task is cancelled before completing. Triggers a continuation turn when consumed by [`Agent.streamUntilIdle()`](https://mastra.ai/reference/streaming/agents/streamUntilIdle).
+**type** (`"background-task-cancelled"`): Chunk type identifier
+**payload** (`BackgroundTaskCancelledPayload`): Cancellation details for the task
+**payload.taskId** (`string`): Unique identifier for the background task
+**payload.toolName** (`string`): Name of the tool that was executed
+**payload.toolCallId** (`string`): Tool-call ID from the originating LLM tool call
+**payload.runId** (`string`): Run ID of the agent that dispatched the task
+**payload.agentId** (`string`): ID of the agent that dispatched the task
+**payload.completedAt** (`Date`): Timestamp at which the task was cancelled
 ## Metadata and special chunks
 ### response-metadata

package/.docs/reference/streaming/agents/streamUntilIdle.md ADDED Viewed

@@ -0,0 +1,94 @@
+# Agent.streamUntilIdle()
+**Added in:** `@mastra/core@1.28.0`
+`streamUntilIdle()` streams an agent's response and keeps the stream open until every background task dispatched during the run completes. When a task finishes, its result is written to memory and the agentic loop re-enters automatically so the LLM can react to it. The stream closes once no tasks are running and no completions are queued.
+Use it when the agent dispatches background tasks (typically long-running tools or subagents) and you want a single stream that spans the initial response **plus** every continuation triggered by a task completion. For foreground-only runs or if you prefer to manage the continuation manually (manually prompt agent to process the result), use [`Agent.stream()`](https://mastra.ai/reference/streaming/agents/stream).
+## Usage example
+```ts
+const stream = await agent.streamUntilIdle('Research solana for me', {
+  memory: { thread: 't1', resource: 'u1' },
+})
+for await (const chunk of stream.fullStream) {
+  // chunks from the initial turn AND any continuation turns triggered by
+  // background task completions flow through here
+}
+```
+> **Info:** `streamUntilIdle()` requires both a [`BackgroundTaskManager`](https://mastra.ai/reference/configuration) and a [memory](https://mastra.ai/docs/memory/overview) backend. Without either, it falls through to a plain `agent.stream()` call.
+## Parameters
+**messages** (`string | string[] | CoreMessage[] | AiMessageType[] | UIMessageWithMetadata[]`): The messages to send to the agent. Can be a single string, array of strings, or structured message objects.
+**options** (`AgentExecutionOptions<Output> & { maxIdleMs?: number }`): Accepts every option that Agent.stream() accepts, plus maxIdleMs. See the Agent.stream() reference for the full list.
+**options.maxIdleMs** (`number`): Closes the outer stream after this many ms of idleness between turns. The timer only runs while the wrapper is between turns, so a slow first token does not close the stream. Default: 5 minutes.
+**options.memory** (`{ thread?: string | { id: string }; resource?: string }`): Memory thread and resource for the run. Required for continuations to write background task results back into the conversation.
+**options.structuredOutput** (`PublicStructuredOutputOptions<Output>`): Schema-based structured output. Same shape as Agent.stream(). Note that aggregate properties resolve against the first turn only.
+For every other option (`maxSteps`, `modelSettings`, `toolChoice`, `outputProcessors`, `onFinish`, `onChunk`, etc.), see the [`Agent.stream()` parameters](https://mastra.ai/reference/streaming/agents/stream). `streamUntilIdle()` forwards them to the initial turn.
+## Returns
+**stream** (`MastraModelOutput<Output>`): A MastraModelOutput where fullStream spans the initial turn plus every auto-continuation. Aggregate properties (text, toolCalls, toolResults, finishReason, messageList, getFullOutput()) resolve against the first turn only.
+### Aggregate properties caveat
+`streamUntilIdle()` returns a proxy over the first turn's `MastraModelOutput`. Only `fullStream` is replaced with a combined stream that spans every continuation. Every other property — `text`, `toolCalls`, `toolResults`, `finishReason`, `messageList`, `getFullOutput()` — resolves against the **first turn's** internal buffer.
+If you need an aggregate view across all continuations, consume `fullStream` yourself and accumulate.
+## Continuation behavior
+Internally, `streamUntilIdle()`:
+1. Runs the initial turn via `agent.stream(...)` and pipes its `fullStream` into the outer stream.
+2. Subscribes to background-task completion events for the resolved memory scope.
+3. Queues each terminal event (`background-task-completed`, `background-task-failed`, `background-task-cancelled`) and, when the outer wrapper is idle between turns, re-invokes `agent.stream([], ...)` with a directive listing the just-completed `toolCallId`s. The continuation turn flows into the same outer stream.
+4. Closes the outer stream once no tasks are running and no completions are queued.
+## Extended usage example
+### Cap idle time between turns
+```ts
+const stream = await agent.streamUntilIdle('Kick off the long jobs', {
+  memory: { thread: 't1', resource: 'u1' },
+  maxIdleMs: 60_000, // close the stream after 1 minute of idleness between turns
+})
+for await (const chunk of stream.fullStream) {
+  if (chunk.type === 'background-task-completed') {
+    console.log('Task complete:', chunk.payload.taskId)
+  }
+}
+```
+### Aggregate text across continuations
+```ts
+const stream = await agent.streamUntilIdle('Research and summarize', {
+  memory: { thread: 't1', resource: 'u1' },
+})
+let fullText = ''
+for await (const chunk of stream.fullStream) {
+  if (chunk.type === 'text-delta') {
+    fullText += chunk.payload.text
+  }
+}
+```
+## Related
+- [Background tasks](https://mastra.ai/docs/agents/background-tasks)
+- [`Agent.stream()` reference](https://mastra.ai/reference/streaming/agents/stream)
+- [backgroundTasks configuration reference](https://mastra.ai/reference/configuration)
+- [Stream chunk types](https://mastra.ai/reference/streaming/ChunkType)