npm - @trigger.dev/sdk - Versions diffs - 4.5.0-rc.5 → 4.5.0-rc.7 - Mend

@trigger.dev/sdk 4.5.0-rc.5 → 4.5.0-rc.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (213) hide show

package/dist/commonjs/v3/ai.d.ts +178 -5
package/dist/commonjs/v3/ai.js +603 -119
package/dist/commonjs/v3/ai.js.map +1 -1
package/dist/commonjs/v3/chat-client.js +3 -0
package/dist/commonjs/v3/chat-client.js.map +1 -1
package/dist/commonjs/v3/chat-react.js +10 -7
package/dist/commonjs/v3/chat-react.js.map +1 -1
package/dist/commonjs/v3/chat-server.d.ts +8 -0
package/dist/commonjs/v3/chat-server.js +32 -10
package/dist/commonjs/v3/chat-server.js.map +1 -1
package/dist/commonjs/v3/chat-server.test.js +51 -0
package/dist/commonjs/v3/chat-server.test.js.map +1 -1
package/dist/commonjs/v3/chat.js +34 -6
package/dist/commonjs/v3/chat.js.map +1 -1
package/dist/commonjs/v3/chat.test.js +53 -0
package/dist/commonjs/v3/chat.test.js.map +1 -1
package/dist/commonjs/v3/createStartSessionAction.test.js +30 -0
package/dist/commonjs/v3/createStartSessionAction.test.js.map +1 -1
package/dist/commonjs/v3/sessions.d.ts +11 -6
package/dist/commonjs/v3/sessions.js +10 -5
package/dist/commonjs/v3/sessions.js.map +1 -1
package/dist/commonjs/v3/test/mock-chat-agent.d.ts +6 -0
package/dist/commonjs/v3/test/mock-chat-agent.js +1 -0
package/dist/commonjs/v3/test/mock-chat-agent.js.map +1 -1
package/dist/commonjs/version.js +1 -1
package/dist/esm/v3/ai.d.ts +178 -5
package/dist/esm/v3/ai.js +603 -120
package/dist/esm/v3/ai.js.map +1 -1
package/dist/esm/v3/chat-client.js +3 -0
package/dist/esm/v3/chat-client.js.map +1 -1
package/dist/esm/v3/chat-react.js +10 -7
package/dist/esm/v3/chat-react.js.map +1 -1
package/dist/esm/v3/chat-server.d.ts +8 -0
package/dist/esm/v3/chat-server.js +32 -10
package/dist/esm/v3/chat-server.js.map +1 -1
package/dist/esm/v3/chat-server.test.js +51 -0
package/dist/esm/v3/chat-server.test.js.map +1 -1
package/dist/esm/v3/chat.js +34 -6
package/dist/esm/v3/chat.js.map +1 -1
package/dist/esm/v3/chat.test.js +53 -0
package/dist/esm/v3/chat.test.js.map +1 -1
package/dist/esm/v3/createStartSessionAction.test.js +30 -0
package/dist/esm/v3/createStartSessionAction.test.js.map +1 -1
package/dist/esm/v3/sessions.d.ts +11 -6
package/dist/esm/v3/sessions.js +10 -5
package/dist/esm/v3/sessions.js.map +1 -1
package/dist/esm/v3/test/mock-chat-agent.d.ts +6 -0
package/dist/esm/v3/test/mock-chat-agent.js +1 -0
package/dist/esm/v3/test/mock-chat-agent.js.map +1 -1
package/dist/esm/version.js +1 -1
package/docs/ai/prompts.mdx +430 -0
package/docs/ai-chat/actions.mdx +115 -0
package/docs/ai-chat/anatomy.mdx +71 -0
package/docs/ai-chat/backend.mdx +817 -0
package/docs/ai-chat/background-injection.mdx +221 -0
package/docs/ai-chat/changelog.mdx +850 -0
package/docs/ai-chat/chat-local.mdx +174 -0
package/docs/ai-chat/client-protocol.mdx +1081 -0
package/docs/ai-chat/compaction.mdx +411 -0
package/docs/ai-chat/custom-agents.mdx +364 -0
package/docs/ai-chat/error-handling.mdx +415 -0
package/docs/ai-chat/fast-starts.mdx +672 -0
package/docs/ai-chat/frontend.mdx +580 -0
package/docs/ai-chat/how-it-works.mdx +230 -0
package/docs/ai-chat/lifecycle-hooks.mdx +530 -0
package/docs/ai-chat/mcp.mdx +101 -0
package/docs/ai-chat/overview.mdx +90 -0
package/docs/ai-chat/patterns/branching-conversations.mdx +284 -0
package/docs/ai-chat/patterns/code-sandbox.mdx +126 -0
package/docs/ai-chat/patterns/database-persistence.mdx +414 -0
package/docs/ai-chat/patterns/human-in-the-loop.mdx +275 -0
package/docs/ai-chat/patterns/large-payloads.mdx +169 -0
package/docs/ai-chat/patterns/oom-resilience.mdx +120 -0
package/docs/ai-chat/patterns/persistence-and-replay.mdx +211 -0
package/docs/ai-chat/patterns/recovery-boot.mdx +230 -0
package/docs/ai-chat/patterns/skills.mdx +221 -0
package/docs/ai-chat/patterns/sub-agents.mdx +383 -0
package/docs/ai-chat/patterns/tool-result-auditing.mdx +148 -0
package/docs/ai-chat/patterns/trusted-edge-signals.mdx +337 -0
package/docs/ai-chat/patterns/version-upgrades.mdx +172 -0
package/docs/ai-chat/pending-messages.mdx +343 -0
package/docs/ai-chat/prompt-caching.mdx +206 -0
package/docs/ai-chat/quick-start.mdx +161 -0
package/docs/ai-chat/reference.mdx +909 -0
package/docs/ai-chat/server-chat.mdx +263 -0
package/docs/ai-chat/sessions.mdx +333 -0
package/docs/ai-chat/testing.mdx +682 -0
package/docs/ai-chat/tools.mdx +191 -0
package/docs/ai-chat/types.mdx +242 -0
package/docs/ai-chat/upgrade-guide.mdx +515 -0
package/docs/apikeys.mdx +54 -0
package/docs/building-with-ai.mdx +261 -0
package/docs/bulk-actions.mdx +49 -0
package/docs/changelog.mdx +6 -0
package/docs/cli-deploy-commands.mdx +9 -0
package/docs/cli-dev-commands.mdx +9 -0
package/docs/cli-dev.mdx +8 -0
package/docs/cli-init-commands.mdx +58 -0
package/docs/cli-introduction.mdx +25 -0
package/docs/cli-list-profiles-commands.mdx +42 -0
package/docs/cli-login-commands.mdx +33 -0
package/docs/cli-logout-commands.mdx +33 -0
package/docs/cli-preview-archive.mdx +59 -0
package/docs/cli-promote-commands.mdx +9 -0
package/docs/cli-switch.mdx +43 -0
package/docs/cli-update-commands.mdx +42 -0
package/docs/cli-whoami-commands.mdx +33 -0
package/docs/community.mdx +6 -0
package/docs/config/config-file.mdx +602 -0
package/docs/config/extensions/additionalFiles.mdx +38 -0
package/docs/config/extensions/additionalPackages.mdx +40 -0
package/docs/config/extensions/aptGet.mdx +34 -0
package/docs/config/extensions/audioWaveform.mdx +20 -0
package/docs/config/extensions/custom.mdx +380 -0
package/docs/config/extensions/emitDecoratorMetadata.mdx +29 -0
package/docs/config/extensions/esbuildPlugin.mdx +31 -0
package/docs/config/extensions/ffmpeg.mdx +45 -0
package/docs/config/extensions/lightpanda.mdx +56 -0
package/docs/config/extensions/overview.mdx +67 -0
package/docs/config/extensions/playwright.mdx +195 -0
package/docs/config/extensions/prismaExtension.mdx +1014 -0
package/docs/config/extensions/puppeteer.mdx +30 -0
package/docs/config/extensions/pythonExtension.mdx +182 -0
package/docs/config/extensions/syncEnvVars.mdx +291 -0
package/docs/context.mdx +235 -0
package/docs/database-connections.mdx +213 -0
package/docs/deploy-environment-variables.mdx +435 -0
package/docs/deployment/atomic-deployment.mdx +172 -0
package/docs/deployment/overview.mdx +257 -0
package/docs/deployment/preview-branches.mdx +224 -0
package/docs/errors-retrying.mdx +379 -0
package/docs/github-actions.mdx +222 -0
package/docs/github-integration.mdx +136 -0
package/docs/github-repo.mdx +8 -0
package/docs/help-email.mdx +6 -0
package/docs/help-slack.mdx +11 -0
package/docs/hidden-tasks.mdx +56 -0
package/docs/how-it-works.mdx +454 -0
package/docs/how-to-reduce-your-spend.mdx +217 -0
package/docs/idempotency.mdx +504 -0
package/docs/introduction.mdx +223 -0
package/docs/limits.mdx +241 -0
package/docs/logging.mdx +195 -0
package/docs/machines.mdx +952 -0
package/docs/manual-setup.mdx +632 -0
package/docs/mcp-agent-rules.mdx +41 -0
package/docs/mcp-introduction.mdx +385 -0
package/docs/mcp-tools.mdx +273 -0
package/docs/migrating-from-v3.mdx +334 -0
package/docs/observability/dashboards.mdx +102 -0
package/docs/observability/query.mdx +585 -0
package/docs/open-source-contributing.mdx +16 -0
package/docs/open-source-self-hosting.mdx +541 -0
package/docs/private-networking/aws-console-setup.mdx +304 -0
package/docs/private-networking/overview.mdx +144 -0
package/docs/private-networking/troubleshooting.mdx +78 -0
package/docs/queue-concurrency.mdx +354 -0
package/docs/quick-start.mdx +97 -0
package/docs/realtime/auth.mdx +208 -0
package/docs/realtime/backend/overview.mdx +45 -0
package/docs/realtime/backend/streams.mdx +418 -0
package/docs/realtime/backend/subscribe.mdx +225 -0
package/docs/realtime/how-it-works.mdx +94 -0
package/docs/realtime/overview.mdx +63 -0
package/docs/realtime/react-hooks/overview.mdx +73 -0
package/docs/realtime/react-hooks/streams.mdx +449 -0
package/docs/realtime/react-hooks/subscribe.mdx +674 -0
package/docs/realtime/react-hooks/swr.mdx +87 -0
package/docs/realtime/react-hooks/triggering.mdx +194 -0
package/docs/realtime/react-hooks/use-wait-token.mdx +34 -0
package/docs/realtime/run-object.mdx +174 -0
package/docs/replaying.mdx +72 -0
package/docs/request-feature.mdx +6 -0
package/docs/roadmap.mdx +6 -0
package/docs/run-tests.mdx +20 -0
package/docs/run-usage.mdx +113 -0
package/docs/runs/heartbeats.mdx +38 -0
package/docs/runs/max-duration.mdx +139 -0
package/docs/runs/metadata.mdx +734 -0
package/docs/runs/priority.mdx +31 -0
package/docs/runs.mdx +396 -0
package/docs/self-hosting/docker.mdx +458 -0
package/docs/self-hosting/env/supervisor.mdx +74 -0
package/docs/self-hosting/env/webapp.mdx +276 -0
package/docs/self-hosting/kubernetes.mdx +601 -0
package/docs/self-hosting/overview.mdx +108 -0
package/docs/skills.mdx +85 -0
package/docs/tags.mdx +120 -0
package/docs/tasks/overview.mdx +697 -0
package/docs/tasks/scheduled.mdx +382 -0
package/docs/tasks/schemaTask.mdx +413 -0
package/docs/tasks/streams.mdx +884 -0
package/docs/triggering.mdx +1320 -0
package/docs/troubleshooting-alerts.mdx +385 -0
package/docs/troubleshooting-debugging-in-vscode.mdx +8 -0
package/docs/troubleshooting-github-issues.mdx +6 -0
package/docs/troubleshooting-uptime-status.mdx +6 -0
package/docs/troubleshooting.mdx +398 -0
package/docs/upgrading-packages.mdx +80 -0
package/docs/vercel-integration.mdx +207 -0
package/docs/versioning.mdx +56 -0
package/docs/video-walkthrough.mdx +23 -0
package/docs/wait-for-token.mdx +540 -0
package/docs/wait-for.mdx +42 -0
package/docs/wait-until.mdx +53 -0
package/docs/wait.mdx +18 -0
package/docs/writing-tasks-introduction.mdx +33 -0
package/package.json +10 -6
package/skills/trigger-authoring-chat-agent/SKILL.md +296 -0
package/skills/trigger-authoring-tasks/SKILL.md +254 -0
package/skills/trigger-chat-agent-advanced/SKILL.md +368 -0
package/skills/trigger-cost-savings/SKILL.md +116 -0
package/skills/trigger-realtime-and-frontend/SKILL.md +276 -0

package/docs/ai-chat/patterns/large-payloads.mdx ADDED Viewed

@@ -0,0 +1,169 @@
+---
+title: "Large payloads in chat.agent"
+sidebarTitle: "Large payloads"
+description: "Why a single chunk on the chat stream is capped at ~1 MiB, what error you'll see, and how to work around it with ID references."
+---
+import RcBanner from "/snippets/ai-chat-rc-banner.mdx";
+<RcBanner />
+The realtime stream that backs `chat.agent` enforces a **per-record cap of ~1 MiB** (`1048576` bytes minus a small envelope reserve). Anything written through the chat output — auto-piped LLM chunks, `chat.response.write`, custom `writer.write` parts — counts as one record per chunk and is rejected if it crosses the cap.
+This is a platform-level limit and cannot be raised per project or per stream.
+## What you'll see
+When a chunk crosses the cap, the run fails with a typed [`ChatChunkTooLargeError`](/ai-chat/error-handling):
+```
+ChatChunkTooLargeError: chat.agent chunk of type "tool-output-available" is 2000126 bytes,
+over the realtime stream's per-record cap of 1047552 bytes. For oversized payloads
+(e.g. large tool outputs), write the value to your own store and emit only an id/url
+through the chat stream — see https://trigger.dev/docs/ai-chat/patterns/large-payloads.
+```
+The error includes:
+- `chunkType` — discriminant on the chunk that failed (e.g. `tool-output-available`, `data-handover`, `text-delta`).
+- `chunkSize` — UTF-8 byte count of the JSON-serialized record.
+- `maxSize` — the effective cap.
+You can catch and re-throw / log it explicitly:
+```ts
+import { ChatChunkTooLargeError, isChatChunkTooLargeError } from "@trigger.dev/sdk";
+try {
+  await someWrite();
+} catch (err) {
+  if (isChatChunkTooLargeError(err)) {
+    logger.error("Oversized chunk", { type: err.chunkType, size: err.chunkSize });
+  }
+  throw err;
+}
+```
+## Most common cause: large tool outputs
+If you return a `streamText` result from `run()`, the AI SDK auto-pipes its `UIMessageStream` into the chat output. A tool whose result object is large (a fetched HTML body, a CSV blob, an image as base64, a deep DB row dump) gets emitted as one `tool-output-available` chunk — and that's the chunk that overruns.
+**Diagnose first**: log tool sizes during development.
+```ts
+const fetchPage = tool({
+  inputSchema: z.object({ url: z.string().url() }),
+  execute: async ({ url }) => {
+    const html = await (await fetch(url)).text();
+    if (html.length > 500_000) {
+      logger.warn("Large tool output", { tool: "fetchPage", bytes: html.length });
+    }
+    return { html };
+  },
+});
+```
+If the size is unbounded by input, fix the tool — not the stream.
+## ID-reference pattern
+Store the large value in your own database (or object store) and emit only an identifier through the chat stream. The frontend fetches the full payload separately on demand.
+This keeps the chat stream small, predictable, and resumable, and lets you reuse the value across turns or sessions without re-streaming it.
+<CodeGroup>
+```ts task.ts
+import { chat } from "@trigger.dev/sdk/ai";
+import { tool } from "ai";
+import { z } from "zod";
+const fetchPage = tool({
+  description: "Fetch a URL and store the HTML for later inspection.",
+  inputSchema: z.object({ url: z.string().url() }),
+  execute: async ({ url }) => {
+    const html = await (await fetch(url)).text();
+    const docId = await db.documents.create({
+      data: { url, html, byteSize: html.length },
+    });
+    // Tool result is small — just an id and metadata.
+    // The model and the UI both work with this lightweight handle.
+    return {
+      docId,
+      url,
+      byteSize: html.length,
+      preview: html.slice(0, 500),
+    };
+  },
+});
+```
+```ts api/document/[id]/route.ts
+// Frontend fetches the full document on demand.
+import { auth, currentUser } from "@/lib/auth";
+export async function GET(_req: Request, { params }: { params: { id: string } }) {
+  const user = await currentUser();
+  const doc = await db.documents.findUniqueOrThrow({
+    where: { id: params.id, userId: user.id },
+  });
+  return new Response(doc.html, { headers: { "content-type": "text/html" } });
+}
+```
+```tsx component.tsx
+function ToolResultCard({ part }: { part: ToolUIPart<"fetchPage"> }) {
+  const { docId, url, byteSize, preview } = part.output;
+  return (
+    <div>
+      <p>{url} — {(byteSize / 1024).toFixed(0)} KB</p>
+      <pre>{preview}…</pre>
+      <a href={`/api/document/${docId}`}>Open full HTML</a>
+    </div>
+  );
+}
+```
+</CodeGroup>
+The same pattern works for `chat.response.write` — push the heavy value to your DB, then emit a small data part with the id:
+```ts
+const id = await db.attachments.create({ data: { content: hugeReport } });
+chat.response.write({ type: "data-report", data: { id, summary: shortSummary } });
+```
+<Tip>
+  Persist the large value **before** you emit the id chunk. If the chunk reaches the UI before the row is written, the frontend gets a 404 on the follow-up fetch.
+</Tip>
+## Transient UI parts
+For progress indicators or status data that should stream to the UI but not persist into the response message, use `chat.response.write` with `transient: true`. The chunk still travels on the chat stream (so the 1 MiB per-record cap still applies), but it never lands in `responseMessage` or `uiMessages`:
+```ts
+chat.response.write({
+  type: "data-progress",
+  data: { percent: 50 },
+  transient: true,
+});
+```
+For genuinely high-volume diagnostic data (per-token traces, large debug dumps), don't try to ship it through the realtime stream at all. Log to your own store (DB, object storage, OTel logger) and surface it through a separate UI route that isn't tied to the chat session.
+## What does **not** trigger the cap
+These calls don't go through the realtime stream and have no per-record cap:
+- [`chat.history.set` / `slice` / `replace` / `remove`](/ai-chat/backend#chat-history) — locals-only mutations on the in-memory message list.
+- [`chat.inject`](/ai-chat/background-injection#chat-inject) — appends to the run's pending message queue, not the stream.
+- [`chat.defer`](/ai-chat/background-injection#chat-defer-standalone) — promise registry; awaited at turn boundaries, never serialized to the stream.
+The control markers `chat.agent` emits internally (`trigger:turn-complete`, `trigger:upgrade-required`) are tiny by construction.
+## See also
+- [Error handling](/ai-chat/error-handling) — how `ChatChunkTooLargeError` flows through the layers.
+- [Database persistence](/ai-chat/patterns/database-persistence) — your own store as the durable backing for ID references.
+- [Client protocol](/ai-chat/client-protocol) — chunk shapes that travel on the chat stream.

package/docs/ai-chat/patterns/oom-resilience.mdx ADDED Viewed

@@ -0,0 +1,120 @@
+---
+title: "OOM resilience"
+sidebarTitle: "OOM resilience"
+description: "Recover from out-of-memory errors mid-turn by automatically retrying the failed turn on a larger machine — without losing the in-flight user message or re-processing completed turns."
+---
+import RcBanner from "/snippets/ai-chat-rc-banner.mdx";
+<RcBanner />
+When a `chat.agent` turn runs out of memory, the worker process dies and everything in it is gone: the in-flight LLM call, the accumulator, any tool execution mid-flight. By default, Trigger.dev surfaces the OOM as a run failure.
+Setting `oomMachine` opts the agent into automatic recovery: the failed turn re-runs on a larger machine, picks up the user message that triggered the OOM (without re-processing earlier completed turns), and produces a normal response.
+## Setup
+```ts
+import { chat } from "@trigger.dev/sdk/ai";
+export const myChat = chat.agent({
+  id: "my-chat",
+  machine: "small-1x",         // default machine
+  oomMachine: "medium-2x",     // fallback on OOM
+  run: async ({ messages, signal }) =>
+    streamText({ model, messages, abortSignal: signal }),
+});
+```
+That's the entire opt-in. With `oomMachine` set, the agent gets:
+- **`retry.maxAttempts: 2`** internally — one retry for OOM only; non-OOM errors don't retry.
+- **`retry.outOfMemory.machine: oomMachine`** — the fresh attempt boots on the larger machine.
+- **`session.in` cursor recovery** — the new attempt skips records belonging to turns that already completed on the prior attempt and only re-runs the OOM'd turn.
+`chat.agent` does not expose generic `retry` options. OOM recovery is the only retry path because retrying an LLM-driven loop on non-OOM errors tends to be expensive and side-effecting. Drop down to a [raw `task()` with chat primitives](/ai-chat/custom-agents) if you need richer retry semantics.
+## How recovery works
+The recovery doesn't need any customer-side persistence to avoid duplicate processing. It uses two pieces of durable state Trigger already maintains for every chat:
+- **`session.out`** — the durable response stream. Every successful turn writes a `trigger:turn-complete` chunk here.
+- **`session.in`** — the durable input stream. Every user message after the first turn lands here as a record with a server-assigned timestamp.
+On retry boot, the SDK:
+1. Scans `session.out` for the latest `trigger:turn-complete` chunk and reads its timestamp. Call this `T_last_complete`.
+2. Sets a per-stream filter on `session.in` so any record with `timestamp <= T_last_complete` is dropped before it reaches the turn loop.
+3. Begins normal processing. The first record that passes the filter is the message that triggered the OOM (or any newer message that arrived during the retry window).
+Result: turns 1..N-1 are not re-processed, turn N runs on the larger machine, and the conversation continues.
+```mermaid
+sequenceDiagram
+  participant User
+  participant Run as chat.agent run
+  participant SessionIn as session.in
+  participant SessionOut as session.out
+  User->>SessionIn: u2 (turn 2)
+  Run->>SessionIn: read u2
+  Run->>SessionOut: turn-complete (T1)
+  User->>SessionIn: u3 (turn 3)
+  Run->>SessionIn: read u3
+  Run->>SessionOut: turn-complete (T2)
+  User->>SessionIn: u4 (turn 4)
+  Run->>SessionIn: read u4
+  Note over Run: OOM mid-turn
+  Run->>Run: ⚠️ killed
+  Note over Run: Attempt 2 boots on oomMachine
+  Run->>SessionOut: scan → T_last_complete = T2
+  Run->>SessionIn: read with filter (ts > T2)
+  SessionIn-->>Run: u2 (filtered, ts < T2)
+  SessionIn-->>Run: u3 (filtered, ts < T2)
+  SessionIn-->>Run: u4 (passes — the OOM'd turn)
+  Run->>SessionOut: turn 4 complete
+```
+The scan on `session.out` is streaming and bounded in memory: each chunk is inspected and discarded one at a time, so a long-running chat doesn't bloat the retry-boot worker. Bandwidth scales linearly with `session.out` size, but only on the OOM-retry path — a rare event.
+## With `hydrateMessages`
+If your agent uses [`hydrateMessages`](/ai-chat/lifecycle-hooks#hydratemessages) to load the durable conversation history per turn, the OOM'd turn re-runs against the full prior accumulator: the model sees `[u1, a1, u2, a2, ..., u_N]` and responds in context. This is the recommended pattern for production chats.
+## Without `hydrateMessages`
+Recovery boot reconstructs context automatically. The boot reads both the durable `session.out` snapshot (settled turns) and the `session.out` tail past the snapshot cursor (the partial assistant chunks the OOM'd turn streamed before dying). When the new attempt processes the OOM'd user message, the model sees the full prior conversation **plus** the partial assistant that was cut off — so a "keep going" follow-up continues naturally, and any other follow-up has the same context the original turn had.
+`hydrateMessages` is still the right choice if you want a single source of truth in your own database (branching conversations, message-level access control, etc.). It's no longer required for OOM continuity.
+For full control over recovery — drop the partial, synthesize tool results for an interrupted tool call, emit a recovery banner to the UI — register [`onRecoveryBoot`](/ai-chat/patterns/recovery-boot).
+## Tool execute idempotency
+If an OOM hits mid-tool-execution, the new attempt re-runs the entire turn — including the tool call. Make tool `execute` functions idempotent or checkpoint their progress externally. Trigger doesn't roll back side effects automatically.
+```ts
+import { tool } from "ai";
+export const sendEmail = tool({
+  description: "Send an email",
+  inputSchema: z.object({ to: z.string(), idempotencyKey: z.string() }),
+  execute: async ({ to, idempotencyKey }) => {
+    // Stripe-style: dedupe at the side-effect layer with a customer-supplied key.
+    return await mailer.send({ to, idempotencyKey });
+  },
+});
+```
+## Limitations
+- **One OOM retry per run.** `chat.agent` sets `maxAttempts: 2`. If attempt 2 also OOMs, the run fails. Use a sufficiently large `oomMachine` to avoid this.
+- **Single fallback tier.** Only one `oomMachine`. There's no "tiered retry" (small → medium → large). If you need that, drop down to a [raw `task()` with chat primitives](/ai-chat/custom-agents) and configure `retry` directly.
+- **Non-OOM errors don't retry.** Schema errors, model-call rejections, tool throws, etc. fail the run as before. Out-of-memory is the only retry trigger.
+- **Tools mid-execution are not checkpointed.** A partially-run tool re-runs from scratch on the new attempt. Make them idempotent.
+## See also
+- [Recovery boot](/ai-chat/patterns/recovery-boot) — the underlying hook + smart default that gives OOM recovery its full-context behavior
+- [Lifecycle hooks](/ai-chat/lifecycle-hooks) — `onChatResume` fires on every retry attempt with `phase: "preload"` or `"turn"`
+- [Database persistence](/ai-chat/patterns/database-persistence) — the `hydrateMessages` pattern for branching, ACL, and DB-as-source-of-truth scenarios

package/docs/ai-chat/patterns/persistence-and-replay.mdx ADDED Viewed

@@ -0,0 +1,211 @@
+---
+title: "Persistence and replay"
+sidebarTitle: "Persistence and replay"
+description: "How chat.agent rebuilds conversation history at run boot — durable JSON snapshot in object storage plus session.out replay, with a hydrateMessages short-circuit for backend-owned history."
+---
+import RcBanner from "/snippets/ai-chat-rc-banner.mdx";
+<RcBanner />
+`chat.agent` runs are processes — they boot, stream a turn, and either suspend (waiting for the next message) or exit. When the next message arrives at a session whose previous run already exited, a **fresh** run boots with no in-memory state. Something has to rebuild the conversation history before that turn can produce a coherent response.
+This page walks through the **snapshot + replay** model the runtime uses by default, and the [`hydrateMessages`](/ai-chat/lifecycle-hooks#hydratemessages) short-circuit that turns the whole thing off when the customer owns history.
+## Why a snapshot at all
+The wire is delta-only: each `.in/append` carries at most one new `UIMessage` (see [Client Protocol](/ai-chat/client-protocol#chattaskwirepayload)). A long conversation might be 50 turns deep with megabytes of tool results — the wire never carries that. So when run #2 boots to handle turn 51, the wire alone tells it almost nothing about turns 1–50.
+Two existing pieces of durable state already capture everything that happened:
+- **`session.in`** — every user message and tool-approval response ever sent.
+- **`session.out`** — every assistant token, tool call, and tool result the agent emitted, ordered.
+Replaying `session.out` from the beginning is correct but expensive — bandwidth scales with chat length, and parsing N megabytes of streamed chunks at every boot adds latency. So the runtime writes a **snapshot** after every turn and reads it on the next boot. Replay only covers the gap between the snapshot's cursor and now.
+## The model end-to-end
+```mermaid
+sequenceDiagram
+  participant User
+  participant Run1 as Run 1 (turn 1)
+  participant Snapshot as Object storage
+  participant SessionOut as session.out
+  participant Run2 as Run 2 (turn 2+)
+  User->>Run1: u1
+  Run1->>SessionOut: assistant chunks for a1
+  Run1->>Run1: onTurnComplete
+  Run1->>Snapshot: write { messages: [u1, a1], lastOutEventId, lastOutTimestamp }
+  Note over Run1: idle suspend (or exit)
+  User->>Run2: u2 (delta only)
+  Run2->>Snapshot: read snapshot
+  Run2->>SessionOut: subscribe(lastEventId, wait=0)
+  SessionOut-->>Run2: (empty — nothing since snapshot)
+  Note over Run2: accumulator = [u1, a1]
+  Run2->>Run2: append u2 from wire
+  Run2->>SessionOut: assistant chunks for a2
+  Run2->>Run2: onTurnComplete
+  Run2->>Snapshot: write { messages: [u1, a1, u2, a2], ... }
+```
+### Run 1 — first turn
+The accumulator starts empty. The wire delivers `u1`. After the model finishes, `onTurnComplete` fires, then the runtime serializes the full accumulator and writes:
+```json
+{
+  "version": 1,
+  "savedAt": 1715180400000,
+  "messages": [u1, a1],
+  "lastOutEventId": "42",
+  "lastOutTimestamp": 1715180399000
+}
+```
+The key is `packets/{projectRef}/{envSlug}/sessions/{sessionId}/snapshot.json` — overwritten every turn, never appended. The write is **awaited**, not fire-and-forget — if the run idle-suspends immediately after, in-flight promises don't reliably complete and the snapshot would be lost.
+### Run 2 — boot
+A new run boots when the user sends `u2`. Run 1 has long since exited. Run 2 has no in-memory state. The boot sequence:
+<Steps>
+  <Step title="Read the snapshot">
+    GET the JSON blob. On 404 (no snapshot yet — first-ever turn) or read error or version mismatch, treat as empty and continue. Snapshot misses are non-fatal — replay alone may still be sufficient.
+  </Step>
+  <Step title="Replay session.out tail">
+    Subscribe to `session.out` with `wait=0` starting from `snapshot.lastOutEventId`. Drain whatever's there and close. Returns:
+    - **Settled messages** — closed assistant turns past the snapshot cursor (the chunks of a turn that completed after the snapshot was written but before the run exited cleanly).
+    - **A partial assistant** — the trailing message if its stream never received a `finish` chunk. The dead run was mid-response when it died. `cleanupAbortedParts` has already stripped streaming-in-progress fragments.
+    In the steady state this returns empty. In recovery, it returns whatever the dead run was in the middle of.
+  </Step>
+  <Step title="Replay session.in tail">
+    GET `session.in` records past the last `turn-complete`'s `session-in-event-id` cursor. Returns the user messages the dead run hadn't acknowledged — typically the message that triggered the cancelled / crashed turn, plus anything the customer typed after.
+  </Step>
+  <Step title="Reconstruct the chain (smart default)">
+    Snapshot messages merge with the settled replay (replay wins on `id` collision). Then:
+    - If there's a partial assistant **and** at least one in-flight user message, splice `[firstInFlightUser, partialAssistant]` onto the end of the chain. The model sees the prior turn's incomplete attempt and can continue, abandon, or pivot based on the next user message.
+    - Remaining in-flight users dispatch as fresh turns after the recovered first one.
+    - If there's no partial OR no in-flight users, the chain is just the settled chain and any in-flight users dispatch normally.
+    Customers can override this entirely via [`onRecoveryBoot`](/ai-chat/patterns/recovery-boot).
+  </Step>
+  <Step title="Append the new wire message">
+    Append `u2` from the wire payload, exactly as on turn 1.
+  </Step>
+</Steps>
+The model now sees `[u1, a1, u2]` and produces `a2`. After `onTurnComplete`, the runtime overwrites the snapshot with `[u1, a1, u2, a2]` and the cycle repeats.
+### Crash mid-turn — replay carries the load
+Suppose Run 1's turn 1 streams partial assistant chunks to `session.out` and then crashes (OOM, exception, server-side cancel) before `onTurnComplete` fires. No snapshot was written. The next run boots and:
+1. Snapshot read returns 404 → empty.
+2. `session.out` tail replay picks up the partial assistant chunks emitted before the crash. `cleanupAbortedParts` strips streaming-in-progress fragments but keeps the cleaned trailing message as the `partialAssistant`.
+3. `session.in` tail replay finds the user message the dead run was answering (no `turn-complete` was written, so the cursor never advanced past it).
+4. Smart default splices `[firstInFlightUser, partialAssistant]` onto the chain. Any later user messages (including the customer's follow-up) dispatch as fresh turns.
+5. The model sees full prior context and responds in kind — continuing a cut-off essay on "keep going", answering a fresh question on "actually, what's 7+8?", abandoning the prior work on "scrap that, do X instead".
+Replay carries the conversation across the crash boundary with zero customer code. For policies different from "preserve context" — drop the partial entirely, synthesize tool results for an interrupted tool call, write a recovery banner to the UI — register [`onRecoveryBoot`](/ai-chat/patterns/recovery-boot).
+## OOM-retry interaction
+The runtime already had an OOM-retry path that scans `session.out` for the latest `trigger:turn-complete` timestamp to use as a cutoff for `session.in` (so the retry doesn't re-process completed turns — see [OOM resilience](/ai-chat/patterns/oom-resilience)). The snapshot includes a `lastOutTimestamp` field that is exactly that high-water mark.
+When a snapshot exists, the OOM-retry path reads `lastOutTimestamp` directly instead of scanning `session.out`. One fewer stream subscription per retry. Free win.
+If no snapshot exists (first turn, or `hydrateMessages` registered), the path falls back to the scan.
+## Action turns — no snapshot write
+[Action turns](/ai-chat/actions) (`trigger: "action"`) don't fire `onTurnComplete` — they fire `onAction` only. The snapshot write site is gated on `onTurnComplete`, so action turns don't snapshot.
+If `onAction` mutates `chat.history.*` and then the run crashes before the next regular turn, the mutation is lost. The user re-fires the action. This matches `chat.history` semantics in general — mutations are persisted at turn boundaries, not action boundaries.
+## The `hydrateMessages` short-circuit
+When the customer registers a [`hydrateMessages`](/ai-chat/lifecycle-hooks#hydratemessages) hook, the runtime trusts the hook to be the source of truth for history. Snapshot read and replay are **skipped entirely** at boot. The hook fires per turn, returns the canonical chain from the customer's database, and the accumulator is set to whatever the hook returned.
+```ts
+import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai";
+import { db } from "@/lib/db";
+export const myChat = chat.agent({
+  id: "my-chat",
+  hydrateMessages: async ({ chatId, trigger, incomingMessages }) => {
+    const stored = (await db.chat.findUnique({ where: { id: chatId } }))?.messages ?? [];
+    // See lifecycle-hooks for the full upsert pattern + rationale:
+    // /ai-chat/lifecycle-hooks#hydratemessages
+    if (upsertIncomingMessage(stored, { trigger, incomingMessages })) {
+      // Upsert, not update: head-start first turns run without a preload
+      // to create the row.
+      await db.chat.upsert({
+        where: { id: chatId },
+        create: { id: chatId, messages: stored },
+        update: { messages: stored },
+      });
+    }
+    return stored;
+  },
+  onTurnComplete: async ({ chatId, uiMessages }) => {
+    await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } });
+  },
+  run: async ({ messages, signal }) => {
+    return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal });
+  },
+});
+```
+What you gain:
+- **Zero object-store traffic per turn.** No snapshot read, no snapshot write, no replay subscription. `OBJECT_STORE_*` env vars don't have to be set.
+- **Branching, undo, edit, abuse prevention** — patterns that need a backend-side single source of truth work naturally because the customer mediates every read.
+What you give up:
+- **You own persistence end-to-end.** A bug in `hydrateMessages` that returns the wrong chain corrupts the conversation visible to the model.
+- **OOM-retry needs a `session.out` scan again** because there's no snapshot to short-circuit it. (Same as the pre-snapshot baseline — not a regression, just a missed optimization.)
+The runtime's snapshot+replay is the safer default. `hydrateMessages` is the right choice when you already have authoritative storage for messages and want one consistent persistence path.
+## When neither is configured
+If `hydrateMessages` is not registered **and** no object store is configured, conversations don't survive run boundaries. A continuation boots empty. The runtime logs a warning at agent registration time so you see this at deploy time, not at user-traffic time.
+For local development this is sometimes fine — you're not testing continuations. For production it isn't. Configure one of:
+- **Object store** (`OBJECT_STORE_*` env vars on your webapp) — easiest, default behavior.
+- **`hydrateMessages` + your own database** — stronger control, suits multi-tenant apps with audit needs.
+## Snapshot key & lifecycle
+| Field | Value |
+|---|---|
+| Bucket | Whatever `OBJECT_STORE_BASE_URL` points to |
+| Key prefix | `packets/{projectRef}/{envSlug}/` (server-prefixed) |
+| Key suffix | `sessions/{sessionId}/snapshot.json` |
+| Final key | `packets/{projectRef}/{envSlug}/sessions/{sessionId}/snapshot.json` |
+| Size | Tens of KB typical, capped only by object-store limits |
+| Cadence | Overwritten after every successful `onTurnComplete` |
+Snapshots accumulate per-session forever unless you set a lifecycle policy on the bucket. A 90-day expiry on `packets/*/sessions/*/snapshot.json` is a reasonable default if your chats don't typically resume after that window. Closed sessions are not auto-cleaned today.
+### MinIO and S3-compatible stores
+Snapshot read/write reuses the same object-store layer as Trigger.dev's existing large-payload routes. Anything that already works for large payloads — AWS S3, MinIO (self-host or local development), Cloudflare R2, Tigris, Backblaze B2 — works for snapshots too. `OBJECT_STORE_DEFAULT_PROTOCOL` controls the routing (`s3`, `minio`, etc.) and the SDK picks the right driver automatically. No snapshot-specific config.
+For local development against `pnpm run docker`, the bundled MinIO container is enough — set `OBJECT_STORE_DEFAULT_PROTOCOL=minio` and the standard MinIO env vars on the webapp, and continuations work end-to-end against a local stack.
+## See also
+- [Client Protocol](/ai-chat/client-protocol#how-history-is-rebuilt) — the wire-level view of the same model
+- [`hydrateMessages`](/ai-chat/lifecycle-hooks#hydratemessages) — the short-circuit hook
+- [OOM resilience](/ai-chat/patterns/oom-resilience) — how `session.in` cutoffs interact with snapshots
+- [Database persistence](/ai-chat/patterns/database-persistence) — the canonical persistence pattern using `onTurnComplete`
+- [v4.5 upgrade guide](/ai-chat/upgrade-guide#v45-wire-format-change) — when this model landed and what changed