@trigger.dev/sdk 4.5.0-rc.5 → 4.5.0-rc.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/commonjs/v3/ai.d.ts +178 -5
- package/dist/commonjs/v3/ai.js +603 -119
- package/dist/commonjs/v3/ai.js.map +1 -1
- package/dist/commonjs/v3/chat-client.js +3 -0
- package/dist/commonjs/v3/chat-client.js.map +1 -1
- package/dist/commonjs/v3/chat-react.js +10 -7
- package/dist/commonjs/v3/chat-react.js.map +1 -1
- package/dist/commonjs/v3/chat-server.d.ts +8 -0
- package/dist/commonjs/v3/chat-server.js +32 -10
- package/dist/commonjs/v3/chat-server.js.map +1 -1
- package/dist/commonjs/v3/chat-server.test.js +51 -0
- package/dist/commonjs/v3/chat-server.test.js.map +1 -1
- package/dist/commonjs/v3/chat.js +34 -6
- package/dist/commonjs/v3/chat.js.map +1 -1
- package/dist/commonjs/v3/chat.test.js +53 -0
- package/dist/commonjs/v3/chat.test.js.map +1 -1
- package/dist/commonjs/v3/createStartSessionAction.test.js +30 -0
- package/dist/commonjs/v3/createStartSessionAction.test.js.map +1 -1
- package/dist/commonjs/v3/sessions.d.ts +11 -6
- package/dist/commonjs/v3/sessions.js +10 -5
- package/dist/commonjs/v3/sessions.js.map +1 -1
- package/dist/commonjs/v3/test/mock-chat-agent.d.ts +6 -0
- package/dist/commonjs/v3/test/mock-chat-agent.js +1 -0
- package/dist/commonjs/v3/test/mock-chat-agent.js.map +1 -1
- package/dist/commonjs/version.js +1 -1
- package/dist/esm/v3/ai.d.ts +178 -5
- package/dist/esm/v3/ai.js +603 -120
- package/dist/esm/v3/ai.js.map +1 -1
- package/dist/esm/v3/chat-client.js +3 -0
- package/dist/esm/v3/chat-client.js.map +1 -1
- package/dist/esm/v3/chat-react.js +10 -7
- package/dist/esm/v3/chat-react.js.map +1 -1
- package/dist/esm/v3/chat-server.d.ts +8 -0
- package/dist/esm/v3/chat-server.js +32 -10
- package/dist/esm/v3/chat-server.js.map +1 -1
- package/dist/esm/v3/chat-server.test.js +51 -0
- package/dist/esm/v3/chat-server.test.js.map +1 -1
- package/dist/esm/v3/chat.js +34 -6
- package/dist/esm/v3/chat.js.map +1 -1
- package/dist/esm/v3/chat.test.js +53 -0
- package/dist/esm/v3/chat.test.js.map +1 -1
- package/dist/esm/v3/createStartSessionAction.test.js +30 -0
- package/dist/esm/v3/createStartSessionAction.test.js.map +1 -1
- package/dist/esm/v3/sessions.d.ts +11 -6
- package/dist/esm/v3/sessions.js +10 -5
- package/dist/esm/v3/sessions.js.map +1 -1
- package/dist/esm/v3/test/mock-chat-agent.d.ts +6 -0
- package/dist/esm/v3/test/mock-chat-agent.js +1 -0
- package/dist/esm/v3/test/mock-chat-agent.js.map +1 -1
- package/dist/esm/version.js +1 -1
- package/docs/ai/prompts.mdx +430 -0
- package/docs/ai-chat/actions.mdx +115 -0
- package/docs/ai-chat/anatomy.mdx +71 -0
- package/docs/ai-chat/backend.mdx +817 -0
- package/docs/ai-chat/background-injection.mdx +221 -0
- package/docs/ai-chat/changelog.mdx +850 -0
- package/docs/ai-chat/chat-local.mdx +174 -0
- package/docs/ai-chat/client-protocol.mdx +1081 -0
- package/docs/ai-chat/compaction.mdx +411 -0
- package/docs/ai-chat/custom-agents.mdx +364 -0
- package/docs/ai-chat/error-handling.mdx +415 -0
- package/docs/ai-chat/fast-starts.mdx +672 -0
- package/docs/ai-chat/frontend.mdx +580 -0
- package/docs/ai-chat/how-it-works.mdx +230 -0
- package/docs/ai-chat/lifecycle-hooks.mdx +530 -0
- package/docs/ai-chat/mcp.mdx +101 -0
- package/docs/ai-chat/overview.mdx +90 -0
- package/docs/ai-chat/patterns/branching-conversations.mdx +284 -0
- package/docs/ai-chat/patterns/code-sandbox.mdx +126 -0
- package/docs/ai-chat/patterns/database-persistence.mdx +414 -0
- package/docs/ai-chat/patterns/human-in-the-loop.mdx +275 -0
- package/docs/ai-chat/patterns/large-payloads.mdx +169 -0
- package/docs/ai-chat/patterns/oom-resilience.mdx +120 -0
- package/docs/ai-chat/patterns/persistence-and-replay.mdx +211 -0
- package/docs/ai-chat/patterns/recovery-boot.mdx +230 -0
- package/docs/ai-chat/patterns/skills.mdx +221 -0
- package/docs/ai-chat/patterns/sub-agents.mdx +383 -0
- package/docs/ai-chat/patterns/tool-result-auditing.mdx +148 -0
- package/docs/ai-chat/patterns/trusted-edge-signals.mdx +337 -0
- package/docs/ai-chat/patterns/version-upgrades.mdx +172 -0
- package/docs/ai-chat/pending-messages.mdx +343 -0
- package/docs/ai-chat/prompt-caching.mdx +206 -0
- package/docs/ai-chat/quick-start.mdx +161 -0
- package/docs/ai-chat/reference.mdx +909 -0
- package/docs/ai-chat/server-chat.mdx +263 -0
- package/docs/ai-chat/sessions.mdx +333 -0
- package/docs/ai-chat/testing.mdx +682 -0
- package/docs/ai-chat/tools.mdx +191 -0
- package/docs/ai-chat/types.mdx +242 -0
- package/docs/ai-chat/upgrade-guide.mdx +515 -0
- package/docs/apikeys.mdx +54 -0
- package/docs/building-with-ai.mdx +261 -0
- package/docs/bulk-actions.mdx +49 -0
- package/docs/changelog.mdx +6 -0
- package/docs/cli-deploy-commands.mdx +9 -0
- package/docs/cli-dev-commands.mdx +9 -0
- package/docs/cli-dev.mdx +8 -0
- package/docs/cli-init-commands.mdx +58 -0
- package/docs/cli-introduction.mdx +25 -0
- package/docs/cli-list-profiles-commands.mdx +42 -0
- package/docs/cli-login-commands.mdx +33 -0
- package/docs/cli-logout-commands.mdx +33 -0
- package/docs/cli-preview-archive.mdx +59 -0
- package/docs/cli-promote-commands.mdx +9 -0
- package/docs/cli-switch.mdx +43 -0
- package/docs/cli-update-commands.mdx +42 -0
- package/docs/cli-whoami-commands.mdx +33 -0
- package/docs/community.mdx +6 -0
- package/docs/config/config-file.mdx +602 -0
- package/docs/config/extensions/additionalFiles.mdx +38 -0
- package/docs/config/extensions/additionalPackages.mdx +40 -0
- package/docs/config/extensions/aptGet.mdx +34 -0
- package/docs/config/extensions/audioWaveform.mdx +20 -0
- package/docs/config/extensions/custom.mdx +380 -0
- package/docs/config/extensions/emitDecoratorMetadata.mdx +29 -0
- package/docs/config/extensions/esbuildPlugin.mdx +31 -0
- package/docs/config/extensions/ffmpeg.mdx +45 -0
- package/docs/config/extensions/lightpanda.mdx +56 -0
- package/docs/config/extensions/overview.mdx +67 -0
- package/docs/config/extensions/playwright.mdx +195 -0
- package/docs/config/extensions/prismaExtension.mdx +1014 -0
- package/docs/config/extensions/puppeteer.mdx +30 -0
- package/docs/config/extensions/pythonExtension.mdx +182 -0
- package/docs/config/extensions/syncEnvVars.mdx +291 -0
- package/docs/context.mdx +235 -0
- package/docs/database-connections.mdx +213 -0
- package/docs/deploy-environment-variables.mdx +435 -0
- package/docs/deployment/atomic-deployment.mdx +172 -0
- package/docs/deployment/overview.mdx +257 -0
- package/docs/deployment/preview-branches.mdx +224 -0
- package/docs/errors-retrying.mdx +379 -0
- package/docs/github-actions.mdx +222 -0
- package/docs/github-integration.mdx +136 -0
- package/docs/github-repo.mdx +8 -0
- package/docs/help-email.mdx +6 -0
- package/docs/help-slack.mdx +11 -0
- package/docs/hidden-tasks.mdx +56 -0
- package/docs/how-it-works.mdx +454 -0
- package/docs/how-to-reduce-your-spend.mdx +217 -0
- package/docs/idempotency.mdx +504 -0
- package/docs/introduction.mdx +223 -0
- package/docs/limits.mdx +241 -0
- package/docs/logging.mdx +195 -0
- package/docs/machines.mdx +952 -0
- package/docs/manual-setup.mdx +632 -0
- package/docs/mcp-agent-rules.mdx +41 -0
- package/docs/mcp-introduction.mdx +385 -0
- package/docs/mcp-tools.mdx +273 -0
- package/docs/migrating-from-v3.mdx +334 -0
- package/docs/observability/dashboards.mdx +102 -0
- package/docs/observability/query.mdx +585 -0
- package/docs/open-source-contributing.mdx +16 -0
- package/docs/open-source-self-hosting.mdx +541 -0
- package/docs/private-networking/aws-console-setup.mdx +304 -0
- package/docs/private-networking/overview.mdx +144 -0
- package/docs/private-networking/troubleshooting.mdx +78 -0
- package/docs/queue-concurrency.mdx +354 -0
- package/docs/quick-start.mdx +97 -0
- package/docs/realtime/auth.mdx +208 -0
- package/docs/realtime/backend/overview.mdx +45 -0
- package/docs/realtime/backend/streams.mdx +418 -0
- package/docs/realtime/backend/subscribe.mdx +225 -0
- package/docs/realtime/how-it-works.mdx +94 -0
- package/docs/realtime/overview.mdx +63 -0
- package/docs/realtime/react-hooks/overview.mdx +73 -0
- package/docs/realtime/react-hooks/streams.mdx +449 -0
- package/docs/realtime/react-hooks/subscribe.mdx +674 -0
- package/docs/realtime/react-hooks/swr.mdx +87 -0
- package/docs/realtime/react-hooks/triggering.mdx +194 -0
- package/docs/realtime/react-hooks/use-wait-token.mdx +34 -0
- package/docs/realtime/run-object.mdx +174 -0
- package/docs/replaying.mdx +72 -0
- package/docs/request-feature.mdx +6 -0
- package/docs/roadmap.mdx +6 -0
- package/docs/run-tests.mdx +20 -0
- package/docs/run-usage.mdx +113 -0
- package/docs/runs/heartbeats.mdx +38 -0
- package/docs/runs/max-duration.mdx +139 -0
- package/docs/runs/metadata.mdx +734 -0
- package/docs/runs/priority.mdx +31 -0
- package/docs/runs.mdx +396 -0
- package/docs/self-hosting/docker.mdx +458 -0
- package/docs/self-hosting/env/supervisor.mdx +74 -0
- package/docs/self-hosting/env/webapp.mdx +276 -0
- package/docs/self-hosting/kubernetes.mdx +601 -0
- package/docs/self-hosting/overview.mdx +108 -0
- package/docs/skills.mdx +85 -0
- package/docs/tags.mdx +120 -0
- package/docs/tasks/overview.mdx +697 -0
- package/docs/tasks/scheduled.mdx +382 -0
- package/docs/tasks/schemaTask.mdx +413 -0
- package/docs/tasks/streams.mdx +884 -0
- package/docs/triggering.mdx +1320 -0
- package/docs/troubleshooting-alerts.mdx +385 -0
- package/docs/troubleshooting-debugging-in-vscode.mdx +8 -0
- package/docs/troubleshooting-github-issues.mdx +6 -0
- package/docs/troubleshooting-uptime-status.mdx +6 -0
- package/docs/troubleshooting.mdx +398 -0
- package/docs/upgrading-packages.mdx +80 -0
- package/docs/vercel-integration.mdx +207 -0
- package/docs/versioning.mdx +56 -0
- package/docs/video-walkthrough.mdx +23 -0
- package/docs/wait-for-token.mdx +540 -0
- package/docs/wait-for.mdx +42 -0
- package/docs/wait-until.mdx +53 -0
- package/docs/wait.mdx +18 -0
- package/docs/writing-tasks-introduction.mdx +33 -0
- package/package.json +10 -6
- package/skills/trigger-authoring-chat-agent/SKILL.md +296 -0
- package/skills/trigger-authoring-tasks/SKILL.md +254 -0
- package/skills/trigger-chat-agent-advanced/SKILL.md +368 -0
- package/skills/trigger-cost-savings/SKILL.md +116 -0
- package/skills/trigger-realtime-and-frontend/SKILL.md +276 -0
|
@@ -0,0 +1,230 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: "How it works"
|
|
3
|
+
sidebarTitle: "How it works"
|
|
4
|
+
description: "End-to-end mechanics of a chat.agent turn: the two durable channels per session, the long-lived task that reads and writes them, and how a chat survives refreshes, deploys, and idle gaps."
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
import RcBanner from "/snippets/ai-chat-rc-banner.mdx";
|
|
8
|
+
|
|
9
|
+
<RcBanner />
|
|
10
|
+
|
|
11
|
+
This page explains how `chat.agent` is put together, what each piece does on a single turn, and how a chat survives across turns. It is not an API tour — for that, see [Backend](/ai-chat/backend), [Frontend](/ai-chat/frontend), and the [Reference](/ai-chat/reference). For the byte-level wire format, see [Client Protocol](/ai-chat/client-protocol).
|
|
12
|
+
|
|
13
|
+
<Note>
|
|
14
|
+
**What you don't have to think about**: SSE reconnects, WebSocket backpressure, container cold starts, whether a worker is currently running, or how to re-deliver chunks the client missed during a reload. The platform handles those. **What you do have to think about**: idempotency in your `run()` function, and how much state you keep in memory between turns versus persist in your own database.
|
|
15
|
+
</Note>
|
|
16
|
+
|
|
17
|
+
## The primary noun: the chat session
|
|
18
|
+
|
|
19
|
+
A **chat session** is a stateful execution of an agent: two-way streaming plus durable compute, able to span multiple runs. It is the unit chat.agent owns, and it is three things bound together:
|
|
20
|
+
|
|
21
|
+
- An **inbox** channel called `.in` — every user message lands here as a record.
|
|
22
|
+
- An **outbox** channel called `.out` — every assistant chunk leaves through here.
|
|
23
|
+
- A long-lived **agent task** that reads from `.in` and writes to `.out`.
|
|
24
|
+
|
|
25
|
+
Both channels are S2 ([s2.dev](https://s2.dev)) durable append-only streams, keyed by the session. Think of them as a pair of per-session topics on a tiny Kafka: records have monotonically increasing sequence numbers, readers resume from a cursor, writers append to the tail. We chose S2 because reads are resumable from an offset — so a browser reload can replay the response stream without re-running the LLM, and a crashed run can rejoin mid-conversation by reading from where it left off.
|
|
26
|
+
|
|
27
|
+
A chat ID identifies the session for the lifetime of the conversation. The same session can be served by **many runs**: one run handles a turn (or several), goes idle, eventually exits, and the next user message triggers a fresh continuation run on the same session. Sessions are the durable identity; runs are the ephemeral compute.
|
|
28
|
+
|
|
29
|
+
## The lifecycle states
|
|
30
|
+
|
|
31
|
+
A run moves through a small state machine over its lifetime. Each state is named below, with the trigger that moves it to the next.
|
|
32
|
+
|
|
33
|
+
### Cold start
|
|
34
|
+
|
|
35
|
+
There is no run yet for this session. The frontend's first `sendMessage` posts to the session's `.in` channel; the server sees no live `currentRunId` and triggers a fresh `chat.agent` run with `continuation: false`. Moves to **Streaming** as soon as the task wakes and begins consuming `.in`.
|
|
36
|
+
|
|
37
|
+
### Streaming
|
|
38
|
+
|
|
39
|
+
The agent task is running. It reads the new message off `.in`, fires `onTurnStart`, runs your `run()` function, and pipes `streamText()` chunks onto `.out`. The browser is SSE-subscribed to `.out` and renders chunks as they land. When `streamText()` ends, the task writes a `trigger:turn-complete` control record (an S2 record with an empty body and a special header) and immediately trims `.out` back to the *previous* turn's completion marker — keeping the outbox bounded to roughly one turn of chunks at steady state. Moves to **Idle** after `onTurnComplete` runs and the post-turn snapshot is written.
|
|
40
|
+
|
|
41
|
+
### Idle (awaiting next message)
|
|
42
|
+
|
|
43
|
+
The turn is over. The task is alive but not doing work — it is parked in a waitpoint on `.in`, waiting for the next user message. If one arrives, it goes back to **Streaming** for the next turn. If `idleTimeoutInSeconds` (30 seconds by default) passes with no new message, it moves to **Suspended**.
|
|
44
|
+
|
|
45
|
+
### Suspended
|
|
46
|
+
|
|
47
|
+
The task fires `onChatSuspend`, then the engine **checkpoints** the run's whole process state and frees the compute. The session is still live (the row exists, the `.out` stream is still readable, the chat ID still works), but no machine is dedicated to it. This is the same Checkpoint-Resume System that powers every Trigger.dev task — covered in detail at [How it works → Checkpoint-Resume](/how-it-works#the-checkpoint-resume-system). Moves to **Resuming** when the next message lands in `.in`.
|
|
48
|
+
|
|
49
|
+
### Resuming
|
|
50
|
+
|
|
51
|
+
The engine restores the suspended run from its checkpoint. The same JS process picks up exactly where it parked — `chat.local` values, the accumulator, in-flight promises, in-memory caches all preserved as they were. `onChatResume` fires immediately after the restore, then the task transitions to **Streaming**. No boot work, no snapshot read, no SDK reinitialization. This is the cheap path.
|
|
52
|
+
|
|
53
|
+
### Continuation (after exit)
|
|
54
|
+
|
|
55
|
+
If the run has fully exited (because it hit `maxTurns`, the customer called `chat.endRun()` or `chat.requestUpgrade()`, or it was cancelled or crashed), the next user message can't resume it — there is nothing to resume. Instead, the server triggers a brand-new run with `continuation: true`. The new run does a cold boot, reads the prior conversation's S3 snapshot, replays any `.out` chunks after the snapshot cursor, AND replays any `.in` records past the last `turn-complete` cursor (the user messages a dead run never acknowledged). If the predecessor died mid-stream and left a partial assistant response in `.out`, the smart default splices `[firstInFlightUser, partialAssistant]` onto the chain so any follow-up has full context — see [Recovery boot](/ai-chat/patterns/recovery-boot). The new run then enters **Streaming** with `turn === 0` of the new run but `messageCount > 0`.
|
|
56
|
+
|
|
57
|
+
### Closed
|
|
58
|
+
|
|
59
|
+
`POST /api/v1/sessions/:id/close` flips `closedAt` on the session row. Future appends are rejected. Reads still work for transcript viewing. The session is terminal.
|
|
60
|
+
|
|
61
|
+
## One turn, end to end
|
|
62
|
+
|
|
63
|
+
Here is a typical cold turn — user opens the page, types "What's the weather?", reads the response — traced through every component.
|
|
64
|
+
|
|
65
|
+
<Steps>
|
|
66
|
+
<Step title="Browser: useChat calls transport.sendMessages">
|
|
67
|
+
The Vercel AI SDK's `useChat` hook serializes the user's message into the slim wire format: `{ chatId, trigger: "submit-message", message, metadata }`. Only the new message goes on the wire, not the full history.
|
|
68
|
+
</Step>
|
|
69
|
+
<Step title="Browser: transport posts to /append">
|
|
70
|
+
The transport calls `POST /realtime/v1/sessions/:chatId/in/append`, authenticated with the session's public access token. The body is one S2 record.
|
|
71
|
+
</Step>
|
|
72
|
+
<Step title="Server: route ensures a run exists">
|
|
73
|
+
The append route resolves the session, then calls `ensureRunForSession()`. The session's `currentRunId` is null (cold start), so it triggers a new `chat.agent` run on the project's dev/prod environment and atomically claims the slot via an optimistic version counter.
|
|
74
|
+
</Step>
|
|
75
|
+
<Step title="Server: route appends the record to S2 .in">
|
|
76
|
+
The route writes the message to `s2://sessions/:chatId/in` as a single record. S2 assigns a sequence number. Any waitpoints registered on this channel fire, which would wake an existing run — but there is no run waiting yet, so this is a no-op for now.
|
|
77
|
+
</Step>
|
|
78
|
+
<Step title="Browser: transport opens an SSE subscription to .out">
|
|
79
|
+
In parallel with the send, the transport opens `GET /realtime/v1/sessions/:chatId/out` (server-sent events). It passes its `lastEventId` if it has one cached; on a brand-new chat it does not. Any chunks the agent writes from now on will be delivered to this stream.
|
|
80
|
+
</Step>
|
|
81
|
+
<Step title="Task: agent run boots">
|
|
82
|
+
The newly-triggered run starts. `onBoot` fires once per worker process. Because this is a fresh chat, no snapshot is read.
|
|
83
|
+
</Step>
|
|
84
|
+
<Step title="Task: enters the turn loop, reads the message from .in">
|
|
85
|
+
The agent reads the pending record off `.in` via a waitpoint. `onChatStart` fires (once per chat lifetime). `onTurnStart` fires (every turn).
|
|
86
|
+
</Step>
|
|
87
|
+
<Step title="Task: runs your run() function, streams chunks to .out">
|
|
88
|
+
Your code calls `streamText({ model, messages })`. Each `UIMessageChunk` it produces is appended to `s2://sessions/:chatId/out` as a record. The browser sees them arrive on the SSE stream and the AI SDK renders them.
|
|
89
|
+
</Step>
|
|
90
|
+
<Step title="Task: writes the turn-complete control record">
|
|
91
|
+
When `streamText()` finishes, the agent writes a record with header `trigger:turn-complete` and an empty body. The browser transport sees this header and closes the per-turn readable stream.
|
|
92
|
+
</Step>
|
|
93
|
+
<Step title="Task: trims .out back to the previous turn-complete">
|
|
94
|
+
Immediately after writing the new turn-complete marker, the agent issues an S2 trim command targeting the *previous* turn-complete's sequence number. This bounds the stream's storage to roughly one turn of chunks plus the latest control record.
|
|
95
|
+
</Step>
|
|
96
|
+
<Step title="Task: fires onTurnComplete, writes snapshot to S3">
|
|
97
|
+
`onTurnComplete` runs (your hook for persistence). Then the agent writes `ChatSnapshotV1` — `{ version: 1, messages, lastOutEventId, lastOutTimestamp }` — to S3 at `sessions/:chatId/snapshot.json`. This write is awaited, not fire-and-forget, so the next run is guaranteed to find it.
|
|
98
|
+
</Step>
|
|
99
|
+
<Step title="Task: goes idle, then suspends">
|
|
100
|
+
The agent re-enters the waitpoint on `.in`. After `idleTimeoutInSeconds` of nothing arriving, `onChatSuspend` fires and the engine snapshots the run. Compute is freed.
|
|
101
|
+
</Step>
|
|
102
|
+
</Steps>
|
|
103
|
+
|
|
104
|
+
## Three layers of persistence
|
|
105
|
+
|
|
106
|
+
chat.agent survives idle gaps, deploys, refreshes, and crashes because three separate persistence mechanisms work at three different layers of the stack. They're orthogonal — each protects against a different failure mode, and conflating them is a common source of bugs.
|
|
107
|
+
|
|
108
|
+
### Layer 1: the engine checkpoint (compute)
|
|
109
|
+
|
|
110
|
+
When a run enters the Suspended state, the engine **checkpoints** the running process — its memory, CPU registers, and open file descriptors — and frees the compute. Today this is done via [CRIU](https://criu.org/) (Checkpoint/Restore in Userspace), the same mechanism that powers every Trigger.dev task's suspend/resume. On the new microVM compute runtime (currently in [private beta](/compute-private-beta)), it becomes a full Firecracker VM snapshot: every byte of memory plus filesystem state plus every kernel object inside the VM.
|
|
111
|
+
|
|
112
|
+
When the next message arrives, the engine **restores** the checkpoint. The same JS process picks up at the exact instruction it parked on. From your code's perspective, the line right after the `messagesInput.wait()` waitpoint just continues executing. Anything in process memory survives: `chat.local`, the message accumulator, in-flight Promises, in-memory caches, open DB connections. The runId is unchanged.
|
|
113
|
+
|
|
114
|
+
This is what lets you write `run()` as a single long-lived function with stateful closures, even though the underlying compute actually goes through checkpoint/restore cycles between turns. `onChatSuspend` fires immediately before the checkpoint; `onChatResume` fires immediately after the restore.
|
|
115
|
+
|
|
116
|
+
### Layer 2: the chat snapshot (S3)
|
|
117
|
+
|
|
118
|
+
After every turn the agent writes a `ChatSnapshotV1` blob to S3 — full accumulated `UIMessage[]` plus the current `lastOutEventId` cursor. This is chat-specific and lives one layer above the engine. It has nothing to do with CRIU or Firecracker.
|
|
119
|
+
|
|
120
|
+
The chat snapshot bridges run *boundaries*. If a run exits cleanly — because it hit `maxTurns`, called `chat.endRun()` or `chat.requestUpgrade()`, was cancelled, crashed, or got bumped to a new version after a deploy — the engine checkpoint is gone with it. When the next user message arrives, the server triggers a fresh run with `continuation: true`. That new run reads the S3 snapshot, replays any post-snapshot chunks from `.out`, merges by message ID, and starts its first turn with the full conversation history already in memory.
|
|
121
|
+
|
|
122
|
+
The chat snapshot carries only message history — not process memory. `chat.local`, in-memory caches, open connections all need to be reinitialized on a continuation. This is why `onBoot` (every fresh worker) is the right place to initialize `chat.local`, not `onChatStart` (only the very first turn of the chat). See [Persistence and replay](/ai-chat/patterns/persistence-and-replay) for the full snapshot model.
|
|
123
|
+
|
|
124
|
+
If your task registers a `hydrateMessages` hook, the chat snapshot is skipped entirely — your hook is the single source of truth for history.
|
|
125
|
+
|
|
126
|
+
### Layer 3: the `lastEventId` cursor (browser)
|
|
127
|
+
|
|
128
|
+
The transport stores `lastEventId` — the S2 sequence number of the most recent chunk it processed — in its session state. On page reload, it reopens the SSE stream with `Last-Event-ID: <cursor>` as a header. S2 resumes from that cursor; chunks the browser already saw are not redelivered. If the agent was mid-turn when the browser reloaded, the rest of the turn streams in. If the turn had already completed, the stream closes immediately via an `X-Session-Settled` header so the client doesn't long-poll for nothing.
|
|
129
|
+
|
|
130
|
+
Unlike the other two layers, this one is client-side. The server doesn't even need to know the browser refreshed — the agent run keeps running (or stays suspended) regardless.
|
|
131
|
+
|
|
132
|
+
### Which layer covers which failure mode
|
|
133
|
+
|
|
134
|
+
| What happened | Recovery layer | Same run? | In-memory state preserved? |
|
|
135
|
+
| --- | --- | --- | --- |
|
|
136
|
+
| Idle gap mid-conversation (suspend → resume) | Engine checkpoint | Yes | Yes |
|
|
137
|
+
| Run exited cleanly (`endRun`, `requestUpgrade`, `maxTurns`) | Chat snapshot | No (fresh continuation run) | No |
|
|
138
|
+
| Run crashed mid-turn (OOM, exception) | Chat snapshot + `.out` tail replay | (retried as a new attempt) | No |
|
|
139
|
+
| Browser tab reloaded mid-stream | `lastEventId` cursor on `.out` | (run unaffected) | (n/a) |
|
|
140
|
+
| Deploy rolled out a new version mid-chat | Chat snapshot, via `requestUpgrade` flow | No | No |
|
|
141
|
+
|
|
142
|
+
No single layer covers every case. The engine checkpoint alone can't survive a run exit (there's nothing to restore). The chat snapshot alone can't survive a tab refresh mid-turn (chunks already streamed would be lost). The `lastEventId` cursor alone can't bridge run boundaries (the new run wouldn't know the history). Together they cover every realistic failure.
|
|
143
|
+
|
|
144
|
+
## Warm vs cold: same chat, three different timings
|
|
145
|
+
|
|
146
|
+
Take the same conversation — "What's the weather?" then "What about tomorrow?" — and look at how each second turn lands.
|
|
147
|
+
|
|
148
|
+
**Warm second turn (within a few seconds).** The first turn finished, the agent is parked on the `.in` waitpoint, status is **Idle**. The new message hits `/append`, the waitpoint fires, the agent wakes inside the same run with all memory intact, runs `onTurnStart` for turn 2, streams the response. No checkpoint involved — the process never went to sleep. Latency to first chunk: dominated by the LLM, not the platform.
|
|
149
|
+
|
|
150
|
+
**Resumed second turn (a few minutes later).** The first turn finished and the agent suspended — the engine checkpoint is stored, compute is freed. The new message hits `/append`. The engine restores the checkpoint, fires `onChatResume`, and the task picks up exactly where it parked — all in-memory state preserved (`chat.local`, the accumulator, the lot). Latency to first chunk: the engine's restore overhead, then the LLM.
|
|
151
|
+
|
|
152
|
+
**Continuation second turn (an hour later, or after a deploy).** The first turn finished and the run eventually exited. The new message hits `/append`, the server triggers a fresh run with `continuation: true`. The new run boots cold, `onBoot` fires, the agent reads the S3 chat snapshot, replays the `.out` tail, then enters the turn loop with the full conversation already accumulated. The previous run's in-memory state is gone — anything in `chat.local` has to be re-initialized in `onBoot`. Latency to first chunk: cold start plus snapshot read, then the LLM.
|
|
153
|
+
|
|
154
|
+
All three look identical to the browser. Only the agent task knows which path it took, via `payload.continuation` and `ctx.attempt.number`.
|
|
155
|
+
|
|
156
|
+
## Lifecycle hooks: where you plug in
|
|
157
|
+
|
|
158
|
+
| Hook | When it fires | Typical use |
|
|
159
|
+
| --- | --- | --- |
|
|
160
|
+
| `onBoot` | Once per worker process, before any chat work | Initialize `chat.local` resources |
|
|
161
|
+
| `onPreload` | Once per chat lifetime, if the chat was preloaded before the first message | Warm caches, fetch the user's profile |
|
|
162
|
+
| `onChatStart` | Once per chat lifetime, on the first turn of a fresh chat (not on continuation) | First-message persistence, system-prompt setup |
|
|
163
|
+
| `onValidateMessages` | Every turn, before merging the incoming message | Reject or transform user input |
|
|
164
|
+
| `hydrateMessages` | Every turn, instead of snapshot+replay | Use your DB as the source of truth |
|
|
165
|
+
| `onTurnStart` | Every turn, before `run()` | Compact history, persist the user message |
|
|
166
|
+
| `onBeforeTurnComplete` | Every turn, after streaming, before the turn-complete record | Emit a final custom chunk |
|
|
167
|
+
| `onTurnComplete` | Every turn, after the turn-complete record is written | Persist the assistant message and `lastEventId` |
|
|
168
|
+
| `onChatSuspend` / `onChatResume` | At the idle → suspend / suspend → wake transitions | Release/reacquire expensive resources |
|
|
169
|
+
|
|
170
|
+
See [Lifecycle hooks](/ai-chat/lifecycle-hooks) for the full signatures and firing order.
|
|
171
|
+
|
|
172
|
+
## When chat.agent is the right primitive
|
|
173
|
+
|
|
174
|
+
**Good fit**:
|
|
175
|
+
- Multi-turn conversational agents where the user is expected to come back later.
|
|
176
|
+
- Long-running agent loops with tool calls, where a single turn can take a minute or more.
|
|
177
|
+
- Cases where you want page reloads to resume the in-flight response without re-running the model.
|
|
178
|
+
- Cases where you can't predict idle gaps — humans go to lunch.
|
|
179
|
+
|
|
180
|
+
**Not a good fit**:
|
|
181
|
+
- Single-shot completions where you don't need durability or resume. Call your model directly.
|
|
182
|
+
- Workflows where you control both ends and want a custom protocol. Use a [raw `task()` with chat primitives](/ai-chat/custom-agents) directly without the `chat.agent` wrapper.
|
|
183
|
+
- High-fanout broadcasting (one source, many subscribers). Use Trigger.dev realtime streams against a regular task instead.
|
|
184
|
+
|
|
185
|
+
## Putting it together
|
|
186
|
+
|
|
187
|
+
```mermaid
|
|
188
|
+
sequenceDiagram
|
|
189
|
+
participant Browser
|
|
190
|
+
participant API as Trigger.dev API
|
|
191
|
+
participant S2_in as S2 .in
|
|
192
|
+
participant S2_out as S2 .out
|
|
193
|
+
participant Agent as chat.agent task
|
|
194
|
+
participant S3 as S3 snapshot
|
|
195
|
+
|
|
196
|
+
Note over Agent: Cold start
|
|
197
|
+
Browser->>API: POST /sessions/:id/in/append
|
|
198
|
+
API->>S2_in: append(message)
|
|
199
|
+
API->>Agent: trigger run (continuation: false)
|
|
200
|
+
Browser->>API: GET /sessions/:id/out (SSE)
|
|
201
|
+
API->>S2_out: read stream
|
|
202
|
+
Agent->>S2_in: read message (waitpoint)
|
|
203
|
+
Agent->>S2_out: append chunk(s)
|
|
204
|
+
S2_out-->>Browser: SSE chunks
|
|
205
|
+
Agent->>S2_out: append turn-complete (control)
|
|
206
|
+
Agent->>S2_out: trim < previous turn-complete
|
|
207
|
+
Agent->>S3: write snapshot
|
|
208
|
+
Note over Agent: Idle on waitpoint
|
|
209
|
+
|
|
210
|
+
Note over Agent: ...time passes...
|
|
211
|
+
Note over Agent: Suspended
|
|
212
|
+
|
|
213
|
+
Browser->>API: POST /sessions/:id/in/append
|
|
214
|
+
API->>S2_in: append(message)
|
|
215
|
+
API->>Agent: restore from suspend
|
|
216
|
+
Agent->>S2_in: read message
|
|
217
|
+
Agent->>S2_out: append chunk(s)
|
|
218
|
+
S2_out-->>Browser: SSE chunks
|
|
219
|
+
Agent->>S2_out: append turn-complete
|
|
220
|
+
Agent->>S3: write snapshot
|
|
221
|
+
Note over Agent: Idle again
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
## Where to go next
|
|
225
|
+
|
|
226
|
+
- [Quick start](/ai-chat/quick-start) — get a chat running in a few minutes.
|
|
227
|
+
- [Backend](/ai-chat/backend) — the `chat.agent()` API in detail.
|
|
228
|
+
- [Lifecycle hooks](/ai-chat/lifecycle-hooks) — every hook, what fires when.
|
|
229
|
+
- [Persistence and replay](/ai-chat/patterns/persistence-and-replay) — deeper on the snapshot model.
|
|
230
|
+
- [Client protocol](/ai-chat/client-protocol) — wire format if you're writing a custom transport.
|