@mastra/mcp-docs-server 1.1.39-alpha.6 → 1.1.39

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/.docs/docs/agents/acp.md +238 -0
  2. package/.docs/docs/agents/agent-approval.md +2 -0
  3. package/.docs/docs/agents/background-tasks.md +9 -6
  4. package/.docs/docs/agents/response-caching.md +2 -0
  5. package/.docs/docs/agents/signals.md +29 -3
  6. package/.docs/docs/evals/evals-with-memory.md +146 -0
  7. package/.docs/docs/evals/running-in-ci.md +1 -0
  8. package/.docs/docs/memory/multi-user-threads.md +206 -0
  9. package/.docs/docs/memory/observational-memory.md +53 -17
  10. package/.docs/docs/memory/overview.md +1 -0
  11. package/.docs/docs/memory/working-memory.md +1 -1
  12. package/.docs/docs/server/auth/fga.md +55 -10
  13. package/.docs/models/gateways/netlify.md +2 -1
  14. package/.docs/models/gateways/openrouter.md +2 -1
  15. package/.docs/models/index.md +1 -1
  16. package/.docs/models/providers/deepinfra.md +2 -2
  17. package/.docs/models/providers/fireworks-ai.md +23 -22
  18. package/.docs/models/providers/google.md +29 -46
  19. package/.docs/models/providers/llmgateway.md +186 -191
  20. package/.docs/models/providers/novita-ai.md +7 -7
  21. package/.docs/models/providers/opencode.md +1 -1
  22. package/.docs/models/providers/orcarouter.md +2 -2
  23. package/.docs/models/providers/poe.md +2 -1
  24. package/.docs/models/providers/routing-run.md +94 -0
  25. package/.docs/models/providers/the-grid-ai.md +15 -9
  26. package/.docs/models/providers/xai.md +2 -1
  27. package/.docs/models/providers.md +1 -0
  28. package/.docs/reference/agents/agent.md +13 -5
  29. package/.docs/reference/agents/channels.md +4 -2
  30. package/.docs/reference/client-js/agents.md +1 -1
  31. package/.docs/reference/configuration.md +1 -1
  32. package/.docs/reference/memory/observational-memory.md +5 -3
  33. package/.docs/reference/server/register-api-route.md +1 -1
  34. package/.docs/reference/storage/convex.md +74 -12
  35. package/.docs/reference/tools/mcp-client.md +27 -2
  36. package/.docs/reference/vectors/convex.md +129 -7
  37. package/CHANGELOG.md +73 -0
  38. package/package.json +6 -6
@@ -0,0 +1,238 @@
1
+ # ACP (Agent Client Protocol)
2
+
3
+ Mastra supports the [Agent Client Protocol (ACP)](https://agentclientprotocol.com/overview/introduction) for running ACP-compatible coding agents from a Mastra agent. Use `@mastra/acp` to wrap a coding agent process as a Mastra tool or as a subagent.
4
+
5
+ ACP is useful for coding agents such as Claude Code, Amp, Codex, or any other executable that implements the Agent Client Protocol over standard input and output.
6
+
7
+ ## When to use ACP
8
+
9
+ - A Mastra agent should delegate code inspection, editing, or repository tasks to an external coding agent.
10
+ - An ACP-compatible agent process should stay alive across calls so it can keep session context.
11
+ - A parent agent needs real-time output from a coding agent while the task runs.
12
+ - An ACP-compatible agent needs permission prompts before it reads files, writes files, or runs actions.
13
+ - File access should go through Mastra's workspace abstraction instead of direct process-only file access.
14
+
15
+ ## How ACP works
16
+
17
+ `@mastra/acp` starts the configured ACP agent command as a child process and communicates with it using newline-delimited JSON over standard input and output.
18
+
19
+ The flow is:
20
+
21
+ 1. Configure `command`, `args`, and optional connection settings.
22
+ 2. `@mastra/acp` spawns the ACP agent process on first use.
23
+ 3. The client sends ACP `initialize` and `session/new` requests.
24
+ 4. Mastra sends the user task to the ACP agent with `session/prompt`.
25
+ 5. The ACP agent streams session updates and message chunks back to Mastra.
26
+ 6. Mastra returns the buffered output, emits streaming chunks, or suspends for permission input.
27
+ 7. The ACP process stays alive by default, or stops after the prompt when `persistSession` is `false`.
28
+
29
+ During execution, the ACP client also handles permission requests and file operations. File reads and writes go through Mastra's `Workspace`, so the ACP agent operates inside the workspace you provide.
30
+
31
+ ## Getting started
32
+
33
+ Install `@mastra/acp` in a project that already uses `@mastra/core`:
34
+
35
+ **npm**:
36
+
37
+ ```bash
38
+ npm install @mastra/acp
39
+ ```
40
+
41
+ **pnpm**:
42
+
43
+ ```bash
44
+ pnpm add @mastra/acp
45
+ ```
46
+
47
+ **Yarn**:
48
+
49
+ ```bash
50
+ yarn add @mastra/acp
51
+ ```
52
+
53
+ **Bun**:
54
+
55
+ ```bash
56
+ bun add @mastra/acp
57
+ ```
58
+
59
+ `@mastra/acp` exports two APIs:
60
+
61
+ - `createACPTool`: Create a Mastra tool that sends a `task` string to an ACP agent and returns an `output` string.
62
+ - `AcpAgent`: Wrap an ACP agent as a Mastra subagent with `generate()` and `stream()` support.
63
+
64
+ The package requires `@mastra/core` version `1.34.0` or later.
65
+
66
+ ## Use ACP as a subagent
67
+
68
+ Use `AcpAgent` when a parent Mastra agent should delegate directly to an ACP-compatible coding agent as a subagent. Create the ACP agent, then register it in the parent agent's `agents` map.
69
+
70
+ ```typescript
71
+ import { AcpAgent } from '@mastra/acp'
72
+ import { Agent } from '@mastra/core/agent'
73
+
74
+ const codeAgent = new AcpAgent({
75
+ id: 'code-agent',
76
+ name: 'Code Agent',
77
+ description: 'An ACP-compatible coding agent that can inspect and edit files',
78
+ command: 'acp-agent',
79
+ args: ['--stdio'],
80
+ cwd: process.cwd(),
81
+ })
82
+
83
+ export const codeSupervisor = new Agent({
84
+ id: 'code-supervisor',
85
+ name: 'Code Supervisor',
86
+ instructions: 'Delegate code editing tasks to the code-agent subagent.',
87
+ model: 'openai/gpt-5.4',
88
+ agents: {
89
+ codeAgent,
90
+ },
91
+ })
92
+ ```
93
+
94
+ `AcpAgent.generate()` buffers the ACP response and returns it as text. `AcpAgent.stream()` emits Mastra `text-delta` chunks as ACP `agent_message_chunk` updates arrive.
95
+
96
+ ## Use ACP as a tool
97
+
98
+ Use `createACPTool` when the parent Mastra agent should decide when to call the ACP agent as a tool. The following example creates a code editing tool and registers it on a parent agent:
99
+
100
+ ```typescript
101
+ import { createACPTool } from '@mastra/acp'
102
+ import { Agent } from '@mastra/core/agent'
103
+
104
+ const codeAgentTool = createACPTool({
105
+ id: 'code-agent',
106
+ description: 'Use an ACP-compatible coding agent to inspect and edit code',
107
+ command: 'acp-agent',
108
+ args: ['--stdio'],
109
+ cwd: process.cwd(),
110
+ })
111
+
112
+ export const codeSupervisor = new Agent({
113
+ id: 'code-supervisor',
114
+ name: 'Code Supervisor',
115
+ instructions: 'Use the code-agent tool when a task requires repository inspection or code edits.',
116
+ model: 'openai/gpt-5.4',
117
+ tools: {
118
+ codeAgentTool,
119
+ },
120
+ })
121
+ ```
122
+
123
+ Use the `command` and `args` required by the ACP-compatible agent you run. The tool input schema has a single `task` string, and the output schema returns the final ACP response as `output`.
124
+
125
+ If the ACP agent requests permission, the tool can suspend and resume through Mastra's tool suspension flow. Use `onPermissionRequest` when you need custom permission behavior.
126
+
127
+ ## Options reference
128
+
129
+ `createACPTool` and `AcpAgent` accept the same ACP connection options. `AcpAgent` also accepts `name` to set the display name used during agent delegation.
130
+
131
+ | Option | Type | Description |
132
+ | --------------------- | -------------------------------- | --------------------------------------------------------------------------------------- |
133
+ | `id` | `string` | Unique tool or subagent identifier. |
134
+ | `description` | `string` | Description shown to the model when it can call the tool or delegate to the subagent. |
135
+ | `command` | `string` | ACP agent executable to spawn. |
136
+ | `args` | `string[]` | Arguments passed to the ACP agent executable. |
137
+ | `env` | `Record<string, string>` | Environment variables to merge with the current process environment. |
138
+ | `cwd` | `string` | Working directory for the ACP process, ACP session, and default workspace. |
139
+ | `session` | `Partial<NewSessionRequest>` | ACP session creation options. Defaults to `cwd` or `process.cwd()` and no MCP servers. |
140
+ | `initialize` | `Partial<InitializeRequest>` | ACP initialization options. Defaults to Mastra client information and protocol version. |
141
+ | `authMethodId` | `string` | ACP authentication method ID to invoke after initialization. |
142
+ | `persistSession` | `boolean` | Keep the ACP process alive after execution. Defaults to `true`. |
143
+ | `onPermissionRequest` | `(request) => Promise<Response>` | Callback for ACP permission requests. Defaults to selecting the first option. |
144
+ | `workspace` | `Workspace` | Workspace used for ACP file reads and writes. |
145
+
146
+ ## Session lifecycle
147
+
148
+ `createACPTool` and `AcpAgent` start the configured command on first use and create an ACP session. By default, `persistSession` is `true`, so the child process stays alive across calls.
149
+
150
+ Use the default persistent session when:
151
+
152
+ - The ACP agent benefits from keeping conversation or repository context.
153
+ - Startup is expensive and repeated calls should reuse the same process.
154
+ - A parent agent may delegate several related tasks to the same coding agent.
155
+
156
+ Set `persistSession: false` when each prompt should run in an isolated process:
157
+
158
+ ```typescript
159
+ import { AcpAgent } from '@mastra/acp'
160
+
161
+ export const codeAgent = new AcpAgent({
162
+ id: 'code-agent',
163
+ description: 'Run one isolated ACP coding task',
164
+ command: 'acp-agent',
165
+ args: ['--stdio'],
166
+ cwd: process.cwd(),
167
+ persistSession: false,
168
+ })
169
+ ```
170
+
171
+ With `persistSession: false`, `@mastra/acp` stops the ACP process after each prompt completes.
172
+
173
+ ## Permission handling
174
+
175
+ ACP agents may ask the client to choose a permission option before they continue. By default, `@mastra/acp` selects the first option returned by the ACP agent.
176
+
177
+ Pass `onPermissionRequest` to inspect the request and return the selected option yourself:
178
+
179
+ ```typescript
180
+ import { createACPTool } from '@mastra/acp'
181
+
182
+ export const codeAgentTool = createACPTool({
183
+ id: 'code-agent',
184
+ description: 'Use an ACP-compatible coding agent',
185
+ command: 'acp-agent',
186
+ args: ['--stdio'],
187
+ async onPermissionRequest(request) {
188
+ const allowOption = request.options.find(option => option.name === 'Allow')
189
+
190
+ if (!allowOption) {
191
+ return { outcome: { outcome: 'cancelled' } }
192
+ }
193
+
194
+ return {
195
+ outcome: {
196
+ outcome: 'selected',
197
+ optionId: allowOption.optionId,
198
+ },
199
+ }
200
+ },
201
+ })
202
+ ```
203
+
204
+ Use this callback to enforce local policy, inspect the permission title, or route the decision to your own approval flow.
205
+
206
+ ## Workspace integration
207
+
208
+ ACP file operations go through Mastra's workspace abstraction. If you don't pass `workspace`, `@mastra/acp` creates a `Workspace` backed by `LocalFilesystem` and uses `cwd` as the filesystem root.
209
+
210
+ Pass a custom `Workspace` when the ACP agent should read and write through a specific filesystem implementation:
211
+
212
+ ```typescript
213
+ import { AcpAgent } from '@mastra/acp'
214
+ import { LocalFilesystem, Workspace } from '@mastra/core/workspace'
215
+
216
+ const workspace = new Workspace({
217
+ filesystem: new LocalFilesystem({
218
+ root: process.cwd(),
219
+ }),
220
+ })
221
+
222
+ export const codeAgent = new AcpAgent({
223
+ id: 'code-agent',
224
+ description: 'Run coding tasks in a controlled workspace',
225
+ command: 'acp-agent',
226
+ args: ['--stdio'],
227
+ workspace,
228
+ })
229
+ ```
230
+
231
+ Use `cwd` and `workspace` together when the ACP process should start in one directory but file operations should use an explicitly configured workspace root.
232
+
233
+ ## Related
234
+
235
+ - [Agent reference](https://mastra.ai/reference/agents/agent)
236
+ - [Subagents](https://mastra.ai/docs/agents/supervisor-agents)
237
+ - [Agent Client Protocol introduction](https://agentclientprotocol.com/overview/introduction)
238
+ - [Agent Client Protocol schema](https://agentclientprotocol.com/protocol/schema)
@@ -92,6 +92,8 @@ A tool can also pause _during_ its `execute` function by calling `suspend()`. Th
92
92
 
93
93
  The stream emits a `tool-call-suspended` chunk with a custom payload defined by the tool's `suspendSchema`. You resume by calling `resumeStream()` with data matching the tool's `resumeSchema`.
94
94
 
95
+ > **Note:** `suspend()` does not throw — return immediately after calling it (e.g. `return await suspend({ ... })`). Code after `await suspend(...)` still runs before the tool pauses.
96
+
95
97
  ## Tool approval with `generate()`
96
98
 
97
99
  Tool approval also works with `generate()` for non-streaming use cases. When a tool requires approval, `generate()` returns immediately with `finishReason: 'suspended'`, a `suspendPayload` containing the tool call details (`toolCallId`, `toolName`, `args`), and a `runId`:
@@ -40,11 +40,12 @@ The full set of options is listed in the [backgroundTasks configuration referenc
40
40
 
41
41
  ## Run a tool in the background
42
42
 
43
- Enabling the manager doesn't run anything in the background by itself as every tool defaults to foreground execution. You can run a tool in the background at one of three layers, in priority order:
43
+ Enabling the manager doesn't run anything in the background by itself as every tool defaults to foreground execution. Tools opt in at one of two layers:
44
44
 
45
- 1. **LLM per-call override**: the model decides it should run in the background and includes a `_background` field in the tool arguments.
45
+ 1. **Tool-level config**: the tool itself declares it as background-eligible.
46
46
  2. **Agent-level config**: the agent declares which of its tools are background-eligible.
47
- 3. **Tool-level config**: the tool itself declares it as background-eligible.
47
+
48
+ Once a tool has opted in, the LLM can optionally include a `_background` field in the tool arguments to override the resolved config for a specific call (timeout, retries, or to flip the call back to foreground).
48
49
 
49
50
  ### Tool-level
50
51
 
@@ -103,13 +104,15 @@ When a tool is registered on an agent that has background tasks enabled, the mod
103
104
  }
104
105
  ```
105
106
 
107
+ The `_background` override is a _modifier_ on tools the developer has already opted in at the tool or agent layer — it is not a standalone opt-in. If a tool hasn't been opted in, `_background.enabled: true` from the model is ignored and the tool runs in the foreground. This keeps deterministic, foreground-only tools (calculators, lookups, schema validators) from being silently dispatched as tasks.
108
+
106
109
  ### Resolution order
107
110
 
108
111
  When a tool call is dispatched, the resolved background config is computed in this priority order:
109
112
 
110
- 1. LLM `_background` override (if present in the call's arguments).
111
- 2. Agent-level `backgroundTasks.tools` entry for the tool.
112
- 3. Tool-level `backgroundTasks` config.
113
+ 1. Agent-level `backgroundTasks.tools` entry for the tool.
114
+ 2. Tool-level `backgroundTasks` config.
115
+ 3. LLM `_background.enabled` override (only used to enable background dispatch when the tool was opted in at one of the layers above).
113
116
  4. Manager defaults (`defaultTimeoutMs`, `defaultRetries`).
114
117
 
115
118
  If the agent has `backgroundTasks.disabled: true`, every tool call runs synchronously regardless of the layers above.
@@ -1,5 +1,7 @@
1
1
  # Response Caching
2
2
 
3
+ > **Experimental:** This feature is in alpha. Breaking changes may occur without a major version bump until the API is stable.
4
+
3
5
  Response caching skips the LLM call and replays a previously cached response when an agent receives an identical request. Use it to reduce latency and avoid paying for repeated calls.
4
6
 
5
7
  Caching is implemented as the [`ResponseCache`](https://mastra.ai/reference/processors/response-cache) input processor. Mastra doesn't provide an agent-level option. To enable caching, register the processor explicitly. This keeps the API surface small while Mastra collects feedback; per-call overrides flow through `RequestContext`.
@@ -1,6 +1,6 @@
1
1
  # Signals
2
2
 
3
- > **Experimental:** Agent signals are experimental. Breaking changes may occur without a major version bump until the API is stable.
3
+ > **Experimental:** This feature is in alpha. Breaking changes may occur without a major version bump until the API is stable.
4
4
 
5
5
  Signals are a way to interact with an agent through a thread. Instead of starting every interaction with `agent.stream()`, subscribe to a thread and send signals. Mastra either wakes the agent when the thread is idle or drops the signal into the running agent loop.
6
6
 
@@ -86,6 +86,32 @@ agent.sendSignal(
86
86
  )
87
87
  ```
88
88
 
89
+ ## Identify users with attributes
90
+
91
+ Use `attributes` to tag each signal with user identity. The signal type and attributes are rendered as XML so the model can distinguish who said what in a multi-user thread:
92
+
93
+ ```typescript
94
+ agent.sendSignal(
95
+ {
96
+ type: 'user',
97
+ contents: 'Can we simplify the API surface?',
98
+ attributes: { name: 'Devin', from: 'slack' },
99
+ },
100
+ {
101
+ resourceId: 'user_123',
102
+ threadId: 'thread_456',
103
+ },
104
+ )
105
+ ```
106
+
107
+ The model receives:
108
+
109
+ ```xml
110
+ <user name="Devin" from="slack">Can we simplify the API surface?</user>
111
+ ```
112
+
113
+ The UI sees just the message contents but can also read `attributes` and `metadata` off the signal message for custom rendering (e.g. showing user names, avatars, or platform badges).
114
+
89
115
  ## Send external event context
90
116
 
91
117
  Use custom signal types for system-generated context. Non-user signal types are rendered as XML-style user-role context so they can appear inside conversation history without looking like assistant output.
@@ -96,7 +122,7 @@ agent.sendSignal(
96
122
  type: 'system-reminder',
97
123
  contents: 'User X has left a new PR comment asking for a smaller API surface.',
98
124
  attributes: {
99
- type: 'github',
125
+ source: 'github',
100
126
  pr: '123',
101
127
  },
102
128
  },
@@ -110,7 +136,7 @@ agent.sendSignal(
110
136
  The model receives the custom signal as context like this:
111
137
 
112
138
  ```xml
113
- <system-reminder type="github" pr="123">User X has left a new PR comment asking for a smaller API surface.</system-reminder>
139
+ <system-reminder source="github" pr="123">User X has left a new PR comment asking for a smaller API surface.</system-reminder>
114
140
  ```
115
141
 
116
142
  Use XML-safe signal type names and attribute names. Signal type names and attribute names can contain letters, numbers, underscores, periods, and hyphens. They must start with a letter or underscore.
@@ -0,0 +1,146 @@
1
+ # Evals with memory
2
+
3
+ Agents that use memory in `thread` scope — including observational memory — require a thread ID at run time. When an eval invokes the agent without one, you'll see:
4
+
5
+ ```text
6
+ ObservationalMemory (scope: 'thread') requires a threadId, but none was found in RequestContext or MessageList.
7
+ ```
8
+
9
+ This page covers the three working patterns for running Mastra evals against memory-enabled agents, what each path supports, and which one to pick. A complete runnable repro for all three approaches lives in [`examples/evals-with-memory`](https://github.com/mastra-ai/mastra/tree/main/examples/evals-with-memory).
10
+
11
+ ## When to use which approach
12
+
13
+ | Goal | Approach |
14
+ | ----------------------------------------------- | ----------------------------------------------------------------------------------------- |
15
+ | One shared conversation across every item | [`runEvals` with global `targetOptions.memory`](#shared-thread-with-runevals) |
16
+ | One independent thread per item, simple CI loop | [`runEvals` per item](#per-item-threads-with-runevals) |
17
+ | Per-item threads driven by a stored `Dataset` | [`dataset.startExperiment` with an inline task](#dataset-experiments-with-an-inline-task) |
18
+
19
+ Pre-seeding `RequestContext` with `MastraMemory` is **not** a supported way to drive memory into an agent. Thread resolution reads `args.memory.thread` — `RequestContext.MastraMemory` is populated by `prepare-memory-step` after the agent has already resolved its thread.
20
+
21
+ ## Shared thread with `runEvals`
22
+
23
+ `runEvals` accepts `targetOptions`, which is forwarded to `agent.generate()`. Passing `memory: { thread, resource }` runs every data item against the same thread — useful for testing recall across a multi-turn conversation.
24
+
25
+ ```typescript
26
+ import { runEvals } from '@mastra/core/evals'
27
+ import { supportAgent } from './support-agent'
28
+ import { recallScorer } from '../scorers/recall-scorer'
29
+
30
+ const memory = await supportAgent.getMemory()
31
+ await memory!.createThread({ threadId: 'eval-thread', resourceId: 'ci-user' })
32
+
33
+ const result = await runEvals({
34
+ target: supportAgent,
35
+ scorers: [recallScorer],
36
+ targetOptions: {
37
+ memory: { thread: 'eval-thread', resource: 'ci-user' },
38
+ },
39
+ data: [
40
+ { input: 'My order number is 12345' },
41
+ { input: 'What is my order number?', groundTruth: '12345' },
42
+ ],
43
+ })
44
+ ```
45
+
46
+ `targetOptions` is **global per call**. There is no per-item override on `RunEvalsDataItem` today.
47
+
48
+ ## Per-item threads with `runEvals`
49
+
50
+ When each data item needs its own thread (the common CI shape), call `runEvals` once per item with a unique `targetOptions.memory` and aggregate the scores yourself.
51
+
52
+ ```typescript
53
+ import { randomUUID } from 'node:crypto'
54
+ import { runEvals } from '@mastra/core/evals'
55
+ import { supportAgent } from './support-agent'
56
+ import { recallScorer } from '../scorers/recall-scorer'
57
+
58
+ const memory = await supportAgent.getMemory()
59
+ const resourceId = 'ci-user'
60
+
61
+ const items = [
62
+ { input: 'Cats are mammals', groundTruth: 'mammals' },
63
+ { input: 'Dogs are mammals too', groundTruth: 'mammals' },
64
+ ]
65
+
66
+ // `runEvals` returns `{ scores: Record<string, number>; summary: { totalItems } }`.
67
+ const scores: number[] = []
68
+ for (const item of items) {
69
+ const threadId = `eval-${randomUUID()}`
70
+ await memory!.createThread({ threadId, resourceId, title: item.input })
71
+
72
+ const result = await runEvals({
73
+ target: supportAgent,
74
+ scorers: [recallScorer],
75
+ targetOptions: { memory: { thread: threadId, resource: resourceId } },
76
+ data: [item],
77
+ })
78
+
79
+ scores.push(result.scores[recallScorer.id])
80
+ }
81
+
82
+ const average = scores.reduce((a, b) => a + b, 0) / scores.length
83
+ ```
84
+
85
+ > **Note:** Create the thread before running the eval. Observational memory in `thread` scope reads from a record that must already exist.
86
+
87
+ ## Dataset experiments with an inline task
88
+
89
+ `dataset.startExperiment({ target: agent })` does **not** forward a `memory` option to the agent — only `requestContext`. To run a stored dataset against a memory-enabled agent, use an inline `task` function and stash `{ threadId, resourceId }` in each item's `metadata`. The scorer pipeline still runs as normal.
90
+
91
+ ```typescript
92
+ import { randomUUID } from 'node:crypto'
93
+ import { mastra } from '../index'
94
+ import { supportAgent } from '../agents/support-agent'
95
+ import { recallScorer } from '../scorers/recall-scorer'
96
+
97
+ const memory = await supportAgent.getMemory()
98
+ const resourceId = 'ci-user'
99
+
100
+ const items = [
101
+ { input: 'Cats are mammals', groundTruth: 'mammals', thread: `ds-${randomUUID()}` },
102
+ { input: 'Dogs are mammals too', groundTruth: 'mammals', thread: `ds-${randomUUID()}` },
103
+ ]
104
+
105
+ for (const it of items) {
106
+ await memory!.createThread({ threadId: it.thread, resourceId, title: it.input })
107
+ }
108
+
109
+ const dataset = await mastra.datasets.create({
110
+ name: 'support-recall',
111
+ description: 'Per-item memory via inline task + item metadata',
112
+ })
113
+
114
+ await dataset.addItems({
115
+ items: items.map(it => ({
116
+ input: it.input,
117
+ groundTruth: it.groundTruth,
118
+ metadata: { threadId: it.thread, resourceId },
119
+ })),
120
+ })
121
+
122
+ const summary = await dataset.startExperiment({
123
+ scorers: [recallScorer],
124
+ task: async ({ input, metadata }) => {
125
+ const { threadId, resourceId: rid } = (metadata ?? {}) as {
126
+ threadId: string
127
+ resourceId: string
128
+ }
129
+ const result = await supportAgent.generate(input as string, {
130
+ memory: { thread: threadId, resource: rid },
131
+ })
132
+ return result.text
133
+ },
134
+ })
135
+ ```
136
+
137
+ The inline `task` receives the item's `metadata`, so each row can drive its own thread without changing the agent or any scorer.
138
+
139
+ > **Note:** Visit [runEvals reference](https://mastra.ai/reference/evals/run-evals) and [Dataset reference](https://mastra.ai/reference/datasets/dataset) for full configuration.
140
+
141
+ ## Related
142
+
143
+ - [Running scorers in CI](https://mastra.ai/docs/evals/running-in-ci)
144
+ - [Running experiments](https://mastra.ai/docs/evals/datasets/running-experiments)
145
+ - [Observational memory](https://mastra.ai/docs/memory/observational-memory)
146
+ - [runEvals API reference](https://mastra.ai/reference/evals/run-evals)
@@ -121,4 +121,5 @@ describe('Weather Agent Tests', () => {
121
121
 
122
122
  - Learn about [creating custom scorers](https://mastra.ai/docs/evals/custom-scorers)
123
123
  - Explore [built-in scorers](https://mastra.ai/docs/evals/built-in-scorers)
124
+ - Run scorers against [memory-enabled agents](https://mastra.ai/docs/evals/evals-with-memory)
124
125
  - Read the [runEvals API reference](https://mastra.ai/reference/evals/run-evals)