@mastra/libsql 1.6.1 → 1.6.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +22 -0
- package/dist/docs/SKILL.md +50 -0
- package/dist/docs/assets/SOURCE_MAP.json +6 -0
- package/dist/docs/references/docs-agents-agent-approval.md +558 -0
- package/dist/docs/references/docs-agents-agent-memory.md +209 -0
- package/dist/docs/references/docs-agents-network-approval.md +275 -0
- package/dist/docs/references/docs-agents-networks.md +299 -0
- package/dist/docs/references/docs-memory-memory-processors.md +314 -0
- package/dist/docs/references/docs-memory-message-history.md +260 -0
- package/dist/docs/references/docs-memory-overview.md +45 -0
- package/dist/docs/references/docs-memory-semantic-recall.md +272 -0
- package/dist/docs/references/docs-memory-storage.md +261 -0
- package/dist/docs/references/docs-memory-working-memory.md +400 -0
- package/dist/docs/references/docs-observability-overview.md +70 -0
- package/dist/docs/references/docs-observability-tracing-exporters-default.md +209 -0
- package/dist/docs/references/docs-rag-retrieval.md +515 -0
- package/dist/docs/references/docs-workflows-snapshots.md +238 -0
- package/dist/docs/references/guides-agent-frameworks-ai-sdk.md +140 -0
- package/dist/docs/references/reference-core-getMemory.md +50 -0
- package/dist/docs/references/reference-core-listMemory.md +56 -0
- package/dist/docs/references/reference-core-mastra-class.md +66 -0
- package/dist/docs/references/reference-memory-memory-class.md +147 -0
- package/dist/docs/references/reference-storage-composite.md +235 -0
- package/dist/docs/references/reference-storage-dynamodb.md +282 -0
- package/dist/docs/references/reference-storage-libsql.md +135 -0
- package/dist/docs/references/reference-vectors-libsql.md +305 -0
- package/dist/index.cjs +14 -3
- package/dist/index.cjs.map +1 -1
- package/dist/index.js +14 -3
- package/dist/index.js.map +1 -1
- package/dist/storage/domains/memory/index.d.ts.map +1 -1
- package/dist/vector/index.d.ts.map +1 -1
- package/package.json +5 -5
|
@@ -0,0 +1,260 @@
|
|
|
1
|
+
# Message History
|
|
2
|
+
|
|
3
|
+
Message history is the most basic and important form of memory. It gives the LLM a view of recent messages in the context window, enabling your agent to reference earlier exchanges and respond coherently.
|
|
4
|
+
|
|
5
|
+
You can also retrieve message history to display past conversations in your UI.
|
|
6
|
+
|
|
7
|
+
> **Info:** Each message belongs to a thread (the conversation) and a resource (the user or entity it's associated with). See [Threads and resources](https://mastra.ai/docs/memory/storage) for more detail.
|
|
8
|
+
|
|
9
|
+
## Getting started
|
|
10
|
+
|
|
11
|
+
Install the Mastra memory module along with a [storage adapter](https://mastra.ai/docs/memory/storage) for your database. The examples below use `@mastra/libsql`, which stores data locally in a `mastra.db` file.
|
|
12
|
+
|
|
13
|
+
**npm**:
|
|
14
|
+
|
|
15
|
+
```bash
|
|
16
|
+
npm install @mastra/memory@latest @mastra/libsql@latest
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
**pnpm**:
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
pnpm add @mastra/memory@latest @mastra/libsql@latest
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
**Yarn**:
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
yarn add @mastra/memory@latest @mastra/libsql@latest
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
**Bun**:
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
bun add @mastra/memory@latest @mastra/libsql@latest
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
Message history requires a storage adapter to persist conversations. Configure storage on your Mastra instance if you haven't already:
|
|
38
|
+
|
|
39
|
+
```typescript
|
|
40
|
+
import { Mastra } from '@mastra/core'
|
|
41
|
+
import { LibSQLStore } from '@mastra/libsql'
|
|
42
|
+
|
|
43
|
+
export const mastra = new Mastra({
|
|
44
|
+
storage: new LibSQLStore({
|
|
45
|
+
id: 'mastra-storage',
|
|
46
|
+
url: 'file:./mastra.db',
|
|
47
|
+
}),
|
|
48
|
+
})
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Give your agent a `Memory`:
|
|
52
|
+
|
|
53
|
+
```typescript
|
|
54
|
+
import { Memory } from '@mastra/memory'
|
|
55
|
+
import { Agent } from '@mastra/core/agent'
|
|
56
|
+
|
|
57
|
+
export const agent = new Agent({
|
|
58
|
+
id: 'test-agent',
|
|
59
|
+
memory: new Memory({
|
|
60
|
+
options: {
|
|
61
|
+
lastMessages: 10,
|
|
62
|
+
},
|
|
63
|
+
}),
|
|
64
|
+
})
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
When you call the agent, messages are automatically saved to the database. You can specify a `threadId`, `resourceId`, and optional `metadata`:
|
|
68
|
+
|
|
69
|
+
**Generate**:
|
|
70
|
+
|
|
71
|
+
```typescript
|
|
72
|
+
await agent.generate('Hello', {
|
|
73
|
+
memory: {
|
|
74
|
+
thread: {
|
|
75
|
+
id: 'thread-123',
|
|
76
|
+
title: 'Support conversation',
|
|
77
|
+
metadata: { category: 'billing' },
|
|
78
|
+
},
|
|
79
|
+
resource: 'user-456',
|
|
80
|
+
},
|
|
81
|
+
})
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
**Stream**:
|
|
85
|
+
|
|
86
|
+
```typescript
|
|
87
|
+
await agent.stream('Hello', {
|
|
88
|
+
memory: {
|
|
89
|
+
thread: {
|
|
90
|
+
id: 'thread-123',
|
|
91
|
+
title: 'Support conversation',
|
|
92
|
+
metadata: { category: 'billing' },
|
|
93
|
+
},
|
|
94
|
+
resource: 'user-456',
|
|
95
|
+
},
|
|
96
|
+
})
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
> **Info:** Threads and messages are created automatically when you call `agent.generate()` or `agent.stream()`, but you can also create them manually with [`createThread()`](https://mastra.ai/reference/memory/createThread) and [`saveMessages()`](https://mastra.ai/reference/memory/memory-class).
|
|
100
|
+
|
|
101
|
+
There are two ways to use this history:
|
|
102
|
+
|
|
103
|
+
- **Automatic inclusion** - Mastra automatically fetches and includes recent messages in the context window. By default, it includes the last 10 messages, keeping agents grounded in the conversation. You can adjust this number with `lastMessages`, but in most cases you don't need to think about it.
|
|
104
|
+
- [**Manual querying**](#querying) - For more control, use the `recall()` function to query threads and messages directly. This lets you choose exactly which memories are included in the context window, or fetch messages to render conversation history in your UI.
|
|
105
|
+
|
|
106
|
+
## Accessing Memory
|
|
107
|
+
|
|
108
|
+
To access memory functions for querying, cloning, or deleting threads and messages, call `getMemory()` on an agent:
|
|
109
|
+
|
|
110
|
+
```typescript
|
|
111
|
+
const agent = mastra.getAgent('weatherAgent')
|
|
112
|
+
const memory = await agent.getMemory()
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
The `Memory` instance gives you access to functions for listing threads, recalling messages, cloning conversations, and more.
|
|
116
|
+
|
|
117
|
+
## Querying
|
|
118
|
+
|
|
119
|
+
Use these methods to fetch threads and messages for displaying conversation history in your UI or for custom memory retrieval logic.
|
|
120
|
+
|
|
121
|
+
> **Warning:** The memory system does not enforce access control. Before running any query, verify in your application logic that the current user is authorized to access the `resourceId` being queried.
|
|
122
|
+
|
|
123
|
+
### Threads
|
|
124
|
+
|
|
125
|
+
Use [`listThreads()`](https://mastra.ai/reference/memory/listThreads) to retrieve threads for a resource:
|
|
126
|
+
|
|
127
|
+
```typescript
|
|
128
|
+
const result = await memory.listThreads({
|
|
129
|
+
filter: { resourceId: 'user-123' },
|
|
130
|
+
perPage: false,
|
|
131
|
+
})
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
Paginate through threads:
|
|
135
|
+
|
|
136
|
+
```typescript
|
|
137
|
+
const result = await memory.listThreads({
|
|
138
|
+
filter: { resourceId: 'user-123' },
|
|
139
|
+
page: 0,
|
|
140
|
+
perPage: 10,
|
|
141
|
+
})
|
|
142
|
+
|
|
143
|
+
console.log(result.threads) // thread objects
|
|
144
|
+
console.log(result.hasMore) // more pages available?
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
You can also filter by metadata and control sort order:
|
|
148
|
+
|
|
149
|
+
```typescript
|
|
150
|
+
const result = await memory.listThreads({
|
|
151
|
+
filter: {
|
|
152
|
+
resourceId: 'user-123',
|
|
153
|
+
metadata: { status: 'active' },
|
|
154
|
+
},
|
|
155
|
+
orderBy: { field: 'createdAt', direction: 'DESC' },
|
|
156
|
+
})
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
To fetch a single thread by ID, use [`getThreadById()`](https://mastra.ai/reference/memory/getThreadById):
|
|
160
|
+
|
|
161
|
+
```typescript
|
|
162
|
+
const thread = await memory.getThreadById({ threadId: 'thread-123' })
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
### Messages
|
|
166
|
+
|
|
167
|
+
Once you have a thread, use [`recall()`](https://mastra.ai/reference/memory/recall) to retrieve its messages. It supports pagination, date filtering, and [semantic search](https://mastra.ai/docs/memory/semantic-recall).
|
|
168
|
+
|
|
169
|
+
Basic recall returns all messages from a thread:
|
|
170
|
+
|
|
171
|
+
```typescript
|
|
172
|
+
const { messages } = await memory.recall({
|
|
173
|
+
threadId: 'thread-123',
|
|
174
|
+
perPage: false,
|
|
175
|
+
})
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Paginate through messages:
|
|
179
|
+
|
|
180
|
+
```typescript
|
|
181
|
+
const { messages } = await memory.recall({
|
|
182
|
+
threadId: 'thread-123',
|
|
183
|
+
page: 0,
|
|
184
|
+
perPage: 50,
|
|
185
|
+
})
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
Filter by date range:
|
|
189
|
+
|
|
190
|
+
```typescript
|
|
191
|
+
const { messages } = await memory.recall({
|
|
192
|
+
threadId: 'thread-123',
|
|
193
|
+
filter: {
|
|
194
|
+
dateRange: {
|
|
195
|
+
start: new Date('2025-01-01'),
|
|
196
|
+
end: new Date('2025-06-01'),
|
|
197
|
+
},
|
|
198
|
+
},
|
|
199
|
+
})
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
Fetch a single message by ID:
|
|
203
|
+
|
|
204
|
+
```typescript
|
|
205
|
+
const { messages } = await memory.recall({
|
|
206
|
+
threadId: 'thread-123',
|
|
207
|
+
include: [{ id: 'msg-123' }],
|
|
208
|
+
})
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
Fetch multiple messages by ID with surrounding context:
|
|
212
|
+
|
|
213
|
+
```typescript
|
|
214
|
+
const { messages } = await memory.recall({
|
|
215
|
+
threadId: 'thread-123',
|
|
216
|
+
include: [
|
|
217
|
+
{ id: 'msg-123' },
|
|
218
|
+
{
|
|
219
|
+
id: 'msg-456',
|
|
220
|
+
withPreviousMessages: 3,
|
|
221
|
+
withNextMessages: 1,
|
|
222
|
+
},
|
|
223
|
+
],
|
|
224
|
+
})
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
Search by meaning (see [Semantic recall](https://mastra.ai/docs/memory/semantic-recall) for setup):
|
|
228
|
+
|
|
229
|
+
```typescript
|
|
230
|
+
const { messages } = await memory.recall({
|
|
231
|
+
threadId: 'thread-123',
|
|
232
|
+
vectorSearchString: 'project deadline discussion',
|
|
233
|
+
threadConfig: {
|
|
234
|
+
semanticRecall: true,
|
|
235
|
+
},
|
|
236
|
+
})
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
### UI format
|
|
240
|
+
|
|
241
|
+
Message queries return `MastraDBMessage[]` format. To display messages in a frontend, you may need to convert them to a format your UI library expects. For example, [`toAISdkV5Messages`](https://mastra.ai/reference/ai-sdk/to-ai-sdk-v5-messages) converts messages to AI SDK UI format.
|
|
242
|
+
|
|
243
|
+
## Thread cloning
|
|
244
|
+
|
|
245
|
+
Thread cloning creates a copy of an existing thread with its messages. This is useful for branching conversations, creating checkpoints before a potentially destructive operation, or testing variations of a conversation.
|
|
246
|
+
|
|
247
|
+
```typescript
|
|
248
|
+
const { thread, clonedMessages } = await memory.cloneThread({
|
|
249
|
+
sourceThreadId: 'thread-123',
|
|
250
|
+
title: 'Branched conversation',
|
|
251
|
+
})
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
You can filter which messages get cloned (by count or date range), specify custom thread IDs, and use utility methods to inspect clone relationships.
|
|
255
|
+
|
|
256
|
+
See [`cloneThread()`](https://mastra.ai/reference/memory/cloneThread) and [clone utilities](https://mastra.ai/reference/memory/clone-utilities) for the full API.
|
|
257
|
+
|
|
258
|
+
## Deleting messages
|
|
259
|
+
|
|
260
|
+
To remove messages from a thread, use [`deleteMessages()`](https://mastra.ai/reference/memory/deleteMessages). You can delete by message ID or clear all messages from a thread.
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
# Memory
|
|
2
|
+
|
|
3
|
+
Memory enables your agent to remember user messages, agent replies, and tool results across interactions, giving it the context it needs to stay consistent, maintain conversation flow, and produce better answers over time.
|
|
4
|
+
|
|
5
|
+
Mastra supports four complementary memory types:
|
|
6
|
+
|
|
7
|
+
- [**Message history**](https://mastra.ai/docs/memory/message-history) - keeps recent messages from the current conversation so they can be rendered in the UI and used to maintain short-term continuity within the exchange.
|
|
8
|
+
- [**Working memory**](https://mastra.ai/docs/memory/working-memory) - stores persistent, structured user data such as names, preferences, and goals.
|
|
9
|
+
- [**Semantic recall**](https://mastra.ai/docs/memory/semantic-recall) - retrieves relevant messages from older conversations based on semantic meaning rather than exact keywords, mirroring how humans recall information by association. Requires a [vector database](https://mastra.ai/docs/memory/semantic-recall) and an [embedding model](https://mastra.ai/docs/memory/semantic-recall).
|
|
10
|
+
- [**Observational memory**](https://mastra.ai/docs/memory/observational-memory) - uses background Observer and Reflector agents to maintain a dense observation log that replaces raw message history as it grows, keeping the context window small while preserving long-term memory across conversations.
|
|
11
|
+
|
|
12
|
+
If the combined memory exceeds the model's context limit, [memory processors](https://mastra.ai/docs/memory/memory-processors) can filter, trim, or prioritize content so the most relevant information is preserved.
|
|
13
|
+
|
|
14
|
+
## Getting started
|
|
15
|
+
|
|
16
|
+
Choose a memory option to get started:
|
|
17
|
+
|
|
18
|
+
- [Message history](https://mastra.ai/docs/memory/message-history)
|
|
19
|
+
- [Working memory](https://mastra.ai/docs/memory/working-memory)
|
|
20
|
+
- [Semantic recall](https://mastra.ai/docs/memory/semantic-recall)
|
|
21
|
+
- [Observational memory](https://mastra.ai/docs/memory/observational-memory)
|
|
22
|
+
|
|
23
|
+
## Storage
|
|
24
|
+
|
|
25
|
+
Before enabling memory, you must first configure a storage adapter. Mastra supports several databases including PostgreSQL, MongoDB, libSQL, and [more](https://mastra.ai/docs/memory/storage).
|
|
26
|
+
|
|
27
|
+
Storage can be configured at the [instance level](https://mastra.ai/docs/memory/storage) (shared across all agents) or at the [agent level](https://mastra.ai/docs/memory/storage) (dedicated per agent).
|
|
28
|
+
|
|
29
|
+
For semantic recall, you can use a separate vector database like Pinecone alongside your primary storage.
|
|
30
|
+
|
|
31
|
+
See the [Storage](https://mastra.ai/docs/memory/storage) documentation for configuration options, supported providers, and examples.
|
|
32
|
+
|
|
33
|
+
## Debugging memory
|
|
34
|
+
|
|
35
|
+
When [tracing](https://mastra.ai/docs/observability/tracing/overview) is enabled, you can inspect exactly which messages the agent uses for context in each request. The trace output shows all memory included in the agent's context window - both recent message history and messages recalled via semantic recall.
|
|
36
|
+
|
|
37
|
+

|
|
38
|
+
|
|
39
|
+
This visibility helps you understand why an agent made specific decisions and verify that memory retrieval is working as expected.
|
|
40
|
+
|
|
41
|
+
## Next steps
|
|
42
|
+
|
|
43
|
+
- Learn more about [Storage](https://mastra.ai/docs/memory/storage) providers and configuration options
|
|
44
|
+
- Add [Message history](https://mastra.ai/docs/memory/message-history), [Working memory](https://mastra.ai/docs/memory/working-memory), [Semantic recall](https://mastra.ai/docs/memory/semantic-recall), or [Observational memory](https://mastra.ai/docs/memory/observational-memory)
|
|
45
|
+
- Visit [Memory configuration reference](https://mastra.ai/reference/memory/memory-class) for all available options
|
|
@@ -0,0 +1,272 @@
|
|
|
1
|
+
# Semantic Recall
|
|
2
|
+
|
|
3
|
+
If you ask your friend what they did last weekend, they will search in their memory for events associated with "last weekend" and then tell you what they did. That's sort of like how semantic recall works in Mastra.
|
|
4
|
+
|
|
5
|
+
> **Watch 📹:** What semantic recall is, how it works, and how to configure it in Mastra → [YouTube (5 minutes)](https://youtu.be/UVZtK8cK8xQ)
|
|
6
|
+
|
|
7
|
+
## How Semantic Recall Works
|
|
8
|
+
|
|
9
|
+
Semantic recall is RAG-based search that helps agents maintain context across longer interactions when messages are no longer within [recent message history](https://mastra.ai/docs/memory/message-history).
|
|
10
|
+
|
|
11
|
+
It uses vector embeddings of messages for similarity search, integrates with various vector stores, and has configurable context windows around retrieved messages.
|
|
12
|
+
|
|
13
|
+

|
|
14
|
+
|
|
15
|
+
When it's enabled, new messages are used to query a vector DB for semantically similar messages.
|
|
16
|
+
|
|
17
|
+
After getting a response from the LLM, all new messages (user, assistant, and tool calls/results) are inserted into the vector DB to be recalled in later interactions.
|
|
18
|
+
|
|
19
|
+
## Quick Start
|
|
20
|
+
|
|
21
|
+
Semantic recall is enabled by default, so if you give your agent memory it will be included:
|
|
22
|
+
|
|
23
|
+
```typescript
|
|
24
|
+
import { Agent } from '@mastra/core/agent'
|
|
25
|
+
import { Memory } from '@mastra/memory'
|
|
26
|
+
|
|
27
|
+
const agent = new Agent({
|
|
28
|
+
id: 'support-agent',
|
|
29
|
+
name: 'SupportAgent',
|
|
30
|
+
instructions: 'You are a helpful support agent.',
|
|
31
|
+
model: 'openai/gpt-5.1',
|
|
32
|
+
memory: new Memory(),
|
|
33
|
+
})
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Using the recall() Method
|
|
37
|
+
|
|
38
|
+
While `listMessages` retrieves messages by thread ID with basic pagination, [`recall()`](https://mastra.ai/reference/memory/recall) adds support for **semantic search**. When you need to find messages by meaning rather than just recency, use `recall()` with a `vectorSearchString`:
|
|
39
|
+
|
|
40
|
+
```typescript
|
|
41
|
+
const memory = await agent.getMemory()
|
|
42
|
+
|
|
43
|
+
// Basic recall - similar to listMessages
|
|
44
|
+
const { messages } = await memory!.recall({
|
|
45
|
+
threadId: 'thread-123',
|
|
46
|
+
perPage: 50,
|
|
47
|
+
})
|
|
48
|
+
|
|
49
|
+
// Semantic recall - find messages by meaning
|
|
50
|
+
const { messages: relevantMessages } = await memory!.recall({
|
|
51
|
+
threadId: 'thread-123',
|
|
52
|
+
vectorSearchString: 'What did we discuss about the project deadline?',
|
|
53
|
+
threadConfig: {
|
|
54
|
+
semanticRecall: true,
|
|
55
|
+
},
|
|
56
|
+
})
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## Storage configuration
|
|
60
|
+
|
|
61
|
+
Semantic recall relies on a [storage and vector db](https://mastra.ai/reference/memory/memory-class) to store messages and their embeddings.
|
|
62
|
+
|
|
63
|
+
```ts
|
|
64
|
+
import { Memory } from '@mastra/memory'
|
|
65
|
+
import { Agent } from '@mastra/core/agent'
|
|
66
|
+
import { LibSQLStore, LibSQLVector } from '@mastra/libsql'
|
|
67
|
+
|
|
68
|
+
const agent = new Agent({
|
|
69
|
+
memory: new Memory({
|
|
70
|
+
// this is the default storage db if omitted
|
|
71
|
+
storage: new LibSQLStore({
|
|
72
|
+
id: 'agent-storage',
|
|
73
|
+
url: 'file:./local.db',
|
|
74
|
+
}),
|
|
75
|
+
// this is the default vector db if omitted
|
|
76
|
+
vector: new LibSQLVector({
|
|
77
|
+
id: 'agent-vector',
|
|
78
|
+
url: 'file:./local.db',
|
|
79
|
+
}),
|
|
80
|
+
}),
|
|
81
|
+
})
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Each vector store page below includes installation instructions, configuration parameters, and usage examples:
|
|
85
|
+
|
|
86
|
+
- [Astra](https://mastra.ai/reference/vectors/astra)
|
|
87
|
+
- [Chroma](https://mastra.ai/reference/vectors/chroma)
|
|
88
|
+
- [Cloudflare Vectorize](https://mastra.ai/reference/vectors/vectorize)
|
|
89
|
+
- [Convex](https://mastra.ai/reference/vectors/convex)
|
|
90
|
+
- [Couchbase](https://mastra.ai/reference/vectors/couchbase)
|
|
91
|
+
- [DuckDB](https://mastra.ai/reference/vectors/duckdb)
|
|
92
|
+
- [Elasticsearch](https://mastra.ai/reference/vectors/elasticsearch)
|
|
93
|
+
- [LanceDB](https://mastra.ai/reference/vectors/lance)
|
|
94
|
+
- [libSQL](https://mastra.ai/reference/vectors/libsql)
|
|
95
|
+
- [MongoDB](https://mastra.ai/reference/vectors/mongodb)
|
|
96
|
+
- [OpenSearch](https://mastra.ai/reference/vectors/opensearch)
|
|
97
|
+
- [Pinecone](https://mastra.ai/reference/vectors/pinecone)
|
|
98
|
+
- [PostgreSQL](https://mastra.ai/reference/vectors/pg)
|
|
99
|
+
- [Qdrant](https://mastra.ai/reference/vectors/qdrant)
|
|
100
|
+
- [S3 Vectors](https://mastra.ai/reference/vectors/s3vectors)
|
|
101
|
+
- [Turbopuffer](https://mastra.ai/reference/vectors/turbopuffer)
|
|
102
|
+
- [Upstash](https://mastra.ai/reference/vectors/upstash)
|
|
103
|
+
|
|
104
|
+
## Recall configuration
|
|
105
|
+
|
|
106
|
+
The three main parameters that control semantic recall behavior are:
|
|
107
|
+
|
|
108
|
+
1. **topK**: How many semantically similar messages to retrieve
|
|
109
|
+
2. **messageRange**: How much surrounding context to include with each match
|
|
110
|
+
3. **scope**: Whether to search within the current thread or across all threads owned by a resource (the default is resource scope).
|
|
111
|
+
|
|
112
|
+
```typescript
|
|
113
|
+
const agent = new Agent({
|
|
114
|
+
memory: new Memory({
|
|
115
|
+
options: {
|
|
116
|
+
semanticRecall: {
|
|
117
|
+
topK: 3, // Retrieve 3 most similar messages
|
|
118
|
+
messageRange: 2, // Include 2 messages before and after each match
|
|
119
|
+
scope: 'resource', // Search across all threads for this user (default setting if omitted)
|
|
120
|
+
},
|
|
121
|
+
},
|
|
122
|
+
}),
|
|
123
|
+
})
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
## Embedder configuration
|
|
127
|
+
|
|
128
|
+
Semantic recall relies on an [embedding model](https://mastra.ai/reference/memory/memory-class) to convert messages into embeddings. Mastra supports embedding models through the model router using `provider/model` strings, or you can use any [embedding model](https://sdk.vercel.ai/docs/ai-sdk-core/embeddings) compatible with the AI SDK.
|
|
129
|
+
|
|
130
|
+
### Using the Model Router (Recommended)
|
|
131
|
+
|
|
132
|
+
The simplest way is to use a `provider/model` string with autocomplete support:
|
|
133
|
+
|
|
134
|
+
```ts
|
|
135
|
+
import { Memory } from '@mastra/memory'
|
|
136
|
+
import { Agent } from '@mastra/core/agent'
|
|
137
|
+
import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
|
|
138
|
+
|
|
139
|
+
const agent = new Agent({
|
|
140
|
+
memory: new Memory({
|
|
141
|
+
embedder: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
|
|
142
|
+
}),
|
|
143
|
+
})
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Supported embedding models:
|
|
147
|
+
|
|
148
|
+
- **OpenAI**: `text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`
|
|
149
|
+
- **Google**: `gemini-embedding-001`
|
|
150
|
+
|
|
151
|
+
The model router automatically handles API key detection from environment variables (`OPENAI_API_KEY`, `GOOGLE_GENERATIVE_AI_API_KEY`).
|
|
152
|
+
|
|
153
|
+
### Using AI SDK Packages
|
|
154
|
+
|
|
155
|
+
You can also use AI SDK embedding models directly:
|
|
156
|
+
|
|
157
|
+
```ts
|
|
158
|
+
import { Memory } from '@mastra/memory'
|
|
159
|
+
import { Agent } from '@mastra/core/agent'
|
|
160
|
+
import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
|
|
161
|
+
|
|
162
|
+
const agent = new Agent({
|
|
163
|
+
memory: new Memory({
|
|
164
|
+
embedder: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
|
|
165
|
+
}),
|
|
166
|
+
})
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
### Using FastEmbed (Local)
|
|
170
|
+
|
|
171
|
+
To use FastEmbed (a local embedding model), install `@mastra/fastembed`:
|
|
172
|
+
|
|
173
|
+
**npm**:
|
|
174
|
+
|
|
175
|
+
```bash
|
|
176
|
+
npm install @mastra/fastembed@latest
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
**pnpm**:
|
|
180
|
+
|
|
181
|
+
```bash
|
|
182
|
+
pnpm add @mastra/fastembed@latest
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
**Yarn**:
|
|
186
|
+
|
|
187
|
+
```bash
|
|
188
|
+
yarn add @mastra/fastembed@latest
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
**Bun**:
|
|
192
|
+
|
|
193
|
+
```bash
|
|
194
|
+
bun add @mastra/fastembed@latest
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
Then configure it in your memory:
|
|
198
|
+
|
|
199
|
+
```ts
|
|
200
|
+
import { Memory } from '@mastra/memory'
|
|
201
|
+
import { Agent } from '@mastra/core/agent'
|
|
202
|
+
import { fastembed } from '@mastra/fastembed'
|
|
203
|
+
|
|
204
|
+
const agent = new Agent({
|
|
205
|
+
memory: new Memory({
|
|
206
|
+
embedder: fastembed,
|
|
207
|
+
}),
|
|
208
|
+
})
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
## PostgreSQL Index Optimization
|
|
212
|
+
|
|
213
|
+
When using PostgreSQL as your vector store, you can optimize semantic recall performance by configuring the vector index. This is particularly important for large-scale deployments with thousands of messages.
|
|
214
|
+
|
|
215
|
+
PostgreSQL supports both IVFFlat and HNSW indexes. By default, Mastra creates an IVFFlat index, but HNSW indexes typically provide better performance, especially with OpenAI embeddings which use inner product distance.
|
|
216
|
+
|
|
217
|
+
```typescript
|
|
218
|
+
import { Memory } from '@mastra/memory'
|
|
219
|
+
import { PgStore, PgVector } from '@mastra/pg'
|
|
220
|
+
|
|
221
|
+
const agent = new Agent({
|
|
222
|
+
memory: new Memory({
|
|
223
|
+
storage: new PgStore({
|
|
224
|
+
id: 'agent-storage',
|
|
225
|
+
connectionString: process.env.DATABASE_URL,
|
|
226
|
+
}),
|
|
227
|
+
vector: new PgVector({
|
|
228
|
+
id: 'agent-vector',
|
|
229
|
+
connectionString: process.env.DATABASE_URL,
|
|
230
|
+
}),
|
|
231
|
+
options: {
|
|
232
|
+
semanticRecall: {
|
|
233
|
+
topK: 5,
|
|
234
|
+
messageRange: 2,
|
|
235
|
+
indexConfig: {
|
|
236
|
+
type: 'hnsw', // Use HNSW for better performance
|
|
237
|
+
metric: 'dotproduct', // Best for OpenAI embeddings
|
|
238
|
+
m: 16, // Number of bi-directional links (default: 16)
|
|
239
|
+
efConstruction: 64, // Size of candidate list during construction (default: 64)
|
|
240
|
+
},
|
|
241
|
+
},
|
|
242
|
+
},
|
|
243
|
+
}),
|
|
244
|
+
})
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
For detailed information about index configuration options and performance tuning, see the [PgVector configuration guide](https://mastra.ai/reference/vectors/pg).
|
|
248
|
+
|
|
249
|
+
## Disabling
|
|
250
|
+
|
|
251
|
+
There is a performance impact to using semantic recall. New messages are converted into embeddings and used to query a vector database before new messages are sent to the LLM.
|
|
252
|
+
|
|
253
|
+
Semantic recall is enabled by default but can be disabled when not needed:
|
|
254
|
+
|
|
255
|
+
```typescript
|
|
256
|
+
const agent = new Agent({
|
|
257
|
+
memory: new Memory({
|
|
258
|
+
options: {
|
|
259
|
+
semanticRecall: false,
|
|
260
|
+
},
|
|
261
|
+
}),
|
|
262
|
+
})
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
You might want to disable semantic recall in scenarios like:
|
|
266
|
+
|
|
267
|
+
- When message history provides sufficient context for the current conversation.
|
|
268
|
+
- In performance-sensitive applications, like realtime two-way audio, where the added latency of creating embeddings and running vector queries is noticeable.
|
|
269
|
+
|
|
270
|
+
## Viewing Recalled Messages
|
|
271
|
+
|
|
272
|
+
When tracing is enabled, any messages retrieved via semantic recall will appear in the agent's trace output, alongside recent message history (if configured).
|