npm - @mastra/memory - Versions diffs - 1.0.0-beta.1 → 1.0.0-beta.11 - Mend

@mastra/memory 1.0.0-beta.1 → 1.0.0-beta.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (73) hide show

package/CHANGELOG.md +355 -0
package/dist/_types/@internal_ai-sdk-v4/dist/index.d.ts +7549 -0
package/dist/chunk-DGUM43GV.js +10 -0
package/dist/chunk-DGUM43GV.js.map +1 -0
package/dist/chunk-JEQ2X3Z6.cjs +12 -0
package/dist/chunk-JEQ2X3Z6.cjs.map +1 -0
package/dist/chunk-KMQS2YEC.js +79 -0
package/dist/chunk-KMQS2YEC.js.map +1 -0
package/dist/chunk-MMUHFOCG.js +79 -0
package/dist/chunk-MMUHFOCG.js.map +1 -0
package/dist/chunk-QY6BZOPJ.js +250 -0
package/dist/chunk-QY6BZOPJ.js.map +1 -0
package/dist/chunk-SG3GRV3O.cjs +84 -0
package/dist/chunk-SG3GRV3O.cjs.map +1 -0
package/dist/chunk-W72AYUIF.cjs +252 -0
package/dist/chunk-W72AYUIF.cjs.map +1 -0
package/dist/chunk-WC4XBMZT.js +250 -0
package/dist/chunk-WC4XBMZT.js.map +1 -0
package/dist/chunk-YMNW6DEN.cjs +252 -0
package/dist/chunk-YMNW6DEN.cjs.map +1 -0
package/dist/chunk-ZUQPUTTO.cjs +84 -0
package/dist/chunk-ZUQPUTTO.cjs.map +1 -0
package/dist/docs/README.md +36 -0
package/dist/docs/SKILL.md +42 -0
package/dist/docs/SOURCE_MAP.json +31 -0
package/dist/docs/agents/01-agent-memory.md +160 -0
package/dist/docs/agents/02-networks.md +236 -0
package/dist/docs/agents/03-agent-approval.md +317 -0
package/dist/docs/core/01-reference.md +114 -0
package/dist/docs/memory/01-overview.md +76 -0
package/dist/docs/memory/02-storage.md +181 -0
package/dist/docs/memory/03-working-memory.md +386 -0
package/dist/docs/memory/04-semantic-recall.md +235 -0
package/dist/docs/memory/05-memory-processors.md +319 -0
package/dist/docs/memory/06-reference.md +617 -0
package/dist/docs/processors/01-reference.md +81 -0
package/dist/docs/storage/01-reference.md +972 -0
package/dist/docs/vectors/01-reference.md +929 -0
package/dist/index.cjs +14845 -115
package/dist/index.cjs.map +1 -1
package/dist/index.d.ts +145 -5
package/dist/index.d.ts.map +1 -1
package/dist/index.js +14807 -119
package/dist/index.js.map +1 -1
package/dist/token-6GSAFR2W-JV3TZR4M.cjs +63 -0
package/dist/token-6GSAFR2W-JV3TZR4M.cjs.map +1 -0
package/dist/token-6GSAFR2W-K2BTU23I.js +61 -0
package/dist/token-6GSAFR2W-K2BTU23I.js.map +1 -0
package/dist/token-6GSAFR2W-VLY2XUPA.js +61 -0
package/dist/token-6GSAFR2W-VLY2XUPA.js.map +1 -0
package/dist/token-6GSAFR2W-YCB5SK2Z.cjs +63 -0
package/dist/token-6GSAFR2W-YCB5SK2Z.cjs.map +1 -0
package/dist/token-util-NEHG7TUY-7IL6JUVY.cjs +10 -0
package/dist/token-util-NEHG7TUY-7IL6JUVY.cjs.map +1 -0
package/dist/token-util-NEHG7TUY-HF7KBP2H.cjs +10 -0
package/dist/token-util-NEHG7TUY-HF7KBP2H.cjs.map +1 -0
package/dist/token-util-NEHG7TUY-KSXDO2NO.js +8 -0
package/dist/token-util-NEHG7TUY-KSXDO2NO.js.map +1 -0
package/dist/token-util-NEHG7TUY-TIJ3LMSH.js +8 -0
package/dist/token-util-NEHG7TUY-TIJ3LMSH.js.map +1 -0
package/dist/tools/working-memory.d.ts +10 -2
package/dist/tools/working-memory.d.ts.map +1 -1
package/package.json +19 -25
package/dist/processors/index.cjs +0 -165
package/dist/processors/index.cjs.map +0 -1
package/dist/processors/index.d.ts +0 -3
package/dist/processors/index.d.ts.map +0 -1
package/dist/processors/index.js +0 -158
package/dist/processors/index.js.map +0 -1
package/dist/processors/token-limiter.d.ts +0 -32
package/dist/processors/token-limiter.d.ts.map +0 -1
package/dist/processors/tool-call-filter.d.ts +0 -20
package/dist/processors/tool-call-filter.d.ts.map +0 -1

package/dist/docs/memory/04-semantic-recall.md ADDED Viewed

@@ -0,0 +1,235 @@
+> Learn how to use semantic recall in Mastra to retrieve relevant messages from past conversations using vector search and embeddings.
+# Semantic Recall
+If you ask your friend what they did last weekend, they will search in their memory for events associated with "last weekend" and then tell you what they did. That's sort of like how semantic recall works in Mastra.
+> **Watch 📹**
+What semantic recall is, how it works, and how to configure it in Mastra → [YouTube (5 minutes)](https://youtu.be/UVZtK8cK8xQ)
+## How Semantic Recall Works
+Semantic recall is RAG-based search that helps agents maintain context across longer interactions when messages are no longer within [recent message history](./message-history).
+It uses vector embeddings of messages for similarity search, integrates with various vector stores, and has configurable context windows around retrieved messages.
+![Diagram showing Mastra Memory semantic recall](/img/semantic-recall.png)
+When it's enabled, new messages are used to query a vector DB for semantically similar messages.
+After getting a response from the LLM, all new messages (user, assistant, and tool calls/results) are inserted into the vector DB to be recalled in later interactions.
+## Quick Start
+Semantic recall is enabled by default, so if you give your agent memory it will be included:
+```typescript {9}
+import { Agent } from "@mastra/core/agent";
+import { Memory } from "@mastra/memory";
+const agent = new Agent({
+  id: "support-agent",
+  name: "SupportAgent",
+  instructions: "You are a helpful support agent.",
+  model: "openai/gpt-5.1",
+  memory: new Memory(),
+});
+```
+## Storage configuration
+Semantic recall relies on a [storage and vector db](https://mastra.ai/reference/v1/memory/memory-class) to store messages and their embeddings.
+```ts {8-16}
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+import { LibSQLStore, LibSQLVector } from "@mastra/libsql";
+const agent = new Agent({
+  memory: new Memory({
+    // this is the default storage db if omitted
+    storage: new LibSQLStore({
+      id: 'agent-storage',
+      url: "file:./local.db",
+    }),
+    // this is the default vector db if omitted
+    vector: new LibSQLVector({
+      id: 'agent-vector',
+      connectionUrl: "file:./local.db",
+    }),
+  }),
+});
+```
+Each vector store page below includes installation instructions, configuration parameters, and usage examples:
+- [Astra](https://mastra.ai/reference/v1/vectors/astra)
+- [Chroma](https://mastra.ai/reference/v1/vectors/chroma)
+- [Cloudflare Vectorize](https://mastra.ai/reference/v1/vectors/vectorize)
+- [Convex](https://mastra.ai/reference/v1/vectors/convex)
+- [Couchbase](https://mastra.ai/reference/v1/vectors/couchbase)
+- [DuckDB](https://mastra.ai/reference/v1/vectors/duckdb)
+- [Elasticsearch](https://mastra.ai/reference/v1/vectors/elasticsearch)
+- [LanceDB](https://mastra.ai/reference/v1/vectors/lance)
+- [libSQL](https://mastra.ai/reference/v1/vectors/libsql)
+- [MongoDB](https://mastra.ai/reference/v1/vectors/mongodb)
+- [OpenSearch](https://mastra.ai/reference/v1/vectors/opensearch)
+- [Pinecone](https://mastra.ai/reference/v1/vectors/pinecone)
+- [PostgreSQL](https://mastra.ai/reference/v1/vectors/pg)
+- [Qdrant](https://mastra.ai/reference/v1/vectors/qdrant)
+- [S3 Vectors](https://mastra.ai/reference/v1/vectors/s3vectors)
+- [Turbopuffer](https://mastra.ai/reference/v1/vectors/turbopuffer)
+- [Upstash](https://mastra.ai/reference/v1/vectors/upstash)
+## Recall configuration
+The three main parameters that control semantic recall behavior are:
+1. **topK**: How many semantically similar messages to retrieve
+2. **messageRange**: How much surrounding context to include with each match
+3. **scope**: Whether to search within the current thread or across all threads owned by a resource (the default is resource scope).
+```typescript {5-7}
+const agent = new Agent({
+  memory: new Memory({
+    options: {
+      semanticRecall: {
+        topK: 3, // Retrieve 3 most similar messages
+        messageRange: 2, // Include 2 messages before and after each match
+        scope: "resource", // Search across all threads for this user (default setting if omitted)
+      },
+    },
+  }),
+});
+```
+## Embedder configuration
+Semantic recall relies on an [embedding model](https://mastra.ai/reference/v1/memory/memory-class) to convert messages into embeddings. Mastra supports embedding models through the model router using `provider/model` strings, or you can use any [embedding model](https://sdk.vercel.ai/docs/ai-sdk-core/embeddings) compatible with the AI SDK.
+#### Using the Model Router (Recommended)
+The simplest way is to use a `provider/model` string with autocomplete support:
+```ts {7}
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+import { ModelRouterEmbeddingModel } from "@mastra/core/llm";
+const agent = new Agent({
+  memory: new Memory({
+    embedder: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
+  }),
+});
+```
+Supported embedding models:
+- **OpenAI**: `text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`
+- **Google**: `gemini-embedding-001`, `text-embedding-004`
+The model router automatically handles API key detection from environment variables (`OPENAI_API_KEY`, `GOOGLE_GENERATIVE_AI_API_KEY`).
+#### Using AI SDK Packages
+You can also use AI SDK embedding models directly:
+```ts {2,7}
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+import { ModelRouterEmbeddingModel } from "@mastra/core/llm";
+const agent = new Agent({
+  memory: new Memory({
+    embedder: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
+  }),
+});
+```
+#### Using FastEmbed (Local)
+To use FastEmbed (a local embedding model), install `@mastra/fastembed`:
+```bash npm2yarn
+npm install @mastra/fastembed@beta
+```
+Then configure it in your memory:
+```ts {3,7}
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+import { fastembed } from "@mastra/fastembed";
+const agent = new Agent({
+  memory: new Memory({
+    embedder: fastembed,
+  }),
+});
+```
+## PostgreSQL Index Optimization
+When using PostgreSQL as your vector store, you can optimize semantic recall performance by configuring the vector index. This is particularly important for large-scale deployments with thousands of messages.
+PostgreSQL supports both IVFFlat and HNSW indexes. By default, Mastra creates an IVFFlat index, but HNSW indexes typically provide better performance, especially with OpenAI embeddings which use inner product distance.
+```typescript {18-23}
+import { Memory } from "@mastra/memory";
+import { PgStore, PgVector } from "@mastra/pg";
+const agent = new Agent({
+  memory: new Memory({
+    storage: new PgStore({
+      id: 'agent-storage',
+      connectionString: process.env.DATABASE_URL,
+    }),
+    vector: new PgVector({
+      id: 'agent-vector',
+      connectionString: process.env.DATABASE_URL,
+    }),
+    options: {
+      semanticRecall: {
+        topK: 5,
+        messageRange: 2,
+        indexConfig: {
+          type: "hnsw", // Use HNSW for better performance
+          metric: "dotproduct", // Best for OpenAI embeddings
+          m: 16, // Number of bi-directional links (default: 16)
+          efConstruction: 64, // Size of candidate list during construction (default: 64)
+        },
+      },
+    },
+  }),
+});
+```
+For detailed information about index configuration options and performance tuning, see the [PgVector configuration guide](https://mastra.ai/reference/v1/vectors/pg#index-configuration-guide).
+## Disabling
+There is a performance impact to using semantic recall. New messages are converted into embeddings and used to query a vector database before new messages are sent to the LLM.
+Semantic recall is enabled by default but can be disabled when not needed:
+```typescript {4}
+const agent = new Agent({
+  memory: new Memory({
+    options: {
+      semanticRecall: false,
+    },
+  }),
+});
+```
+You might want to disable semantic recall in scenarios like:
+- When message history provides sufficient context for the current conversation.
+- In performance-sensitive applications, like realtime two-way audio, where the added latency of creating embeddings and running vector queries is noticeable.
+## Viewing Recalled Messages
+When tracing is enabled, any messages retrieved via semantic recall will appear in the agent's trace output, alongside recent message history (if configured).
+For more info on viewing message traces, see [Viewing Retrieved Messages](./overview#viewing-retrieved-messages).

package/dist/docs/memory/05-memory-processors.md ADDED Viewed

@@ -0,0 +1,319 @@
+> Learn how to use memory processors in Mastra to filter, trim, and transform messages before they
+# Memory Processors
+Memory processors transform and filter messages as they pass through an agent with memory enabled. They manage context window limits, remove unnecessary content, and optimize the information sent to the language model.
+When memory is enabled on an agent, Mastra adds memory processors to the agent's processor pipeline. These processors retrieve message history, working memory, and semantically relevant messages, then persist new messages after the model responds.
+Memory processors are [processors](https://mastra.ai/docs/v1/agents/processors) that operate specifically on memory-related messages and state.
+## Built-in Memory Processors
+Mastra automatically adds these processors when memory is enabled:
+### MessageHistory
+Retrieves message history and persists new messages.
+**When you configure:**
+```typescript
+memory: new Memory({
+  lastMessages: 10,
+});
+```
+**Mastra internally:**
+1. Creates a `MessageHistory` processor with `limit: 10`
+2. Adds it to the agent's input processors (runs before the LLM)
+3. Adds it to the agent's output processors (runs after the LLM)
+**What it does:**
+- **Input**: Fetches the last 10 messages from storage and prepends them to the conversation
+- **Output**: Persists new messages to storage after the model responds
+**Example:**
+```typescript
+import { Agent } from "@mastra/core/agent";
+import { Memory } from "@mastra/memory";
+import { LibSQLStore } from "@mastra/libsql";
+import { openai } from "@ai-sdk/openai";
+const agent = new Agent({
+  id: "test-agent",
+  name: "Test Agent",
+  instructions: "You are a helpful assistant",
+  model: 'openai/gpt-4o',
+  memory: new Memory({
+    storage: new LibSQLStore({
+      id: "memory-store",
+      url: "file:memory.db",
+    }),
+    lastMessages: 10, // MessageHistory processor automatically added
+  }),
+});
+```
+### SemanticRecall
+Retrieves semantically relevant messages based on the current input and creates embeddings for new messages.
+**When you configure:**
+```typescript
+memory: new Memory({
+  semanticRecall: { enabled: true },
+  vector: myVectorStore,
+  embedder: myEmbedder,
+});
+```
+**Mastra internally:**
+1. Creates a `SemanticRecall` processor
+2. Adds it to the agent's input processors (runs before the LLM)
+3. Adds it to the agent's output processors (runs after the LLM)
+4. Requires both a vector store and embedder to be configured
+**What it does:**
+- **Input**: Performs vector similarity search to find relevant past messages and prepends them to the conversation
+- **Output**: Creates embeddings for new messages and stores them in the vector store for future retrieval
+**Example:**
+```typescript
+import { Agent } from "@mastra/core/agent";
+import { Memory } from "@mastra/memory";
+import { LibSQLStore } from "@mastra/libsql";
+import { PineconeVector } from "@mastra/pinecone";
+import { OpenAIEmbedder } from "@mastra/openai";
+import { openai } from "@ai-sdk/openai";
+const agent = new Agent({
+  name: "semantic-agent",
+  instructions: "You are a helpful assistant with semantic memory",
+  model: 'openai/gpt-4o',
+  memory: new Memory({
+    storage: new LibSQLStore({
+      id: "memory-store",
+      url: "file:memory.db",
+    }),
+    vector: new PineconeVector({
+      id: "memory-vector",
+      apiKey: process.env.PINECONE_API_KEY!,
+      environment: "us-east-1",
+    }),
+    embedder: new OpenAIEmbedder({
+      model: "text-embedding-3-small",
+      apiKey: process.env.OPENAI_API_KEY!,
+    }),
+    semanticRecall: { enabled: true }, // SemanticRecall processor automatically added
+  }),
+});
+```
+### WorkingMemory
+Manages working memory state across conversations.
+**When you configure:**
+```typescript
+memory: new Memory({
+  workingMemory: { enabled: true },
+});
+```
+**Mastra internally:**
+1. Creates a `WorkingMemory` processor
+2. Adds it to the agent's input processors (runs before the LLM)
+3. Requires a storage adapter to be configured
+**What it does:**
+- **Input**: Retrieves working memory state for the current thread and prepends it to the conversation
+- **Output**: No output processing
+**Example:**
+```typescript
+import { Agent } from "@mastra/core/agent";
+import { Memory } from "@mastra/memory";
+import { LibSQLStore } from "@mastra/libsql";
+import { openai } from "@ai-sdk/openai";
+const agent = new Agent({
+  name: "working-memory-agent",
+  instructions: "You are an assistant with working memory",
+  model: 'openai/gpt-4o',
+  memory: new Memory({
+    storage: new LibSQLStore({
+      id: "memory-store",
+      url: "file:memory.db",
+    }),
+    workingMemory: { enabled: true }, // WorkingMemory processor automatically added
+  }),
+});
+```
+## Manual Control and Deduplication
+If you manually add a memory processor to `inputProcessors` or `outputProcessors`, Mastra will **not** automatically add it. This gives you full control over processor ordering:
+```typescript
+import { Agent } from "@mastra/core/agent";
+import { Memory } from "@mastra/memory";
+import { MessageHistory } from "@mastra/memory/processors";
+import { TokenLimiter } from "@mastra/core/processors";
+import { LibSQLStore } from "@mastra/libsql";
+import { openai } from "@ai-sdk/openai";
+// Custom MessageHistory with different configuration
+const customMessageHistory = new MessageHistory({
+  storage: new LibSQLStore({ id: "memory-store", url: "file:memory.db" }),
+  lastMessages: 20,
+});
+const agent = new Agent({
+  name: "custom-memory-agent",
+  instructions: "You are a helpful assistant",
+  model: 'openai/gpt-4o',
+  memory: new Memory({
+    storage: new LibSQLStore({ id: "memory-store", url: "file:memory.db" }),
+    lastMessages: 10, // This would normally add MessageHistory(10)
+  }),
+  inputProcessors: [
+    customMessageHistory, // Your custom one is used instead
+    new TokenLimiter({ limit: 4000 }), // Runs after your custom MessageHistory
+  ],
+});
+```
+## Processor Execution Order
+Understanding the execution order is important when combining guardrails with memory:
+### Input Processors
+```
+[Memory Processors] → [Your inputProcessors]
+```
+1. **Memory processors run FIRST**: `WorkingMemory`, `MessageHistory`, `SemanticRecall`
+2. **Your input processors run AFTER**: guardrails, filters, validators
+This means memory loads message history before your processors can validate or filter the input.
+### Output Processors
+```
+[Your outputProcessors] → [Memory Processors]
+```
+1. **Your output processors run FIRST**: guardrails, filters, validators
+2. **Memory processors run AFTER**: `SemanticRecall` (embeddings), `MessageHistory` (persistence)
+This ordering is designed to be **safe by default**: if your output guardrail calls `abort()`, the memory processors never run and **no messages are saved**.
+## Guardrails and Memory
+The default execution order provides safe guardrail behavior:
+### Output guardrails (recommended)
+Output guardrails run **before** memory processors save messages. If a guardrail aborts:
+- The tripwire is triggered
+- Memory processors are skipped
+- **No messages are persisted to storage**
+```typescript
+import { Agent } from "@mastra/core/agent";
+import { Memory } from "@mastra/memory";
+import { openai } from "@ai-sdk/openai";
+// Output guardrail that blocks inappropriate content
+const contentBlocker = {
+  id: "content-blocker",
+  processOutputResult: async ({ messages, abort }) => {
+    const hasInappropriateContent = messages.some((msg) =>
+      containsBadContent(msg)
+    );
+    if (hasInappropriateContent) {
+      abort("Content blocked by guardrail");
+    }
+    return messages;
+  },
+};
+const agent = new Agent({
+  name: "safe-agent",
+  instructions: "You are a helpful assistant",
+  model: 'openai/gpt-4o',
+  memory: new Memory({ lastMessages: 10 }),
+  // Your guardrail runs BEFORE memory saves
+  outputProcessors: [contentBlocker],
+});
+// If the guardrail aborts, nothing is saved to memory
+const result = await agent.generate("Hello");
+if (result.tripwire) {
+  console.log("Blocked:", result.tripwireReason);
+  // Memory is empty - no messages were persisted
+}
+```
+### Input guardrails
+Input guardrails run **after** memory processors load history. If a guardrail aborts:
+- The tripwire is triggered
+- The LLM is never called
+- Output processors (including memory persistence) are skipped
+- **No messages are persisted to storage**
+```typescript
+// Input guardrail that validates user input
+const inputValidator = {
+  id: "input-validator",
+  processInput: async ({ messages, abort }) => {
+    const lastUserMessage = messages.findLast((m) => m.role === "user");
+    if (isInvalidInput(lastUserMessage)) {
+      abort("Invalid input detected");
+    }
+    return messages;
+  },
+};
+const agent = new Agent({
+  name: "validated-agent",
+  instructions: "You are a helpful assistant",
+  model: 'openai/gpt-4o',
+  memory: new Memory({ lastMessages: 10 }),
+  // Your guardrail runs AFTER memory loads history
+  inputProcessors: [inputValidator],
+});
+```
+### Summary
+| Guardrail Type | When it runs | If it aborts |
+| -------------- | ------------ | ------------ |
+| Input | After memory loads history | LLM not called, nothing saved |
+| Output | Before memory saves | Nothing saved to storage |
+Both scenarios are safe - guardrails prevent inappropriate content from being persisted to memory
+## Related documentation
+- [Processors](https://mastra.ai/docs/v1/agents/processors) - General processor concepts and custom processor creation
+- [Guardrails](https://mastra.ai/docs/v1/agents/guardrails) - Security and validation processors
+- [Memory Overview](https://mastra.ai/docs/v1/memory/overview) - Memory types and configuration
+When creating custom processors avoid mutating the input `messages` array or its objects directly.