npm - @mastra/memory - Versions diffs - 1.1.0 → 1.2.0-alpha.1 - Mend

@mastra/memory 1.1.0 → 1.2.0-alpha.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (95) hide show

package/dist/docs/references/reference-memory-listThreads.md ADDED Viewed

@@ -0,0 +1,145 @@
+# Memory.listThreads()
+The `listThreads()` method retrieves threads with pagination support and optional filtering by `resourceId`, `metadata`, or both.
+## Usage Examples
+### List all threads with pagination
+```typescript
+const result = await memory.listThreads({
+  page: 0,
+  perPage: 10,
+});
+```
+### Fetch all threads without pagination
+Use `perPage: false` to retrieve all matching threads at once.
+> **Warning:** Generally speaking it's recommended to use pagination, especially for large datasets. Use this option cautiously.
+```typescript
+const result = await memory.listThreads({
+  filter: { resourceId: "user-123" },
+  perPage: false,
+});
+```
+### Filter by resourceId
+```typescript
+const result = await memory.listThreads({
+  filter: { resourceId: "user-123" },
+  page: 0,
+  perPage: 10,
+});
+```
+### Filter by metadata
+```typescript
+const result = await memory.listThreads({
+  filter: { metadata: { category: "support", priority: "high" } },
+  page: 0,
+  perPage: 10,
+});
+```
+### Combined filter (resourceId & metadata)
+```typescript
+const result = await memory.listThreads({
+  filter: {
+    resourceId: "user-123",
+    metadata: { status: "active" },
+  },
+  page: 0,
+  perPage: 10,
+});
+```
+## Parameters
+**filter?:** (`{ resourceId?: string; metadata?: Record<string, unknown> }`): Optional filter object. resourceId filters threads by resource ID. metadata filters threads by metadata key-value pairs (AND logic - all must match)
+**page?:** (`number`): Page number (0-indexed) to retrieve
+**perPage?:** (`number | false`): Maximum number of threads to return per page, or false to fetch all
+**orderBy?:** (`{ field: 'createdAt' | 'updatedAt', direction: 'ASC' | 'DESC' }`): Sort configuration with field and direction (defaults to { field: 'createdAt', direction: 'DESC' })
+## Returns
+**result:** (`Promise<StorageListThreadsOutput>`): A promise that resolves to paginated thread results with metadata
+The return object contains:
+- `threads`: Array of thread objects
+- `total`: Total number of threads matching the filter
+- `page`: Current page number (same as the input `page` parameter)
+- `perPage`: Items per page (same as the input `perPage` parameter)
+- `hasMore`: Boolean indicating if more results are available
+## Extended usage example
+```typescript
+import { mastra } from "./mastra";
+const agent = mastra.getAgent("agent");
+const memory = await agent.getMemory();
+let currentPage = 0;
+const perPage = 25;
+let hasMorePages = true;
+// Fetch all active threads for a user, sorted by creation date
+while (hasMorePages) {
+  const result = await memory?.listThreads({
+    filter: {
+      resourceId: "user-123",
+      metadata: { status: "active" },
+    },
+    page: currentPage,
+    perPage: perPage,
+    orderBy: { field: "createdAt", direction: "ASC" },
+  });
+  if (!result) {
+    console.log("No threads");
+    break;
+  }
+  result.threads.forEach((thread) => {
+    console.log(`Thread: ${thread.id}, Created: ${thread.createdAt}`);
+  });
+  hasMorePages = result.hasMore;
+  currentPage++; // Move to next page
+}
+```
+## Metadata Filtering
+The metadata filter uses AND logic - all specified key-value pairs must match for a thread to be included in the results:
+```typescript
+// This will only return threads where BOTH conditions are true:
+// - category === 'support'
+// - priority === 'high'
+await memory.listThreads({
+  filter: {
+    metadata: {
+      category: "support",
+      priority: "high",
+    },
+  },
+});
+```
+## Related
+- [Memory Class Reference](https://mastra.ai/reference/memory/memory-class)
+- [Getting Started with Memory](https://mastra.ai/docs/memory/overview)
+- [createThread](https://mastra.ai/reference/memory/createThread)
+- [getThreadById](https://mastra.ai/reference/memory/getThreadById)

package/dist/docs/references/reference-memory-memory-class.md ADDED Viewed

@@ -0,0 +1,147 @@
+# Memory Class
+The `Memory` class provides a robust system for managing conversation history and thread-based message storage in Mastra. It enables persistent storage of conversations, semantic search capabilities, and efficient message retrieval. You must configure a storage provider for conversation history, and if you enable semantic recall you will also need to provide a vector store and embedder.
+## Usage example
+```typescript
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+export const agent = new Agent({
+  name: "test-agent",
+  instructions: "You are an agent with memory.",
+  model: "openai/gpt-5.1",
+  memory: new Memory({
+    options: {
+      workingMemory: {
+        enabled: true,
+      },
+    },
+  }),
+});
+```
+> To enable `workingMemory` on an agent, you’ll need a storage provider configured on your main Mastra instance. See [Mastra class](https://mastra.ai/reference/core/mastra-class) for more information.
+## Constructor parameters
+**storage?:** (`MastraCompositeStore`): Storage implementation for persisting memory data. Defaults to \`new DefaultStorage({ config: { url: "file:memory.db" } })\` if not provided.
+**vector?:** (`MastraVector | false`): Vector store for semantic search capabilities. Set to \`false\` to disable vector operations.
+**embedder?:** (`EmbeddingModel<string> | EmbeddingModelV2<string>`): Embedder instance for vector embeddings. Required when semantic recall is enabled.
+**options?:** (`MemoryConfig`): Memory configuration options.
+### Options parameters
+**lastMessages?:** (`number | false`): Number of most recent messages to retrieve. Set to false to disable. (Default: `10`)
+**readOnly?:** (`boolean`): When true, prevents memory from saving new messages and provides working memory as read-only context (without the updateWorkingMemory tool). Useful for read-only operations like previews, internal routing agents, or sub agents that should reference but not modify memory. (Default: `false`)
+**semanticRecall?:** (`boolean | { topK: number; messageRange: number | { before: number; after: number }; scope?: 'thread' | 'resource' }`): Enable semantic search in message history. Can be a boolean or an object with configuration options. When enabled, requires both vector store and embedder to be configured. Default topK is 4, default messageRange is {before: 1, after: 1}. (Default: `false`)
+**workingMemory?:** (`WorkingMemory`): Configuration for working memory feature. Can be \`{ enabled: boolean; template?: string; schema?: ZodObject\<any> | JSONSchema7; scope?: 'thread' | 'resource' }\` or \`{ enabled: boolean }\` to disable. (Default: `{ enabled: false, template: '# User Information\n- **First Name**:\n- **Last Name**:\n...' }`)
+**observationalMemory?:** (`boolean | ObservationalMemoryOptions`): Enable Observational Memory for long-context agentic memory. Set to \`true\` for defaults, or pass a config object to customize token budgets, models, and scope. See \[Observational Memory reference]\(/reference/memory/observational-memory) for configuration details. (Default: `false`)
+**generateTitle?:** (`boolean | { model: DynamicArgument<MastraLanguageModel>; instructions?: DynamicArgument<string> }`): Controls automatic thread title generation from the user's first message. Can be a boolean or an object with custom model and instructions. (Default: `false`)
+## Returns
+**memory:** (`Memory`): A new Memory instance with the specified configuration.
+## Extended usage example
+```typescript
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+import { LibSQLStore, LibSQLVector } from "@mastra/libsql";
+export const agent = new Agent({
+  name: "test-agent",
+  instructions: "You are an agent with memory.",
+  model: "openai/gpt-5.1",
+  memory: new Memory({
+    storage: new LibSQLStore({
+      id: 'test-agent-storage',
+      url: "file:./working-memory.db",
+    }),
+    vector: new LibSQLVector({
+      id: 'test-agent-vector',
+      url: "file:./vector-memory.db",
+    }),
+    options: {
+      lastMessages: 10,
+      semanticRecall: {
+        topK: 3,
+        messageRange: 2,
+        scope: "resource",
+      },
+      workingMemory: {
+        enabled: true,
+      },
+      generateTitle: true,
+    },
+  }),
+});
+```
+## PostgreSQL with index configuration
+```typescript
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+import { ModelRouterEmbeddingModel } from "@mastra/core/llm";
+import { PgStore, PgVector } from "@mastra/pg";
+export const agent = new Agent({
+  name: "pg-agent",
+  instructions: "You are an agent with optimized PostgreSQL memory.",
+  model: "openai/gpt-5.1",
+  memory: new Memory({
+    storage: new PgStore({
+      id: 'pg-agent-storage',
+      connectionString: process.env.DATABASE_URL,
+    }),
+    vector: new PgVector({
+      id: 'pg-agent-vector',
+      connectionString: process.env.DATABASE_URL,
+    }),
+    embedder: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
+    options: {
+      lastMessages: 20,
+      semanticRecall: {
+        topK: 5,
+        messageRange: 3,
+        scope: "resource",
+        indexConfig: {
+          type: "hnsw", // Use HNSW for better performance
+          metric: "dotproduct", // Optimal for OpenAI embeddings
+          m: 16, // Number of bi-directional links
+          efConstruction: 64, // Construction-time candidate list size
+        },
+      },
+      workingMemory: {
+        enabled: true,
+      },
+    },
+  }),
+});
+```
+### Related
+- [Getting Started with Memory](https://mastra.ai/docs/memory/overview)
+- [Semantic Recall](https://mastra.ai/docs/memory/semantic-recall)
+- [Working Memory](https://mastra.ai/docs/memory/working-memory)
+- [Observational Memory](https://mastra.ai/docs/memory/observational-memory)
+- [Memory Processors](https://mastra.ai/docs/memory/memory-processors)
+- [createThread](https://mastra.ai/reference/memory/createThread)
+- [recall](https://mastra.ai/reference/memory/recall)
+- [getThreadById](https://mastra.ai/reference/memory/getThreadById)
+- [listThreads](https://mastra.ai/reference/memory/listThreads)
+- [deleteMessages](https://mastra.ai/reference/memory/deleteMessages)
+- [cloneThread](https://mastra.ai/reference/memory/cloneThread)
+- [Clone Utility Methods](https://mastra.ai/reference/memory/clone-utilities)

package/dist/docs/references/reference-memory-observational-memory.md ADDED Viewed

@@ -0,0 +1,219 @@
+# Observational Memory
+**Added in:** `@mastra/memory@1.1.0`
+Observational Memory (OM) is Mastra's memory system for long-context agentic memory. Two background agents — an **Observer** that watches conversations and creates observations, and a **Reflector** that restructures observations by combining related items, reflecting on overarching patterns, and condensing where possible — maintain an observation log that replaces raw message history as it grows.
+## Usage
+```typescript
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+export const agent = new Agent({
+  name: "my-agent",
+  instructions: "You are a helpful assistant.",
+  model: "openai/gpt-5-mini",
+  memory: new Memory({
+    options: {
+      observationalMemory: true,
+    },
+  }),
+});
+```
+## Configuration
+The `observationalMemory` option accepts `true`, `false`, or a configuration object.
+Setting `observationalMemory: true` enables it with all defaults. Setting `observationalMemory: false` or omitting it disables it.
+**enabled?:** (`boolean`): Enable or disable Observational Memory. When omitted from a config object, defaults to \`true\`. Only \`enabled: false\` explicitly disables it. (Default: `true`)
+**model?:** (`string | LanguageModel | DynamicModel | ModelWithRetries[]`): Model for both the Observer and Reflector agents. Sets the model for both at once. Cannot be used together with \`observation.model\` or \`reflection.model\` — an error will be thrown if both are set. (Default: `'google/gemini-2.5-flash'`)
+**scope?:** (`'resource' | 'thread'`): Memory scope for observations. \`'thread'\` keeps observations per-thread. \`'resource'\` shares observations across all threads for a resource, enabling cross-conversation memory. (Default: `'thread'`)
+**shareTokenBudget?:** (`boolean`): Share the token budget between messages and observations. When enabled, the total budget is \`observation.messageTokens + reflection.observationTokens\`. Messages can use more space when observations are small, and vice versa. This maximizes context usage through flexible allocation. (Default: `false`)
+**observation?:** (`ObservationalMemoryObservationConfig`): Configuration for the observation step. Controls when the Observer agent runs and how it behaves.
+**reflection?:** (`ObservationalMemoryReflectionConfig`): Configuration for the reflection step. Controls when the Reflector agent runs and how it behaves.
+### Observation config
+**model?:** (`string | LanguageModel | DynamicModel | ModelWithRetries[]`): Model for the Observer agent. Cannot be set if a top-level \`model\` is also provided. (Default: `'google/gemini-2.5-flash'`)
+**messageTokens?:** (`number`): Token count of unobserved messages that triggers observation. When unobserved message tokens exceed this threshold, the Observer agent is called. (Default: `30000`)
+**maxTokensPerBatch?:** (`number`): Maximum tokens per batch when observing multiple threads in resource scope. Threads are chunked into batches of this size and processed in parallel. Lower values mean more parallelism but more API calls. (Default: `10000`)
+**modelSettings?:** (`ObservationalMemoryModelSettings`): Model settings for the Observer agent. (Default: `{ temperature: 0.3, maxOutputTokens: 100_000 }`)
+### Reflection config
+**model?:** (`string | LanguageModel | DynamicModel | ModelWithRetries[]`): Model for the Reflector agent. Cannot be set if a top-level \`model\` is also provided. (Default: `'google/gemini-2.5-flash'`)
+**observationTokens?:** (`number`): Token count of observations that triggers reflection. When observation tokens exceed this threshold, the Reflector agent is called to condense them. (Default: `40000`)
+**modelSettings?:** (`ObservationalMemoryModelSettings`): Model settings for the Reflector agent. (Default: `{ temperature: 0, maxOutputTokens: 100_000 }`)
+### Model settings
+**temperature?:** (`number`): Temperature for generation. Lower values produce more consistent output. (Default: `0.3`)
+**maxOutputTokens?:** (`number`): Maximum output tokens. Set high to prevent truncation of observations. (Default: `100000`)
+## Examples
+### Resource scope with custom thresholds
+```typescript
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+export const agent = new Agent({
+  name: "my-agent",
+  instructions: "You are a helpful assistant.",
+  model: "openai/gpt-5-mini",
+  memory: new Memory({
+    options: {
+      observationalMemory: {
+        scope: "resource",
+        observation: {
+          messageTokens: 20_000,
+        },
+        reflection: {
+          observationTokens: 60_000,
+        },
+      },
+    },
+  }),
+});
+```
+### Shared token budget
+```typescript
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+export const agent = new Agent({
+  name: "my-agent",
+  instructions: "You are a helpful assistant.",
+  model: "openai/gpt-5-mini",
+  memory: new Memory({
+    options: {
+      observationalMemory: {
+        shareTokenBudget: true,
+        observation: {
+          messageTokens: 20_000,
+        },
+        reflection: {
+          observationTokens: 80_000,
+        },
+      },
+    },
+  }),
+});
+```
+When `shareTokenBudget` is enabled, the total budget is `observation.messageTokens + reflection.observationTokens` (100k in this example). If observations only use 30k tokens, messages can expand to use up to 70k. If messages are short, observations have more room before triggering reflection.
+### Custom model
+```typescript
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+export const agent = new Agent({
+  name: "my-agent",
+  instructions: "You are a helpful assistant.",
+  model: "openai/gpt-5-mini",
+  memory: new Memory({
+    options: {
+      observationalMemory: {
+        model: "openai/gpt-4o-mini",
+      },
+    },
+  }),
+});
+```
+### Different models per agent
+```typescript
+import { Memory } from "@mastra/memory";
+import { Agent } from "@mastra/core/agent";
+export const agent = new Agent({
+  name: "my-agent",
+  instructions: "You are a helpful assistant.",
+  model: "openai/gpt-5-mini",
+  memory: new Memory({
+    options: {
+      observationalMemory: {
+        observation: {
+          model: "google/gemini-2.5-flash",
+        },
+        reflection: {
+          model: "openai/gpt-4o-mini",
+        },
+      },
+    },
+  }),
+});
+```
+## Standalone usage
+Most users should use the `Memory` class above. Using `ObservationalMemory` directly is mainly useful for benchmarking, experimentation, or when you need to control processor ordering with other processors (like [guardrails](https://mastra.ai/docs/agents/guardrails)).
+```typescript
+import { ObservationalMemory } from "@mastra/memory/processors";
+import { Agent } from "@mastra/core/agent";
+import { LibSQLStore } from "@mastra/libsql";
+const storage = new LibSQLStore({
+  id: "my-storage",
+  url: "file:./memory.db",
+});
+const om = new ObservationalMemory({
+  storage: storage.stores.memory,
+  model: "google/gemini-2.5-flash",
+  scope: "resource",
+  observation: {
+    messageTokens: 20_000,
+  },
+  reflection: {
+    observationTokens: 60_000,
+  },
+});
+export const agent = new Agent({
+  name: "my-agent",
+  instructions: "You are a helpful assistant.",
+  model: "openai/gpt-5-mini",
+  inputProcessors: [om],
+  outputProcessors: [om],
+});
+```
+### Standalone config
+The standalone `ObservationalMemory` class accepts all the same options as the `observationalMemory` config object above, plus the following:
+**storage:** (`MemoryStorage`): Storage adapter for persisting observations. Must be a MemoryStorage instance (from \`MastraStorage.stores.memory\`).
+**onDebugEvent?:** (`(event: ObservationDebugEvent) => void`): Debug callback for observation events. Called whenever observation-related events occur. Useful for debugging and understanding the observation flow.
+**obscureThreadIds?:** (`boolean`): When enabled, thread IDs are hashed before being included in observation context. This prevents the LLM from recognizing patterns in thread identifiers. Automatically enabled when using resource scope through the Memory class. (Default: `false`)
+### Related
+- [Observational Memory](https://mastra.ai/docs/memory/observational-memory)
+- [Memory Overview](https://mastra.ai/docs/memory/overview)
+- [Memory Class](https://mastra.ai/reference/memory/memory-class)
+- [Memory Processors](https://mastra.ai/docs/memory/memory-processors)
+- [Processors](https://mastra.ai/docs/agents/processors)

package/dist/docs/{processors/01-reference.md → references/reference-processors-token-limiter-processor.md} RENAMED Viewed

@@ -1,13 +1,4 @@
-# Processors API Reference
-> API reference for processors - 1 entries
----
-## Reference: Token Limiter Processor
-> Documentation for the TokenLimiterProcessor in Mastra, which limits the number of tokens in messages.
+# TokenLimiterProcessor
 The `TokenLimiterProcessor` limits the number of tokens in messages. It can be used as both an input and output processor:
@@ -28,10 +19,32 @@ const processor = new TokenLimiterProcessor({
 ## Constructor parameters
+**options:** (`number | Options`): Either a simple number for token limit, or configuration options object
 ### Options
+**limit:** (`number`): Maximum number of tokens to allow in the response
+**encoding?:** (`TiktokenBPE`): Optional encoding to use. Defaults to o200k\_base which is used by gpt-5.1
+**strategy?:** (`'truncate' | 'abort'`): Strategy when token limit is reached: 'truncate' stops emitting chunks, 'abort' calls abort() to stop the stream
+**countMode?:** (`'cumulative' | 'part'`): Whether to count tokens from the beginning of the stream or just the current part: 'cumulative' counts all tokens from start, 'part' only counts tokens in current part
 ## Returns
+**id:** (`string`): Processor identifier set to 'token-limiter'
+**name?:** (`string`): Optional processor display name
+**processInput:** (`(args: { messages: MastraDBMessage[]; abort: (reason?: string) => never }) => Promise<MastraDBMessage[]>`): Filters input messages to fit within token limit, prioritizing recent messages while preserving system messages
+**processOutputStream:** (`(args: { part: ChunkType; streamParts: ChunkType[]; state: Record<string, any>; abort: (reason?: string) => never }) => Promise<ChunkType | null>`): Processes streaming output parts to limit token count during streaming
+**processOutputResult:** (`(args: { messages: MastraDBMessage[]; abort: (reason?: string) => never }) => Promise<MastraDBMessage[]>`): Processes final output results to limit token count in non-streaming scenarios
+**getMaxTokens:** (`() => number`): Get the maximum token limit
 ## Error behavior
 When used as an input processor, `TokenLimiterProcessor` throws a `TripWire` error in the following cases:
@@ -57,7 +70,7 @@ try {
 Use `inputProcessors` to limit historical messages sent to the model, which helps stay within context window limits:
-```typescript title="src/mastra/agents/context-limited-agent.ts"
+```typescript
 import { Agent } from "@mastra/core/agent";
 import { Memory } from "@mastra/memory";
 import { TokenLimiterProcessor } from "@mastra/core/processors";
@@ -77,7 +90,7 @@ export const agent = new Agent({
 Use `outputProcessors` to limit the length of generated responses:
-```typescript title="src/mastra/agents/response-limited-agent.ts"
+```typescript
 import { Agent } from "@mastra/core/agent";
 import { TokenLimiterProcessor } from "@mastra/core/processors";