@mastra/pg 1.6.1 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. package/CHANGELOG.md +74 -0
  2. package/dist/docs/SKILL.md +40 -0
  3. package/dist/docs/assets/SOURCE_MAP.json +6 -0
  4. package/dist/docs/references/docs-memory-semantic-recall.md +272 -0
  5. package/dist/docs/references/docs-memory-storage.md +261 -0
  6. package/dist/docs/references/docs-memory-working-memory.md +400 -0
  7. package/dist/docs/references/docs-rag-overview.md +72 -0
  8. package/dist/docs/references/docs-rag-retrieval.md +515 -0
  9. package/dist/docs/references/docs-rag-vector-databases.md +645 -0
  10. package/dist/docs/references/reference-memory-memory-class.md +147 -0
  11. package/dist/docs/references/reference-processors-message-history-processor.md +85 -0
  12. package/dist/docs/references/reference-processors-semantic-recall-processor.md +117 -0
  13. package/dist/docs/references/reference-processors-working-memory-processor.md +152 -0
  14. package/dist/docs/references/reference-rag-metadata-filters.md +216 -0
  15. package/dist/docs/references/reference-storage-composite.md +235 -0
  16. package/dist/docs/references/reference-storage-dynamodb.md +282 -0
  17. package/dist/docs/references/reference-storage-postgresql.md +526 -0
  18. package/dist/docs/references/reference-tools-vector-query-tool.md +459 -0
  19. package/dist/docs/references/reference-vectors-pg.md +408 -0
  20. package/dist/index.cjs +62 -5
  21. package/dist/index.cjs.map +1 -1
  22. package/dist/index.js +62 -5
  23. package/dist/index.js.map +1 -1
  24. package/dist/storage/db/index.d.ts.map +1 -1
  25. package/dist/storage/domains/memory/index.d.ts.map +1 -1
  26. package/dist/vector/index.d.ts.map +1 -1
  27. package/package.json +5 -5
@@ -0,0 +1,216 @@
1
+ # Metadata Filters
2
+
3
+ Mastra provides a unified metadata filtering syntax across all vector stores, based on MongoDB/Sift query syntax. Each vector store translates these filters into their native format.
4
+
5
+ ## Basic Example
6
+
7
+ ```typescript
8
+ import { PgVector } from '@mastra/pg'
9
+
10
+ const store = new PgVector({
11
+ id: 'pg-vector',
12
+ connectionString,
13
+ })
14
+
15
+ const results = await store.query({
16
+ indexName: 'my_index',
17
+ queryVector: queryVector,
18
+ topK: 10,
19
+ filter: {
20
+ category: 'electronics', // Simple equality
21
+ price: { $gt: 100 }, // Numeric comparison
22
+ tags: { $in: ['sale', 'new'] }, // Array membership
23
+ },
24
+ })
25
+ ```
26
+
27
+ ## Supported Operators
28
+
29
+ ### Basic Comparison
30
+
31
+ `$eq`Matches values equal to specified value{ age: { $eq: 25 } }Supported by: All except Couchbase`$ne`Matches values not equal{ status: { $ne: 'inactive' } }Supported by: All except Couchbase`$gt`Greater than{ price: { $gt: 100 } }Supported by: All except Couchbase`$gte`Greater than or equal{ rating: { $gte: 4.5 } }Supported by: All except Couchbase`$lt`Less than{ stock: { $lt: 20 } }Supported by: All except Couchbase`$lte`Less than or equal{ priority: { $lte: 3 } }Supported by: All except Couchbase
32
+
33
+ ### Array Operators
34
+
35
+ `$in`Matches any value in array{ category: { $in: \["A", "B"] } }Supported by: All except Couchbase`$nin`Matches none of the values{ status: { $nin: \["deleted", "archived"] } }Supported by: All except Couchbase`$all`Matches arrays containing all elements{ tags: { $all: \["urgent", "high"] } }Supported by: Astra, Pinecone, Upstash, MongoDB`$elemMatch`Matches array elements meeting criteria{ scores: { $elemMatch: { $gt: 80 } } }Supported by: libSQL, PgVector, MongoDB
36
+
37
+ ### Logical Operators
38
+
39
+ `$and`Logical AND{ $and: \[{ price: { $gt: 100 } }, { stock: { $gt: 0 } }] }Supported by: All except Vectorize, Couchbase`$or`Logical OR{ $or: \[{ status: "active" }, { priority: "high" }] }Supported by: All except Vectorize, Couchbase`$not`Logical NOT{ price: { $not: { $lt: 100 } } }Supported by: Astra, Qdrant, Upstash, PgVector, libSQL, MongoDB`$nor`Logical NOR{ $nor: \[{ status: "deleted" }, { archived: true }] }Supported by: Qdrant, Upstash, PgVector, libSQL, MongoDB
40
+
41
+ ### Element Operators
42
+
43
+ `$exists`Matches documents with field{ rating: { $exists: true } }Supported by: All except Vectorize, Chroma, Couchbase
44
+
45
+ ### Custom Operators
46
+
47
+ `$contains`Text contains substring{ description: { $contains: "sale" } }Supported by: Upstash, libSQL, PgVector`$regex`Regular expression match{ name: { $regex: "^test" } }Supported by: Qdrant, PgVector, Upstash, MongoDB`$size`Array length check{ tags: { $size: { $gt: 2 } } }Supported by: Astra, libSQL, PgVector, MongoDB`$geo`Geospatial query{ location: { $geo: { type: "radius", ... } } }Supported by: Qdrant`$datetime`Datetime range query{ created: { $datetime: { range: { gt: "2024-01-01" } } } }Supported by: Qdrant`$hasId`Vector ID existence check{ $hasId: \["id1", "id2"] }Supported by: Qdrant`$hasVector`Vector existence check{ $hasVector: true }Supported by: Qdrant
48
+
49
+ ## Common Rules and Restrictions
50
+
51
+ 1. Field names cannot:
52
+
53
+ - Contain dots (.) unless referring to nested fields
54
+ - Start with $ or contain null characters
55
+ - Be empty strings
56
+
57
+ 2. Values must be:
58
+
59
+ - Valid JSON types (string, number, boolean, object, array)
60
+ - Not undefined
61
+ - Properly typed for the operator (e.g., numbers for numeric comparisons)
62
+
63
+ 3. Logical operators:
64
+
65
+ - Must contain valid conditions
66
+ - Cannot be empty
67
+ - Must be properly nested
68
+ - Can only be used at top level or nested within other logical operators
69
+ - Cannot be used at field level or nested inside a field
70
+ - Cannot be used inside an operator
71
+ - Valid: `{ "$and": [{ "field": { "$gt": 100 } }] }`
72
+ - Valid: `{ "$or": [{ "$and": [{ "field": { "$gt": 100 } }] }] }`
73
+ - Invalid: `{ "field": { "$and": [{ "$gt": 100 }] } }`
74
+ - Invalid: `{ "field": { "$gt": { "$and": [{...}] } } }`
75
+
76
+ 4. $not operator:
77
+
78
+ - Must be an object
79
+ - Cannot be empty
80
+ - Can be used at field level or top level
81
+ - Valid: `{ "$not": { "field": "value" } }`
82
+ - Valid: `{ "field": { "$not": { "$eq": "value" } } }`
83
+
84
+ 5. Operator nesting:
85
+
86
+ - Logical operators must contain field conditions, not direct operators
87
+ - Valid: `{ "$and": [{ "field": { "$gt": 100 } }] }`
88
+ - Invalid: `{ "$and": [{ "$gt": 100 }] }`
89
+
90
+ ## Store-Specific Notes
91
+
92
+ ### Astra
93
+
94
+ - Nested field queries are supported using dot notation
95
+ - Array fields must be explicitly defined as arrays in the metadata
96
+ - Metadata values are case-sensitive
97
+
98
+ ### ChromaDB
99
+
100
+ - Where filters only return results where the filtered field exists in metadata
101
+ - Empty metadata fields are not included in filter results
102
+ - Metadata fields must be present for negative matches (e.g., $ne won't match documents missing the field)
103
+
104
+ ### Cloudflare Vectorize
105
+
106
+ - Requires explicit metadata indexing before filtering can be used
107
+ - Use `createMetadataIndex()` to index fields you want to filter on
108
+ - Up to 10 metadata indexes per Vectorize index
109
+ - String values are indexed up to first 64 bytes (truncated on UTF-8 boundaries)
110
+ - Number values use float64 precision
111
+ - Filter JSON must be under 2048 bytes
112
+ - Field names cannot contain dots (.) or start with $
113
+ - Field names limited to 512 characters
114
+ - Vectors must be re-upserted after creating new metadata indexes to be included in filtered results
115
+ - Range queries may have reduced accuracy with very large datasets (\~10M+ vectors)
116
+
117
+ ### libSQL
118
+
119
+ - Supports nested object queries with dot notation
120
+ - Array fields are validated to ensure they contain valid JSON arrays
121
+ - Numeric comparisons maintain proper type handling
122
+ - Empty arrays in conditions are handled gracefully
123
+ - Metadata is stored in a JSONB column for efficient querying
124
+
125
+ ### PgVector
126
+
127
+ - Full support for PostgreSQL's native JSON querying capabilities
128
+ - Efficient handling of array operations using native array functions
129
+ - Proper type handling for numbers, strings, and booleans
130
+ - Nested field queries use PostgreSQL's JSON path syntax internally
131
+ - Metadata is stored in a JSONB column for efficient indexing
132
+
133
+ ### Pinecone
134
+
135
+ - Metadata field names are limited to 512 characters
136
+ - Numeric values must be within the range of ±1e38
137
+ - Arrays in metadata are limited to 64KB total size
138
+ - Nested objects are flattened with dot notation
139
+ - Metadata updates replace the entire metadata object
140
+
141
+ ### Qdrant
142
+
143
+ - Supports advanced filtering with nested conditions
144
+ - Payload (metadata) fields must be explicitly indexed for filtering
145
+ - Use `createPayloadIndex()` to index fields you want to filter on:
146
+
147
+ ```typescript
148
+ // Index a field before filtering on it
149
+ await store.createPayloadIndex({
150
+ indexName: 'my_index',
151
+ fieldName: 'source',
152
+ fieldSchema: 'keyword', // 'keyword' | 'integer' | 'float' | 'geo' | 'text' | 'bool' | 'datetime' | 'uuid'
153
+ })
154
+
155
+ // Now filtering works
156
+ const results = await store.query({
157
+ indexName: 'my_index',
158
+ queryVector: queryVector,
159
+ filter: { source: 'document-a' },
160
+ })
161
+ ```
162
+
163
+ - Efficient handling of geo-spatial queries
164
+ - Special handling for null and empty values
165
+ - Vector-specific filtering capabilities
166
+ - Datetime values must be in RFC 3339 format
167
+
168
+ ### Upstash
169
+
170
+ - 512-character limit for metadata field keys
171
+ - Query size is limited (avoid large IN clauses)
172
+ - No support for null/undefined values in filters
173
+ - Translates to SQL-like syntax internally
174
+ - Case-sensitive string comparisons
175
+ - Metadata updates are atomic
176
+
177
+ ### MongoDB
178
+
179
+ - Full support for MongoDB/Sift query syntax for metadata filters
180
+ - Supports all standard comparison, array, logical, and element operators
181
+ - Supports nested fields and arrays in metadata
182
+ - Filtering can be applied to both `metadata` and the original document content using the `filter` and `documentFilter` options, respectively
183
+ - `filter` applies to the metadata object; `documentFilter` applies to the original document fields
184
+ - No artificial limits on filter size or complexity (subject to MongoDB query limits)
185
+ - Indexing metadata fields is recommended for optimal performance
186
+
187
+ ### Couchbase
188
+
189
+ - Currently does not have support for metadata filters. Filtering must be done client-side after retrieving results or by using the Couchbase SDK's Search capabilities directly for more complex queries.
190
+
191
+ ### Amazon S3 Vectors
192
+
193
+ - Equality values must be primitives (string/number/boolean). `null`/`undefined`, arrays, objects, and Date are not allowed for equality. Range operators accept numbers or Date (Dates are normalized to epoch ms).
194
+ - `$in`/`$nin` require **non-empty arrays of primitives**; Date elements are allowed and normalized to epoch ms. **Array equality** is not supported.
195
+ - Implicit AND is canonicalized (`{a:1,b:2}` → `{$and:[{a:1},{b:2}]}`). Logical operators must contain field conditions, use non-empty arrays, and appear only at the root or within other logical operators (not inside field values).
196
+ - Keys listed in `nonFilterableMetadataKeys` at index creation are stored but not filterable; this setting is immutable.
197
+ - $exists requires a boolean value.
198
+ - undefined/null/empty filters are treated as no filter.
199
+ - Each metadata key name limited to 63 characters.
200
+ - Total metadata per vector: Up to 40 KB (filterable + non-filterable)
201
+ - Total metadata keys per vector: Up to 10
202
+ - Filterable metadata per vector: Up to 2 KB
203
+ - Non-filterable metadata keys per vector index: Up to 10
204
+
205
+ ## Related
206
+
207
+ - [Astra](https://mastra.ai/reference/vectors/astra)
208
+ - [Chroma](https://mastra.ai/reference/vectors/chroma)
209
+ - [Cloudflare Vectorize](https://mastra.ai/reference/vectors/vectorize)
210
+ - [libSQL](https://mastra.ai/reference/vectors/libsql)
211
+ - [MongoDB](https://mastra.ai/reference/vectors/mongodb)
212
+ - [PgStore](https://mastra.ai/reference/vectors/pg)
213
+ - [Pinecone](https://mastra.ai/reference/vectors/pinecone)
214
+ - [Qdrant](https://mastra.ai/reference/vectors/qdrant)
215
+ - [Upstash](https://mastra.ai/reference/vectors/upstash)
216
+ - [Amazon S3 Vectors](https://mastra.ai/reference/vectors/s3vectors)
@@ -0,0 +1,235 @@
1
+ # Composite Storage
2
+
3
+ `MastraCompositeStore` can compose storage domains from different providers. Use it when you need different databases for different purposes. For example, use LibSQL for memory and PostgreSQL for workflows.
4
+
5
+ ## Installation
6
+
7
+ `MastraCompositeStore` is included in `@mastra/core`:
8
+
9
+ **npm**:
10
+
11
+ ```bash
12
+ npm install @mastra/core@latest
13
+ ```
14
+
15
+ **pnpm**:
16
+
17
+ ```bash
18
+ pnpm add @mastra/core@latest
19
+ ```
20
+
21
+ **Yarn**:
22
+
23
+ ```bash
24
+ yarn add @mastra/core@latest
25
+ ```
26
+
27
+ **Bun**:
28
+
29
+ ```bash
30
+ bun add @mastra/core@latest
31
+ ```
32
+
33
+ You'll also need to install the storage providers you want to compose:
34
+
35
+ **npm**:
36
+
37
+ ```bash
38
+ npm install @mastra/pg@latest @mastra/libsql@latest
39
+ ```
40
+
41
+ **pnpm**:
42
+
43
+ ```bash
44
+ pnpm add @mastra/pg@latest @mastra/libsql@latest
45
+ ```
46
+
47
+ **Yarn**:
48
+
49
+ ```bash
50
+ yarn add @mastra/pg@latest @mastra/libsql@latest
51
+ ```
52
+
53
+ **Bun**:
54
+
55
+ ```bash
56
+ bun add @mastra/pg@latest @mastra/libsql@latest
57
+ ```
58
+
59
+ ## Storage domains
60
+
61
+ Mastra organizes storage into five specialized domains, each handling a specific type of data. Each domain can be backed by a different storage adapter, and domain classes are exported from each storage package.
62
+
63
+ | Domain | Description |
64
+ | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
65
+ | `memory` | Conversation persistence for agents. Stores threads (conversation sessions), messages, resources (user identities), and working memory (persistent context across conversations). |
66
+ | `workflows` | Workflow execution state. When workflows suspend for human input, external events, or scheduled resumption, their state is persisted here to enable resumption after server restarts. |
67
+ | `scores` | Evaluation results from Mastra's evals system. Scores and metrics are persisted here for analysis and comparison over time. |
68
+ | `observability` | Telemetry data including traces and spans. Agent interactions, tool calls, and LLM requests generate spans collected into traces for debugging and performance analysis. |
69
+ | `agents` | Agent configurations for stored agents. Enables agents to be defined and updated at runtime without code deployments. |
70
+
71
+ ## Usage
72
+
73
+ ### Basic composition
74
+
75
+ Import domain classes directly from each store package and compose them:
76
+
77
+ ```typescript
78
+ import { MastraCompositeStore } from '@mastra/core/storage'
79
+ import { WorkflowsPG, ScoresPG } from '@mastra/pg'
80
+ import { MemoryLibSQL } from '@mastra/libsql'
81
+ import { Mastra } from '@mastra/core'
82
+
83
+ export const mastra = new Mastra({
84
+ storage: new MastraCompositeStore({
85
+ id: 'composite',
86
+ domains: {
87
+ memory: new MemoryLibSQL({ url: 'file:./local.db' }),
88
+ workflows: new WorkflowsPG({ connectionString: process.env.DATABASE_URL }),
89
+ scores: new ScoresPG({ connectionString: process.env.DATABASE_URL }),
90
+ },
91
+ }),
92
+ })
93
+ ```
94
+
95
+ ### With a default storage
96
+
97
+ Use `default` to specify a fallback storage, then override specific domains:
98
+
99
+ ```typescript
100
+ import { MastraCompositeStore } from '@mastra/core/storage'
101
+ import { PostgresStore } from '@mastra/pg'
102
+ import { MemoryLibSQL } from '@mastra/libsql'
103
+ import { Mastra } from '@mastra/core'
104
+
105
+ const pgStore = new PostgresStore({
106
+ id: 'pg',
107
+ connectionString: process.env.DATABASE_URL,
108
+ })
109
+
110
+ export const mastra = new Mastra({
111
+ storage: new MastraCompositeStore({
112
+ id: 'composite',
113
+ default: pgStore,
114
+ domains: {
115
+ memory: new MemoryLibSQL({ url: 'file:./local.db' }),
116
+ },
117
+ }),
118
+ })
119
+ ```
120
+
121
+ ## Options
122
+
123
+ **id:** (`string`): Unique identifier for this storage instance.
124
+
125
+ **default?:** (`MastraCompositeStore`): Default storage adapter. Domains not explicitly specified in \`domains\` will use this storage's domains as fallbacks.
126
+
127
+ **domains?:** (`object`): Individual domain overrides. Each domain can come from a different storage adapter. These take precedence over the default storage.
128
+
129
+ **domains.memory?:** (`MemoryStorage`): Storage for threads, messages, and resources.
130
+
131
+ **domains.workflows?:** (`WorkflowsStorage`): Storage for workflow snapshots.
132
+
133
+ **domains.scores?:** (`ScoresStorage`): Storage for evaluation scores.
134
+
135
+ **domains.observability?:** (`ObservabilityStorage`): Storage for traces and spans.
136
+
137
+ **domains.agents?:** (`AgentsStorage`): Storage for stored agent configurations.
138
+
139
+ **disableInit?:** (`boolean`): When true, automatic initialization is disabled. You must call init() explicitly.
140
+
141
+ ## Initialization
142
+
143
+ `MastraCompositeStore` initializes each configured domain independently. When passed to the Mastra class, `init()` is called automatically:
144
+
145
+ ```typescript
146
+ import { MastraCompositeStore } from '@mastra/core/storage'
147
+ import { MemoryPG, WorkflowsPG, ScoresPG } from '@mastra/pg'
148
+ import { Mastra } from '@mastra/core'
149
+
150
+ const storage = new MastraCompositeStore({
151
+ id: 'composite',
152
+ domains: {
153
+ memory: new MemoryPG({ connectionString: process.env.DATABASE_URL }),
154
+ workflows: new WorkflowsPG({ connectionString: process.env.DATABASE_URL }),
155
+ scores: new ScoresPG({ connectionString: process.env.DATABASE_URL }),
156
+ },
157
+ })
158
+
159
+ export const mastra = new Mastra({
160
+ storage, // init() called automatically
161
+ })
162
+ ```
163
+
164
+ If using storage directly, call `init()` explicitly:
165
+
166
+ ```typescript
167
+ import { MastraCompositeStore } from '@mastra/core/storage'
168
+ import { MemoryPG } from '@mastra/pg'
169
+
170
+ const storage = new MastraCompositeStore({
171
+ id: 'composite',
172
+ domains: {
173
+ memory: new MemoryPG({ connectionString: process.env.DATABASE_URL }),
174
+ },
175
+ })
176
+
177
+ await storage.init()
178
+
179
+ // Access domain-specific stores via getStore()
180
+ const memoryStore = await storage.getStore('memory')
181
+ const thread = await memoryStore?.getThreadById({ threadId: '...' })
182
+ ```
183
+
184
+ ## Use cases
185
+
186
+ ### Separate databases for different workloads
187
+
188
+ Use a local database for development while keeping production data in a managed service:
189
+
190
+ ```typescript
191
+ import { MastraCompositeStore } from '@mastra/core/storage'
192
+ import { MemoryPG, WorkflowsPG, ScoresPG } from '@mastra/pg'
193
+ import { MemoryLibSQL } from '@mastra/libsql'
194
+
195
+ const storage = new MastraCompositeStore({
196
+ id: 'composite',
197
+ domains: {
198
+ // Use local SQLite for development, PostgreSQL for production
199
+ memory:
200
+ process.env.NODE_ENV === 'development'
201
+ ? new MemoryLibSQL({ url: 'file:./dev.db' })
202
+ : new MemoryPG({ connectionString: process.env.DATABASE_URL }),
203
+ workflows: new WorkflowsPG({ connectionString: process.env.DATABASE_URL }),
204
+ scores: new ScoresPG({ connectionString: process.env.DATABASE_URL }),
205
+ },
206
+ })
207
+ ```
208
+
209
+ ### Specialized storage for observability
210
+
211
+ Observability data can quickly overwhelm general-purpose databases in production. A single agent interaction can generate hundreds of spans, and high-traffic applications can produce thousands of traces per day.
212
+
213
+ **ClickHouse** is recommended for production observability because it's optimized for high-volume, write-heavy analytics workloads. Use composite storage to route observability to ClickHouse while keeping other data in your primary database:
214
+
215
+ ```typescript
216
+ import { MastraCompositeStore } from '@mastra/core/storage'
217
+ import { MemoryPG, WorkflowsPG, ScoresPG } from '@mastra/pg'
218
+ import { ObservabilityStorageClickhouse } from '@mastra/clickhouse'
219
+
220
+ const storage = new MastraCompositeStore({
221
+ id: 'composite',
222
+ domains: {
223
+ memory: new MemoryPG({ connectionString: process.env.DATABASE_URL }),
224
+ workflows: new WorkflowsPG({ connectionString: process.env.DATABASE_URL }),
225
+ scores: new ScoresPG({ connectionString: process.env.DATABASE_URL }),
226
+ observability: new ObservabilityStorageClickhouse({
227
+ url: process.env.CLICKHOUSE_URL,
228
+ username: process.env.CLICKHOUSE_USERNAME,
229
+ password: process.env.CLICKHOUSE_PASSWORD,
230
+ }),
231
+ },
232
+ })
233
+ ```
234
+
235
+ > **Info:** This approach is also required when using storage providers that don't support observability (like Convex, DynamoDB, or Cloudflare). See the [DefaultExporter documentation](https://mastra.ai/docs/observability/tracing/exporters/default) for the full list of supported providers.