npm - @mastra/rag - Versions diffs - 2.1.1 → 2.1.2-alpha.0 - Mend

@mastra/rag 2.1.1 → 2.1.2-alpha.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/CHANGELOG.md +9 -0
package/dist/docs/SKILL.md +38 -0
package/dist/docs/assets/SOURCE_MAP.json +6 -0
package/dist/docs/references/docs-rag-chunking-and-embedding.md +183 -0
package/dist/docs/references/docs-rag-graph-rag.md +215 -0
package/dist/docs/references/docs-rag-overview.md +72 -0
package/dist/docs/references/docs-rag-retrieval.md +515 -0
package/dist/docs/references/reference-rag-chunk.md +221 -0
package/dist/docs/references/reference-rag-database-config.md +261 -0
package/dist/docs/references/reference-rag-document.md +114 -0
package/dist/docs/references/reference-rag-extract-params.md +168 -0
package/dist/docs/references/reference-rag-graph-rag.md +111 -0
package/dist/docs/references/reference-rag-rerank.md +75 -0
package/dist/docs/references/reference-rag-rerankWithScorer.md +80 -0
package/dist/docs/references/reference-tools-document-chunker-tool.md +89 -0
package/dist/docs/references/reference-tools-graph-rag-tool.md +182 -0
package/dist/docs/references/reference-tools-vector-query-tool.md +459 -0
package/dist/document/transformers/semantic-markdown.d.ts +6 -4
package/dist/document/transformers/semantic-markdown.d.ts.map +1 -1
package/dist/document/transformers/token.d.ts +5 -4
package/dist/document/transformers/token.d.ts.map +1 -1
package/dist/index.cjs +41 -26
package/dist/index.cjs.map +1 -1
package/dist/index.js +41 -26
package/dist/index.js.map +1 -1
package/package.json +3 -3

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,14 @@
 # @mastra/rag
+## 2.1.2-alpha.0
+### Patch Changes
+- Improved token-based chunking performance in `token` and `semantic-markdown` strategies. Markdown knowledge bases now chunk significantly faster with lower tokenization overhead. ([#13495](https://github.com/mastra-ai/mastra/pull/13495))
+- Updated dependencies [[`df170fd`](https://github.com/mastra-ai/mastra/commit/df170fd139b55f845bfd2de8488b16435bd3d0da), [`ae55343`](https://github.com/mastra-ai/mastra/commit/ae5534397fc006fd6eef3e4f80c235bcdc9289ef), [`c290cec`](https://github.com/mastra-ai/mastra/commit/c290cec5bf9107225de42942b56b487107aa9dce), [`f03e794`](https://github.com/mastra-ai/mastra/commit/f03e794630f812b56e95aad54f7b1993dc003add), [`aa4a5ae`](https://github.com/mastra-ai/mastra/commit/aa4a5aedb80d8d6837bab8cbb2e301215d1ba3e9), [`de3f584`](https://github.com/mastra-ai/mastra/commit/de3f58408752a8d80a295275c7f23fc306cf7f4f), [`d3fb010`](https://github.com/mastra-ai/mastra/commit/d3fb010c98f575f1c0614452667396e2653815f6), [`702ee1c`](https://github.com/mastra-ai/mastra/commit/702ee1c41be67cc532b4dbe89bcb62143508f6f0), [`f495051`](https://github.com/mastra-ai/mastra/commit/f495051eb6496a720f637fc85b6d69941c12554c), [`e622f1d`](https://github.com/mastra-ai/mastra/commit/e622f1d3ab346a8e6aca6d1fe2eac99bd961e50b), [`861f111`](https://github.com/mastra-ai/mastra/commit/861f11189211b20ddb70d8df81a6b901fc78d11e), [`00f43e8`](https://github.com/mastra-ai/mastra/commit/00f43e8e97a80c82b27d5bd30494f10a715a1df9), [`1b6f651`](https://github.com/mastra-ai/mastra/commit/1b6f65127d4a0d6c38d0a1055cb84527db529d6b), [`96a1702`](https://github.com/mastra-ai/mastra/commit/96a1702ce362c50dda20c8b4a228b4ad1a36a17a), [`cb9f921`](https://github.com/mastra-ai/mastra/commit/cb9f921320913975657abb1404855d8c510f7ac5), [`114e7c1`](https://github.com/mastra-ai/mastra/commit/114e7c146ac682925f0fb37376c1be70e5d6e6e5), [`1b6f651`](https://github.com/mastra-ai/mastra/commit/1b6f65127d4a0d6c38d0a1055cb84527db529d6b), [`72df4a8`](https://github.com/mastra-ai/mastra/commit/72df4a8f9bf1a20cfd3d9006a4fdb597ad56d10a)]:
+  - @mastra/core@1.8.0-alpha.0
 ## 2.1.1
 ### Patch Changes

package/dist/docs/SKILL.md ADDED Viewed

@@ -0,0 +1,38 @@
+---
+name: mastra-rag
+description: Documentation for @mastra/rag. Use when working with @mastra/rag APIs, configuration, or implementation.
+metadata:
+  package: "@mastra/rag"
+  version: "2.1.2-alpha.0"
+---
+## When to use
+Use this skill whenever you are working with @mastra/rag to obtain the domain-specific knowledge.
+## How to use
+Read the individual reference documents for detailed explanations and code examples.
+### Docs
+- [Chunking and Embedding Documents](references/docs-rag-chunking-and-embedding.md) - Guide on chunking and embedding documents in Mastra for efficient processing and retrieval.
+- [GraphRAG](references/docs-rag-graph-rag.md) - Guide on graph-based retrieval in Mastra's RAG systems for documents with complex relationships.
+- [RAG (Retrieval-Augmented Generation) in Mastra](references/docs-rag-overview.md) - Overview of Retrieval-Augmented Generation (RAG) in Mastra, detailing its capabilities for enhancing LLM outputs with relevant context.
+- [Retrieval, Semantic Search, Reranking](references/docs-rag-retrieval.md) - Guide on retrieval processes in Mastra's RAG systems, including semantic search, filtering, and re-ranking.
+### Reference
+- [Reference: .chunk()](references/reference-rag-chunk.md) - Documentation for the chunk function in Mastra, which splits documents into smaller segments using various strategies.
+- [Reference: DatabaseConfig](references/reference-rag-database-config.md) - API reference for database-specific configuration types used with vector query tools in Mastra RAG systems.
+- [Reference: MDocument](references/reference-rag-document.md) - Documentation for the MDocument class in Mastra, which handles document processing and chunking.
+- [Reference: ExtractParams](references/reference-rag-extract-params.md) - Documentation for metadata extraction configuration in Mastra.
+- [Reference: GraphRAG](references/reference-rag-graph-rag.md) - Documentation for the GraphRAG class in Mastra, which implements a graph-based approach to retrieval augmented generation.
+- [Reference: rerank()](references/reference-rag-rerank.md) - Documentation for the rerank function in Mastra, which provides advanced reranking capabilities for vector search results.
+- [Reference: rerankWithScorer()](references/reference-rag-rerankWithScorer.md) - Documentation for the rerank function in Mastra, which provides advanced reranking capabilities for vector search results.
+- [Reference: createDocumentChunkerTool()](references/reference-tools-document-chunker-tool.md) - Documentation for the Document Chunker Tool in Mastra, which splits documents into smaller chunks for efficient processing and retrieval.
+- [Reference: createGraphRAGTool()](references/reference-tools-graph-rag-tool.md) - Documentation for the GraphRAG Tool in Mastra, which enhances RAG by building a graph of semantic relationships between documents.
+- [Reference: createVectorQueryTool()](references/reference-tools-vector-query-tool.md) - Documentation for the Vector Query Tool in Mastra, which facilitates semantic search over vector stores with filtering and reranking capabilities.
+Read [assets/SOURCE_MAP.json](assets/SOURCE_MAP.json) for source code references.

package/dist/docs/assets/SOURCE_MAP.json ADDED Viewed

@@ -0,0 +1,6 @@
+{
+  "version": "2.1.2-alpha.0",
+  "package": "@mastra/rag",
+  "exports": {},
+  "modules": {}
+}

package/dist/docs/references/docs-rag-chunking-and-embedding.md ADDED Viewed

@@ -0,0 +1,183 @@
+# Chunking and Embedding Documents
+Before processing, create a MDocument instance from your content. You can initialize it from various formats:
+```ts
+const docFromText = MDocument.fromText('Your plain text content...')
+const docFromHTML = MDocument.fromHTML('<html>Your HTML content...</html>')
+const docFromMarkdown = MDocument.fromMarkdown('# Your Markdown content...')
+const docFromJSON = MDocument.fromJSON(`{ "key": "value" }`)
+```
+## Step 1: Document Processing
+Use `chunk` to split documents into manageable pieces. Mastra supports multiple chunking strategies optimized for different document types:
+- `recursive`: Smart splitting based on content structure
+- `character`: Simple character-based splits
+- `token`: Token-aware splitting
+- `markdown`: Markdown-aware splitting
+- `semantic-markdown`: Markdown splitting based on related header families
+- `html`: HTML structure-aware splitting
+- `json`: JSON structure-aware splitting
+- `latex`: LaTeX structure-aware splitting
+- `sentence`: Sentence-aware splitting
+> **Note:** Each strategy accepts different parameters optimized for its chunking approach.
+Here's an example of how to use the `recursive` strategy:
+```ts
+const chunks = await doc.chunk({
+  strategy: 'recursive',
+  maxSize: 512,
+  overlap: 50,
+  separators: ['\n'],
+  extract: {
+    metadata: true, // Optionally extract metadata
+  },
+})
+```
+For text where preserving sentence structure is important, here's an example of how to use the `sentence` strategy:
+```ts
+const chunks = await doc.chunk({
+  strategy: 'sentence',
+  maxSize: 450,
+  minSize: 50,
+  overlap: 0,
+  sentenceEnders: ['.'],
+})
+```
+For markdown documents where preserving the semantic relationships between sections is important, here's an example of how to use the `semantic-markdown` strategy:
+```ts
+const chunks = await doc.chunk({
+  strategy: 'semantic-markdown',
+  joinThreshold: 500,
+  modelName: 'gpt-3.5-turbo',
+})
+```
+> **Note:** Metadata extraction may use LLM calls, so ensure your API key is set.
+We go deeper into chunking strategies in our [`chunk()` reference documentation](https://mastra.ai/reference/rag/chunk).
+## Step 2: Embedding Generation
+Transform chunks into embeddings using your preferred provider. Mastra supports embedding models through the model router.
+### Using the Model Router
+The simplest way is to use Mastra's model router with `provider/model` strings:
+```ts
+import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
+import { embedMany } from 'ai'
+const { embeddings } = await embedMany({
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  values: chunks.map(chunk => chunk.text),
+})
+```
+Mastra supports OpenAI and Google embedding models. For a complete list of supported embedding models, see the [embeddings reference](https://mastra.ai/reference/rag/embeddings).
+The model router automatically handles API key detection from environment variables.
+The embedding functions return vectors, arrays of numbers representing the semantic meaning of your text, ready for similarity searches in your vector database.
+### Configuring Embedding Dimensions
+Embedding models typically output vectors with a fixed number of dimensions (e.g., 1536 for OpenAI's `text-embedding-3-small`). Some models support reducing this dimensionality, which can help:
+- Decrease storage requirements in vector databases
+- Reduce computational costs for similarity searches
+Here are some supported models:
+OpenAI (text-embedding-3 models):
+```ts
+import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
+const { embeddings } = await embedMany({
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  options: {
+    dimensions: 256, // Only supported in text-embedding-3 and later
+  },
+  values: chunks.map(chunk => chunk.text),
+})
+```
+Google (text-embedding-001):
+```ts
+const { embeddings } = await embedMany({
+  model: google('gemini-embedding-001', {
+    outputDimensionality: 256, // Truncates excessive values from the end
+  }),
+  values: chunks.map(chunk => chunk.text),
+})
+```
+> **Vector Database Compatibility:** When storing embeddings, the vector database index must be configured to match the output size of your embedding model. If the dimensions do not match, you may get errors or data corruption.
+## Example: Complete Pipeline
+Here's an example showing document processing and embedding generation with both providers:
+```ts
+import { embedMany } from 'ai'
+import { MDocument } from '@mastra/rag'
+// Initialize document
+const doc = MDocument.fromText(`
+  Climate change poses significant challenges to global agriculture.
+  Rising temperatures and changing precipitation patterns affect crop yields.
+`)
+// Create chunks
+const chunks = await doc.chunk({
+  strategy: 'recursive',
+  maxSize: 256,
+  overlap: 50,
+})
+// Generate embeddings with OpenAI
+import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
+const { embeddings } = await embedMany({
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  values: chunks.map(chunk => chunk.text),
+})
+// OR
+// Generate embeddings with Cohere
+const { embeddings } = await embedMany({
+  model: 'cohere/embed-english-v3.0',
+  values: chunks.map(chunk => chunk.text),
+})
+// Store embeddings in your vector database
+await vectorStore.upsert({
+  indexName: 'embeddings',
+  vectors: embeddings,
+})
+```
+##
+For more examples of different chunking strategies and embedding configurations, see:
+- [Chunk Reference](https://mastra.ai/reference/rag/chunk)
+- [Embeddings Reference](https://mastra.ai/reference/rag/embeddings)
+For more details on vector databases and embeddings, see:
+- [Vector Databases](https://mastra.ai/docs/rag/vector-databases)
+- [Embedding API Reference](https://mastra.ai/reference/rag/embeddings)

package/dist/docs/references/docs-rag-graph-rag.md ADDED Viewed

@@ -0,0 +1,215 @@
+# GraphRAG
+Graph-based retrieval enhances traditional vector search by following relationships between chunks of information. This approach is useful when information is spread across multiple documents or when documents reference each other.
+## When to use GraphRAG
+GraphRAG is particularly effective when:
+- Information is spread across multiple documents
+- Documents reference each other
+- You need to traverse relationships to find complete answers
+- Understanding connections between concepts is important
+- Simple vector similarity misses important contextual relationships
+For straightforward semantic search without relationship traversal, use [standard retrieval methods](https://mastra.ai/docs/rag/retrieval).
+## How GraphRAG works
+GraphRAG combines vector similarity with knowledge graph traversal:
+1. Initial vector search retrieves relevant chunks based on semantic similarity
+2. A knowledge graph is constructed from the retrieved chunks
+3. The graph is traversed to find connected information
+4. Results include both directly relevant chunks and related content
+This process helps surface information that might not be semantically similar to the query but is contextually relevant through connections.
+## Creating a graph query tool
+The Graph Query Tool provides agents with the ability to perform graph-based retrieval:
+```ts
+import { createGraphRAGTool } from '@mastra/rag'
+import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
+const graphQueryTool = createGraphRAGTool({
+  vectorStoreName: 'pgVector',
+  indexName: 'embeddings',
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  graphOptions: {
+    threshold: 0.7,
+  },
+})
+```
+### Configuration options
+The `graphOptions` parameter controls how the knowledge graph is built and traversed:
+- `threshold`: Similarity threshold (0-1) for determining which chunks are related. Higher values create sparser graphs with stronger connections; lower values create denser graphs with more potential relationships.
+- `dimension`: Vector embedding dimension. Must match the embedding model's output dimension (e.g., 1536 for OpenAI's text-embedding-3-small).
+```ts
+const graphQueryTool = createGraphRAGTool({
+  vectorStoreName: 'pgVector',
+  indexName: 'embeddings',
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  graphOptions: {
+    dimension: 1536,
+    threshold: 0.7,
+  },
+})
+```
+## Using GraphRAG with agents
+Integrate the graph query tool with an agent to enable graph-based retrieval:
+```ts
+import { Agent } from '@mastra/core/agent'
+const ragAgent = new Agent({
+  id: 'rag-agent',
+  name: 'GraphRAG Agent',
+  instructions: `You are a helpful assistant that answers questions based on the provided context.
+When answering questions, use the graph query tool to find relevant information and relationships.
+Base your answers on the context provided by the tool, and clearly state if the context doesn't contain enough information.`,
+  model: 'openai/gpt-5.1',
+  tools: {
+    graphQueryTool,
+  },
+})
+```
+## Document processing and storage
+Before using graph-based retrieval, process documents into chunks and store their embeddings:
+```ts
+import { MDocument } from '@mastra/rag'
+import { embedMany } from 'ai'
+import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
+// Create and chunk document
+const doc = MDocument.fromText('Your document content here...')
+const chunks = await doc.chunk({
+  strategy: 'recursive',
+  size: 512,
+  overlap: 50,
+  separator: '\n',
+})
+// Generate embeddings
+const { embeddings } = await embedMany({
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  values: chunks.map(chunk => chunk.text),
+})
+// Store in vector database
+const vectorStore = mastra.getVector('pgVector')
+await vectorStore.createIndex({
+  indexName: 'embeddings',
+  dimension: 1536,
+})
+await vectorStore.upsert({
+  indexName: 'embeddings',
+  vectors: embeddings,
+  metadata: chunks?.map(chunk => ({ text: chunk.text })),
+})
+```
+## Querying with GraphRAG
+Once configured, the agent can perform graph-based queries:
+```ts
+const query = 'What are the effects of infrastructure changes on local businesses?'
+const response = await ragAgent.generate(query)
+console.log(response.text)
+```
+The agent uses the graph query tool to:
+1. Convert the query to an embedding
+2. Find semantically similar chunks in the vector store
+3. Build a knowledge graph from related chunks
+4. Traverse the graph to find connected information
+5. Return comprehensive context for generating the response
+## Choosing the right threshold
+The threshold parameter significantly impacts retrieval quality:
+- **High threshold (0.8-0.9)**: Strict connections, fewer relationships, more precise but potentially incomplete results
+- **Medium threshold (0.6-0.8)**: Balanced approach, good for most use cases
+- **Low threshold (0.4-0.6)**: More connections, broader context, risk of including less relevant information
+Start with 0.7 and adjust based on your specific use case:
+```ts
+// Strict connections for precise answers
+const strictGraphTool = createGraphRAGTool({
+  vectorStoreName: 'pgVector',
+  indexName: 'embeddings',
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  graphOptions: {
+    threshold: 0.85,
+  },
+})
+// Broader connections for exploratory queries
+const broadGraphTool = createGraphRAGTool({
+  vectorStoreName: 'pgVector',
+  indexName: 'embeddings',
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  graphOptions: {
+    threshold: 0.5,
+  },
+})
+```
+## Combining with other retrieval methods
+GraphRAG can be used alongside other retrieval approaches:
+```ts
+import { createVectorQueryTool } from '@mastra/rag'
+const vectorQueryTool = createVectorQueryTool({
+  vectorStoreName: 'pgVector',
+  indexName: 'embeddings',
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+})
+const graphQueryTool = createGraphRAGTool({
+  vectorStoreName: 'pgVector',
+  indexName: 'embeddings',
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+  graphOptions: {
+    threshold: 0.7,
+  },
+})
+const agent = new Agent({
+  id: 'rag-agent',
+  name: 'RAG Agent',
+  instructions: `Use vector search for simple fact-finding queries.
+Use graph search when you need to understand relationships or find connected information.`,
+  model: 'openai/gpt-5.1',
+  tools: {
+    vectorQueryTool,
+    graphQueryTool,
+  },
+})
+```
+This gives the agent flexibility to choose the appropriate retrieval method based on the query.
+## Reference
+For detailed API documentation, see:
+- [GraphRAG Class](https://mastra.ai/reference/rag/graph-rag)
+- [createGraphRAGTool()](https://mastra.ai/reference/tools/graph-rag-tool)

package/dist/docs/references/docs-rag-overview.md ADDED Viewed

@@ -0,0 +1,72 @@
+# RAG (Retrieval-Augmented Generation) in Mastra
+RAG in Mastra helps you enhance LLM outputs by incorporating relevant context from your own data sources, improving accuracy and grounding responses in real information.
+Mastra's RAG system provides:
+- Standardized APIs to process and embed documents
+- Support for multiple vector stores
+- Chunking and embedding strategies for optimal retrieval
+- Observability for tracking embedding and retrieval performance
+## Example
+To implement RAG, you process your documents into chunks, create embeddings, store them in a vector database, and then retrieve relevant context at query time.
+```ts
+import { embedMany } from 'ai'
+import { PgVector } from '@mastra/pg'
+import { MDocument } from '@mastra/rag'
+import { z } from 'zod'
+// 1. Initialize document
+const doc = MDocument.fromText(`Your document text here...`)
+// 2. Create chunks
+const chunks = await doc.chunk({
+  strategy: 'recursive',
+  size: 512,
+  overlap: 50,
+})
+// 3. Generate embeddings; we need to pass the text of each chunk
+import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
+const { embeddings } = await embedMany({
+  values: chunks.map(chunk => chunk.text),
+  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
+})
+// 4. Store in vector database
+const pgVector = new PgVector({
+  id: 'pg-vector',
+  connectionString: process.env.POSTGRES_CONNECTION_STRING,
+})
+await pgVector.upsert({
+  indexName: 'embeddings',
+  vectors: embeddings,
+}) // using an index name of 'embeddings'
+// 5. Query similar chunks
+const results = await pgVector.query({
+  indexName: 'embeddings',
+  queryVector: queryVector,
+  topK: 3,
+}) // queryVector is the embedding of the query
+console.log('Similar chunks:', results)
+```
+This example shows the essentials: initialize a document, create chunks, generate embeddings, store them, and query for similar content.
+## Document Processing
+The basic building block of RAG is document processing. Documents can be chunked using various strategies (recursive, sliding window, etc.) and enriched with metadata. See the [chunking and embedding doc](https://mastra.ai/docs/rag/chunking-and-embedding).
+## Vector Storage
+Mastra supports multiple vector stores for embedding persistence and similarity search, including pgvector, Pinecone, Qdrant, and MongoDB. See the [vector database doc](https://mastra.ai/docs/rag/vector-databases).
+## More resources
+- [Chain of Thought RAG Example](https://github.com/mastra-ai/mastra/tree/main/examples/basics/rag/cot-rag)