npm - @mastra/rag - Versions diffs - 2.1.2-alpha.0 → 2.1.3-alpha.0 - Mend

@mastra/rag 2.1.2-alpha.0 → 2.1.3-alpha.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/CHANGELOG.md +20 -0
package/LICENSE.md +15 -0
package/dist/docs/SKILL.md +3 -3
package/dist/docs/assets/SOURCE_MAP.json +1 -1
package/dist/docs/references/docs-rag-chunking-and-embedding.md +5 -5
package/dist/docs/references/docs-rag-graph-rag.md +2 -2
package/dist/docs/references/docs-rag-overview.md +2 -2
package/dist/docs/references/docs-rag-retrieval.md +16 -16
package/dist/docs/references/reference-rag-chunk.md +40 -40
package/dist/docs/references/reference-rag-database-config.md +19 -15
package/dist/docs/references/reference-rag-document.md +13 -13
package/dist/docs/references/reference-rag-extract-params.md +31 -31
package/dist/docs/references/reference-rag-graph-rag.md +16 -16
package/dist/docs/references/reference-rag-rerank.md +28 -20
package/dist/docs/references/reference-rag-rerankWithScorer.md +27 -19
package/dist/docs/references/reference-tools-document-chunker-tool.md +11 -11
package/dist/docs/references/reference-tools-graph-rag-tool.md +23 -25
package/dist/docs/references/reference-tools-vector-query-tool.md +47 -35
package/dist/document/validation.d.ts.map +1 -1
package/dist/index.cjs +6 -5
package/dist/index.cjs.map +1 -1
package/dist/index.js +6 -5
package/dist/index.js.map +1 -1
package/dist/tools/document-chunker.d.ts +1 -3
package/dist/tools/document-chunker.d.ts.map +1 -1
package/dist/tools/graph-rag.d.ts +5 -19
package/dist/tools/graph-rag.d.ts.map +1 -1
package/dist/tools/vector-query.d.ts +5 -19
package/dist/tools/vector-query.d.ts.map +1 -1
package/dist/utils/tool-schemas.d.ts +9 -47
package/dist/utils/tool-schemas.d.ts.map +1 -1
package/package.json +9 -9

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,25 @@
 # @mastra/rag
+## 2.1.3-alpha.0
+### Patch Changes
+- Standardized all logger calls across the codebase to use static string messages with structured data objects. Dynamic values are now passed as key-value pairs in the second argument instead of being interpolated into template literal strings. This improves log filterability and searchability in observability storage. ([#14899](https://github.com/mastra-ai/mastra/pull/14899))
+  Removed ~150 redundant or noisy log calls including duplicate error logging after trackException and verbose in-memory storage CRUD traces.
+- Updated dependencies [[`cbeec24`](https://github.com/mastra-ai/mastra/commit/cbeec24b3c97a1a296e7e461e66cc7f7d215dc50), [`cee146b`](https://github.com/mastra-ai/mastra/commit/cee146b5d858212e1df2b2730fc36d3ceda0e08d), [`aa0aeff`](https://github.com/mastra-ai/mastra/commit/aa0aeffa11efbef5e219fbd97bf43d263cfe3afe), [`2bcec65`](https://github.com/mastra-ai/mastra/commit/2bcec652d62b07eab15e9eb9822f70184526eede), [`ad9bded`](https://github.com/mastra-ai/mastra/commit/ad9bdedf86a824801f49928a8d40f6e31ff5450f), [`cbeec24`](https://github.com/mastra-ai/mastra/commit/cbeec24b3c97a1a296e7e461e66cc7f7d215dc50), [`208c0bb`](https://github.com/mastra-ai/mastra/commit/208c0bbacbf5a1da6318f2a0e0c544390e542ddc), [`f566ee7`](https://github.com/mastra-ai/mastra/commit/f566ee7d53a3da33a01103e2a5ac2070ddefe6b0)]:
+  - @mastra/core@1.20.0-alpha.0
+## 2.1.2
+### Patch Changes
+- Improved token-based chunking performance in `token` and `semantic-markdown` strategies. Markdown knowledge bases now chunk significantly faster with lower tokenization overhead. ([#13495](https://github.com/mastra-ai/mastra/pull/13495))
+- Updated dependencies [[`df170fd`](https://github.com/mastra-ai/mastra/commit/df170fd139b55f845bfd2de8488b16435bd3d0da), [`ae55343`](https://github.com/mastra-ai/mastra/commit/ae5534397fc006fd6eef3e4f80c235bcdc9289ef), [`c290cec`](https://github.com/mastra-ai/mastra/commit/c290cec5bf9107225de42942b56b487107aa9dce), [`f03e794`](https://github.com/mastra-ai/mastra/commit/f03e794630f812b56e95aad54f7b1993dc003add), [`aa4a5ae`](https://github.com/mastra-ai/mastra/commit/aa4a5aedb80d8d6837bab8cbb2e301215d1ba3e9), [`de3f584`](https://github.com/mastra-ai/mastra/commit/de3f58408752a8d80a295275c7f23fc306cf7f4f), [`d3fb010`](https://github.com/mastra-ai/mastra/commit/d3fb010c98f575f1c0614452667396e2653815f6), [`702ee1c`](https://github.com/mastra-ai/mastra/commit/702ee1c41be67cc532b4dbe89bcb62143508f6f0), [`f495051`](https://github.com/mastra-ai/mastra/commit/f495051eb6496a720f637fc85b6d69941c12554c), [`e622f1d`](https://github.com/mastra-ai/mastra/commit/e622f1d3ab346a8e6aca6d1fe2eac99bd961e50b), [`861f111`](https://github.com/mastra-ai/mastra/commit/861f11189211b20ddb70d8df81a6b901fc78d11e), [`00f43e8`](https://github.com/mastra-ai/mastra/commit/00f43e8e97a80c82b27d5bd30494f10a715a1df9), [`1b6f651`](https://github.com/mastra-ai/mastra/commit/1b6f65127d4a0d6c38d0a1055cb84527db529d6b), [`96a1702`](https://github.com/mastra-ai/mastra/commit/96a1702ce362c50dda20c8b4a228b4ad1a36a17a), [`cb9f921`](https://github.com/mastra-ai/mastra/commit/cb9f921320913975657abb1404855d8c510f7ac5), [`114e7c1`](https://github.com/mastra-ai/mastra/commit/114e7c146ac682925f0fb37376c1be70e5d6e6e5), [`1b6f651`](https://github.com/mastra-ai/mastra/commit/1b6f65127d4a0d6c38d0a1055cb84527db529d6b), [`72df4a8`](https://github.com/mastra-ai/mastra/commit/72df4a8f9bf1a20cfd3d9006a4fdb597ad56d10a)]:
+  - @mastra/core@1.8.0
 ## 2.1.2-alpha.0
 ### Patch Changes

package/LICENSE.md CHANGED Viewed

@@ -1,3 +1,18 @@
+Portions of this software are licensed as follows:
+- All content that resides under any directory named "ee/" within this
+  repository, including but not limited to:
+  - `packages/core/src/auth/ee/`
+  - `packages/server/src/server/auth/ee/`
+    is licensed under the license defined in `ee/LICENSE`.
+- All third-party components incorporated into the Mastra Software are
+  licensed under the original license provided by the owner of the
+  applicable component.
+- Content outside of the above-mentioned directories or restrictions is
+  available under the "Apache License 2.0" as defined below.
 # Apache License 2.0
 Copyright (c) 2025 Kepler Software, Inc.

package/dist/docs/SKILL.md CHANGED Viewed

@@ -3,7 +3,7 @@ name: mastra-rag
 description: Documentation for @mastra/rag. Use when working with @mastra/rag APIs, configuration, or implementation.
 metadata:
   package: "@mastra/rag"
-  version: "2.1.2-alpha.0"
+  version: "2.1.3-alpha.0"
 ---
 ## When to use
@@ -16,10 +16,10 @@ Read the individual reference documents for detailed explanations and code examp
 ### Docs
-- [Chunking and Embedding Documents](references/docs-rag-chunking-and-embedding.md) - Guide on chunking and embedding documents in Mastra for efficient processing and retrieval.
+- [Chunking and embedding documents](references/docs-rag-chunking-and-embedding.md) - Guide on chunking and embedding documents in Mastra for efficient processing and retrieval.
 - [GraphRAG](references/docs-rag-graph-rag.md) - Guide on graph-based retrieval in Mastra's RAG systems for documents with complex relationships.
 - [RAG (Retrieval-Augmented Generation) in Mastra](references/docs-rag-overview.md) - Overview of Retrieval-Augmented Generation (RAG) in Mastra, detailing its capabilities for enhancing LLM outputs with relevant context.
-- [Retrieval, Semantic Search, Reranking](references/docs-rag-retrieval.md) - Guide on retrieval processes in Mastra's RAG systems, including semantic search, filtering, and re-ranking.
+- [Retrieval, semantic search, reranking](references/docs-rag-retrieval.md) - Guide on retrieval processes in Mastra's RAG systems, including semantic search, filtering, and re-ranking.
 ### Reference

package/dist/docs/assets/SOURCE_MAP.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "version": "2.1.2-alpha.0",
+  "version": "2.1.3-alpha.0",
   "package": "@mastra/rag",
   "exports": {},
   "modules": {}

package/dist/docs/references/docs-rag-chunking-and-embedding.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# Chunking and Embedding Documents
+# Chunking and embedding documents
 Before processing, create a MDocument instance from your content. You can initialize it from various formats:
@@ -9,7 +9,7 @@ const docFromMarkdown = MDocument.fromMarkdown('# Your Markdown content...')
 const docFromJSON = MDocument.fromJSON(`{ "key": "value" }`)
 ```
-## Step 1: Document Processing
+## Document processing
 Use `chunk` to split documents into manageable pieces. Mastra supports multiple chunking strategies optimized for different document types:
@@ -65,7 +65,7 @@ const chunks = await doc.chunk({
 We go deeper into chunking strategies in our [`chunk()` reference documentation](https://mastra.ai/reference/rag/chunk).
-## Step 2: Embedding Generation
+## Embedding generation
 Transform chunks into embeddings using your preferred provider. Mastra supports embedding models through the model router.
@@ -123,9 +123,9 @@ const { embeddings } = await embedMany({
 })
 ```
-> **Vector Database Compatibility:** When storing embeddings, the vector database index must be configured to match the output size of your embedding model. If the dimensions do not match, you may get errors or data corruption.
+> **Vector Database Compatibility:** When storing embeddings, the vector database index must be configured to match the output size of your embedding model. If the dimensions don't match, you may get errors or data corruption.
-## Example: Complete Pipeline
+## Example: Complete pipeline
 Here's an example showing document processing and embedding generation with both providers:

package/dist/docs/references/docs-rag-graph-rag.md CHANGED Viewed

@@ -75,7 +75,7 @@ const ragAgent = new Agent({
   instructions: `You are a helpful assistant that answers questions based on the provided context.
 When answering questions, use the graph query tool to find relevant information and relationships.
 Base your answers on the context provided by the tool, and clearly state if the context doesn't contain enough information.`,
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   tools: {
     graphQueryTool,
   },
@@ -197,7 +197,7 @@ const agent = new Agent({
   name: 'RAG Agent',
   instructions: `Use vector search for simple fact-finding queries.
 Use graph search when you need to understand relationships or find connected information.`,
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   tools: {
     vectorQueryTool,
     graphQueryTool,

package/dist/docs/references/docs-rag-overview.md CHANGED Viewed

@@ -59,11 +59,11 @@ console.log('Similar chunks:', results)
 This example shows the essentials: initialize a document, create chunks, generate embeddings, store them, and query for similar content.
-## Document Processing
+## Document processing
 The basic building block of RAG is document processing. Documents can be chunked using various strategies (recursive, sliding window, etc.) and enriched with metadata. See the [chunking and embedding doc](https://mastra.ai/docs/rag/chunking-and-embedding).
-## Vector Storage
+## Vector storage
 Mastra supports multiple vector stores for embedding persistence and similarity search, including pgvector, Pinecone, Qdrant, and MongoDB. See the [vector database doc](https://mastra.ai/docs/rag/vector-databases).

package/dist/docs/references/docs-rag-retrieval.md CHANGED Viewed

@@ -1,10 +1,10 @@
-# Retrieval in RAG Systems
+# Retrieval in RAG systems
 After storing embeddings, you need to retrieve relevant chunks to answer user queries.
 Mastra provides flexible retrieval options with support for semantic search, filtering, and re-ranking.
-## How Retrieval Works
+## How retrieval works
 1. The user's query is converted to an embedding using the same model used for document embeddings
 2. This embedding is compared to stored embeddings using vector similarity
@@ -14,7 +14,7 @@ Mastra provides flexible retrieval options with support for semantic search, fil
 - Re-ranked for better relevance
 - Processed through a knowledge graph
-## Basic Retrieval
+## Basic retrieval
 The simplest approach is direct semantic search. This method uses vector similarity to find chunks that are semantically similar to the query:
@@ -63,7 +63,7 @@ Results include both the text content and a similarity score:
 ]
 ```
-## Advanced Retrieval options
+## Advanced retrieval options
 ### Metadata Filtering
@@ -272,7 +272,7 @@ import { PGVECTOR_PROMPT } from '@mastra/pg'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${PGVECTOR_PROMPT}
@@ -289,7 +289,7 @@ import { PINECONE_PROMPT } from '@mastra/pinecone'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${PINECONE_PROMPT}
@@ -306,7 +306,7 @@ import { QDRANT_PROMPT } from '@mastra/qdrant'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${QDRANT_PROMPT}
@@ -323,7 +323,7 @@ import { CHROMA_PROMPT } from '@mastra/chroma'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${CHROMA_PROMPT}
@@ -340,7 +340,7 @@ import { ASTRA_PROMPT } from '@mastra/astra'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${ASTRA_PROMPT}
@@ -357,7 +357,7 @@ import { LIBSQL_PROMPT } from '@mastra/libsql'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${LIBSQL_PROMPT}
@@ -374,7 +374,7 @@ import { UPSTASH_PROMPT } from '@mastra/upstash'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${UPSTASH_PROMPT}
@@ -391,7 +391,7 @@ import { VECTORIZE_PROMPT } from '@mastra/vectorize'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${VECTORIZE_PROMPT}
@@ -408,7 +408,7 @@ import { MONGODB_PROMPT } from '@mastra/mongodb'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${MONGODB_PROMPT}
@@ -425,7 +425,7 @@ import { OPENSEARCH_PROMPT } from '@mastra/opensearch'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${OPENSEARCH_PROMPT}
@@ -442,7 +442,7 @@ import { S3VECTORS_PROMPT } from '@mastra/s3vectors'
 export const ragAgent = new Agent({
   id: 'rag-agent',
   name: 'RAG Agent',
-  model: 'openai/gpt-5.1',
+  model: 'openai/gpt-5.4',
   instructions: `
   Process queries using the provided context. Structure responses to be concise and relevant.
   ${S3VECTORS_PROMPT}
@@ -472,7 +472,7 @@ const initialResults = await pgVector.query({
 })
 // Create a relevance scorer
-const relevanceProvider = new MastraAgentRelevanceScorer('relevance-scorer', 'openai/gpt-5.1')
+const relevanceProvider = new MastraAgentRelevanceScorer('relevance-scorer', 'openai/gpt-5.4')
 // Re-rank the results
 const rerankedResults = await rerank({

package/dist/docs/references/reference-rag-chunk.md CHANGED Viewed

@@ -39,25 +39,25 @@ const chunksWithMetadata = await doc.chunk({
 The following parameters are available for all chunking strategies. **Important:** Each strategy will only utilize a subset of these parameters relevant to its specific use case.
-**strategy?:** (`'recursive' | 'character' | 'token' | 'markdown' | 'semantic-markdown' | 'html' | 'json' | 'latex' | 'sentence'`): The chunking strategy to use. If not specified, defaults based on document type. Depending on the chunking strategy, there are additional optionals. Defaults: .md files → 'markdown', .html/.htm → 'html', .json → 'json', .tex → 'latex', others → 'recursive'
+**strategy** (`'recursive' | 'character' | 'token' | 'markdown' | 'semantic-markdown' | 'html' | 'json' | 'latex' | 'sentence'`): The chunking strategy to use. If not specified, defaults based on document type. Depending on the chunking strategy, there are additional optionals. Defaults: .md files → 'markdown', .html/.htm → 'html', .json → 'json', .tex → 'latex', others → 'recursive'
-**maxSize?:** (`number`): Maximum size of each chunk. \*\*Note:\*\* Some strategy configurations (markdown with headers, HTML with headers) ignore this parameter. (Default: `4000`)
+**maxSize** (`number`): Maximum size of each chunk. \*\*Note:\*\* Some strategy configurations (markdown with headers, HTML with headers) ignore this parameter. (Default: `4000`)
-**overlap?:** (`number`): Number of characters/tokens that overlap between chunks. (Default: `50`)
+**overlap** (`number`): Number of characters/tokens that overlap between chunks. (Default: `50`)
-**lengthFunction?:** (`(text: string) => number`): Function to calculate text length. Defaults to character count.
+**lengthFunction** (`(text: string) => number`): Function to calculate text length. Defaults to character count.
-**separatorPosition?:** (`'start' | 'end'`): Where to position the separator in chunks. 'start' attaches to beginning of next chunk, 'end' attaches to end of current chunk. If not specified, separators are discarded.
+**separatorPosition** (`'start' | 'end'`): Where to position the separator in chunks. 'start' attaches to beginning of next chunk, 'end' attaches to end of current chunk. If not specified, separators are discarded.
-**addStartIndex?:** (`boolean`): Whether to add start index metadata to chunks. (Default: `false`)
+**addStartIndex** (`boolean`): Whether to add start index metadata to chunks. (Default: `false`)
-**stripWhitespace?:** (`boolean`): Whether to strip whitespace from chunks. (Default: `true`)
+**stripWhitespace** (`boolean`): Whether to strip whitespace from chunks. (Default: `true`)
-**extract?:** (`ExtractParams`): Metadata extraction configuration.
+**extract** (`ExtractParams`): Metadata extraction configuration.
 See [ExtractParams reference](https://mastra.ai/reference/rag/extract-params) for details on the `extract` parameter.
-## Strategy-Specific Options
+## Strategy-specific options
 Strategy-specific options are passed as top-level parameters alongside the strategy parameter. For example:
@@ -126,89 +126,89 @@ The options documented below are passed directly at the top level of the configu
 ### Character
-**separators?:** (`string[]`): Array of separators to try in order of preference. The strategy will attempt to split on the first separator, then fall back to subsequent ones.
+**separators** (`string[]`): Array of separators to try in order of preference. The strategy will attempt to split on the first separator, then fall back to subsequent ones.
-**isSeparatorRegex?:** (`boolean`): Whether the separator is a regex pattern (Default: `false`)
+**isSeparatorRegex** (`boolean`): Whether the separator is a regex pattern (Default: `false`)
 ### Recursive
-**separators?:** (`string[]`): Array of separators to try in order of preference. The strategy will attempt to split on the first separator, then fall back to subsequent ones.
+**separators** (`string[]`): Array of separators to try in order of preference. The strategy will attempt to split on the first separator, then fall back to subsequent ones.
-**isSeparatorRegex?:** (`boolean`): Whether the separators are regex patterns (Default: `false`)
+**isSeparatorRegex** (`boolean`): Whether the separators are regex patterns (Default: `false`)
-**language?:** (`Language`): Programming or markup language for language-specific splitting behavior. See Language enum for supported values.
+**language** (`Language`): Programming or markup language for language-specific splitting behavior. See Language enum for supported values.
 ### Sentence
-**maxSize:** (`number`): Maximum size of each chunk (required for sentence strategy)
+**maxSize** (`number`): Maximum size of each chunk (required for sentence strategy)
-**minSize?:** (`number`): Minimum size of each chunk. Chunks smaller than this will be merged with adjacent chunks when possible. (Default: `50`)
+**minSize** (`number`): Minimum size of each chunk. Chunks smaller than this will be merged with adjacent chunks when possible. (Default: `50`)
-**targetSize?:** (`number`): Preferred target size for chunks. Defaults to 80% of maxSize. The strategy will try to create chunks close to this size.
+**targetSize** (`number`): Preferred target size for chunks. Defaults to 80% of maxSize. The strategy will try to create chunks close to this size.
-**sentenceEnders?:** (`string[]`): Array of characters that mark sentence endings for splitting boundaries. (Default: `['.', '!', '?']`)
+**sentenceEnders** (`string[]`): Array of characters that mark sentence endings for splitting boundaries. (Default: `['.', '!', '?']`)
-**fallbackToWords?:** (`boolean`): Whether to fall back to word-level splitting for sentences that exceed maxSize. (Default: `true`)
+**fallbackToWords** (`boolean`): Whether to fall back to word-level splitting for sentences that exceed maxSize. (Default: `true`)
-**fallbackToCharacters?:** (`boolean`): Whether to fall back to character-level splitting for words that exceed maxSize. Only applies if fallbackToWords is enabled. (Default: `true`)
+**fallbackToCharacters** (`boolean`): Whether to fall back to character-level splitting for words that exceed maxSize. Only applies if fallbackToWords is enabled. (Default: `true`)
 ### HTML
-**headers:** (`Array<[string, string]>`): Array of \[selector, metadata key] pairs for header-based splitting
+**headers** (`Array<[string, string]>`): Array of \[selector, metadata key] pairs for header-based splitting
-**sections:** (`Array<[string, string]>`): Array of \[selector, metadata key] pairs for section-based splitting
+**sections** (`Array<[string, string]>`): Array of \[selector, metadata key] pairs for section-based splitting
-**returnEachLine?:** (`boolean`): Whether to return each line as a separate chunk
+**returnEachLine** (`boolean`): Whether to return each line as a separate chunk
 **Important:** When using the HTML strategy, all general options are ignored. Use `headers` for header-based splitting or `sections` for section-based splitting. If used together, `sections` will be ignored.
 ### Markdown
-**headers?:** (`Array<[string, string]>`): Array of \[header level, metadata key] pairs
+**headers** (`Array<[string, string]>`): Array of \[header level, metadata key] pairs
-**stripHeaders?:** (`boolean`): Whether to remove headers from the output
+**stripHeaders** (`boolean`): Whether to remove headers from the output
-**returnEachLine?:** (`boolean`): Whether to return each line as a separate chunk
+**returnEachLine** (`boolean`): Whether to return each line as a separate chunk
 **Important:** When using the `headers` option, the markdown strategy ignores all general options and content is split based on the markdown header structure. To use size-based chunking with markdown, omit the `headers` parameter.
 ### Semantic Markdown
-**joinThreshold?:** (`number`): Maximum token count for merging related sections. Sections exceeding this limit individually are left intact, but smaller sections are merged with siblings or parents if the combined size stays under this threshold. (Default: `500`)
+**joinThreshold** (`number`): Maximum token count for merging related sections. Sections exceeding this limit individually are left intact, but smaller sections are merged with siblings or parents if the combined size stays under this threshold. (Default: `500`)
-**modelName?:** (`string`): Name of the model for tokenization. If provided, the model's underlying tokenization \`encodingName\` will be used.
+**modelName** (`string`): Name of the model for tokenization. If provided, the model's underlying tokenization \`encodingName\` will be used.
-**encodingName?:** (`string`): Name of the token encoding to use. Derived from \`modelName\` if available. (Default: `cl100k_base`)
+**encodingName** (`string`): Name of the token encoding to use. Derived from \`modelName\` if available. (Default: `cl100k_base`)
-**allowedSpecial?:** (`Set<string> | 'all'`): Set of special tokens allowed during tokenization, or 'all' to allow all special tokens
+**allowedSpecial** (`Set<string> | 'all'`): Set of special tokens allowed during tokenization, or 'all' to allow all special tokens
-**disallowedSpecial?:** (`Set<string> | 'all'`): Set of special tokens to disallow during tokenization, or 'all' to disallow all special tokens (Default: `all`)
+**disallowedSpecial** (`Set<string> | 'all'`): Set of special tokens to disallow during tokenization, or 'all' to disallow all special tokens (Default: `all`)
 ### Token
-**encodingName?:** (`string`): Name of the token encoding to use
+**encodingName** (`string`): Name of the token encoding to use
-**modelName?:** (`string`): Name of the model for tokenization
+**modelName** (`string`): Name of the model for tokenization
-**allowedSpecial?:** (`Set<string> | 'all'`): Set of special tokens allowed during tokenization, or 'all' to allow all special tokens
+**allowedSpecial** (`Set<string> | 'all'`): Set of special tokens allowed during tokenization, or 'all' to allow all special tokens
-**disallowedSpecial?:** (`Set<string> | 'all'`): Set of special tokens to disallow during tokenization, or 'all' to disallow all special tokens
+**disallowedSpecial** (`Set<string> | 'all'`): Set of special tokens to disallow during tokenization, or 'all' to disallow all special tokens
 ### JSON
-**maxSize:** (`number`): Maximum size of each chunk
+**maxSize** (`number`): Maximum size of each chunk
-**minSize?:** (`number`): Minimum size of each chunk
+**minSize** (`number`): Minimum size of each chunk
-**ensureAscii?:** (`boolean`): Whether to ensure ASCII encoding
+**ensureAscii** (`boolean`): Whether to ensure ASCII encoding
-**convertLists?:** (`boolean`): Whether to convert lists in the JSON
+**convertLists** (`boolean`): Whether to convert lists in the JSON
 ### Latex
 The Latex strategy uses only the general chunking options listed above. It provides LaTeX-aware splitting optimized for mathematical and academic documents.
-## Return Value
+## Return value
 Returns a `MDocument` instance containing the chunked documents. Each chunk includes:

package/dist/docs/references/reference-rag-database-config.md CHANGED Viewed

@@ -2,7 +2,7 @@
 The `DatabaseConfig` type allows you to specify database-specific configurations when using vector query tools. These configurations enable you to leverage unique features and optimizations offered by different vector stores.
-## Type Definition
+## Type definition
 ```typescript
 export type DatabaseConfig = {
@@ -13,15 +13,19 @@ export type DatabaseConfig = {
 }
 ```
-## Database-Specific Types
+## Database-specific types
-### PineconeConfig
+### `PineconeConfig`
 Configuration options specific to Pinecone vector store.
-**namespace?:** (`string`): Pinecone namespace for organizing and isolating vectors within the same index. Useful for multi-tenancy or environment separation.
+**namespace** (`string`): Pinecone namespace for organizing and isolating vectors within the same index. Useful for multi-tenancy or environment separation.
-**sparseVector?:** (`{ indices: number[]; values: number[]; }`): objectindices:number\[]Array of indices for sparse vector componentsvalues:number\[]Array of values corresponding to the indices
+**sparseVector** (`{ indices: number[]; values: number[]; }`): Sparse vector for hybrid search combining dense and sparse embeddings. Enables better search quality for keyword-based queries. The indices and values arrays must be the same length.
+**sparseVector.indices** (`number[]`): Array of indices for sparse vector components
+**sparseVector.values** (`number[]`): Array of values corresponding to the indices
 **Use Cases:**
@@ -29,15 +33,15 @@ Configuration options specific to Pinecone vector store.
 - Environment isolation (dev/staging/prod namespaces)
 - Hybrid search combining semantic and keyword matching
-### PgVectorConfig
+### `PgVectorConfig`
 Configuration options specific to PostgreSQL with pgvector extension.
-**minScore?:** (`number`): Minimum similarity score threshold for results. Only vectors with similarity scores above this value will be returned.
+**minScore** (`number`): Minimum similarity score threshold for results. Only vectors with similarity scores above this value will be returned.
-**ef?:** (`number`): HNSW search parameter that controls the size of the dynamic candidate list during search. Higher values improve accuracy at the cost of speed. Typically set between topK and 200.
+**ef** (`number`): HNSW search parameter that controls the size of the dynamic candidate list during search. Higher values improve accuracy at the cost of speed. Typically set between topK and 200.
-**probes?:** (`number`): IVFFlat probe parameter that specifies the number of index cells to visit during search. Higher values improve recall at the cost of speed.
+**probes** (`number`): IVFFlat probe parameter that specifies the number of index cells to visit during search. Higher values improve recall at the cost of speed.
 **Performance Guidelines:**
@@ -51,13 +55,13 @@ Configuration options specific to PostgreSQL with pgvector extension.
 - Quality filtering to remove irrelevant results
 - Fine-tuning search accuracy vs speed tradeoffs
-### ChromaConfig
+### `ChromaConfig`
 Configuration options specific to Chroma vector store.
-**where?:** (`Record<string, any>`): Metadata filtering conditions using MongoDB-style query syntax. Filters results based on metadata fields.
+**where** (`Record<string, any>`): Metadata filtering conditions using MongoDB-style query syntax. Filters results based on metadata fields.
-**whereDocument?:** (`Record<string, any>`): Document content filtering conditions. Allows filtering based on the actual document text content.
+**whereDocument** (`Record<string, any>`): Document content filtering conditions. Allows filtering based on the actual document text content.
 **Filter Syntax Examples:**
@@ -84,7 +88,7 @@ whereDocument: { "$contains": "API documentation" }
 - Content-based document filtering
 - Complex query combinations
-## Usage Examples
+## Usage examples
 **Basic Usage**:
@@ -229,7 +233,7 @@ const vectorTool = createVectorQueryTool({
 })
 ```
-## Best Practices
+## Best practices
 1. **Environment Configuration**: Use different namespaces or configurations for different environments
 2. **Performance Tuning**: Start with default values and adjust based on your specific needs
@@ -237,7 +241,7 @@ const vectorTool = createVectorQueryTool({
 4. **Runtime Flexibility**: Override configurations at runtime for dynamic scenarios
 5. **Documentation**: Document your specific configuration choices for team members
-## Migration Guide
+## Migration guide
 Existing vector query tools continue to work without changes. To add database configurations:

package/dist/docs/references/reference-rag-document.md CHANGED Viewed

@@ -4,13 +4,13 @@ The MDocument class processes documents for RAG applications. The main methods a
 ## Constructor
-**docs:** (`Array<{ text: string, metadata?: Record<string, any> }>`): Array of document chunks with their text content and optional metadata
+**docs** (`Array<{ text: string, metadata?: Record<string, any> }>`): Array of document chunks with their text content and optional metadata
-**type:** (`'text' | 'html' | 'markdown' | 'json' | 'latex'`): Type of document content
+**type** (`'text' | 'html' | 'markdown' | 'json' | 'latex'`): Type of document content
-## Static Methods
+## Static methods
-### fromText()
+### `fromText()`
 Creates a document from plain text content.
@@ -18,7 +18,7 @@ Creates a document from plain text content.
 static fromText(text: string, metadata?: Record<string, any>): MDocument
 ```
-### fromHTML()
+### `fromHTML()`
 Creates a document from HTML content.
@@ -26,7 +26,7 @@ Creates a document from HTML content.
 static fromHTML(html: string, metadata?: Record<string, any>): MDocument
 ```
-### fromMarkdown()
+### `fromMarkdown()`
 Creates a document from Markdown content.
@@ -34,7 +34,7 @@ Creates a document from Markdown content.
 static fromMarkdown(markdown: string, metadata?: Record<string, any>): MDocument
 ```
-### fromJSON()
+### `fromJSON()`
 Creates a document from JSON content.
@@ -42,9 +42,9 @@ Creates a document from JSON content.
 static fromJSON(json: string, metadata?: Record<string, any>): MDocument
 ```
-## Instance Methods
+## Instance methods
-### chunk()
+### `chunk()`
 Splits document into chunks and optionally extracts metadata.
@@ -54,7 +54,7 @@ async chunk(params?: ChunkParams): Promise<Chunk[]>
 See [chunk() reference](https://mastra.ai/reference/rag/chunk) for detailed options.
-### getDocs()
+### `getDocs()`
 Returns array of processed document chunks.
@@ -62,7 +62,7 @@ Returns array of processed document chunks.
 getDocs(): Chunk[]
 ```
-### getText()
+### `getText()`
 Returns array of text strings from chunks.
@@ -70,7 +70,7 @@ Returns array of text strings from chunks.
 getText(): string[]
 ```
-### getMetadata()
+### `getMetadata()`
 Returns array of metadata objects from chunks.
@@ -78,7 +78,7 @@ Returns array of metadata objects from chunks.
 getMetadata(): Record<string, any>[]
 ```
-### extractMetadata()
+### `extractMetadata()`
 Extracts metadata using specified extractors. See [ExtractParams reference](https://mastra.ai/reference/rag/extract-params) for details.