npm - @mastra/mcp-docs-server - Versions diffs - 1.1.17-alpha.7 → 1.1.17 - Mend

@mastra/mcp-docs-server 1.1.17-alpha.7 → 1.1.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/.docs/docs/evals/built-in-scorers.md +1 -0
package/.docs/docs/memory/observational-memory.md +49 -4
package/.docs/docs/server/mastra-client.md +17 -0
package/.docs/docs/server/server-adapters.md +15 -1
package/.docs/models/gateways/openrouter.md +1 -1
package/.docs/models/index.md +1 -1
package/.docs/models/providers/bailing.md +1 -1
package/.docs/models/providers/cloudflare-workers-ai.md +4 -3
package/.docs/models/providers/firmware.md +2 -2
package/.docs/models/providers/friendli.md +1 -1
package/.docs/models/providers/github-models.md +1 -1
package/.docs/models/providers/google.md +7 -2
package/.docs/models/providers/groq.md +24 -16
package/.docs/models/providers/huggingface.md +1 -1
package/.docs/models/providers/llmgateway.md +269 -0
package/.docs/models/providers/mistral.md +3 -2
package/.docs/models/providers/nano-gpt.md +3 -1
package/.docs/models/providers/openai.md +2 -1
package/.docs/models/providers/poe.md +3 -1
package/.docs/models/providers/zai-coding-plan.md +3 -2
package/.docs/models/providers/zhipuai-coding-plan.md +3 -2
package/.docs/models/providers.md +1 -0
package/.docs/reference/ai-sdk/handle-chat-stream.md +2 -0
package/.docs/reference/client-js/agents.md +11 -6
package/.docs/reference/client-js/mastra-client.md +1 -1
package/.docs/reference/client-js/memory.md +1 -1
package/.docs/reference/configuration.md +24 -0
package/.docs/reference/core/mastra-model-gateway.md +2 -0
package/.docs/reference/deployer/cloudflare.md +31 -1
package/.docs/reference/evals/run-evals.md +78 -3
package/.docs/reference/evals/scorer-utils.md +188 -0
package/.docs/reference/evals/trajectory-accuracy.md +627 -0
package/.docs/reference/index.md +1 -2
package/.docs/reference/logging/pino-logger.md +58 -0
package/.docs/reference/memory/observational-memory.md +32 -6
package/CHANGELOG.md +44 -0
package/package.json +6 -6
package/.docs/reference/core/getStoredAgentById.md +0 -87
package/.docs/reference/core/listStoredAgents.md +0 -91

package/.docs/docs/evals/built-in-scorers.md CHANGED Viewed

@@ -18,6 +18,7 @@ These scorers evaluate how correct, truthful, and complete your agent's answers
 - [`content-similarity`](https://mastra.ai/reference/evals/content-similarity): Measures textual similarity using character-level matching (`0-1`, higher is better)
 - [`textual-difference`](https://mastra.ai/reference/evals/textual-difference): Measures textual differences between strings (`0-1`, higher means more similar)
 - [`tool-call-accuracy`](https://mastra.ai/reference/evals/tool-call-accuracy): Evaluates whether the LLM selects the correct tool from available options (`0-1`, higher is better)
+- [`trajectory-accuracy`](https://mastra.ai/reference/evals/trajectory-accuracy): Evaluates whether an agent follows the expected sequence of actions (tool calls, model generations, workflow steps, and other span types) (`0-1`, higher is better)
 - [`prompt-alignment`](https://mastra.ai/reference/evals/prompt-alignment): Measures how well agent responses align with user prompt intent, requirements, completeness, and format (`0-1`, higher is better)
 ### Context quality

package/.docs/docs/memory/observational-memory.md CHANGED Viewed

@@ -95,27 +95,72 @@ The result is a three-tier system:
 Normal OM compresses messages into observations, which is great for staying on task — but the original wording is gone. Retrieval mode fixes this by keeping each observation group linked to the raw messages that produced it. When the agent needs exact wording, tool output, or chronology that the summary compressed away, it can call a `recall` tool to page through the source messages.
+#### Browsing only
+Set `retrieval: true` to enable the recall tool for browsing raw messages. No vector store needed. By default, the recall tool can browse across all threads for the current resource.
 ```typescript
 const memory = new Memory({
   options: {
     observationalMemory: {
       model: 'google/gemini-2.5-flash',
-      scope: 'thread',
       retrieval: true,
     },
   },
 })
 ```
+#### With semantic search
+Set `retrieval: { vector: true }` to also enable semantic search. This reuses the vector store and embedder already configured on your Memory instance:
+```typescript
+const memory = new Memory({
+  storage,
+  vector: myVectorStore,
+  embedder: myEmbedder,
+  options: {
+    observationalMemory: {
+      model: 'google/gemini-2.5-flash',
+      retrieval: { vector: true },
+    },
+  },
+})
+```
+When vector search is configured, new observation groups are automatically indexed at buffer time and during synchronous observation (fire-and-forget, non-blocking). Semantic search returns observation-group matches with their raw source message ID ranges, so the recall tool can show the summarized memory alongside where it came from.
+#### Restricting to the current thread
+By default, the recall tool scope is `'resource'` — the agent can list threads, browse other threads, and search across all conversations. Set `scope: 'thread'` to restrict the agent to only the current thread:
+```typescript
+const memory = new Memory({
+  options: {
+    observationalMemory: {
+      model: 'google/gemini-2.5-flash',
+      retrieval: { vector: true, scope: 'thread' },
+    },
+  },
+})
+```
+#### What retrieval enables
 With retrieval mode enabled, OM:
 - Stores a `range` (e.g. `startId:endId`) on each observation group pointing to the messages it was derived from
 - Keeps range metadata visible in the agent's context so the agent knows which observations map to which messages
-- Registers a `recall` tool the agent can call to page through the raw messages behind any range
-Retrieval mode is only active for thread-scoped OM. Setting `retrieval: true` with `scope: 'resource'` has no effect — OM keeps resource-scoped behavior but skips retrieval-mode context and does not register the `recall` tool.
+- Registers a `recall` tool the agent can call to:
+  - Page through the raw messages behind any observation group range
+  - Search by semantic similarity (`mode: "search"` with a `query` string) — requires `vector: true`
+  - List all threads (`mode: "threads"`), browse other threads (`threadId`), and search across all threads (default `scope: 'resource'`)
+  - When `scope: 'thread'`: restrict browsing and search to the current thread only
-See the [recall tool reference](https://mastra.ai/reference/memory/observational-memory) for the full API (detail levels, part indexing, pagination, and token limiting).
+See the [recall tool reference](https://mastra.ai/reference/memory/observational-memory) for the full API (detail levels, part indexing, pagination, cross-thread browsing, and token limiting).
 ## Models

package/.docs/docs/server/mastra-client.md CHANGED Viewed

@@ -133,6 +133,23 @@ export const mastraClient = new MastraClient({
 > **Info:** Visit [MastraClient](https://mastra.ai/reference/client-js/mastra-client) for more configuration options.
+## Credentials and session cookies
+**Authenticate Mastra API calls with session cookies** when your UI and Mastra API are not on the same origin—different host, subdomain, or port (for example Mastra Studio on one port and a custom server on another). Add **`credentials: 'include'`** to `MastraClient` so each request carries the cookies the user already has after sign-in. Skip this and you will often get **`401`** responses from Mastra even though login succeeded in the browser.
+```typescript
+import { MastraClient } from '@mastra/client-js'
+export const mastraClient = new MastraClient({
+  baseUrl: process.env.MASTRA_API_URL || 'http://localhost:4111',
+  credentials: 'include',
+})
+```
+**Allow credentialed cross-origin requests on your server**—see [CORS: requests with credentials](https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CORS#requests_with_credentials). You need a concrete `Access-Control-Allow-Origin` (not `*`) and `Access-Control-Allow-Credentials: true`, or the browser will block the call before it reaches Mastra.
+**Using `@mastra/react`?** Wrap your app with `MastraReactProvider`, set `baseUrl` and `apiPrefix` to match your server, and rely on the default `credentials: 'include'`. Change `credentials` only when you deliberately want `same-origin` or `omit` behavior.
 ## Adding request cancelling
 `MastraClient` supports request cancellation using the standard Node.js `AbortSignal` API. Useful for canceling in-flight requests, such as when users abort an operation or to clean up stale network calls.

package/.docs/docs/server/server-adapters.md CHANGED Viewed

@@ -521,7 +521,21 @@ The adapter registers routes for both HTTP and SSE (Server-Sent Events) transpor
 ### Serverless mode
-For serverless environments like Cloudflare Workers or Vercel Edge, enable stateless mode via `mcpOptions`:
+For serverless environments like Cloudflare Workers or Vercel Edge, enable stateless mode via `mcpOptions`.
+When using the Mastra deployer (the standard `mastra dev` / `mastra build` path), set `mcpOptions` in your server config:
+```typescript
+const mastra = new Mastra({
+  server: {
+    mcpOptions: {
+      serverless: true,
+    },
+  },
+})
+```
+When manually creating a server adapter, pass `mcpOptions` directly:
 ```typescript
 const server = new MastraServer({

package/.docs/models/gateways/openrouter.md CHANGED Viewed

@@ -117,7 +117,7 @@ ANTHROPIC_API_KEY=ant-...
 | `nousresearch/hermes-4-70b`                                     |
 | `nvidia/nemotron-3-nano-30b-a3b:free`                           |
 | `nvidia/nemotron-3-super-120b-a12b`                             |
-| `nvidia/nemotron-3-super-120b-a12b-free`                        |
+| `nvidia/nemotron-3-super-120b-a12b:free`                        |
 | `nvidia/nemotron-nano-12b-v2-vl:free`                           |
 | `nvidia/nemotron-nano-9b-v2`                                    |
 | `nvidia/nemotron-nano-9b-v2:free`                               |

package/.docs/models/index.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Model Providers
-Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3396 models from 94 providers through a single API.
+Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3616 models from 95 providers through a single API.
 ## Features

package/.docs/models/providers/bailing.md CHANGED Viewed

@@ -5,7 +5,7 @@ Access 2 Bailing models through Mastra's model router. Authentication is handled
 Learn more in the [Bailing documentation](https://alipaytbox.yuque.com/sxs0ba/ling/intro).
 ```bash
-BAILING_API_TOKEN=your-api-key
+BAILING_API_TOKEN=your-api-token
 ```
 ```typescript

package/.docs/models/providers/cloudflare-workers-ai.md CHANGED Viewed

@@ -1,11 +1,12 @@
 # ![Cloudflare Workers AI logo](https://models.dev/logos/cloudflare-workers-ai.svg)Cloudflare Workers AI
-Access 42 Cloudflare Workers AI models through Mastra's model router. Authentication is handled automatically using the `CLOUDFLARE_ACCOUNT_ID` environment variable.
+Access 42 Cloudflare Workers AI models through Mastra's model router. Authentication is handled automatically using the `CLOUDFLARE_API_KEY` environment variable. Configure `CLOUDFLARE_ACCOUNT_ID` as well.
 Learn more in the [Cloudflare Workers AI documentation](https://developers.cloudflare.com/workers-ai/models/).
 ```bash
-CLOUDFLARE_ACCOUNT_ID=your-api-key
+CLOUDFLARE_ACCOUNT_ID=your-account-id
+CLOUDFLARE_API_KEY=your-api-key
 ```
 ```typescript
@@ -88,7 +89,7 @@ const agent = new Agent({
   model: {
     url: "https://api.cloudflare.com/client/v4/accounts/${CLOUDFLARE_ACCOUNT_ID}/ai/v1",
     id: "cloudflare-workers-ai/@cf/ai4bharat/indictrans2-en-indic-1B",
-    apiKey: process.env.CLOUDFLARE_ACCOUNT_ID,
+    apiKey: process.env.CLOUDFLARE_API_KEY,
     headers: {
       "X-Custom-Header": "value"
     }

package/.docs/models/providers/firmware.md CHANGED Viewed

@@ -45,7 +45,6 @@ for await (const chunk of stream) {
 | `firmware/gemini-3-1-pro-preview`      | 1.0M    |       |           |       |       |       | $2         | $12         |
 | `firmware/gemini-3-flash-preview`      | 1.0M    |       |           |       |       |       | $0.50      | $3          |
 | `firmware/gemini-3-pro-preview`        | 1.0M    |       |           |       |       |       | $2         | $12         |
-| `firmware/glm-5`                       | 198K    |       |           |       |       |       | $1         | $3          |
 | `firmware/gpt-4o`                      | 128K    |       |           |       |       |       | $3         | $10         |
 | `firmware/gpt-5-3-codex`               | 400K    |       |           |       |       |       | $2         | $14         |
 | `firmware/gpt-5-4`                     | 272K    |       |           |       |       |       | $3         | $15         |
@@ -58,6 +57,7 @@ for await (const chunk of stream) {
 | `firmware/grok-code-fast-1`            | 256K    |       |           |       |       |       | $0.20      | $2          |
 | `firmware/kimi-k2.5`                   | 256K    |       |           |       |       |       | $0.60      | $3          |
 | `firmware/minimax-m2-5`                | 192K    |       |           |       |       |       | $0.30      | $1          |
+| `firmware/zai-glm-5`                   | 198K    |       |           |       |       |       | $1         | $3          |
 ## Advanced configuration
@@ -87,7 +87,7 @@ const agent = new Agent({
   model: ({ requestContext }) => {
     const useAdvanced = requestContext.task === "complex";
     return useAdvanced
-      ? "firmware/minimax-m2-5"
+      ? "firmware/zai-glm-5"
       : "firmware/claude-haiku-4-5";
   }
 });

package/.docs/models/providers/friendli.md CHANGED Viewed

@@ -5,7 +5,7 @@ Access 7 Friendli models through Mastra's model router. Authentication is handle
 Learn more in the [Friendli documentation](https://friendli.ai/docs/guides/serverless_endpoints/introduction).
 ```bash
-FRIENDLI_TOKEN=your-api-key
+FRIENDLI_TOKEN=your-api-token
 ```
 ```typescript

package/.docs/models/providers/github-models.md CHANGED Viewed

@@ -5,7 +5,7 @@ Access 55 GitHub Models models through Mastra's model router. Authentication is
 Learn more in the [GitHub Models documentation](https://docs.github.com/en/github-models).
 ```bash
-GITHUB_TOKEN=your-api-key
+GITHUB_TOKEN=your-api-token
 ```
 ```typescript

package/.docs/models/providers/google.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![Google logo](https://models.dev/logos/google.svg)Google
-Access 30 Google models through Mastra's model router. Authentication is handled automatically using the `GOOGLE_GENERATIVE_AI_API_KEY` environment variable.
+Access 35 Google models through Mastra's model router. Authentication is handled automatically using the `GOOGLE_GENERATIVE_AI_API_KEY` environment variable.
 Learn more in the [Google documentation](https://ai.google.dev/gemini-api/docs/pricing).
@@ -62,6 +62,11 @@ for await (const chunk of stream) {
 | `google/gemini-flash-lite-latest`                   | 1.0M    |       |           |       |       |       | $0.10      | $0.40       |
 | `google/gemini-live-2.5-flash`                      | 128K    |       |           |       |       |       | $0.50      | $2          |
 | `google/gemini-live-2.5-flash-preview-native-audio` | 131K    |       |           |       |       |       | $0.50      | $2          |
+| `google/gemma-3-12b-it`                             | 33K     |       |           |       |       |       | —          | —           |
+| `google/gemma-3-27b-it`                             | 131K    |       |           |       |       |       | —          | —           |
+| `google/gemma-3-4b-it`                              | 33K     |       |           |       |       |       | —          | —           |
+| `google/gemma-3n-e2b-it`                            | 8K      |       |           |       |       |       | —          | —           |
+| `google/gemma-3n-e4b-it`                            | 8K      |       |           |       |       |       | —          | —           |
 ## Advanced configuration
@@ -90,7 +95,7 @@ const agent = new Agent({
   model: ({ requestContext }) => {
     const useAdvanced = requestContext.task === "complex";
     return useAdvanced
-      ? "google/gemini-live-2.5-flash-preview-native-audio"
+      ? "google/gemma-3n-e4b-it"
       : "google/gemini-1.5-flash";
   }
 });

package/.docs/models/providers/groq.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![Groq logo](https://models.dev/logos/groq.svg)Groq
-Access 9 Groq models through Mastra's model router. Authentication is handled automatically using the `GROQ_API_KEY` environment variable.
+Access 17 Groq models through Mastra's model router. Authentication is handled automatically using the `GROQ_API_KEY` environment variable.
 Learn more in the [Groq documentation](https://console.groq.com/docs/models).
@@ -15,7 +15,7 @@ const agent = new Agent({
   id: "my-agent",
   name: "My Agent",
   instructions: "You are a helpful assistant",
-  model: "groq/llama-3.1-8b-instant"
+  model: "groq/allam-2-7b"
 });
 // Generate a response
@@ -30,17 +30,25 @@ for await (const chunk of stream) {
 ## Models
-| Model                                                | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
-| ---------------------------------------------------- | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
-| `groq/llama-3.1-8b-instant`                          | 131K    |       |           |       |       |       | $0.05      | $0.08       |
-| `groq/llama-3.3-70b-versatile`                       | 131K    |       |           |       |       |       | $0.59      | $0.79       |
-| `groq/meta-llama/llama-4-maverick-17b-128e-instruct` | 131K    |       |           |       |       |       | $0.20      | $0.60       |
-| `groq/meta-llama/llama-4-scout-17b-16e-instruct`     | 131K    |       |           |       |       |       | $0.11      | $0.34       |
-| `groq/meta-llama/llama-guard-4-12b`                  | 131K    |       |           |       |       |       | $0.20      | $0.20       |
-| `groq/moonshotai/kimi-k2-instruct-0905`              | 262K    |       |           |       |       |       | $1         | $3          |
-| `groq/openai/gpt-oss-120b`                           | 131K    |       |           |       |       |       | $0.15      | $0.60       |
-| `groq/openai/gpt-oss-20b`                            | 131K    |       |           |       |       |       | $0.07      | $0.30       |
-| `groq/qwen/qwen3-32b`                                | 131K    |       |           |       |       |       | $0.29      | $0.59       |
+| Model                                            | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
+| ------------------------------------------------ | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
+| `groq/allam-2-7b`                                | 4K      |       |           |       |       |       | —          | —           |
+| `groq/canopylabs/orpheus-arabic-saudi`           | 4K      |       |           |       |       |       | $40        | —           |
+| `groq/canopylabs/orpheus-v1-english`             | 4K      |       |           |       |       |       | —          | —           |
+| `groq/groq/compound`                             | 131K    |       |           |       |       |       | —          | —           |
+| `groq/groq/compound-mini`                        | 131K    |       |           |       |       |       | —          | —           |
+| `groq/llama-3.1-8b-instant`                      | 131K    |       |           |       |       |       | $0.05      | $0.08       |
+| `groq/llama-3.3-70b-versatile`                   | 131K    |       |           |       |       |       | $0.59      | $0.79       |
+| `groq/meta-llama/llama-4-scout-17b-16e-instruct` | 131K    |       |           |       |       |       | $0.11      | $0.34       |
+| `groq/meta-llama/llama-prompt-guard-2-22m`       | 512     |       |           |       |       |       | $0.03      | $0.03       |
+| `groq/meta-llama/llama-prompt-guard-2-86m`       | 512     |       |           |       |       |       | $0.04      | $0.04       |
+| `groq/moonshotai/kimi-k2-instruct-0905`          | 262K    |       |           |       |       |       | $1         | $3          |
+| `groq/openai/gpt-oss-120b`                       | 131K    |       |           |       |       |       | $0.15      | $0.60       |
+| `groq/openai/gpt-oss-20b`                        | 131K    |       |           |       |       |       | $0.07      | $0.30       |
+| `groq/openai/gpt-oss-safeguard-20b`              | 131K    |       |           |       |       |       | $0.07      | $0.30       |
+| `groq/qwen/qwen3-32b`                            | 131K    |       |           |       |       |       | $0.29      | $0.59       |
+| `groq/whisper-large-v3`                          | 448     |       |           |       |       |       | —          | —           |
+| `groq/whisper-large-v3-turbo`                    | 448     |       |           |       |       |       | —          | —           |
 ## Advanced configuration
@@ -52,7 +60,7 @@ const agent = new Agent({
   name: "custom-agent",
   model: {
     url: "https://api.groq.com/openai/v1",
-    id: "groq/llama-3.1-8b-instant",
+    id: "groq/allam-2-7b",
     apiKey: process.env.GROQ_API_KEY,
     headers: {
       "X-Custom-Header": "value"
@@ -70,8 +78,8 @@ const agent = new Agent({
   model: ({ requestContext }) => {
     const useAdvanced = requestContext.task === "complex";
     return useAdvanced
-      ? "groq/qwen/qwen3-32b"
-      : "groq/llama-3.1-8b-instant";
+      ? "groq/whisper-large-v3-turbo"
+      : "groq/allam-2-7b";
   }
 });
 ```

package/.docs/models/providers/huggingface.md CHANGED Viewed

@@ -5,7 +5,7 @@ Access 20 Hugging Face models through Mastra's model router. Authentication is h
 Learn more in the [Hugging Face documentation](https://huggingface.co).
 ```bash
-HF_TOKEN=your-api-key
+HF_TOKEN=your-api-token
 ```
 ```typescript