npm - @mastra/mcp-docs-server - Versions diffs - 1.1.39-alpha.10 → 1.1.39-alpha.11 - Mend

@mastra/mcp-docs-server 1.1.39-alpha.10 → 1.1.39-alpha.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/.docs/docs/memory/observational-memory.md +52 -17
package/.docs/models/gateways/netlify.md +2 -1
package/.docs/models/index.md +1 -1
package/.docs/models/providers/routing-run.md +27 -40
package/.docs/reference/memory/observational-memory.md +5 -3
package/.docs/reference/tools/mcp-client.md +27 -2
package/CHANGELOG.md +8 -0
package/package.json +4 -4

package/.docs/docs/memory/observational-memory.md CHANGED Viewed

@@ -88,7 +88,7 @@ const memory = new Memory({
   options: {
     observationalMemory: {
       model: 'google/gemini-2.5-flash',
-      activateAfterIdle: '5m',
+      activateAfterIdle: 'auto',
       activateOnProviderChange: true,
     },
   },
@@ -144,6 +144,28 @@ OM uses fast local token estimation for this thresholding work. Text is estimate
 The Observer can also see attachments in the history it reviews. OM keeps readable placeholders like `[Image #1: reference-board.png]` or `[File #1: floorplan.pdf]` in the transcript for readability, and forwards the actual attachment parts alongside the text. Image-like `file` parts are upgraded to image inputs for the Observer when possible, while non-image attachments are forwarded as file parts with normalized token counting. This applies to both normal thread observation and batched resource-scope observation.
+If your Observer model is text-only or its API rejects multimodal input, set `observation.observeAttachments` to `false` to drop attachments before they reach the Observer. The readable placeholders (`[Image #1: ...]`, `[File #1: ...]`) are kept in the transcript so the Observer can still reason about what was shared without receiving the binary payload. The same filter applies to tool results that contain image or file parts:
+```typescript
+new Agent({
+  name: 'assistant',
+  instructions: 'You are a helpful assistant.',
+  model: 'openai/gpt-5-mini',
+  memory: new Memory({
+    options: {
+      observationalMemory: {
+        observation: {
+          model: 'deepseek/deepseek-reasoner',
+          observeAttachments: false,
+        },
+      },
+    },
+  }),
+})
+```
+You can also pass an allowlist of mimeType globs (for example `['image/*']`) to forward only the kinds the Observer can handle.
 ```md
 Date: 2026-01-15
@@ -444,35 +466,48 @@ Reflection works similarly — the Reflector runs in the background when observa
 ### Settings
-| Setting                               | Default | What it controls                                                                                                                                                                                                                                                                                                              |
-| ------------------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `observation.bufferTokens`            | `0.2`   | How often to buffer. `0.2` means every 20% of `messageTokens` — with the default 30k threshold, that's roughly every 6k tokens. Can also be an absolute token count (e.g. `5000`).                                                                                                                                            |
-| `observation.bufferActivation`        | `0.8`   | How aggressively to clear the message window on activation. `0.8` means remove enough messages to keep only 20% of `messageTokens` remaining. Lower values keep more message history.                                                                                                                                         |
-| `observation.blockAfter`              | `1.2`   | Safety threshold as a multiplier of `messageTokens`. At `1.2`, synchronous observation is forced at 36k tokens (1.2 × 30k). Only matters if buffering can't keep up.                                                                                                                                                          |
-| `activateAfterIdle`                   | none    | Forces buffered observations to activate after a period of inactivity, even before `observation.messageTokens` is reached. Accepts a numeric millisecond value such as `300_000`, or duration strings like `"5m"` or `"1hr"`. Set this to your prompt cache TTL if you want activation to happen before the next cold prompt. |
-| `activateOnProviderChange`            | `false` | Forces buffered observations to activate when the next step uses a different `provider/model` than the one that produced the latest assistant step. Use this when switching providers or models would invalidate prompt cache reuse.                                                                                          |
-| `reflection.bufferActivation`         | `0.5`   | When to start background reflection. `0.5` means reflection begins when observations reach 50% of the `observationTokens` threshold.                                                                                                                                                                                          |
-| `reflection.activateAfterIdle`        | none    | Opts buffered reflections into idle activation. Reflections don't inherit top-level `activateAfterIdle`.                                                                                                                                                                                                                      |
-| `reflection.activateOnProviderChange` | `false` | Opts buffered reflections into provider-change activation. Reflections don't inherit top-level `activateOnProviderChange`.                                                                                                                                                                                                    |
-| `reflection.blockAfter`               | `1.2`   | Safety threshold for reflection, same logic as observation.                                                                                                                                                                                                                                                                   |
-If you're relying on prompt caching, set `activateAfterIdle` to match your cache TTL. That way, once a thread has been idle long enough for the cache to expire, the next request can activate buffered observations first and send a smaller compressed context window.
+| Setting                               | Default | What it controls                                                                                                                                                                                                                                                              |
+| ------------------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `observation.bufferTokens`            | `0.2`   | How often to buffer. `0.2` means every 20% of `messageTokens` — with the default 30k threshold, that's roughly every 6k tokens. Can also be an absolute token count (e.g. `5000`).                                                                                            |
+| `observation.bufferActivation`        | `0.8`   | How aggressively to clear the message window on activation. `0.8` means remove enough messages to keep only 20% of `messageTokens` remaining. Lower values keep more message history.                                                                                         |
+| `observation.blockAfter`              | `1.2`   | Safety threshold as a multiplier of `messageTokens`. At `1.2`, synchronous observation is forced at 36k tokens (1.2 × 30k). Only matters if buffering can't keep up.                                                                                                          |
+| `activateAfterIdle`                   | none    | Forces buffered observations to activate after a period of inactivity, even before `observation.messageTokens` is reached. Accepts a numeric millisecond value such as `300_000`, duration strings like `"5m"` or `"1hr"`, or `"auto"` for a provider-aware prompt cache TTL. |
+| `activateOnProviderChange`            | `false` | Forces buffered observations to activate when the next step uses a different `provider/model` than the one that produced the latest assistant step. Use this when switching providers or models would invalidate prompt cache reuse.                                          |
+| `reflection.bufferActivation`         | `0.5`   | When to start background reflection. `0.5` means reflection begins when observations reach 50% of the `observationTokens` threshold.                                                                                                                                          |
+| `reflection.activateAfterIdle`        | none    | Opts buffered reflections into idle activation. Reflections don't inherit top-level `activateAfterIdle`.                                                                                                                                                                      |
+| `reflection.activateOnProviderChange` | `false` | Opts buffered reflections into provider-change activation. Reflections don't inherit top-level `activateOnProviderChange`.                                                                                                                                                    |
+| `reflection.blockAfter`               | `1.2`   | Safety threshold for reflection, same logic as observation.                                                                                                                                                                                                                   |
+If you're relying on prompt caching, set `activateAfterIdle` to `"auto"` or to a specific cache TTL. That way, once a thread has been idle long enough for the cache to expire, the next request can activate buffered observations first and send a smaller compressed context window.
+With `"auto"`, Mastra chooses an idle activation TTL from the active model provider:
+| Provider                                                                                | Auto TTL  |
+| --------------------------------------------------------------------------------------- | --------- |
+| Anthropic, OpenRouter, unknown providers, xAI                                           | 5 minutes |
+| DeepSeek                                                                                | 1 hour    |
+| Google Gemini                                                                           | 24 hours  |
+| Groq                                                                                    | 2 hours   |
+| OpenAI with `providerOptions.openai.promptCacheRetention: "24h"`                        | 1 hour    |
+| OpenAI with `providerOptions.openai.promptCacheRetention: "in_memory"`                  | 5 minutes |
+| OpenAI `gpt-4*`, `gpt-5`, `gpt-5-*`, `gpt-5.1*`, `gpt-5.2*`, `gpt-5.3*`, and `gpt-5.4*` | 5 minutes |
+| Other OpenAI models                                                                     | 1 hour    |
 ```typescript
 const memory = new Memory({
   options: {
     observationalMemory: {
       model: 'google/gemini-2.5-flash',
-      activateAfterIdle: '5m',
+      activateAfterIdle: 'auto',
       activateOnProviderChange: true,
     },
   },
 })
 ```
-With a 5-minute prompt cache TTL, this activates buffered observations after 5 minutes of inactivity so the next uncached prompt uses compressed observations instead of a larger raw message window. If you prefer, `300_000` works the same way.
+With `"auto"`, this activates buffered observations based on the active provider's prompt cache behavior so the next uncached prompt uses compressed observations instead of a larger raw message window. If you prefer a fixed 5-minute TTL, use `"5m"` or `300_000`.
-Changing model or providers mid-thread will invalidate the prompt cache. If your agent can switch between providers or models mid-thread, `activateOnProviderChange: true` forces buffered observations to activate before the new provider runs. That avoids sending a large raw window to a provider that can't reuse the previous prompt cache.
+Changing models or providers mid-thread will invalidate the prompt cache. If your agent can switch between providers or models mid-thread, `activateOnProviderChange: true` forces buffered observations to activate before the new provider runs. That avoids sending a large raw window to a provider that can't reuse the previous prompt cache.
 ### Disabling

package/.docs/models/gateways/netlify.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Netlify
-Netlify AI Gateway provides unified access to multiple providers with built-in caching and observability. Access 68 models through Mastra's model router.
+Netlify AI Gateway provides unified access to multiple providers with built-in caching and observability. Access 69 models through Mastra's model router.
 Learn more in the [Netlify documentation](https://docs.netlify.com/build/ai-gateway/overview/).
@@ -61,6 +61,7 @@ ANTHROPIC_API_KEY=ant-...
 | `gemini/gemini-3.1-flash-lite-preview`      |
 | `gemini/gemini-3.1-pro-preview`             |
 | `gemini/gemini-3.1-pro-preview-customtools` |
+| `gemini/gemini-3.5-flash`                   |
 | `gemini/gemini-flash-latest`                |
 | `gemini/gemini-flash-lite-latest`           |
 | `openai/chat-latest`                        |

package/.docs/models/index.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Model Providers
-Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 4219 models from 121 providers through a single API.
+Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 4207 models from 121 providers through a single API.
 ## Features

package/.docs/models/providers/routing-run.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![routing.run logo](https://models.dev/logos/routing-run.svg)routing.run
-Access 37 routing.run models through Mastra's model router. Authentication is handled automatically using the `ROUTING_RUN_API_KEY` environment variable.
+Access 24 routing.run models through Mastra's model router. Authentication is handled automatically using the `ROUTING_RUN_API_KEY` environment variable.
 Learn more in the [routing.run documentation](https://docs.routing.run).
@@ -32,45 +32,32 @@ for await (const chunk of stream) {
 ## Models
-| Model                                         | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
-| --------------------------------------------- | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
-| `routing-run/route/deepseek-v3.2`             | 164K    |       |           |       |       |       | $0.49      | $0.74       |
-| `routing-run/route/deepseek-v4-flash`         | 1.0M    |       |           |       |       |       | $0.49      | $0.74       |
-| `routing-run/route/deepseek-v4-flash-full`    | 1.0M    |       |           |       |       |       | $0.49      | $0.74       |
-| `routing-run/route/deepseek-v4-pro`           | 1.0M    |       |           |       |       |       | $0.49      | $0.74       |
-| `routing-run/route/deepseek-v4-pro-precision` | 1.0M    |       |           |       |       |       | $0.74      | $1          |
-| `routing-run/route/gemma-4-31b-it`            | 131K    |       |           |       |       |       | $0.10      | $0.30       |
-| `routing-run/route/glm-4.7`                   | 128K    |       |           |       |       |       | $1         | $4          |
-| `routing-run/route/glm-4.7-flash`             | 128K    |       |           |       |       |       | $1         | $4          |
-| `routing-run/route/glm-5`                     | 203K    |       |           |       |       |       | $0.79      | $3          |
-| `routing-run/route/glm-5-highspeed`           | 203K    |       |           |       |       |       | $1         | $4          |
-| `routing-run/route/glm-5.1`                   | 203K    |       |           |       |       |       | $1         | $3          |
-| `routing-run/route/glm-5.1-fp16`              | 203K    |       |           |       |       |       | $1         | $4          |
-| `routing-run/route/glm-5.1-full`              | 203K    |       |           |       |       |       | $1         | $4          |
-| `routing-run/route/glm-5.1-precision`         | 203K    |       |           |       |       |       | $1         | $4          |
-| `routing-run/route/kimi-k2.5`                 | 262K    |       |           |       |       |       | $0.46      | $2          |
-| `routing-run/route/kimi-k2.5-highspeed`       | 131K    |       |           |       |       |       | $0.65      | $3          |
-| `routing-run/route/kimi-k2.6`                 | 262K    |       |           |       |       |       | $0.46      | $2          |
-| `routing-run/route/kimi-k2.6-full`            | 262K    |       |           |       |       |       | $0.46      | $2          |
-| `routing-run/route/kimi-k2.6-precision`       | 262K    |       |           |       |       |       | $0.65      | $3          |
-| `routing-run/route/mimo-v2.5`                 | 256K    |       |           |       |       |       | $0.40      | $2          |
-| `routing-run/route/mimo-v2.5-pro`             | 1.0M    |       |           |       |       |       | $0.45      | $1          |
-| `routing-run/route/mimo-v2.5-pro-precision`   | 1.0M    |       |           |       |       |       | $0.45      | $1          |
-| `routing-run/route/minimax-m2.5`              | 100K    |       |           |       |       |       | $0.19      | $1          |
-| `routing-run/route/minimax-m2.5-highspeed`    | 100K    |       |           |       |       |       | $0.19      | $1          |
-| `routing-run/route/minimax-m2.7`              | 100K    |       |           |       |       |       | $0.33      | $1          |
-| `routing-run/route/minimax-m2.7-highspeed`    | 100K    |       |           |       |       |       | $0.33      | $1          |
-| `routing-run/route/mistral-large-3`           | 128K    |       |           |       |       |       | $0.50      | $2          |
-| `routing-run/route/mistral-medium-2505`       | 128K    |       |           |       |       |       | $0.40      | $2          |
-| `routing-run/route/mistral-small-2503`        | 128K    |       |           |       |       |       | $0.15      | $0.60       |
-| `routing-run/route/qwen3.5-397b-a17b`         | 262K    |       |           |       |       |       | $1         | $3          |
-| `routing-run/route/qwen3.5-9b`                | 262K    |       |           |       |       |       | $0.20      | $0.60       |
-| `routing-run/route/qwen3.5-9b-chat`           | 262K    |       |           |       |       |       | $0.20      | $0.60       |
-| `routing-run/route/qwen3.6-27b`               | 262K    |       |           |       |       |       | $1         | $3          |
-| `routing-run/route/step-3.5-flash`            | 262K    |       |           |       |       |       | $0.10      | $0.29       |
-| `routing-run/route/step-3.5-flash-2603`       | 262K    |       |           |       |       |       | $0.10      | $0.30       |
-| `routing-run/route/step-3.5-flash-full`       | 262K    |       |           |       |       |       | $0.10      | $0.29       |
-| `routing-run/route/stepfun-3.5-flash`         | 262K    |       |           |       |       |       | $0.10      | $0.29       |
+| Model                                      | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
+| ------------------------------------------ | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
+| `routing-run/route/deepseek-v3.2`          | 164K    |       |           |       |       |       | $0.49      | $0.74       |
+| `routing-run/route/deepseek-v4-flash`      | 1.0M    |       |           |       |       |       | $0.49      | $0.74       |
+| `routing-run/route/deepseek-v4-flash-6bit` | 1.0M    |       |           |       |       |       | $0.49      | $0.74       |
+| `routing-run/route/deepseek-v4-pro`        | 1.0M    |       |           |       |       |       | $0.49      | $0.74       |
+| `routing-run/route/deepseek-v4-pro-6bit`   | 1.0M    |       |           |       |       |       | $0.49      | $0.74       |
+| `routing-run/route/gemma-4-31b-it`         | 131K    |       |           |       |       |       | $0.10      | $0.30       |
+| `routing-run/route/glm-5.1`                | 203K    |       |           |       |       |       | $1         | $3          |
+| `routing-run/route/glm-5.1-6bit`           | 203K    |       |           |       |       |       | $1         | $3          |
+| `routing-run/route/kimi-k2.5`              | 131K    |       |           |       |       |       | $0.46      | $2          |
+| `routing-run/route/kimi-k2.6`              | 262K    |       |           |       |       |       | $0.46      | $2          |
+| `routing-run/route/kimi-k2.6-6bit`         | 262K    |       |           |       |       |       | $0.46      | $2          |
+| `routing-run/route/mimo-v2.5-pro`          | 1.0M    |       |           |       |       |       | $0.45      | $1          |
+| `routing-run/route/mimo-v2.5-pro-6bit`     | 1.0M    |       |           |       |       |       | $0.45      | $1          |
+| `routing-run/route/minimax-m2.5`           | 100K    |       |           |       |       |       | $0.19      | $1          |
+| `routing-run/route/minimax-m2.5-highspeed` | 100K    |       |           |       |       |       | $0.19      | $1          |
+| `routing-run/route/minimax-m2.7`           | 100K    |       |           |       |       |       | $0.33      | $1          |
+| `routing-run/route/minimax-m2.7-highspeed` | 100K    |       |           |       |       |       | $0.33      | $1          |
+| `routing-run/route/mistral-large-3`        | 128K    |       |           |       |       |       | $0.50      | $2          |
+| `routing-run/route/mistral-medium-2505`    | 128K    |       |           |       |       |       | $0.40      | $2          |
+| `routing-run/route/mistral-small-2503`     | 128K    |       |           |       |       |       | $0.15      | $0.60       |
+| `routing-run/route/qwen3.6-27b`            | 262K    |       |           |       |       |       | $1         | $3          |
+| `routing-run/route/step-3.5-flash`         | 262K    |       |           |       |       |       | $0.10      | $0.29       |
+| `routing-run/route/step-3.5-flash-2603`    | 262K    |       |           |       |       |       | $0.10      | $0.30       |
+| `routing-run/route/stepfun-3.5-flash`      | 262K    |       |           |       |       |       | $0.10      | $0.29       |
 ## Advanced configuration

package/.docs/reference/memory/observational-memory.md CHANGED Viewed

@@ -36,7 +36,7 @@ OM performs thresholding with fast local token estimation. Text uses `tokenx`, a
 **scope** (`'resource' | 'thread'`): Memory scope for observations. \`'thread'\` keeps observations per-thread. \`'resource'\` (experimental) shares observations across all threads for a resource, enabling cross-conversation memory. (Default: `'thread'`)
-**activateAfterIdle** (`number | string | false`): Time before buffered observations are forced to activate after inactivity, even before \`observation.messageTokens\` is reached. Accepts a numeric millisecond value such as \`300\_000\`, duration strings like \`"5m"\` or \`"1hr"\`, or \`false\` to disable inherited observation idle activation. Reflections do not inherit this setting. Use \`reflection.activateAfterIdle\` to opt reflections into idle activation.
+**activateAfterIdle** (`number | string | false | "auto"`): Time before buffered observations are forced to activate after inactivity, even before \`observation.messageTokens\` is reached. Accepts a numeric millisecond value such as \`300\_000\`, duration strings like \`"5m"\` or \`"1hr"\`, \`"auto"\` for a provider-aware prompt cache TTL, or \`false\` to disable inherited observation idle activation. Reflections do not inherit this setting. Use \`reflection.activateAfterIdle\` to opt reflections into idle activation.
 **activateOnProviderChange** (`boolean`): Force buffered observations to activate when the actor provider or model changes. Reflections do not inherit this setting. Use \`reflection.activateOnProviderChange\` to opt reflections into provider-change activation. (Default: `false`)
@@ -54,6 +54,8 @@ OM performs thresholding with fast local token estimation. Text uses `tokenx`, a
 **observation.threadTitle** (`boolean`): When \`true\`, the Observer suggests short thread titles and updates the thread title when the conversation topic meaningfully changes. This is opt-in and defaults to disabled.
+**observation.observeAttachments** (`boolean | string[]`): Controls which image/file attachments are forwarded to the Observer model alongside their placeholder text lines. \`true\` (default) forwards all attachments. \`false\` drops all attachments while keeping placeholders visible. An array is a case-insensitive mimeType allowlist supporting exact matches (\`'application/pdf'\`), wildcard subtypes (\`'image/\*'\`), and bare \`'\*'\` for everything. Useful when the Observer model is text-only (e.g. some DeepSeek endpoints) while the main agent uses a multimodal model. Tool-result attachments are filtered using the same rule.
 **observation.messageTokens** (`number`): Token count of unobserved messages that triggers observation. When unobserved message tokens exceed this threshold, the Observer agent is called. Text is estimated locally with \`tokenx\`. Image parts are included with model-aware heuristics when possible, with deterministic fallbacks when image metadata is incomplete. Image-like \`file\` parts are counted the same way when uploads are normalized as files.
 **observation.maxTokensPerBatch** (`number`): Maximum tokens per batch when observing multiple threads in resource scope. Threads are chunked into batches of this size and processed in parallel. Lower values mean more parallelism but more API calls.
@@ -68,7 +70,7 @@ OM performs thresholding with fast local token estimation. Text uses `tokenx`, a
 **observation.bufferActivation** (`number`): Controls how much of the message window to retain after activation. Accepts a ratio (0-1) or an absolute token count (≥ 1000). For example, \`0.8\` means: activate enough buffers to remove 80% of \`messageTokens\` and leave 20% as active message history. An absolute token count like \`4000\` targets a goal of keeping \~4k message tokens remaining after activation. Higher values remove more message history per activation when using a ratio. Higher values keep more message history when using a token count.
-**observation.activateAfterIdle** (`number | string | false`): Time before buffered observations are forced to activate after inactivity. Accepts milliseconds, a duration string, or \`false\`. If unset, the top-level \`activateAfterIdle\` value is used for observations. Set \`false\` to disable the top-level idle setting for observations.
+**observation.activateAfterIdle** (`number | string | false | "auto"`): Time before buffered observations are forced to activate after inactivity. Accepts milliseconds, a duration string, \`"auto"\` for a provider-aware prompt cache TTL, or \`false\`. If unset, the top-level \`activateAfterIdle\` value is used for observations. Set \`false\` to disable the top-level idle setting for observations.
 **observation.activateOnProviderChange** (`boolean`): Force buffered observations to activate when the actor provider or model changes. If unset, the top-level \`activateOnProviderChange\` value is used for observations.
@@ -92,7 +94,7 @@ OM performs thresholding with fast local token estimation. Text uses `tokenx`, a
 **reflection.bufferActivation** (`number`): Ratio (0-1) controlling when async reflection buffering starts. When observation tokens reach \`observationTokens \* bufferActivation\`, reflection runs in the background. On activation at the full threshold, the buffered reflection replaces the observations it covers, preserving any new observations appended after that range.
-**reflection.activateAfterIdle** (`number | string | false`): Time before buffered reflections are forced to activate after inactivity. Accepts milliseconds, a duration string, or \`false\`. Reflections do not inherit top-level \`activateAfterIdle\`; set this explicitly to opt reflections into idle activation.
+**reflection.activateAfterIdle** (`number | string | false | "auto"`): Time before buffered reflections are forced to activate after inactivity. Accepts milliseconds, a duration string, \`"auto"\` for a provider-aware prompt cache TTL, or \`false\`. Reflections do not inherit top-level \`activateAfterIdle\`; set this explicitly to opt reflections into idle activation.
 **reflection.activateOnProviderChange** (`boolean`): Force buffered reflections to activate when the actor provider or model changes. Reflections do not inherit top-level \`activateOnProviderChange\`; set this explicitly to opt reflections into provider-change activation.

package/.docs/reference/tools/mcp-client.md CHANGED Viewed

@@ -53,7 +53,7 @@ Each server in the `servers` map is configured using the `MastraMCPServerDefinit
 **enableServerLogs** (`boolean`): Whether to enable logging for this server. (Default: `true`)
-**requireToolApproval** (`boolean | (params: RequireToolApprovalContext) => boolean | Promise<boolean>`): Require human approval before executing tools from this server. When set to \`true\`, all tools require approval. When set to a function, the function is called with the tool name, arguments, and request context to dynamically decide whether approval is needed.
+**requireToolApproval** (`boolean | (params: RequireToolApprovalContext) => boolean | Promise<boolean>`): Require human approval before executing tools from this server. When set to \`true\`, all tools require approval. When set to a function, the function is called with the tool name, arguments, request context, and any tool annotations advertised by the server to dynamically decide whether approval is needed.
 ## Tool approval
@@ -76,7 +76,7 @@ const mcp = new MCPClient({
 ### Dynamic approval with a function
-Pass a function to decide per-call whether approval is needed. The function receives the tool name, the arguments the model passed, and any request context from the incoming request:
+Pass a function to decide per-call whether approval is needed. The function receives the tool name, the arguments the model passed, any request context from the incoming request, and the tool's MCP `annotations` (when the server advertises them):
 ```typescript
 const mcp = new MCPClient({
@@ -98,6 +98,31 @@ const mcp = new MCPClient({
 The function can also be async. It receives `requestContext` from the incoming request, which you can use for auth checks or other per-request logic.
+### Use tool annotations from a trusted server
+If you trust the MCP server, you can use its [tool annotations](https://modelcontextprotocol.io/specification/2025-11-25/server/tools#tool-annotations) (`readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`, `title`) to drive approval decisions:
+```typescript
+const mcp = new MCPClient({
+  servers: {
+    github: {
+      url: new URL('http://localhost:3000/mcp'),
+      requireToolApproval: ({ annotations }) => {
+        // Skip approval for tools the server has marked read-only
+        if (annotations?.readOnlyHint) return false
+        // Always require approval for destructive tools
+        if (annotations?.destructiveHint) return true
+        return true
+      },
+    },
+  },
+})
+```
+Per the MCP specification, **clients MUST consider tool annotations to be untrusted unless they come from trusted servers**. Annotations are advisory hints, not a security boundary — a malicious or buggy server can claim a tool is read-only when it isn't. Only use annotations to relax approval requirements for servers you trust.
+The same annotations are also exposed on the tools returned by `listTools()` and `listToolsets()` under `tool.mcp.annotations`, so you can inspect them when wiring tools into an agent.
 ## Methods
 ### `listTools()`

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,13 @@
 # @mastra/mcp-docs-server
+## 1.1.39-alpha.11
+### Patch Changes
+- Updated dependencies [[`c272d50`](https://github.com/mastra-ai/mastra/commit/c272d50610a54496b6b6d92ccd4d37b333a2613a), [`d8692af`](https://github.com/mastra-ai/mastra/commit/d8692afa253028e39cdce2aafa0ac414071a762e), [`4bd4e8e`](https://github.com/mastra-ai/mastra/commit/4bd4e8e042f6687559f49a560a7914cee9b85447), [`841a222`](https://github.com/mastra-ai/mastra/commit/841a222560d8c19238f8213713f30535cdd82284)]:
+  - @mastra/core@1.36.0-alpha.4
+  - @mastra/mcp@1.8.0-alpha.1
 ## 1.1.39-alpha.9
 ### Patch Changes

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@mastra/mcp-docs-server",
-  "version": "1.1.39-alpha.10",
+  "version": "1.1.39-alpha.11",
   "description": "MCP server for accessing Mastra.ai documentation, changelogs, and news.",
   "type": "module",
   "main": "dist/index.js",
@@ -29,8 +29,8 @@
     "jsdom": "^26.1.0",
     "local-pkg": "^1.1.2",
     "zod": "^4.3.6",
-    "@mastra/core": "1.36.0-alpha.3",
-    "@mastra/mcp": "^1.7.1-alpha.0"
+    "@mastra/core": "1.36.0-alpha.4",
+    "@mastra/mcp": "^1.8.0-alpha.1"
   },
   "devDependencies": {
     "@hono/node-server": "^1.19.11",
@@ -47,7 +47,7 @@
     "typescript": "^6.0.3",
     "vitest": "4.1.5",
     "@internal/types-builder": "0.0.71",
-    "@mastra/core": "1.36.0-alpha.3",
+    "@mastra/core": "1.36.0-alpha.4",
     "@internal/lint": "0.0.96"
   },
   "homepage": "https://mastra.ai",