npm - @mastra/mcp-docs-server - Versions diffs - 1.1.39-alpha.1 → 1.1.39-alpha.11 - Mend

@mastra/mcp-docs-server 1.1.39-alpha.1 → 1.1.39-alpha.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/.docs/docs/agents/a2a.md +115 -88
package/.docs/docs/agents/acp.md +238 -0
package/.docs/docs/agents/response-caching.md +2 -0
package/.docs/docs/agents/signals.md +1 -1
package/.docs/docs/build-with-ai/skills.md +28 -0
package/.docs/docs/evals/evals-with-memory.md +146 -0
package/.docs/docs/evals/running-in-ci.md +1 -0
package/.docs/docs/memory/observational-memory.md +52 -17
package/.docs/docs/server/auth/fga.md +55 -10
package/.docs/models/gateways/netlify.md +2 -1
package/.docs/models/gateways/openrouter.md +2 -10
package/.docs/models/gateways/vercel.md +1 -8
package/.docs/models/index.md +1 -1
package/.docs/models/providers/berget.md +2 -1
package/.docs/models/providers/kilo.md +9 -22
package/.docs/models/providers/lilac.md +74 -0
package/.docs/models/providers/llmgateway.md +1 -8
package/.docs/models/providers/neuralwatt.md +3 -3
package/.docs/models/providers/novita-ai.md +7 -7
package/.docs/models/providers/routing-run.md +94 -0
package/.docs/models/providers/xai.md +4 -15
package/.docs/models/providers/xpersona.md +3 -3
package/.docs/models/providers.md +2 -0
package/.docs/reference/agents/channels.md +6 -0
package/.docs/reference/configuration.md +10 -10
package/.docs/reference/memory/observational-memory.md +5 -3
package/.docs/reference/server/register-api-route.md +19 -0
package/.docs/reference/storage/convex.md +74 -12
package/.docs/reference/tools/mcp-client.md +27 -2
package/.docs/reference/vectors/convex.md +129 -7
package/CHANGELOG.md +36 -0
package/package.json +6 -6

package/.docs/docs/evals/evals-with-memory.md ADDED Viewed

@@ -0,0 +1,146 @@
+# Evals with memory
+Agents that use memory in `thread` scope — including observational memory — require a thread ID at run time. When an eval invokes the agent without one, you'll see:
+```text
+ObservationalMemory (scope: 'thread') requires a threadId, but none was found in RequestContext or MessageList.
+```
+This page covers the three working patterns for running Mastra evals against memory-enabled agents, what each path supports, and which one to pick. A complete runnable repro for all three approaches lives in [`examples/evals-with-memory`](https://github.com/mastra-ai/mastra/tree/main/examples/evals-with-memory).
+## When to use which approach
+| Goal                                            | Approach                                                                                  |
+| ----------------------------------------------- | ----------------------------------------------------------------------------------------- |
+| One shared conversation across every item       | [`runEvals` with global `targetOptions.memory`](#shared-thread-with-runevals)             |
+| One independent thread per item, simple CI loop | [`runEvals` per item](#per-item-threads-with-runevals)                                    |
+| Per-item threads driven by a stored `Dataset`   | [`dataset.startExperiment` with an inline task](#dataset-experiments-with-an-inline-task) |
+Pre-seeding `RequestContext` with `MastraMemory` is **not** a supported way to drive memory into an agent. Thread resolution reads `args.memory.thread` — `RequestContext.MastraMemory` is populated by `prepare-memory-step` after the agent has already resolved its thread.
+## Shared thread with `runEvals`
+`runEvals` accepts `targetOptions`, which is forwarded to `agent.generate()`. Passing `memory: { thread, resource }` runs every data item against the same thread — useful for testing recall across a multi-turn conversation.
+```typescript
+import { runEvals } from '@mastra/core/evals'
+import { supportAgent } from './support-agent'
+import { recallScorer } from '../scorers/recall-scorer'
+const memory = await supportAgent.getMemory()
+await memory!.createThread({ threadId: 'eval-thread', resourceId: 'ci-user' })
+const result = await runEvals({
+  target: supportAgent,
+  scorers: [recallScorer],
+  targetOptions: {
+    memory: { thread: 'eval-thread', resource: 'ci-user' },
+  },
+  data: [
+    { input: 'My order number is 12345' },
+    { input: 'What is my order number?', groundTruth: '12345' },
+  ],
+})
+```
+`targetOptions` is **global per call**. There is no per-item override on `RunEvalsDataItem` today.
+## Per-item threads with `runEvals`
+When each data item needs its own thread (the common CI shape), call `runEvals` once per item with a unique `targetOptions.memory` and aggregate the scores yourself.
+```typescript
+import { randomUUID } from 'node:crypto'
+import { runEvals } from '@mastra/core/evals'
+import { supportAgent } from './support-agent'
+import { recallScorer } from '../scorers/recall-scorer'
+const memory = await supportAgent.getMemory()
+const resourceId = 'ci-user'
+const items = [
+  { input: 'Cats are mammals', groundTruth: 'mammals' },
+  { input: 'Dogs are mammals too', groundTruth: 'mammals' },
+]
+// `runEvals` returns `{ scores: Record<string, number>; summary: { totalItems } }`.
+const scores: number[] = []
+for (const item of items) {
+  const threadId = `eval-${randomUUID()}`
+  await memory!.createThread({ threadId, resourceId, title: item.input })
+  const result = await runEvals({
+    target: supportAgent,
+    scorers: [recallScorer],
+    targetOptions: { memory: { thread: threadId, resource: resourceId } },
+    data: [item],
+  })
+  scores.push(result.scores[recallScorer.id])
+}
+const average = scores.reduce((a, b) => a + b, 0) / scores.length
+```
+> **Note:** Create the thread before running the eval. Observational memory in `thread` scope reads from a record that must already exist.
+## Dataset experiments with an inline task
+`dataset.startExperiment({ target: agent })` does **not** forward a `memory` option to the agent — only `requestContext`. To run a stored dataset against a memory-enabled agent, use an inline `task` function and stash `{ threadId, resourceId }` in each item's `metadata`. The scorer pipeline still runs as normal.
+```typescript
+import { randomUUID } from 'node:crypto'
+import { mastra } from '../index'
+import { supportAgent } from '../agents/support-agent'
+import { recallScorer } from '../scorers/recall-scorer'
+const memory = await supportAgent.getMemory()
+const resourceId = 'ci-user'
+const items = [
+  { input: 'Cats are mammals', groundTruth: 'mammals', thread: `ds-${randomUUID()}` },
+  { input: 'Dogs are mammals too', groundTruth: 'mammals', thread: `ds-${randomUUID()}` },
+]
+for (const it of items) {
+  await memory!.createThread({ threadId: it.thread, resourceId, title: it.input })
+}
+const dataset = await mastra.datasets.create({
+  name: 'support-recall',
+  description: 'Per-item memory via inline task + item metadata',
+})
+await dataset.addItems({
+  items: items.map(it => ({
+    input: it.input,
+    groundTruth: it.groundTruth,
+    metadata: { threadId: it.thread, resourceId },
+  })),
+})
+const summary = await dataset.startExperiment({
+  scorers: [recallScorer],
+  task: async ({ input, metadata }) => {
+    const { threadId, resourceId: rid } = (metadata ?? {}) as {
+      threadId: string
+      resourceId: string
+    }
+    const result = await supportAgent.generate(input as string, {
+      memory: { thread: threadId, resource: rid },
+    })
+    return result.text
+  },
+})
+```
+The inline `task` receives the item's `metadata`, so each row can drive its own thread without changing the agent or any scorer.
+> **Note:** Visit [runEvals reference](https://mastra.ai/reference/evals/run-evals) and [Dataset reference](https://mastra.ai/reference/datasets/dataset) for full configuration.
+## Related
+- [Running scorers in CI](https://mastra.ai/docs/evals/running-in-ci)
+- [Running experiments](https://mastra.ai/docs/evals/datasets/running-experiments)
+- [Observational memory](https://mastra.ai/docs/memory/observational-memory)
+- [runEvals API reference](https://mastra.ai/reference/evals/run-evals)

package/.docs/docs/evals/running-in-ci.md CHANGED Viewed

@@ -121,4 +121,5 @@ describe('Weather Agent Tests', () => {
 - Learn about [creating custom scorers](https://mastra.ai/docs/evals/custom-scorers)
 - Explore [built-in scorers](https://mastra.ai/docs/evals/built-in-scorers)
+- Run scorers against [memory-enabled agents](https://mastra.ai/docs/evals/evals-with-memory)
 - Read the [runEvals API reference](https://mastra.ai/reference/evals/run-evals)

package/.docs/docs/memory/observational-memory.md CHANGED Viewed

@@ -88,7 +88,7 @@ const memory = new Memory({
   options: {
     observationalMemory: {
       model: 'google/gemini-2.5-flash',
-      activateAfterIdle: '5m',
+      activateAfterIdle: 'auto',
       activateOnProviderChange: true,
     },
   },
@@ -144,6 +144,28 @@ OM uses fast local token estimation for this thresholding work. Text is estimate
 The Observer can also see attachments in the history it reviews. OM keeps readable placeholders like `[Image #1: reference-board.png]` or `[File #1: floorplan.pdf]` in the transcript for readability, and forwards the actual attachment parts alongside the text. Image-like `file` parts are upgraded to image inputs for the Observer when possible, while non-image attachments are forwarded as file parts with normalized token counting. This applies to both normal thread observation and batched resource-scope observation.
+If your Observer model is text-only or its API rejects multimodal input, set `observation.observeAttachments` to `false` to drop attachments before they reach the Observer. The readable placeholders (`[Image #1: ...]`, `[File #1: ...]`) are kept in the transcript so the Observer can still reason about what was shared without receiving the binary payload. The same filter applies to tool results that contain image or file parts:
+```typescript
+new Agent({
+  name: 'assistant',
+  instructions: 'You are a helpful assistant.',
+  model: 'openai/gpt-5-mini',
+  memory: new Memory({
+    options: {
+      observationalMemory: {
+        observation: {
+          model: 'deepseek/deepseek-reasoner',
+          observeAttachments: false,
+        },
+      },
+    },
+  }),
+})
+```
+You can also pass an allowlist of mimeType globs (for example `['image/*']`) to forward only the kinds the Observer can handle.
 ```md
 Date: 2026-01-15
@@ -444,35 +466,48 @@ Reflection works similarly — the Reflector runs in the background when observa
 ### Settings
-| Setting                               | Default | What it controls                                                                                                                                                                                                                                                                                                              |
-| ------------------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `observation.bufferTokens`            | `0.2`   | How often to buffer. `0.2` means every 20% of `messageTokens` — with the default 30k threshold, that's roughly every 6k tokens. Can also be an absolute token count (e.g. `5000`).                                                                                                                                            |
-| `observation.bufferActivation`        | `0.8`   | How aggressively to clear the message window on activation. `0.8` means remove enough messages to keep only 20% of `messageTokens` remaining. Lower values keep more message history.                                                                                                                                         |
-| `observation.blockAfter`              | `1.2`   | Safety threshold as a multiplier of `messageTokens`. At `1.2`, synchronous observation is forced at 36k tokens (1.2 × 30k). Only matters if buffering can't keep up.                                                                                                                                                          |
-| `activateAfterIdle`                   | none    | Forces buffered observations to activate after a period of inactivity, even before `observation.messageTokens` is reached. Accepts a numeric millisecond value such as `300_000`, or duration strings like `"5m"` or `"1hr"`. Set this to your prompt cache TTL if you want activation to happen before the next cold prompt. |
-| `activateOnProviderChange`            | `false` | Forces buffered observations to activate when the next step uses a different `provider/model` than the one that produced the latest assistant step. Use this when switching providers or models would invalidate prompt cache reuse.                                                                                          |
-| `reflection.bufferActivation`         | `0.5`   | When to start background reflection. `0.5` means reflection begins when observations reach 50% of the `observationTokens` threshold.                                                                                                                                                                                          |
-| `reflection.activateAfterIdle`        | none    | Opts buffered reflections into idle activation. Reflections don't inherit top-level `activateAfterIdle`.                                                                                                                                                                                                                      |
-| `reflection.activateOnProviderChange` | `false` | Opts buffered reflections into provider-change activation. Reflections don't inherit top-level `activateOnProviderChange`.                                                                                                                                                                                                    |
-| `reflection.blockAfter`               | `1.2`   | Safety threshold for reflection, same logic as observation.                                                                                                                                                                                                                                                                   |
-If you're relying on prompt caching, set `activateAfterIdle` to match your cache TTL. That way, once a thread has been idle long enough for the cache to expire, the next request can activate buffered observations first and send a smaller compressed context window.
+| Setting                               | Default | What it controls                                                                                                                                                                                                                                                              |
+| ------------------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `observation.bufferTokens`            | `0.2`   | How often to buffer. `0.2` means every 20% of `messageTokens` — with the default 30k threshold, that's roughly every 6k tokens. Can also be an absolute token count (e.g. `5000`).                                                                                            |
+| `observation.bufferActivation`        | `0.8`   | How aggressively to clear the message window on activation. `0.8` means remove enough messages to keep only 20% of `messageTokens` remaining. Lower values keep more message history.                                                                                         |
+| `observation.blockAfter`              | `1.2`   | Safety threshold as a multiplier of `messageTokens`. At `1.2`, synchronous observation is forced at 36k tokens (1.2 × 30k). Only matters if buffering can't keep up.                                                                                                          |
+| `activateAfterIdle`                   | none    | Forces buffered observations to activate after a period of inactivity, even before `observation.messageTokens` is reached. Accepts a numeric millisecond value such as `300_000`, duration strings like `"5m"` or `"1hr"`, or `"auto"` for a provider-aware prompt cache TTL. |
+| `activateOnProviderChange`            | `false` | Forces buffered observations to activate when the next step uses a different `provider/model` than the one that produced the latest assistant step. Use this when switching providers or models would invalidate prompt cache reuse.                                          |
+| `reflection.bufferActivation`         | `0.5`   | When to start background reflection. `0.5` means reflection begins when observations reach 50% of the `observationTokens` threshold.                                                                                                                                          |
+| `reflection.activateAfterIdle`        | none    | Opts buffered reflections into idle activation. Reflections don't inherit top-level `activateAfterIdle`.                                                                                                                                                                      |
+| `reflection.activateOnProviderChange` | `false` | Opts buffered reflections into provider-change activation. Reflections don't inherit top-level `activateOnProviderChange`.                                                                                                                                                    |
+| `reflection.blockAfter`               | `1.2`   | Safety threshold for reflection, same logic as observation.                                                                                                                                                                                                                   |
+If you're relying on prompt caching, set `activateAfterIdle` to `"auto"` or to a specific cache TTL. That way, once a thread has been idle long enough for the cache to expire, the next request can activate buffered observations first and send a smaller compressed context window.
+With `"auto"`, Mastra chooses an idle activation TTL from the active model provider:
+| Provider                                                                                | Auto TTL  |
+| --------------------------------------------------------------------------------------- | --------- |
+| Anthropic, OpenRouter, unknown providers, xAI                                           | 5 minutes |
+| DeepSeek                                                                                | 1 hour    |
+| Google Gemini                                                                           | 24 hours  |
+| Groq                                                                                    | 2 hours   |
+| OpenAI with `providerOptions.openai.promptCacheRetention: "24h"`                        | 1 hour    |
+| OpenAI with `providerOptions.openai.promptCacheRetention: "in_memory"`                  | 5 minutes |
+| OpenAI `gpt-4*`, `gpt-5`, `gpt-5-*`, `gpt-5.1*`, `gpt-5.2*`, `gpt-5.3*`, and `gpt-5.4*` | 5 minutes |
+| Other OpenAI models                                                                     | 1 hour    |
 ```typescript
 const memory = new Memory({
   options: {
     observationalMemory: {
       model: 'google/gemini-2.5-flash',
-      activateAfterIdle: '5m',
+      activateAfterIdle: 'auto',
       activateOnProviderChange: true,
     },
   },
 })
 ```
-With a 5-minute prompt cache TTL, this activates buffered observations after 5 minutes of inactivity so the next uncached prompt uses compressed observations instead of a larger raw message window. If you prefer, `300_000` works the same way.
+With `"auto"`, this activates buffered observations based on the active provider's prompt cache behavior so the next uncached prompt uses compressed observations instead of a larger raw message window. If you prefer a fixed 5-minute TTL, use `"5m"` or `300_000`.
-Changing model or providers mid-thread will invalidate the prompt cache. If your agent can switch between providers or models mid-thread, `activateOnProviderChange: true` forces buffered observations to activate before the new provider runs. That avoids sending a large raw window to a provider that can't reuse the previous prompt cache.
+Changing models or providers mid-thread will invalidate the prompt cache. If your agent can switch between providers or models mid-thread, `activateOnProviderChange: true` forces buffered observations to activate before the new provider runs. That avoids sending a large raw window to a provider that can't reuse the previous prompt cache.
 ### Disabling

package/.docs/docs/server/auth/fga.md CHANGED Viewed

@@ -25,6 +25,7 @@ const mastra = new Mastra({
     auth: new MastraAuthWorkos({
       /* ... */
       fetchMemberships: true,
+      mapUserToResourceId: user => user.teamId,
     }),
     fga: new MastraFGAWorkos({
       resourceMapping: {
@@ -39,6 +40,9 @@ const mastra = new Mastra({
         [MastraFGAPermissions.MEMORY_WRITE]: 'update',
       },
     }),
+    storedResources: {
+      scope: true,
+    },
   },
 });
 ```
@@ -47,6 +51,8 @@ When using `MastraFGAWorkos`, set `fetchMemberships: true` on `MastraAuthWorkos`
 Use `thread` as the resource-mapping key for memory authorization. `MastraFGAWorkos` still accepts the legacy alias `memory`, but new configs should prefer `thread`.
+When `server.fga` is configured, Mastra enforces FGA on protected actions. If a protected action has no authenticated user, Mastra denies it. If `server.fga` is not configured, these FGA checks are skipped and Mastra keeps the previous behavior.
 ### Resource mapping
 The `resourceMapping` tells Mastra how to resolve FGA resource types and IDs from request context. Keys are Mastra resource types, values define the FGA resource type and how to derive the ID:
@@ -67,6 +73,7 @@ resourceMapping: {
 - `user` — the authenticated user
 - `resourceId` — the owning Mastra resource ID when available (for example, a thread's `resourceId`)
 - `requestContext` — the current request context for advanced tenant resolution
+- `metadata` — provider-specific metadata for the attempted action
 Return `undefined` from `deriveId()` to fall back to the original Mastra resource ID.
@@ -89,9 +96,43 @@ If no mapping exists for a permission, the original string is passed through.
 Use `validatePermissions()` to validate the full set of permissions Mastra may emit at startup. Use this when a provider requires every Mastra permission to have an explicit provider permission slug.
+### Stored resource scoping
+FGA authorizes access to a resource. It does not automatically filter stored records that live in shared storage. Enable stored resource scoping when the built-in stored resource APIs are used in a multi-tenant app.
+```typescript
+const mastra = new Mastra({
+  server: {
+    auth: new MastraAuthWorkos({
+      /* ... */
+      mapUserToResourceId: user => user.teamId,
+    }),
+    storedResources: {
+      scope: true,
+    },
+  },
+});
+```
+With `scope: true`, Mastra reads `MASTRA_RESOURCE_ID_KEY` from the request context. `mapUserToResourceId()` sets this value after authentication. Stored resource handlers persist the scope in record metadata and filter list, read, update, publish, and delete operations by that scope.
+Use an object when the scope needs custom request logic:
+```typescript
+storedResources: {
+  scope: {
+    metadataKey: 'teamId',
+    resolve: ({ user }) => user.teamId,
+    requireScope: true,
+  },
+},
+```
+If `requireScope` is `true` or omitted, scoped stored resource routes fail when no scope can be resolved.
 ### Route policy coverage
-Mastra includes route-level FGA metadata for built-in resource routes, including agents, workflows, tools, MCP tools, memory threads, responses, and conversations. A route is checked when it has route-level `fga` metadata, when Mastra can derive built-in metadata for that route, or when the provider supplies metadata with `resolveRouteFGA()`.
+Mastra includes route-level FGA metadata for built-in resource routes, including agents, workflows, tools, MCP tools, memory threads, responses, conversations, and stored resources. Stored resource route coverage includes `/stored/agents`, `/stored/mcp-clients`, `/stored/prompt-blocks`, `/stored/scorers`, `/stored/skills`, and `/stored/workspaces`. A route is checked when it has route-level `fga` metadata, when Mastra can derive built-in metadata for that route, or when the provider supplies metadata with `resolveRouteFGA()`.
 To deny protected routes that do not resolve FGA metadata, configure route policy coverage on the FGA provider:
@@ -159,16 +200,20 @@ const fga = new MastraFGAWorkos({
 When an FGA provider is configured, Mastra automatically checks authorization at these lifecycle points:
-| Lifecycle point                        | Permission checked                             | Resource                               |
-| -------------------------------------- | ---------------------------------------------- | -------------------------------------- |
-| Agent execution (`generate`, `stream`) | `agents:execute`                               | `{ type: 'agent', id: agentId }`       |
-| Workflow execution                     | `workflows:execute`                            | `{ type: 'workflow', id: workflowId }` |
-| Tool execution                         | `tools:execute`                                | `{ type: 'tool', id: toolName }`       |
-| Thread/memory access                   | `memory:read`, `memory:write`, `memory:delete` | `{ type: 'thread', id: threadId }`     |
-| MCP tool execution                     | `tools:execute`                                | `{ type: 'tool', id: toolName }`       |
-| HTTP resource routes                   | Configured per route                           | Configured per route                   |
+| Lifecycle point                                                  | Permission checked                              | Resource type        | Resource ID                                                         |
+| ---------------------------------------------------------------- | ----------------------------------------------- | -------------------- | ------------------------------------------------------------------- |
+| Agent execution (`generate`, `stream`)                           | `agents:execute`                                | `agent`              | `agentId`                                                           |
+| Built-in workflow HTTP execution routes and `Workflow.execute()` | `workflows:execute`                             | `workflow`           | `workflowId`                                                        |
+| Standalone tool execution                                        | `tools:execute`                                 | `tool`               | `toolName`                                                          |
+| Agent tool execution                                             | `tools:execute`                                 | `tool`               | `${agentId}:${toolName}`                                            |
+| MCP tool execution                                               | `tools:execute`                                 | `tool`               | `JSON.stringify([serverName, toolName])`                            |
+| Thread and memory access                                         | `memory:read`, `memory:write`, `memory:delete`  | `thread`             | `threadId`                                                          |
+| Stored resource routes                                           | Stored resource permission for the route action | Stored resource type | Route record ID, or the stored-resource scope for collection routes |
+| HTTP resource routes                                             | Configured per route                            | Configured per route | Configured per route                                                |
+Direct SDK calls to `createRun().start()`, `resume()`, or `restart()` are not independently checked by core FGA in this release. Make those calls from a protected route or guard them in application code. Pass a `requestContext` with an authenticated user when invoking protected entry points directly.
-All checks are **no-ops when FGA is not configured**, maintaining backward compatibility.
+Core agent, internal workflow, tool, and memory checks also pass `requestContext` and action metadata to the FGA provider. Route checks pass `requestContext`. Thread checks pass the owning `resourceId` when available.
 ## Custom FGA provider

package/.docs/models/gateways/netlify.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Netlify
-Netlify AI Gateway provides unified access to multiple providers with built-in caching and observability. Access 68 models through Mastra's model router.
+Netlify AI Gateway provides unified access to multiple providers with built-in caching and observability. Access 69 models through Mastra's model router.
 Learn more in the [Netlify documentation](https://docs.netlify.com/build/ai-gateway/overview/).
@@ -61,6 +61,7 @@ ANTHROPIC_API_KEY=ant-...
 | `gemini/gemini-3.1-flash-lite-preview`      |
 | `gemini/gemini-3.1-pro-preview`             |
 | `gemini/gemini-3.1-pro-preview-customtools` |
+| `gemini/gemini-3.5-flash`                   |
 | `gemini/gemini-flash-latest`                |
 | `gemini/gemini-flash-lite-latest`           |
 | `openai/chat-latest`                        |

package/.docs/models/gateways/openrouter.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![OpenRouter logo](https://models.dev/logos/openrouter.svg)OpenRouter
-OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 364 models through Mastra's model router.
+OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 356 models through Mastra's model router.
 Learn more in the [OpenRouter documentation](https://openrouter.ai/models).
@@ -143,7 +143,7 @@ ANTHROPIC_API_KEY=ant-...
 | `inception/mercury-2`                                           |
 | `inclusionai/ling-2.6-1t`                                       |
 | `inclusionai/ling-2.6-flash`                                    |
-| `inclusionai/ring-2.6-1t:free`                                  |
+| `inclusionai/ring-2.6-1t`                                       |
 | `inflection/inflection-3-pi`                                    |
 | `inflection/inflection-3-productivity`                          |
 | `kwaipilot/kat-coder-pro-v2`                                    |
@@ -369,17 +369,9 @@ ANTHROPIC_API_KEY=ant-...
 | `undi95/remm-slerp-l2-13b`                                      |
 | `upstage/solar-pro-3`                                           |
 | `writer/palmyra-x5`                                             |
-| `x-ai/grok-3`                                                   |
-| `x-ai/grok-3-beta`                                              |
-| `x-ai/grok-3-mini`                                              |
-| `x-ai/grok-3-mini-beta`                                         |
-| `x-ai/grok-4`                                                   |
-| `x-ai/grok-4-fast`                                              |
-| `x-ai/grok-4.1-fast`                                            |
 | `x-ai/grok-4.20`                                                |
 | `x-ai/grok-4.20-multi-agent`                                    |
 | `x-ai/grok-4.3`                                                 |
-| `x-ai/grok-code-fast-1`                                         |
 | `xiaomi/mimo-v2-flash`                                          |
 | `xiaomi/mimo-v2-omni`                                           |
 | `xiaomi/mimo-v2-pro`                                            |

package/.docs/models/gateways/vercel.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![Vercel logo](https://models.dev/logos/vercel.svg)Vercel
-Vercel aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 247 models through Mastra's model router.
+Vercel aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 240 models through Mastra's model router.
 Learn more in the [Vercel documentation](https://ai-sdk.dev/providers/ai-sdk-providers).
@@ -245,12 +245,6 @@ ANTHROPIC_API_KEY=ant-...
 | `voyage/voyage-finance-2`                      |
 | `voyage/voyage-law-2`                          |
 | `xai/grok-2-vision`                            |
-| `xai/grok-3`                                   |
-| `xai/grok-3-fast`                              |
-| `xai/grok-3-mini`                              |
-| `xai/grok-3-mini-fast`                         |
-| `xai/grok-4`                                   |
-| `xai/grok-4-fast-non-reasoning`                |
 | `xai/grok-4-fast-reasoning`                    |
 | `xai/grok-4.1-fast-non-reasoning`              |
 | `xai/grok-4.1-fast-reasoning`                  |
@@ -261,7 +255,6 @@ ANTHROPIC_API_KEY=ant-...
 | `xai/grok-4.20-reasoning`                      |
 | `xai/grok-4.20-reasoning-beta`                 |
 | `xai/grok-4.3`                                 |
-| `xai/grok-code-fast-1`                         |
 | `xai/grok-imagine-image`                       |
 | `xai/grok-imagine-image-pro`                   |
 | `xiaomi/mimo-v2-flash`                         |

package/.docs/models/index.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Model Providers
-Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 4223 models from 119 providers through a single API.
+Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 4207 models from 121 providers through a single API.
 ## Features

package/.docs/models/providers/berget.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![Berget.AI logo](https://models.dev/logos/berget.svg)Berget.AI
-Access 6 Berget.AI models through Mastra's model router. Authentication is handled automatically using the `BERGET_API_KEY` environment variable.
+Access 7 Berget.AI models through Mastra's model router. Authentication is handled automatically using the `BERGET_API_KEY` environment variable.
 Learn more in the [Berget.AI documentation](https://api.berget.ai).
@@ -38,6 +38,7 @@ for await (const chunk of stream) {
 | `berget/meta-llama/Llama-3.3-70B-Instruct`             | 128K    |       |           |       |       |       | $0.99      | $0.99       |
 | `berget/mistralai/Mistral-Medium-3.5-128B`             | 262K    |       |           |       |       |       | $2         | $6          |
 | `berget/mistralai/Mistral-Small-3.2-24B-Instruct-2506` | 32K     |       |           |       |       |       | $0.33      | $0.33       |
+| `berget/moonshotai/Kimi-K2.6`                          | 262K    |       |           |       |       |       | $0.83      | $4          |
 | `berget/openai/gpt-oss-120b`                           | 128K    |       |           |       |       |       | $0.44      | $0.99       |
 | `berget/zai-org/GLM-4.7`                               | 128K    |       |           |       |       |       | $0.77      | $3          |

package/.docs/models/providers/kilo.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # ![Kilo Gateway logo](https://models.dev/logos/kilo.svg)Kilo Gateway
-Access 357 Kilo Gateway models through Mastra's model router. Authentication is handled automatically using the `KILO_API_KEY` environment variable.
+Access 344 Kilo Gateway models through Mastra's model router. Authentication is handled automatically using the `KILO_API_KEY` environment variable.
 Learn more in the [Kilo Gateway documentation](https://kilo.ai).
@@ -50,7 +50,6 @@ for await (const chunk of stream) {
 | `kilo/alfredpros/codellama-7b-instruct-solidity`          | 4K      |       |           |       |       |       | $0.80      | $1          |
 | `kilo/alibaba/tongyi-deepresearch-30b-a3b`                | 131K    |       |           |       |       |       | $0.09      | $0.45       |
 | `kilo/allenai/olmo-3-32b-think`                           | 66K     |       |           |       |       |       | $0.15      | $0.50       |
-| `kilo/alpindale/goliath-120b`                             | 6K      |       |           |       |       |       | $4         | $8          |
 | `kilo/amazon/nova-2-lite-v1`                              | 1.0M    |       |           |       |       |       | $0.30      | $3          |
 | `kilo/amazon/nova-lite-v1`                                | 300K    |       |           |       |       |       | $0.06      | $0.24       |
 | `kilo/amazon/nova-micro-v1`                               | 128K    |       |           |       |       |       | $0.04      | $0.14       |
@@ -59,8 +58,6 @@ for await (const chunk of stream) {
 | `kilo/anthracite-org/magnum-v4-72b`                       | 16K     |       |           |       |       |       | $3         | $5          |
 | `kilo/anthropic/claude-3-haiku`                           | 200K    |       |           |       |       |       | $0.25      | $1          |
 | `kilo/anthropic/claude-3.5-haiku`                         | 200K    |       |           |       |       |       | $0.80      | $4          |
-| `kilo/anthropic/claude-3.7-sonnet`                        | 200K    |       |           |       |       |       | $3         | $15         |
-| `kilo/anthropic/claude-3.7-sonnet:thinking`               | 200K    |       |           |       |       |       | $3         | $15         |
 | `kilo/anthropic/claude-haiku-4.5`                         | 200K    |       |           |       |       |       | $1         | $5          |
 | `kilo/anthropic/claude-opus-4`                            | 200K    |       |           |       |       |       | $15        | $75         |
 | `kilo/anthropic/claude-opus-4.1`                          | 200K    |       |           |       |       |       | $15        | $75         |
@@ -68,6 +65,7 @@ for await (const chunk of stream) {
 | `kilo/anthropic/claude-opus-4.6`                          | 1.0M    |       |           |       |       |       | $5         | $25         |
 | `kilo/anthropic/claude-opus-4.6-fast`                     | 1.0M    |       |           |       |       |       | $30        | $150        |
 | `kilo/anthropic/claude-opus-4.7`                          | 1.0M    |       |           |       |       |       | $5         | $25         |
+| `kilo/anthropic/claude-opus-4.7-fast`                     | 1.0M    |       |           |       |       |       | $30        | $150        |
 | `kilo/anthropic/claude-sonnet-4`                          | 200K    |       |           |       |       |       | $3         | $15         |
 | `kilo/anthropic/claude-sonnet-4.5`                        | 1.0M    |       |           |       |       |       | $3         | $15         |
 | `kilo/anthropic/claude-sonnet-4.6`                        | 1.0M    |       |           |       |       |       | $3         | $15         |
@@ -84,7 +82,7 @@ for await (const chunk of stream) {
 | `kilo/baidu/ernie-4.5-300b-a47b`                          | 123K    |       |           |       |       |       | $0.28      | $1          |
 | `kilo/baidu/ernie-4.5-vl-28b-a3b`                         | 30K     |       |           |       |       |       | $0.14      | $0.56       |
 | `kilo/baidu/ernie-4.5-vl-424b-a47b`                       | 123K    |       |           |       |       |       | $0.42      | $1          |
-| `kilo/baidu/qianfan-ocr-fast:free`                        | 66K     |       |           |       |       |       | —          | —           |
+| `kilo/baidu/qianfan-ocr-fast`                             | 66K     |       |           |       |       |       | $0.68      | $3          |
 | `kilo/bytedance-seed/seed-1.6`                            | 262K    |       |           |       |       |       | $0.25      | $2          |
 | `kilo/bytedance-seed/seed-1.6-flash`                      | 262K    |       |           |       |       |       | $0.07      | $0.30       |
 | `kilo/bytedance-seed/seed-2.0-lite`                       | 262K    |       |           |       |       |       | $0.25      | $2          |
@@ -107,6 +105,7 @@ for await (const chunk of stream) {
 | `kilo/deepseek/deepseek-v3.2-exp`                         | 164K    |       |           |       |       |       | $0.27      | $0.41       |
 | `kilo/deepseek/deepseek-v3.2-speciale`                    | 164K    |       |           |       |       |       | $0.40      | $1          |
 | `kilo/deepseek/deepseek-v4-flash`                         | 1.0M    |       |           |       |       |       | $0.14      | $0.28       |
+| `kilo/deepseek/deepseek-v4-flash:free`                    | 1.0M    |       |           |       |       |       | —          | —           |
 | `kilo/deepseek/deepseek-v4-pro`                           | 1.0M    |       |           |       |       |       | $0.43      | $0.87       |
 | `kilo/essentialai/rnj-1-instruct`                         | 33K     |       |           |       |       |       | $0.15      | $0.15       |
 | `kilo/google/gemini-2.0-flash-001`                        | 1.0M    |       |           |       |       |       | $0.10      | $0.40       |
@@ -121,6 +120,7 @@ for await (const chunk of stream) {
 | `kilo/google/gemini-3-flash-preview`                      | 1.0M    |       |           |       |       |       | $0.50      | $3          |
 | `kilo/google/gemini-3-pro-image-preview`                  | 66K     |       |           |       |       |       | $2         | $12         |
 | `kilo/google/gemini-3.1-flash-image-preview`              | 66K     |       |           |       |       |       | $0.50      | $3          |
+| `kilo/google/gemini-3.1-flash-lite`                       | 1.0M    |       |           |       |       |       | $0.25      | $2          |
 | `kilo/google/gemini-3.1-flash-lite-preview`               | 1.0M    |       |           |       |       |       | $0.25      | $2          |
 | `kilo/google/gemini-3.1-pro-preview`                      | 1.0M    |       |           |       |       |       | $2         | $12         |
 | `kilo/google/gemini-3.1-pro-preview-customtools`          | 1.0M    |       |           |       |       |       | $2         | $12         |
@@ -137,8 +137,9 @@ for await (const chunk of stream) {
 | `kilo/ibm-granite/granite-4.0-h-micro`                    | 131K    |       |           |       |       |       | $0.02      | $0.11       |
 | `kilo/ibm-granite/granite-4.1-8b`                         | 131K    |       |           |       |       |       | $0.05      | $0.10       |
 | `kilo/inception/mercury-2`                                | 128K    |       |           |       |       |       | $0.25      | $0.75       |
-| `kilo/inclusionai/ling-2.6-1t:free`                       | 262K    |       |           |       |       |       | —          | —           |
+| `kilo/inclusionai/ling-2.6-1t`                            | 262K    |       |           |       |       |       | $0.30      | $3          |
 | `kilo/inclusionai/ling-2.6-flash`                         | 262K    |       |           |       |       |       | $0.08      | $0.24       |
+| `kilo/inclusionai/ring-2.6-1t`                            | 262K    |       |           |       |       |       | $0.07      | $0.63       |
 | `kilo/inflection/inflection-3-pi`                         | 8K      |       |           |       |       |       | $3         | $10         |
 | `kilo/inflection/inflection-3-productivity`               | 8K      |       |           |       |       |       | $3         | $10         |
 | `kilo/kilo-auto/balanced`                                 | 205K    |       |           |       |       |       | $0.60      | $3          |
@@ -192,7 +193,6 @@ for await (const chunk of stream) {
 | `kilo/mistralai/mistral-small-3.1-24b-instruct`           | 128K    |       |           |       |       |       | $0.35      | $0.56       |
 | `kilo/mistralai/mistral-small-3.2-24b-instruct`           | 131K    |       |           |       |       |       | $0.06      | $0.18       |
 | `kilo/mistralai/mixtral-8x22b-instruct`                   | 66K     |       |           |       |       |       | $2         | $6          |
-| `kilo/mistralai/mixtral-8x7b-instruct`                    | 33K     |       |           |       |       |       | $0.54      | $0.54       |
 | `kilo/mistralai/pixtral-large-2411`                       | 131K    |       |           |       |       |       | $2         | $6          |
 | `kilo/mistralai/voxtral-small-24b-2507`                   | 32K     |       |           |       |       |       | $0.10      | $0.30       |
 | `kilo/moonshotai/kimi-k2`                                 | 131K    |       |           |       |       |       | $0.55      | $2          |
@@ -208,7 +208,6 @@ for await (const chunk of stream) {
 | `kilo/nousresearch/hermes-3-llama-3.1-70b`                | 131K    |       |           |       |       |       | $0.30      | $0.30       |
 | `kilo/nousresearch/hermes-4-405b`                         | 131K    |       |           |       |       |       | $1         | $3          |
 | `kilo/nousresearch/hermes-4-70b`                          | 131K    |       |           |       |       |       | $0.13      | $0.40       |
-| `kilo/nvidia/llama-3.1-nemotron-70b-instruct`             | 131K    |       |           |       |       |       | $1         | $1          |
 | `kilo/nvidia/llama-3.3-nemotron-super-49b-v1.5`           | 131K    |       |           |       |       |       | $0.10      | $0.40       |
 | `kilo/nvidia/nemotron-3-nano-30b-a3b`                     | 262K    |       |           |       |       |       | $0.05      | $0.20       |
 | `kilo/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free` | 256K    |       |           |       |       |       | —          | —           |
@@ -283,6 +282,7 @@ for await (const chunk of stream) {
 | `kilo/openrouter/free`                                    | 200K    |       |           |       |       |       | —          | —           |
 | `kilo/openrouter/owl-alpha`                               | 1.0M    |       |           |       |       |       | —          | —           |
 | `kilo/openrouter/pareto-code`                             | 200K    |       |           |       |       |       | —          | —           |
+| `kilo/perceptron/perceptron-mk1`                          | 33K     |       |           |       |       |       | $0.15      | $2          |
 | `kilo/perplexity/sonar`                                   | 127K    |       |           |       |       |       | $1         | $1          |
 | `kilo/perplexity/sonar-deep-research`                     | 128K    |       |           |       |       |       | $2         | $8          |
 | `kilo/perplexity/sonar-pro`                               | 200K    |       |           |       |       |       | $3         | $15         |
@@ -294,13 +294,9 @@ for await (const chunk of stream) {
 | `kilo/qwen/qwen-2.5-72b-instruct`                         | 33K     |       |           |       |       |       | $0.12      | $0.39       |
 | `kilo/qwen/qwen-2.5-7b-instruct`                          | 33K     |       |           |       |       |       | $0.04      | $0.10       |
 | `kilo/qwen/qwen-2.5-coder-32b-instruct`                   | 33K     |       |           |       |       |       | $0.20      | $0.20       |
-| `kilo/qwen/qwen-max`                                      | 33K     |       |           |       |       |       | $1         | $4          |
 | `kilo/qwen/qwen-plus`                                     | 1.0M    |       |           |       |       |       | $0.40      | $1          |
 | `kilo/qwen/qwen-plus-2025-07-28`                          | 1.0M    |       |           |       |       |       | $0.26      | $0.78       |
 | `kilo/qwen/qwen-plus-2025-07-28:thinking`                 | 1.0M    |       |           |       |       |       | $0.26      | $0.78       |
-| `kilo/qwen/qwen-turbo`                                    | 131K    |       |           |       |       |       | $0.03      | $0.13       |
-| `kilo/qwen/qwen-vl-max`                                   | 131K    |       |           |       |       |       | $0.80      | $3          |
-| `kilo/qwen/qwen-vl-plus`                                  | 131K    |       |           |       |       |       | $0.14      | $0.41       |
 | `kilo/qwen/qwen2.5-vl-72b-instruct`                       | 33K     |       |           |       |       |       | $0.80      | $0.80       |
 | `kilo/qwen/qwen3-14b`                                     | 41K     |       |           |       |       |       | $0.06      | $0.24       |
 | `kilo/qwen/qwen3-235b-a22b`                               | 131K    |       |           |       |       |       | $0.46      | $2          |
@@ -353,26 +349,17 @@ for await (const chunk of stream) {
 | `kilo/stepfun/step-3.5-flash:free`                        | 262K    |       |           |       |       |       | —          | —           |
 | `kilo/switchpoint/router`                                 | 131K    |       |           |       |       |       | $0.85      | $3          |
 | `kilo/tencent/hunyuan-a13b-instruct`                      | 131K    |       |           |       |       |       | $0.14      | $0.57       |
-| `kilo/tencent/hy3-preview:free`                           | 262K    |       |           |       |       |       | —          | —           |
+| `kilo/tencent/hy3-preview`                                | 262K    |       |           |       |       |       | $0.07      | $0.26       |
 | `kilo/thedrummer/cydonia-24b-v4.1`                        | 131K    |       |           |       |       |       | $0.30      | $0.50       |
 | `kilo/thedrummer/rocinante-12b`                           | 33K     |       |           |       |       |       | $0.17      | $0.43       |
 | `kilo/thedrummer/skyfall-36b-v2`                          | 33K     |       |           |       |       |       | $0.55      | $0.80       |
 | `kilo/thedrummer/unslopnemo-12b`                          | 33K     |       |           |       |       |       | $0.40      | $0.40       |
-| `kilo/tngtech/deepseek-r1t2-chimera`                      | 164K    |       |           |       |       |       | $0.25      | $0.85       |
 | `kilo/undi95/remm-slerp-l2-13b`                           | 6K      |       |           |       |       |       | $0.45      | $0.65       |
 | `kilo/upstage/solar-pro-3`                                | 128K    |       |           |       |       |       | $0.15      | $0.60       |
 | `kilo/writer/palmyra-x5`                                  | 1.0M    |       |           |       |       |       | $0.60      | $6          |
-| `kilo/x-ai/grok-3`                                        | 131K    |       |           |       |       |       | $3         | $15         |
-| `kilo/x-ai/grok-3-beta`                                   | 131K    |       |           |       |       |       | $3         | $15         |
-| `kilo/x-ai/grok-3-mini`                                   | 131K    |       |           |       |       |       | $0.30      | $0.50       |
-| `kilo/x-ai/grok-3-mini-beta`                              | 131K    |       |           |       |       |       | $0.30      | $0.50       |
-| `kilo/x-ai/grok-4`                                        | 256K    |       |           |       |       |       | $3         | $15         |
-| `kilo/x-ai/grok-4-fast`                                   | 2.0M    |       |           |       |       |       | $0.20      | $0.50       |
-| `kilo/x-ai/grok-4.1-fast`                                 | 2.0M    |       |           |       |       |       | $0.20      | $0.50       |
 | `kilo/x-ai/grok-4.20`                                     | 2.0M    |       |           |       |       |       | $2         | $6          |
 | `kilo/x-ai/grok-4.20-multi-agent`                         | 2.0M    |       |           |       |       |       | $2         | $6          |
 | `kilo/x-ai/grok-4.3`                                      | 1.0M    |       |           |       |       |       | $1         | $3          |
-| `kilo/x-ai/grok-code-fast-1`                              | 256K    |       |           |       |       |       | $0.20      | $2          |
 | `kilo/x-ai/grok-code-fast-1:optimized:free`               | 256K    |       |           |       |       |       | —          | —           |
 | `kilo/xiaomi/mimo-v2-flash`                               | 262K    |       |           |       |       |       | $0.09      | $0.29       |
 | `kilo/xiaomi/mimo-v2-omni`                                | 262K    |       |           |       |       |       | $0.40      | $2          |