@mastra/mcp-docs-server 1.1.17-alpha.7 → 1.1.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/.docs/docs/evals/built-in-scorers.md +1 -0
  2. package/.docs/docs/memory/observational-memory.md +49 -4
  3. package/.docs/docs/server/mastra-client.md +17 -0
  4. package/.docs/docs/server/server-adapters.md +15 -1
  5. package/.docs/models/gateways/openrouter.md +1 -1
  6. package/.docs/models/index.md +1 -1
  7. package/.docs/models/providers/bailing.md +1 -1
  8. package/.docs/models/providers/cloudflare-workers-ai.md +4 -3
  9. package/.docs/models/providers/firmware.md +2 -2
  10. package/.docs/models/providers/friendli.md +1 -1
  11. package/.docs/models/providers/github-models.md +1 -1
  12. package/.docs/models/providers/google.md +7 -2
  13. package/.docs/models/providers/groq.md +24 -16
  14. package/.docs/models/providers/huggingface.md +1 -1
  15. package/.docs/models/providers/llmgateway.md +269 -0
  16. package/.docs/models/providers/mistral.md +3 -2
  17. package/.docs/models/providers/nano-gpt.md +3 -1
  18. package/.docs/models/providers/openai.md +2 -1
  19. package/.docs/models/providers/poe.md +3 -1
  20. package/.docs/models/providers/zai-coding-plan.md +3 -2
  21. package/.docs/models/providers/zhipuai-coding-plan.md +3 -2
  22. package/.docs/models/providers.md +1 -0
  23. package/.docs/reference/ai-sdk/handle-chat-stream.md +2 -0
  24. package/.docs/reference/client-js/agents.md +11 -6
  25. package/.docs/reference/client-js/mastra-client.md +1 -1
  26. package/.docs/reference/client-js/memory.md +1 -1
  27. package/.docs/reference/configuration.md +24 -0
  28. package/.docs/reference/core/mastra-model-gateway.md +2 -0
  29. package/.docs/reference/deployer/cloudflare.md +31 -1
  30. package/.docs/reference/evals/run-evals.md +78 -3
  31. package/.docs/reference/evals/scorer-utils.md +188 -0
  32. package/.docs/reference/evals/trajectory-accuracy.md +627 -0
  33. package/.docs/reference/index.md +1 -2
  34. package/.docs/reference/logging/pino-logger.md +58 -0
  35. package/.docs/reference/memory/observational-memory.md +32 -6
  36. package/CHANGELOG.md +44 -0
  37. package/package.json +6 -6
  38. package/.docs/reference/core/getStoredAgentById.md +0 -87
  39. package/.docs/reference/core/listStoredAgents.md +0 -91
@@ -18,6 +18,7 @@ These scorers evaluate how correct, truthful, and complete your agent's answers
18
18
  - [`content-similarity`](https://mastra.ai/reference/evals/content-similarity): Measures textual similarity using character-level matching (`0-1`, higher is better)
19
19
  - [`textual-difference`](https://mastra.ai/reference/evals/textual-difference): Measures textual differences between strings (`0-1`, higher means more similar)
20
20
  - [`tool-call-accuracy`](https://mastra.ai/reference/evals/tool-call-accuracy): Evaluates whether the LLM selects the correct tool from available options (`0-1`, higher is better)
21
+ - [`trajectory-accuracy`](https://mastra.ai/reference/evals/trajectory-accuracy): Evaluates whether an agent follows the expected sequence of actions (tool calls, model generations, workflow steps, and other span types) (`0-1`, higher is better)
21
22
  - [`prompt-alignment`](https://mastra.ai/reference/evals/prompt-alignment): Measures how well agent responses align with user prompt intent, requirements, completeness, and format (`0-1`, higher is better)
22
23
 
23
24
  ### Context quality
@@ -95,27 +95,72 @@ The result is a three-tier system:
95
95
 
96
96
  Normal OM compresses messages into observations, which is great for staying on task — but the original wording is gone. Retrieval mode fixes this by keeping each observation group linked to the raw messages that produced it. When the agent needs exact wording, tool output, or chronology that the summary compressed away, it can call a `recall` tool to page through the source messages.
97
97
 
98
+ #### Browsing only
99
+
100
+ Set `retrieval: true` to enable the recall tool for browsing raw messages. No vector store needed. By default, the recall tool can browse across all threads for the current resource.
101
+
98
102
  ```typescript
99
103
  const memory = new Memory({
100
104
  options: {
101
105
  observationalMemory: {
102
106
  model: 'google/gemini-2.5-flash',
103
- scope: 'thread',
104
107
  retrieval: true,
105
108
  },
106
109
  },
107
110
  })
108
111
  ```
109
112
 
113
+ #### With semantic search
114
+
115
+ Set `retrieval: { vector: true }` to also enable semantic search. This reuses the vector store and embedder already configured on your Memory instance:
116
+
117
+ ```typescript
118
+ const memory = new Memory({
119
+ storage,
120
+ vector: myVectorStore,
121
+ embedder: myEmbedder,
122
+ options: {
123
+ observationalMemory: {
124
+ model: 'google/gemini-2.5-flash',
125
+ retrieval: { vector: true },
126
+ },
127
+ },
128
+ })
129
+ ```
130
+
131
+ When vector search is configured, new observation groups are automatically indexed at buffer time and during synchronous observation (fire-and-forget, non-blocking). Semantic search returns observation-group matches with their raw source message ID ranges, so the recall tool can show the summarized memory alongside where it came from.
132
+
133
+ #### Restricting to the current thread
134
+
135
+ By default, the recall tool scope is `'resource'` — the agent can list threads, browse other threads, and search across all conversations. Set `scope: 'thread'` to restrict the agent to only the current thread:
136
+
137
+ ```typescript
138
+ const memory = new Memory({
139
+ options: {
140
+ observationalMemory: {
141
+ model: 'google/gemini-2.5-flash',
142
+ retrieval: { vector: true, scope: 'thread' },
143
+ },
144
+ },
145
+ })
146
+ ```
147
+
148
+ #### What retrieval enables
149
+
110
150
  With retrieval mode enabled, OM:
111
151
 
112
152
  - Stores a `range` (e.g. `startId:endId`) on each observation group pointing to the messages it was derived from
153
+
113
154
  - Keeps range metadata visible in the agent's context so the agent knows which observations map to which messages
114
- - Registers a `recall` tool the agent can call to page through the raw messages behind any range
115
155
 
116
- Retrieval mode is only active for thread-scoped OM. Setting `retrieval: true` with `scope: 'resource'` has no effect — OM keeps resource-scoped behavior but skips retrieval-mode context and does not register the `recall` tool.
156
+ - Registers a `recall` tool the agent can call to:
157
+
158
+ - Page through the raw messages behind any observation group range
159
+ - Search by semantic similarity (`mode: "search"` with a `query` string) — requires `vector: true`
160
+ - List all threads (`mode: "threads"`), browse other threads (`threadId`), and search across all threads (default `scope: 'resource'`)
161
+ - When `scope: 'thread'`: restrict browsing and search to the current thread only
117
162
 
118
- See the [recall tool reference](https://mastra.ai/reference/memory/observational-memory) for the full API (detail levels, part indexing, pagination, and token limiting).
163
+ See the [recall tool reference](https://mastra.ai/reference/memory/observational-memory) for the full API (detail levels, part indexing, pagination, cross-thread browsing, and token limiting).
119
164
 
120
165
  ## Models
121
166
 
@@ -133,6 +133,23 @@ export const mastraClient = new MastraClient({
133
133
 
134
134
  > **Info:** Visit [MastraClient](https://mastra.ai/reference/client-js/mastra-client) for more configuration options.
135
135
 
136
+ ## Credentials and session cookies
137
+
138
+ **Authenticate Mastra API calls with session cookies** when your UI and Mastra API are not on the same origin—different host, subdomain, or port (for example Mastra Studio on one port and a custom server on another). Add **`credentials: 'include'`** to `MastraClient` so each request carries the cookies the user already has after sign-in. Skip this and you will often get **`401`** responses from Mastra even though login succeeded in the browser.
139
+
140
+ ```typescript
141
+ import { MastraClient } from '@mastra/client-js'
142
+
143
+ export const mastraClient = new MastraClient({
144
+ baseUrl: process.env.MASTRA_API_URL || 'http://localhost:4111',
145
+ credentials: 'include',
146
+ })
147
+ ```
148
+
149
+ **Allow credentialed cross-origin requests on your server**—see [CORS: requests with credentials](https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CORS#requests_with_credentials). You need a concrete `Access-Control-Allow-Origin` (not `*`) and `Access-Control-Allow-Credentials: true`, or the browser will block the call before it reaches Mastra.
150
+
151
+ **Using `@mastra/react`?** Wrap your app with `MastraReactProvider`, set `baseUrl` and `apiPrefix` to match your server, and rely on the default `credentials: 'include'`. Change `credentials` only when you deliberately want `same-origin` or `omit` behavior.
152
+
136
153
  ## Adding request cancelling
137
154
 
138
155
  `MastraClient` supports request cancellation using the standard Node.js `AbortSignal` API. Useful for canceling in-flight requests, such as when users abort an operation or to clean up stale network calls.
@@ -521,7 +521,21 @@ The adapter registers routes for both HTTP and SSE (Server-Sent Events) transpor
521
521
 
522
522
  ### Serverless mode
523
523
 
524
- For serverless environments like Cloudflare Workers or Vercel Edge, enable stateless mode via `mcpOptions`:
524
+ For serverless environments like Cloudflare Workers or Vercel Edge, enable stateless mode via `mcpOptions`.
525
+
526
+ When using the Mastra deployer (the standard `mastra dev` / `mastra build` path), set `mcpOptions` in your server config:
527
+
528
+ ```typescript
529
+ const mastra = new Mastra({
530
+ server: {
531
+ mcpOptions: {
532
+ serverless: true,
533
+ },
534
+ },
535
+ })
536
+ ```
537
+
538
+ When manually creating a server adapter, pass `mcpOptions` directly:
525
539
 
526
540
  ```typescript
527
541
  const server = new MastraServer({
@@ -117,7 +117,7 @@ ANTHROPIC_API_KEY=ant-...
117
117
  | `nousresearch/hermes-4-70b` |
118
118
  | `nvidia/nemotron-3-nano-30b-a3b:free` |
119
119
  | `nvidia/nemotron-3-super-120b-a12b` |
120
- | `nvidia/nemotron-3-super-120b-a12b-free` |
120
+ | `nvidia/nemotron-3-super-120b-a12b:free` |
121
121
  | `nvidia/nemotron-nano-12b-v2-vl:free` |
122
122
  | `nvidia/nemotron-nano-9b-v2` |
123
123
  | `nvidia/nemotron-nano-9b-v2:free` |
@@ -1,6 +1,6 @@
1
1
  # Model Providers
2
2
 
3
- Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3396 models from 94 providers through a single API.
3
+ Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3616 models from 95 providers through a single API.
4
4
 
5
5
  ## Features
6
6
 
@@ -5,7 +5,7 @@ Access 2 Bailing models through Mastra's model router. Authentication is handled
5
5
  Learn more in the [Bailing documentation](https://alipaytbox.yuque.com/sxs0ba/ling/intro).
6
6
 
7
7
  ```bash
8
- BAILING_API_TOKEN=your-api-key
8
+ BAILING_API_TOKEN=your-api-token
9
9
  ```
10
10
 
11
11
  ```typescript
@@ -1,11 +1,12 @@
1
1
  # ![Cloudflare Workers AI logo](https://models.dev/logos/cloudflare-workers-ai.svg)Cloudflare Workers AI
2
2
 
3
- Access 42 Cloudflare Workers AI models through Mastra's model router. Authentication is handled automatically using the `CLOUDFLARE_ACCOUNT_ID` environment variable.
3
+ Access 42 Cloudflare Workers AI models through Mastra's model router. Authentication is handled automatically using the `CLOUDFLARE_API_KEY` environment variable. Configure `CLOUDFLARE_ACCOUNT_ID` as well.
4
4
 
5
5
  Learn more in the [Cloudflare Workers AI documentation](https://developers.cloudflare.com/workers-ai/models/).
6
6
 
7
7
  ```bash
8
- CLOUDFLARE_ACCOUNT_ID=your-api-key
8
+ CLOUDFLARE_ACCOUNT_ID=your-account-id
9
+ CLOUDFLARE_API_KEY=your-api-key
9
10
  ```
10
11
 
11
12
  ```typescript
@@ -88,7 +89,7 @@ const agent = new Agent({
88
89
  model: {
89
90
  url: "https://api.cloudflare.com/client/v4/accounts/${CLOUDFLARE_ACCOUNT_ID}/ai/v1",
90
91
  id: "cloudflare-workers-ai/@cf/ai4bharat/indictrans2-en-indic-1B",
91
- apiKey: process.env.CLOUDFLARE_ACCOUNT_ID,
92
+ apiKey: process.env.CLOUDFLARE_API_KEY,
92
93
  headers: {
93
94
  "X-Custom-Header": "value"
94
95
  }
@@ -45,7 +45,6 @@ for await (const chunk of stream) {
45
45
  | `firmware/gemini-3-1-pro-preview` | 1.0M | | | | | | $2 | $12 |
46
46
  | `firmware/gemini-3-flash-preview` | 1.0M | | | | | | $0.50 | $3 |
47
47
  | `firmware/gemini-3-pro-preview` | 1.0M | | | | | | $2 | $12 |
48
- | `firmware/glm-5` | 198K | | | | | | $1 | $3 |
49
48
  | `firmware/gpt-4o` | 128K | | | | | | $3 | $10 |
50
49
  | `firmware/gpt-5-3-codex` | 400K | | | | | | $2 | $14 |
51
50
  | `firmware/gpt-5-4` | 272K | | | | | | $3 | $15 |
@@ -58,6 +57,7 @@ for await (const chunk of stream) {
58
57
  | `firmware/grok-code-fast-1` | 256K | | | | | | $0.20 | $2 |
59
58
  | `firmware/kimi-k2.5` | 256K | | | | | | $0.60 | $3 |
60
59
  | `firmware/minimax-m2-5` | 192K | | | | | | $0.30 | $1 |
60
+ | `firmware/zai-glm-5` | 198K | | | | | | $1 | $3 |
61
61
 
62
62
  ## Advanced configuration
63
63
 
@@ -87,7 +87,7 @@ const agent = new Agent({
87
87
  model: ({ requestContext }) => {
88
88
  const useAdvanced = requestContext.task === "complex";
89
89
  return useAdvanced
90
- ? "firmware/minimax-m2-5"
90
+ ? "firmware/zai-glm-5"
91
91
  : "firmware/claude-haiku-4-5";
92
92
  }
93
93
  });
@@ -5,7 +5,7 @@ Access 7 Friendli models through Mastra's model router. Authentication is handle
5
5
  Learn more in the [Friendli documentation](https://friendli.ai/docs/guides/serverless_endpoints/introduction).
6
6
 
7
7
  ```bash
8
- FRIENDLI_TOKEN=your-api-key
8
+ FRIENDLI_TOKEN=your-api-token
9
9
  ```
10
10
 
11
11
  ```typescript
@@ -5,7 +5,7 @@ Access 55 GitHub Models models through Mastra's model router. Authentication is
5
5
  Learn more in the [GitHub Models documentation](https://docs.github.com/en/github-models).
6
6
 
7
7
  ```bash
8
- GITHUB_TOKEN=your-api-key
8
+ GITHUB_TOKEN=your-api-token
9
9
  ```
10
10
 
11
11
  ```typescript
@@ -1,6 +1,6 @@
1
1
  # ![Google logo](https://models.dev/logos/google.svg)Google
2
2
 
3
- Access 30 Google models through Mastra's model router. Authentication is handled automatically using the `GOOGLE_GENERATIVE_AI_API_KEY` environment variable.
3
+ Access 35 Google models through Mastra's model router. Authentication is handled automatically using the `GOOGLE_GENERATIVE_AI_API_KEY` environment variable.
4
4
 
5
5
  Learn more in the [Google documentation](https://ai.google.dev/gemini-api/docs/pricing).
6
6
 
@@ -62,6 +62,11 @@ for await (const chunk of stream) {
62
62
  | `google/gemini-flash-lite-latest` | 1.0M | | | | | | $0.10 | $0.40 |
63
63
  | `google/gemini-live-2.5-flash` | 128K | | | | | | $0.50 | $2 |
64
64
  | `google/gemini-live-2.5-flash-preview-native-audio` | 131K | | | | | | $0.50 | $2 |
65
+ | `google/gemma-3-12b-it` | 33K | | | | | | — | — |
66
+ | `google/gemma-3-27b-it` | 131K | | | | | | — | — |
67
+ | `google/gemma-3-4b-it` | 33K | | | | | | — | — |
68
+ | `google/gemma-3n-e2b-it` | 8K | | | | | | — | — |
69
+ | `google/gemma-3n-e4b-it` | 8K | | | | | | — | — |
65
70
 
66
71
  ## Advanced configuration
67
72
 
@@ -90,7 +95,7 @@ const agent = new Agent({
90
95
  model: ({ requestContext }) => {
91
96
  const useAdvanced = requestContext.task === "complex";
92
97
  return useAdvanced
93
- ? "google/gemini-live-2.5-flash-preview-native-audio"
98
+ ? "google/gemma-3n-e4b-it"
94
99
  : "google/gemini-1.5-flash";
95
100
  }
96
101
  });
@@ -1,6 +1,6 @@
1
1
  # ![Groq logo](https://models.dev/logos/groq.svg)Groq
2
2
 
3
- Access 9 Groq models through Mastra's model router. Authentication is handled automatically using the `GROQ_API_KEY` environment variable.
3
+ Access 17 Groq models through Mastra's model router. Authentication is handled automatically using the `GROQ_API_KEY` environment variable.
4
4
 
5
5
  Learn more in the [Groq documentation](https://console.groq.com/docs/models).
6
6
 
@@ -15,7 +15,7 @@ const agent = new Agent({
15
15
  id: "my-agent",
16
16
  name: "My Agent",
17
17
  instructions: "You are a helpful assistant",
18
- model: "groq/llama-3.1-8b-instant"
18
+ model: "groq/allam-2-7b"
19
19
  });
20
20
 
21
21
  // Generate a response
@@ -30,17 +30,25 @@ for await (const chunk of stream) {
30
30
 
31
31
  ## Models
32
32
 
33
- | Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
34
- | ---------------------------------------------------- | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
35
- | `groq/llama-3.1-8b-instant` | 131K | | | | | | $0.05 | $0.08 |
36
- | `groq/llama-3.3-70b-versatile` | 131K | | | | | | $0.59 | $0.79 |
37
- | `groq/meta-llama/llama-4-maverick-17b-128e-instruct` | 131K | | | | | | $0.20 | $0.60 |
38
- | `groq/meta-llama/llama-4-scout-17b-16e-instruct` | 131K | | | | | | $0.11 | $0.34 |
39
- | `groq/meta-llama/llama-guard-4-12b` | 131K | | | | | | $0.20 | $0.20 |
40
- | `groq/moonshotai/kimi-k2-instruct-0905` | 262K | | | | | | $1 | $3 |
41
- | `groq/openai/gpt-oss-120b` | 131K | | | | | | $0.15 | $0.60 |
42
- | `groq/openai/gpt-oss-20b` | 131K | | | | | | $0.07 | $0.30 |
43
- | `groq/qwen/qwen3-32b` | 131K | | | | | | $0.29 | $0.59 |
33
+ | Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
34
+ | ------------------------------------------------ | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
35
+ | `groq/allam-2-7b` | 4K | | | | | | | |
36
+ | `groq/canopylabs/orpheus-arabic-saudi` | 4K | | | | | | $40 | |
37
+ | `groq/canopylabs/orpheus-v1-english` | 4K | | | | | | | |
38
+ | `groq/groq/compound` | 131K | | | | | | | |
39
+ | `groq/groq/compound-mini` | 131K | | | | | | | |
40
+ | `groq/llama-3.1-8b-instant` | 131K | | | | | | $0.05 | $0.08 |
41
+ | `groq/llama-3.3-70b-versatile` | 131K | | | | | | $0.59 | $0.79 |
42
+ | `groq/meta-llama/llama-4-scout-17b-16e-instruct` | 131K | | | | | | $0.11 | $0.34 |
43
+ | `groq/meta-llama/llama-prompt-guard-2-22m` | 512 | | | | | | $0.03 | $0.03 |
44
+ | `groq/meta-llama/llama-prompt-guard-2-86m` | 512 | | | | | | $0.04 | $0.04 |
45
+ | `groq/moonshotai/kimi-k2-instruct-0905` | 262K | | | | | | $1 | $3 |
46
+ | `groq/openai/gpt-oss-120b` | 131K | | | | | | $0.15 | $0.60 |
47
+ | `groq/openai/gpt-oss-20b` | 131K | | | | | | $0.07 | $0.30 |
48
+ | `groq/openai/gpt-oss-safeguard-20b` | 131K | | | | | | $0.07 | $0.30 |
49
+ | `groq/qwen/qwen3-32b` | 131K | | | | | | $0.29 | $0.59 |
50
+ | `groq/whisper-large-v3` | 448 | | | | | | — | — |
51
+ | `groq/whisper-large-v3-turbo` | 448 | | | | | | — | — |
44
52
 
45
53
  ## Advanced configuration
46
54
 
@@ -52,7 +60,7 @@ const agent = new Agent({
52
60
  name: "custom-agent",
53
61
  model: {
54
62
  url: "https://api.groq.com/openai/v1",
55
- id: "groq/llama-3.1-8b-instant",
63
+ id: "groq/allam-2-7b",
56
64
  apiKey: process.env.GROQ_API_KEY,
57
65
  headers: {
58
66
  "X-Custom-Header": "value"
@@ -70,8 +78,8 @@ const agent = new Agent({
70
78
  model: ({ requestContext }) => {
71
79
  const useAdvanced = requestContext.task === "complex";
72
80
  return useAdvanced
73
- ? "groq/qwen/qwen3-32b"
74
- : "groq/llama-3.1-8b-instant";
81
+ ? "groq/whisper-large-v3-turbo"
82
+ : "groq/allam-2-7b";
75
83
  }
76
84
  });
77
85
  ```
@@ -5,7 +5,7 @@ Access 20 Hugging Face models through Mastra's model router. Authentication is h
5
5
  Learn more in the [Hugging Face documentation](https://huggingface.co).
6
6
 
7
7
  ```bash
8
- HF_TOKEN=your-api-key
8
+ HF_TOKEN=your-api-token
9
9
  ```
10
10
 
11
11
  ```typescript