@mastra/mcp-docs-server 1.1.17-alpha.9 → 1.1.18-alpha.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.docs/docs/memory/observational-memory.md +49 -4
- package/.docs/docs/server/mastra-client.md +17 -0
- package/.docs/docs/server/server-adapters.md +15 -1
- package/.docs/models/gateways/netlify.md +65 -66
- package/.docs/models/gateways/openrouter.md +3 -2
- package/.docs/models/gateways/vercel.md +3 -1
- package/.docs/models/index.md +1 -1
- package/.docs/models/providers/bailing.md +1 -1
- package/.docs/models/providers/cloudflare-workers-ai.md +4 -3
- package/.docs/models/providers/firmware.md +2 -2
- package/.docs/models/providers/friendli.md +1 -1
- package/.docs/models/providers/github-models.md +1 -1
- package/.docs/models/providers/google.md +7 -2
- package/.docs/models/providers/groq.md +24 -16
- package/.docs/models/providers/huggingface.md +1 -1
- package/.docs/models/providers/mistral.md +3 -2
- package/.docs/models/providers/nano-gpt.md +3 -1
- package/.docs/models/providers/openai.md +2 -1
- package/.docs/models/providers/opencode.md +3 -2
- package/.docs/models/providers/poe.md +3 -1
- package/.docs/models/providers/vultr.md +11 -16
- package/.docs/models/providers/zai-coding-plan.md +3 -2
- package/.docs/models/providers/zenmux.md +2 -31
- package/.docs/models/providers/zhipuai-coding-plan.md +3 -2
- package/.docs/reference/ai-sdk/handle-chat-stream.md +2 -0
- package/.docs/reference/client-js/agents.md +11 -6
- package/.docs/reference/client-js/mastra-client.md +1 -1
- package/.docs/reference/client-js/memory.md +1 -1
- package/.docs/reference/configuration.md +24 -0
- package/.docs/reference/core/mastra-model-gateway.md +2 -0
- package/.docs/reference/deployer/cloudflare.md +31 -1
- package/.docs/reference/evals/scorer-utils.md +9 -5
- package/.docs/reference/evals/trajectory-accuracy.md +29 -15
- package/.docs/reference/index.md +0 -2
- package/.docs/reference/logging/pino-logger.md +58 -0
- package/.docs/reference/memory/observational-memory.md +32 -6
- package/CHANGELOG.md +44 -0
- package/package.json +6 -6
- package/.docs/reference/core/getStoredAgentById.md +0 -87
- package/.docs/reference/core/listStoredAgents.md +0 -91
|
@@ -95,27 +95,72 @@ The result is a three-tier system:
|
|
|
95
95
|
|
|
96
96
|
Normal OM compresses messages into observations, which is great for staying on task — but the original wording is gone. Retrieval mode fixes this by keeping each observation group linked to the raw messages that produced it. When the agent needs exact wording, tool output, or chronology that the summary compressed away, it can call a `recall` tool to page through the source messages.
|
|
97
97
|
|
|
98
|
+
#### Browsing only
|
|
99
|
+
|
|
100
|
+
Set `retrieval: true` to enable the recall tool for browsing raw messages. No vector store needed. By default, the recall tool can browse across all threads for the current resource.
|
|
101
|
+
|
|
98
102
|
```typescript
|
|
99
103
|
const memory = new Memory({
|
|
100
104
|
options: {
|
|
101
105
|
observationalMemory: {
|
|
102
106
|
model: 'google/gemini-2.5-flash',
|
|
103
|
-
scope: 'thread',
|
|
104
107
|
retrieval: true,
|
|
105
108
|
},
|
|
106
109
|
},
|
|
107
110
|
})
|
|
108
111
|
```
|
|
109
112
|
|
|
113
|
+
#### With semantic search
|
|
114
|
+
|
|
115
|
+
Set `retrieval: { vector: true }` to also enable semantic search. This reuses the vector store and embedder already configured on your Memory instance:
|
|
116
|
+
|
|
117
|
+
```typescript
|
|
118
|
+
const memory = new Memory({
|
|
119
|
+
storage,
|
|
120
|
+
vector: myVectorStore,
|
|
121
|
+
embedder: myEmbedder,
|
|
122
|
+
options: {
|
|
123
|
+
observationalMemory: {
|
|
124
|
+
model: 'google/gemini-2.5-flash',
|
|
125
|
+
retrieval: { vector: true },
|
|
126
|
+
},
|
|
127
|
+
},
|
|
128
|
+
})
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
When vector search is configured, new observation groups are automatically indexed at buffer time and during synchronous observation (fire-and-forget, non-blocking). Semantic search returns observation-group matches with their raw source message ID ranges, so the recall tool can show the summarized memory alongside where it came from.
|
|
132
|
+
|
|
133
|
+
#### Restricting to the current thread
|
|
134
|
+
|
|
135
|
+
By default, the recall tool scope is `'resource'` — the agent can list threads, browse other threads, and search across all conversations. Set `scope: 'thread'` to restrict the agent to only the current thread:
|
|
136
|
+
|
|
137
|
+
```typescript
|
|
138
|
+
const memory = new Memory({
|
|
139
|
+
options: {
|
|
140
|
+
observationalMemory: {
|
|
141
|
+
model: 'google/gemini-2.5-flash',
|
|
142
|
+
retrieval: { vector: true, scope: 'thread' },
|
|
143
|
+
},
|
|
144
|
+
},
|
|
145
|
+
})
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
#### What retrieval enables
|
|
149
|
+
|
|
110
150
|
With retrieval mode enabled, OM:
|
|
111
151
|
|
|
112
152
|
- Stores a `range` (e.g. `startId:endId`) on each observation group pointing to the messages it was derived from
|
|
153
|
+
|
|
113
154
|
- Keeps range metadata visible in the agent's context so the agent knows which observations map to which messages
|
|
114
|
-
- Registers a `recall` tool the agent can call to page through the raw messages behind any range
|
|
115
155
|
|
|
116
|
-
|
|
156
|
+
- Registers a `recall` tool the agent can call to:
|
|
157
|
+
|
|
158
|
+
- Page through the raw messages behind any observation group range
|
|
159
|
+
- Search by semantic similarity (`mode: "search"` with a `query` string) — requires `vector: true`
|
|
160
|
+
- List all threads (`mode: "threads"`), browse other threads (`threadId`), and search across all threads (default `scope: 'resource'`)
|
|
161
|
+
- When `scope: 'thread'`: restrict browsing and search to the current thread only
|
|
117
162
|
|
|
118
|
-
See the [recall tool reference](https://mastra.ai/reference/memory/observational-memory) for the full API (detail levels, part indexing, pagination, and token limiting).
|
|
163
|
+
See the [recall tool reference](https://mastra.ai/reference/memory/observational-memory) for the full API (detail levels, part indexing, pagination, cross-thread browsing, and token limiting).
|
|
119
164
|
|
|
120
165
|
## Models
|
|
121
166
|
|
|
@@ -133,6 +133,23 @@ export const mastraClient = new MastraClient({
|
|
|
133
133
|
|
|
134
134
|
> **Info:** Visit [MastraClient](https://mastra.ai/reference/client-js/mastra-client) for more configuration options.
|
|
135
135
|
|
|
136
|
+
## Credentials and session cookies
|
|
137
|
+
|
|
138
|
+
**Authenticate Mastra API calls with session cookies** when your UI and Mastra API are not on the same origin—different host, subdomain, or port (for example Mastra Studio on one port and a custom server on another). Add **`credentials: 'include'`** to `MastraClient` so each request carries the cookies the user already has after sign-in. Skip this and you will often get **`401`** responses from Mastra even though login succeeded in the browser.
|
|
139
|
+
|
|
140
|
+
```typescript
|
|
141
|
+
import { MastraClient } from '@mastra/client-js'
|
|
142
|
+
|
|
143
|
+
export const mastraClient = new MastraClient({
|
|
144
|
+
baseUrl: process.env.MASTRA_API_URL || 'http://localhost:4111',
|
|
145
|
+
credentials: 'include',
|
|
146
|
+
})
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
**Allow credentialed cross-origin requests on your server**—see [CORS: requests with credentials](https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CORS#requests_with_credentials). You need a concrete `Access-Control-Allow-Origin` (not `*`) and `Access-Control-Allow-Credentials: true`, or the browser will block the call before it reaches Mastra.
|
|
150
|
+
|
|
151
|
+
**Using `@mastra/react`?** Wrap your app with `MastraReactProvider`, set `baseUrl` and `apiPrefix` to match your server, and rely on the default `credentials: 'include'`. Change `credentials` only when you deliberately want `same-origin` or `omit` behavior.
|
|
152
|
+
|
|
136
153
|
## Adding request cancelling
|
|
137
154
|
|
|
138
155
|
`MastraClient` supports request cancellation using the standard Node.js `AbortSignal` API. Useful for canceling in-flight requests, such as when users abort an operation or to clean up stale network calls.
|
|
@@ -521,7 +521,21 @@ The adapter registers routes for both HTTP and SSE (Server-Sent Events) transpor
|
|
|
521
521
|
|
|
522
522
|
### Serverless mode
|
|
523
523
|
|
|
524
|
-
For serverless environments like Cloudflare Workers or Vercel Edge, enable stateless mode via `mcpOptions
|
|
524
|
+
For serverless environments like Cloudflare Workers or Vercel Edge, enable stateless mode via `mcpOptions`.
|
|
525
|
+
|
|
526
|
+
When using the Mastra deployer (the standard `mastra dev` / `mastra build` path), set `mcpOptions` in your server config:
|
|
527
|
+
|
|
528
|
+
```typescript
|
|
529
|
+
const mastra = new Mastra({
|
|
530
|
+
server: {
|
|
531
|
+
mcpOptions: {
|
|
532
|
+
serverless: true,
|
|
533
|
+
},
|
|
534
|
+
},
|
|
535
|
+
})
|
|
536
|
+
```
|
|
537
|
+
|
|
538
|
+
When manually creating a server adapter, pass `mcpOptions` directly:
|
|
525
539
|
|
|
526
540
|
```typescript
|
|
527
541
|
const server = new MastraServer({
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Netlify
|
|
2
2
|
|
|
3
|
-
Netlify AI Gateway provides unified access to multiple providers with built-in caching and observability. Access
|
|
3
|
+
Netlify AI Gateway provides unified access to multiple providers with built-in caching and observability. Access 62 models through Mastra's model router.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Netlify documentation](https://docs.netlify.com/build/ai-gateway/overview/).
|
|
6
6
|
|
|
@@ -33,68 +33,67 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
33
33
|
|
|
34
34
|
## Available models
|
|
35
35
|
|
|
36
|
-
| Model
|
|
37
|
-
|
|
|
38
|
-
| `anthropic/claude-3-haiku-20240307`
|
|
39
|
-
| `anthropic/claude-haiku-4-5`
|
|
40
|
-
| `anthropic/claude-haiku-4-5-20251001`
|
|
41
|
-
| `anthropic/claude-opus-4-1-20250805`
|
|
42
|
-
| `anthropic/claude-opus-4-20250514`
|
|
43
|
-
| `anthropic/claude-opus-4-5`
|
|
44
|
-
| `anthropic/claude-opus-4-5-20251101`
|
|
45
|
-
| `anthropic/claude-opus-4-6`
|
|
46
|
-
| `anthropic/claude-sonnet-4-0`
|
|
47
|
-
| `anthropic/claude-sonnet-4-20250514`
|
|
48
|
-
| `anthropic/claude-sonnet-4-5`
|
|
49
|
-
| `anthropic/claude-sonnet-4-5-20250929`
|
|
50
|
-
| `anthropic/claude-sonnet-4-6`
|
|
51
|
-
| `gemini/gemini-2.0-flash`
|
|
52
|
-
| `gemini/gemini-2.0-flash-lite`
|
|
53
|
-
| `gemini/gemini-2.5-flash`
|
|
54
|
-
| `gemini/gemini-2.5-flash-image`
|
|
55
|
-
| `gemini/gemini-2.5-flash-lite`
|
|
56
|
-
| `gemini/gemini-2.5-
|
|
57
|
-
| `gemini/gemini-
|
|
58
|
-
| `gemini/gemini-3-
|
|
59
|
-
| `gemini/gemini-3-
|
|
60
|
-
| `gemini/gemini-3.1-flash-
|
|
61
|
-
| `gemini/gemini-3.1-
|
|
62
|
-
| `gemini/gemini-3.1-pro-preview`
|
|
63
|
-
| `gemini/gemini-
|
|
64
|
-
| `gemini/gemini-flash-latest`
|
|
65
|
-
| `
|
|
66
|
-
| `openai/gpt-4.1`
|
|
67
|
-
| `openai/gpt-4.1-
|
|
68
|
-
| `openai/gpt-
|
|
69
|
-
| `openai/gpt-4o`
|
|
70
|
-
| `openai/gpt-
|
|
71
|
-
| `openai/gpt-5`
|
|
72
|
-
| `openai/gpt-5-
|
|
73
|
-
| `openai/gpt-5-
|
|
74
|
-
| `openai/gpt-5-mini`
|
|
75
|
-
| `openai/gpt-5-
|
|
76
|
-
| `openai/gpt-5-
|
|
77
|
-
| `openai/gpt-5
|
|
78
|
-
| `openai/gpt-5.1`
|
|
79
|
-
| `openai/gpt-5.1-
|
|
80
|
-
| `openai/gpt-5.1-codex`
|
|
81
|
-
| `openai/gpt-5.1-codex-
|
|
82
|
-
| `openai/gpt-5.
|
|
83
|
-
| `openai/gpt-5.2`
|
|
84
|
-
| `openai/gpt-5.2-
|
|
85
|
-
| `openai/gpt-5.2-
|
|
86
|
-
| `openai/gpt-5.2-pro`
|
|
87
|
-
| `openai/gpt-5.
|
|
88
|
-
| `openai/gpt-5.3-
|
|
89
|
-
| `openai/gpt-5.
|
|
90
|
-
| `openai/gpt-5.4`
|
|
91
|
-
| `openai/gpt-5.4-
|
|
92
|
-
| `openai/gpt-5.4-mini`
|
|
93
|
-
| `openai/gpt-5.4-
|
|
94
|
-
| `openai/gpt-5.4-nano`
|
|
95
|
-
| `openai/gpt-5.4-
|
|
96
|
-
| `openai/gpt-5.4-pro`
|
|
97
|
-
| `openai/
|
|
98
|
-
| `openai/o3`
|
|
99
|
-
| `openai/
|
|
100
|
-
| `openai/o4-mini` |
|
|
36
|
+
| Model |
|
|
37
|
+
| ------------------------------------------- |
|
|
38
|
+
| `anthropic/claude-3-haiku-20240307` |
|
|
39
|
+
| `anthropic/claude-haiku-4-5` |
|
|
40
|
+
| `anthropic/claude-haiku-4-5-20251001` |
|
|
41
|
+
| `anthropic/claude-opus-4-1-20250805` |
|
|
42
|
+
| `anthropic/claude-opus-4-20250514` |
|
|
43
|
+
| `anthropic/claude-opus-4-5` |
|
|
44
|
+
| `anthropic/claude-opus-4-5-20251101` |
|
|
45
|
+
| `anthropic/claude-opus-4-6` |
|
|
46
|
+
| `anthropic/claude-sonnet-4-0` |
|
|
47
|
+
| `anthropic/claude-sonnet-4-20250514` |
|
|
48
|
+
| `anthropic/claude-sonnet-4-5` |
|
|
49
|
+
| `anthropic/claude-sonnet-4-5-20250929` |
|
|
50
|
+
| `anthropic/claude-sonnet-4-6` |
|
|
51
|
+
| `gemini/gemini-2.0-flash` |
|
|
52
|
+
| `gemini/gemini-2.0-flash-lite` |
|
|
53
|
+
| `gemini/gemini-2.5-flash` |
|
|
54
|
+
| `gemini/gemini-2.5-flash-image` |
|
|
55
|
+
| `gemini/gemini-2.5-flash-lite` |
|
|
56
|
+
| `gemini/gemini-2.5-pro` |
|
|
57
|
+
| `gemini/gemini-3-flash-preview` |
|
|
58
|
+
| `gemini/gemini-3-pro-image-preview` |
|
|
59
|
+
| `gemini/gemini-3.1-flash-image-preview` |
|
|
60
|
+
| `gemini/gemini-3.1-flash-lite-preview` |
|
|
61
|
+
| `gemini/gemini-3.1-pro-preview` |
|
|
62
|
+
| `gemini/gemini-3.1-pro-preview-customtools` |
|
|
63
|
+
| `gemini/gemini-flash-latest` |
|
|
64
|
+
| `gemini/gemini-flash-lite-latest` |
|
|
65
|
+
| `openai/gpt-4.1` |
|
|
66
|
+
| `openai/gpt-4.1-mini` |
|
|
67
|
+
| `openai/gpt-4.1-nano` |
|
|
68
|
+
| `openai/gpt-4o` |
|
|
69
|
+
| `openai/gpt-4o-mini` |
|
|
70
|
+
| `openai/gpt-5` |
|
|
71
|
+
| `openai/gpt-5-2025-08-07` |
|
|
72
|
+
| `openai/gpt-5-codex` |
|
|
73
|
+
| `openai/gpt-5-mini` |
|
|
74
|
+
| `openai/gpt-5-mini-2025-08-07` |
|
|
75
|
+
| `openai/gpt-5-nano` |
|
|
76
|
+
| `openai/gpt-5-pro` |
|
|
77
|
+
| `openai/gpt-5.1` |
|
|
78
|
+
| `openai/gpt-5.1-2025-11-13` |
|
|
79
|
+
| `openai/gpt-5.1-codex` |
|
|
80
|
+
| `openai/gpt-5.1-codex-max` |
|
|
81
|
+
| `openai/gpt-5.1-codex-mini` |
|
|
82
|
+
| `openai/gpt-5.2` |
|
|
83
|
+
| `openai/gpt-5.2-2025-12-11` |
|
|
84
|
+
| `openai/gpt-5.2-codex` |
|
|
85
|
+
| `openai/gpt-5.2-pro` |
|
|
86
|
+
| `openai/gpt-5.2-pro-2025-12-11` |
|
|
87
|
+
| `openai/gpt-5.3-chat-latest` |
|
|
88
|
+
| `openai/gpt-5.3-codex` |
|
|
89
|
+
| `openai/gpt-5.4` |
|
|
90
|
+
| `openai/gpt-5.4-2026-03-05` |
|
|
91
|
+
| `openai/gpt-5.4-mini` |
|
|
92
|
+
| `openai/gpt-5.4-mini-2026-03-17` |
|
|
93
|
+
| `openai/gpt-5.4-nano` |
|
|
94
|
+
| `openai/gpt-5.4-nano-2026-03-17` |
|
|
95
|
+
| `openai/gpt-5.4-pro` |
|
|
96
|
+
| `openai/gpt-5.4-pro-2026-03-05` |
|
|
97
|
+
| `openai/o3` |
|
|
98
|
+
| `openai/o3-mini` |
|
|
99
|
+
| `openai/o4-mini` |
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# OpenRouter
|
|
2
2
|
|
|
3
|
-
OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access
|
|
3
|
+
OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 166 models through Mastra's model router.
|
|
4
4
|
|
|
5
5
|
Learn more in the [OpenRouter documentation](https://openrouter.ai/models).
|
|
6
6
|
|
|
@@ -117,7 +117,7 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
117
117
|
| `nousresearch/hermes-4-70b` |
|
|
118
118
|
| `nvidia/nemotron-3-nano-30b-a3b:free` |
|
|
119
119
|
| `nvidia/nemotron-3-super-120b-a12b` |
|
|
120
|
-
| `nvidia/nemotron-3-super-120b-a12b
|
|
120
|
+
| `nvidia/nemotron-3-super-120b-a12b:free` |
|
|
121
121
|
| `nvidia/nemotron-nano-12b-v2-vl:free` |
|
|
122
122
|
| `nvidia/nemotron-nano-9b-v2` |
|
|
123
123
|
| `nvidia/nemotron-nano-9b-v2:free` |
|
|
@@ -172,6 +172,7 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
172
172
|
| `qwen/qwen3-next-80b-a3b-thinking` |
|
|
173
173
|
| `qwen/qwen3.5-397b-a17b` |
|
|
174
174
|
| `qwen/qwen3.5-plus-02-15` |
|
|
175
|
+
| `qwen/qwen3.6-plus-preview:free` |
|
|
175
176
|
| `sourceful/riverflow-v2-fast-preview` |
|
|
176
177
|
| `sourceful/riverflow-v2-max-preview` |
|
|
177
178
|
| `sourceful/riverflow-v2-standard-preview` |
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Vercel
|
|
2
2
|
|
|
3
|
-
Vercel aggregates models from multiple providers with enhanced features like rate limiting and failover. Access
|
|
3
|
+
Vercel aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 226 models through Mastra's model router.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Vercel documentation](https://ai-sdk.dev/providers/ai-sdk-providers).
|
|
6
6
|
|
|
@@ -117,6 +117,7 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
117
117
|
| `inception/mercury-2` |
|
|
118
118
|
| `inception/mercury-coder-small` |
|
|
119
119
|
| `kwaipilot/kat-coder-pro-v1` |
|
|
120
|
+
| `kwaipilot/kat-coder-pro-v2` |
|
|
120
121
|
| `meituan/longcat-flash-chat` |
|
|
121
122
|
| `meituan/longcat-flash-thinking` |
|
|
122
123
|
| `meituan/longcat-flash-thinking-2601` |
|
|
@@ -162,6 +163,7 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
162
163
|
| `morph/morph-v3-fast` |
|
|
163
164
|
| `morph/morph-v3-large` |
|
|
164
165
|
| `nvidia/nemotron-3-nano-30b-a3b` |
|
|
166
|
+
| `nvidia/nemotron-3-super-120b-a12b` |
|
|
165
167
|
| `nvidia/nemotron-nano-12b-v2-vl` |
|
|
166
168
|
| `nvidia/nemotron-nano-9b-v2` |
|
|
167
169
|
| `openai/codex-mini` |
|
package/.docs/models/index.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Model Providers
|
|
2
2
|
|
|
3
|
-
Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to
|
|
3
|
+
Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3613 models from 95 providers through a single API.
|
|
4
4
|
|
|
5
5
|
## Features
|
|
6
6
|
|
|
@@ -5,7 +5,7 @@ Access 2 Bailing models through Mastra's model router. Authentication is handled
|
|
|
5
5
|
Learn more in the [Bailing documentation](https://alipaytbox.yuque.com/sxs0ba/ling/intro).
|
|
6
6
|
|
|
7
7
|
```bash
|
|
8
|
-
BAILING_API_TOKEN=your-api-
|
|
8
|
+
BAILING_API_TOKEN=your-api-token
|
|
9
9
|
```
|
|
10
10
|
|
|
11
11
|
```typescript
|
|
@@ -1,11 +1,12 @@
|
|
|
1
1
|
# Cloudflare Workers AI
|
|
2
2
|
|
|
3
|
-
Access 42 Cloudflare Workers AI models through Mastra's model router. Authentication is handled automatically using the `
|
|
3
|
+
Access 42 Cloudflare Workers AI models through Mastra's model router. Authentication is handled automatically using the `CLOUDFLARE_API_KEY` environment variable. Configure `CLOUDFLARE_ACCOUNT_ID` as well.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Cloudflare Workers AI documentation](https://developers.cloudflare.com/workers-ai/models/).
|
|
6
6
|
|
|
7
7
|
```bash
|
|
8
|
-
CLOUDFLARE_ACCOUNT_ID=your-
|
|
8
|
+
CLOUDFLARE_ACCOUNT_ID=your-account-id
|
|
9
|
+
CLOUDFLARE_API_KEY=your-api-key
|
|
9
10
|
```
|
|
10
11
|
|
|
11
12
|
```typescript
|
|
@@ -88,7 +89,7 @@ const agent = new Agent({
|
|
|
88
89
|
model: {
|
|
89
90
|
url: "https://api.cloudflare.com/client/v4/accounts/${CLOUDFLARE_ACCOUNT_ID}/ai/v1",
|
|
90
91
|
id: "cloudflare-workers-ai/@cf/ai4bharat/indictrans2-en-indic-1B",
|
|
91
|
-
apiKey: process.env.
|
|
92
|
+
apiKey: process.env.CLOUDFLARE_API_KEY,
|
|
92
93
|
headers: {
|
|
93
94
|
"X-Custom-Header": "value"
|
|
94
95
|
}
|
|
@@ -45,7 +45,6 @@ for await (const chunk of stream) {
|
|
|
45
45
|
| `firmware/gemini-3-1-pro-preview` | 1.0M | | | | | | $2 | $12 |
|
|
46
46
|
| `firmware/gemini-3-flash-preview` | 1.0M | | | | | | $0.50 | $3 |
|
|
47
47
|
| `firmware/gemini-3-pro-preview` | 1.0M | | | | | | $2 | $12 |
|
|
48
|
-
| `firmware/glm-5` | 198K | | | | | | $1 | $3 |
|
|
49
48
|
| `firmware/gpt-4o` | 128K | | | | | | $3 | $10 |
|
|
50
49
|
| `firmware/gpt-5-3-codex` | 400K | | | | | | $2 | $14 |
|
|
51
50
|
| `firmware/gpt-5-4` | 272K | | | | | | $3 | $15 |
|
|
@@ -58,6 +57,7 @@ for await (const chunk of stream) {
|
|
|
58
57
|
| `firmware/grok-code-fast-1` | 256K | | | | | | $0.20 | $2 |
|
|
59
58
|
| `firmware/kimi-k2.5` | 256K | | | | | | $0.60 | $3 |
|
|
60
59
|
| `firmware/minimax-m2-5` | 192K | | | | | | $0.30 | $1 |
|
|
60
|
+
| `firmware/zai-glm-5` | 198K | | | | | | $1 | $3 |
|
|
61
61
|
|
|
62
62
|
## Advanced configuration
|
|
63
63
|
|
|
@@ -87,7 +87,7 @@ const agent = new Agent({
|
|
|
87
87
|
model: ({ requestContext }) => {
|
|
88
88
|
const useAdvanced = requestContext.task === "complex";
|
|
89
89
|
return useAdvanced
|
|
90
|
-
? "firmware/
|
|
90
|
+
? "firmware/zai-glm-5"
|
|
91
91
|
: "firmware/claude-haiku-4-5";
|
|
92
92
|
}
|
|
93
93
|
});
|
|
@@ -5,7 +5,7 @@ Access 7 Friendli models through Mastra's model router. Authentication is handle
|
|
|
5
5
|
Learn more in the [Friendli documentation](https://friendli.ai/docs/guides/serverless_endpoints/introduction).
|
|
6
6
|
|
|
7
7
|
```bash
|
|
8
|
-
FRIENDLI_TOKEN=your-api-
|
|
8
|
+
FRIENDLI_TOKEN=your-api-token
|
|
9
9
|
```
|
|
10
10
|
|
|
11
11
|
```typescript
|
|
@@ -5,7 +5,7 @@ Access 55 GitHub Models models through Mastra's model router. Authentication is
|
|
|
5
5
|
Learn more in the [GitHub Models documentation](https://docs.github.com/en/github-models).
|
|
6
6
|
|
|
7
7
|
```bash
|
|
8
|
-
GITHUB_TOKEN=your-api-
|
|
8
|
+
GITHUB_TOKEN=your-api-token
|
|
9
9
|
```
|
|
10
10
|
|
|
11
11
|
```typescript
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Google
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 35 Google models through Mastra's model router. Authentication is handled automatically using the `GOOGLE_GENERATIVE_AI_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Google documentation](https://ai.google.dev/gemini-api/docs/pricing).
|
|
6
6
|
|
|
@@ -62,6 +62,11 @@ for await (const chunk of stream) {
|
|
|
62
62
|
| `google/gemini-flash-lite-latest` | 1.0M | | | | | | $0.10 | $0.40 |
|
|
63
63
|
| `google/gemini-live-2.5-flash` | 128K | | | | | | $0.50 | $2 |
|
|
64
64
|
| `google/gemini-live-2.5-flash-preview-native-audio` | 131K | | | | | | $0.50 | $2 |
|
|
65
|
+
| `google/gemma-3-12b-it` | 33K | | | | | | — | — |
|
|
66
|
+
| `google/gemma-3-27b-it` | 131K | | | | | | — | — |
|
|
67
|
+
| `google/gemma-3-4b-it` | 33K | | | | | | — | — |
|
|
68
|
+
| `google/gemma-3n-e2b-it` | 8K | | | | | | — | — |
|
|
69
|
+
| `google/gemma-3n-e4b-it` | 8K | | | | | | — | — |
|
|
65
70
|
|
|
66
71
|
## Advanced configuration
|
|
67
72
|
|
|
@@ -90,7 +95,7 @@ const agent = new Agent({
|
|
|
90
95
|
model: ({ requestContext }) => {
|
|
91
96
|
const useAdvanced = requestContext.task === "complex";
|
|
92
97
|
return useAdvanced
|
|
93
|
-
? "google/
|
|
98
|
+
? "google/gemma-3n-e4b-it"
|
|
94
99
|
: "google/gemini-1.5-flash";
|
|
95
100
|
}
|
|
96
101
|
});
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Groq
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 17 Groq models through Mastra's model router. Authentication is handled automatically using the `GROQ_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Groq documentation](https://console.groq.com/docs/models).
|
|
6
6
|
|
|
@@ -15,7 +15,7 @@ const agent = new Agent({
|
|
|
15
15
|
id: "my-agent",
|
|
16
16
|
name: "My Agent",
|
|
17
17
|
instructions: "You are a helpful assistant",
|
|
18
|
-
model: "groq/
|
|
18
|
+
model: "groq/allam-2-7b"
|
|
19
19
|
});
|
|
20
20
|
|
|
21
21
|
// Generate a response
|
|
@@ -30,17 +30,25 @@ for await (const chunk of stream) {
|
|
|
30
30
|
|
|
31
31
|
## Models
|
|
32
32
|
|
|
33
|
-
| Model
|
|
34
|
-
|
|
|
35
|
-
| `groq/
|
|
36
|
-
| `groq/
|
|
37
|
-
| `groq/
|
|
38
|
-
| `groq/
|
|
39
|
-
| `groq/
|
|
40
|
-
| `groq/
|
|
41
|
-
| `groq/
|
|
42
|
-
| `groq/
|
|
43
|
-
| `groq/
|
|
33
|
+
| Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
|
|
34
|
+
| ------------------------------------------------ | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
|
|
35
|
+
| `groq/allam-2-7b` | 4K | | | | | | — | — |
|
|
36
|
+
| `groq/canopylabs/orpheus-arabic-saudi` | 4K | | | | | | $40 | — |
|
|
37
|
+
| `groq/canopylabs/orpheus-v1-english` | 4K | | | | | | — | — |
|
|
38
|
+
| `groq/groq/compound` | 131K | | | | | | — | — |
|
|
39
|
+
| `groq/groq/compound-mini` | 131K | | | | | | — | — |
|
|
40
|
+
| `groq/llama-3.1-8b-instant` | 131K | | | | | | $0.05 | $0.08 |
|
|
41
|
+
| `groq/llama-3.3-70b-versatile` | 131K | | | | | | $0.59 | $0.79 |
|
|
42
|
+
| `groq/meta-llama/llama-4-scout-17b-16e-instruct` | 131K | | | | | | $0.11 | $0.34 |
|
|
43
|
+
| `groq/meta-llama/llama-prompt-guard-2-22m` | 512 | | | | | | $0.03 | $0.03 |
|
|
44
|
+
| `groq/meta-llama/llama-prompt-guard-2-86m` | 512 | | | | | | $0.04 | $0.04 |
|
|
45
|
+
| `groq/moonshotai/kimi-k2-instruct-0905` | 262K | | | | | | $1 | $3 |
|
|
46
|
+
| `groq/openai/gpt-oss-120b` | 131K | | | | | | $0.15 | $0.60 |
|
|
47
|
+
| `groq/openai/gpt-oss-20b` | 131K | | | | | | $0.07 | $0.30 |
|
|
48
|
+
| `groq/openai/gpt-oss-safeguard-20b` | 131K | | | | | | $0.07 | $0.30 |
|
|
49
|
+
| `groq/qwen/qwen3-32b` | 131K | | | | | | $0.29 | $0.59 |
|
|
50
|
+
| `groq/whisper-large-v3` | 448 | | | | | | — | — |
|
|
51
|
+
| `groq/whisper-large-v3-turbo` | 448 | | | | | | — | — |
|
|
44
52
|
|
|
45
53
|
## Advanced configuration
|
|
46
54
|
|
|
@@ -52,7 +60,7 @@ const agent = new Agent({
|
|
|
52
60
|
name: "custom-agent",
|
|
53
61
|
model: {
|
|
54
62
|
url: "https://api.groq.com/openai/v1",
|
|
55
|
-
id: "groq/
|
|
63
|
+
id: "groq/allam-2-7b",
|
|
56
64
|
apiKey: process.env.GROQ_API_KEY,
|
|
57
65
|
headers: {
|
|
58
66
|
"X-Custom-Header": "value"
|
|
@@ -70,8 +78,8 @@ const agent = new Agent({
|
|
|
70
78
|
model: ({ requestContext }) => {
|
|
71
79
|
const useAdvanced = requestContext.task === "complex";
|
|
72
80
|
return useAdvanced
|
|
73
|
-
? "groq/
|
|
74
|
-
: "groq/
|
|
81
|
+
? "groq/whisper-large-v3-turbo"
|
|
82
|
+
: "groq/allam-2-7b";
|
|
75
83
|
}
|
|
76
84
|
});
|
|
77
85
|
```
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Mistral
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 27 Mistral models through Mastra's model router. Authentication is handled automatically using the `MISTRAL_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Mistral documentation](https://docs.mistral.ai/getting-started/models/).
|
|
6
6
|
|
|
@@ -52,7 +52,8 @@ for await (const chunk of stream) {
|
|
|
52
52
|
| `mistral/mistral-medium-latest` | 128K | | | | | | $0.40 | $2 |
|
|
53
53
|
| `mistral/mistral-nemo` | 128K | | | | | | $0.15 | $0.15 |
|
|
54
54
|
| `mistral/mistral-small-2506` | 128K | | | | | | $0.10 | $0.30 |
|
|
55
|
-
| `mistral/mistral-small-
|
|
55
|
+
| `mistral/mistral-small-2603` | 256K | | | | | | $0.15 | $0.60 |
|
|
56
|
+
| `mistral/mistral-small-latest` | 256K | | | | | | $0.15 | $0.60 |
|
|
56
57
|
| `mistral/open-mistral-7b` | 8K | | | | | | $0.25 | $0.25 |
|
|
57
58
|
| `mistral/open-mixtral-8x22b` | 64K | | | | | | $2 | $6 |
|
|
58
59
|
| `mistral/open-mixtral-8x7b` | 32K | | | | | | $0.70 | $0.70 |
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# NanoGPT
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 519 NanoGPT models through Mastra's model router. Authentication is handled automatically using the `NANO_GPT_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [NanoGPT documentation](https://docs.nano-gpt.com).
|
|
6
6
|
|
|
@@ -551,6 +551,8 @@ for await (const chunk of stream) {
|
|
|
551
551
|
| `nano-gpt/zai-org/glm-4.7-flash` | 200K | | | | | | $0.07 | $0.40 |
|
|
552
552
|
| `nano-gpt/zai-org/glm-5` | 200K | | | | | | $0.30 | $3 |
|
|
553
553
|
| `nano-gpt/zai-org/glm-5:thinking` | 200K | | | | | | $0.30 | $3 |
|
|
554
|
+
| `nano-gpt/zai-org/glm-5.1` | 200K | | | | | | $0.30 | $3 |
|
|
555
|
+
| `nano-gpt/zai-org/glm-5.1:thinking` | 200K | | | | | | $0.30 | $3 |
|
|
554
556
|
|
|
555
557
|
## Advanced configuration
|
|
556
558
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# OpenAI
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 47 OpenAI models through Mastra's model router. Authentication is handled automatically using the `OPENAI_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [OpenAI documentation](https://platform.openai.com/docs/models).
|
|
6
6
|
|
|
@@ -59,6 +59,7 @@ for await (const chunk of stream) {
|
|
|
59
59
|
| `openai/gpt-5.2-chat-latest` | 128K | | | | | | $2 | $14 |
|
|
60
60
|
| `openai/gpt-5.2-codex` | 400K | | | | | | $2 | $14 |
|
|
61
61
|
| `openai/gpt-5.2-pro` | 400K | | | | | | $21 | $168 |
|
|
62
|
+
| `openai/gpt-5.3-chat-latest` | 128K | | | | | | $2 | $14 |
|
|
62
63
|
| `openai/gpt-5.3-codex` | 400K | | | | | | $2 | $14 |
|
|
63
64
|
| `openai/gpt-5.3-codex-spark` | 128K | | | | | | $2 | $14 |
|
|
64
65
|
| `openai/gpt-5.4` | 1.1M | | | | | | $3 | $15 |
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# OpenCode Zen
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 34 OpenCode Zen models through Mastra's model router. Authentication is handled automatically using the `OPENCODE_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [OpenCode Zen documentation](https://opencode.ai/docs/zen).
|
|
6
6
|
|
|
@@ -67,6 +67,7 @@ for await (const chunk of stream) {
|
|
|
67
67
|
| `opencode/minimax-m2.5` | 205K | | | | | | $0.30 | $1 |
|
|
68
68
|
| `opencode/minimax-m2.5-free` | 205K | | | | | | — | — |
|
|
69
69
|
| `opencode/nemotron-3-super-free` | 1.0M | | | | | | — | — |
|
|
70
|
+
| `opencode/qwen3.6-plus-free` | 1.0M | | | | | | — | — |
|
|
70
71
|
|
|
71
72
|
## Advanced configuration
|
|
72
73
|
|
|
@@ -96,7 +97,7 @@ const agent = new Agent({
|
|
|
96
97
|
model: ({ requestContext }) => {
|
|
97
98
|
const useAdvanced = requestContext.task === "complex";
|
|
98
99
|
return useAdvanced
|
|
99
|
-
? "opencode/
|
|
100
|
+
? "opencode/qwen3.6-plus-free"
|
|
100
101
|
: "opencode/big-pickle";
|
|
101
102
|
}
|
|
102
103
|
});
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Poe
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 124 Poe models through Mastra's model router. Authentication is handled automatically using the `POE_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Poe documentation](https://creator.poe.com/docs/external-applications/openai-compatible-api).
|
|
6
6
|
|
|
@@ -83,6 +83,7 @@ for await (const chunk of stream) {
|
|
|
83
83
|
| `poe/ideogramai/ideogram-v2a` | 150 | | | | | | — | — |
|
|
84
84
|
| `poe/ideogramai/ideogram-v2a-turbo` | 150 | | | | | | — | — |
|
|
85
85
|
| `poe/lumalabs/ray2` | 5K | | | | | | — | — |
|
|
86
|
+
| `poe/novita/deepseek-v3.2` | 128K | | | | | | $0.27 | $0.40 |
|
|
86
87
|
| `poe/novita/glm-4.6` | — | | | | | | — | — |
|
|
87
88
|
| `poe/novita/glm-4.6v` | 131K | | | | | | — | — |
|
|
88
89
|
| `poe/novita/glm-4.7` | 205K | | | | | | — | — |
|
|
@@ -155,6 +156,7 @@ for await (const chunk of stream) {
|
|
|
155
156
|
| `poe/xai/grok-4-fast-reasoning` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
156
157
|
| `poe/xai/grok-4.1-fast-non-reasoning` | 2.0M | | | | | | — | — |
|
|
157
158
|
| `poe/xai/grok-4.1-fast-reasoning` | 2.0M | | | | | | — | — |
|
|
159
|
+
| `poe/xai/grok-4.20-multi-agent` | 128K | | | | | | $2 | $6 |
|
|
158
160
|
| `poe/xai/grok-code-fast-1` | 256K | | | | | | $0.20 | $2 |
|
|
159
161
|
|
|
160
162
|
## Advanced configuration
|