@mastra/mcp-docs-server 1.1.38 → 1.1.39-alpha.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.docs/docs/agents/a2a.md +115 -88
- package/.docs/docs/agents/acp.md +238 -0
- package/.docs/docs/agents/response-caching.md +2 -0
- package/.docs/docs/agents/signals.md +1 -1
- package/.docs/docs/build-with-ai/skills.md +28 -0
- package/.docs/docs/evals/evals-with-memory.md +146 -0
- package/.docs/docs/evals/running-in-ci.md +1 -0
- package/.docs/docs/server/auth/fga.md +55 -10
- package/.docs/models/gateways/openrouter.md +2 -10
- package/.docs/models/gateways/vercel.md +1 -8
- package/.docs/models/index.md +1 -1
- package/.docs/models/providers/berget.md +2 -1
- package/.docs/models/providers/kilo.md +9 -22
- package/.docs/models/providers/lilac.md +74 -0
- package/.docs/models/providers/llmgateway.md +1 -8
- package/.docs/models/providers/neuralwatt.md +3 -3
- package/.docs/models/providers/novita-ai.md +7 -7
- package/.docs/models/providers/opencode.md +1 -1
- package/.docs/models/providers/routing-run.md +107 -0
- package/.docs/models/providers/xai.md +4 -15
- package/.docs/models/providers/xpersona.md +3 -3
- package/.docs/models/providers.md +2 -0
- package/.docs/reference/agents/agent.md +1 -1
- package/.docs/reference/agents/channels.md +6 -0
- package/.docs/reference/client-js/agents.md +1 -1
- package/.docs/reference/configuration.md +10 -10
- package/.docs/reference/server/register-api-route.md +19 -0
- package/.docs/reference/storage/convex.md +74 -12
- package/.docs/reference/vectors/convex.md +129 -7
- package/CHANGELOG.md +35 -0
- package/package.json +6 -6
|
@@ -0,0 +1,146 @@
|
|
|
1
|
+
# Evals with memory
|
|
2
|
+
|
|
3
|
+
Agents that use memory in `thread` scope — including observational memory — require a thread ID at run time. When an eval invokes the agent without one, you'll see:
|
|
4
|
+
|
|
5
|
+
```text
|
|
6
|
+
ObservationalMemory (scope: 'thread') requires a threadId, but none was found in RequestContext or MessageList.
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
This page covers the three working patterns for running Mastra evals against memory-enabled agents, what each path supports, and which one to pick. A complete runnable repro for all three approaches lives in [`examples/evals-with-memory`](https://github.com/mastra-ai/mastra/tree/main/examples/evals-with-memory).
|
|
10
|
+
|
|
11
|
+
## When to use which approach
|
|
12
|
+
|
|
13
|
+
| Goal | Approach |
|
|
14
|
+
| ----------------------------------------------- | ----------------------------------------------------------------------------------------- |
|
|
15
|
+
| One shared conversation across every item | [`runEvals` with global `targetOptions.memory`](#shared-thread-with-runevals) |
|
|
16
|
+
| One independent thread per item, simple CI loop | [`runEvals` per item](#per-item-threads-with-runevals) |
|
|
17
|
+
| Per-item threads driven by a stored `Dataset` | [`dataset.startExperiment` with an inline task](#dataset-experiments-with-an-inline-task) |
|
|
18
|
+
|
|
19
|
+
Pre-seeding `RequestContext` with `MastraMemory` is **not** a supported way to drive memory into an agent. Thread resolution reads `args.memory.thread` — `RequestContext.MastraMemory` is populated by `prepare-memory-step` after the agent has already resolved its thread.
|
|
20
|
+
|
|
21
|
+
## Shared thread with `runEvals`
|
|
22
|
+
|
|
23
|
+
`runEvals` accepts `targetOptions`, which is forwarded to `agent.generate()`. Passing `memory: { thread, resource }` runs every data item against the same thread — useful for testing recall across a multi-turn conversation.
|
|
24
|
+
|
|
25
|
+
```typescript
|
|
26
|
+
import { runEvals } from '@mastra/core/evals'
|
|
27
|
+
import { supportAgent } from './support-agent'
|
|
28
|
+
import { recallScorer } from '../scorers/recall-scorer'
|
|
29
|
+
|
|
30
|
+
const memory = await supportAgent.getMemory()
|
|
31
|
+
await memory!.createThread({ threadId: 'eval-thread', resourceId: 'ci-user' })
|
|
32
|
+
|
|
33
|
+
const result = await runEvals({
|
|
34
|
+
target: supportAgent,
|
|
35
|
+
scorers: [recallScorer],
|
|
36
|
+
targetOptions: {
|
|
37
|
+
memory: { thread: 'eval-thread', resource: 'ci-user' },
|
|
38
|
+
},
|
|
39
|
+
data: [
|
|
40
|
+
{ input: 'My order number is 12345' },
|
|
41
|
+
{ input: 'What is my order number?', groundTruth: '12345' },
|
|
42
|
+
],
|
|
43
|
+
})
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
`targetOptions` is **global per call**. There is no per-item override on `RunEvalsDataItem` today.
|
|
47
|
+
|
|
48
|
+
## Per-item threads with `runEvals`
|
|
49
|
+
|
|
50
|
+
When each data item needs its own thread (the common CI shape), call `runEvals` once per item with a unique `targetOptions.memory` and aggregate the scores yourself.
|
|
51
|
+
|
|
52
|
+
```typescript
|
|
53
|
+
import { randomUUID } from 'node:crypto'
|
|
54
|
+
import { runEvals } from '@mastra/core/evals'
|
|
55
|
+
import { supportAgent } from './support-agent'
|
|
56
|
+
import { recallScorer } from '../scorers/recall-scorer'
|
|
57
|
+
|
|
58
|
+
const memory = await supportAgent.getMemory()
|
|
59
|
+
const resourceId = 'ci-user'
|
|
60
|
+
|
|
61
|
+
const items = [
|
|
62
|
+
{ input: 'Cats are mammals', groundTruth: 'mammals' },
|
|
63
|
+
{ input: 'Dogs are mammals too', groundTruth: 'mammals' },
|
|
64
|
+
]
|
|
65
|
+
|
|
66
|
+
// `runEvals` returns `{ scores: Record<string, number>; summary: { totalItems } }`.
|
|
67
|
+
const scores: number[] = []
|
|
68
|
+
for (const item of items) {
|
|
69
|
+
const threadId = `eval-${randomUUID()}`
|
|
70
|
+
await memory!.createThread({ threadId, resourceId, title: item.input })
|
|
71
|
+
|
|
72
|
+
const result = await runEvals({
|
|
73
|
+
target: supportAgent,
|
|
74
|
+
scorers: [recallScorer],
|
|
75
|
+
targetOptions: { memory: { thread: threadId, resource: resourceId } },
|
|
76
|
+
data: [item],
|
|
77
|
+
})
|
|
78
|
+
|
|
79
|
+
scores.push(result.scores[recallScorer.id])
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
const average = scores.reduce((a, b) => a + b, 0) / scores.length
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
> **Note:** Create the thread before running the eval. Observational memory in `thread` scope reads from a record that must already exist.
|
|
86
|
+
|
|
87
|
+
## Dataset experiments with an inline task
|
|
88
|
+
|
|
89
|
+
`dataset.startExperiment({ target: agent })` does **not** forward a `memory` option to the agent — only `requestContext`. To run a stored dataset against a memory-enabled agent, use an inline `task` function and stash `{ threadId, resourceId }` in each item's `metadata`. The scorer pipeline still runs as normal.
|
|
90
|
+
|
|
91
|
+
```typescript
|
|
92
|
+
import { randomUUID } from 'node:crypto'
|
|
93
|
+
import { mastra } from '../index'
|
|
94
|
+
import { supportAgent } from '../agents/support-agent'
|
|
95
|
+
import { recallScorer } from '../scorers/recall-scorer'
|
|
96
|
+
|
|
97
|
+
const memory = await supportAgent.getMemory()
|
|
98
|
+
const resourceId = 'ci-user'
|
|
99
|
+
|
|
100
|
+
const items = [
|
|
101
|
+
{ input: 'Cats are mammals', groundTruth: 'mammals', thread: `ds-${randomUUID()}` },
|
|
102
|
+
{ input: 'Dogs are mammals too', groundTruth: 'mammals', thread: `ds-${randomUUID()}` },
|
|
103
|
+
]
|
|
104
|
+
|
|
105
|
+
for (const it of items) {
|
|
106
|
+
await memory!.createThread({ threadId: it.thread, resourceId, title: it.input })
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
const dataset = await mastra.datasets.create({
|
|
110
|
+
name: 'support-recall',
|
|
111
|
+
description: 'Per-item memory via inline task + item metadata',
|
|
112
|
+
})
|
|
113
|
+
|
|
114
|
+
await dataset.addItems({
|
|
115
|
+
items: items.map(it => ({
|
|
116
|
+
input: it.input,
|
|
117
|
+
groundTruth: it.groundTruth,
|
|
118
|
+
metadata: { threadId: it.thread, resourceId },
|
|
119
|
+
})),
|
|
120
|
+
})
|
|
121
|
+
|
|
122
|
+
const summary = await dataset.startExperiment({
|
|
123
|
+
scorers: [recallScorer],
|
|
124
|
+
task: async ({ input, metadata }) => {
|
|
125
|
+
const { threadId, resourceId: rid } = (metadata ?? {}) as {
|
|
126
|
+
threadId: string
|
|
127
|
+
resourceId: string
|
|
128
|
+
}
|
|
129
|
+
const result = await supportAgent.generate(input as string, {
|
|
130
|
+
memory: { thread: threadId, resource: rid },
|
|
131
|
+
})
|
|
132
|
+
return result.text
|
|
133
|
+
},
|
|
134
|
+
})
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
The inline `task` receives the item's `metadata`, so each row can drive its own thread without changing the agent or any scorer.
|
|
138
|
+
|
|
139
|
+
> **Note:** Visit [runEvals reference](https://mastra.ai/reference/evals/run-evals) and [Dataset reference](https://mastra.ai/reference/datasets/dataset) for full configuration.
|
|
140
|
+
|
|
141
|
+
## Related
|
|
142
|
+
|
|
143
|
+
- [Running scorers in CI](https://mastra.ai/docs/evals/running-in-ci)
|
|
144
|
+
- [Running experiments](https://mastra.ai/docs/evals/datasets/running-experiments)
|
|
145
|
+
- [Observational memory](https://mastra.ai/docs/memory/observational-memory)
|
|
146
|
+
- [runEvals API reference](https://mastra.ai/reference/evals/run-evals)
|
|
@@ -121,4 +121,5 @@ describe('Weather Agent Tests', () => {
|
|
|
121
121
|
|
|
122
122
|
- Learn about [creating custom scorers](https://mastra.ai/docs/evals/custom-scorers)
|
|
123
123
|
- Explore [built-in scorers](https://mastra.ai/docs/evals/built-in-scorers)
|
|
124
|
+
- Run scorers against [memory-enabled agents](https://mastra.ai/docs/evals/evals-with-memory)
|
|
124
125
|
- Read the [runEvals API reference](https://mastra.ai/reference/evals/run-evals)
|
|
@@ -25,6 +25,7 @@ const mastra = new Mastra({
|
|
|
25
25
|
auth: new MastraAuthWorkos({
|
|
26
26
|
/* ... */
|
|
27
27
|
fetchMemberships: true,
|
|
28
|
+
mapUserToResourceId: user => user.teamId,
|
|
28
29
|
}),
|
|
29
30
|
fga: new MastraFGAWorkos({
|
|
30
31
|
resourceMapping: {
|
|
@@ -39,6 +40,9 @@ const mastra = new Mastra({
|
|
|
39
40
|
[MastraFGAPermissions.MEMORY_WRITE]: 'update',
|
|
40
41
|
},
|
|
41
42
|
}),
|
|
43
|
+
storedResources: {
|
|
44
|
+
scope: true,
|
|
45
|
+
},
|
|
42
46
|
},
|
|
43
47
|
});
|
|
44
48
|
```
|
|
@@ -47,6 +51,8 @@ When using `MastraFGAWorkos`, set `fetchMemberships: true` on `MastraAuthWorkos`
|
|
|
47
51
|
|
|
48
52
|
Use `thread` as the resource-mapping key for memory authorization. `MastraFGAWorkos` still accepts the legacy alias `memory`, but new configs should prefer `thread`.
|
|
49
53
|
|
|
54
|
+
When `server.fga` is configured, Mastra enforces FGA on protected actions. If a protected action has no authenticated user, Mastra denies it. If `server.fga` is not configured, these FGA checks are skipped and Mastra keeps the previous behavior.
|
|
55
|
+
|
|
50
56
|
### Resource mapping
|
|
51
57
|
|
|
52
58
|
The `resourceMapping` tells Mastra how to resolve FGA resource types and IDs from request context. Keys are Mastra resource types, values define the FGA resource type and how to derive the ID:
|
|
@@ -67,6 +73,7 @@ resourceMapping: {
|
|
|
67
73
|
- `user` — the authenticated user
|
|
68
74
|
- `resourceId` — the owning Mastra resource ID when available (for example, a thread's `resourceId`)
|
|
69
75
|
- `requestContext` — the current request context for advanced tenant resolution
|
|
76
|
+
- `metadata` — provider-specific metadata for the attempted action
|
|
70
77
|
|
|
71
78
|
Return `undefined` from `deriveId()` to fall back to the original Mastra resource ID.
|
|
72
79
|
|
|
@@ -89,9 +96,43 @@ If no mapping exists for a permission, the original string is passed through.
|
|
|
89
96
|
|
|
90
97
|
Use `validatePermissions()` to validate the full set of permissions Mastra may emit at startup. Use this when a provider requires every Mastra permission to have an explicit provider permission slug.
|
|
91
98
|
|
|
99
|
+
### Stored resource scoping
|
|
100
|
+
|
|
101
|
+
FGA authorizes access to a resource. It does not automatically filter stored records that live in shared storage. Enable stored resource scoping when the built-in stored resource APIs are used in a multi-tenant app.
|
|
102
|
+
|
|
103
|
+
```typescript
|
|
104
|
+
const mastra = new Mastra({
|
|
105
|
+
server: {
|
|
106
|
+
auth: new MastraAuthWorkos({
|
|
107
|
+
/* ... */
|
|
108
|
+
mapUserToResourceId: user => user.teamId,
|
|
109
|
+
}),
|
|
110
|
+
storedResources: {
|
|
111
|
+
scope: true,
|
|
112
|
+
},
|
|
113
|
+
},
|
|
114
|
+
});
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
With `scope: true`, Mastra reads `MASTRA_RESOURCE_ID_KEY` from the request context. `mapUserToResourceId()` sets this value after authentication. Stored resource handlers persist the scope in record metadata and filter list, read, update, publish, and delete operations by that scope.
|
|
118
|
+
|
|
119
|
+
Use an object when the scope needs custom request logic:
|
|
120
|
+
|
|
121
|
+
```typescript
|
|
122
|
+
storedResources: {
|
|
123
|
+
scope: {
|
|
124
|
+
metadataKey: 'teamId',
|
|
125
|
+
resolve: ({ user }) => user.teamId,
|
|
126
|
+
requireScope: true,
|
|
127
|
+
},
|
|
128
|
+
},
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
If `requireScope` is `true` or omitted, scoped stored resource routes fail when no scope can be resolved.
|
|
132
|
+
|
|
92
133
|
### Route policy coverage
|
|
93
134
|
|
|
94
|
-
Mastra includes route-level FGA metadata for built-in resource routes, including agents, workflows, tools, MCP tools, memory threads, responses, and
|
|
135
|
+
Mastra includes route-level FGA metadata for built-in resource routes, including agents, workflows, tools, MCP tools, memory threads, responses, conversations, and stored resources. Stored resource route coverage includes `/stored/agents`, `/stored/mcp-clients`, `/stored/prompt-blocks`, `/stored/scorers`, `/stored/skills`, and `/stored/workspaces`. A route is checked when it has route-level `fga` metadata, when Mastra can derive built-in metadata for that route, or when the provider supplies metadata with `resolveRouteFGA()`.
|
|
95
136
|
|
|
96
137
|
To deny protected routes that do not resolve FGA metadata, configure route policy coverage on the FGA provider:
|
|
97
138
|
|
|
@@ -159,16 +200,20 @@ const fga = new MastraFGAWorkos({
|
|
|
159
200
|
|
|
160
201
|
When an FGA provider is configured, Mastra automatically checks authorization at these lifecycle points:
|
|
161
202
|
|
|
162
|
-
| Lifecycle point
|
|
163
|
-
|
|
|
164
|
-
| Agent execution (`generate`, `stream`)
|
|
165
|
-
| Workflow
|
|
166
|
-
|
|
|
167
|
-
|
|
|
168
|
-
| MCP tool execution
|
|
169
|
-
|
|
|
203
|
+
| Lifecycle point | Permission checked | Resource type | Resource ID |
|
|
204
|
+
| ---------------------------------------------------------------- | ----------------------------------------------- | -------------------- | ------------------------------------------------------------------- |
|
|
205
|
+
| Agent execution (`generate`, `stream`) | `agents:execute` | `agent` | `agentId` |
|
|
206
|
+
| Built-in workflow HTTP execution routes and `Workflow.execute()` | `workflows:execute` | `workflow` | `workflowId` |
|
|
207
|
+
| Standalone tool execution | `tools:execute` | `tool` | `toolName` |
|
|
208
|
+
| Agent tool execution | `tools:execute` | `tool` | `${agentId}:${toolName}` |
|
|
209
|
+
| MCP tool execution | `tools:execute` | `tool` | `JSON.stringify([serverName, toolName])` |
|
|
210
|
+
| Thread and memory access | `memory:read`, `memory:write`, `memory:delete` | `thread` | `threadId` |
|
|
211
|
+
| Stored resource routes | Stored resource permission for the route action | Stored resource type | Route record ID, or the stored-resource scope for collection routes |
|
|
212
|
+
| HTTP resource routes | Configured per route | Configured per route | Configured per route |
|
|
213
|
+
|
|
214
|
+
Direct SDK calls to `createRun().start()`, `resume()`, or `restart()` are not independently checked by core FGA in this release. Make those calls from a protected route or guard them in application code. Pass a `requestContext` with an authenticated user when invoking protected entry points directly.
|
|
170
215
|
|
|
171
|
-
|
|
216
|
+
Core agent, internal workflow, tool, and memory checks also pass `requestContext` and action metadata to the FGA provider. Route checks pass `requestContext`. Thread checks pass the owning `resourceId` when available.
|
|
172
217
|
|
|
173
218
|
## Custom FGA provider
|
|
174
219
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# OpenRouter
|
|
2
2
|
|
|
3
|
-
OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access
|
|
3
|
+
OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 356 models through Mastra's model router.
|
|
4
4
|
|
|
5
5
|
Learn more in the [OpenRouter documentation](https://openrouter.ai/models).
|
|
6
6
|
|
|
@@ -143,7 +143,7 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
143
143
|
| `inception/mercury-2` |
|
|
144
144
|
| `inclusionai/ling-2.6-1t` |
|
|
145
145
|
| `inclusionai/ling-2.6-flash` |
|
|
146
|
-
| `inclusionai/ring-2.6-1t
|
|
146
|
+
| `inclusionai/ring-2.6-1t` |
|
|
147
147
|
| `inflection/inflection-3-pi` |
|
|
148
148
|
| `inflection/inflection-3-productivity` |
|
|
149
149
|
| `kwaipilot/kat-coder-pro-v2` |
|
|
@@ -369,17 +369,9 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
369
369
|
| `undi95/remm-slerp-l2-13b` |
|
|
370
370
|
| `upstage/solar-pro-3` |
|
|
371
371
|
| `writer/palmyra-x5` |
|
|
372
|
-
| `x-ai/grok-3` |
|
|
373
|
-
| `x-ai/grok-3-beta` |
|
|
374
|
-
| `x-ai/grok-3-mini` |
|
|
375
|
-
| `x-ai/grok-3-mini-beta` |
|
|
376
|
-
| `x-ai/grok-4` |
|
|
377
|
-
| `x-ai/grok-4-fast` |
|
|
378
|
-
| `x-ai/grok-4.1-fast` |
|
|
379
372
|
| `x-ai/grok-4.20` |
|
|
380
373
|
| `x-ai/grok-4.20-multi-agent` |
|
|
381
374
|
| `x-ai/grok-4.3` |
|
|
382
|
-
| `x-ai/grok-code-fast-1` |
|
|
383
375
|
| `xiaomi/mimo-v2-flash` |
|
|
384
376
|
| `xiaomi/mimo-v2-omni` |
|
|
385
377
|
| `xiaomi/mimo-v2-pro` |
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Vercel
|
|
2
2
|
|
|
3
|
-
Vercel aggregates models from multiple providers with enhanced features like rate limiting and failover. Access
|
|
3
|
+
Vercel aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 240 models through Mastra's model router.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Vercel documentation](https://ai-sdk.dev/providers/ai-sdk-providers).
|
|
6
6
|
|
|
@@ -245,12 +245,6 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
245
245
|
| `voyage/voyage-finance-2` |
|
|
246
246
|
| `voyage/voyage-law-2` |
|
|
247
247
|
| `xai/grok-2-vision` |
|
|
248
|
-
| `xai/grok-3` |
|
|
249
|
-
| `xai/grok-3-fast` |
|
|
250
|
-
| `xai/grok-3-mini` |
|
|
251
|
-
| `xai/grok-3-mini-fast` |
|
|
252
|
-
| `xai/grok-4` |
|
|
253
|
-
| `xai/grok-4-fast-non-reasoning` |
|
|
254
248
|
| `xai/grok-4-fast-reasoning` |
|
|
255
249
|
| `xai/grok-4.1-fast-non-reasoning` |
|
|
256
250
|
| `xai/grok-4.1-fast-reasoning` |
|
|
@@ -261,7 +255,6 @@ ANTHROPIC_API_KEY=ant-...
|
|
|
261
255
|
| `xai/grok-4.20-reasoning` |
|
|
262
256
|
| `xai/grok-4.20-reasoning-beta` |
|
|
263
257
|
| `xai/grok-4.3` |
|
|
264
|
-
| `xai/grok-code-fast-1` |
|
|
265
258
|
| `xai/grok-imagine-image` |
|
|
266
259
|
| `xai/grok-imagine-image-pro` |
|
|
267
260
|
| `xiaomi/mimo-v2-flash` |
|
package/.docs/models/index.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Model Providers
|
|
2
2
|
|
|
3
|
-
Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to
|
|
3
|
+
Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 4219 models from 121 providers through a single API.
|
|
4
4
|
|
|
5
5
|
## Features
|
|
6
6
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Berget.AI
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 7 Berget.AI models through Mastra's model router. Authentication is handled automatically using the `BERGET_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Berget.AI documentation](https://api.berget.ai).
|
|
6
6
|
|
|
@@ -38,6 +38,7 @@ for await (const chunk of stream) {
|
|
|
38
38
|
| `berget/meta-llama/Llama-3.3-70B-Instruct` | 128K | | | | | | $0.99 | $0.99 |
|
|
39
39
|
| `berget/mistralai/Mistral-Medium-3.5-128B` | 262K | | | | | | $2 | $6 |
|
|
40
40
|
| `berget/mistralai/Mistral-Small-3.2-24B-Instruct-2506` | 32K | | | | | | $0.33 | $0.33 |
|
|
41
|
+
| `berget/moonshotai/Kimi-K2.6` | 262K | | | | | | $0.83 | $4 |
|
|
41
42
|
| `berget/openai/gpt-oss-120b` | 128K | | | | | | $0.44 | $0.99 |
|
|
42
43
|
| `berget/zai-org/GLM-4.7` | 128K | | | | | | $0.77 | $3 |
|
|
43
44
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Kilo Gateway
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 344 Kilo Gateway models through Mastra's model router. Authentication is handled automatically using the `KILO_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [Kilo Gateway documentation](https://kilo.ai).
|
|
6
6
|
|
|
@@ -50,7 +50,6 @@ for await (const chunk of stream) {
|
|
|
50
50
|
| `kilo/alfredpros/codellama-7b-instruct-solidity` | 4K | | | | | | $0.80 | $1 |
|
|
51
51
|
| `kilo/alibaba/tongyi-deepresearch-30b-a3b` | 131K | | | | | | $0.09 | $0.45 |
|
|
52
52
|
| `kilo/allenai/olmo-3-32b-think` | 66K | | | | | | $0.15 | $0.50 |
|
|
53
|
-
| `kilo/alpindale/goliath-120b` | 6K | | | | | | $4 | $8 |
|
|
54
53
|
| `kilo/amazon/nova-2-lite-v1` | 1.0M | | | | | | $0.30 | $3 |
|
|
55
54
|
| `kilo/amazon/nova-lite-v1` | 300K | | | | | | $0.06 | $0.24 |
|
|
56
55
|
| `kilo/amazon/nova-micro-v1` | 128K | | | | | | $0.04 | $0.14 |
|
|
@@ -59,8 +58,6 @@ for await (const chunk of stream) {
|
|
|
59
58
|
| `kilo/anthracite-org/magnum-v4-72b` | 16K | | | | | | $3 | $5 |
|
|
60
59
|
| `kilo/anthropic/claude-3-haiku` | 200K | | | | | | $0.25 | $1 |
|
|
61
60
|
| `kilo/anthropic/claude-3.5-haiku` | 200K | | | | | | $0.80 | $4 |
|
|
62
|
-
| `kilo/anthropic/claude-3.7-sonnet` | 200K | | | | | | $3 | $15 |
|
|
63
|
-
| `kilo/anthropic/claude-3.7-sonnet:thinking` | 200K | | | | | | $3 | $15 |
|
|
64
61
|
| `kilo/anthropic/claude-haiku-4.5` | 200K | | | | | | $1 | $5 |
|
|
65
62
|
| `kilo/anthropic/claude-opus-4` | 200K | | | | | | $15 | $75 |
|
|
66
63
|
| `kilo/anthropic/claude-opus-4.1` | 200K | | | | | | $15 | $75 |
|
|
@@ -68,6 +65,7 @@ for await (const chunk of stream) {
|
|
|
68
65
|
| `kilo/anthropic/claude-opus-4.6` | 1.0M | | | | | | $5 | $25 |
|
|
69
66
|
| `kilo/anthropic/claude-opus-4.6-fast` | 1.0M | | | | | | $30 | $150 |
|
|
70
67
|
| `kilo/anthropic/claude-opus-4.7` | 1.0M | | | | | | $5 | $25 |
|
|
68
|
+
| `kilo/anthropic/claude-opus-4.7-fast` | 1.0M | | | | | | $30 | $150 |
|
|
71
69
|
| `kilo/anthropic/claude-sonnet-4` | 200K | | | | | | $3 | $15 |
|
|
72
70
|
| `kilo/anthropic/claude-sonnet-4.5` | 1.0M | | | | | | $3 | $15 |
|
|
73
71
|
| `kilo/anthropic/claude-sonnet-4.6` | 1.0M | | | | | | $3 | $15 |
|
|
@@ -84,7 +82,7 @@ for await (const chunk of stream) {
|
|
|
84
82
|
| `kilo/baidu/ernie-4.5-300b-a47b` | 123K | | | | | | $0.28 | $1 |
|
|
85
83
|
| `kilo/baidu/ernie-4.5-vl-28b-a3b` | 30K | | | | | | $0.14 | $0.56 |
|
|
86
84
|
| `kilo/baidu/ernie-4.5-vl-424b-a47b` | 123K | | | | | | $0.42 | $1 |
|
|
87
|
-
| `kilo/baidu/qianfan-ocr-fast
|
|
85
|
+
| `kilo/baidu/qianfan-ocr-fast` | 66K | | | | | | $0.68 | $3 |
|
|
88
86
|
| `kilo/bytedance-seed/seed-1.6` | 262K | | | | | | $0.25 | $2 |
|
|
89
87
|
| `kilo/bytedance-seed/seed-1.6-flash` | 262K | | | | | | $0.07 | $0.30 |
|
|
90
88
|
| `kilo/bytedance-seed/seed-2.0-lite` | 262K | | | | | | $0.25 | $2 |
|
|
@@ -107,6 +105,7 @@ for await (const chunk of stream) {
|
|
|
107
105
|
| `kilo/deepseek/deepseek-v3.2-exp` | 164K | | | | | | $0.27 | $0.41 |
|
|
108
106
|
| `kilo/deepseek/deepseek-v3.2-speciale` | 164K | | | | | | $0.40 | $1 |
|
|
109
107
|
| `kilo/deepseek/deepseek-v4-flash` | 1.0M | | | | | | $0.14 | $0.28 |
|
|
108
|
+
| `kilo/deepseek/deepseek-v4-flash:free` | 1.0M | | | | | | — | — |
|
|
110
109
|
| `kilo/deepseek/deepseek-v4-pro` | 1.0M | | | | | | $0.43 | $0.87 |
|
|
111
110
|
| `kilo/essentialai/rnj-1-instruct` | 33K | | | | | | $0.15 | $0.15 |
|
|
112
111
|
| `kilo/google/gemini-2.0-flash-001` | 1.0M | | | | | | $0.10 | $0.40 |
|
|
@@ -121,6 +120,7 @@ for await (const chunk of stream) {
|
|
|
121
120
|
| `kilo/google/gemini-3-flash-preview` | 1.0M | | | | | | $0.50 | $3 |
|
|
122
121
|
| `kilo/google/gemini-3-pro-image-preview` | 66K | | | | | | $2 | $12 |
|
|
123
122
|
| `kilo/google/gemini-3.1-flash-image-preview` | 66K | | | | | | $0.50 | $3 |
|
|
123
|
+
| `kilo/google/gemini-3.1-flash-lite` | 1.0M | | | | | | $0.25 | $2 |
|
|
124
124
|
| `kilo/google/gemini-3.1-flash-lite-preview` | 1.0M | | | | | | $0.25 | $2 |
|
|
125
125
|
| `kilo/google/gemini-3.1-pro-preview` | 1.0M | | | | | | $2 | $12 |
|
|
126
126
|
| `kilo/google/gemini-3.1-pro-preview-customtools` | 1.0M | | | | | | $2 | $12 |
|
|
@@ -137,8 +137,9 @@ for await (const chunk of stream) {
|
|
|
137
137
|
| `kilo/ibm-granite/granite-4.0-h-micro` | 131K | | | | | | $0.02 | $0.11 |
|
|
138
138
|
| `kilo/ibm-granite/granite-4.1-8b` | 131K | | | | | | $0.05 | $0.10 |
|
|
139
139
|
| `kilo/inception/mercury-2` | 128K | | | | | | $0.25 | $0.75 |
|
|
140
|
-
| `kilo/inclusionai/ling-2.6-1t
|
|
140
|
+
| `kilo/inclusionai/ling-2.6-1t` | 262K | | | | | | $0.30 | $3 |
|
|
141
141
|
| `kilo/inclusionai/ling-2.6-flash` | 262K | | | | | | $0.08 | $0.24 |
|
|
142
|
+
| `kilo/inclusionai/ring-2.6-1t` | 262K | | | | | | $0.07 | $0.63 |
|
|
142
143
|
| `kilo/inflection/inflection-3-pi` | 8K | | | | | | $3 | $10 |
|
|
143
144
|
| `kilo/inflection/inflection-3-productivity` | 8K | | | | | | $3 | $10 |
|
|
144
145
|
| `kilo/kilo-auto/balanced` | 205K | | | | | | $0.60 | $3 |
|
|
@@ -192,7 +193,6 @@ for await (const chunk of stream) {
|
|
|
192
193
|
| `kilo/mistralai/mistral-small-3.1-24b-instruct` | 128K | | | | | | $0.35 | $0.56 |
|
|
193
194
|
| `kilo/mistralai/mistral-small-3.2-24b-instruct` | 131K | | | | | | $0.06 | $0.18 |
|
|
194
195
|
| `kilo/mistralai/mixtral-8x22b-instruct` | 66K | | | | | | $2 | $6 |
|
|
195
|
-
| `kilo/mistralai/mixtral-8x7b-instruct` | 33K | | | | | | $0.54 | $0.54 |
|
|
196
196
|
| `kilo/mistralai/pixtral-large-2411` | 131K | | | | | | $2 | $6 |
|
|
197
197
|
| `kilo/mistralai/voxtral-small-24b-2507` | 32K | | | | | | $0.10 | $0.30 |
|
|
198
198
|
| `kilo/moonshotai/kimi-k2` | 131K | | | | | | $0.55 | $2 |
|
|
@@ -208,7 +208,6 @@ for await (const chunk of stream) {
|
|
|
208
208
|
| `kilo/nousresearch/hermes-3-llama-3.1-70b` | 131K | | | | | | $0.30 | $0.30 |
|
|
209
209
|
| `kilo/nousresearch/hermes-4-405b` | 131K | | | | | | $1 | $3 |
|
|
210
210
|
| `kilo/nousresearch/hermes-4-70b` | 131K | | | | | | $0.13 | $0.40 |
|
|
211
|
-
| `kilo/nvidia/llama-3.1-nemotron-70b-instruct` | 131K | | | | | | $1 | $1 |
|
|
212
211
|
| `kilo/nvidia/llama-3.3-nemotron-super-49b-v1.5` | 131K | | | | | | $0.10 | $0.40 |
|
|
213
212
|
| `kilo/nvidia/nemotron-3-nano-30b-a3b` | 262K | | | | | | $0.05 | $0.20 |
|
|
214
213
|
| `kilo/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free` | 256K | | | | | | — | — |
|
|
@@ -283,6 +282,7 @@ for await (const chunk of stream) {
|
|
|
283
282
|
| `kilo/openrouter/free` | 200K | | | | | | — | — |
|
|
284
283
|
| `kilo/openrouter/owl-alpha` | 1.0M | | | | | | — | — |
|
|
285
284
|
| `kilo/openrouter/pareto-code` | 200K | | | | | | — | — |
|
|
285
|
+
| `kilo/perceptron/perceptron-mk1` | 33K | | | | | | $0.15 | $2 |
|
|
286
286
|
| `kilo/perplexity/sonar` | 127K | | | | | | $1 | $1 |
|
|
287
287
|
| `kilo/perplexity/sonar-deep-research` | 128K | | | | | | $2 | $8 |
|
|
288
288
|
| `kilo/perplexity/sonar-pro` | 200K | | | | | | $3 | $15 |
|
|
@@ -294,13 +294,9 @@ for await (const chunk of stream) {
|
|
|
294
294
|
| `kilo/qwen/qwen-2.5-72b-instruct` | 33K | | | | | | $0.12 | $0.39 |
|
|
295
295
|
| `kilo/qwen/qwen-2.5-7b-instruct` | 33K | | | | | | $0.04 | $0.10 |
|
|
296
296
|
| `kilo/qwen/qwen-2.5-coder-32b-instruct` | 33K | | | | | | $0.20 | $0.20 |
|
|
297
|
-
| `kilo/qwen/qwen-max` | 33K | | | | | | $1 | $4 |
|
|
298
297
|
| `kilo/qwen/qwen-plus` | 1.0M | | | | | | $0.40 | $1 |
|
|
299
298
|
| `kilo/qwen/qwen-plus-2025-07-28` | 1.0M | | | | | | $0.26 | $0.78 |
|
|
300
299
|
| `kilo/qwen/qwen-plus-2025-07-28:thinking` | 1.0M | | | | | | $0.26 | $0.78 |
|
|
301
|
-
| `kilo/qwen/qwen-turbo` | 131K | | | | | | $0.03 | $0.13 |
|
|
302
|
-
| `kilo/qwen/qwen-vl-max` | 131K | | | | | | $0.80 | $3 |
|
|
303
|
-
| `kilo/qwen/qwen-vl-plus` | 131K | | | | | | $0.14 | $0.41 |
|
|
304
300
|
| `kilo/qwen/qwen2.5-vl-72b-instruct` | 33K | | | | | | $0.80 | $0.80 |
|
|
305
301
|
| `kilo/qwen/qwen3-14b` | 41K | | | | | | $0.06 | $0.24 |
|
|
306
302
|
| `kilo/qwen/qwen3-235b-a22b` | 131K | | | | | | $0.46 | $2 |
|
|
@@ -353,26 +349,17 @@ for await (const chunk of stream) {
|
|
|
353
349
|
| `kilo/stepfun/step-3.5-flash:free` | 262K | | | | | | — | — |
|
|
354
350
|
| `kilo/switchpoint/router` | 131K | | | | | | $0.85 | $3 |
|
|
355
351
|
| `kilo/tencent/hunyuan-a13b-instruct` | 131K | | | | | | $0.14 | $0.57 |
|
|
356
|
-
| `kilo/tencent/hy3-preview
|
|
352
|
+
| `kilo/tencent/hy3-preview` | 262K | | | | | | $0.07 | $0.26 |
|
|
357
353
|
| `kilo/thedrummer/cydonia-24b-v4.1` | 131K | | | | | | $0.30 | $0.50 |
|
|
358
354
|
| `kilo/thedrummer/rocinante-12b` | 33K | | | | | | $0.17 | $0.43 |
|
|
359
355
|
| `kilo/thedrummer/skyfall-36b-v2` | 33K | | | | | | $0.55 | $0.80 |
|
|
360
356
|
| `kilo/thedrummer/unslopnemo-12b` | 33K | | | | | | $0.40 | $0.40 |
|
|
361
|
-
| `kilo/tngtech/deepseek-r1t2-chimera` | 164K | | | | | | $0.25 | $0.85 |
|
|
362
357
|
| `kilo/undi95/remm-slerp-l2-13b` | 6K | | | | | | $0.45 | $0.65 |
|
|
363
358
|
| `kilo/upstage/solar-pro-3` | 128K | | | | | | $0.15 | $0.60 |
|
|
364
359
|
| `kilo/writer/palmyra-x5` | 1.0M | | | | | | $0.60 | $6 |
|
|
365
|
-
| `kilo/x-ai/grok-3` | 131K | | | | | | $3 | $15 |
|
|
366
|
-
| `kilo/x-ai/grok-3-beta` | 131K | | | | | | $3 | $15 |
|
|
367
|
-
| `kilo/x-ai/grok-3-mini` | 131K | | | | | | $0.30 | $0.50 |
|
|
368
|
-
| `kilo/x-ai/grok-3-mini-beta` | 131K | | | | | | $0.30 | $0.50 |
|
|
369
|
-
| `kilo/x-ai/grok-4` | 256K | | | | | | $3 | $15 |
|
|
370
|
-
| `kilo/x-ai/grok-4-fast` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
371
|
-
| `kilo/x-ai/grok-4.1-fast` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
372
360
|
| `kilo/x-ai/grok-4.20` | 2.0M | | | | | | $2 | $6 |
|
|
373
361
|
| `kilo/x-ai/grok-4.20-multi-agent` | 2.0M | | | | | | $2 | $6 |
|
|
374
362
|
| `kilo/x-ai/grok-4.3` | 1.0M | | | | | | $1 | $3 |
|
|
375
|
-
| `kilo/x-ai/grok-code-fast-1` | 256K | | | | | | $0.20 | $2 |
|
|
376
363
|
| `kilo/x-ai/grok-code-fast-1:optimized:free` | 256K | | | | | | — | — |
|
|
377
364
|
| `kilo/xiaomi/mimo-v2-flash` | 262K | | | | | | $0.09 | $0.29 |
|
|
378
365
|
| `kilo/xiaomi/mimo-v2-omni` | 262K | | | | | | $0.40 | $2 |
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# Lilac
|
|
2
|
+
|
|
3
|
+
Access 4 Lilac models through Mastra's model router. Authentication is handled automatically using the `LILAC_API_KEY` environment variable.
|
|
4
|
+
|
|
5
|
+
Learn more in the [Lilac documentation](https://docs.getlilac.com).
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
LILAC_API_KEY=your-api-key
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
```typescript
|
|
12
|
+
import { Agent } from "@mastra/core/agent";
|
|
13
|
+
|
|
14
|
+
const agent = new Agent({
|
|
15
|
+
id: "my-agent",
|
|
16
|
+
name: "My Agent",
|
|
17
|
+
instructions: "You are a helpful assistant",
|
|
18
|
+
model: "lilac/google/gemma-4-31b-it"
|
|
19
|
+
});
|
|
20
|
+
|
|
21
|
+
// Generate a response
|
|
22
|
+
const response = await agent.generate("Hello!");
|
|
23
|
+
|
|
24
|
+
// Stream a response
|
|
25
|
+
const stream = await agent.stream("Tell me a story");
|
|
26
|
+
for await (const chunk of stream) {
|
|
27
|
+
console.log(chunk);
|
|
28
|
+
}
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
> **Info:** Mastra uses the OpenAI-compatible `/chat/completions` endpoint. Some provider-specific features may not be available. Check the [Lilac documentation](https://docs.getlilac.com) for details.
|
|
32
|
+
|
|
33
|
+
## Models
|
|
34
|
+
|
|
35
|
+
| Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
|
|
36
|
+
| ------------------------------ | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
|
|
37
|
+
| `lilac/google/gemma-4-31b-it` | 262K | | | | | | $0.11 | $0.35 |
|
|
38
|
+
| `lilac/minimaxai/minimax-m2.7` | 205K | | | | | | $0.30 | $1 |
|
|
39
|
+
| `lilac/moonshotai/kimi-k2.6` | 262K | | | | | | $0.70 | $4 |
|
|
40
|
+
| `lilac/zai-org/glm-5.1` | 203K | | | | | | $0.90 | $3 |
|
|
41
|
+
|
|
42
|
+
## Advanced configuration
|
|
43
|
+
|
|
44
|
+
### Custom headers
|
|
45
|
+
|
|
46
|
+
```typescript
|
|
47
|
+
const agent = new Agent({
|
|
48
|
+
id: "custom-agent",
|
|
49
|
+
name: "custom-agent",
|
|
50
|
+
model: {
|
|
51
|
+
url: "https://api.getlilac.com/v1",
|
|
52
|
+
id: "lilac/google/gemma-4-31b-it",
|
|
53
|
+
apiKey: process.env.LILAC_API_KEY,
|
|
54
|
+
headers: {
|
|
55
|
+
"X-Custom-Header": "value"
|
|
56
|
+
}
|
|
57
|
+
}
|
|
58
|
+
});
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
### Dynamic model selection
|
|
62
|
+
|
|
63
|
+
```typescript
|
|
64
|
+
const agent = new Agent({
|
|
65
|
+
id: "dynamic-agent",
|
|
66
|
+
name: "Dynamic Agent",
|
|
67
|
+
model: ({ requestContext }) => {
|
|
68
|
+
const useAdvanced = requestContext.task === "complex";
|
|
69
|
+
return useAdvanced
|
|
70
|
+
? "lilac/zai-org/glm-5.1"
|
|
71
|
+
: "lilac/google/gemma-4-31b-it";
|
|
72
|
+
}
|
|
73
|
+
});
|
|
74
|
+
```
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# LLM Gateway
|
|
2
2
|
|
|
3
|
-
Access
|
|
3
|
+
Access 188 LLM Gateway models through Mastra's model router. Authentication is handled automatically using the `LLMGATEWAY_API_KEY` environment variable.
|
|
4
4
|
|
|
5
5
|
Learn more in the [LLM Gateway documentation](https://llmgateway.io/docs).
|
|
6
6
|
|
|
@@ -125,19 +125,12 @@ for await (const chunk of stream) {
|
|
|
125
125
|
| `llmgateway/gpt-5.5-pro` | 1.1M | | | | | | $30 | $180 |
|
|
126
126
|
| `llmgateway/gpt-oss-120b` | 131K | | | | | | $0.15 | $0.75 |
|
|
127
127
|
| `llmgateway/gpt-oss-20b` | 131K | | | | | | $0.10 | $0.50 |
|
|
128
|
-
| `llmgateway/grok-3` | 131K | | | | | | $3 | $15 |
|
|
129
|
-
| `llmgateway/grok-4` | 256K | | | | | | $3 | $15 |
|
|
130
128
|
| `llmgateway/grok-4-0709` | 256K | | | | | | $3 | $15 |
|
|
131
|
-
| `llmgateway/grok-4-1-fast` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
132
|
-
| `llmgateway/grok-4-1-fast-non-reasoning` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
133
129
|
| `llmgateway/grok-4-1-fast-reasoning` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
134
130
|
| `llmgateway/grok-4-20-beta-0309-non-reasoning` | 2.0M | | | | | | $2 | $6 |
|
|
135
131
|
| `llmgateway/grok-4-20-beta-0309-reasoning` | 2.0M | | | | | | $2 | $6 |
|
|
136
132
|
| `llmgateway/grok-4-3` | 1.0M | | | | | | $1 | $3 |
|
|
137
|
-
| `llmgateway/grok-4-fast` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
138
|
-
| `llmgateway/grok-4-fast-non-reasoning` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
139
133
|
| `llmgateway/grok-4-fast-reasoning` | 2.0M | | | | | | $0.20 | $0.50 |
|
|
140
|
-
| `llmgateway/grok-code-fast-1` | 256K | | | | | | $0.20 | $2 |
|
|
141
134
|
| `llmgateway/hermes-2-pro-llama-3-8b` | 8K | | | | | | $0.14 | $0.14 |
|
|
142
135
|
| `llmgateway/kimi-k2` | 131K | | | | | | $1 | $3 |
|
|
143
136
|
| `llmgateway/kimi-k2-thinking` | 262K | | | | | | $0.60 | $3 |
|
|
@@ -34,8 +34,8 @@ for await (const chunk of stream) {
|
|
|
34
34
|
|
|
35
35
|
| Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
|
|
36
36
|
| --------------------------------------------------------- | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
|
|
37
|
-
| `neuralwatt/glm-5-fast` |
|
|
38
|
-
| `neuralwatt/glm-5.1-fast` |
|
|
37
|
+
| `neuralwatt/glm-5-fast` | 203K | | | | | | $1 | $4 |
|
|
38
|
+
| `neuralwatt/glm-5.1-fast` | 203K | | | | | | $1 | $4 |
|
|
39
39
|
| `neuralwatt/kimi-k2.5-fast` | 262K | | | | | | $0.52 | $3 |
|
|
40
40
|
| `neuralwatt/kimi-k2.6-fast` | 262K | | | | | | $0.69 | $3 |
|
|
41
41
|
| `neuralwatt/MiniMaxAI/MiniMax-M2.5` | 197K | | | | | | $0.35 | $1 |
|
|
@@ -47,7 +47,7 @@ for await (const chunk of stream) {
|
|
|
47
47
|
| `neuralwatt/Qwen/Qwen3.6-35B-A3B` | 131K | | | | | | $0.05 | $0.10 |
|
|
48
48
|
| `neuralwatt/qwen3.5-397b-fast` | 262K | | | | | | $0.69 | $4 |
|
|
49
49
|
| `neuralwatt/qwen3.6-35b-fast` | 131K | | | | | | $0.05 | $0.10 |
|
|
50
|
-
| `neuralwatt/zai-org/GLM-5.1-FP8` |
|
|
50
|
+
| `neuralwatt/zai-org/GLM-5.1-FP8` | 203K | | | | | | $1 | $4 |
|
|
51
51
|
|
|
52
52
|
## Advanced configuration
|
|
53
53
|
|