@mastra/mcp-docs-server 1.1.15-alpha.7 → 1.1.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -89,6 +89,34 @@ The result is a three-tier system:
89
89
  2. **Observations**: A log of what the Observer has seen
90
90
  3. **Reflections**: Condensed observations when memory becomes too long
91
91
 
92
+ ### Retrieval mode (experimental)
93
+
94
+ > **Note:** Retrieval mode is experimental. The API may change in future releases.
95
+
96
+ Normal OM compresses messages into observations, which is great for staying on task — but the original wording is gone. Retrieval mode fixes this by keeping each observation group linked to the raw messages that produced it. When the agent needs exact wording, tool output, or chronology that the summary compressed away, it can call a `recall` tool to page through the source messages.
97
+
98
+ ```typescript
99
+ const memory = new Memory({
100
+ options: {
101
+ observationalMemory: {
102
+ model: 'google/gemini-2.5-flash',
103
+ scope: 'thread',
104
+ retrieval: true,
105
+ },
106
+ },
107
+ })
108
+ ```
109
+
110
+ With retrieval mode enabled, OM:
111
+
112
+ - Stores a `range` (e.g. `startId:endId`) on each observation group pointing to the messages it was derived from
113
+ - Keeps range metadata visible in the agent's context so the agent knows which observations map to which messages
114
+ - Registers a `recall` tool the agent can call to page through the raw messages behind any range
115
+
116
+ Retrieval mode is only active for thread-scoped OM. Setting `retrieval: true` with `scope: 'resource'` has no effect — OM keeps resource-scoped behavior but skips retrieval-mode context and does not register the `recall` tool.
117
+
118
+ See the [recall tool reference](https://mastra.ai/reference/memory/observational-memory) for the full API (detail levels, part indexing, pagination, and token limiting).
119
+
92
120
  ## Models
93
121
 
94
122
  The Observer and Reflector run in the background. Any model that works with Mastra's [model routing](https://mastra.ai/models) (`provider/model`) can be used. When using `observationalMemory: true`, the default model is `google/gemini-2.5-flash`. When passing a config object, a `model` must be explicitly set.
@@ -1,6 +1,6 @@
1
1
  # ![OpenRouter logo](https://models.dev/logos/openrouter.svg)OpenRouter
2
2
 
3
- OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 199 models through Mastra's model router.
3
+ OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 203 models through Mastra's model router.
4
4
 
5
5
  Learn more in the [OpenRouter documentation](https://openrouter.ai/models).
6
6
 
@@ -160,6 +160,8 @@ ANTHROPIC_API_KEY=ant-...
160
160
  | `openai/gpt-5.2-pro` |
161
161
  | `openai/gpt-5.3-codex` |
162
162
  | `openai/gpt-5.4` |
163
+ | `openai/gpt-5.4-mini` |
164
+ | `openai/gpt-5.4-nano` |
163
165
  | `openai/gpt-5.4-pro` |
164
166
  | `openai/gpt-oss-120b` |
165
167
  | `openai/gpt-oss-120b:exacto` |
@@ -224,6 +226,8 @@ ANTHROPIC_API_KEY=ant-...
224
226
  | `x-ai/grok-4.20-multi-agent-beta` |
225
227
  | `x-ai/grok-code-fast-1` |
226
228
  | `xiaomi/mimo-v2-flash` |
229
+ | `xiaomi/mimo-v2-omni` |
230
+ | `xiaomi/mimo-v2-pro` |
227
231
  | `z-ai/glm-4.5` |
228
232
  | `z-ai/glm-4.5-air` |
229
233
  | `z-ai/glm-4.5-air:free` |
@@ -1,6 +1,6 @@
1
1
  # Model Providers
2
2
 
3
- Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3386 models from 94 providers through a single API.
3
+ Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3396 models from 94 providers through a single API.
4
4
 
5
5
  ## Features
6
6
 
@@ -1,6 +1,6 @@
1
1
  # ![Cloudflare Workers AI logo](https://models.dev/logos/cloudflare-workers-ai.svg)Cloudflare Workers AI
2
2
 
3
- Access 40 Cloudflare Workers AI models through Mastra's model router. Authentication is handled automatically using the `CLOUDFLARE_ACCOUNT_ID` environment variable.
3
+ Access 42 Cloudflare Workers AI models through Mastra's model router. Authentication is handled automatically using the `CLOUDFLARE_ACCOUNT_ID` environment variable.
4
4
 
5
5
  Learn more in the [Cloudflare Workers AI documentation](https://developers.cloudflare.com/workers-ai/models/).
6
6
 
@@ -64,7 +64,9 @@ for await (const chunk of stream) {
64
64
  | `cloudflare-workers-ai/@cf/meta/m2m100-1.2b` | 128K | | | | | | $0.34 | $0.34 |
65
65
  | `cloudflare-workers-ai/@cf/mistral/mistral-7b-instruct-v0.1` | 128K | | | | | | $0.11 | $0.19 |
66
66
  | `cloudflare-workers-ai/@cf/mistralai/mistral-small-3.1-24b-instruct` | 128K | | | | | | $0.35 | $0.56 |
67
+ | `cloudflare-workers-ai/@cf/moonshotai/kimi-k2.5` | 256K | | | | | | $0.60 | $3 |
67
68
  | `cloudflare-workers-ai/@cf/myshell-ai/melotts` | 128K | | | | | | — | — |
69
+ | `cloudflare-workers-ai/@cf/nvidia/nemotron-3-120b-a12b` | 256K | | | | | | $0.50 | $2 |
68
70
  | `cloudflare-workers-ai/@cf/openai/gpt-oss-120b` | 128K | | | | | | $0.35 | $0.75 |
69
71
  | `cloudflare-workers-ai/@cf/openai/gpt-oss-20b` | 128K | | | | | | $0.20 | $0.30 |
70
72
  | `cloudflare-workers-ai/@cf/pfnet/plamo-embedding-1b` | 128K | | | | | | $0.02 | — |
@@ -1,6 +1,6 @@
1
1
  # ![Ollama Cloud logo](https://models.dev/logos/ollama-cloud.svg)Ollama Cloud
2
2
 
3
- Access 33 Ollama Cloud models through Mastra's model router. Authentication is handled automatically using the `OLLAMA_API_KEY` environment variable.
3
+ Access 34 Ollama Cloud models through Mastra's model router. Authentication is handled automatically using the `OLLAMA_API_KEY` environment variable.
4
4
 
5
5
  Learn more in the [Ollama Cloud documentation](https://docs.ollama.com/cloud).
6
6
 
@@ -54,6 +54,7 @@ for await (const chunk of stream) {
54
54
  | `ollama-cloud/minimax-m2` | 205K | | | | | | — | — |
55
55
  | `ollama-cloud/minimax-m2.1` | 205K | | | | | | — | — |
56
56
  | `ollama-cloud/minimax-m2.5` | 205K | | | | | | — | — |
57
+ | `ollama-cloud/minimax-m2.7` | 205K | | | | | | — | — |
57
58
  | `ollama-cloud/ministral-3:14b` | 262K | | | | | | — | — |
58
59
  | `ollama-cloud/ministral-3:3b` | 262K | | | | | | — | — |
59
60
  | `ollama-cloud/ministral-3:8b` | 262K | | | | | | — | — |
@@ -1,6 +1,6 @@
1
1
  # ![Xiaomi logo](https://models.dev/logos/xiaomi.svg)Xiaomi
2
2
 
3
- Access 1 Xiaomi model through Mastra's model router. Authentication is handled automatically using the `XIAOMI_API_KEY` environment variable.
3
+ Access 3 Xiaomi models through Mastra's model router. Authentication is handled automatically using the `XIAOMI_API_KEY` environment variable.
4
4
 
5
5
  Learn more in the [Xiaomi documentation](https://platform.xiaomimimo.com/#/docs).
6
6
 
@@ -35,6 +35,8 @@ for await (const chunk of stream) {
35
35
  | Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
36
36
  | ---------------------- | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
37
37
  | `xiaomi/mimo-v2-flash` | 256K | | | | | | $0.10 | $0.30 |
38
+ | `xiaomi/mimo-v2-omni` | 256K | | | | | | $0.40 | $2 |
39
+ | `xiaomi/mimo-v2-pro` | 1.0M | | | | | | $1 | $3 |
38
40
 
39
41
  ## Advanced configuration
40
42
 
@@ -64,7 +66,7 @@ const agent = new Agent({
64
66
  model: ({ requestContext }) => {
65
67
  const useAdvanced = requestContext.task === "complex";
66
68
  return useAdvanced
67
- ? "xiaomi/mimo-v2-flash"
69
+ ? "xiaomi/mimo-v2-pro"
68
70
  : "xiaomi/mimo-v2-flash";
69
71
  }
70
72
  });
@@ -1,6 +1,6 @@
1
1
  # ![Zhipu AI Coding Plan logo](https://models.dev/logos/zhipuai-coding-plan.svg)Zhipu AI Coding Plan
2
2
 
3
- Access 9 Zhipu AI Coding Plan models through Mastra's model router. Authentication is handled automatically using the `ZHIPU_API_KEY` environment variable.
3
+ Access 10 Zhipu AI Coding Plan models through Mastra's model router. Authentication is handled automatically using the `ZHIPU_API_KEY` environment variable.
4
4
 
5
5
  Learn more in the [Zhipu AI Coding Plan documentation](https://docs.bigmodel.cn/cn/coding-plan/overview).
6
6
 
@@ -43,6 +43,7 @@ for await (const chunk of stream) {
43
43
  | `zhipuai-coding-plan/glm-4.6v-flash` | 128K | | | | | | — | — |
44
44
  | `zhipuai-coding-plan/glm-4.7` | 205K | | | | | | — | — |
45
45
  | `zhipuai-coding-plan/glm-5` | 205K | | | | | | — | — |
46
+ | `zhipuai-coding-plan/glm-5-turbo` | 200K | | | | | | — | — |
46
47
 
47
48
  ## Advanced configuration
48
49
 
@@ -72,7 +73,7 @@ const agent = new Agent({
72
73
  model: ({ requestContext }) => {
73
74
  const useAdvanced = requestContext.task === "complex";
74
75
  return useAdvanced
75
- ? "zhipuai-coding-plan/glm-5"
76
+ ? "zhipuai-coding-plan/glm-5-turbo"
76
77
  : "zhipuai-coding-plan/glm-4.5";
77
78
  }
78
79
  });
@@ -38,6 +38,8 @@ OM performs thresholding with fast local token estimation. Text uses `tokenx`, a
38
38
 
39
39
  **shareTokenBudget** (`boolean`): Share the token budget between messages and observations. When enabled, the total budget is \`observation.messageTokens + reflection.observationTokens\`. Messages can use more space when observations are small, and vice versa. This maximizes context usage through flexible allocation. \`shareTokenBudget\` is not yet compatible with async buffering. You must set \`observation: { bufferTokens: false }\` when using this option (this is a temporary limitation). (Default: `false`)
40
40
 
41
+ **retrieval** (`boolean`): \*\*Experimental.\*\* Enable retrieval-mode observation groups as durable pointers to raw message history. Retrieval mode is only active when \`scope\` is \`'thread'\`. If you set \`retrieval: true\` with \`scope: 'resource'\`, OM keeps resource-scoped memory behavior but skips retrieval-mode context and does not register the \`recall\` tool. (Default: `false`)
42
+
41
43
  **observation** (`ObservationalMemoryObservationConfig`): Configuration for the observation step. Controls when the Observer agent runs and how it behaves.
42
44
 
43
45
  **observation.model** (`string | LanguageModel | DynamicModel | ModelWithRetries[]`): Model for the Observer agent. Cannot be set if a top-level \`model\` is also provided. If neither this nor the top-level \`model\` is set, falls back to \`reflection.model\`.
@@ -574,6 +576,42 @@ The standalone `ObservationalMemory` class accepts all the same options as the `
574
576
 
575
577
  **obscureThreadIds** (`boolean`): When enabled, thread IDs are hashed before being included in observation context. This prevents the LLM from recognizing patterns in thread identifiers. Automatically enabled when using resource scope through the Memory class. (Default: `false`)
576
578
 
579
+ ## Recall tool
580
+
581
+ When `retrieval: true` is set with `scope: 'thread'`, OM registers a `recall` tool that the agent can call to page through the raw messages behind an observation group's `_range`. The tool is automatically added to the agent's tool list — no manual registration is needed.
582
+
583
+ ### Parameters
584
+
585
+ **cursor** (`string`): A message ID to anchor the recall query. Extract the start or end ID from an observation group range (e.g. from \`\_range: \\\`startId:endId\\\`\_\`, use either \`startId\` or \`endId\`). If a range string is passed directly, the tool returns a hint explaining how to extract the correct ID.
586
+
587
+ **page** (`number`): Pagination offset from the cursor. Positive values page forward (messages after the cursor), negative values page backward (messages before the cursor). \`0\` is treated as \`1\`. (Default: `1`)
588
+
589
+ **limit** (`number`): Maximum number of messages per page. (Default: `20`)
590
+
591
+ **detail** (`'low' | 'high'`): Controls how much content is shown per message part. \`'low'\` shows truncated text and tool names with positional indices (\`\[p0]\`, \`\[p1]\`). \`'high'\` shows full content including tool arguments and results, clamped to one part per call with continuation hints. (Default: `'low'`)
592
+
593
+ **partIndex** (`number`): Fetch a single message part at full detail by its positional index. Use this when a low-detail recall shows an interesting part at \`\[p1]\` — call again with \`partIndex: 1\` to see the full content without loading every part.
594
+
595
+ ### Returns
596
+
597
+ **messages** (`string`): Formatted message content. Format depends on the \`detail\` level.
598
+
599
+ **count** (`number`): Number of messages in this page.
600
+
601
+ **cursor** (`string`): The cursor message ID used for this query.
602
+
603
+ **page** (`number`): The page number returned.
604
+
605
+ **limit** (`number`): The limit used for this query.
606
+
607
+ **hasNextPage** (`boolean`): Whether more messages exist after this page.
608
+
609
+ **hasPrevPage** (`boolean`): Whether more messages exist before this page.
610
+
611
+ **truncated** (`boolean`): Present and \`true\` when the output was capped by the token budget. The agent can paginate or use \`partIndex\` to access remaining content.
612
+
613
+ **tokenOffset** (`number`): Approximate number of tokens that were trimmed when \`truncated\` is true.
614
+
577
615
  ### Related
578
616
 
579
617
  - [Observational Memory](https://mastra.ai/docs/memory/observational-memory)
package/CHANGELOG.md CHANGED
@@ -1,5 +1,21 @@
1
1
  # @mastra/mcp-docs-server
2
2
 
3
+ ## 1.1.15
4
+
5
+ ### Patch Changes
6
+
7
+ - Updated dependencies [[`cb611a1`](https://github.com/mastra-ai/mastra/commit/cb611a1e89a4f4cf74c97b57e0c27bb56f2eceb5), [`da93115`](https://github.com/mastra-ai/mastra/commit/da931155c1a9bc63d455d3d86b4ec984db5991fe), [`44df54a`](https://github.com/mastra-ai/mastra/commit/44df54a28e6315d9699cf437e4f3e8c7c7d10217), [`62d1d3c`](https://github.com/mastra-ai/mastra/commit/62d1d3cc08fe8182e7080237fd975de862ec8c91), [`9e1a3ed`](https://github.com/mastra-ai/mastra/commit/9e1a3ed07cfafb5e8e19a796ce0bee817002d7c0), [`56c9ad9`](https://github.com/mastra-ai/mastra/commit/56c9ad9c871d258af9da4d6e50065b01d339bf34), [`0773d08`](https://github.com/mastra-ai/mastra/commit/0773d089859210217702d3175ad4b2f3d63d267e), [`8681ecb`](https://github.com/mastra-ai/mastra/commit/8681ecb86184d5907267000e4576cc442a9a83fc), [`28d0249`](https://github.com/mastra-ai/mastra/commit/28d0249295782277040ad1e0d243e695b7ab1ce4), [`681ee1c`](https://github.com/mastra-ai/mastra/commit/681ee1c811359efd1b8bebc4bce35b9bb7b14bec), [`bb0f09d`](https://github.com/mastra-ai/mastra/commit/bb0f09dbac58401b36069f483acf5673202db5b5), [`6a8f1e6`](https://github.com/mastra-ai/mastra/commit/6a8f1e66272d2928351db334da091ee27e304c23), [`a579f7a`](https://github.com/mastra-ai/mastra/commit/a579f7a31e582674862b5679bc79af7ccf7429b8), [`5f7e9d0`](https://github.com/mastra-ai/mastra/commit/5f7e9d0db664020e1f3d97d7d18c6b0b9d4843d0), [`d7f14c3`](https://github.com/mastra-ai/mastra/commit/d7f14c3285cd253ecdd5f58139b7b6cbdf3678b5), [`0efe12a`](https://github.com/mastra-ai/mastra/commit/0efe12a5f008a939a1aac71699486ba40138054e)]:
8
+ - @mastra/core@1.15.0
9
+ - @mastra/mcp@1.3.1
10
+
11
+ ## 1.1.15-alpha.8
12
+
13
+ ### Patch Changes
14
+
15
+ - Updated dependencies [[`da93115`](https://github.com/mastra-ai/mastra/commit/da931155c1a9bc63d455d3d86b4ec984db5991fe), [`44df54a`](https://github.com/mastra-ai/mastra/commit/44df54a28e6315d9699cf437e4f3e8c7c7d10217), [`0efe12a`](https://github.com/mastra-ai/mastra/commit/0efe12a5f008a939a1aac71699486ba40138054e)]:
16
+ - @mastra/core@1.15.0-alpha.4
17
+ - @mastra/mcp@1.3.1-alpha.1
18
+
3
19
  ## 1.1.15-alpha.7
4
20
 
5
21
  ### Patch Changes
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@mastra/mcp-docs-server",
3
- "version": "1.1.15-alpha.7",
3
+ "version": "1.1.15",
4
4
  "description": "MCP server for accessing Mastra.ai documentation, changelogs, and news.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -29,8 +29,8 @@
29
29
  "jsdom": "^26.1.0",
30
30
  "local-pkg": "^1.1.2",
31
31
  "zod": "^4.3.6",
32
- "@mastra/core": "1.15.0-alpha.3",
33
- "@mastra/mcp": "^1.3.1-alpha.0"
32
+ "@mastra/mcp": "^1.3.1",
33
+ "@mastra/core": "1.15.0"
34
34
  },
35
35
  "devDependencies": {
36
36
  "@hono/node-server": "^1.19.11",
@@ -46,9 +46,9 @@
46
46
  "tsx": "^4.21.0",
47
47
  "typescript": "^5.9.3",
48
48
  "vitest": "4.0.18",
49
- "@internal/lint": "0.0.72",
50
- "@internal/types-builder": "0.0.47",
51
- "@mastra/core": "1.15.0-alpha.3"
49
+ "@internal/lint": "0.0.73",
50
+ "@mastra/core": "1.15.0",
51
+ "@internal/types-builder": "0.0.48"
52
52
  },
53
53
  "homepage": "https://mastra.ai",
54
54
  "repository": {