@mastra/mcp-docs-server 1.1.10-alpha.0 → 1.1.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -61,7 +61,7 @@ OM solves both problems by compressing old context into dense observations.
61
61
 
62
62
  When message history tokens exceed a threshold (default: 30,000), the Observer creates observations — concise notes about what happened:
63
63
 
64
- Image parts contribute to this threshold with model-aware estimates, so multimodal conversations trigger observation at the right time. The same applies to image-like `file` parts when a transport normalizes an uploaded image as a file instead of an image part. For example, OpenAI image detail settings can materially change when OM decides to observe.
64
+ OM uses fast local token estimation for this thresholding work. Text is estimated with `tokenx`, while image parts use provider-aware heuristics so multimodal conversations still trigger observation at the right time. The same applies to image-like `file` parts when a transport normalizes an uploaded image as a file instead of an image part. For example, OpenAI image detail settings can materially change when OM decides to observe.
65
65
 
66
66
  The Observer can also see attachments in the history it reviews. OM keeps readable placeholders like `[Image #1: reference-board.png]` or `[File #1: floorplan.pdf]` in the transcript for readability, and forwards the actual attachment parts alongside the text. Image-like `file` parts are upgraded to image inputs for the Observer when possible, while non-image attachments are forwarded as file parts with normalized token counting. This applies to both normal thread observation and batched resource-scope observation.
67
67
 
@@ -176,7 +176,7 @@ const memory = new Memory({
176
176
 
177
177
  ### Token counting cache
178
178
 
179
- OM caches tiktoken part estimates in message metadata to reduce repeat counting work during threshold checks and buffering decisions.
179
+ OM caches token estimates in message metadata to reduce repeat counting work during threshold checks and buffering decisions.
180
180
 
181
181
  - Per-part estimates are stored on `part.providerMetadata.mastra` and reused on subsequent passes when the cache version/tokenizer source matches.
182
182
  - For string-only message content (without parts), OM uses a message-level metadata fallback cache.
@@ -53,15 +53,19 @@ const response = await agent.generate('List all files in the workspace')
53
53
 
54
54
  By default, `LocalFilesystem` runs in **contained mode** — all file operations are restricted to stay within `basePath`. This prevents path traversal attacks and symlink escapes.
55
55
 
56
- In contained mode, absolute paths that fall within `basePath` are used as-is, while other absolute paths are treated as virtual paths relative to `basePath` (e.g. `/file.txt` resolves to `basePath/file.txt`). Any resolved path that escapes `basePath` throws a `PermissionError`.
56
+ In contained mode:
57
57
 
58
- If your agent needs to access specific paths outside `basePath`, use `allowedPaths` to grant access without disabling containment entirely:
58
+ - **Relative paths** (e.g. `src/index.ts`) resolve against `basePath`
59
+ - **Absolute paths** (e.g. `/home/user/.config/file.txt`) are treated as real filesystem paths — if they fall outside `basePath` and any `allowedPaths`, a `PermissionError` is thrown
60
+ - **Tilde paths** (e.g. `~/Documents`) expand to the home directory and follow the same containment rules
61
+
62
+ If your agent needs to access specific paths outside `basePath`, use `allowedPaths` to grant access without disabling containment entirely. Relative paths are resolved against `basePath`, and absolute paths are used as-is:
59
63
 
60
64
  ```typescript
61
65
  const workspace = new Workspace({
62
66
  filesystem: new LocalFilesystem({
63
67
  basePath: './workspace',
64
- allowedPaths: ['/home/user/.config', '/home/user/documents'],
68
+ allowedPaths: ['~/.claude/skills', '../shared-data'],
65
69
  }),
66
70
  })
67
71
  ```
@@ -1,6 +1,6 @@
1
1
  # ![OpenRouter logo](https://models.dev/logos/openrouter.svg)OpenRouter
2
2
 
3
- OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 193 models through Mastra's model router.
3
+ OpenRouter aggregates models from multiple providers with enhanced features like rate limiting and failover. Access 195 models through Mastra's model router.
4
4
 
5
5
  Learn more in the [OpenRouter documentation](https://openrouter.ai/models).
6
6
 
@@ -168,6 +168,8 @@ ANTHROPIC_API_KEY=ant-...
168
168
  | `openai/o4-mini` |
169
169
  | `openrouter/aurora-alpha` |
170
170
  | `openrouter/free` |
171
+ | `openrouter/healer-alpha` |
172
+ | `openrouter/hunter-alpha` |
171
173
  | `openrouter/sherlock-dash-alpha` |
172
174
  | `openrouter/sherlock-think-alpha` |
173
175
  | `prime-intellect/intellect-3` |
@@ -1,6 +1,6 @@
1
1
  # Model Providers
2
2
 
3
- Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3239 models from 92 providers through a single API.
3
+ Mastra provides a unified interface for working with LLMs across multiple providers, giving you access to 3245 models from 92 providers through a single API.
4
4
 
5
5
  ## Features
6
6
 
@@ -37,7 +37,7 @@ for await (const chunk of stream) {
37
37
  | `alibaba-coding-plan/glm-4.7` | 203K | | | | | | — | — |
38
38
  | `alibaba-coding-plan/glm-5` | 203K | | | | | | — | — |
39
39
  | `alibaba-coding-plan/kimi-k2.5` | 262K | | | | | | — | — |
40
- | `alibaba-coding-plan/MiniMax-M2.5` | 17K | | | | | | — | — |
40
+ | `alibaba-coding-plan/MiniMax-M2.5` | 197K | | | | | | — | — |
41
41
  | `alibaba-coding-plan/qwen3-coder-next` | 262K | | | | | | — | — |
42
42
  | `alibaba-coding-plan/qwen3-coder-plus` | 1.0M | | | | | | — | — |
43
43
  | `alibaba-coding-plan/qwen3-max-2026-01-23` | 262K | | | | | | — | — |
@@ -508,11 +508,11 @@ for await (const chunk of stream) {
508
508
  | `nano-gpt/TheDrummer 2/Rocinante-12B-v1.1` | 16K | | | | | | $0.41 | $0.59 |
509
509
  | `nano-gpt/TheDrummer 2/skyfall-36b-v2` | 64K | | | | | | $0.49 | $0.49 |
510
510
  | `nano-gpt/TheDrummer 2/UnslopNemo-12B-v4.1` | 33K | | | | | | $0.49 | $0.49 |
511
- | `nano-gpt/THUDM 2/GLM-4-32B-0414` | 128K | | | | | | $0.20 | $0.20 |
512
- | `nano-gpt/THUDM 2/GLM-4-9B-0414` | 32K | | | | | | $0.20 | $0.20 |
513
- | `nano-gpt/THUDM 2/GLM-Z1-32B-0414` | 128K | | | | | | $0.20 | $0.20 |
514
- | `nano-gpt/THUDM 2/GLM-Z1-9B-0414` | 32K | | | | | | $0.20 | $0.20 |
515
- | `nano-gpt/THUDM 2/GLM-Z1-Rumination-32B-0414` | 32K | | | | | | $0.20 | $0.20 |
511
+ | `nano-gpt/THUDM/GLM-4-32B-0414` | 128K | | | | | | $0.20 | $0.20 |
512
+ | `nano-gpt/THUDM/GLM-4-9B-0414` | 32K | | | | | | $0.20 | $0.20 |
513
+ | `nano-gpt/THUDM/GLM-Z1-32B-0414` | 128K | | | | | | $0.20 | $0.20 |
514
+ | `nano-gpt/THUDM/GLM-Z1-9B-0414` | 32K | | | | | | $0.20 | $0.20 |
515
+ | `nano-gpt/THUDM/GLM-Z1-Rumination-32B-0414` | 32K | | | | | | $0.20 | $0.20 |
516
516
  | `nano-gpt/tngtech/DeepSeek-TNG-R1T2-Chimera` | 128K | | | | | | $0.31 | $0.31 |
517
517
  | `nano-gpt/tngtech/tng-r1t-chimera` | 128K | | | | | | $0.30 | $1 |
518
518
  | `nano-gpt/Tongyi-Zhiwen 2/QwenLong-L1-32B` | 128K | | | | | | $0.14 | $0.60 |
@@ -1,6 +1,6 @@
1
1
  # ![Nebius Token Factory logo](https://models.dev/logos/nebius.svg)Nebius Token Factory
2
2
 
3
- Access 46 Nebius Token Factory models through Mastra's model router. Authentication is handled automatically using the `NEBIUS_API_KEY` environment variable.
3
+ Access 47 Nebius Token Factory models through Mastra's model router. Authentication is handled automatically using the `NEBIUS_API_KEY` environment variable.
4
4
 
5
5
  Learn more in the [Nebius Token Factory documentation](https://docs.tokenfactory.nebius.com/).
6
6
 
@@ -42,11 +42,11 @@ for await (const chunk of stream) {
42
42
  | `nebius/deepseek-ai/DeepSeek-R1-0528-fast` | 131K | | | | | | $2 | $6 |
43
43
  | `nebius/deepseek-ai/DeepSeek-V3-0324` | 128K | | | | | | $0.50 | $2 |
44
44
  | `nebius/deepseek-ai/DeepSeek-V3-0324-fast` | 128K | | | | | | $0.75 | $2 |
45
- | `nebius/deepseek-ai/DeepSeek-V3.2` | 128K | | | | | | $0.30 | $0.45 |
45
+ | `nebius/deepseek-ai/DeepSeek-V3.2` | 163K | | | | | | $0.30 | $0.45 |
46
46
  | `nebius/google/gemma-2-2b-it` | 8K | | | | | | $0.02 | $0.06 |
47
47
  | `nebius/google/gemma-2-9b-it-fast` | 8K | | | | | | $0.03 | $0.09 |
48
- | `nebius/google/gemma-3-27b-it` | 128K | | | | | | $0.10 | $0.30 |
49
- | `nebius/google/gemma-3-27b-it-fast` | 128K | | | | | | $0.20 | $0.60 |
48
+ | `nebius/google/gemma-3-27b-it` | 110K | | | | | | $0.10 | $0.30 |
49
+ | `nebius/google/gemma-3-27b-it-fast` | 110K | | | | | | $0.20 | $0.60 |
50
50
  | `nebius/intfloat/e5-mistral-7b-instruct` | 33K | | | | | | $0.01 | — |
51
51
  | `nebius/meta-llama/Llama-3.3-70B-Instruct` | 128K | | | | | | $0.13 | $0.40 |
52
52
  | `nebius/meta-llama/Llama-3.3-70B-Instruct-fast` | 128K | | | | | | $0.25 | $0.75 |
@@ -80,6 +80,7 @@ for await (const chunk of stream) {
80
80
  | `nebius/zai-org/GLM-4.5` | 128K | | | | | | $0.60 | $2 |
81
81
  | `nebius/zai-org/GLM-4.5-Air` | 128K | | | | | | $0.20 | $1 |
82
82
  | `nebius/zai-org/GLM-4.7-FP8` | 128K | | | | | | $0.40 | $2 |
83
+ | `nebius/zai-org/GLM-5` | 203K | | | | | | $1 | $3 |
83
84
 
84
85
  ## Advanced configuration
85
86
 
@@ -109,7 +110,7 @@ const agent = new Agent({
109
110
  model: ({ requestContext }) => {
110
111
  const useAdvanced = requestContext.task === "complex";
111
112
  return useAdvanced
112
- ? "nebius/zai-org/GLM-4.7-FP8"
113
+ ? "nebius/zai-org/GLM-5"
113
114
  : "nebius/BAAI/bge-en-icl";
114
115
  }
115
116
  });
@@ -104,7 +104,7 @@ for await (const chunk of stream) {
104
104
  | `nvidia/qwen/qwen3-next-80b-a3b-thinking` | 262K | | | | | | — | — |
105
105
  | `nvidia/qwen/qwen3.5-397b-a17b` | 262K | | | | | | — | — |
106
106
  | `nvidia/qwen/qwq-32b` | 128K | | | | | | — | — |
107
- | `nvidia/stepfun-ai/step-3-5-flash` | 256K | | | | | | — | — |
107
+ | `nvidia/stepfun-ai/step-3.5-flash` | 256K | | | | | | — | — |
108
108
  | `nvidia/z-ai/glm4.7` | 205K | | | | | | — | — |
109
109
  | `nvidia/z-ai/glm5` | 203K | | | | | | — | — |
110
110
 
@@ -1,6 +1,6 @@
1
1
  # ![OpenCode Zen logo](https://models.dev/logos/opencode.svg)OpenCode Zen
2
2
 
3
- Access 33 OpenCode Zen models through Mastra's model router. Authentication is handled automatically using the `OPENCODE_API_KEY` environment variable.
3
+ Access 34 OpenCode Zen models through Mastra's model router. Authentication is handled automatically using the `OPENCODE_API_KEY` environment variable.
4
4
 
5
5
  Learn more in the [OpenCode Zen documentation](https://opencode.ai/docs/zen).
6
6
 
@@ -32,41 +32,42 @@ for await (const chunk of stream) {
32
32
 
33
33
  ## Models
34
34
 
35
- | Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
36
- | ------------------------------ | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
37
- | `opencode/big-pickle` | 200K | | | | | | — | — |
38
- | `opencode/claude-3-5-haiku` | 200K | | | | | | $0.80 | $4 |
39
- | `opencode/claude-haiku-4-5` | 200K | | | | | | $1 | $5 |
40
- | `opencode/claude-opus-4-1` | 200K | | | | | | $15 | $75 |
41
- | `opencode/claude-opus-4-5` | 200K | | | | | | $5 | $25 |
42
- | `opencode/claude-opus-4-6` | 1.0M | | | | | | $5 | $25 |
43
- | `opencode/claude-sonnet-4` | 1.0M | | | | | | $3 | $15 |
44
- | `opencode/claude-sonnet-4-5` | 1.0M | | | | | | $3 | $15 |
45
- | `opencode/claude-sonnet-4-6` | 1.0M | | | | | | $3 | $15 |
46
- | `opencode/gemini-3-flash` | 1.0M | | | | | | $0.50 | $3 |
47
- | `opencode/gemini-3-pro` | 1.0M | | | | | | $2 | $12 |
48
- | `opencode/gemini-3.1-pro` | 1.0M | | | | | | $2 | $12 |
49
- | `opencode/glm-4.6` | 205K | | | | | | $0.60 | $2 |
50
- | `opencode/glm-4.7` | 205K | | | | | | $0.60 | $2 |
51
- | `opencode/glm-5` | 205K | | | | | | $1 | $3 |
52
- | `opencode/gpt-5` | 400K | | | | | | $1 | $9 |
53
- | `opencode/gpt-5-codex` | 400K | | | | | | $1 | $9 |
54
- | `opencode/gpt-5-nano` | 400K | | | | | | — | — |
55
- | `opencode/gpt-5.1` | 400K | | | | | | $1 | $9 |
56
- | `opencode/gpt-5.1-codex` | 400K | | | | | | $1 | $9 |
57
- | `opencode/gpt-5.1-codex-max` | 400K | | | | | | $1 | $10 |
58
- | `opencode/gpt-5.1-codex-mini` | 400K | | | | | | $0.25 | $2 |
59
- | `opencode/gpt-5.2` | 400K | | | | | | $2 | $14 |
60
- | `opencode/gpt-5.2-codex` | 400K | | | | | | $2 | $14 |
61
- | `opencode/gpt-5.3-codex` | 400K | | | | | | $2 | $14 |
62
- | `opencode/gpt-5.3-codex-spark` | 128K | | | | | | $2 | $14 |
63
- | `opencode/gpt-5.4` | 1.1M | | | | | | $3 | $15 |
64
- | `opencode/gpt-5.4-pro` | 1.1M | | | | | | $30 | $180 |
65
- | `opencode/kimi-k2.5` | 262K | | | | | | $0.60 | $3 |
66
- | `opencode/mimo-v2-flash-free` | 262K | | | | | | — | — |
67
- | `opencode/minimax-m2.1` | 205K | | | | | | $0.30 | $1 |
68
- | `opencode/minimax-m2.5` | 205K | | | | | | $0.30 | $1 |
69
- | `opencode/minimax-m2.5-free` | 205K | | | | | | — | — |
35
+ | Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M |
36
+ | -------------------------------- | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- |
37
+ | `opencode/big-pickle` | 200K | | | | | | — | — |
38
+ | `opencode/claude-3-5-haiku` | 200K | | | | | | $0.80 | $4 |
39
+ | `opencode/claude-haiku-4-5` | 200K | | | | | | $1 | $5 |
40
+ | `opencode/claude-opus-4-1` | 200K | | | | | | $15 | $75 |
41
+ | `opencode/claude-opus-4-5` | 200K | | | | | | $5 | $25 |
42
+ | `opencode/claude-opus-4-6` | 1.0M | | | | | | $5 | $25 |
43
+ | `opencode/claude-sonnet-4` | 1.0M | | | | | | $3 | $15 |
44
+ | `opencode/claude-sonnet-4-5` | 1.0M | | | | | | $3 | $15 |
45
+ | `opencode/claude-sonnet-4-6` | 1.0M | | | | | | $3 | $15 |
46
+ | `opencode/gemini-3-flash` | 1.0M | | | | | | $0.50 | $3 |
47
+ | `opencode/gemini-3-pro` | 1.0M | | | | | | $2 | $12 |
48
+ | `opencode/gemini-3.1-pro` | 1.0M | | | | | | $2 | $12 |
49
+ | `opencode/glm-4.6` | 205K | | | | | | $0.60 | $2 |
50
+ | `opencode/glm-4.7` | 205K | | | | | | $0.60 | $2 |
51
+ | `opencode/glm-5` | 205K | | | | | | $1 | $3 |
52
+ | `opencode/gpt-5` | 400K | | | | | | $1 | $9 |
53
+ | `opencode/gpt-5-codex` | 400K | | | | | | $1 | $9 |
54
+ | `opencode/gpt-5-nano` | 400K | | | | | | — | — |
55
+ | `opencode/gpt-5.1` | 400K | | | | | | $1 | $9 |
56
+ | `opencode/gpt-5.1-codex` | 400K | | | | | | $1 | $9 |
57
+ | `opencode/gpt-5.1-codex-max` | 400K | | | | | | $1 | $10 |
58
+ | `opencode/gpt-5.1-codex-mini` | 400K | | | | | | $0.25 | $2 |
59
+ | `opencode/gpt-5.2` | 400K | | | | | | $2 | $14 |
60
+ | `opencode/gpt-5.2-codex` | 400K | | | | | | $2 | $14 |
61
+ | `opencode/gpt-5.3-codex` | 400K | | | | | | $2 | $14 |
62
+ | `opencode/gpt-5.3-codex-spark` | 128K | | | | | | $2 | $14 |
63
+ | `opencode/gpt-5.4` | 1.1M | | | | | | $3 | $15 |
64
+ | `opencode/gpt-5.4-pro` | 1.1M | | | | | | $30 | $180 |
65
+ | `opencode/kimi-k2.5` | 262K | | | | | | $0.60 | $3 |
66
+ | `opencode/mimo-v2-flash-free` | 262K | | | | | | — | — |
67
+ | `opencode/minimax-m2.1` | 205K | | | | | | $0.30 | $1 |
68
+ | `opencode/minimax-m2.5` | 205K | | | | | | $0.30 | $1 |
69
+ | `opencode/minimax-m2.5-free` | 205K | | | | | | — | — |
70
+ | `opencode/nemotron-3-super-free` | 1.0M | | | | | | — | — |
70
71
 
71
72
  ## Advanced configuration
72
73
 
@@ -96,7 +97,7 @@ const agent = new Agent({
96
97
  model: ({ requestContext }) => {
97
98
  const useAdvanced = requestContext.task === "complex";
98
99
  return useAdvanced
99
- ? "opencode/minimax-m2.5-free"
100
+ ? "opencode/nemotron-3-super-free"
100
101
  : "opencode/big-pickle";
101
102
  }
102
103
  });
@@ -1,6 +1,6 @@
1
1
  # ![Synthetic logo](https://models.dev/logos/synthetic.svg)Synthetic
2
2
 
3
- Access 26 Synthetic models through Mastra's model router. Authentication is handled automatically using the `SYNTHETIC_API_KEY` environment variable.
3
+ Access 28 Synthetic models through Mastra's model router. Authentication is handled automatically using the `SYNTHETIC_API_KEY` environment variable.
4
4
 
5
5
  Learn more in the [Synthetic documentation](https://synthetic.new/pricing).
6
6
 
@@ -49,6 +49,7 @@ for await (const chunk of stream) {
49
49
  | `synthetic/hf:meta-llama/Llama-4-Scout-17B-16E-Instruct` | 328K | | | | | | $0.15 | $0.60 |
50
50
  | `synthetic/hf:MiniMaxAI/MiniMax-M2` | 197K | | | | | | $0.55 | $2 |
51
51
  | `synthetic/hf:MiniMaxAI/MiniMax-M2.1` | 205K | | | | | | $0.55 | $2 |
52
+ | `synthetic/hf:MiniMaxAI/MiniMax-M2.5` | 191K | | | | | | $0.60 | $3 |
52
53
  | `synthetic/hf:moonshotai/Kimi-K2-Instruct-0905` | 262K | | | | | | $1 | $1 |
53
54
  | `synthetic/hf:moonshotai/Kimi-K2-Thinking` | 262K | | | | | | $0.55 | $2 |
54
55
  | `synthetic/hf:moonshotai/Kimi-K2.5` | 262K | | | | | | $0.55 | $2 |
@@ -60,6 +61,7 @@ for await (const chunk of stream) {
60
61
  | `synthetic/hf:Qwen/Qwen3-Coder-480B-A35B-Instruct` | 256K | | | | | | $2 | $2 |
61
62
  | `synthetic/hf:zai-org/GLM-4.6` | 200K | | | | | | $0.55 | $2 |
62
63
  | `synthetic/hf:zai-org/GLM-4.7` | 200K | | | | | | $0.55 | $2 |
64
+ | `synthetic/hf:zai-org/GLM-4.7-Flash` | 197K | | | | | | $0.06 | $0.40 |
63
65
 
64
66
  ## Advanced configuration
65
67
 
@@ -89,7 +91,7 @@ const agent = new Agent({
89
91
  model: ({ requestContext }) => {
90
92
  const useAdvanced = requestContext.task === "complex";
91
93
  return useAdvanced
92
- ? "synthetic/hf:zai-org/GLM-4.7"
94
+ ? "synthetic/hf:zai-org/GLM-4.7-Flash"
93
95
  : "synthetic/hf:MiniMaxAI/MiniMax-M2";
94
96
  }
95
97
  });
@@ -28,6 +28,8 @@ The `observationalMemory` option accepts `true`, a configuration object, or `fal
28
28
 
29
29
  Observer input is multimodal-aware. OM keeps text placeholders like `[Image #1: screenshot.png]` in the transcript it builds for the Observer, and also sends the underlying image parts when possible. This applies to both single-thread observation and batched multi-thread observation. Non-image files appear as placeholders only.
30
30
 
31
+ OM performs thresholding with fast local token estimation. Text uses `tokenx`, and image-like inputs use provider-aware heuristics plus deterministic fallbacks when metadata is incomplete.
32
+
31
33
  **enabled** (`boolean`): Enable or disable Observational Memory. When omitted from a config object, defaults to \`true\`. Only \`enabled: false\` explicitly disables it. (Default: `true`)
32
34
 
33
35
  **model** (`string | LanguageModel | DynamicModel | ModelWithRetries[]`): Model for both the Observer and Reflector agents. Sets the model for both at once. Cannot be used together with \`observation.model\` or \`reflection.model\` — an error will be thrown if both are set. When using \`observationalMemory: true\`, defaults to \`google/gemini-2.5-flash\`. When passing a config object, this or \`observation.model\`/\`reflection.model\` must be set. Use \`"default"\` to explicitly use the default model (\`google/gemini-2.5-flash\`). (Default: `'google/gemini-2.5-flash' (when using observationalMemory: true)`)
@@ -42,7 +44,7 @@ Observer input is multimodal-aware. OM keeps text placeholders like `[Image #1:
42
44
 
43
45
  **observation.instruction** (`string`): Custom instruction appended to the Observer's system prompt. Use this to customize what the Observer focuses on, such as domain-specific preferences or priorities.
44
46
 
45
- **observation.messageTokens** (`number`): Token count of unobserved messages that triggers observation. When unobserved message tokens exceed this threshold, the Observer agent is called. Image parts are included with model-aware estimates when possible, with deterministic fallbacks when image metadata is incomplete. Image-like \`file\` parts are counted the same way when uploads are normalized as files.
47
+ **observation.messageTokens** (`number`): Token count of unobserved messages that triggers observation. When unobserved message tokens exceed this threshold, the Observer agent is called. Text is estimated locally with \`tokenx\`. Image parts are included with model-aware heuristics when possible, with deterministic fallbacks when image metadata is incomplete. Image-like \`file\` parts are counted the same way when uploads are normalized as files.
46
48
 
47
49
  **observation.maxTokensPerBatch** (`number`): Maximum tokens per batch when observing multiple threads in resource scope. Threads are chunked into batches of this size and processed in parallel. Lower values mean more parallelism but more API calls.
48
50
 
@@ -78,7 +80,7 @@ Observer input is multimodal-aware. OM keeps text placeholders like `[Image #1:
78
80
 
79
81
  ### Token estimate metadata cache
80
82
 
81
- OM persists token payload estimates so repeated counting can reuse prior tiktoken work.
83
+ OM persists token payload estimates so repeated counting can reuse prior token estimation work.
82
84
 
83
85
  - Part-level cache: `part.providerMetadata.mastra`.
84
86
  - String-content fallback cache: message-level metadata when no parts exist.
@@ -1,8 +1,9 @@
1
1
  # TokenLimiterProcessor
2
2
 
3
- The `TokenLimiterProcessor` limits the number of tokens in messages. It can be used as both an input and output processor:
3
+ The `TokenLimiterProcessor` limits the number of tokens in messages. It can be used as an input, per-step input, and output processor:
4
4
 
5
- - **Input processor**: Filters historical messages to fit within the context window, prioritizing recent messages
5
+ - **Input processor** (`processInput`): Filters historical messages to fit within the context window before the agentic loop starts, prioritizing recent messages
6
+ - **Per-step input processor** (`processInputStep`): Prunes messages at each step of a multi-step agent workflow, preventing unbounded token growth when tools trigger additional LLM calls
6
7
  - **Output processor**: Limits generated response tokens via streaming or non-streaming with configurable strategies for handling exceeded limits
7
8
 
8
9
  ## Usage example
@@ -35,7 +36,9 @@ const processor = new TokenLimiterProcessor({
35
36
 
36
37
  **name** (`string`): Optional processor display name
37
38
 
38
- **processInput** (`(args: { messages: MastraDBMessage[]; abort: (reason?: string) => never }) => Promise<MastraDBMessage[]>`): Filters input messages to fit within token limit, prioritizing recent messages while preserving system messages
39
+ **processInput** (`(args: { messages: MastraDBMessage[]; abort: (reason?: string) => never }) => Promise<MastraDBMessage[]>`): Filters input messages to fit within token limit before the agentic loop starts, prioritizing recent messages while preserving system messages
40
+
41
+ **processInputStep** (`(args: ProcessInputStepArgs) => Promise<void>`): Prunes messages at each step of the agentic loop (including tool call continuations) to keep the conversation within the token limit. Mutates the messageList directly by removing oldest messages first while preserving system messages.
39
42
 
40
43
  **processOutputStream** (`(args: { part: ChunkType; streamParts: ChunkType[]; state: Record<string, any>; abort: (reason?: string) => never }) => Promise<ChunkType | null>`): Processes streaming output parts to limit token count during streaming
41
44
 
@@ -45,7 +48,7 @@ const processor = new TokenLimiterProcessor({
45
48
 
46
49
  ## Error behavior
47
50
 
48
- When used as an input processor, `TokenLimiterProcessor` throws a `TripWire` error in the following cases:
51
+ When used as an input processor (both `processInput` and `processInputStep`), `TokenLimiterProcessor` throws a `TripWire` error in the following cases:
49
52
 
50
53
  - **Empty messages**: If there are no messages to process, a TripWire is thrown because you can't send an LLM request with no messages.
51
54
  - **System messages exceed limit**: If system messages alone exceed the token limit, a TripWire is thrown because you can't send an LLM request with only system messages and no user/assistant messages.
@@ -86,6 +89,29 @@ export const agent = new Agent({
86
89
  })
87
90
  ```
88
91
 
92
+ ### As a per-step input processor (limit multi-step token growth)
93
+
94
+ When an agent uses tools across multiple steps (e.g. `maxSteps > 1`), each step accumulates conversation history from all previous steps. Use `inputProcessors` to also limit tokens at each step of the agentic loop — the `TokenLimiterProcessor` automatically applies to both the initial input and every subsequent step:
95
+
96
+ ```typescript
97
+ import { Agent } from '@mastra/core/agent'
98
+ import { TokenLimiterProcessor } from '@mastra/core/processors'
99
+
100
+ export const agent = new Agent({
101
+ name: 'multi-step-agent',
102
+ instructions: 'You are a helpful research assistant with access to tools',
103
+ model: 'openai/gpt-4o',
104
+ inputProcessors: [
105
+ new TokenLimiterProcessor({ limit: 8000 }), // Applied at every step
106
+ ],
107
+ })
108
+
109
+ // Each tool call step will be limited to ~8000 input tokens
110
+ const result = await agent.generate('Research this topic using your tools', {
111
+ maxSteps: 10,
112
+ })
113
+ ```
114
+
89
115
  ### As an output processor (limit response length)
90
116
 
91
117
  Use `outputProcessors` to limit the length of generated responses:
@@ -38,7 +38,7 @@ const response = await agent.generate('List all files in the workspace')
38
38
 
39
39
  **contained** (`boolean`): When true, all file operations are restricted to stay within basePath. Prevents path traversal attacks and symlink escapes. See \[containment]\(/docs/workspace/filesystem#containment). (Default: `true`)
40
40
 
41
- **allowedPaths** (`string[]`): Additional absolute paths that are allowed beyond basePath. Useful with \`contained: true\` to grant access to specific directories without disabling containment entirely. Paths are resolved to absolute paths. (Default: `[]`)
41
+ **allowedPaths** (`string[]`): Additional directories the agent can access outside of \`basePath\`. (Default: `[]`)
42
42
 
43
43
  **instructions** (`string | ((opts: { defaultInstructions: string; requestContext?: RequestContext }) => string)`): Custom instructions that override the default instructions returned by getInstructions(). Pass a string to fully replace them, or a function to extend them with access to the current requestContext for per-request customization.
44
44
 
@@ -56,7 +56,7 @@ const response = await agent.generate('List all files in the workspace')
56
56
 
57
57
  **readOnly** (`boolean | undefined`): Whether the filesystem is in read-only mode
58
58
 
59
- **allowedPaths** (`readonly string[]`): Current set of additional allowed paths (absolute, resolved). These paths are permitted beyond basePath when containment is enabled.
59
+ **allowedPaths** (`readonly string[]`): Current set of resolved allowed paths. These paths are permitted beyond basePath when containment is enabled.
60
60
 
61
61
  ## Methods
62
62
 
package/CHANGELOG.md CHANGED
@@ -1,5 +1,21 @@
1
1
  # @mastra/mcp-docs-server
2
2
 
3
+ ## 1.1.10
4
+
5
+ ### Patch Changes
6
+
7
+ - Updated dependencies [[`cddf895`](https://github.com/mastra-ai/mastra/commit/cddf895532b8ee7f9fa814136ec672f53d37a9ba), [`9cede11`](https://github.com/mastra-ai/mastra/commit/9cede110abac9d93072e0521bb3c8bcafb9fdadf), [`a59f126`](https://github.com/mastra-ai/mastra/commit/a59f1269104f54726699c5cdb98c72c93606d2df), [`ed8fd75`](https://github.com/mastra-ai/mastra/commit/ed8fd75cbff03bb5e19971ddb30ab7040fc60447), [`c510833`](https://github.com/mastra-ai/mastra/commit/c5108333e8cbc19dafee5f8bfefbcb5ee935335c), [`c4c7dad`](https://github.com/mastra-ai/mastra/commit/c4c7dadfe2e4584f079f6c24bfabdb8c4981827f), [`b9a77b9`](https://github.com/mastra-ai/mastra/commit/b9a77b951fa6422077080b492cce74460d2f8fdd), [`45c3112`](https://github.com/mastra-ai/mastra/commit/45c31122666a0cc56b94727099fcb1871ed1b3f6), [`45c3112`](https://github.com/mastra-ai/mastra/commit/45c31122666a0cc56b94727099fcb1871ed1b3f6), [`7296fcc`](https://github.com/mastra-ai/mastra/commit/7296fcc599c876a68699a71c7054a16d5aaf2337), [`00c27f9`](https://github.com/mastra-ai/mastra/commit/00c27f9080731433230a61be69c44e39a7a7b4c7), [`5e7c287`](https://github.com/mastra-ai/mastra/commit/5e7c28701f2bce795dd5c811e4c3060bf2ea2242), [`7e17d3f`](https://github.com/mastra-ai/mastra/commit/7e17d3f656fdda2aad47c4beb8c491636d70820c), [`ee19c9b`](https://github.com/mastra-ai/mastra/commit/ee19c9ba3ec3ed91feb214ad539bdc766c53bb01)]:
8
+ - @mastra/core@1.12.0
9
+ - @mastra/mcp@1.2.0
10
+
11
+ ## 1.1.10-alpha.1
12
+
13
+ ### Patch Changes
14
+
15
+ - Updated dependencies [[`9cede11`](https://github.com/mastra-ai/mastra/commit/9cede110abac9d93072e0521bb3c8bcafb9fdadf), [`a59f126`](https://github.com/mastra-ai/mastra/commit/a59f1269104f54726699c5cdb98c72c93606d2df), [`c510833`](https://github.com/mastra-ai/mastra/commit/c5108333e8cbc19dafee5f8bfefbcb5ee935335c), [`7296fcc`](https://github.com/mastra-ai/mastra/commit/7296fcc599c876a68699a71c7054a16d5aaf2337), [`00c27f9`](https://github.com/mastra-ai/mastra/commit/00c27f9080731433230a61be69c44e39a7a7b4c7), [`ee19c9b`](https://github.com/mastra-ai/mastra/commit/ee19c9ba3ec3ed91feb214ad539bdc766c53bb01)]:
16
+ - @mastra/core@1.12.0-alpha.1
17
+ - @mastra/mcp@1.2.0-alpha.0
18
+
3
19
  ## 1.1.10-alpha.0
4
20
 
5
21
  ### Patch Changes
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@mastra/mcp-docs-server",
3
- "version": "1.1.10-alpha.0",
3
+ "version": "1.1.10",
4
4
  "description": "MCP server for accessing Mastra.ai documentation, changelogs, and news.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -29,8 +29,8 @@
29
29
  "jsdom": "^26.1.0",
30
30
  "local-pkg": "^1.1.2",
31
31
  "zod": "^4.3.6",
32
- "@mastra/mcp": "^1.2.0-alpha.0",
33
- "@mastra/core": "1.12.0-alpha.0"
32
+ "@mastra/core": "1.12.0",
33
+ "@mastra/mcp": "^1.2.0"
34
34
  },
35
35
  "devDependencies": {
36
36
  "@hono/node-server": "^1.19.9",
@@ -46,9 +46,9 @@
46
46
  "tsx": "^4.21.0",
47
47
  "typescript": "^5.9.3",
48
48
  "vitest": "4.0.18",
49
- "@internal/types-builder": "0.0.42",
50
- "@internal/lint": "0.0.67",
51
- "@mastra/core": "1.12.0-alpha.0"
49
+ "@internal/lint": "0.0.68",
50
+ "@internal/types-builder": "0.0.43",
51
+ "@mastra/core": "1.12.0"
52
52
  },
53
53
  "homepage": "https://mastra.ai",
54
54
  "repository": {