@mastra/mcp-docs-server 0.13.39 → 1.0.0-beta.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (494) hide show
  1. package/.docs/organized/changelogs/%40internal%2Fai-sdk-v4.md +1 -0
  2. package/.docs/organized/changelogs/%40internal%2Fchangeset-cli.md +0 -10
  3. package/.docs/organized/changelogs/%40internal%2Fexternal-types.md +0 -10
  4. package/.docs/organized/changelogs/%40internal%2Fstorage-test-utils.md +36 -36
  5. package/.docs/organized/changelogs/%40internal%2Ftypes-builder.md +0 -10
  6. package/.docs/organized/changelogs/%40mastra%2Fagent-builder.md +70 -70
  7. package/.docs/organized/changelogs/%40mastra%2Fai-sdk.md +40 -40
  8. package/.docs/organized/changelogs/%40mastra%2Fastra.md +19 -19
  9. package/.docs/organized/changelogs/%40mastra%2Fauth.md +4 -14
  10. package/.docs/organized/changelogs/%40mastra%2Fchroma.md +18 -18
  11. package/.docs/organized/changelogs/%40mastra%2Fclickhouse.md +199 -199
  12. package/.docs/organized/changelogs/%40mastra%2Fclient-js.md +220 -220
  13. package/.docs/organized/changelogs/%40mastra%2Fcloudflare-d1.md +190 -190
  14. package/.docs/organized/changelogs/%40mastra%2Fcloudflare.md +199 -199
  15. package/.docs/organized/changelogs/%40mastra%2Fcodemod.md +7 -0
  16. package/.docs/organized/changelogs/%40mastra%2Fcore.md +210 -210
  17. package/.docs/organized/changelogs/%40mastra%2Fcouchbase.md +16 -16
  18. package/.docs/organized/changelogs/%40mastra%2Fdeployer-cloud.md +69 -69
  19. package/.docs/organized/changelogs/%40mastra%2Fdeployer-cloudflare.md +67 -67
  20. package/.docs/organized/changelogs/%40mastra%2Fdeployer-netlify.md +70 -70
  21. package/.docs/organized/changelogs/%40mastra%2Fdeployer-vercel.md +67 -67
  22. package/.docs/organized/changelogs/%40mastra%2Fdeployer.md +209 -209
  23. package/.docs/organized/changelogs/%40mastra%2Fdynamodb.md +191 -191
  24. package/.docs/organized/changelogs/%40mastra%2Fevals.md +34 -34
  25. package/.docs/organized/changelogs/%40mastra%2Ffastembed.md +5 -13
  26. package/.docs/organized/changelogs/%40mastra%2Flance.md +182 -182
  27. package/.docs/organized/changelogs/%40mastra%2Flibsql.md +199 -199
  28. package/.docs/organized/changelogs/%40mastra%2Floggers.md +20 -20
  29. package/.docs/organized/changelogs/%40mastra%2Fmcp-docs-server.md +56 -56
  30. package/.docs/organized/changelogs/%40mastra%2Fmcp-registry-registry.md +20 -20
  31. package/.docs/organized/changelogs/%40mastra%2Fmcp.md +65 -65
  32. package/.docs/organized/changelogs/%40mastra%2Fmemory.md +228 -228
  33. package/.docs/organized/changelogs/%40mastra%2Fmongodb.md +199 -199
  34. package/.docs/organized/changelogs/%40mastra%2Fmssql.md +206 -206
  35. package/.docs/organized/changelogs/%40mastra%2Fopensearch.md +19 -19
  36. package/.docs/organized/changelogs/%40mastra%2Fpg.md +197 -197
  37. package/.docs/organized/changelogs/%40mastra%2Fpinecone.md +16 -16
  38. package/.docs/organized/changelogs/%40mastra%2Fplayground-ui.md +216 -216
  39. package/.docs/organized/changelogs/%40mastra%2Fqdrant.md +16 -16
  40. package/.docs/organized/changelogs/%40mastra%2Frag.md +61 -61
  41. package/.docs/organized/changelogs/%40mastra%2Freact.md +66 -66
  42. package/.docs/organized/changelogs/%40mastra%2Fs3vectors.md +9 -17
  43. package/.docs/organized/changelogs/%40mastra%2Fschema-compat.md +6 -30
  44. package/.docs/organized/changelogs/%40mastra%2Fserver.md +203 -203
  45. package/.docs/organized/changelogs/%40mastra%2Fturbopuffer.md +16 -16
  46. package/.docs/organized/changelogs/%40mastra%2Fupstash.md +190 -190
  47. package/.docs/organized/changelogs/%40mastra%2Fvectorize.md +18 -18
  48. package/.docs/organized/changelogs/%40mastra%2Fvoice-azure.md +21 -21
  49. package/.docs/organized/changelogs/%40mastra%2Fvoice-cloudflare.md +20 -20
  50. package/.docs/organized/changelogs/%40mastra%2Fvoice-deepgram.md +20 -20
  51. package/.docs/organized/changelogs/%40mastra%2Fvoice-elevenlabs.md +20 -20
  52. package/.docs/organized/changelogs/%40mastra%2Fvoice-gladia.md +20 -20
  53. package/.docs/organized/changelogs/%40mastra%2Fvoice-google-gemini-live.md +56 -56
  54. package/.docs/organized/changelogs/%40mastra%2Fvoice-google.md +20 -20
  55. package/.docs/organized/changelogs/%40mastra%2Fvoice-murf.md +20 -20
  56. package/.docs/organized/changelogs/%40mastra%2Fvoice-openai-realtime.md +56 -56
  57. package/.docs/organized/changelogs/%40mastra%2Fvoice-openai.md +20 -20
  58. package/.docs/organized/changelogs/%40mastra%2Fvoice-playai.md +20 -20
  59. package/.docs/organized/changelogs/%40mastra%2Fvoice-sarvam.md +20 -20
  60. package/.docs/organized/changelogs/%40mastra%2Fvoice-speechify.md +20 -20
  61. package/.docs/organized/changelogs/create-mastra.md +29 -29
  62. package/.docs/organized/changelogs/mastra.md +93 -93
  63. package/.docs/organized/code-examples/a2a.md +4 -2
  64. package/.docs/organized/code-examples/agui.md +12 -9
  65. package/.docs/organized/code-examples/ai-sdk-useChat.md +12 -18
  66. package/.docs/organized/code-examples/ai-sdk-v5.md +4 -2
  67. package/.docs/organized/code-examples/bird-checker-with-express.md +5 -4
  68. package/.docs/organized/code-examples/bird-checker-with-nextjs-and-eval.md +4 -3
  69. package/.docs/organized/code-examples/bird-checker-with-nextjs.md +4 -3
  70. package/.docs/organized/code-examples/client-side-tools.md +1 -0
  71. package/.docs/organized/code-examples/crypto-chatbot.md +1 -1
  72. package/.docs/organized/code-examples/experimental-auth-weather-agent.md +8 -177
  73. package/.docs/organized/code-examples/fireworks-r1.md +2 -2
  74. package/.docs/organized/code-examples/heads-up-game.md +10 -7
  75. package/.docs/organized/code-examples/mcp-configuration.md +5 -3
  76. package/.docs/organized/code-examples/mcp-registry-registry.md +3 -2
  77. package/.docs/organized/code-examples/memory-per-resource-example.md +4 -2
  78. package/.docs/organized/code-examples/memory-todo-agent.md +1 -0
  79. package/.docs/organized/code-examples/memory-with-context.md +2 -1
  80. package/.docs/organized/code-examples/memory-with-libsql.md +4 -2
  81. package/.docs/organized/code-examples/memory-with-mongodb.md +4 -2
  82. package/.docs/organized/code-examples/memory-with-pg.md +4 -2
  83. package/.docs/organized/code-examples/memory-with-processors.md +13 -8
  84. package/.docs/organized/code-examples/memory-with-upstash.md +5 -3
  85. package/.docs/organized/code-examples/openapi-spec-writer.md +32 -41
  86. package/.docs/organized/code-examples/quick-start.md +5 -32
  87. package/.docs/organized/code-examples/stock-price-tool.md +6 -5
  88. package/.docs/organized/code-examples/weather-agent.md +21 -16
  89. package/.docs/organized/code-examples/workflow-ai-recruiter.md +3 -2
  90. package/.docs/organized/code-examples/workflow-with-inline-steps.md +9 -12
  91. package/.docs/organized/code-examples/workflow-with-memory.md +16 -15
  92. package/.docs/organized/code-examples/workflow-with-separate-steps.md +2 -2
  93. package/.docs/organized/code-examples/workflow-with-suspend-resume.md +3 -2
  94. package/.docs/raw/agents/adding-voice.mdx +27 -22
  95. package/.docs/raw/agents/agent-memory.mdx +24 -16
  96. package/.docs/raw/agents/guardrails.mdx +33 -12
  97. package/.docs/raw/agents/networks.mdx +8 -4
  98. package/.docs/raw/agents/overview.mdx +23 -17
  99. package/.docs/raw/agents/using-tools.mdx +11 -8
  100. package/.docs/raw/auth/auth0.mdx +9 -9
  101. package/.docs/raw/auth/clerk.mdx +7 -7
  102. package/.docs/raw/auth/firebase.mdx +9 -9
  103. package/.docs/raw/auth/index.mdx +6 -6
  104. package/.docs/raw/auth/jwt.mdx +7 -7
  105. package/.docs/raw/auth/supabase.mdx +8 -8
  106. package/.docs/raw/auth/workos.mdx +9 -9
  107. package/.docs/raw/community/contributing-templates.mdx +3 -3
  108. package/.docs/raw/community/discord.mdx +1 -1
  109. package/.docs/raw/course/01-first-agent/03-verifying-installation.md +1 -1
  110. package/.docs/raw/course/01-first-agent/08-exporting-your-agent.md +2 -1
  111. package/.docs/raw/course/01-first-agent/16-adding-memory-to-agent.md +2 -1
  112. package/.docs/raw/course/02-agent-tools-mcp/02-installing-mcp.md +1 -1
  113. package/.docs/raw/course/02-agent-tools-mcp/31-enhancing-memory-configuration.md +2 -0
  114. package/.docs/raw/course/03-agent-memory/03-installing-memory.md +1 -1
  115. package/.docs/raw/course/03-agent-memory/04-creating-basic-memory-agent.md +1 -0
  116. package/.docs/raw/course/03-agent-memory/10-storage-configuration.md +2 -3
  117. package/.docs/raw/course/03-agent-memory/13-vector-store-configuration.md +2 -0
  118. package/.docs/raw/course/03-agent-memory/16-configuring-semantic-recall.md +2 -0
  119. package/.docs/raw/course/03-agent-memory/18-advanced-configuration-semantic-recall.md +1 -0
  120. package/.docs/raw/course/03-agent-memory/21-configuring-working-memory.md +2 -0
  121. package/.docs/raw/course/03-agent-memory/22-custom-working-memory-templates.md +1 -0
  122. package/.docs/raw/course/03-agent-memory/25-combining-memory-features.md +1 -0
  123. package/.docs/raw/course/03-agent-memory/27-creating-learning-assistant.md +1 -0
  124. package/.docs/raw/course/04-workflows/08-running-workflows-programmatically.md +2 -2
  125. package/.docs/raw/deployment/cloud-providers/amazon-ec2.mdx +6 -6
  126. package/.docs/raw/deployment/cloud-providers/aws-lambda.mdx +8 -6
  127. package/.docs/raw/deployment/cloud-providers/azure-app-services.mdx +5 -5
  128. package/.docs/raw/deployment/cloud-providers/digital-ocean.mdx +5 -5
  129. package/.docs/raw/deployment/cloud-providers/index.mdx +11 -8
  130. package/.docs/raw/deployment/monorepo.mdx +2 -2
  131. package/.docs/raw/deployment/overview.mdx +2 -2
  132. package/.docs/raw/deployment/server-deployment.mdx +2 -10
  133. package/.docs/raw/deployment/serverless-platforms/cloudflare-deployer.mdx +5 -5
  134. package/.docs/raw/deployment/serverless-platforms/index.mdx +10 -7
  135. package/.docs/raw/deployment/serverless-platforms/netlify-deployer.mdx +5 -5
  136. package/.docs/raw/deployment/serverless-platforms/vercel-deployer.mdx +5 -5
  137. package/.docs/raw/deployment/web-framework.mdx +8 -8
  138. package/.docs/raw/{scorers → evals}/custom-scorers.mdx +6 -6
  139. package/.docs/raw/evals/off-the-shelf-scorers.mdx +50 -0
  140. package/.docs/raw/{scorers → evals}/overview.mdx +9 -9
  141. package/.docs/raw/evals/running-in-ci.mdx +113 -0
  142. package/.docs/raw/frameworks/agentic-uis/ai-sdk.mdx +26 -25
  143. package/.docs/raw/frameworks/agentic-uis/assistant-ui.mdx +1 -1
  144. package/.docs/raw/frameworks/agentic-uis/copilotkit.mdx +17 -17
  145. package/.docs/raw/frameworks/agentic-uis/openrouter.mdx +4 -1
  146. package/.docs/raw/frameworks/servers/express.mdx +11 -10
  147. package/.docs/raw/frameworks/web-frameworks/astro.mdx +18 -18
  148. package/.docs/raw/frameworks/web-frameworks/next-js.mdx +7 -7
  149. package/.docs/raw/frameworks/web-frameworks/sveltekit.mdx +16 -16
  150. package/.docs/raw/frameworks/web-frameworks/vite-react.mdx +7 -7
  151. package/.docs/raw/getting-started/installation.mdx +26 -25
  152. package/.docs/raw/getting-started/mcp-docs-server.mdx +1 -1
  153. package/.docs/raw/getting-started/project-structure.mdx +4 -4
  154. package/.docs/raw/getting-started/studio.mdx +8 -8
  155. package/.docs/raw/getting-started/templates.mdx +6 -6
  156. package/.docs/raw/guides/guide/ai-recruiter.mdx +264 -0
  157. package/.docs/raw/guides/guide/chef-michel.mdx +271 -0
  158. package/.docs/raw/guides/guide/notes-mcp-server.mdx +450 -0
  159. package/.docs/raw/guides/guide/research-assistant.mdx +380 -0
  160. package/.docs/raw/guides/guide/stock-agent.mdx +185 -0
  161. package/.docs/raw/guides/guide/web-search.mdx +291 -0
  162. package/.docs/raw/guides/index.mdx +43 -0
  163. package/.docs/raw/guides/migrations/agentnetwork.mdx +114 -0
  164. package/.docs/raw/guides/migrations/upgrade-to-v1/_template.mdx +50 -0
  165. package/.docs/raw/guides/migrations/upgrade-to-v1/agent.mdx +265 -0
  166. package/.docs/raw/guides/migrations/upgrade-to-v1/cli.mdx +48 -0
  167. package/.docs/raw/guides/migrations/upgrade-to-v1/client.mdx +153 -0
  168. package/.docs/raw/guides/migrations/upgrade-to-v1/evals.mdx +230 -0
  169. package/.docs/raw/guides/migrations/upgrade-to-v1/mastra.mdx +171 -0
  170. package/.docs/raw/guides/migrations/upgrade-to-v1/mcp.mdx +114 -0
  171. package/.docs/raw/guides/migrations/upgrade-to-v1/memory.mdx +241 -0
  172. package/.docs/raw/guides/migrations/upgrade-to-v1/overview.mdx +83 -0
  173. package/.docs/raw/guides/migrations/upgrade-to-v1/processors.mdx +62 -0
  174. package/.docs/raw/guides/migrations/upgrade-to-v1/storage.mdx +270 -0
  175. package/.docs/raw/guides/migrations/upgrade-to-v1/tools.mdx +115 -0
  176. package/.docs/raw/guides/migrations/upgrade-to-v1/tracing.mdx +280 -0
  177. package/.docs/raw/guides/migrations/upgrade-to-v1/vectors.mdx +23 -0
  178. package/.docs/raw/guides/migrations/upgrade-to-v1/voice.mdx +39 -0
  179. package/.docs/raw/guides/migrations/upgrade-to-v1/workflows.mdx +178 -0
  180. package/.docs/raw/guides/migrations/vnext-to-standard-apis.mdx +367 -0
  181. package/.docs/raw/guides/quickstarts/nextjs.mdx +275 -0
  182. package/.docs/raw/index.mdx +9 -9
  183. package/.docs/raw/{observability/logging.mdx → logging.mdx} +4 -4
  184. package/.docs/raw/mastra-cloud/dashboard.mdx +2 -2
  185. package/.docs/raw/mastra-cloud/observability.mdx +6 -6
  186. package/.docs/raw/mastra-cloud/overview.mdx +2 -2
  187. package/.docs/raw/mastra-cloud/setting-up.mdx +4 -4
  188. package/.docs/raw/memory/conversation-history.mdx +1 -0
  189. package/.docs/raw/memory/memory-processors.mdx +4 -3
  190. package/.docs/raw/memory/overview.mdx +10 -6
  191. package/.docs/raw/memory/semantic-recall.mdx +13 -8
  192. package/.docs/raw/memory/storage/memory-with-libsql.mdx +12 -7
  193. package/.docs/raw/memory/storage/memory-with-pg.mdx +11 -6
  194. package/.docs/raw/memory/storage/memory-with-upstash.mdx +11 -6
  195. package/.docs/raw/memory/threads-and-resources.mdx +11 -13
  196. package/.docs/raw/memory/working-memory.mdx +30 -14
  197. package/.docs/raw/observability/overview.mdx +13 -30
  198. package/.docs/raw/observability/{ai-tracing → tracing}/exporters/arize.mdx +11 -19
  199. package/.docs/raw/observability/{ai-tracing → tracing}/exporters/braintrust.mdx +8 -17
  200. package/.docs/raw/observability/{ai-tracing → tracing}/exporters/cloud.mdx +11 -17
  201. package/.docs/raw/observability/{ai-tracing → tracing}/exporters/default.mdx +16 -20
  202. package/.docs/raw/observability/{ai-tracing → tracing}/exporters/langfuse.mdx +8 -17
  203. package/.docs/raw/observability/{ai-tracing → tracing}/exporters/langsmith.mdx +8 -17
  204. package/.docs/raw/observability/{ai-tracing → tracing}/exporters/otel.mdx +12 -21
  205. package/.docs/raw/observability/{ai-tracing → tracing}/overview.mdx +107 -142
  206. package/.docs/raw/observability/{ai-tracing → tracing}/processors/sensitive-data-filter.mdx +14 -13
  207. package/.docs/raw/rag/chunking-and-embedding.mdx +5 -5
  208. package/.docs/raw/rag/overview.mdx +3 -13
  209. package/.docs/raw/rag/retrieval.mdx +24 -12
  210. package/.docs/raw/rag/vector-databases.mdx +7 -1
  211. package/.docs/raw/reference/agents/agent.mdx +35 -30
  212. package/.docs/raw/reference/agents/generate.mdx +10 -10
  213. package/.docs/raw/reference/agents/generateLegacy.mdx +8 -8
  214. package/.docs/raw/reference/agents/getDefaultGenerateOptions.mdx +21 -15
  215. package/.docs/raw/reference/agents/getDefaultOptions.mdx +69 -0
  216. package/.docs/raw/reference/agents/getDefaultStreamOptions.mdx +22 -16
  217. package/.docs/raw/reference/agents/getDescription.mdx +1 -1
  218. package/.docs/raw/reference/agents/getInstructions.mdx +8 -8
  219. package/.docs/raw/reference/agents/getLLM.mdx +9 -9
  220. package/.docs/raw/reference/agents/getMemory.mdx +9 -9
  221. package/.docs/raw/reference/agents/getModel.mdx +10 -10
  222. package/.docs/raw/reference/agents/getVoice.mdx +8 -8
  223. package/.docs/raw/reference/agents/listAgents.mdx +9 -9
  224. package/.docs/raw/reference/agents/listScorers.mdx +7 -7
  225. package/.docs/raw/reference/agents/listTools.mdx +7 -7
  226. package/.docs/raw/reference/agents/listWorkflows.mdx +7 -7
  227. package/.docs/raw/reference/agents/network.mdx +11 -10
  228. package/.docs/raw/reference/auth/auth0.mdx +4 -4
  229. package/.docs/raw/reference/auth/clerk.mdx +4 -4
  230. package/.docs/raw/reference/auth/firebase.mdx +6 -6
  231. package/.docs/raw/reference/auth/jwt.mdx +4 -4
  232. package/.docs/raw/reference/auth/supabase.mdx +4 -4
  233. package/.docs/raw/reference/auth/workos.mdx +4 -4
  234. package/.docs/raw/reference/cli/create-mastra.mdx +10 -10
  235. package/.docs/raw/reference/cli/mastra.mdx +7 -7
  236. package/.docs/raw/reference/client-js/agents.mdx +6 -2
  237. package/.docs/raw/reference/client-js/mastra-client.mdx +7 -7
  238. package/.docs/raw/reference/client-js/memory.mdx +24 -16
  239. package/.docs/raw/reference/client-js/observability.mdx +11 -11
  240. package/.docs/raw/reference/client-js/workflows.mdx +6 -34
  241. package/.docs/raw/reference/core/getAgent.mdx +1 -1
  242. package/.docs/raw/reference/core/getAgentById.mdx +1 -1
  243. package/.docs/raw/reference/core/getDeployer.mdx +2 -2
  244. package/.docs/raw/reference/core/getLogger.mdx +2 -2
  245. package/.docs/raw/reference/core/getMCPServer.mdx +31 -15
  246. package/.docs/raw/reference/core/getMCPServerById.mdx +81 -0
  247. package/.docs/raw/reference/core/getScorer.mdx +3 -3
  248. package/.docs/raw/reference/core/getScorerById.mdx +79 -0
  249. package/.docs/raw/reference/core/getServer.mdx +2 -2
  250. package/.docs/raw/reference/core/getStorage.mdx +2 -2
  251. package/.docs/raw/reference/core/getTelemetry.mdx +2 -2
  252. package/.docs/raw/reference/core/getVector.mdx +2 -2
  253. package/.docs/raw/reference/core/getWorkflow.mdx +1 -1
  254. package/.docs/raw/reference/core/listAgents.mdx +1 -1
  255. package/.docs/raw/reference/core/listLogs.mdx +2 -2
  256. package/.docs/raw/reference/core/listLogsByRunId.mdx +2 -2
  257. package/.docs/raw/reference/core/listMCPServers.mdx +65 -0
  258. package/.docs/raw/reference/core/listScorers.mdx +3 -3
  259. package/.docs/raw/reference/core/listVectors.mdx +36 -0
  260. package/.docs/raw/reference/core/listWorkflows.mdx +6 -6
  261. package/.docs/raw/reference/core/mastra-class.mdx +3 -2
  262. package/.docs/raw/reference/core/setLogger.mdx +2 -2
  263. package/.docs/raw/reference/core/setStorage.mdx +3 -2
  264. package/.docs/raw/reference/core/setTelemetry.mdx +2 -2
  265. package/.docs/raw/reference/deployer/cloudflare.mdx +2 -2
  266. package/.docs/raw/reference/deployer/deployer.mdx +0 -6
  267. package/.docs/raw/reference/deployer/netlify.mdx +2 -2
  268. package/.docs/raw/reference/deployer/vercel.mdx +3 -3
  269. package/.docs/raw/reference/evals/answer-relevancy.mdx +164 -126
  270. package/.docs/raw/reference/{scorers → evals}/answer-similarity.mdx +27 -27
  271. package/.docs/raw/reference/evals/bias.mdx +149 -115
  272. package/.docs/raw/reference/evals/completeness.mdx +148 -117
  273. package/.docs/raw/reference/evals/content-similarity.mdx +126 -113
  274. package/.docs/raw/reference/evals/context-precision.mdx +290 -133
  275. package/.docs/raw/reference/{scorers → evals}/context-relevance.mdx +6 -6
  276. package/.docs/raw/reference/{scorers → evals}/create-scorer.mdx +69 -60
  277. package/.docs/raw/reference/evals/faithfulness.mdx +163 -121
  278. package/.docs/raw/reference/evals/hallucination.mdx +159 -132
  279. package/.docs/raw/reference/evals/keyword-coverage.mdx +169 -125
  280. package/.docs/raw/reference/{scorers → evals}/mastra-scorer.mdx +7 -5
  281. package/.docs/raw/reference/{scorers → evals}/noise-sensitivity.mdx +9 -9
  282. package/.docs/raw/reference/evals/prompt-alignment.mdx +604 -182
  283. package/.docs/raw/reference/{scorers/run-experiment.mdx → evals/run-evals.mdx} +17 -18
  284. package/.docs/raw/reference/evals/textual-difference.mdx +149 -117
  285. package/.docs/raw/reference/evals/tone-consistency.mdx +149 -125
  286. package/.docs/raw/reference/{scorers → evals}/tool-call-accuracy.mdx +8 -6
  287. package/.docs/raw/reference/evals/toxicity.mdx +152 -96
  288. package/.docs/raw/reference/{observability/logging → logging}/pino-logger.mdx +2 -2
  289. package/.docs/raw/reference/memory/createThread.mdx +5 -5
  290. package/.docs/raw/reference/memory/deleteMessages.mdx +7 -7
  291. package/.docs/raw/reference/memory/getThreadById.mdx +4 -4
  292. package/.docs/raw/reference/memory/listThreadsByResourceId.mdx +110 -0
  293. package/.docs/raw/reference/memory/memory-class.mdx +13 -9
  294. package/.docs/raw/reference/memory/query.mdx +58 -57
  295. package/.docs/raw/reference/memory/recall.mdx +185 -0
  296. package/.docs/raw/reference/observability/tracing/configuration.mdx +245 -0
  297. package/.docs/raw/reference/observability/{ai-tracing → tracing}/exporters/arize.mdx +13 -13
  298. package/.docs/raw/reference/observability/{ai-tracing → tracing}/exporters/braintrust.mdx +11 -8
  299. package/.docs/raw/reference/observability/{ai-tracing → tracing}/exporters/cloud-exporter.mdx +21 -19
  300. package/.docs/raw/reference/observability/{ai-tracing → tracing}/exporters/console-exporter.mdx +49 -17
  301. package/.docs/raw/reference/observability/{ai-tracing → tracing}/exporters/default-exporter.mdx +42 -41
  302. package/.docs/raw/reference/observability/{ai-tracing → tracing}/exporters/langfuse.mdx +10 -7
  303. package/.docs/raw/reference/observability/{ai-tracing → tracing}/exporters/langsmith.mdx +10 -7
  304. package/.docs/raw/reference/observability/{ai-tracing → tracing}/exporters/otel.mdx +5 -5
  305. package/.docs/raw/reference/observability/tracing/instances.mdx +168 -0
  306. package/.docs/raw/reference/observability/{ai-tracing → tracing}/interfaces.mdx +115 -89
  307. package/.docs/raw/reference/observability/{ai-tracing → tracing}/processors/sensitive-data-filter.mdx +3 -3
  308. package/.docs/raw/reference/observability/{ai-tracing/span.mdx → tracing/spans.mdx} +59 -41
  309. package/.docs/raw/reference/processors/batch-parts-processor.mdx +9 -3
  310. package/.docs/raw/reference/processors/language-detector.mdx +9 -3
  311. package/.docs/raw/reference/processors/moderation-processor.mdx +9 -3
  312. package/.docs/raw/reference/processors/pii-detector.mdx +9 -3
  313. package/.docs/raw/reference/processors/prompt-injection-detector.mdx +9 -3
  314. package/.docs/raw/reference/processors/system-prompt-scrubber.mdx +9 -3
  315. package/.docs/raw/reference/processors/token-limiter-processor.mdx +9 -3
  316. package/.docs/raw/reference/processors/unicode-normalizer.mdx +9 -3
  317. package/.docs/raw/reference/rag/chunk.mdx +1 -8
  318. package/.docs/raw/reference/rag/database-config.mdx +7 -7
  319. package/.docs/raw/reference/rag/metadata-filters.mdx +14 -11
  320. package/.docs/raw/reference/storage/cloudflare-d1.mdx +1 -1
  321. package/.docs/raw/reference/storage/cloudflare.mdx +1 -1
  322. package/.docs/raw/reference/storage/dynamodb.mdx +3 -3
  323. package/.docs/raw/reference/storage/lance.mdx +1 -1
  324. package/.docs/raw/reference/storage/libsql.mdx +3 -1
  325. package/.docs/raw/reference/storage/mongodb.mdx +1 -1
  326. package/.docs/raw/reference/storage/mssql.mdx +6 -1
  327. package/.docs/raw/reference/storage/postgresql.mdx +7 -1
  328. package/.docs/raw/reference/storage/upstash.mdx +2 -1
  329. package/.docs/raw/reference/streaming/agents/stream.mdx +12 -12
  330. package/.docs/raw/reference/streaming/agents/streamLegacy.mdx +8 -8
  331. package/.docs/raw/reference/streaming/workflows/observeStream.mdx +3 -3
  332. package/.docs/raw/reference/streaming/workflows/observeStreamVNext.mdx +3 -3
  333. package/.docs/raw/reference/streaming/workflows/resumeStreamVNext.mdx +6 -6
  334. package/.docs/raw/reference/streaming/workflows/stream.mdx +10 -10
  335. package/.docs/raw/reference/streaming/workflows/streamVNext.mdx +11 -11
  336. package/.docs/raw/reference/templates/overview.mdx +3 -3
  337. package/.docs/raw/reference/tools/create-tool.mdx +52 -35
  338. package/.docs/raw/reference/tools/graph-rag-tool.mdx +15 -15
  339. package/.docs/raw/reference/tools/mcp-client.mdx +1 -1
  340. package/.docs/raw/reference/tools/mcp-server.mdx +119 -35
  341. package/.docs/raw/reference/tools/vector-query-tool.mdx +27 -26
  342. package/.docs/raw/reference/vectors/couchbase.mdx +8 -2
  343. package/.docs/raw/reference/vectors/libsql.mdx +2 -1
  344. package/.docs/raw/reference/vectors/mongodb.mdx +7 -1
  345. package/.docs/raw/reference/vectors/pg.mdx +3 -0
  346. package/.docs/raw/reference/vectors/s3vectors.mdx +1 -1
  347. package/.docs/raw/reference/vectors/upstash.mdx +1 -0
  348. package/.docs/raw/reference/voice/google-gemini-live.mdx +1 -1
  349. package/.docs/raw/reference/voice/voice.addTools.mdx +3 -3
  350. package/.docs/raw/reference/workflows/run-methods/cancel.mdx +4 -4
  351. package/.docs/raw/reference/workflows/run-methods/resume.mdx +14 -14
  352. package/.docs/raw/reference/workflows/run-methods/start.mdx +17 -17
  353. package/.docs/raw/reference/workflows/run.mdx +1 -8
  354. package/.docs/raw/reference/workflows/step.mdx +5 -5
  355. package/.docs/raw/reference/workflows/workflow-methods/branch.mdx +2 -2
  356. package/.docs/raw/reference/workflows/workflow-methods/commit.mdx +1 -1
  357. package/.docs/raw/reference/workflows/workflow-methods/create-run.mdx +7 -13
  358. package/.docs/raw/reference/workflows/workflow-methods/dountil.mdx +1 -1
  359. package/.docs/raw/reference/workflows/workflow-methods/dowhile.mdx +1 -1
  360. package/.docs/raw/reference/workflows/workflow-methods/foreach.mdx +1 -1
  361. package/.docs/raw/reference/workflows/workflow-methods/map.mdx +5 -0
  362. package/.docs/raw/reference/workflows/workflow-methods/parallel.mdx +1 -1
  363. package/.docs/raw/reference/workflows/workflow-methods/sendEvent.mdx +2 -2
  364. package/.docs/raw/reference/workflows/workflow-methods/sleep.mdx +1 -1
  365. package/.docs/raw/reference/workflows/workflow-methods/sleepUntil.mdx +1 -1
  366. package/.docs/raw/reference/workflows/workflow-methods/then.mdx +1 -1
  367. package/.docs/raw/reference/workflows/workflow-methods/waitForEvent.mdx +1 -1
  368. package/.docs/raw/reference/workflows/workflow.mdx +1 -1
  369. package/.docs/raw/server-db/custom-api-routes.mdx +2 -2
  370. package/.docs/raw/server-db/mastra-client.mdx +23 -22
  371. package/.docs/raw/server-db/middleware.mdx +7 -7
  372. package/.docs/raw/server-db/production-server.mdx +4 -4
  373. package/.docs/raw/server-db/{runtime-context.mdx → request-context.mdx} +46 -45
  374. package/.docs/raw/server-db/storage.mdx +29 -21
  375. package/.docs/raw/streaming/events.mdx +3 -3
  376. package/.docs/raw/streaming/overview.mdx +5 -5
  377. package/.docs/raw/streaming/tool-streaming.mdx +18 -17
  378. package/.docs/raw/streaming/workflow-streaming.mdx +1 -1
  379. package/.docs/raw/tools-mcp/advanced-usage.mdx +5 -4
  380. package/.docs/raw/tools-mcp/mcp-overview.mdx +33 -20
  381. package/.docs/raw/tools-mcp/overview.mdx +11 -11
  382. package/.docs/raw/voice/overview.mdx +63 -43
  383. package/.docs/raw/voice/speech-to-speech.mdx +5 -3
  384. package/.docs/raw/voice/speech-to-text.mdx +10 -9
  385. package/.docs/raw/voice/text-to-speech.mdx +13 -12
  386. package/.docs/raw/workflows/agents-and-tools.mdx +9 -5
  387. package/.docs/raw/workflows/control-flow.mdx +3 -3
  388. package/.docs/raw/workflows/error-handling.mdx +2 -21
  389. package/.docs/raw/workflows/human-in-the-loop.mdx +7 -4
  390. package/.docs/raw/workflows/inngest-workflow.mdx +3 -3
  391. package/.docs/raw/workflows/input-data-mapping.mdx +107 -0
  392. package/.docs/raw/workflows/overview.mdx +17 -16
  393. package/.docs/raw/workflows/snapshots.mdx +13 -11
  394. package/.docs/raw/workflows/suspend-and-resume.mdx +23 -15
  395. package/CHANGELOG.md +55 -53
  396. package/README.md +11 -2
  397. package/dist/{chunk-TUAHUTTB.js → chunk-5NJC7NRO.js} +3 -0
  398. package/dist/index.d.ts.map +1 -1
  399. package/dist/prepare-docs/copy-raw.d.ts.map +1 -1
  400. package/dist/prepare-docs/prepare.js +1 -1
  401. package/dist/prompts/migration.d.ts +6 -0
  402. package/dist/prompts/migration.d.ts.map +1 -0
  403. package/dist/stdio.js +402 -30
  404. package/dist/tools/migration.d.ts +40 -0
  405. package/dist/tools/migration.d.ts.map +1 -0
  406. package/package.json +8 -12
  407. package/.docs/organized/changelogs/%40mastra%2Fcloud.md +0 -302
  408. package/.docs/raw/observability/nextjs-tracing.mdx +0 -109
  409. package/.docs/raw/observability/otel-tracing.mdx +0 -189
  410. package/.docs/raw/reference/agents/getScorers.mdx +0 -69
  411. package/.docs/raw/reference/agents/getTools.mdx +0 -69
  412. package/.docs/raw/reference/agents/getWorkflows.mdx +0 -69
  413. package/.docs/raw/reference/client-js/workflows-legacy.mdx +0 -143
  414. package/.docs/raw/reference/core/getAgents.mdx +0 -35
  415. package/.docs/raw/reference/core/getLogs.mdx +0 -96
  416. package/.docs/raw/reference/core/getLogsByRunId.mdx +0 -87
  417. package/.docs/raw/reference/core/getMCPServers.mdx +0 -36
  418. package/.docs/raw/reference/core/getMemory.mdx +0 -36
  419. package/.docs/raw/reference/core/getScorerByName.mdx +0 -78
  420. package/.docs/raw/reference/core/getScorers.mdx +0 -43
  421. package/.docs/raw/reference/core/getVectors.mdx +0 -36
  422. package/.docs/raw/reference/core/getWorkflows.mdx +0 -45
  423. package/.docs/raw/reference/evals/context-position.mdx +0 -197
  424. package/.docs/raw/reference/evals/context-relevancy.mdx +0 -196
  425. package/.docs/raw/reference/evals/contextual-recall.mdx +0 -196
  426. package/.docs/raw/reference/evals/summarization.mdx +0 -212
  427. package/.docs/raw/reference/legacyWorkflows/after.mdx +0 -89
  428. package/.docs/raw/reference/legacyWorkflows/afterEvent.mdx +0 -79
  429. package/.docs/raw/reference/legacyWorkflows/commit.mdx +0 -33
  430. package/.docs/raw/reference/legacyWorkflows/createRun.mdx +0 -76
  431. package/.docs/raw/reference/legacyWorkflows/else.mdx +0 -68
  432. package/.docs/raw/reference/legacyWorkflows/events.mdx +0 -305
  433. package/.docs/raw/reference/legacyWorkflows/execute.mdx +0 -110
  434. package/.docs/raw/reference/legacyWorkflows/if.mdx +0 -108
  435. package/.docs/raw/reference/legacyWorkflows/resume.mdx +0 -158
  436. package/.docs/raw/reference/legacyWorkflows/resumeWithEvent.mdx +0 -133
  437. package/.docs/raw/reference/legacyWorkflows/snapshots.mdx +0 -207
  438. package/.docs/raw/reference/legacyWorkflows/start.mdx +0 -87
  439. package/.docs/raw/reference/legacyWorkflows/step-class.mdx +0 -100
  440. package/.docs/raw/reference/legacyWorkflows/step-condition.mdx +0 -137
  441. package/.docs/raw/reference/legacyWorkflows/step-function.mdx +0 -93
  442. package/.docs/raw/reference/legacyWorkflows/step-options.mdx +0 -69
  443. package/.docs/raw/reference/legacyWorkflows/step-retries.mdx +0 -196
  444. package/.docs/raw/reference/legacyWorkflows/suspend.mdx +0 -70
  445. package/.docs/raw/reference/legacyWorkflows/then.mdx +0 -72
  446. package/.docs/raw/reference/legacyWorkflows/until.mdx +0 -168
  447. package/.docs/raw/reference/legacyWorkflows/watch.mdx +0 -124
  448. package/.docs/raw/reference/legacyWorkflows/while.mdx +0 -168
  449. package/.docs/raw/reference/legacyWorkflows/workflow.mdx +0 -234
  450. package/.docs/raw/reference/memory/getThreadsByResourceId.mdx +0 -79
  451. package/.docs/raw/reference/memory/getThreadsByResourceIdPaginated.mdx +0 -110
  452. package/.docs/raw/reference/observability/ai-tracing/ai-tracing.mdx +0 -185
  453. package/.docs/raw/reference/observability/ai-tracing/configuration.mdx +0 -238
  454. package/.docs/raw/reference/observability/otel-tracing/otel-config.mdx +0 -117
  455. package/.docs/raw/reference/observability/otel-tracing/providers/arize-ax.mdx +0 -81
  456. package/.docs/raw/reference/observability/otel-tracing/providers/arize-phoenix.mdx +0 -121
  457. package/.docs/raw/reference/observability/otel-tracing/providers/braintrust.mdx +0 -40
  458. package/.docs/raw/reference/observability/otel-tracing/providers/dash0.mdx +0 -40
  459. package/.docs/raw/reference/observability/otel-tracing/providers/index.mdx +0 -20
  460. package/.docs/raw/reference/observability/otel-tracing/providers/keywordsai.mdx +0 -73
  461. package/.docs/raw/reference/observability/otel-tracing/providers/laminar.mdx +0 -41
  462. package/.docs/raw/reference/observability/otel-tracing/providers/langfuse.mdx +0 -84
  463. package/.docs/raw/reference/observability/otel-tracing/providers/langsmith.mdx +0 -48
  464. package/.docs/raw/reference/observability/otel-tracing/providers/langwatch.mdx +0 -43
  465. package/.docs/raw/reference/observability/otel-tracing/providers/new-relic.mdx +0 -40
  466. package/.docs/raw/reference/observability/otel-tracing/providers/signoz.mdx +0 -40
  467. package/.docs/raw/reference/observability/otel-tracing/providers/traceloop.mdx +0 -40
  468. package/.docs/raw/reference/scorers/answer-relevancy.mdx +0 -227
  469. package/.docs/raw/reference/scorers/bias.mdx +0 -228
  470. package/.docs/raw/reference/scorers/completeness.mdx +0 -214
  471. package/.docs/raw/reference/scorers/content-similarity.mdx +0 -197
  472. package/.docs/raw/reference/scorers/context-precision.mdx +0 -352
  473. package/.docs/raw/reference/scorers/faithfulness.mdx +0 -241
  474. package/.docs/raw/reference/scorers/hallucination.mdx +0 -252
  475. package/.docs/raw/reference/scorers/keyword-coverage.mdx +0 -229
  476. package/.docs/raw/reference/scorers/prompt-alignment.mdx +0 -668
  477. package/.docs/raw/reference/scorers/textual-difference.mdx +0 -203
  478. package/.docs/raw/reference/scorers/tone-consistency.mdx +0 -211
  479. package/.docs/raw/reference/scorers/toxicity.mdx +0 -228
  480. package/.docs/raw/reference/workflows/run-methods/watch.mdx +0 -73
  481. package/.docs/raw/scorers/evals-old-api/custom-eval.mdx +0 -24
  482. package/.docs/raw/scorers/evals-old-api/overview.mdx +0 -106
  483. package/.docs/raw/scorers/evals-old-api/running-in-ci.mdx +0 -85
  484. package/.docs/raw/scorers/evals-old-api/textual-evals.mdx +0 -58
  485. package/.docs/raw/scorers/off-the-shelf-scorers.mdx +0 -50
  486. package/.docs/raw/workflows-legacy/control-flow.mdx +0 -774
  487. package/.docs/raw/workflows-legacy/dynamic-workflows.mdx +0 -239
  488. package/.docs/raw/workflows-legacy/error-handling.mdx +0 -187
  489. package/.docs/raw/workflows-legacy/nested-workflows.mdx +0 -360
  490. package/.docs/raw/workflows-legacy/overview.mdx +0 -182
  491. package/.docs/raw/workflows-legacy/runtime-variables.mdx +0 -156
  492. package/.docs/raw/workflows-legacy/steps.mdx +0 -115
  493. package/.docs/raw/workflows-legacy/suspend-and-resume.mdx +0 -406
  494. package/.docs/raw/workflows-legacy/variables.mdx +0 -318
@@ -1,73 +0,0 @@
1
- ---
2
- title: "Reference: Run.watch() | Workflows | Mastra Docs"
3
- description: Documentation for the `Run.watch()` method in workflows, which allows you to monitor the execution of a workflow run.
4
- ---
5
-
6
- # Run.watch()
7
-
8
- The `.watch()` method allows you to monitor the execution of a workflow run, providing real-time updates on the status of steps.
9
-
10
- ## Usage example
11
-
12
- ```typescript showLineNumbers copy
13
- const run = await workflow.createRunAsync();
14
-
15
- run.watch((event) => {
16
- console.log(event?.payload?.currentStep?.id);
17
- });
18
-
19
- const result = await run.start({ inputData: { value: "initial data" } });
20
- ```
21
-
22
- ## Parameters
23
-
24
- <PropertiesTable
25
- content={[
26
- {
27
- name: "callback",
28
- type: "(event: WatchEvent) => void",
29
- description:
30
- "A callback function that is called whenever a step is completed or the workflow state changes. The event parameter contains: type ('watch'), payload (currentStep and workflowState), and eventTimestamp",
31
- isOptional: false,
32
- },
33
- {
34
- name: "type",
35
- type: "'watch' | 'watch-v2'",
36
- description:
37
- "The type of watch events to listen for. 'watch' for step completion events, 'watch-v2' for data stream events",
38
- isOptional: true,
39
- defaultValue: "'watch'",
40
- },
41
- ]}
42
- />
43
-
44
- ## Returns
45
-
46
- <PropertiesTable
47
- content={[
48
- {
49
- name: "unwatch",
50
- type: "() => void",
51
- description:
52
- "A function that can be called to stop watching the workflow run",
53
- },
54
- ]}
55
- />
56
-
57
- ## Extended usage example
58
-
59
- ```typescript showLineNumbers copy
60
- const run = await workflow.createRunAsync();
61
-
62
- run.watch((event) => {
63
- console.log(event?.payload?.currentStep?.id);
64
- }, "watch");
65
-
66
- const result = await run.start({ inputData: { value: "initial data" } });
67
- ```
68
-
69
- ## Related
70
-
71
- - [Workflows overview](/docs/workflows/overview#running-workflows)
72
- - [Workflow.createRunAsync()](../workflow-methods/create-run)
73
- - [Watch Workflow](/docs/workflows/overview)
@@ -1,24 +0,0 @@
1
- ---
2
- title: "Create a Custom Eval | Scorers | Mastra Docs"
3
- description: "Mastra allows you to create your own evals, here is how."
4
- ---
5
-
6
- # Create a Custom Eval
7
-
8
- :::info Scorers
9
- This documentation refers to the legacy evals API. For the latest scorer features, see [Scorers](/docs/scorers/overview).
10
- :::
11
-
12
- Create a custom eval by extending the `Metric` class and implementing the `measure` method. This gives you full control over how scores are calculated and what information is returned. For LLM-based evaluations, extend the `MastraAgentJudge` class to define how the model reasons and scores output.
13
-
14
- ## Native JavaScript evaluation
15
-
16
- You can write lightweight custom metrics using plain JavaScript/TypeScript. These are ideal for simple string comparisons, pattern checks, or other rule-based logic.
17
-
18
- See our [Word Inclusion example](/examples/evals/custom-native-javascript-eval), which scores responses based on the number of reference words found in the output.
19
-
20
- ## LLM as a judge evaluation
21
-
22
- For more complex evaluations, you can build a judge powered by an LLM. This lets you capture more nuanced criteria, like factual accuracy, tone, or reasoning.
23
-
24
- See the [Real World Countries example](/examples/evals/custom-llm-judge-eval) for a complete walkthrough of building a custom judge and metric that evaluates real-world factual accuracy.
@@ -1,106 +0,0 @@
1
- ---
2
- title: "Testing your agents with evals | Scorers | Mastra Docs"
3
- description: "Understanding how to evaluate and measure AI agent quality using Mastra evals."
4
- ---
5
-
6
- # Testing your agents with evals
7
-
8
- :::info Scorers
9
- This documentation refers to the legacy evals API. For the latest scorer features, see [Scorers](/docs/scorers/overview).
10
- :::
11
-
12
- While traditional software tests have clear pass/fail conditions, AI outputs are non-deterministic — they can vary with the same input. Evals help bridge this gap by providing quantifiable metrics for measuring agent quality.
13
-
14
- Evals are automated tests that evaluate Agents outputs using model-graded, rule-based, and statistical methods. Each eval returns a normalized score between 0-1 that can be logged and compared. Evals can be customized with your own prompts and scoring functions.
15
-
16
- Evals can be run in the cloud, capturing real-time results. But evals can also be part of your CI/CD pipeline, allowing you to test and monitor your agents over time.
17
-
18
- ## Types of Evals
19
-
20
- There are different kinds of evals, each serving a specific purpose. Here are some common types:
21
-
22
- 1. **Textual Evals**: Evaluate accuracy, reliability, and context understanding of agent responses
23
- 2. **Classification Evals**: Measure accuracy in categorizing data based on predefined categories
24
- 3. **Prompt Engineering Evals**: Explore impact of different instructions and input formats
25
-
26
- ## Installation
27
-
28
- To access Mastra's evals feature install the `@mastra/evals` package.
29
-
30
- ```bash copy
31
- npm install @mastra/evals@latest
32
- ```
33
-
34
- ## Getting Started
35
-
36
- Evals need to be added to an agent. Here's an example using the summarization, content similarity, and tone consistency metrics:
37
-
38
- ```typescript copy showLineNumbers title="src/mastra/agents/index.ts"
39
- import { Agent } from "@mastra/core/agent";
40
- import { openai } from "@ai-sdk/openai";
41
- import { SummarizationMetric } from "@mastra/evals/llm";
42
- import {
43
- ContentSimilarityMetric,
44
- ToneConsistencyMetric,
45
- } from "@mastra/evals/nlp";
46
-
47
- const model = openai("gpt-4o");
48
-
49
- export const myAgent = new Agent({
50
- name: "ContentWriter",
51
- instructions: "You are a content writer that creates accurate summaries",
52
- model,
53
- evals: {
54
- summarization: new SummarizationMetric(model),
55
- contentSimilarity: new ContentSimilarityMetric(),
56
- tone: new ToneConsistencyMetric(),
57
- },
58
- });
59
- ```
60
-
61
- You can view eval results in the Mastra dashboard when using `mastra dev`.
62
-
63
- ## Beyond Automated Testing
64
-
65
- While automated evals are valuable, high-performing AI teams often combine them with:
66
-
67
- 1. **A/B Testing**: Compare different versions with real users
68
- 2. **Human Review**: Regular review of production data and traces
69
- 3. **Continuous Monitoring**: Track eval metrics over time to detect regressions
70
-
71
- ## Understanding Eval Results
72
-
73
- Each eval metric measures a specific aspect of your agent's output. Here's how to interpret and improve your results:
74
-
75
- ### Understanding Scores
76
-
77
- For any metric:
78
-
79
- 1. Check the metric documentation to understand the scoring process
80
- 2. Look for patterns in when scores change
81
- 3. Compare scores across different inputs and contexts
82
- 4. Track changes over time to spot trends
83
-
84
- ### Improving Results
85
-
86
- When scores aren't meeting your targets:
87
-
88
- 1. Check your instructions - Are they clear? Try making them more specific
89
- 2. Look at your context - Is it giving the agent what it needs?
90
- 3. Simplify your prompts - Break complex tasks into smaller steps
91
- 4. Add guardrails - Include specific rules for tricky cases
92
-
93
- ### Maintaining Quality
94
-
95
- Once you're hitting your targets:
96
-
97
- 1. Monitor stability - Do scores remain consistent?
98
- 2. Document what works - Keep notes on successful approaches
99
- 3. Test edge cases - Add examples that cover unusual scenarios
100
- 4. Fine-tune - Look for ways to improve efficiency
101
-
102
- See [Textual Evals](/docs/scorers/evals-old-api/textual-evals) for more info on what evals can do.
103
-
104
- For more info on how to create your own evals, see the [Custom Evals](/docs/scorers/evals-old-api/custom-eval) guide.
105
-
106
- For running evals in your CI pipeline, see the [Running in CI](/docs/scorers/evals-old-api/running-in-ci) guide.
@@ -1,85 +0,0 @@
1
- ---
2
- title: "Running Evals in CI | Scorers | Mastra Docs"
3
- description: "Learn how to run Mastra evals in your CI/CD pipeline to monitor agent quality over time."
4
- ---
5
-
6
- # Running Evals in CI
7
-
8
- :::info Scorers
9
- This documentation refers to the legacy evals API. For the latest scorer features, see [Scorers](/docs/scorers/overview).
10
- :::
11
-
12
- Running evals in your CI pipeline helps bridge this gap by providing quantifiable metrics for measuring agent quality over time.
13
-
14
- ## Setting Up CI Integration
15
-
16
- We support any testing framework that supports ESM modules. For example, you can use [Vitest](https://vitest.dev/), [Jest](https://jestjs.io/) or [Mocha](https://mochajs.org/) to run evals in your CI/CD pipeline.
17
-
18
- ```typescript copy showLineNumbers title="src/mastra/agents/index.test.ts"
19
- import { describe, it, expect } from "vitest";
20
- import { evaluate } from "@mastra/evals";
21
- import { ToneConsistencyMetric } from "@mastra/evals/nlp";
22
- import { myAgent } from "./index";
23
-
24
- describe("My Agent", () => {
25
- it("should validate tone consistency", async () => {
26
- const metric = new ToneConsistencyMetric();
27
- const result = await evaluate(myAgent, "Hello, world!", metric);
28
-
29
- expect(result.score).toBe(1);
30
- });
31
- });
32
- ```
33
-
34
- You will need to configure a testSetup and globalSetup script for your testing framework to capture the eval results. It allows us to show these results in your mastra dashboard.
35
-
36
- ## Framework Configuration
37
-
38
- ### Vitest Setup
39
-
40
- Add these files to your project to run evals in your CI/CD pipeline:
41
-
42
- ```typescript copy showLineNumbers title="globalSetup.ts"
43
- import { globalSetup } from "@mastra/evals";
44
-
45
- export default function setup() {
46
- globalSetup();
47
- }
48
- ```
49
-
50
- ```typescript copy showLineNumbers title="testSetup.ts"
51
- import { beforeAll } from "vitest";
52
- import { attachListeners } from "@mastra/evals";
53
-
54
- beforeAll(async () => {
55
- await attachListeners();
56
- });
57
- ```
58
-
59
- ```typescript copy showLineNumbers title="vitest.config.ts"
60
- import { defineConfig } from "vitest/config";
61
-
62
- export default defineConfig({
63
- test: {
64
- globalSetup: "./globalSetup.ts",
65
- setupFiles: ["./testSetup.ts"],
66
- },
67
- });
68
- ```
69
-
70
- ## Storage Configuration
71
-
72
- To store eval results in Mastra Storage and capture results in the Mastra dashboard:
73
-
74
- ```typescript copy showLineNumbers title="testSetup.ts"
75
- import { beforeAll } from "vitest";
76
- import { attachListeners } from "@mastra/evals";
77
- import { mastra } from "./your-mastra-setup";
78
-
79
- beforeAll(async () => {
80
- // Store evals in Mastra Storage (requires storage to be enabled)
81
- await attachListeners(mastra);
82
- });
83
- ```
84
-
85
- With file storage, evals persist and can be queried later. With memory storage, evals are isolated to the test process.
@@ -1,58 +0,0 @@
1
- ---
2
- title: "Textual Evals | Scorers | Mastra Docs"
3
- description: "Understand how Mastra uses LLM-as-judge methodology to evaluate text quality."
4
- ---
5
-
6
- # Textual Evals
7
-
8
- :::info Scorers
9
- This documentation refers to the legacy evals API. For the latest scorer features, see [Scorers](/docs/scorers/overview).
10
- :::
11
-
12
- Textual evals use an LLM-as-judge methodology to evaluate agent outputs. This approach leverages language models to assess various aspects of text quality, similar to how a teaching assistant might grade assignments using a rubric.
13
-
14
- Each eval focuses on specific quality aspects and returns a score between 0 and 1, providing quantifiable metrics for non-deterministic AI outputs.
15
-
16
- Mastra provides several eval metrics for assessing Agent outputs. Mastra is not limited to these metrics, and you can also [define your own evals](/docs/scorers/evals-old-api/custom-eval).
17
-
18
- ## Why Use Textual Evals?
19
-
20
- Textual evals help ensure your agent:
21
-
22
- - Produces accurate and reliable responses
23
- - Uses context effectively
24
- - Follows output requirements
25
- - Maintains consistent quality over time
26
-
27
- ## Available Metrics
28
-
29
- ### Accuracy and Reliability
30
-
31
- These metrics evaluate how correct, truthful, and complete your agent's answers are:
32
-
33
- - [`hallucination`](/reference/evals/hallucination): Detects facts or claims not present in provided context
34
- - [`faithfulness`](/reference/evals/faithfulness): Measures how accurately responses represent provided context
35
- - [`content-similarity`](/reference/evals/content-similarity): Evaluates consistency of information across different phrasings
36
- - [`completeness`](/reference/evals/completeness): Checks if responses include all necessary information
37
- - [`answer-relevancy`](/reference/evals/answer-relevancy): Assesses how well responses address the original query
38
- - [`textual-difference`](/reference/evals/textual-difference): Measures textual differences between strings
39
-
40
- ### Understanding Context
41
-
42
- These metrics evaluate how well your agent uses provided context:
43
-
44
- - [`context-position`](/reference/evals/context-position): Analyzes where context appears in responses
45
- - [`context-precision`](/reference/evals/context-precision): Evaluates whether context chunks are grouped logically
46
- - [`context-relevancy`](/reference/evals/context-relevancy): Measures use of appropriate context pieces
47
- - [`contextual-recall`](/reference/evals/contextual-recall): Assesses completeness of context usage
48
-
49
- ### Output Quality
50
-
51
- These metrics evaluate adherence to format and style requirements:
52
-
53
- - [`tone`](/reference/evals/tone-consistency): Measures consistency in formality, complexity, and style
54
- - [`toxicity`](/reference/evals/toxicity): Detects harmful or inappropriate content
55
- - [`bias`](/reference/evals/bias): Detects potential biases in the output
56
- - [`prompt-alignment`](/reference/evals/prompt-alignment): Checks adherence to explicit instructions like length restrictions, formatting requirements, or other constraints
57
- - [`summarization`](/reference/evals/summarization): Evaluates information retention and conciseness
58
- - [`keyword-coverage`](/reference/evals/keyword-coverage): Assesses technical terminology usage
@@ -1,50 +0,0 @@
1
- ---
2
- title: "Built-in Scorers | Scorers | Mastra Docs"
3
- description: "Overview of Mastra's ready-to-use scorers for evaluating AI outputs across quality, safety, and performance dimensions."
4
- ---
5
-
6
- # Built-in Scorers
7
-
8
- Mastra provides a comprehensive set of built-in scorers for evaluating AI outputs. These scorers are optimized for common evaluation scenarios and are ready to use in your agents and workflows.
9
-
10
- ## Available Scorers
11
-
12
- ### Accuracy and Reliability
13
-
14
- These scorers evaluate how correct, truthful, and complete your agent's answers are:
15
-
16
- - [`answer-relevancy`](/reference/scorers/answer-relevancy): Evaluates how well responses address the input query (`0-1`, higher is better)
17
- - [`answer-similarity`](/reference/scorers/answer-similarity): Compares agent outputs against ground truth answers for CI/CD testing using semantic analysis (`0-1`, higher is better)
18
- - [`faithfulness`](/reference/scorers/faithfulness): Measures how accurately responses represent provided context (`0-1`, higher is better)
19
- - [`hallucination`](/reference/scorers/hallucination): Detects factual contradictions and unsupported claims (`0-1`, lower is better)
20
- - [`completeness`](/reference/scorers/completeness): Checks if responses include all necessary information (`0-1`, higher is better)
21
- - [`content-similarity`](/reference/scorers/content-similarity): Measures textual similarity using character-level matching (`0-1`, higher is better)
22
- - [`textual-difference`](/reference/scorers/textual-difference): Measures textual differences between strings (`0-1`, higher means more similar)
23
- - [`tool-call-accuracy`](/reference/scorers/tool-call-accuracy): Evaluates whether the LLM selects the correct tool from available options (`0-1`, higher is better)
24
- - [`prompt-alignment`](/reference/scorers/prompt-alignment): Measures how well agent responses align with user prompt intent, requirements, completeness, and format (`0-1`, higher is better)
25
-
26
- ### Context Quality
27
-
28
- These scorers evaluate the quality and relevance of context used in generating responses:
29
-
30
- - [`context-precision`](/reference/scorers/context-precision): Evaluates context relevance and ranking using Mean Average Precision, rewarding early placement of relevant context (`0-1`, higher is better)
31
- - [`context-relevance`](/reference/scorers/context-relevance): Measures context utility with nuanced relevance levels, usage tracking, and missing context detection (`0-1`, higher is better)
32
-
33
- > tip Context Scorer Selection
34
-
35
- - Use **Context Precision** when context ordering matters and you need standard IR metrics (ideal for RAG ranking evaluation)
36
- - Use **Context Relevance** when you need detailed relevance assessment and want to track context usage and identify gaps
37
-
38
- Both context scorers support:
39
-
40
- - **Static context**: Pre-defined context arrays
41
- - **Dynamic context extraction**: Extract context from runs using custom functions (ideal for RAG systems, vector databases, etc.)
42
-
43
- ### Output Quality
44
-
45
- These scorers evaluate adherence to format, style, and safety requirements:
46
-
47
- - [`tone-consistency`](/reference/scorers/tone-consistency): Measures consistency in formality, complexity, and style (`0-1`, higher is better)
48
- - [`toxicity`](/reference/scorers/toxicity): Detects harmful or inappropriate content (`0-1`, lower is better)
49
- - [`bias`](/reference/scorers/bias): Detects potential biases in the output (`0-1`, lower is better)
50
- - [`keyword-coverage`](/reference/scorers/keyword-coverage): Assesses technical terminology usage (`0-1`, higher is better)