@mastra/core 1.6.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (453) hide show
  1. package/CHANGELOG.md +236 -0
  2. package/dist/agent/agent.d.ts +6 -0
  3. package/dist/agent/agent.d.ts.map +1 -1
  4. package/dist/agent/index.cjs +8 -8
  5. package/dist/agent/index.js +1 -1
  6. package/dist/{chunk-VJWRJWSC.cjs → chunk-2X66GWF5.cjs} +94 -16
  7. package/dist/chunk-2X66GWF5.cjs.map +1 -0
  8. package/dist/{chunk-YM6245EM.js → chunk-6OXW5E2O.js} +3 -3
  9. package/dist/{chunk-YM6245EM.js.map → chunk-6OXW5E2O.js.map} +1 -1
  10. package/dist/{chunk-XWZAKKFT.cjs → chunk-6QBN6MZY.cjs} +14 -14
  11. package/dist/{chunk-XWZAKKFT.cjs.map → chunk-6QBN6MZY.cjs.map} +1 -1
  12. package/dist/{chunk-AYHSPIT6.cjs → chunk-7UAJ6LMR.cjs} +820 -259
  13. package/dist/chunk-7UAJ6LMR.cjs.map +1 -0
  14. package/dist/{chunk-RZNHRIM7.cjs → chunk-A72NTLFT.cjs} +5 -5
  15. package/dist/{chunk-RZNHRIM7.cjs.map → chunk-A72NTLFT.cjs.map} +1 -1
  16. package/dist/{chunk-EEU5NHHU.js → chunk-DFCRXDVK.js} +3 -3
  17. package/dist/{chunk-EEU5NHHU.js.map → chunk-DFCRXDVK.js.map} +1 -1
  18. package/dist/{chunk-5K45E5VE.js → chunk-GPJGPARM.js} +3 -2
  19. package/dist/chunk-GPJGPARM.js.map +1 -0
  20. package/dist/{chunk-LNKS4TJ6.cjs → chunk-HB6T4554.cjs} +8 -7
  21. package/dist/chunk-HB6T4554.cjs.map +1 -0
  22. package/dist/{chunk-DGS2KGDI.js → chunk-KUXNBWN7.js} +6 -5
  23. package/dist/chunk-KUXNBWN7.js.map +1 -0
  24. package/dist/{chunk-IHDE4CJV.js → chunk-QSHV7GPT.js} +89 -12
  25. package/dist/chunk-QSHV7GPT.js.map +1 -0
  26. package/dist/{chunk-3U3XFMGJ.cjs → chunk-QTAS3HND.cjs} +13 -8
  27. package/dist/chunk-QTAS3HND.cjs.map +1 -0
  28. package/dist/{chunk-RHKNKJNM.js → chunk-QWTB53GS.js} +4 -4
  29. package/dist/{chunk-RHKNKJNM.js.map → chunk-QWTB53GS.js.map} +1 -1
  30. package/dist/{chunk-4WG5K4CK.js → chunk-R4N65TLG.js} +7 -7
  31. package/dist/{chunk-4WG5K4CK.js.map → chunk-R4N65TLG.js.map} +1 -1
  32. package/dist/{chunk-5VQPSWPG.cjs → chunk-RABITNTG.cjs} +48 -48
  33. package/dist/{chunk-5VQPSWPG.cjs.map → chunk-RABITNTG.cjs.map} +1 -1
  34. package/dist/{chunk-TVPANHLE.cjs → chunk-SBOHDNIZ.cjs} +3 -2
  35. package/dist/chunk-SBOHDNIZ.cjs.map +1 -0
  36. package/dist/{chunk-MWGGSA5Q.js → chunk-T6GAM3SQ.js} +10 -5
  37. package/dist/chunk-T6GAM3SQ.js.map +1 -0
  38. package/dist/{chunk-TL2TTA4X.cjs → chunk-YQG7NBPR.cjs} +9 -9
  39. package/dist/{chunk-TL2TTA4X.cjs.map → chunk-YQG7NBPR.cjs.map} +1 -1
  40. package/dist/{chunk-XB3DA67Q.js → chunk-ZSBM2SVU.js} +818 -259
  41. package/dist/chunk-ZSBM2SVU.js.map +1 -0
  42. package/dist/datasets/experiment/scorer.d.ts.map +1 -1
  43. package/dist/datasets/index.cjs +17 -17
  44. package/dist/datasets/index.js +2 -2
  45. package/dist/evals/index.cjs +20 -20
  46. package/dist/evals/index.js +3 -3
  47. package/dist/evals/scoreTraces/index.cjs +7 -6
  48. package/dist/evals/scoreTraces/index.cjs.map +1 -1
  49. package/dist/evals/scoreTraces/index.js +4 -3
  50. package/dist/evals/scoreTraces/index.js.map +1 -1
  51. package/dist/evals/scoreTraces/scoreTracesWorkflow.d.ts.map +1 -1
  52. package/dist/harness/harness.d.ts +21 -1
  53. package/dist/harness/harness.d.ts.map +1 -1
  54. package/dist/harness/index.cjs +422 -5
  55. package/dist/harness/index.cjs.map +1 -1
  56. package/dist/harness/index.d.ts +2 -1
  57. package/dist/harness/index.d.ts.map +1 -1
  58. package/dist/harness/index.js +418 -3
  59. package/dist/harness/index.js.map +1 -1
  60. package/dist/harness/types.d.ts +151 -0
  61. package/dist/harness/types.d.ts.map +1 -1
  62. package/dist/index.cjs +2 -2
  63. package/dist/index.js +1 -1
  64. package/dist/loop/index.cjs +12 -12
  65. package/dist/loop/index.js +1 -1
  66. package/dist/loop/test-utils/tools.d.ts.map +1 -1
  67. package/dist/loop/workflows/agentic-loop/index.d.ts.map +1 -1
  68. package/dist/mastra/hooks.d.ts.map +1 -1
  69. package/dist/mastra/index.cjs +2 -2
  70. package/dist/mastra/index.js +1 -1
  71. package/dist/memory/index.cjs +14 -14
  72. package/dist/memory/index.js +1 -1
  73. package/dist/processor-provider/index.cjs +10 -10
  74. package/dist/processor-provider/index.js +1 -1
  75. package/dist/processors/index.cjs +45 -41
  76. package/dist/processors/index.js +1 -1
  77. package/dist/processors/processors/index.d.ts +1 -0
  78. package/dist/processors/processors/index.d.ts.map +1 -1
  79. package/dist/processors/processors/workspace-instructions.d.ts +50 -0
  80. package/dist/processors/processors/workspace-instructions.d.ts.map +1 -0
  81. package/dist/relevance/index.cjs +3 -3
  82. package/dist/relevance/index.js +1 -1
  83. package/dist/storage/constants.cjs +56 -56
  84. package/dist/storage/constants.d.ts.map +1 -1
  85. package/dist/storage/constants.js +1 -1
  86. package/dist/storage/index.cjs +160 -160
  87. package/dist/storage/index.js +2 -2
  88. package/dist/storage/types.d.ts +2 -0
  89. package/dist/storage/types.d.ts.map +1 -1
  90. package/dist/stream/aisdk/v5/compat/prepare-tools.d.ts.map +1 -1
  91. package/dist/stream/base/output.d.ts.map +1 -1
  92. package/dist/stream/index.cjs +8 -8
  93. package/dist/stream/index.js +1 -1
  94. package/dist/tool-loop-agent/index.cjs +4 -4
  95. package/dist/tool-loop-agent/index.js +1 -1
  96. package/dist/vector/index.cjs +7 -7
  97. package/dist/vector/index.js +1 -1
  98. package/dist/workflows/evented/index.cjs +10 -10
  99. package/dist/workflows/evented/index.js +1 -1
  100. package/dist/workflows/index.cjs +25 -25
  101. package/dist/workflows/index.js +1 -1
  102. package/dist/workspace/constants/index.d.ts +2 -0
  103. package/dist/workspace/constants/index.d.ts.map +1 -1
  104. package/dist/workspace/errors.d.ts +1 -1
  105. package/dist/workspace/errors.d.ts.map +1 -1
  106. package/dist/workspace/filesystem/composite-filesystem.d.ts +4 -1
  107. package/dist/workspace/filesystem/composite-filesystem.d.ts.map +1 -1
  108. package/dist/workspace/filesystem/file-write-lock.d.ts +35 -0
  109. package/dist/workspace/filesystem/file-write-lock.d.ts.map +1 -0
  110. package/dist/workspace/filesystem/filesystem.d.ts +5 -1
  111. package/dist/workspace/filesystem/filesystem.d.ts.map +1 -1
  112. package/dist/workspace/filesystem/index.d.ts +1 -0
  113. package/dist/workspace/filesystem/index.d.ts.map +1 -1
  114. package/dist/workspace/filesystem/local-filesystem.d.ts +17 -1
  115. package/dist/workspace/filesystem/local-filesystem.d.ts.map +1 -1
  116. package/dist/workspace/index.cjs +72 -64
  117. package/dist/workspace/index.d.ts +3 -2
  118. package/dist/workspace/index.d.ts.map +1 -1
  119. package/dist/workspace/index.js +1 -1
  120. package/dist/workspace/lifecycle.d.ts +1 -9
  121. package/dist/workspace/lifecycle.d.ts.map +1 -1
  122. package/dist/workspace/sandbox/index.d.ts +2 -0
  123. package/dist/workspace/sandbox/index.d.ts.map +1 -1
  124. package/dist/workspace/sandbox/local-process-manager.d.ts +18 -0
  125. package/dist/workspace/sandbox/local-process-manager.d.ts.map +1 -0
  126. package/dist/workspace/sandbox/local-sandbox.d.ts +49 -35
  127. package/dist/workspace/sandbox/local-sandbox.d.ts.map +1 -1
  128. package/dist/workspace/sandbox/mastra-sandbox.d.ts +45 -11
  129. package/dist/workspace/sandbox/mastra-sandbox.d.ts.map +1 -1
  130. package/dist/workspace/sandbox/native-sandbox/bubblewrap.d.ts +2 -3
  131. package/dist/workspace/sandbox/native-sandbox/bubblewrap.d.ts.map +1 -1
  132. package/dist/workspace/sandbox/native-sandbox/seatbelt.d.ts +2 -3
  133. package/dist/workspace/sandbox/native-sandbox/seatbelt.d.ts.map +1 -1
  134. package/dist/workspace/sandbox/native-sandbox/wrapper.d.ts +4 -5
  135. package/dist/workspace/sandbox/native-sandbox/wrapper.d.ts.map +1 -1
  136. package/dist/workspace/sandbox/process-manager/index.d.ts +4 -0
  137. package/dist/workspace/sandbox/process-manager/index.d.ts.map +1 -0
  138. package/dist/workspace/sandbox/process-manager/process-handle.d.ts +107 -0
  139. package/dist/workspace/sandbox/process-manager/process-handle.d.ts.map +1 -0
  140. package/dist/workspace/sandbox/process-manager/process-manager.d.ts +59 -0
  141. package/dist/workspace/sandbox/process-manager/process-manager.d.ts.map +1 -0
  142. package/dist/workspace/sandbox/process-manager/types.d.ts +24 -0
  143. package/dist/workspace/sandbox/process-manager/types.d.ts.map +1 -0
  144. package/dist/workspace/sandbox/sandbox.d.ts +38 -2
  145. package/dist/workspace/sandbox/sandbox.d.ts.map +1 -1
  146. package/dist/workspace/sandbox/types.d.ts +9 -2
  147. package/dist/workspace/sandbox/types.d.ts.map +1 -1
  148. package/dist/workspace/sandbox/utils.d.ts +7 -0
  149. package/dist/workspace/sandbox/utils.d.ts.map +1 -0
  150. package/dist/workspace/tools/execute-command.d.ts +53 -2
  151. package/dist/workspace/tools/execute-command.d.ts.map +1 -1
  152. package/dist/workspace/tools/get-process-output.d.ts +6 -0
  153. package/dist/workspace/tools/get-process-output.d.ts.map +1 -0
  154. package/dist/workspace/tools/index.d.ts +4 -1
  155. package/dist/workspace/tools/index.d.ts.map +1 -1
  156. package/dist/workspace/tools/kill-process.d.ts +4 -0
  157. package/dist/workspace/tools/kill-process.d.ts.map +1 -0
  158. package/dist/workspace/tools/output-helpers.d.ts +21 -0
  159. package/dist/workspace/tools/output-helpers.d.ts.map +1 -0
  160. package/dist/workspace/tools/tools.d.ts.map +1 -1
  161. package/dist/workspace/types.d.ts +31 -0
  162. package/dist/workspace/types.d.ts.map +1 -1
  163. package/dist/workspace/utils.d.ts +11 -0
  164. package/dist/workspace/utils.d.ts.map +1 -0
  165. package/dist/workspace/workspace.d.ts +36 -0
  166. package/dist/workspace/workspace.d.ts.map +1 -1
  167. package/package.json +7 -7
  168. package/dist/chunk-3U3XFMGJ.cjs.map +0 -1
  169. package/dist/chunk-5K45E5VE.js.map +0 -1
  170. package/dist/chunk-AYHSPIT6.cjs.map +0 -1
  171. package/dist/chunk-DGS2KGDI.js.map +0 -1
  172. package/dist/chunk-IHDE4CJV.js.map +0 -1
  173. package/dist/chunk-LNKS4TJ6.cjs.map +0 -1
  174. package/dist/chunk-MWGGSA5Q.js.map +0 -1
  175. package/dist/chunk-TVPANHLE.cjs.map +0 -1
  176. package/dist/chunk-VJWRJWSC.cjs.map +0 -1
  177. package/dist/chunk-XB3DA67Q.js.map +0 -1
  178. package/dist/docs/SKILL.md +0 -301
  179. package/dist/docs/assets/SOURCE_MAP.json +0 -1413
  180. package/dist/docs/references/docs-agents-adding-voice.md +0 -353
  181. package/dist/docs/references/docs-agents-agent-approval.md +0 -377
  182. package/dist/docs/references/docs-agents-agent-memory.md +0 -212
  183. package/dist/docs/references/docs-agents-guardrails.md +0 -382
  184. package/dist/docs/references/docs-agents-network-approval.md +0 -275
  185. package/dist/docs/references/docs-agents-networks.md +0 -290
  186. package/dist/docs/references/docs-agents-overview.md +0 -309
  187. package/dist/docs/references/docs-agents-processors.md +0 -632
  188. package/dist/docs/references/docs-agents-structured-output.md +0 -271
  189. package/dist/docs/references/docs-agents-using-tools.md +0 -214
  190. package/dist/docs/references/docs-evals-custom-scorers.md +0 -519
  191. package/dist/docs/references/docs-evals-overview.md +0 -146
  192. package/dist/docs/references/docs-evals-running-in-ci.md +0 -106
  193. package/dist/docs/references/docs-mcp-overview.md +0 -370
  194. package/dist/docs/references/docs-mcp-publishing-mcp-server.md +0 -95
  195. package/dist/docs/references/docs-memory-memory-processors.md +0 -316
  196. package/dist/docs/references/docs-memory-observational-memory.md +0 -246
  197. package/dist/docs/references/docs-memory-overview.md +0 -45
  198. package/dist/docs/references/docs-memory-semantic-recall.md +0 -272
  199. package/dist/docs/references/docs-memory-storage.md +0 -261
  200. package/dist/docs/references/docs-memory-working-memory.md +0 -400
  201. package/dist/docs/references/docs-observability-datasets-overview.md +0 -188
  202. package/dist/docs/references/docs-observability-datasets-running-experiments.md +0 -266
  203. package/dist/docs/references/docs-observability-logging.md +0 -99
  204. package/dist/docs/references/docs-observability-overview.md +0 -70
  205. package/dist/docs/references/docs-observability-tracing-bridges-otel.md +0 -209
  206. package/dist/docs/references/docs-observability-tracing-exporters-arize.md +0 -274
  207. package/dist/docs/references/docs-observability-tracing-exporters-braintrust.md +0 -111
  208. package/dist/docs/references/docs-observability-tracing-exporters-cloud.md +0 -129
  209. package/dist/docs/references/docs-observability-tracing-exporters-datadog.md +0 -187
  210. package/dist/docs/references/docs-observability-tracing-exporters-default.md +0 -211
  211. package/dist/docs/references/docs-observability-tracing-exporters-laminar.md +0 -100
  212. package/dist/docs/references/docs-observability-tracing-exporters-langfuse.md +0 -217
  213. package/dist/docs/references/docs-observability-tracing-exporters-langsmith.md +0 -202
  214. package/dist/docs/references/docs-observability-tracing-exporters-otel.md +0 -479
  215. package/dist/docs/references/docs-observability-tracing-exporters-posthog.md +0 -148
  216. package/dist/docs/references/docs-observability-tracing-overview.md +0 -1114
  217. package/dist/docs/references/docs-rag-chunking-and-embedding.md +0 -183
  218. package/dist/docs/references/docs-rag-graph-rag.md +0 -215
  219. package/dist/docs/references/docs-rag-overview.md +0 -72
  220. package/dist/docs/references/docs-rag-retrieval.md +0 -521
  221. package/dist/docs/references/docs-rag-vector-databases.md +0 -648
  222. package/dist/docs/references/docs-server-auth-auth0.md +0 -222
  223. package/dist/docs/references/docs-server-auth-clerk.md +0 -132
  224. package/dist/docs/references/docs-server-auth-composite-auth.md +0 -234
  225. package/dist/docs/references/docs-server-auth-custom-auth-provider.md +0 -513
  226. package/dist/docs/references/docs-server-auth-firebase.md +0 -272
  227. package/dist/docs/references/docs-server-auth-jwt.md +0 -110
  228. package/dist/docs/references/docs-server-auth-simple-auth.md +0 -178
  229. package/dist/docs/references/docs-server-auth-supabase.md +0 -117
  230. package/dist/docs/references/docs-server-auth-workos.md +0 -190
  231. package/dist/docs/references/docs-server-custom-adapters.md +0 -374
  232. package/dist/docs/references/docs-server-custom-api-routes.md +0 -267
  233. package/dist/docs/references/docs-server-mastra-client.md +0 -243
  234. package/dist/docs/references/docs-server-mastra-server.md +0 -71
  235. package/dist/docs/references/docs-server-middleware.md +0 -228
  236. package/dist/docs/references/docs-server-request-context.md +0 -478
  237. package/dist/docs/references/docs-streaming-events.md +0 -247
  238. package/dist/docs/references/docs-streaming-tool-streaming.md +0 -178
  239. package/dist/docs/references/docs-streaming-workflow-streaming.md +0 -109
  240. package/dist/docs/references/docs-voice-overview.md +0 -979
  241. package/dist/docs/references/docs-voice-speech-to-speech.md +0 -103
  242. package/dist/docs/references/docs-voice-speech-to-text.md +0 -80
  243. package/dist/docs/references/docs-voice-text-to-speech.md +0 -84
  244. package/dist/docs/references/docs-workflows-agents-and-tools.md +0 -170
  245. package/dist/docs/references/docs-workflows-control-flow.md +0 -823
  246. package/dist/docs/references/docs-workflows-error-handling.md +0 -360
  247. package/dist/docs/references/docs-workflows-human-in-the-loop.md +0 -213
  248. package/dist/docs/references/docs-workflows-overview.md +0 -372
  249. package/dist/docs/references/docs-workflows-snapshots.md +0 -238
  250. package/dist/docs/references/docs-workflows-suspend-and-resume.md +0 -205
  251. package/dist/docs/references/docs-workflows-time-travel.md +0 -309
  252. package/dist/docs/references/docs-workflows-workflow-state.md +0 -181
  253. package/dist/docs/references/docs-workspace-filesystem.md +0 -162
  254. package/dist/docs/references/docs-workspace-overview.md +0 -239
  255. package/dist/docs/references/docs-workspace-sandbox.md +0 -63
  256. package/dist/docs/references/docs-workspace-search.md +0 -219
  257. package/dist/docs/references/docs-workspace-skills.md +0 -126
  258. package/dist/docs/references/guides-agent-frameworks-ai-sdk.md +0 -140
  259. package/dist/docs/references/reference-agents-agent.md +0 -142
  260. package/dist/docs/references/reference-agents-generate.md +0 -174
  261. package/dist/docs/references/reference-agents-generateLegacy.md +0 -176
  262. package/dist/docs/references/reference-agents-getDefaultGenerateOptions.md +0 -36
  263. package/dist/docs/references/reference-agents-getDefaultOptions.md +0 -34
  264. package/dist/docs/references/reference-agents-getDefaultStreamOptions.md +0 -36
  265. package/dist/docs/references/reference-agents-getDescription.md +0 -21
  266. package/dist/docs/references/reference-agents-getInstructions.md +0 -34
  267. package/dist/docs/references/reference-agents-getLLM.md +0 -37
  268. package/dist/docs/references/reference-agents-getMemory.md +0 -34
  269. package/dist/docs/references/reference-agents-getModel.md +0 -34
  270. package/dist/docs/references/reference-agents-getTools.md +0 -29
  271. package/dist/docs/references/reference-agents-getVoice.md +0 -34
  272. package/dist/docs/references/reference-agents-listAgents.md +0 -35
  273. package/dist/docs/references/reference-agents-listScorers.md +0 -34
  274. package/dist/docs/references/reference-agents-listTools.md +0 -34
  275. package/dist/docs/references/reference-agents-listWorkflows.md +0 -34
  276. package/dist/docs/references/reference-agents-network.md +0 -134
  277. package/dist/docs/references/reference-ai-sdk-chat-route.md +0 -82
  278. package/dist/docs/references/reference-ai-sdk-network-route.md +0 -74
  279. package/dist/docs/references/reference-ai-sdk-to-ai-sdk-stream.md +0 -232
  280. package/dist/docs/references/reference-ai-sdk-with-mastra.md +0 -59
  281. package/dist/docs/references/reference-ai-sdk-workflow-route.md +0 -79
  282. package/dist/docs/references/reference-auth-auth0.md +0 -73
  283. package/dist/docs/references/reference-auth-clerk.md +0 -36
  284. package/dist/docs/references/reference-auth-firebase.md +0 -80
  285. package/dist/docs/references/reference-auth-jwt.md +0 -26
  286. package/dist/docs/references/reference-auth-supabase.md +0 -33
  287. package/dist/docs/references/reference-auth-workos.md +0 -84
  288. package/dist/docs/references/reference-client-js-agents.md +0 -438
  289. package/dist/docs/references/reference-configuration.md +0 -749
  290. package/dist/docs/references/reference-core-addGateway.md +0 -42
  291. package/dist/docs/references/reference-core-getAgent.md +0 -21
  292. package/dist/docs/references/reference-core-getAgentById.md +0 -21
  293. package/dist/docs/references/reference-core-getDeployer.md +0 -22
  294. package/dist/docs/references/reference-core-getGateway.md +0 -38
  295. package/dist/docs/references/reference-core-getGatewayById.md +0 -41
  296. package/dist/docs/references/reference-core-getLogger.md +0 -22
  297. package/dist/docs/references/reference-core-getMCPServer.md +0 -45
  298. package/dist/docs/references/reference-core-getMCPServerById.md +0 -53
  299. package/dist/docs/references/reference-core-getMemory.md +0 -50
  300. package/dist/docs/references/reference-core-getScorer.md +0 -54
  301. package/dist/docs/references/reference-core-getScorerById.md +0 -54
  302. package/dist/docs/references/reference-core-getServer.md +0 -22
  303. package/dist/docs/references/reference-core-getStorage.md +0 -22
  304. package/dist/docs/references/reference-core-getStoredAgentById.md +0 -89
  305. package/dist/docs/references/reference-core-getTelemetry.md +0 -22
  306. package/dist/docs/references/reference-core-getVector.md +0 -22
  307. package/dist/docs/references/reference-core-getWorkflow.md +0 -40
  308. package/dist/docs/references/reference-core-listAgents.md +0 -21
  309. package/dist/docs/references/reference-core-listGateways.md +0 -40
  310. package/dist/docs/references/reference-core-listLogs.md +0 -38
  311. package/dist/docs/references/reference-core-listLogsByRunId.md +0 -36
  312. package/dist/docs/references/reference-core-listMCPServers.md +0 -51
  313. package/dist/docs/references/reference-core-listMemory.md +0 -56
  314. package/dist/docs/references/reference-core-listScorers.md +0 -29
  315. package/dist/docs/references/reference-core-listStoredAgents.md +0 -93
  316. package/dist/docs/references/reference-core-listVectors.md +0 -22
  317. package/dist/docs/references/reference-core-listWorkflows.md +0 -21
  318. package/dist/docs/references/reference-core-mastra-class.md +0 -66
  319. package/dist/docs/references/reference-core-mastra-model-gateway.md +0 -153
  320. package/dist/docs/references/reference-core-setLogger.md +0 -26
  321. package/dist/docs/references/reference-core-setStorage.md +0 -27
  322. package/dist/docs/references/reference-datasets-addItem.md +0 -35
  323. package/dist/docs/references/reference-datasets-addItems.md +0 -33
  324. package/dist/docs/references/reference-datasets-compareExperiments.md +0 -48
  325. package/dist/docs/references/reference-datasets-create.md +0 -49
  326. package/dist/docs/references/reference-datasets-dataset.md +0 -78
  327. package/dist/docs/references/reference-datasets-datasets-manager.md +0 -84
  328. package/dist/docs/references/reference-datasets-delete.md +0 -23
  329. package/dist/docs/references/reference-datasets-deleteExperiment.md +0 -25
  330. package/dist/docs/references/reference-datasets-deleteItem.md +0 -25
  331. package/dist/docs/references/reference-datasets-deleteItems.md +0 -27
  332. package/dist/docs/references/reference-datasets-get.md +0 -29
  333. package/dist/docs/references/reference-datasets-getDetails.md +0 -45
  334. package/dist/docs/references/reference-datasets-getExperiment.md +0 -28
  335. package/dist/docs/references/reference-datasets-getItem.md +0 -31
  336. package/dist/docs/references/reference-datasets-getItemHistory.md +0 -29
  337. package/dist/docs/references/reference-datasets-list.md +0 -29
  338. package/dist/docs/references/reference-datasets-listExperimentResults.md +0 -37
  339. package/dist/docs/references/reference-datasets-listExperiments.md +0 -31
  340. package/dist/docs/references/reference-datasets-listItems.md +0 -44
  341. package/dist/docs/references/reference-datasets-listVersions.md +0 -31
  342. package/dist/docs/references/reference-datasets-startExperiment.md +0 -60
  343. package/dist/docs/references/reference-datasets-startExperimentAsync.md +0 -41
  344. package/dist/docs/references/reference-datasets-update.md +0 -46
  345. package/dist/docs/references/reference-datasets-updateItem.md +0 -36
  346. package/dist/docs/references/reference-evals-answer-relevancy.md +0 -105
  347. package/dist/docs/references/reference-evals-answer-similarity.md +0 -99
  348. package/dist/docs/references/reference-evals-bias.md +0 -120
  349. package/dist/docs/references/reference-evals-completeness.md +0 -137
  350. package/dist/docs/references/reference-evals-content-similarity.md +0 -101
  351. package/dist/docs/references/reference-evals-context-precision.md +0 -196
  352. package/dist/docs/references/reference-evals-create-scorer.md +0 -270
  353. package/dist/docs/references/reference-evals-faithfulness.md +0 -114
  354. package/dist/docs/references/reference-evals-hallucination.md +0 -220
  355. package/dist/docs/references/reference-evals-keyword-coverage.md +0 -128
  356. package/dist/docs/references/reference-evals-mastra-scorer.md +0 -123
  357. package/dist/docs/references/reference-evals-run-evals.md +0 -138
  358. package/dist/docs/references/reference-evals-scorer-utils.md +0 -330
  359. package/dist/docs/references/reference-evals-textual-difference.md +0 -113
  360. package/dist/docs/references/reference-evals-tone-consistency.md +0 -119
  361. package/dist/docs/references/reference-evals-toxicity.md +0 -123
  362. package/dist/docs/references/reference-harness-harness-class.md +0 -645
  363. package/dist/docs/references/reference-logging-pino-logger.md +0 -117
  364. package/dist/docs/references/reference-memory-deleteMessages.md +0 -40
  365. package/dist/docs/references/reference-memory-memory-class.md +0 -147
  366. package/dist/docs/references/reference-memory-observational-memory.md +0 -565
  367. package/dist/docs/references/reference-observability-tracing-bridges-otel.md +0 -131
  368. package/dist/docs/references/reference-observability-tracing-configuration.md +0 -178
  369. package/dist/docs/references/reference-observability-tracing-exporters-console-exporter.md +0 -138
  370. package/dist/docs/references/reference-observability-tracing-exporters-datadog.md +0 -116
  371. package/dist/docs/references/reference-observability-tracing-instances.md +0 -109
  372. package/dist/docs/references/reference-observability-tracing-interfaces.md +0 -749
  373. package/dist/docs/references/reference-observability-tracing-processors-sensitive-data-filter.md +0 -144
  374. package/dist/docs/references/reference-observability-tracing-spans.md +0 -224
  375. package/dist/docs/references/reference-processors-batch-parts-processor.md +0 -61
  376. package/dist/docs/references/reference-processors-language-detector.md +0 -81
  377. package/dist/docs/references/reference-processors-message-history-processor.md +0 -85
  378. package/dist/docs/references/reference-processors-moderation-processor.md +0 -104
  379. package/dist/docs/references/reference-processors-pii-detector.md +0 -107
  380. package/dist/docs/references/reference-processors-processor-interface.md +0 -525
  381. package/dist/docs/references/reference-processors-prompt-injection-detector.md +0 -71
  382. package/dist/docs/references/reference-processors-semantic-recall-processor.md +0 -123
  383. package/dist/docs/references/reference-processors-system-prompt-scrubber.md +0 -80
  384. package/dist/docs/references/reference-processors-token-limiter-processor.md +0 -113
  385. package/dist/docs/references/reference-processors-tool-call-filter.md +0 -85
  386. package/dist/docs/references/reference-processors-tool-search-processor.md +0 -113
  387. package/dist/docs/references/reference-processors-unicode-normalizer.md +0 -62
  388. package/dist/docs/references/reference-processors-working-memory-processor.md +0 -154
  389. package/dist/docs/references/reference-rag-database-config.md +0 -264
  390. package/dist/docs/references/reference-rag-embeddings.md +0 -92
  391. package/dist/docs/references/reference-server-mastra-server.md +0 -298
  392. package/dist/docs/references/reference-server-register-api-route.md +0 -249
  393. package/dist/docs/references/reference-storage-cloudflare-d1.md +0 -218
  394. package/dist/docs/references/reference-storage-composite.md +0 -235
  395. package/dist/docs/references/reference-storage-lance.md +0 -131
  396. package/dist/docs/references/reference-storage-libsql.md +0 -135
  397. package/dist/docs/references/reference-storage-mongodb.md +0 -262
  398. package/dist/docs/references/reference-storage-mssql.md +0 -155
  399. package/dist/docs/references/reference-storage-overview.md +0 -121
  400. package/dist/docs/references/reference-storage-postgresql.md +0 -529
  401. package/dist/docs/references/reference-storage-upstash.md +0 -160
  402. package/dist/docs/references/reference-streaming-ChunkType.md +0 -292
  403. package/dist/docs/references/reference-streaming-agents-MastraModelOutput.md +0 -182
  404. package/dist/docs/references/reference-streaming-agents-streamLegacy.md +0 -142
  405. package/dist/docs/references/reference-streaming-workflows-observeStream.md +0 -42
  406. package/dist/docs/references/reference-streaming-workflows-resumeStream.md +0 -61
  407. package/dist/docs/references/reference-streaming-workflows-stream.md +0 -88
  408. package/dist/docs/references/reference-streaming-workflows-timeTravelStream.md +0 -142
  409. package/dist/docs/references/reference-templates-overview.md +0 -194
  410. package/dist/docs/references/reference-tools-create-tool.md +0 -237
  411. package/dist/docs/references/reference-tools-graph-rag-tool.md +0 -185
  412. package/dist/docs/references/reference-tools-mcp-client.md +0 -962
  413. package/dist/docs/references/reference-tools-mcp-server.md +0 -1275
  414. package/dist/docs/references/reference-tools-vector-query-tool.md +0 -459
  415. package/dist/docs/references/reference-vectors-libsql.md +0 -305
  416. package/dist/docs/references/reference-vectors-mongodb.md +0 -295
  417. package/dist/docs/references/reference-vectors-pg.md +0 -408
  418. package/dist/docs/references/reference-vectors-upstash.md +0 -294
  419. package/dist/docs/references/reference-voice-composite-voice.md +0 -121
  420. package/dist/docs/references/reference-voice-mastra-voice.md +0 -313
  421. package/dist/docs/references/reference-voice-voice.addInstructions.md +0 -56
  422. package/dist/docs/references/reference-voice-voice.addTools.md +0 -67
  423. package/dist/docs/references/reference-voice-voice.connect.md +0 -94
  424. package/dist/docs/references/reference-voice-voice.events.md +0 -37
  425. package/dist/docs/references/reference-voice-voice.listen.md +0 -164
  426. package/dist/docs/references/reference-voice-voice.on.md +0 -111
  427. package/dist/docs/references/reference-voice-voice.speak.md +0 -157
  428. package/dist/docs/references/reference-workflows-run-methods-cancel.md +0 -86
  429. package/dist/docs/references/reference-workflows-run-methods-restart.md +0 -33
  430. package/dist/docs/references/reference-workflows-run-methods-resume.md +0 -59
  431. package/dist/docs/references/reference-workflows-run-methods-start.md +0 -58
  432. package/dist/docs/references/reference-workflows-run-methods-startAsync.md +0 -67
  433. package/dist/docs/references/reference-workflows-run-methods-timeTravel.md +0 -142
  434. package/dist/docs/references/reference-workflows-run.md +0 -59
  435. package/dist/docs/references/reference-workflows-step.md +0 -119
  436. package/dist/docs/references/reference-workflows-workflow-methods-branch.md +0 -25
  437. package/dist/docs/references/reference-workflows-workflow-methods-commit.md +0 -17
  438. package/dist/docs/references/reference-workflows-workflow-methods-create-run.md +0 -63
  439. package/dist/docs/references/reference-workflows-workflow-methods-dountil.md +0 -25
  440. package/dist/docs/references/reference-workflows-workflow-methods-dowhile.md +0 -25
  441. package/dist/docs/references/reference-workflows-workflow-methods-foreach.md +0 -118
  442. package/dist/docs/references/reference-workflows-workflow-methods-map.md +0 -93
  443. package/dist/docs/references/reference-workflows-workflow-methods-parallel.md +0 -21
  444. package/dist/docs/references/reference-workflows-workflow-methods-sleep.md +0 -35
  445. package/dist/docs/references/reference-workflows-workflow-methods-sleepUntil.md +0 -35
  446. package/dist/docs/references/reference-workflows-workflow-methods-then.md +0 -21
  447. package/dist/docs/references/reference-workflows-workflow.md +0 -157
  448. package/dist/docs/references/reference-workspace-filesystem.md +0 -202
  449. package/dist/docs/references/reference-workspace-local-filesystem.md +0 -327
  450. package/dist/docs/references/reference-workspace-local-sandbox.md +0 -285
  451. package/dist/docs/references/reference-workspace-sandbox.md +0 -81
  452. package/dist/docs/references/reference-workspace-workspace-class.md +0 -226
  453. package/dist/docs/references/reference.md +0 -276
@@ -1,519 +0,0 @@
1
- # Custom scorers
2
-
3
- Mastra provides a unified `createScorer` factory that allows you to build custom evaluation logic using either JavaScript functions or LLM-based prompt objects for each step. This flexibility lets you choose the best approach for each part of your evaluation pipeline.
4
-
5
- ### The Four-Step Pipeline
6
-
7
- All scorers in Mastra follow a consistent four-step evaluation pipeline:
8
-
9
- 1. **preprocess** (optional): Prepare or transform input/output data
10
- 2. **analyze** (optional): Perform evaluation analysis and gather insights
11
- 3. **generateScore** (required): Convert analysis into a numerical score
12
- 4. **generateReason** (optional): Generate human-readable explanations
13
-
14
- Each step can use either **functions** or **prompt objects** (LLM-based evaluation), giving you the flexibility to combine deterministic algorithms with AI judgment as needed.
15
-
16
- ### Functions vs Prompt Objects
17
-
18
- **Functions** use JavaScript for deterministic logic. They're ideal for:
19
-
20
- - Algorithmic evaluations with clear criteria
21
- - Performance-critical scenarios
22
- - Integration with existing libraries
23
- - Consistent, reproducible results
24
-
25
- **Prompt Objects** use LLMs as judges for evaluation. They're perfect for:
26
-
27
- - Subjective evaluations requiring human-like judgment
28
- - Complex criteria difficult to code algorithmically
29
- - Natural language understanding tasks
30
- - Nuanced context evaluation
31
-
32
- **What “prompt object” means:** Instead of a function, the step is an object with `description` + `createPrompt` (and `outputSchema` for `preprocess`/`analyze`). That object tells Mastra to run the judge LLM for the step and store the structured output in `results.<step>StepResult`.
33
-
34
- You can mix and match approaches within a single scorer - for example, use a function for preprocessing data and an LLM for analyzing quality.
35
-
36
- ### Initializing a Scorer
37
-
38
- Every scorer starts with the `createScorer` factory function, which requires an id and description, and optionally accepts a type specification and judge configuration.
39
-
40
- ```typescript
41
- import { createScorer } from '@mastra/core/evals';
42
-
43
- const glutenCheckerScorer = createScorer({
44
- id: 'gluten-checker',
45
- description: 'Check if recipes contain gluten ingredients',
46
- judge: { // Optional: for prompt object steps
47
- model: 'openai/gpt-5.1',
48
- instructions: 'You are a Chef that identifies if recipes contain gluten.'
49
- }
50
- })
51
- // Chain step methods here
52
- .preprocess(...)
53
- .analyze(...)
54
- .generateScore(...)
55
- .generateReason(...)
56
- ```
57
-
58
- The judge configuration is only needed if you plan to use prompt objects in any step. Individual steps can override this default configuration with their own judge settings.
59
-
60
- If all steps are function-based, the judge is never called and there is no judge output. To see LLM output, define at least one step as a prompt object and read the corresponding step result (for example, `results.analyzeStepResult`).
61
-
62
- #### Minimal judge example (prompt object)
63
-
64
- This example uses a prompt object in `analyze`, so the judge runs and its structured output is available as `results.analyzeStepResult`.
65
-
66
- ```typescript
67
- import { createScorer } from "@mastra/core/evals";
68
- import { z } from "zod";
69
-
70
- const quoteSourcesScorer = createScorer({
71
- id: "quote-sources",
72
- description: "Check if the response includes sources",
73
- judge: {
74
- model: "openai/gpt-4.1-nano",
75
- instructions: "You are a strict evaluator.",
76
- },
77
- })
78
- .analyze({
79
- description: "Detect whether sources are present",
80
- outputSchema: z.object({
81
- hasSources: z.boolean(),
82
- sources: z.array(z.string()),
83
- }),
84
- createPrompt: ({ run }) => `
85
- Does the response contain sources? Extract them as a list.
86
-
87
- Response:
88
- ${run.output}
89
- `,
90
- })
91
- .generateScore(({ results }) => (results.analyzeStepResult.hasSources ? 1 : 0));
92
-
93
- // Run the scorer and inspect judge output
94
- const result = await quoteSourcesScorer.run({
95
- input: "What is the capital of France?",
96
- output: "Paris is the capital of France [1]. Source: [1] Wikipedia",
97
- });
98
-
99
- console.log(result.score); // 1
100
- console.log(result.analyzeStepResult); // { hasSources: true, sources: ["Wikipedia"] }
101
- ```
102
-
103
- #### Agent Type for Agent Evaluation
104
-
105
- For type safety and compatibility with both live agent scoring and trace scoring, use `type: 'agent'` when creating scorers for agent evaluation. This allows you to use the same scorer for an agent and also use it to score traces:
106
-
107
- ```typescript
108
- const myScorer = createScorer({
109
- type: "agent", // Automatically handles agent input/output types
110
- }).generateScore(({ run, results }) => {
111
- // run.output is automatically typed as ScorerRunOutputForAgent
112
- // run.input is automatically typed as ScorerRunInputForAgent
113
- });
114
- ```
115
-
116
- ### Step-by-Step Breakdown
117
-
118
- #### preprocess Step (Optional)
119
-
120
- Prepares input/output data when you need to extract specific elements, filter content, or transform complex data structures.
121
-
122
- **Functions:** `({ run, results }) => any`
123
-
124
- ```typescript
125
- const glutenCheckerScorer = createScorer(...)
126
- .preprocess(({ run }) => {
127
- // Extract and clean recipe text
128
- const recipeText = run.output.text.toLowerCase();
129
- const wordCount = recipeText.split(' ').length;
130
-
131
- return {
132
- recipeText,
133
- wordCount,
134
- hasCommonGlutenWords: /flour|wheat|bread|pasta/.test(recipeText)
135
- };
136
- })
137
- ```
138
-
139
- **Prompt Objects:** Use `description`, `outputSchema`, and `createPrompt` to structure LLM-based preprocessing.
140
-
141
- ```typescript
142
- const glutenCheckerScorer = createScorer(...)
143
- .preprocess({
144
- description: 'Extract ingredients from the recipe',
145
- outputSchema: z.object({
146
- ingredients: z.array(z.string()),
147
- cookingMethods: z.array(z.string())
148
- }),
149
- createPrompt: ({ run }) => `
150
- Extract all ingredients and cooking methods from this recipe:
151
- ${run.output.text}
152
-
153
- Return JSON with ingredients and cookingMethods arrays.
154
- `
155
- })
156
- ```
157
-
158
- **Data Flow:** Results are available to subsequent steps as `results.preprocessStepResult`
159
-
160
- #### analyze Step (Optional)
161
-
162
- Performs core evaluation analysis, gathering insights that will inform the scoring decision.
163
-
164
- **Functions:** `({ run, results }) => any`
165
-
166
- ```typescript
167
- const glutenCheckerScorer = createScorer({...})
168
- .preprocess(...)
169
- .analyze(({ run, results }) => {
170
- const { recipeText, hasCommonGlutenWords } = results.preprocessStepResult;
171
-
172
- // Simple gluten detection algorithm
173
- const glutenKeywords = ['wheat', 'flour', 'barley', 'rye', 'bread'];
174
- const foundGlutenWords = glutenKeywords.filter(word =>
175
- recipeText.includes(word)
176
- );
177
-
178
- return {
179
- isGlutenFree: foundGlutenWords.length === 0,
180
- detectedGlutenSources: foundGlutenWords,
181
- confidence: hasCommonGlutenWords ? 0.9 : 0.7
182
- };
183
- })
184
- ```
185
-
186
- **Prompt Objects:** Use `description`, `outputSchema`, and `createPrompt` for LLM-based analysis.
187
-
188
- ```typescript
189
- const glutenCheckerScorer = createScorer({...})
190
- .preprocess(...)
191
- .analyze({
192
- description: 'Analyze recipe for gluten content',
193
- outputSchema: z.object({
194
- isGlutenFree: z.boolean(),
195
- glutenSources: z.array(z.string()),
196
- confidence: z.number().min(0).max(1)
197
- }),
198
- createPrompt: ({ run, results }) => `
199
- Analyze this recipe for gluten content:
200
- "${results.preprocessStepResult.recipeText}"
201
-
202
- Look for wheat, barley, rye, and hidden sources like soy sauce.
203
- Return JSON with isGlutenFree, glutenSources array, and confidence (0-1).
204
- `
205
- })
206
- ```
207
-
208
- **Data Flow:** Results are available to subsequent steps as `results.analyzeStepResult`
209
-
210
- #### generateScore Step (Required)
211
-
212
- Converts analysis results into a numerical score. This is the only required step in the pipeline.
213
-
214
- **Functions:** `({ run, results }) => number`
215
-
216
- ```typescript
217
- const glutenCheckerScorer = createScorer({...})
218
- .preprocess(...)
219
- .analyze(...)
220
- .generateScore(({ results }) => {
221
- const { isGlutenFree, confidence } = results.analyzeStepResult;
222
-
223
- // Return 1 for gluten-free, 0 for contains gluten
224
- // Weight by confidence level
225
- return isGlutenFree ? confidence : 0;
226
- })
227
- ```
228
-
229
- **Prompt Objects:** See the [`createScorer`](https://mastra.ai/reference/evals/create-scorer) API reference for details on using prompt objects with generateScore, including required `calculateScore` function.
230
-
231
- **Data Flow:** The score is available to generateReason as the `score` parameter
232
-
233
- #### generateReason Step (Optional)
234
-
235
- Generates human-readable explanations for the score, useful for debugging, transparency, or user feedback.
236
-
237
- **Functions:** `({ run, results, score }) => string`
238
-
239
- ```typescript
240
- const glutenCheckerScorer = createScorer({...})
241
- .preprocess(...)
242
- .analyze(...)
243
- .generateScore(...)
244
- .generateReason(({ results, score }) => {
245
- const { isGlutenFree, glutenSources } = results.analyzeStepResult;
246
-
247
- if (isGlutenFree) {
248
- return `Score: ${score}. This recipe is gluten-free with no harmful ingredients detected.`;
249
- } else {
250
- return `Score: ${score}. Contains gluten from: ${glutenSources.join(', ')}`;
251
- }
252
- })
253
- ```
254
-
255
- **Prompt Objects:** Use `description` and `createPrompt` for LLM-generated explanations.
256
-
257
- ```typescript
258
- const glutenCheckerScorer = createScorer({...})
259
- .preprocess(...)
260
- .analyze(...)
261
- .generateScore(...)
262
- .generateReason({
263
- description: 'Explain the gluten assessment',
264
- createPrompt: ({ results, score }) => `
265
- Explain why this recipe received a score of ${score}.
266
- Analysis: ${JSON.stringify(results.analyzeStepResult)}
267
-
268
- Provide a clear explanation for someone with dietary restrictions.
269
- `
270
- })
271
- ```
272
-
273
- ## Example: Create a custom scorer
274
-
275
- A custom scorer in Mastra uses `createScorer` with four core components:
276
-
277
- 1. [**Judge Configuration**](#judge-configuration)
278
- 2. [**Analysis Step**](#analysis-step)
279
- 3. [**Score Generation**](#score-generation)
280
- 4. [**Reason Generation**](#reason-generation)
281
-
282
- Together, these components allow you to define custom evaluation logic using LLMs as judges.
283
-
284
- > **Info:** Visit [createScorer](https://mastra.ai/reference/evals/create-scorer) for the full API and configuration options.
285
-
286
- ```typescript
287
- import { createScorer } from "@mastra/core/evals";
288
- import { z } from "zod";
289
-
290
- export const GLUTEN_INSTRUCTIONS = `You are a Chef that identifies if recipes contain gluten.`;
291
-
292
- export const generateGlutenPrompt = ({
293
- output,
294
- }: {
295
- output: string;
296
- }) => `Check if this recipe is gluten-free.
297
-
298
- Check for:
299
- - Wheat
300
- - Barley
301
- - Rye
302
- - Common sources like flour, pasta, bread
303
-
304
- Example with gluten:
305
- "Mix flour and water to make dough"
306
- Response: {
307
- "isGlutenFree": false,
308
- "glutenSources": ["flour"]
309
- }
310
-
311
- Example gluten-free:
312
- "Mix rice, beans, and vegetables"
313
- Response: {
314
- "isGlutenFree": true,
315
- "glutenSources": []
316
- }
317
-
318
- Recipe to analyze:
319
- ${output}
320
-
321
- Return your response in this format:
322
- {
323
- "isGlutenFree": boolean,
324
- "glutenSources": ["list ingredients containing gluten"]
325
- }`;
326
-
327
- export const generateReasonPrompt = ({
328
- isGlutenFree,
329
- glutenSources,
330
- }: {
331
- isGlutenFree: boolean;
332
- glutenSources: string[];
333
- }) => `Explain why this recipe is${isGlutenFree ? "" : " not"} gluten-free.
334
-
335
- ${glutenSources.length > 0 ? `Sources of gluten: ${glutenSources.join(", ")}` : "No gluten-containing ingredients found"}
336
-
337
- Return your response in this format:
338
- "This recipe is [gluten-free/contains gluten] because [explanation]"`;
339
-
340
- export const glutenCheckerScorer = createScorer({
341
- id: "gluten-checker",
342
- description: "Check if the output contains any gluten",
343
- judge: {
344
- model: "openai/gpt-4.1-nano",
345
- instructions: GLUTEN_INSTRUCTIONS,
346
- },
347
- })
348
- .analyze({
349
- description: "Analyze the output for gluten",
350
- outputSchema: z.object({
351
- isGlutenFree: z.boolean(),
352
- glutenSources: z.array(z.string()),
353
- }),
354
- createPrompt: ({ run }) => {
355
- const { output } = run;
356
- return generateGlutenPrompt({ output: output.text });
357
- },
358
- })
359
- .generateScore(({ results }) => {
360
- return results.analyzeStepResult.isGlutenFree ? 1 : 0;
361
- })
362
- .generateReason({
363
- description: "Generate a reason for the score",
364
- createPrompt: ({ results }) => {
365
- return generateReasonPrompt({
366
- glutenSources: results.analyzeStepResult.glutenSources,
367
- isGlutenFree: results.analyzeStepResult.isGlutenFree,
368
- });
369
- },
370
- });
371
- ```
372
-
373
- ### Judge Configuration
374
-
375
- Sets up the LLM model and defines its role as a domain expert.
376
-
377
- ```typescript
378
- judge: {
379
- model: 'openai/gpt-4.1-nano',
380
- instructions: GLUTEN_INSTRUCTIONS,
381
- }
382
- ```
383
-
384
- ### Analysis Step
385
-
386
- Defines how the LLM should analyze the input and what structured output to return.
387
-
388
- ```typescript
389
- .analyze({
390
- description: 'Analyze the output for gluten',
391
- outputSchema: z.object({
392
- isGlutenFree: z.boolean(),
393
- glutenSources: z.array(z.string()),
394
- }),
395
- createPrompt: ({ run }) => {
396
- const { output } = run;
397
- return generateGlutenPrompt({ output: output.text });
398
- },
399
- })
400
- ```
401
-
402
- The analysis step uses a prompt object to:
403
-
404
- - Provide a clear description of the analysis task
405
- - Define expected output structure with Zod schema (both boolean result and list of gluten sources)
406
- - Generate dynamic prompts based on the input content
407
-
408
- ### Score Generation
409
-
410
- Converts the LLM's structured analysis into a numerical score.
411
-
412
- ```typescript
413
- .generateScore(({ results }) => {
414
- return results.analyzeStepResult.isGlutenFree ? 1 : 0;
415
- })
416
- ```
417
-
418
- The score generation function takes the analysis results and applies business logic to produce a score. In this case, the LLM directly determines if the recipe is gluten-free, so we use that boolean result: 1 for gluten-free, 0 for contains gluten.
419
-
420
- ### Reason Generation
421
-
422
- Provides human-readable explanations for the score using another LLM call.
423
-
424
- ```typescript
425
- .generateReason({
426
- description: 'Generate a reason for the score',
427
- createPrompt: ({ results }) => {
428
- return generateReasonPrompt({
429
- glutenSources: results.analyzeStepResult.glutenSources,
430
- isGlutenFree: results.analyzeStepResult.isGlutenFree,
431
- });
432
- },
433
- })
434
- ```
435
-
436
- The reason generation step creates explanations that help users understand why a particular score was assigned, using both the boolean result and the specific gluten sources identified by the analysis step.
437
-
438
- ## High gluten-free example
439
-
440
- ```typescript
441
- const result = await glutenCheckerScorer.run({
442
- input: [{ role: 'user', content: 'Mix rice, beans, and vegetables' }],
443
- output: { text: 'Mix rice, beans, and vegetables' },
444
- });
445
-
446
- console.log('Score:', result.score);
447
- console.log('Gluten sources:', result.analyzeStepResult.glutenSources);
448
- console.log('Reason:', result.reason);
449
- ```
450
-
451
- ### High gluten-free output
452
-
453
- ```typescript
454
- {
455
- score: 1,
456
- analyzeStepResult: {
457
- isGlutenFree: true,
458
- glutenSources: []
459
- },
460
- reason: 'This recipe is gluten-free because rice, beans, and vegetables are naturally gluten-free ingredients that are safe for people with celiac disease.'
461
- }
462
- ```
463
-
464
- ## Partial gluten example
465
-
466
- ```typescript
467
- const result = await glutenCheckerScorer.run({
468
- input: [{ role: "user", content: "Mix flour and water to make dough" }],
469
- output: { text: "Mix flour and water to make dough" },
470
- });
471
-
472
- console.log("Score:", result.score);
473
- console.log("Gluten sources:", result.analyzeStepResult.glutenSources);
474
- console.log("Reason:", result.reason);
475
- ```
476
-
477
- ### Partial gluten output
478
-
479
- ```typescript
480
- {
481
- score: 0,
482
- analyzeStepResult: {
483
- isGlutenFree: false,
484
- glutenSources: ['flour']
485
- },
486
- reason: 'This recipe is not gluten-free because it contains flour. Regular flour is made from wheat and contains gluten, making it unsafe for people with celiac disease or gluten sensitivity.'
487
- }
488
- ```
489
-
490
- ## Low gluten-free example
491
-
492
- ```typescript
493
- const result = await glutenCheckerScorer.run({
494
- input: [{ role: "user", content: "Add soy sauce and noodles" }],
495
- output: { text: "Add soy sauce and noodles" },
496
- });
497
-
498
- console.log("Score:", result.score);
499
- console.log("Gluten sources:", result.analyzeStepResult.glutenSources);
500
- console.log("Reason:", result.reason);
501
- ```
502
-
503
- ### Low gluten-free output
504
-
505
- ```typescript
506
- {
507
- score: 0,
508
- analyzeStepResult: {
509
- isGlutenFree: false,
510
- glutenSources: ['soy sauce', 'noodles']
511
- },
512
- reason: 'This recipe is not gluten-free because it contains soy sauce, noodles. Regular soy sauce contains wheat and most noodles are made from wheat flour, both of which contain gluten and are unsafe for people with gluten sensitivity.'
513
- }
514
- ```
515
-
516
- **Examples and Resources:**
517
-
518
- - [createScorer API Reference](https://mastra.ai/reference/evals/create-scorer) - Complete technical documentation
519
- - [Built-in Scorers Source Code](https://github.com/mastra-ai/mastra/tree/main/packages/evals/src/scorers) - Real implementations for reference
@@ -1,146 +0,0 @@
1
- # Scorers overview
2
-
3
- While traditional software tests have clear pass/fail conditions, AI outputs are non-deterministic — they can vary with the same input. **Scorers** help bridge this gap by providing quantifiable metrics for measuring agent quality.
4
-
5
- Scorers are automated tests that evaluate Agents outputs using model-graded, rule-based, and statistical methods. Scorers return **scores**: numerical values (typically between 0 and 1) that quantify how well an output meets your evaluation criteria. These scores enable you to objectively track performance, compare different approaches, and identify areas for improvement in your AI systems. Scorers can be customized with your own prompts and scoring functions.
6
-
7
- Scorers can be run in the cloud, capturing real-time results. But scorers can also be part of your CI/CD pipeline, allowing you to test and monitor your agents over time.
8
-
9
- ## Types of Scorers
10
-
11
- There are different kinds of scorers, each serving a specific purpose. Here are some common types:
12
-
13
- 1. **Textual Scorers**: Evaluate accuracy, reliability, and context understanding of agent responses
14
- 2. **Classification Scorers**: Measure accuracy in categorizing data based on predefined categories
15
- 3. **Prompt Engineering Scorers**: Explore impact of different instructions and input formats
16
-
17
- ## Installation
18
-
19
- To access Mastra's scorers feature install the `@mastra/evals` package.
20
-
21
- **npm**:
22
-
23
- ```bash
24
- npm install @mastra/evals@latest
25
- ```
26
-
27
- **pnpm**:
28
-
29
- ```bash
30
- pnpm add @mastra/evals@latest
31
- ```
32
-
33
- **Yarn**:
34
-
35
- ```bash
36
- yarn add @mastra/evals@latest
37
- ```
38
-
39
- **Bun**:
40
-
41
- ```bash
42
- bun add @mastra/evals@latest
43
- ```
44
-
45
- ## Live evaluations
46
-
47
- **Live evaluations** allow you to automatically score AI outputs in real-time as your agents and workflows operate. Instead of running evaluations manually or in batches, scorers run asynchronously alongside your AI systems, providing continuous quality monitoring.
48
-
49
- ### Adding scorers to agents
50
-
51
- You can add built-in scorers to your agents to automatically evaluate their outputs. See the [full list of built-in scorers](https://mastra.ai/docs/evals/built-in-scorers) for all available options.
52
-
53
- ```typescript
54
- import { Agent } from "@mastra/core/agent";
55
- import {
56
- createAnswerRelevancyScorer,
57
- createToxicityScorer,
58
- } from "@mastra/evals/scorers/prebuilt";
59
-
60
- export const evaluatedAgent = new Agent({
61
- scorers: {
62
- relevancy: {
63
- scorer: createAnswerRelevancyScorer({ model: "openai/gpt-4.1-nano" }),
64
- sampling: { type: "ratio", rate: 0.5 },
65
- },
66
- safety: {
67
- scorer: createToxicityScorer({ model: "openai/gpt-4.1-nano" }),
68
- sampling: { type: "ratio", rate: 1 },
69
- },
70
- },
71
- });
72
- ```
73
-
74
- ### Adding scorers to workflow steps
75
-
76
- You can also add scorers to individual workflow steps to evaluate outputs at specific points in your process:
77
-
78
- ```typescript
79
- import { createWorkflow, createStep } from "@mastra/core/workflows";
80
- import { z } from "zod";
81
- import { customStepScorer } from "../scorers/custom-step-scorer";
82
-
83
- const contentStep = createStep({
84
- scorers: {
85
- customStepScorer: {
86
- scorer: customStepScorer(),
87
- sampling: {
88
- type: "ratio",
89
- rate: 1, // Score every step execution
90
- }
91
- }
92
- },
93
- });
94
-
95
- export const contentWorkflow = createWorkflow({ ... })
96
- .then(contentStep)
97
- .commit();
98
- ```
99
-
100
- ### How live evaluations work
101
-
102
- **Asynchronous execution**: Live evaluations run in the background without blocking your agent responses or workflow execution. This ensures your AI systems maintain their performance while still being monitored.
103
-
104
- **Sampling control**: The `sampling.rate` parameter (0-1) controls what percentage of outputs get scored:
105
-
106
- - `1.0`: Score every single response (100%)
107
- - `0.5`: Score half of all responses (50%)
108
- - `0.1`: Score 10% of responses
109
- - `0.0`: Disable scoring
110
-
111
- **Automatic storage**: All scoring results are automatically stored in the `mastra_scorers` table in your configured database, allowing you to analyze performance trends over time.
112
-
113
- ## Trace evaluations
114
-
115
- In addition to live evaluations, you can use scorers to evaluate historical traces from your agent interactions and workflows. This is particularly useful for analyzing past performance, debugging issues, or running batch evaluations.
116
-
117
- > **Info:** **Observability Required**
118
- >
119
- > To score traces, you must first configure observability in your Mastra instance to collect trace data. See [Tracing documentation](https://mastra.ai/docs/observability/tracing/overview) for setup instructions.
120
-
121
- ### Scoring traces with Studio
122
-
123
- To score traces, you first need to register your scorers with your Mastra instance:
124
-
125
- ```typescript
126
- const mastra = new Mastra({
127
- scorers: {
128
- answerRelevancy: myAnswerRelevancyScorer,
129
- responseQuality: myResponseQualityScorer,
130
- },
131
- });
132
- ```
133
-
134
- Once registered, you can score traces interactively within Studio under the Observability section. This provides a user-friendly interface for running scorers against historical traces.
135
-
136
- ## Testing scorers locally
137
-
138
- Mastra provides a CLI command `mastra dev` to test your scorers. Studio includes a scorers section where you can run individual scorers against test inputs and view detailed results.
139
-
140
- For more details, see [Studio](https://mastra.ai/docs/getting-started/studio) docs.
141
-
142
- ## Next steps
143
-
144
- - Learn how to create your own scorers in the [Creating Custom Scorers](https://mastra.ai/docs/evals/custom-scorers) guide
145
- - Explore built-in scorers in the [Built-in Scorers](https://mastra.ai/docs/evals/built-in-scorers) section
146
- - Test scorers with [Studio](https://mastra.ai/docs/getting-started/studio)