@mastra/core 1.9.0 → 1.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +202 -0
- package/dist/agent/agent.d.ts +22 -3
- package/dist/agent/agent.d.ts.map +1 -1
- package/dist/agent/index.cjs +13 -13
- package/dist/agent/index.js +2 -2
- package/dist/agent/message-list/index.cjs +18 -18
- package/dist/agent/message-list/index.js +1 -1
- package/dist/agent/message-list/message-list.d.ts.map +1 -1
- package/dist/{chunk-IOY7Y5GV.js → chunk-3VVNJPTO.js} +602 -187
- package/dist/chunk-3VVNJPTO.js.map +1 -0
- package/dist/{chunk-VDKWYUGC.cjs → chunk-5WBEMKE2.cjs} +7 -3
- package/dist/chunk-5WBEMKE2.cjs.map +1 -0
- package/dist/{chunk-H5S4PS44.cjs → chunk-76Q75VI4.cjs} +602 -187
- package/dist/chunk-76Q75VI4.cjs.map +1 -0
- package/dist/{chunk-ZBESCKPX.cjs → chunk-7AHCLTZZ.cjs} +1572 -36
- package/dist/chunk-7AHCLTZZ.cjs.map +1 -0
- package/dist/{chunk-SEKQJ447.js → chunk-ACAILOJE.js} +166 -68
- package/dist/chunk-ACAILOJE.js.map +1 -0
- package/dist/{chunk-YIJZBU54.cjs → chunk-BCSVBOAN.cjs} +240 -463
- package/dist/chunk-BCSVBOAN.cjs.map +1 -0
- package/dist/{chunk-ZVWVQ6MG.js → chunk-CY4ZWL2X.js} +8 -3
- package/dist/chunk-CY4ZWL2X.js.map +1 -0
- package/dist/{chunk-ET7GXCHS.js → chunk-DIKRJVK6.js} +5 -5
- package/dist/{chunk-ET7GXCHS.js.map → chunk-DIKRJVK6.js.map} +1 -1
- package/dist/{chunk-P6ZX7OKT.cjs → chunk-E6I5LBDM.cjs} +7 -7
- package/dist/{chunk-P6ZX7OKT.cjs.map → chunk-E6I5LBDM.cjs.map} +1 -1
- package/dist/{chunk-BFV3GSGS.js → chunk-FT5Q6XTK.js} +1546 -22
- package/dist/chunk-FT5Q6XTK.js.map +1 -0
- package/dist/{chunk-K54LFB4P.js → chunk-FXOWXS4O.js} +3 -3
- package/dist/{chunk-K54LFB4P.js.map → chunk-FXOWXS4O.js.map} +1 -1
- package/dist/{chunk-QNXY3J6B.cjs → chunk-GCRPNAAR.cjs} +22 -19
- package/dist/chunk-GCRPNAAR.cjs.map +1 -0
- package/dist/{chunk-5M3RMO7U.js → chunk-GTA5BKXZ.js} +8 -8
- package/dist/{chunk-5M3RMO7U.js.map → chunk-GTA5BKXZ.js.map} +1 -1
- package/dist/{chunk-G5R2755Q.cjs → chunk-HAIQ57YB.cjs} +53 -20
- package/dist/chunk-HAIQ57YB.cjs.map +1 -0
- package/dist/{chunk-3ZBLD2Y4.cjs → chunk-HH76UOJL.cjs} +2 -2
- package/dist/{chunk-3ZBLD2Y4.cjs.map → chunk-HH76UOJL.cjs.map} +1 -1
- package/dist/{chunk-HAGCXIBX.cjs → chunk-HNYQITSV.cjs} +9 -9
- package/dist/{chunk-HAGCXIBX.cjs.map → chunk-HNYQITSV.cjs.map} +1 -1
- package/dist/{chunk-GJTLWOKJ.js → chunk-HQA3IBLZ.js} +51 -18
- package/dist/chunk-HQA3IBLZ.js.map +1 -0
- package/dist/{chunk-4BXXAZ75.js → chunk-HQTHWVAK.js} +15 -7
- package/dist/chunk-HQTHWVAK.js.map +1 -0
- package/dist/{chunk-CCLV5CAA.js → chunk-JGOH7RWL.js} +7 -3
- package/dist/chunk-JGOH7RWL.js.map +1 -0
- package/dist/{chunk-Y6TGIUGL.js → chunk-LSF5WJ6G.js} +2 -2
- package/dist/{chunk-Y6TGIUGL.js.map → chunk-LSF5WJ6G.js.map} +1 -1
- package/dist/{chunk-D6HO5QAM.cjs → chunk-M26GEN4C.cjs} +14 -9
- package/dist/chunk-M26GEN4C.cjs.map +1 -0
- package/dist/{chunk-4P35AVPE.cjs → chunk-MEMYFFOL.cjs} +256 -157
- package/dist/chunk-MEMYFFOL.cjs.map +1 -0
- package/dist/{chunk-BQ355Z3O.js → chunk-NUZWQA4J.js} +4 -4
- package/dist/{chunk-BQ355Z3O.js.map → chunk-NUZWQA4J.js.map} +1 -1
- package/dist/{chunk-JIRB5LX4.js → chunk-OHX36YXF.js} +5 -3
- package/dist/chunk-OHX36YXF.js.map +1 -0
- package/dist/{chunk-PKORY4ZZ.cjs → chunk-OOTLMVNN.cjs} +107 -7
- package/dist/chunk-OOTLMVNN.cjs.map +1 -0
- package/dist/{chunk-D4M6E4OQ.cjs → chunk-OWZ6QT24.cjs} +16 -8
- package/dist/chunk-OWZ6QT24.cjs.map +1 -0
- package/dist/{chunk-6OHS6ZQ3.js → chunk-PDJCIONR.js} +8 -8
- package/dist/{chunk-6OHS6ZQ3.js.map → chunk-PDJCIONR.js.map} +1 -1
- package/dist/{chunk-WU2P7XOU.cjs → chunk-PP4G2TZZ.cjs} +20 -20
- package/dist/{chunk-WU2P7XOU.cjs.map → chunk-PP4G2TZZ.cjs.map} +1 -1
- package/dist/{chunk-VWAK25TU.js → chunk-SBBWRNZA.js} +10 -7
- package/dist/chunk-SBBWRNZA.js.map +1 -0
- package/dist/{chunk-TU4YMXOQ.js → chunk-TJIFNVPX.js} +6 -6
- package/dist/{chunk-TU4YMXOQ.js.map → chunk-TJIFNVPX.js.map} +1 -1
- package/dist/{chunk-V5YNS2QO.cjs → chunk-XJNQHPBJ.cjs} +15 -15
- package/dist/{chunk-V5YNS2QO.cjs.map → chunk-XJNQHPBJ.cjs.map} +1 -1
- package/dist/{chunk-MVYP55NA.js → chunk-YC6Z75K2.js} +113 -336
- package/dist/chunk-YC6Z75K2.js.map +1 -0
- package/dist/{chunk-3ZF7IC6Q.cjs → chunk-YZMSJKAK.cjs} +5 -3
- package/dist/chunk-YZMSJKAK.cjs.map +1 -0
- package/dist/{chunk-L3WDI7HP.cjs → chunk-Z2X5VTYJ.cjs} +65 -65
- package/dist/{chunk-L3WDI7HP.cjs.map → chunk-Z2X5VTYJ.cjs.map} +1 -1
- package/dist/{chunk-BN5MV2QK.cjs → chunk-Z52OAJ73.cjs} +4 -4
- package/dist/{chunk-BN5MV2QK.cjs.map → chunk-Z52OAJ73.cjs.map} +1 -1
- package/dist/{chunk-WYPSC6CO.js → chunk-ZBU6P4JV.js} +104 -4
- package/dist/chunk-ZBU6P4JV.js.map +1 -0
- package/dist/datasets/experiment/index.d.ts.map +1 -1
- package/dist/datasets/experiment/scorer.d.ts.map +1 -1
- package/dist/datasets/index.cjs +17 -17
- package/dist/datasets/index.js +2 -2
- package/dist/di/index.cjs +4 -4
- package/dist/di/index.js +1 -1
- package/dist/docs/SKILL.md +1 -1
- package/dist/docs/assets/SOURCE_MAP.json +443 -384
- package/dist/docs/references/docs-agents-agent-approval.md +61 -31
- package/dist/docs/references/docs-agents-supervisor-agents.md +1 -1
- package/dist/docs/references/docs-memory-observational-memory.md +9 -0
- package/dist/docs/references/docs-memory-semantic-recall.md +17 -1
- package/dist/docs/references/docs-workspace-skills.md +7 -5
- package/dist/docs/references/reference-agents-agent.md +20 -20
- package/dist/docs/references/reference-agents-generate.md +200 -66
- package/dist/docs/references/reference-agents-generateLegacy.md +77 -35
- package/dist/docs/references/reference-agents-getDefaultGenerateOptions.md +4 -6
- package/dist/docs/references/reference-agents-getDefaultOptions.md +4 -6
- package/dist/docs/references/reference-agents-getDefaultStreamOptions.md +4 -6
- package/dist/docs/references/reference-agents-getDescription.md +1 -1
- package/dist/docs/references/reference-agents-getInstructions.md +4 -6
- package/dist/docs/references/reference-agents-getLLM.md +6 -8
- package/dist/docs/references/reference-agents-getMemory.md +4 -6
- package/dist/docs/references/reference-agents-getModel.md +4 -6
- package/dist/docs/references/reference-agents-getTools.md +5 -7
- package/dist/docs/references/reference-agents-getVoice.md +4 -6
- package/dist/docs/references/reference-agents-listAgents.md +4 -6
- package/dist/docs/references/reference-agents-listScorers.md +4 -6
- package/dist/docs/references/reference-agents-listTools.md +4 -6
- package/dist/docs/references/reference-agents-listWorkflows.md +4 -6
- package/dist/docs/references/reference-agents-network.md +69 -23
- package/dist/docs/references/reference-ai-sdk-chat-route.md +7 -7
- package/dist/docs/references/reference-ai-sdk-network-route.md +3 -3
- package/dist/docs/references/reference-ai-sdk-to-ai-sdk-stream.md +9 -9
- package/dist/docs/references/reference-ai-sdk-with-mastra.md +12 -12
- package/dist/docs/references/reference-ai-sdk-workflow-route.md +3 -3
- package/dist/docs/references/reference-auth-auth0.md +6 -6
- package/dist/docs/references/reference-auth-clerk.md +5 -5
- package/dist/docs/references/reference-auth-firebase.md +7 -7
- package/dist/docs/references/reference-auth-jwt.md +1 -1
- package/dist/docs/references/reference-auth-supabase.md +4 -4
- package/dist/docs/references/reference-auth-workos.md +6 -6
- package/dist/docs/references/reference-core-addGateway.md +2 -2
- package/dist/docs/references/reference-core-getAgent.md +2 -2
- package/dist/docs/references/reference-core-getAgentById.md +2 -2
- package/dist/docs/references/reference-core-getDeployer.md +1 -1
- package/dist/docs/references/reference-core-getGateway.md +2 -2
- package/dist/docs/references/reference-core-getGatewayById.md +2 -2
- package/dist/docs/references/reference-core-getLogger.md +1 -1
- package/dist/docs/references/reference-core-getMCPServer.md +2 -2
- package/dist/docs/references/reference-core-getMCPServerById.md +3 -3
- package/dist/docs/references/reference-core-getMemory.md +2 -2
- package/dist/docs/references/reference-core-getScorer.md +2 -2
- package/dist/docs/references/reference-core-getScorerById.md +2 -2
- package/dist/docs/references/reference-core-getServer.md +1 -1
- package/dist/docs/references/reference-core-getStorage.md +1 -1
- package/dist/docs/references/reference-core-getStoredAgentById.md +18 -20
- package/dist/docs/references/reference-core-getTelemetry.md +1 -1
- package/dist/docs/references/reference-core-getVector.md +2 -2
- package/dist/docs/references/reference-core-getWorkflow.md +3 -3
- package/dist/docs/references/reference-core-listAgents.md +1 -1
- package/dist/docs/references/reference-core-listGateways.md +1 -1
- package/dist/docs/references/reference-core-listLogs.md +9 -11
- package/dist/docs/references/reference-core-listLogsByRunId.md +9 -9
- package/dist/docs/references/reference-core-listMCPServers.md +1 -1
- package/dist/docs/references/reference-core-listMemory.md +1 -1
- package/dist/docs/references/reference-core-listScorers.md +1 -1
- package/dist/docs/references/reference-core-listStoredAgents.md +9 -11
- package/dist/docs/references/reference-core-listVectors.md +1 -1
- package/dist/docs/references/reference-core-listWorkflows.md +2 -2
- package/dist/docs/references/reference-core-mastra-class.md +17 -17
- package/dist/docs/references/reference-core-mastra-model-gateway.md +15 -15
- package/dist/docs/references/reference-core-setLogger.md +2 -4
- package/dist/docs/references/reference-core-setStorage.md +1 -1
- package/dist/docs/references/reference-datasets-addItem.md +20 -4
- package/dist/docs/references/reference-datasets-addItems.md +8 -2
- package/dist/docs/references/reference-datasets-compareExperiments.md +15 -3
- package/dist/docs/references/reference-datasets-create.md +6 -6
- package/dist/docs/references/reference-datasets-dataset.md +1 -1
- package/dist/docs/references/reference-datasets-delete.md +2 -2
- package/dist/docs/references/reference-datasets-deleteExperiment.md +2 -2
- package/dist/docs/references/reference-datasets-deleteItem.md +2 -2
- package/dist/docs/references/reference-datasets-deleteItems.md +2 -2
- package/dist/docs/references/reference-datasets-get.md +2 -2
- package/dist/docs/references/reference-datasets-getDetails.md +9 -9
- package/dist/docs/references/reference-datasets-getExperiment.md +2 -2
- package/dist/docs/references/reference-datasets-getItem.md +3 -3
- package/dist/docs/references/reference-datasets-getItemHistory.md +22 -2
- package/dist/docs/references/reference-datasets-list.md +7 -3
- package/dist/docs/references/reference-datasets-listExperimentResults.md +34 -4
- package/dist/docs/references/reference-datasets-listExperiments.md +41 -3
- package/dist/docs/references/reference-datasets-listItems.md +18 -6
- package/dist/docs/references/reference-datasets-listVersions.md +23 -3
- package/dist/docs/references/reference-datasets-startExperiment.md +62 -12
- package/dist/docs/references/reference-datasets-startExperimentAsync.md +5 -1
- package/dist/docs/references/reference-datasets-update.md +6 -6
- package/dist/docs/references/reference-datasets-updateItem.md +5 -5
- package/dist/docs/references/reference-evals-answer-relevancy.md +11 -11
- package/dist/docs/references/reference-evals-answer-similarity.md +17 -19
- package/dist/docs/references/reference-evals-bias.md +10 -10
- package/dist/docs/references/reference-evals-completeness.md +3 -3
- package/dist/docs/references/reference-evals-content-similarity.md +6 -6
- package/dist/docs/references/reference-evals-context-precision.md +4 -4
- package/dist/docs/references/reference-evals-create-scorer.md +47 -49
- package/dist/docs/references/reference-evals-faithfulness.md +11 -11
- package/dist/docs/references/reference-evals-hallucination.md +17 -21
- package/dist/docs/references/reference-evals-keyword-coverage.md +4 -4
- package/dist/docs/references/reference-evals-mastra-scorer.md +14 -14
- package/dist/docs/references/reference-evals-run-evals.md +16 -16
- package/dist/docs/references/reference-evals-scorer-utils.md +3 -3
- package/dist/docs/references/reference-evals-textual-difference.md +3 -3
- package/dist/docs/references/reference-evals-tone-consistency.md +3 -3
- package/dist/docs/references/reference-evals-toxicity.md +8 -8
- package/dist/docs/references/reference-harness-harness-class.md +34 -42
- package/dist/docs/references/reference-logging-pino-logger.md +5 -5
- package/dist/docs/references/reference-memory-deleteMessages.md +2 -2
- package/dist/docs/references/reference-memory-memory-class.md +12 -14
- package/dist/docs/references/reference-memory-observational-memory.md +102 -94
- package/dist/docs/references/reference-observability-tracing-configuration.md +27 -10
- package/dist/docs/references/reference-observability-tracing-exporters-console-exporter.md +4 -7
- package/dist/docs/references/reference-processors-batch-parts-processor.md +8 -10
- package/dist/docs/references/reference-processors-language-detector.md +14 -16
- package/dist/docs/references/reference-processors-message-history-processor.md +7 -9
- package/dist/docs/references/reference-processors-moderation-processor.md +13 -15
- package/dist/docs/references/reference-processors-pii-detector.md +14 -16
- package/dist/docs/references/reference-processors-processor-interface.md +62 -62
- package/dist/docs/references/reference-processors-prompt-injection-detector.md +11 -13
- package/dist/docs/references/reference-processors-semantic-recall-processor.md +14 -16
- package/dist/docs/references/reference-processors-system-prompt-scrubber.md +12 -14
- package/dist/docs/references/reference-processors-token-limiter-processor.md +11 -13
- package/dist/docs/references/reference-processors-tool-call-filter.md +5 -7
- package/dist/docs/references/reference-processors-tool-search-processor.md +9 -11
- package/dist/docs/references/reference-processors-unicode-normalizer.md +8 -10
- package/dist/docs/references/reference-processors-working-memory-processor.md +14 -18
- package/dist/docs/references/reference-rag-database-config.md +11 -7
- package/dist/docs/references/reference-rag-embeddings.md +12 -12
- package/dist/docs/references/reference-server-mastra-server.md +10 -10
- package/dist/docs/references/reference-server-register-api-route.md +13 -13
- package/dist/docs/references/reference-storage-cloudflare-d1.md +5 -5
- package/dist/docs/references/reference-storage-composite.md +9 -9
- package/dist/docs/references/reference-storage-lance.md +3 -3
- package/dist/docs/references/reference-storage-libsql.md +2 -2
- package/dist/docs/references/reference-storage-mongodb.md +5 -5
- package/dist/docs/references/reference-storage-mssql.md +2 -2
- package/dist/docs/references/reference-storage-postgresql.md +25 -25
- package/dist/docs/references/reference-storage-upstash.md +3 -3
- package/dist/docs/references/reference-streaming-ChunkType.md +251 -59
- package/dist/docs/references/reference-streaming-agents-MastraModelOutput.md +86 -16
- package/dist/docs/references/reference-streaming-agents-streamLegacy.md +79 -39
- package/dist/docs/references/reference-streaming-workflows-resumeStream.md +18 -8
- package/dist/docs/references/reference-streaming-workflows-stream.md +21 -9
- package/dist/docs/references/reference-streaming-workflows-timeTravelStream.md +4 -4
- package/dist/docs/references/reference-tools-create-tool.md +25 -21
- package/dist/docs/references/reference-tools-graph-rag-tool.md +16 -18
- package/dist/docs/references/reference-tools-mcp-client.md +38 -27
- package/dist/docs/references/reference-tools-mcp-server.md +45 -45
- package/dist/docs/references/reference-tools-vector-query-tool.md +34 -22
- package/dist/docs/references/reference-vectors-libsql.md +31 -31
- package/dist/docs/references/reference-vectors-mongodb.md +32 -32
- package/dist/docs/references/reference-vectors-pg.md +60 -44
- package/dist/docs/references/reference-vectors-upstash.md +25 -25
- package/dist/docs/references/reference-voice-composite-voice.md +10 -10
- package/dist/docs/references/reference-voice-mastra-voice.md +20 -20
- package/dist/docs/references/reference-voice-voice.addInstructions.md +1 -1
- package/dist/docs/references/reference-voice-voice.addTools.md +1 -1
- package/dist/docs/references/reference-voice-voice.connect.md +3 -3
- package/dist/docs/references/reference-voice-voice.events.md +11 -11
- package/dist/docs/references/reference-voice-voice.listen.md +9 -9
- package/dist/docs/references/reference-voice-voice.on.md +2 -2
- package/dist/docs/references/reference-voice-voice.speak.md +11 -11
- package/dist/docs/references/reference-workflows-run-methods-cancel.md +2 -2
- package/dist/docs/references/reference-workflows-run-methods-restart.md +17 -5
- package/dist/docs/references/reference-workflows-run-methods-resume.md +23 -9
- package/dist/docs/references/reference-workflows-run-methods-start.md +22 -8
- package/dist/docs/references/reference-workflows-run-methods-startAsync.md +12 -6
- package/dist/docs/references/reference-workflows-run-methods-timeTravel.md +29 -13
- package/dist/docs/references/reference-workflows-run.md +12 -12
- package/dist/docs/references/reference-workflows-step.md +24 -26
- package/dist/docs/references/reference-workflows-workflow-methods-branch.md +2 -2
- package/dist/docs/references/reference-workflows-workflow-methods-commit.md +1 -1
- package/dist/docs/references/reference-workflows-workflow-methods-create-run.md +4 -4
- package/dist/docs/references/reference-workflows-workflow-methods-dountil.md +3 -3
- package/dist/docs/references/reference-workflows-workflow-methods-dowhile.md +3 -3
- package/dist/docs/references/reference-workflows-workflow-methods-foreach.md +9 -9
- package/dist/docs/references/reference-workflows-workflow-methods-map.md +2 -2
- package/dist/docs/references/reference-workflows-workflow-methods-parallel.md +2 -2
- package/dist/docs/references/reference-workflows-workflow-methods-sleep.md +2 -2
- package/dist/docs/references/reference-workflows-workflow-methods-sleepUntil.md +2 -2
- package/dist/docs/references/reference-workflows-workflow-methods-then.md +2 -2
- package/dist/docs/references/reference-workflows-workflow.md +40 -50
- package/dist/docs/references/reference-workspace-filesystem.md +22 -22
- package/dist/docs/references/reference-workspace-local-filesystem.md +35 -35
- package/dist/docs/references/reference-workspace-local-sandbox.md +26 -26
- package/dist/docs/references/reference-workspace-sandbox.md +8 -8
- package/dist/docs/references/reference-workspace-workspace-class.md +30 -34
- package/dist/editor/types.d.ts +1 -0
- package/dist/editor/types.d.ts.map +1 -1
- package/dist/evals/index.cjs +20 -20
- package/dist/evals/index.js +3 -3
- package/dist/evals/scoreTraces/index.cjs +5 -5
- package/dist/evals/scoreTraces/index.js +2 -2
- package/dist/harness/harness.d.ts.map +1 -1
- package/dist/harness/index.cjs +25 -21
- package/dist/harness/index.cjs.map +1 -1
- package/dist/harness/index.js +15 -11
- package/dist/harness/index.js.map +1 -1
- package/dist/index.cjs +2 -2
- package/dist/index.js +1 -1
- package/dist/integration/index.cjs +2 -2
- package/dist/integration/index.js +1 -1
- package/dist/llm/index.cjs +12 -12
- package/dist/llm/index.js +3 -3
- package/dist/llm/model/model.d.ts.map +1 -1
- package/dist/llm/model/model.loop.d.ts.map +1 -1
- package/dist/llm/model/provider-types.generated.d.ts +5 -1
- package/dist/loop/index.cjs +14 -14
- package/dist/loop/index.js +1 -1
- package/dist/loop/workflows/agentic-loop/index.d.ts.map +1 -1
- package/dist/mastra/index.cjs +2 -2
- package/dist/mastra/index.js +1 -1
- package/dist/memory/index.cjs +14 -14
- package/dist/memory/index.js +1 -1
- package/dist/models-dev-2FJK72J2.cjs +12 -0
- package/dist/{models-dev-UVWCKPA2.cjs.map → models-dev-2FJK72J2.cjs.map} +1 -1
- package/dist/models-dev-CMQG6EMO.js +3 -0
- package/dist/{models-dev-W3LXZTEB.js.map → models-dev-CMQG6EMO.js.map} +1 -1
- package/dist/processor-provider/index.cjs +10 -10
- package/dist/processor-provider/index.js +1 -1
- package/dist/processors/index.cjs +42 -42
- package/dist/processors/index.js +1 -1
- package/dist/processors/processors/skills.d.ts +9 -42
- package/dist/processors/processors/skills.d.ts.map +1 -1
- package/dist/provider-registry-EHOAWHFE.cjs +40 -0
- package/dist/{provider-registry-4HLP2JRR.cjs.map → provider-registry-EHOAWHFE.cjs.map} +1 -1
- package/dist/provider-registry-LVP6T63V.js +3 -0
- package/dist/{provider-registry-K3DWQSMH.js.map → provider-registry-LVP6T63V.js.map} +1 -1
- package/dist/provider-registry.json +12 -4
- package/dist/relevance/index.cjs +3 -3
- package/dist/relevance/index.js +1 -1
- package/dist/request-context/index.cjs +4 -4
- package/dist/request-context/index.d.ts.map +1 -1
- package/dist/request-context/index.js +1 -1
- package/dist/storage/base.d.ts +34 -1
- package/dist/storage/base.d.ts.map +1 -1
- package/dist/storage/constants.cjs +56 -56
- package/dist/storage/constants.js +1 -1
- package/dist/storage/domains/agents/filesystem.d.ts +28 -0
- package/dist/storage/domains/agents/filesystem.d.ts.map +1 -0
- package/dist/storage/domains/agents/index.d.ts +1 -0
- package/dist/storage/domains/agents/index.d.ts.map +1 -1
- package/dist/storage/domains/experiments/inmemory.d.ts.map +1 -1
- package/dist/storage/domains/mcp-clients/filesystem.d.ts +28 -0
- package/dist/storage/domains/mcp-clients/filesystem.d.ts.map +1 -0
- package/dist/storage/domains/mcp-clients/index.d.ts +1 -0
- package/dist/storage/domains/mcp-clients/index.d.ts.map +1 -1
- package/dist/storage/domains/mcp-servers/filesystem.d.ts +28 -0
- package/dist/storage/domains/mcp-servers/filesystem.d.ts.map +1 -0
- package/dist/storage/domains/mcp-servers/index.d.ts +1 -0
- package/dist/storage/domains/mcp-servers/index.d.ts.map +1 -1
- package/dist/storage/domains/prompt-blocks/filesystem.d.ts +28 -0
- package/dist/storage/domains/prompt-blocks/filesystem.d.ts.map +1 -0
- package/dist/storage/domains/prompt-blocks/index.d.ts +1 -0
- package/dist/storage/domains/prompt-blocks/index.d.ts.map +1 -1
- package/dist/storage/domains/scorer-definitions/filesystem.d.ts +28 -0
- package/dist/storage/domains/scorer-definitions/filesystem.d.ts.map +1 -0
- package/dist/storage/domains/scorer-definitions/index.d.ts +1 -0
- package/dist/storage/domains/scorer-definitions/index.d.ts.map +1 -1
- package/dist/storage/domains/skills/filesystem.d.ts +28 -0
- package/dist/storage/domains/skills/filesystem.d.ts.map +1 -0
- package/dist/storage/domains/skills/index.d.ts +1 -0
- package/dist/storage/domains/skills/index.d.ts.map +1 -1
- package/dist/storage/domains/workspaces/filesystem.d.ts +28 -0
- package/dist/storage/domains/workspaces/filesystem.d.ts.map +1 -0
- package/dist/storage/domains/workspaces/index.d.ts +1 -0
- package/dist/storage/domains/workspaces/index.d.ts.map +1 -1
- package/dist/storage/filesystem-db.d.ts +82 -0
- package/dist/storage/filesystem-db.d.ts.map +1 -0
- package/dist/storage/filesystem-versioned.d.ts +148 -0
- package/dist/storage/filesystem-versioned.d.ts.map +1 -0
- package/dist/storage/filesystem.d.ts +39 -0
- package/dist/storage/filesystem.d.ts.map +1 -0
- package/dist/storage/git-history.d.ts +68 -0
- package/dist/storage/git-history.d.ts.map +1 -0
- package/dist/storage/index.cjs +208 -160
- package/dist/storage/index.d.ts +4 -0
- package/dist/storage/index.d.ts.map +1 -1
- package/dist/storage/index.js +2 -2
- package/dist/storage/types.d.ts +1 -0
- package/dist/storage/types.d.ts.map +1 -1
- package/dist/stream/RunOutput.d.ts +1 -1
- package/dist/stream/aisdk/v5/output-helpers.d.ts +6 -6
- package/dist/stream/base/output.d.ts +3 -3
- package/dist/stream/base/schema.d.ts +1 -1
- package/dist/stream/index.cjs +11 -11
- package/dist/stream/index.js +2 -2
- package/dist/test-utils/llm-mock.cjs +4 -4
- package/dist/test-utils/llm-mock.js +1 -1
- package/dist/tool-loop-agent/index.cjs +4 -4
- package/dist/tool-loop-agent/index.js +1 -1
- package/dist/tools/index.cjs +6 -6
- package/dist/tools/index.js +1 -1
- package/dist/tools/is-vercel-tool.cjs +2 -2
- package/dist/tools/is-vercel-tool.js +1 -1
- package/dist/tools/tool-builder/builder.d.ts.map +1 -1
- package/dist/tools/tool.d.ts +6 -0
- package/dist/tools/tool.d.ts.map +1 -1
- package/dist/tools/types.d.ts +27 -0
- package/dist/tools/types.d.ts.map +1 -1
- package/dist/utils.cjs +23 -23
- package/dist/utils.js +1 -1
- package/dist/vector/index.cjs +9 -9
- package/dist/vector/index.js +2 -2
- package/dist/workflows/evented/index.cjs +10 -10
- package/dist/workflows/evented/index.js +1 -1
- package/dist/workflows/index.cjs +25 -25
- package/dist/workflows/index.js +1 -1
- package/dist/workspace/index.cjs +70 -66
- package/dist/workspace/index.d.ts +1 -0
- package/dist/workspace/index.d.ts.map +1 -1
- package/dist/workspace/index.js +1 -1
- package/dist/workspace/sandbox/execa.d.ts +9 -0
- package/dist/workspace/sandbox/execa.d.ts.map +1 -0
- package/dist/workspace/sandbox/local-process-manager.d.ts.map +1 -1
- package/dist/workspace/skills/index.d.ts +1 -0
- package/dist/workspace/skills/index.d.ts.map +1 -1
- package/dist/workspace/skills/tools.d.ts +36 -0
- package/dist/workspace/skills/tools.d.ts.map +1 -0
- package/dist/workspace/tools/execute-command.d.ts.map +1 -1
- package/dist/workspace/tools/output-helpers.d.ts +4 -3
- package/dist/workspace/tools/output-helpers.d.ts.map +1 -1
- package/package.json +9 -9
- package/src/llm/model/provider-types.generated.d.ts +5 -1
- package/dist/chunk-3ZF7IC6Q.cjs.map +0 -1
- package/dist/chunk-4BXXAZ75.js.map +0 -1
- package/dist/chunk-4P35AVPE.cjs.map +0 -1
- package/dist/chunk-BFV3GSGS.js.map +0 -1
- package/dist/chunk-CCLV5CAA.js.map +0 -1
- package/dist/chunk-D4M6E4OQ.cjs.map +0 -1
- package/dist/chunk-D6HO5QAM.cjs.map +0 -1
- package/dist/chunk-G5R2755Q.cjs.map +0 -1
- package/dist/chunk-GJTLWOKJ.js.map +0 -1
- package/dist/chunk-H5S4PS44.cjs.map +0 -1
- package/dist/chunk-IOY7Y5GV.js.map +0 -1
- package/dist/chunk-JIRB5LX4.js.map +0 -1
- package/dist/chunk-MVYP55NA.js.map +0 -1
- package/dist/chunk-PKORY4ZZ.cjs.map +0 -1
- package/dist/chunk-QNXY3J6B.cjs.map +0 -1
- package/dist/chunk-SEKQJ447.js.map +0 -1
- package/dist/chunk-VDKWYUGC.cjs.map +0 -1
- package/dist/chunk-VWAK25TU.js.map +0 -1
- package/dist/chunk-WYPSC6CO.js.map +0 -1
- package/dist/chunk-YIJZBU54.cjs.map +0 -1
- package/dist/chunk-ZBESCKPX.cjs.map +0 -1
- package/dist/chunk-ZVWVQ6MG.js.map +0 -1
- package/dist/models-dev-UVWCKPA2.cjs +0 -12
- package/dist/models-dev-W3LXZTEB.js +0 -3
- package/dist/provider-registry-4HLP2JRR.cjs +0 -40
- package/dist/provider-registry-K3DWQSMH.js +0 -3
|
@@ -29,31 +29,81 @@ console.log(`Status: ${summary.status}`)
|
|
|
29
29
|
|
|
30
30
|
## Parameters
|
|
31
31
|
|
|
32
|
-
**targetType
|
|
32
|
+
**targetType** (`'agent' | 'workflow' | 'scorer'`): Type of registered target to run items against. Use with \`targetId\`.
|
|
33
33
|
|
|
34
|
-
**targetId
|
|
34
|
+
**targetId** (`string`): ID of the registered target. Use with \`targetType\`.
|
|
35
35
|
|
|
36
|
-
**scorers
|
|
36
|
+
**scorers** (`(MastraScorer | string)[]`): Scorers to evaluate each result. Pass \`MastraScorer\` instances or registered scorer IDs.
|
|
37
37
|
|
|
38
|
-
**name
|
|
38
|
+
**name** (`string`): Display name for the experiment.
|
|
39
39
|
|
|
40
|
-
**description
|
|
40
|
+
**description** (`string`): Description of the experiment.
|
|
41
41
|
|
|
42
|
-
**metadata
|
|
42
|
+
**metadata** (`Record<string, unknown>`): Arbitrary metadata for the experiment.
|
|
43
43
|
|
|
44
|
-
**version
|
|
44
|
+
**version** (`number`): Pin to a specific dataset version. Defaults to the latest version.
|
|
45
45
|
|
|
46
|
-
**maxConcurrency
|
|
46
|
+
**maxConcurrency** (`number`): Maximum concurrent item executions. Defaults to \`5\`.
|
|
47
47
|
|
|
48
|
-
**signal
|
|
48
|
+
**signal** (`AbortSignal`): AbortSignal for cancelling the experiment.
|
|
49
49
|
|
|
50
|
-
**itemTimeout
|
|
50
|
+
**itemTimeout** (`number`): Per-item execution timeout in milliseconds.
|
|
51
51
|
|
|
52
|
-
**maxRetries
|
|
52
|
+
**maxRetries** (`number`): Maximum retries per item on failure. Defaults to \`0\` (no retries). Abort errors are never retried.
|
|
53
53
|
|
|
54
54
|
## Returns
|
|
55
55
|
|
|
56
|
-
**result
|
|
56
|
+
**result** (`Promise<ExperimentSummary>`): Summary of the completed experiment.
|
|
57
|
+
|
|
58
|
+
**result.experimentId** (`string`): Unique ID of the experiment.
|
|
59
|
+
|
|
60
|
+
**result.status** (`'pending' | 'running' | 'completed' | 'failed'`): Final status of the experiment.
|
|
61
|
+
|
|
62
|
+
**result.totalItems** (`number`): Total number of items in the dataset.
|
|
63
|
+
|
|
64
|
+
**result.succeededCount** (`number`): Number of items that succeeded.
|
|
65
|
+
|
|
66
|
+
**result.failedCount** (`number`): Number of items that failed.
|
|
67
|
+
|
|
68
|
+
**result.skippedCount** (`number`): Number of items skipped (e.g., due to abort).
|
|
69
|
+
|
|
70
|
+
**result.completedWithErrors** (`boolean`): \`true\` if the run completed but some items failed.
|
|
71
|
+
|
|
72
|
+
**result.startedAt** (`Date`): When the experiment started.
|
|
73
|
+
|
|
74
|
+
**result.completedAt** (`Date`): When the experiment completed.
|
|
75
|
+
|
|
76
|
+
**result.results** (`ItemWithScores[]`): All item results with their scores.
|
|
77
|
+
|
|
78
|
+
**result.results.itemId** (`string`): ID of the dataset item.
|
|
79
|
+
|
|
80
|
+
**result.results.itemVersion** (`number`): Dataset version of the item when executed.
|
|
81
|
+
|
|
82
|
+
**result.results.input** (`unknown`): Input data passed to the target.
|
|
83
|
+
|
|
84
|
+
**result.results.output** (`unknown | null`): Output from the target, or \`null\` if failed.
|
|
85
|
+
|
|
86
|
+
**result.results.groundTruth** (`unknown | null`): Expected output from the dataset item.
|
|
87
|
+
|
|
88
|
+
**result.results.error** (`{ message: string; stack?: string; code?: string } | null`): Structured error if execution failed.
|
|
89
|
+
|
|
90
|
+
**result.results.startedAt** (`Date`): When item execution started.
|
|
91
|
+
|
|
92
|
+
**result.results.completedAt** (`Date`): When item execution completed.
|
|
93
|
+
|
|
94
|
+
**result.results.retryCount** (`number`): Number of retry attempts.
|
|
95
|
+
|
|
96
|
+
**result.results.scores** (`ScorerResult[]`): Results from all scorers for this item.
|
|
97
|
+
|
|
98
|
+
**result.results.scores.scorerId** (`string`): ID of the scorer.
|
|
99
|
+
|
|
100
|
+
**result.results.scores.scorerName** (`string`): Display name of the scorer.
|
|
101
|
+
|
|
102
|
+
**result.results.scores.score** (`number | null`): Computed score, or \`null\` if the scorer failed.
|
|
103
|
+
|
|
104
|
+
**result.results.scores.reason** (`string | null`): Reason/explanation for the score.
|
|
105
|
+
|
|
106
|
+
**result.results.scores.error** (`string | null`): Error message if the scorer failed.
|
|
57
107
|
|
|
58
108
|
## Related
|
|
59
109
|
|
|
@@ -35,7 +35,11 @@ Takes the same `StartExperimentConfig` as [`dataset.startExperiment()`](https://
|
|
|
35
35
|
|
|
36
36
|
## Returns
|
|
37
37
|
|
|
38
|
-
**result
|
|
38
|
+
**result** (`Promise<object>`): Immediate response with experiment ID.
|
|
39
|
+
|
|
40
|
+
**result.experimentId** (`string`): Unique ID of the created experiment.
|
|
41
|
+
|
|
42
|
+
**result.status** (`'pending'`): Always \`'pending'\` since the experiment hasn't started executing yet.
|
|
39
43
|
|
|
40
44
|
## Related
|
|
41
45
|
|
|
@@ -33,16 +33,16 @@ const updated2 = await dataset.update({
|
|
|
33
33
|
|
|
34
34
|
## Parameters
|
|
35
35
|
|
|
36
|
-
**name
|
|
36
|
+
**name** (`string`): New display name.
|
|
37
37
|
|
|
38
|
-
**description
|
|
38
|
+
**description** (`string`): New description.
|
|
39
39
|
|
|
40
|
-
**metadata
|
|
40
|
+
**metadata** (`Record<string, unknown>`): Updated metadata.
|
|
41
41
|
|
|
42
|
-
**inputSchema
|
|
42
|
+
**inputSchema** (`unknown`): JSON Schema or Zod schema for item inputs.
|
|
43
43
|
|
|
44
|
-
**groundTruthSchema
|
|
44
|
+
**groundTruthSchema** (`unknown`): JSON Schema or Zod schema for item ground truths.
|
|
45
45
|
|
|
46
46
|
## Returns
|
|
47
47
|
|
|
48
|
-
**result
|
|
48
|
+
**result** (`Promise<DatasetRecord>`): The updated dataset record. See dataset.getDetails() for the full shape.
|
|
@@ -25,14 +25,14 @@ const updated = await dataset.updateItem({
|
|
|
25
25
|
|
|
26
26
|
## Parameters
|
|
27
27
|
|
|
28
|
-
**itemId
|
|
28
|
+
**itemId** (`string`): ID of the item to update.
|
|
29
29
|
|
|
30
|
-
**input
|
|
30
|
+
**input** (`unknown`): Updated input data.
|
|
31
31
|
|
|
32
|
-
**groundTruth
|
|
32
|
+
**groundTruth** (`unknown`): Updated ground truth.
|
|
33
33
|
|
|
34
|
-
**metadata
|
|
34
|
+
**metadata** (`Record<string, unknown>`): Updated metadata.
|
|
35
35
|
|
|
36
36
|
## Returns
|
|
37
37
|
|
|
38
|
-
**result
|
|
38
|
+
**result** (`Promise<DatasetItem>`): The updated dataset item. See dataset.addItem() for the item shape.
|
|
@@ -4,31 +4,31 @@ The `createAnswerRelevancyScorer()` function accepts a single options object wit
|
|
|
4
4
|
|
|
5
5
|
## Parameters
|
|
6
6
|
|
|
7
|
-
**model
|
|
7
|
+
**model** (`LanguageModel`): Configuration for the model used to evaluate relevancy.
|
|
8
8
|
|
|
9
|
-
**uncertaintyWeight
|
|
9
|
+
**uncertaintyWeight** (`number`): Weight given to 'unsure' verdicts in scoring (0-1). (Default: `0.3`)
|
|
10
10
|
|
|
11
|
-
**scale
|
|
11
|
+
**scale** (`number`): Maximum score value. (Default: `1`)
|
|
12
12
|
|
|
13
13
|
This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but the return value includes LLM-specific fields as documented below.
|
|
14
14
|
|
|
15
15
|
## .run() Returns
|
|
16
16
|
|
|
17
|
-
**runId
|
|
17
|
+
**runId** (`string`): The id of the run (optional).
|
|
18
18
|
|
|
19
|
-
**score
|
|
19
|
+
**score** (`number`): Relevancy score (0 to scale, default 0-1)
|
|
20
20
|
|
|
21
|
-
**preprocessPrompt
|
|
21
|
+
**preprocessPrompt** (`string`): The prompt sent to the LLM for the preprocess step (optional).
|
|
22
22
|
|
|
23
|
-
**preprocessStepResult
|
|
23
|
+
**preprocessStepResult** (`object`): Object with extracted statements: { statements: string\[] }
|
|
24
24
|
|
|
25
|
-
**analyzePrompt
|
|
25
|
+
**analyzePrompt** (`string`): The prompt sent to the LLM for the analyze step (optional).
|
|
26
26
|
|
|
27
|
-
**analyzeStepResult
|
|
27
|
+
**analyzeStepResult** (`object`): Object with results: { results: Array<{ result: 'yes' | 'unsure' | 'no', reason: string }> }
|
|
28
28
|
|
|
29
|
-
**generateReasonPrompt
|
|
29
|
+
**generateReasonPrompt** (`string`): The prompt sent to the LLM for the reason step (optional).
|
|
30
30
|
|
|
31
|
-
**reason
|
|
31
|
+
**reason** (`string`): Explanation of the score.
|
|
32
32
|
|
|
33
33
|
## Scoring Details
|
|
34
34
|
|
|
@@ -4,45 +4,43 @@ The `createAnswerSimilarityScorer()` function creates a scorer that evaluates ho
|
|
|
4
4
|
|
|
5
5
|
## Parameters
|
|
6
6
|
|
|
7
|
-
**model
|
|
7
|
+
**model** (`LanguageModel`): The language model used to evaluate semantic similarity between outputs and ground truth.
|
|
8
8
|
|
|
9
|
-
**options
|
|
9
|
+
**options** (`AnswerSimilarityOptions`): Configuration options for the scorer.
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
**options.requireGroundTruth** (`boolean`): Whether to require ground truth for evaluation. If false, missing ground truth returns score 0.
|
|
12
12
|
|
|
13
|
-
**
|
|
13
|
+
**options.semanticThreshold** (`number`): Weight for semantic matches vs exact matches (0-1).
|
|
14
14
|
|
|
15
|
-
**
|
|
15
|
+
**options.exactMatchBonus** (`number`): Additional score bonus for exact matches (0-1).
|
|
16
16
|
|
|
17
|
-
**
|
|
17
|
+
**options.missingPenalty** (`number`): Penalty per missing key concept from ground truth.
|
|
18
18
|
|
|
19
|
-
**
|
|
19
|
+
**options.contradictionPenalty** (`number`): Penalty for contradictory information. High value ensures wrong answers score near 0.
|
|
20
20
|
|
|
21
|
-
**
|
|
21
|
+
**options.extraInfoPenalty** (`number`): Mild penalty for extra information not present in ground truth (capped at 0.2).
|
|
22
22
|
|
|
23
|
-
**
|
|
24
|
-
|
|
25
|
-
**scale:** (`number`): Score scaling factor. (Default: `1`)
|
|
23
|
+
**options.scale** (`number`): Score scaling factor.
|
|
26
24
|
|
|
27
25
|
This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but **requires ground truth** to be provided in the run object.
|
|
28
26
|
|
|
29
27
|
## .run() Returns
|
|
30
28
|
|
|
31
|
-
**runId
|
|
29
|
+
**runId** (`string`): The id of the run (optional).
|
|
32
30
|
|
|
33
|
-
**score
|
|
31
|
+
**score** (`number`): Similarity score between 0-1 (or 0-scale if custom scale used). Higher scores indicate better similarity to ground truth.
|
|
34
32
|
|
|
35
|
-
**reason
|
|
33
|
+
**reason** (`string`): Human-readable explanation of the score with actionable feedback.
|
|
36
34
|
|
|
37
|
-
**preprocessStepResult
|
|
35
|
+
**preprocessStepResult** (`object`): Extracted semantic units from output and ground truth.
|
|
38
36
|
|
|
39
|
-
**analyzeStepResult
|
|
37
|
+
**analyzeStepResult** (`object`): Detailed analysis of matches, contradictions, and extra information.
|
|
40
38
|
|
|
41
|
-
**preprocessPrompt
|
|
39
|
+
**preprocessPrompt** (`string`): The prompt used for semantic unit extraction.
|
|
42
40
|
|
|
43
|
-
**analyzePrompt
|
|
41
|
+
**analyzePrompt** (`string`): The prompt used for similarity analysis.
|
|
44
42
|
|
|
45
|
-
**generateReasonPrompt
|
|
43
|
+
**generateReasonPrompt** (`string`): The prompt used for generating the explanation.
|
|
46
44
|
|
|
47
45
|
## Scoring Details
|
|
48
46
|
|
|
@@ -4,29 +4,29 @@ The `createBiasScorer()` function accepts a single options object with the follo
|
|
|
4
4
|
|
|
5
5
|
## Parameters
|
|
6
6
|
|
|
7
|
-
**model
|
|
7
|
+
**model** (`LanguageModel`): Configuration for the model used to evaluate bias.
|
|
8
8
|
|
|
9
|
-
**scale
|
|
9
|
+
**scale** (`number`): Maximum score value. (Default: `1`)
|
|
10
10
|
|
|
11
11
|
This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but the return value includes LLM-specific fields as documented below.
|
|
12
12
|
|
|
13
13
|
## .run() Returns
|
|
14
14
|
|
|
15
|
-
**runId
|
|
15
|
+
**runId** (`string`): The id of the run (optional).
|
|
16
16
|
|
|
17
|
-
**preprocessStepResult
|
|
17
|
+
**preprocessStepResult** (`object`): Object with extracted opinions: { opinions: string\[] }
|
|
18
18
|
|
|
19
|
-
**preprocessPrompt
|
|
19
|
+
**preprocessPrompt** (`string`): The prompt sent to the LLM for the preprocess step (optional).
|
|
20
20
|
|
|
21
|
-
**analyzeStepResult
|
|
21
|
+
**analyzeStepResult** (`object`): Object with results: { results: Array<{ result: 'yes' | 'no', reason: string }> }
|
|
22
22
|
|
|
23
|
-
**analyzePrompt
|
|
23
|
+
**analyzePrompt** (`string`): The prompt sent to the LLM for the analyze step (optional).
|
|
24
24
|
|
|
25
|
-
**score
|
|
25
|
+
**score** (`number`): Bias score (0 to scale, default 0-1). Higher scores indicate more bias.
|
|
26
26
|
|
|
27
|
-
**reason
|
|
27
|
+
**reason** (`string`): Explanation of the score.
|
|
28
28
|
|
|
29
|
-
**generateReasonPrompt
|
|
29
|
+
**generateReasonPrompt** (`string`): The prompt sent to the LLM for the generateReason step (optional).
|
|
30
30
|
|
|
31
31
|
## Bias Categories
|
|
32
32
|
|
|
@@ -10,11 +10,11 @@ This function returns an instance of the MastraScorer class. See the [MastraScor
|
|
|
10
10
|
|
|
11
11
|
## .run() Returns
|
|
12
12
|
|
|
13
|
-
**runId
|
|
13
|
+
**runId** (`string`): The id of the run (optional).
|
|
14
14
|
|
|
15
|
-
**preprocessStepResult
|
|
15
|
+
**preprocessStepResult** (`object`): Object with extracted elements and coverage details: { inputElements: string\[], outputElements: string\[], missingElements: string\[], elementCounts: { input: number, output: number } }
|
|
16
16
|
|
|
17
|
-
**score
|
|
17
|
+
**score** (`number`): Completeness score (0-1) representing the proportion of input elements covered in the output.
|
|
18
18
|
|
|
19
19
|
The `.run()` method returns a result in the following shape:
|
|
20
20
|
|
|
@@ -6,21 +6,21 @@ The `createContentSimilarityScorer()` function measures the textual similarity b
|
|
|
6
6
|
|
|
7
7
|
The `createContentSimilarityScorer()` function accepts a single options object with the following properties:
|
|
8
8
|
|
|
9
|
-
**ignoreCase
|
|
9
|
+
**ignoreCase** (`boolean`): Whether to ignore case differences when comparing strings. (Default: `true`)
|
|
10
10
|
|
|
11
|
-
**ignoreWhitespace
|
|
11
|
+
**ignoreWhitespace** (`boolean`): Whether to normalize whitespace when comparing strings. (Default: `true`)
|
|
12
12
|
|
|
13
13
|
This function returns an instance of the MastraScorer class. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.
|
|
14
14
|
|
|
15
15
|
## .run() Returns
|
|
16
16
|
|
|
17
|
-
**runId
|
|
17
|
+
**runId** (`string`): The id of the run (optional).
|
|
18
18
|
|
|
19
|
-
**preprocessStepResult
|
|
19
|
+
**preprocessStepResult** (`object`): Object with processed input and output: { processedInput: string, processedOutput: string }
|
|
20
20
|
|
|
21
|
-
**analyzeStepResult
|
|
21
|
+
**analyzeStepResult** (`object`): Object with similarity: { similarity: number }
|
|
22
22
|
|
|
23
|
-
**score
|
|
23
|
+
**score** (`number`): Similarity score (0-1) where 1 indicates perfect similarity.
|
|
24
24
|
|
|
25
25
|
## Scoring Details
|
|
26
26
|
|
|
@@ -22,17 +22,17 @@ Use when optimizing context selection for:
|
|
|
22
22
|
|
|
23
23
|
## Parameters
|
|
24
24
|
|
|
25
|
-
**model
|
|
25
|
+
**model** (`MastraModelConfig`): The language model to use for evaluating context relevance
|
|
26
26
|
|
|
27
|
-
**options
|
|
27
|
+
**options** (`ContextPrecisionMetricOptions`): Configuration options for the scorer
|
|
28
28
|
|
|
29
29
|
**Note**: Either `context` or `contextExtractor` must be provided. If both are provided, `contextExtractor` takes precedence.
|
|
30
30
|
|
|
31
31
|
## .run() Returns
|
|
32
32
|
|
|
33
|
-
**score
|
|
33
|
+
**score** (`number`): Mean Average Precision score between 0 and scale (default 0-1)
|
|
34
34
|
|
|
35
|
-
**reason
|
|
35
|
+
**reason** (`string`): Human-readable explanation of the context precision evaluation
|
|
36
36
|
|
|
37
37
|
## Scoring Details
|
|
38
38
|
|
|
@@ -37,23 +37,21 @@ const scorer = createScorer({
|
|
|
37
37
|
|
|
38
38
|
## createScorer Options
|
|
39
39
|
|
|
40
|
-
**id
|
|
40
|
+
**id** (`string`): Unique identifier for the scorer. Used as the name if \`name\` is not provided.
|
|
41
41
|
|
|
42
|
-
**name
|
|
42
|
+
**name** (`string`): Name of the scorer. Defaults to \`id\` if not provided.
|
|
43
43
|
|
|
44
|
-
**description
|
|
44
|
+
**description** (`string`): Description of what the scorer does.
|
|
45
45
|
|
|
46
|
-
**judge
|
|
46
|
+
**judge** (`object`): Optional judge configuration for LLM-based steps.
|
|
47
47
|
|
|
48
|
-
**
|
|
48
|
+
**judge.model** (`LanguageModel`): The LLM model instance to use for evaluation.
|
|
49
49
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
## Judge Object
|
|
50
|
+
**judge.instructions** (`string`): System prompt/instructions for the LLM.
|
|
53
51
|
|
|
54
|
-
**
|
|
52
|
+
**type** (`string`): Type specification for input/output. Use 'agent' for automatic agent types. For custom types, use the generic approach instead.
|
|
55
53
|
|
|
56
|
-
|
|
54
|
+
This function returns a scorer builder that you can chain step methods onto. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.
|
|
57
55
|
|
|
58
56
|
The judge only runs for steps defined as **prompt objects** (`preprocess`, `analyze`, `generateScore`, `generateReason` in prompt mode). If you use function steps only, the judge is never called and there is no LLM output to inspect. In that case, any score/reason must be produced by your functions.
|
|
59
57
|
|
|
@@ -149,28 +147,28 @@ Optional preprocessing step that can extract or transform data before analysis.
|
|
|
149
147
|
|
|
150
148
|
**Function Mode:** Function: `({ run, results }) => any`
|
|
151
149
|
|
|
152
|
-
**run.input
|
|
150
|
+
**run.input** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
|
|
153
151
|
|
|
154
|
-
**run.output
|
|
152
|
+
**run.output** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
|
|
155
153
|
|
|
156
|
-
**run.runId
|
|
154
|
+
**run.runId** (`string`): Unique identifier for this scoring run.
|
|
157
155
|
|
|
158
|
-
**run.requestContext
|
|
156
|
+
**run.requestContext** (`object`): Request Context from the agent or workflow step being evaluated (optional).
|
|
159
157
|
|
|
160
|
-
**results
|
|
158
|
+
**results** (`object`): Empty object (no previous steps).
|
|
161
159
|
|
|
162
160
|
Returns: `any`\
|
|
163
161
|
The method can return any value. The returned value will be available to subsequent steps as `preprocessStepResult`.
|
|
164
162
|
|
|
165
163
|
**Prompt Object Mode:**
|
|
166
164
|
|
|
167
|
-
**description
|
|
165
|
+
**description** (`string`): Description of what this preprocessing step does.
|
|
168
166
|
|
|
169
|
-
**outputSchema
|
|
167
|
+
**outputSchema** (`ZodSchema`): Zod schema for the expected output of the preprocess step.
|
|
170
168
|
|
|
171
|
-
**createPrompt
|
|
169
|
+
**createPrompt** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
|
|
172
170
|
|
|
173
|
-
**judge
|
|
171
|
+
**judge** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
|
|
174
172
|
|
|
175
173
|
### analyze
|
|
176
174
|
|
|
@@ -178,28 +176,28 @@ Optional analysis step that processes the input/output and any preprocessed data
|
|
|
178
176
|
|
|
179
177
|
**Function Mode:** Function: `({ run, results }) => any`
|
|
180
178
|
|
|
181
|
-
**run.input
|
|
179
|
+
**run.input** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
|
|
182
180
|
|
|
183
|
-
**run.output
|
|
181
|
+
**run.output** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
|
|
184
182
|
|
|
185
|
-
**run.runId
|
|
183
|
+
**run.runId** (`string`): Unique identifier for this scoring run.
|
|
186
184
|
|
|
187
|
-
**run.requestContext
|
|
185
|
+
**run.requestContext** (`object`): Request Context from the agent or workflow step being evaluated (optional).
|
|
188
186
|
|
|
189
|
-
**results.preprocessStepResult
|
|
187
|
+
**results.preprocessStepResult** (`any`): Result from preprocess step, if defined (optional).
|
|
190
188
|
|
|
191
189
|
Returns: `any`\
|
|
192
190
|
The method can return any value. The returned value will be available to subsequent steps as `analyzeStepResult`.
|
|
193
191
|
|
|
194
192
|
**Prompt Object Mode:**
|
|
195
193
|
|
|
196
|
-
**description
|
|
194
|
+
**description** (`string`): Description of what this analysis step does.
|
|
197
195
|
|
|
198
|
-
**outputSchema
|
|
196
|
+
**outputSchema** (`ZodSchema`): Zod schema for the expected output of the analyze step.
|
|
199
197
|
|
|
200
|
-
**createPrompt
|
|
198
|
+
**createPrompt** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
|
|
201
199
|
|
|
202
|
-
**judge
|
|
200
|
+
**judge** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
|
|
203
201
|
|
|
204
202
|
### generateScore
|
|
205
203
|
|
|
@@ -207,34 +205,34 @@ The method can return any value. The returned value will be available to subsequ
|
|
|
207
205
|
|
|
208
206
|
**Function Mode:** Function: `({ run, results }) => number`
|
|
209
207
|
|
|
210
|
-
**run.input
|
|
208
|
+
**run.input** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
|
|
211
209
|
|
|
212
|
-
**run.output
|
|
210
|
+
**run.output** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
|
|
213
211
|
|
|
214
|
-
**run.runId
|
|
212
|
+
**run.runId** (`string`): Unique identifier for this scoring run.
|
|
215
213
|
|
|
216
|
-
**run.requestContext
|
|
214
|
+
**run.requestContext** (`object`): Request Context from the agent or workflow step being evaluated (optional).
|
|
217
215
|
|
|
218
|
-
**results.preprocessStepResult
|
|
216
|
+
**results.preprocessStepResult** (`any`): Result from preprocess step, if defined (optional).
|
|
219
217
|
|
|
220
|
-
**results.analyzeStepResult
|
|
218
|
+
**results.analyzeStepResult** (`any`): Result from analyze step, if defined (optional).
|
|
221
219
|
|
|
222
220
|
Returns: `number`\
|
|
223
221
|
The method must return a numerical score.
|
|
224
222
|
|
|
225
223
|
**Prompt Object Mode:**
|
|
226
224
|
|
|
227
|
-
**description
|
|
225
|
+
**description** (`string`): Description of what this scoring step does.
|
|
228
226
|
|
|
229
|
-
**outputSchema
|
|
227
|
+
**outputSchema** (`ZodSchema`): Zod schema for the expected output of the generateScore step.
|
|
230
228
|
|
|
231
|
-
**createPrompt
|
|
229
|
+
**createPrompt** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
|
|
232
230
|
|
|
233
|
-
**judge
|
|
231
|
+
**judge** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
|
|
234
232
|
|
|
235
233
|
When using prompt object mode, you must also provide a `calculateScore` function to convert the LLM output to a numerical score:
|
|
236
234
|
|
|
237
|
-
**calculateScore
|
|
235
|
+
**calculateScore** (`function`): Function: ({ run, results, analyzeStepResult }) => number. Converts the LLM's structured output into a numerical score.
|
|
238
236
|
|
|
239
237
|
### generateReason
|
|
240
238
|
|
|
@@ -242,29 +240,29 @@ Optional step that provides an explanation for the score.
|
|
|
242
240
|
|
|
243
241
|
**Function Mode:** Function: `({ run, results, score }) => string`
|
|
244
242
|
|
|
245
|
-
**run.input
|
|
243
|
+
**run.input** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
|
|
246
244
|
|
|
247
|
-
**run.output
|
|
245
|
+
**run.output** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
|
|
248
246
|
|
|
249
|
-
**run.runId
|
|
247
|
+
**run.runId** (`string`): Unique identifier for this scoring run.
|
|
250
248
|
|
|
251
|
-
**run.requestContext
|
|
249
|
+
**run.requestContext** (`object`): Request Context from the agent or workflow step being evaluated (optional).
|
|
252
250
|
|
|
253
|
-
**results.preprocessStepResult
|
|
251
|
+
**results.preprocessStepResult** (`any`): Result from preprocess step, if defined (optional).
|
|
254
252
|
|
|
255
|
-
**results.analyzeStepResult
|
|
253
|
+
**results.analyzeStepResult** (`any`): Result from analyze step, if defined (optional).
|
|
256
254
|
|
|
257
|
-
**score
|
|
255
|
+
**score** (`number`): Score computed by the generateScore step.
|
|
258
256
|
|
|
259
257
|
Returns: `string`\
|
|
260
258
|
The method must return a string explaining the score.
|
|
261
259
|
|
|
262
260
|
**Prompt Object Mode:**
|
|
263
261
|
|
|
264
|
-
**description
|
|
262
|
+
**description** (`string`): Description of what this reasoning step does.
|
|
265
263
|
|
|
266
|
-
**createPrompt
|
|
264
|
+
**createPrompt** (`function`): Function: ({ run, results, score }) => string. Returns the prompt for the LLM.
|
|
267
265
|
|
|
268
|
-
**judge
|
|
266
|
+
**judge** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
|
|
269
267
|
|
|
270
268
|
All step functions can be async.
|
|
@@ -6,31 +6,31 @@ The `createFaithfulnessScorer()` function evaluates how factually accurate an LL
|
|
|
6
6
|
|
|
7
7
|
The `createFaithfulnessScorer()` function accepts a single options object with the following properties:
|
|
8
8
|
|
|
9
|
-
**model
|
|
9
|
+
**model** (`LanguageModel`): Configuration for the model used to evaluate faithfulness.
|
|
10
10
|
|
|
11
|
-
**context
|
|
11
|
+
**context** (`string[]`): Array of context chunks against which the output's claims will be verified.
|
|
12
12
|
|
|
13
|
-
**scale
|
|
13
|
+
**scale** (`number`): The maximum score value. The final score will be normalized to this scale. (Default: `1`)
|
|
14
14
|
|
|
15
15
|
This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but the return value includes LLM-specific fields as documented below.
|
|
16
16
|
|
|
17
17
|
## .run() Returns
|
|
18
18
|
|
|
19
|
-
**runId
|
|
19
|
+
**runId** (`string`): The id of the run (optional).
|
|
20
20
|
|
|
21
|
-
**preprocessStepResult
|
|
21
|
+
**preprocessStepResult** (`string[]`): Array of extracted claims from the output.
|
|
22
22
|
|
|
23
|
-
**preprocessPrompt
|
|
23
|
+
**preprocessPrompt** (`string`): The prompt sent to the LLM for the preprocess step (optional).
|
|
24
24
|
|
|
25
|
-
**analyzeStepResult
|
|
25
|
+
**analyzeStepResult** (`object`): Object with verdicts: { verdicts: Array<{ verdict: 'yes' | 'no' | 'unsure', reason: string }> }
|
|
26
26
|
|
|
27
|
-
**analyzePrompt
|
|
27
|
+
**analyzePrompt** (`string`): The prompt sent to the LLM for the analyze step (optional).
|
|
28
28
|
|
|
29
|
-
**score
|
|
29
|
+
**score** (`number`): A score between 0 and the configured scale, representing the proportion of claims that are supported by the context.
|
|
30
30
|
|
|
31
|
-
**reason
|
|
31
|
+
**reason** (`string`): A detailed explanation of the score, including which claims were supported, contradicted, or marked as unsure.
|
|
32
32
|
|
|
33
|
-
**generateReasonPrompt
|
|
33
|
+
**generateReasonPrompt** (`string`): The prompt sent to the LLM for the generateReason step (optional).
|
|
34
34
|
|
|
35
35
|
## Scoring Details
|
|
36
36
|
|