npm - @mastra/core - Versions diffs - 1.9.0 → 1.10.0 - Mend

@mastra/core 1.9.0 → 1.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (438) hide show

package/dist/docs/references/reference-datasets-startExperiment.md CHANGED Viewed

@@ -29,31 +29,81 @@ console.log(`Status: ${summary.status}`)
 ## Parameters
-**targetType?:** (`'agent' | 'workflow' | 'scorer'`): Type of registered target to run items against. Use with \`targetId\`.
+**targetType** (`'agent' | 'workflow' | 'scorer'`): Type of registered target to run items against. Use with \`targetId\`.
-**targetId?:** (`string`): ID of the registered target. Use with \`targetType\`.
+**targetId** (`string`): ID of the registered target. Use with \`targetType\`.
-**scorers?:** (`(MastraScorer | string)[]`): Scorers to evaluate each result. Pass \`MastraScorer\` instances or registered scorer IDs.
+**scorers** (`(MastraScorer | string)[]`): Scorers to evaluate each result. Pass \`MastraScorer\` instances or registered scorer IDs.
-**name?:** (`string`): Display name for the experiment.
+**name** (`string`): Display name for the experiment.
-**description?:** (`string`): Description of the experiment.
+**description** (`string`): Description of the experiment.
-**metadata?:** (`Record<string, unknown>`): Arbitrary metadata for the experiment.
+**metadata** (`Record<string, unknown>`): Arbitrary metadata for the experiment.
-**version?:** (`number`): Pin to a specific dataset version. Defaults to the latest version.
+**version** (`number`): Pin to a specific dataset version. Defaults to the latest version.
-**maxConcurrency?:** (`number`): Maximum concurrent item executions. Defaults to \`5\`.
+**maxConcurrency** (`number`): Maximum concurrent item executions. Defaults to \`5\`.
-**signal?:** (`AbortSignal`): AbortSignal for cancelling the experiment.
+**signal** (`AbortSignal`): AbortSignal for cancelling the experiment.
-**itemTimeout?:** (`number`): Per-item execution timeout in milliseconds.
+**itemTimeout** (`number`): Per-item execution timeout in milliseconds.
-**maxRetries?:** (`number`): Maximum retries per item on failure. Defaults to \`0\` (no retries). Abort errors are never retried.
+**maxRetries** (`number`): Maximum retries per item on failure. Defaults to \`0\` (no retries). Abort errors are never retried.
 ## Returns
-**result:** (`Promise<ExperimentSummary>`): ExperimentSummaryexperimentId:stringUnique ID of the experiment.status:'pending' | 'running' | 'completed' | 'failed'Final status of the experiment.totalItems:numberTotal number of items in the dataset.succeededCount:numberNumber of items that succeeded.failedCount:numberNumber of items that failed.skippedCount:numberNumber of items skipped (e.g., due to abort).completedWithErrors:boolean\`true\` if the run completed but some items failed.startedAt:DateWhen the experiment started.completedAt:DateWhen the experiment completed.results:ItemWithScores\[]All item results with their scores.ItemWithScoresitemId:stringID of the dataset item.itemVersion:numberDataset version of the item when executed.input:unknownInput data passed to the target.output:unknown | nullOutput from the target, or \`null\` if failed.groundTruth:unknown | nullExpected output from the dataset item.error:{ message: string; stack?: string; code?: string } | nullStructured error if execution failed.startedAt:DateWhen item execution started.completedAt:DateWhen item execution completed.retryCount:numberNumber of retry attempts.scores:ScorerResult\[]Results from all scorers for this item.ScorerResultscorerId:stringID of the scorer.scorerName:stringDisplay name of the scorer.score:number | nullComputed score, or \`null\` if the scorer failed.reason:string | nullReason/explanation for the score.error:string | nullError message if the scorer failed.
+**result** (`Promise<ExperimentSummary>`): Summary of the completed experiment.
+**result.experimentId** (`string`): Unique ID of the experiment.
+**result.status** (`'pending' | 'running' | 'completed' | 'failed'`): Final status of the experiment.
+**result.totalItems** (`number`): Total number of items in the dataset.
+**result.succeededCount** (`number`): Number of items that succeeded.
+**result.failedCount** (`number`): Number of items that failed.
+**result.skippedCount** (`number`): Number of items skipped (e.g., due to abort).
+**result.completedWithErrors** (`boolean`): \`true\` if the run completed but some items failed.
+**result.startedAt** (`Date`): When the experiment started.
+**result.completedAt** (`Date`): When the experiment completed.
+**result.results** (`ItemWithScores[]`): All item results with their scores.
+**result.results.itemId** (`string`): ID of the dataset item.
+**result.results.itemVersion** (`number`): Dataset version of the item when executed.
+**result.results.input** (`unknown`): Input data passed to the target.
+**result.results.output** (`unknown | null`): Output from the target, or \`null\` if failed.
+**result.results.groundTruth** (`unknown | null`): Expected output from the dataset item.
+**result.results.error** (`{ message: string; stack?: string; code?: string } | null`): Structured error if execution failed.
+**result.results.startedAt** (`Date`): When item execution started.
+**result.results.completedAt** (`Date`): When item execution completed.
+**result.results.retryCount** (`number`): Number of retry attempts.
+**result.results.scores** (`ScorerResult[]`): Results from all scorers for this item.
+**result.results.scores.scorerId** (`string`): ID of the scorer.
+**result.results.scores.scorerName** (`string`): Display name of the scorer.
+**result.results.scores.score** (`number | null`): Computed score, or \`null\` if the scorer failed.
+**result.results.scores.reason** (`string | null`): Reason/explanation for the score.
+**result.results.scores.error** (`string | null`): Error message if the scorer failed.
 ## Related

package/dist/docs/references/reference-datasets-startExperimentAsync.md CHANGED Viewed

@@ -35,7 +35,11 @@ Takes the same `StartExperimentConfig` as [`dataset.startExperiment()`](https://
 ## Returns
-**result:** (`Promise<object>`): objectexperimentId:stringUnique ID of the created experiment.status:'pending'Always \`'pending'\` since the experiment hasn't started executing yet.
+**result** (`Promise<object>`): Immediate response with experiment ID.
+**result.experimentId** (`string`): Unique ID of the created experiment.
+**result.status** (`'pending'`): Always \`'pending'\` since the experiment hasn't started executing yet.
 ## Related

package/dist/docs/references/reference-datasets-update.md CHANGED Viewed

@@ -33,16 +33,16 @@ const updated2 = await dataset.update({
 ## Parameters
-**name?:** (`string`): New display name.
+**name** (`string`): New display name.
-**description?:** (`string`): New description.
+**description** (`string`): New description.
-**metadata?:** (`Record<string, unknown>`): Updated metadata.
+**metadata** (`Record<string, unknown>`): Updated metadata.
-**inputSchema?:** (`unknown`): JSON Schema or Zod schema for item inputs.
+**inputSchema** (`unknown`): JSON Schema or Zod schema for item inputs.
-**groundTruthSchema?:** (`unknown`): JSON Schema or Zod schema for item ground truths.
+**groundTruthSchema** (`unknown`): JSON Schema or Zod schema for item ground truths.
 ## Returns
-**result:** (`Promise<DatasetRecord>`): The updated dataset record. See dataset.getDetails() for the full shape.
+**result** (`Promise<DatasetRecord>`): The updated dataset record. See dataset.getDetails() for the full shape.

package/dist/docs/references/reference-datasets-updateItem.md CHANGED Viewed

@@ -25,14 +25,14 @@ const updated = await dataset.updateItem({
 ## Parameters
-**itemId:** (`string`): ID of the item to update.
+**itemId** (`string`): ID of the item to update.
-**input?:** (`unknown`): Updated input data.
+**input** (`unknown`): Updated input data.
-**groundTruth?:** (`unknown`): Updated ground truth.
+**groundTruth** (`unknown`): Updated ground truth.
-**metadata?:** (`Record<string, unknown>`): Updated metadata.
+**metadata** (`Record<string, unknown>`): Updated metadata.
 ## Returns
-**result:** (`Promise<DatasetItem>`): The updated dataset item. See dataset.addItem() for the item shape.
+**result** (`Promise<DatasetItem>`): The updated dataset item. See dataset.addItem() for the item shape.

package/dist/docs/references/reference-evals-answer-relevancy.md CHANGED Viewed

@@ -4,31 +4,31 @@ The `createAnswerRelevancyScorer()` function accepts a single options object wit
 ## Parameters
-**model:** (`LanguageModel`): Configuration for the model used to evaluate relevancy.
+**model** (`LanguageModel`): Configuration for the model used to evaluate relevancy.
-**uncertaintyWeight:** (`number`): Weight given to 'unsure' verdicts in scoring (0-1). (Default: `0.3`)
+**uncertaintyWeight** (`number`): Weight given to 'unsure' verdicts in scoring (0-1). (Default: `0.3`)
-**scale:** (`number`): Maximum score value. (Default: `1`)
+**scale** (`number`): Maximum score value. (Default: `1`)
 This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but the return value includes LLM-specific fields as documented below.
 ## .run() Returns
-**runId:** (`string`): The id of the run (optional).
+**runId** (`string`): The id of the run (optional).
-**score:** (`number`): Relevancy score (0 to scale, default 0-1)
+**score** (`number`): Relevancy score (0 to scale, default 0-1)
-**preprocessPrompt:** (`string`): The prompt sent to the LLM for the preprocess step (optional).
+**preprocessPrompt** (`string`): The prompt sent to the LLM for the preprocess step (optional).
-**preprocessStepResult:** (`object`): Object with extracted statements: { statements: string\[] }
+**preprocessStepResult** (`object`): Object with extracted statements: { statements: string\[] }
-**analyzePrompt:** (`string`): The prompt sent to the LLM for the analyze step (optional).
+**analyzePrompt** (`string`): The prompt sent to the LLM for the analyze step (optional).
-**analyzeStepResult:** (`object`): Object with results: { results: Array<{ result: 'yes' | 'unsure' | 'no', reason: string }> }
+**analyzeStepResult** (`object`): Object with results: { results: Array<{ result: 'yes' | 'unsure' | 'no', reason: string }> }
-**generateReasonPrompt:** (`string`): The prompt sent to the LLM for the reason step (optional).
+**generateReasonPrompt** (`string`): The prompt sent to the LLM for the reason step (optional).
-**reason:** (`string`): Explanation of the score.
+**reason** (`string`): Explanation of the score.
 ## Scoring Details

package/dist/docs/references/reference-evals-answer-similarity.md CHANGED Viewed

@@ -4,45 +4,43 @@ The `createAnswerSimilarityScorer()` function creates a scorer that evaluates ho
 ## Parameters
-**model:** (`LanguageModel`): The language model used to evaluate semantic similarity between outputs and ground truth.
+**model** (`LanguageModel`): The language model used to evaluate semantic similarity between outputs and ground truth.
-**options:** (`AnswerSimilarityOptions`): Configuration options for the scorer.
+**options** (`AnswerSimilarityOptions`): Configuration options for the scorer.
-### AnswerSimilarityOptions
+**options.requireGroundTruth** (`boolean`): Whether to require ground truth for evaluation. If false, missing ground truth returns score 0.
-**requireGroundTruth:** (`boolean`): Whether to require ground truth for evaluation. If false, missing ground truth returns score 0. (Default: `true`)
+**options.semanticThreshold** (`number`): Weight for semantic matches vs exact matches (0-1).
-**semanticThreshold:** (`number`): Weight for semantic matches vs exact matches (0-1). (Default: `0.8`)
+**options.exactMatchBonus** (`number`): Additional score bonus for exact matches (0-1).
-**exactMatchBonus:** (`number`): Additional score bonus for exact matches (0-1). (Default: `0.2`)
+**options.missingPenalty** (`number`): Penalty per missing key concept from ground truth.
-**missingPenalty:** (`number`): Penalty per missing key concept from ground truth. (Default: `0.15`)
+**options.contradictionPenalty** (`number`): Penalty for contradictory information. High value ensures wrong answers score near 0.
-**contradictionPenalty:** (`number`): Penalty for contradictory information. High value ensures wrong answers score near 0. (Default: `1.0`)
+**options.extraInfoPenalty** (`number`): Mild penalty for extra information not present in ground truth (capped at 0.2).
-**extraInfoPenalty:** (`number`): Mild penalty for extra information not present in ground truth (capped at 0.2). (Default: `0.05`)
-**scale:** (`number`): Score scaling factor. (Default: `1`)
+**options.scale** (`number`): Score scaling factor.
 This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but **requires ground truth** to be provided in the run object.
 ## .run() Returns
-**runId:** (`string`): The id of the run (optional).
+**runId** (`string`): The id of the run (optional).
-**score:** (`number`): Similarity score between 0-1 (or 0-scale if custom scale used). Higher scores indicate better similarity to ground truth.
+**score** (`number`): Similarity score between 0-1 (or 0-scale if custom scale used). Higher scores indicate better similarity to ground truth.
-**reason:** (`string`): Human-readable explanation of the score with actionable feedback.
+**reason** (`string`): Human-readable explanation of the score with actionable feedback.
-**preprocessStepResult:** (`object`): Extracted semantic units from output and ground truth.
+**preprocessStepResult** (`object`): Extracted semantic units from output and ground truth.
-**analyzeStepResult:** (`object`): Detailed analysis of matches, contradictions, and extra information.
+**analyzeStepResult** (`object`): Detailed analysis of matches, contradictions, and extra information.
-**preprocessPrompt:** (`string`): The prompt used for semantic unit extraction.
+**preprocessPrompt** (`string`): The prompt used for semantic unit extraction.
-**analyzePrompt:** (`string`): The prompt used for similarity analysis.
+**analyzePrompt** (`string`): The prompt used for similarity analysis.
-**generateReasonPrompt:** (`string`): The prompt used for generating the explanation.
+**generateReasonPrompt** (`string`): The prompt used for generating the explanation.
 ## Scoring Details

package/dist/docs/references/reference-evals-bias.md CHANGED Viewed

@@ -4,29 +4,29 @@ The `createBiasScorer()` function accepts a single options object with the follo
 ## Parameters
-**model:** (`LanguageModel`): Configuration for the model used to evaluate bias.
+**model** (`LanguageModel`): Configuration for the model used to evaluate bias.
-**scale:** (`number`): Maximum score value. (Default: `1`)
+**scale** (`number`): Maximum score value. (Default: `1`)
 This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but the return value includes LLM-specific fields as documented below.
 ## .run() Returns
-**runId:** (`string`): The id of the run (optional).
+**runId** (`string`): The id of the run (optional).
-**preprocessStepResult:** (`object`): Object with extracted opinions: { opinions: string\[] }
+**preprocessStepResult** (`object`): Object with extracted opinions: { opinions: string\[] }
-**preprocessPrompt:** (`string`): The prompt sent to the LLM for the preprocess step (optional).
+**preprocessPrompt** (`string`): The prompt sent to the LLM for the preprocess step (optional).
-**analyzeStepResult:** (`object`): Object with results: { results: Array<{ result: 'yes' | 'no', reason: string }> }
+**analyzeStepResult** (`object`): Object with results: { results: Array<{ result: 'yes' | 'no', reason: string }> }
-**analyzePrompt:** (`string`): The prompt sent to the LLM for the analyze step (optional).
+**analyzePrompt** (`string`): The prompt sent to the LLM for the analyze step (optional).
-**score:** (`number`): Bias score (0 to scale, default 0-1). Higher scores indicate more bias.
+**score** (`number`): Bias score (0 to scale, default 0-1). Higher scores indicate more bias.
-**reason:** (`string`): Explanation of the score.
+**reason** (`string`): Explanation of the score.
-**generateReasonPrompt:** (`string`): The prompt sent to the LLM for the generateReason step (optional).
+**generateReasonPrompt** (`string`): The prompt sent to the LLM for the generateReason step (optional).
 ## Bias Categories

package/dist/docs/references/reference-evals-completeness.md CHANGED Viewed

@@ -10,11 +10,11 @@ This function returns an instance of the MastraScorer class. See the [MastraScor
 ## .run() Returns
-**runId:** (`string`): The id of the run (optional).
+**runId** (`string`): The id of the run (optional).
-**preprocessStepResult:** (`object`): Object with extracted elements and coverage details: { inputElements: string\[], outputElements: string\[], missingElements: string\[], elementCounts: { input: number, output: number } }
+**preprocessStepResult** (`object`): Object with extracted elements and coverage details: { inputElements: string\[], outputElements: string\[], missingElements: string\[], elementCounts: { input: number, output: number } }
-**score:** (`number`): Completeness score (0-1) representing the proportion of input elements covered in the output.
+**score** (`number`): Completeness score (0-1) representing the proportion of input elements covered in the output.
 The `.run()` method returns a result in the following shape:

package/dist/docs/references/reference-evals-content-similarity.md CHANGED Viewed

@@ -6,21 +6,21 @@ The `createContentSimilarityScorer()` function measures the textual similarity b
 The `createContentSimilarityScorer()` function accepts a single options object with the following properties:
-**ignoreCase:** (`boolean`): Whether to ignore case differences when comparing strings. (Default: `true`)
+**ignoreCase** (`boolean`): Whether to ignore case differences when comparing strings. (Default: `true`)
-**ignoreWhitespace:** (`boolean`): Whether to normalize whitespace when comparing strings. (Default: `true`)
+**ignoreWhitespace** (`boolean`): Whether to normalize whitespace when comparing strings. (Default: `true`)
 This function returns an instance of the MastraScorer class. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.
 ## .run() Returns
-**runId:** (`string`): The id of the run (optional).
+**runId** (`string`): The id of the run (optional).
-**preprocessStepResult:** (`object`): Object with processed input and output: { processedInput: string, processedOutput: string }
+**preprocessStepResult** (`object`): Object with processed input and output: { processedInput: string, processedOutput: string }
-**analyzeStepResult:** (`object`): Object with similarity: { similarity: number }
+**analyzeStepResult** (`object`): Object with similarity: { similarity: number }
-**score:** (`number`): Similarity score (0-1) where 1 indicates perfect similarity.
+**score** (`number`): Similarity score (0-1) where 1 indicates perfect similarity.
 ## Scoring Details

package/dist/docs/references/reference-evals-context-precision.md CHANGED Viewed

@@ -22,17 +22,17 @@ Use when optimizing context selection for:
 ## Parameters
-**model:** (`MastraModelConfig`): The language model to use for evaluating context relevance
+**model** (`MastraModelConfig`): The language model to use for evaluating context relevance
-**options:** (`ContextPrecisionMetricOptions`): Configuration options for the scorer
+**options** (`ContextPrecisionMetricOptions`): Configuration options for the scorer
 **Note**: Either `context` or `contextExtractor` must be provided. If both are provided, `contextExtractor` takes precedence.
 ## .run() Returns
-**score:** (`number`): Mean Average Precision score between 0 and scale (default 0-1)
+**score** (`number`): Mean Average Precision score between 0 and scale (default 0-1)
-**reason:** (`string`): Human-readable explanation of the context precision evaluation
+**reason** (`string`): Human-readable explanation of the context precision evaluation
 ## Scoring Details

package/dist/docs/references/reference-evals-create-scorer.md CHANGED Viewed

@@ -37,23 +37,21 @@ const scorer = createScorer({
 ## createScorer Options
-**id:** (`string`): Unique identifier for the scorer. Used as the name if \`name\` is not provided.
+**id** (`string`): Unique identifier for the scorer. Used as the name if \`name\` is not provided.
-**name?:** (`string`): Name of the scorer. Defaults to \`id\` if not provided.
+**name** (`string`): Name of the scorer. Defaults to \`id\` if not provided.
-**description:** (`string`): Description of what the scorer does.
+**description** (`string`): Description of what the scorer does.
-**judge?:** (`object`): Optional judge configuration for LLM-based steps. See Judge Object section below.
+**judge** (`object`): Optional judge configuration for LLM-based steps.
-**type?:** (`string`): Type specification for input/output. Use 'agent' for automatic agent types. For custom types, use the generic approach instead.
+**judge.model** (`LanguageModel`): The LLM model instance to use for evaluation.
-This function returns a scorer builder that you can chain step methods onto. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.
-## Judge Object
+**judge.instructions** (`string`): System prompt/instructions for the LLM.
-**model:** (`LanguageModel`): The LLM model instance to use for evaluation.
+**type** (`string`): Type specification for input/output. Use 'agent' for automatic agent types. For custom types, use the generic approach instead.
-**instructions:** (`string`): System prompt/instructions for the LLM.
+This function returns a scorer builder that you can chain step methods onto. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.
 The judge only runs for steps defined as **prompt objects** (`preprocess`, `analyze`, `generateScore`, `generateReason` in prompt mode). If you use function steps only, the judge is never called and there is no LLM output to inspect. In that case, any score/reason must be produced by your functions.
@@ -149,28 +147,28 @@ Optional preprocessing step that can extract or transform data before analysis.
 **Function Mode:** Function: `({ run, results }) => any`
-**run.input:** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
+**run.input** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
-**run.output:** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
+**run.output** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
-**run.runId:** (`string`): Unique identifier for this scoring run.
+**run.runId** (`string`): Unique identifier for this scoring run.
-**run.requestContext?:** (`object`): Request Context from the agent or workflow step being evaluated (optional).
+**run.requestContext** (`object`): Request Context from the agent or workflow step being evaluated (optional).
-**results:** (`object`): Empty object (no previous steps).
+**results** (`object`): Empty object (no previous steps).
 Returns: `any`\
 The method can return any value. The returned value will be available to subsequent steps as `preprocessStepResult`.
 **Prompt Object Mode:**
-**description:** (`string`): Description of what this preprocessing step does.
+**description** (`string`): Description of what this preprocessing step does.
-**outputSchema:** (`ZodSchema`): Zod schema for the expected output of the preprocess step.
+**outputSchema** (`ZodSchema`): Zod schema for the expected output of the preprocess step.
-**createPrompt:** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
+**createPrompt** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
-**judge?:** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
+**judge** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
 ### analyze
@@ -178,28 +176,28 @@ Optional analysis step that processes the input/output and any preprocessed data
 **Function Mode:** Function: `({ run, results }) => any`
-**run.input:** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
+**run.input** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
-**run.output:** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
+**run.output** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
-**run.runId:** (`string`): Unique identifier for this scoring run.
+**run.runId** (`string`): Unique identifier for this scoring run.
-**run.requestContext?:** (`object`): Request Context from the agent or workflow step being evaluated (optional).
+**run.requestContext** (`object`): Request Context from the agent or workflow step being evaluated (optional).
-**results.preprocessStepResult?:** (`any`): Result from preprocess step, if defined (optional).
+**results.preprocessStepResult** (`any`): Result from preprocess step, if defined (optional).
 Returns: `any`\
 The method can return any value. The returned value will be available to subsequent steps as `analyzeStepResult`.
 **Prompt Object Mode:**
-**description:** (`string`): Description of what this analysis step does.
+**description** (`string`): Description of what this analysis step does.
-**outputSchema:** (`ZodSchema`): Zod schema for the expected output of the analyze step.
+**outputSchema** (`ZodSchema`): Zod schema for the expected output of the analyze step.
-**createPrompt:** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
+**createPrompt** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
-**judge?:** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
+**judge** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
 ### generateScore
@@ -207,34 +205,34 @@ The method can return any value. The returned value will be available to subsequ
 **Function Mode:** Function: `({ run, results }) => number`
-**run.input:** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
+**run.input** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
-**run.output:** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
+**run.output** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
-**run.runId:** (`string`): Unique identifier for this scoring run.
+**run.runId** (`string`): Unique identifier for this scoring run.
-**run.requestContext?:** (`object`): Request Context from the agent or workflow step being evaluated (optional).
+**run.requestContext** (`object`): Request Context from the agent or workflow step being evaluated (optional).
-**results.preprocessStepResult?:** (`any`): Result from preprocess step, if defined (optional).
+**results.preprocessStepResult** (`any`): Result from preprocess step, if defined (optional).
-**results.analyzeStepResult?:** (`any`): Result from analyze step, if defined (optional).
+**results.analyzeStepResult** (`any`): Result from analyze step, if defined (optional).
 Returns: `number`\
 The method must return a numerical score.
 **Prompt Object Mode:**
-**description:** (`string`): Description of what this scoring step does.
+**description** (`string`): Description of what this scoring step does.
-**outputSchema:** (`ZodSchema`): Zod schema for the expected output of the generateScore step.
+**outputSchema** (`ZodSchema`): Zod schema for the expected output of the generateScore step.
-**createPrompt:** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
+**createPrompt** (`function`): Function: ({ run, results }) => string. Returns the prompt for the LLM.
-**judge?:** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
+**judge** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
 When using prompt object mode, you must also provide a `calculateScore` function to convert the LLM output to a numerical score:
-**calculateScore:** (`function`): Function: ({ run, results, analyzeStepResult }) => number. Converts the LLM's structured output into a numerical score.
+**calculateScore** (`function`): Function: ({ run, results, analyzeStepResult }) => number. Converts the LLM's structured output into a numerical score.
 ### generateReason
@@ -242,29 +240,29 @@ Optional step that provides an explanation for the score.
 **Function Mode:** Function: `({ run, results, score }) => string`
-**run.input:** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
+**run.input** (`any`): Input records provided to the scorer. If the scorer is added to an agent, this will be an array of user messages, e.g. \`\[{ role: 'user', content: 'hello world' }]\`. If the scorer is used in a workflow, this will be the input of the workflow.
-**run.output:** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
+**run.output** (`any`): Output record provided to the scorer. For agents, this is usually the agent's response. For workflows, this is the workflow's output.
-**run.runId:** (`string`): Unique identifier for this scoring run.
+**run.runId** (`string`): Unique identifier for this scoring run.
-**run.requestContext?:** (`object`): Request Context from the agent or workflow step being evaluated (optional).
+**run.requestContext** (`object`): Request Context from the agent or workflow step being evaluated (optional).
-**results.preprocessStepResult?:** (`any`): Result from preprocess step, if defined (optional).
+**results.preprocessStepResult** (`any`): Result from preprocess step, if defined (optional).
-**results.analyzeStepResult?:** (`any`): Result from analyze step, if defined (optional).
+**results.analyzeStepResult** (`any`): Result from analyze step, if defined (optional).
-**score:** (`number`): Score computed by the generateScore step.
+**score** (`number`): Score computed by the generateScore step.
 Returns: `string`\
 The method must return a string explaining the score.
 **Prompt Object Mode:**
-**description:** (`string`): Description of what this reasoning step does.
+**description** (`string`): Description of what this reasoning step does.
-**createPrompt:** (`function`): Function: ({ run, results, score }) => string. Returns the prompt for the LLM.
+**createPrompt** (`function`): Function: ({ run, results, score }) => string. Returns the prompt for the LLM.
-**judge?:** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
+**judge** (`object`): (Optional) LLM judge for this step (can override main judge). See Judge Object section.
 All step functions can be async.

package/dist/docs/references/reference-evals-faithfulness.md CHANGED Viewed

@@ -6,31 +6,31 @@ The `createFaithfulnessScorer()` function evaluates how factually accurate an LL
 The `createFaithfulnessScorer()` function accepts a single options object with the following properties:
-**model:** (`LanguageModel`): Configuration for the model used to evaluate faithfulness.
+**model** (`LanguageModel`): Configuration for the model used to evaluate faithfulness.
-**context:** (`string[]`): Array of context chunks against which the output's claims will be verified.
+**context** (`string[]`): Array of context chunks against which the output's claims will be verified.
-**scale:** (`number`): The maximum score value. The final score will be normalized to this scale. (Default: `1`)
+**scale** (`number`): The maximum score value. The final score will be normalized to this scale. (Default: `1`)
 This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but the return value includes LLM-specific fields as documented below.
 ## .run() Returns
-**runId:** (`string`): The id of the run (optional).
+**runId** (`string`): The id of the run (optional).
-**preprocessStepResult:** (`string[]`): Array of extracted claims from the output.
+**preprocessStepResult** (`string[]`): Array of extracted claims from the output.
-**preprocessPrompt:** (`string`): The prompt sent to the LLM for the preprocess step (optional).
+**preprocessPrompt** (`string`): The prompt sent to the LLM for the preprocess step (optional).
-**analyzeStepResult:** (`object`): Object with verdicts: { verdicts: Array<{ verdict: 'yes' | 'no' | 'unsure', reason: string }> }
+**analyzeStepResult** (`object`): Object with verdicts: { verdicts: Array<{ verdict: 'yes' | 'no' | 'unsure', reason: string }> }
-**analyzePrompt:** (`string`): The prompt sent to the LLM for the analyze step (optional).
+**analyzePrompt** (`string`): The prompt sent to the LLM for the analyze step (optional).
-**score:** (`number`): A score between 0 and the configured scale, representing the proportion of claims that are supported by the context.
+**score** (`number`): A score between 0 and the configured scale, representing the proportion of claims that are supported by the context.
-**reason:** (`string`): A detailed explanation of the score, including which claims were supported, contradicted, or marked as unsure.
+**reason** (`string`): A detailed explanation of the score, including which claims were supported, contradicted, or marked as unsure.
-**generateReasonPrompt:** (`string`): The prompt sent to the LLM for the generateReason step (optional).
+**generateReasonPrompt** (`string`): The prompt sent to the LLM for the generateReason step (optional).
 ## Scoring Details