npm - @mastra/mcp-docs-server - Versions diffs - 0.13.37 → 0.13.38 - Mend

@mastra/mcp-docs-server 0.13.37 → 0.13.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (397) hide show

package/.docs/raw/reference/workflows/workflow.mdx CHANGED Viewed

@@ -9,7 +9,7 @@ The `Workflow` class enables you to create state machines for complex sequences
 ## Usage example
-```typescript filename="src/mastra/workflows/test-workflow.ts" showLineNumbers copy
+```typescript title="src/mastra/workflows/test-workflow.ts" showLineNumbers copy
 import { createWorkflow } from "@mastra/core/workflows";
 import { z } from "zod";
@@ -20,8 +20,8 @@ export const workflow = createWorkflow({
   }),
   outputSchema: z.object({
     value: z.string(),
-  })
-})
+  }),
+});
 ```
 ## Constructor parameters
@@ -46,7 +46,8 @@ export const workflow = createWorkflow({
     {
       name: "stateSchema",
       type: "z.ZodObject<any>",
-      description: "Optional Zod schema for the workflow state. Automatically injected when using Mastra's state system. If not specified, type is 'any'.",
+      description:
+        "Optional Zod schema for the workflow state. Automatically injected when using Mastra's state system. If not specified, type is 'any'.",
       isOptional: true,
     },
     {
@@ -54,7 +55,7 @@ export const workflow = createWorkflow({
       type: "WorkflowOptions",
       description: "Optional options for the workflow",
       isOptional: true,
-    }
+    },
   ]}
 />
@@ -71,14 +72,16 @@ export const workflow = createWorkflow({
     {
       name: "validateInputs",
       type: "boolean",
-      description: "Optional flag to determine whether to validate the workflow inputs. This also applies default values from zodSchemas on the workflow/step input/resume data. If input/resume data validation fails on start/resume, the workflow will not start/resume, it throws an error instead. If input data validation fails on a step execution, the step fails, causing the workflow to fail and the error is returned.",
+      description:
+        "Optional flag to determine whether to validate the workflow inputs. This also applies default values from zodSchemas on the workflow/step input/resume data. If input/resume data validation fails on start/resume, the workflow will not start/resume, it throws an error instead. If input data validation fails on a step execution, the step fails, causing the workflow to fail and the error is returned.",
       isOptional: true,
       defaultValue: "false",
     },
     {
       name: "shouldPersistSnapshot",
       type: "(params: { stepResults: Record<string, StepResult<any, any, any, any>>; workflowStatus: WorkflowRunStatus }) => boolean",
-      description: "Optional flag to determine whether to persist the workflow snapshot",
+      description:
+        "Optional flag to determine whether to persist the workflow snapshot",
       isOptional: true,
       defaultValue: "() => true",
     },
@@ -114,7 +117,7 @@ A workflow's `status` indicates its current execution state. The possible values
 ## Extended usage example
-```typescript filename="src/test-run.ts" showLineNumbers copy
+```typescript title="src/test-run.ts" showLineNumbers copy
 import { mastra } from "./mastra";
 const run = await mastra.getWorkflow("workflow").createRunAsync();
@@ -128,5 +131,5 @@ if (result.status === "suspended") {
 ## Related
-- [Step Class](./step.mdx)
-- [Control flow](../../docs/workflows/control-flow.mdx)
+- [Step Class](./step)
+- [Control flow](/docs/workflows/control-flow)

package/.docs/raw/scorers/custom-scorers.mdx CHANGED Viewed

@@ -1,4 +1,8 @@
-## Custom scorers
+---
+title: "Custom Scorers | Scorers | Mastra Docs"
+---
+# Custom scorers
 Mastra provides a unified `createScorer` factory that allows you to build custom evaluation logic using either JavaScript functions or LLM-based prompt objects for each step. This flexibility lets you choose the best approach for each part of your evaluation pipeline.
@@ -16,12 +20,14 @@ Each step can use either **functions** or **prompt objects** (LLM-based evaluati
 ### Functions vs Prompt Objects
 **Functions** use JavaScript for deterministic logic. They're ideal for:
 - Algorithmic evaluations with clear criteria
 - Performance-critical scenarios
 - Integration with existing libraries
 - Consistent, reproducible results
 **Prompt Objects** use LLMs as judges for evaluation. They're perfect for:
 - Subjective evaluations requiring human-like judgment
 - Complex criteria difficult to code algorithmically
 - Natural language understanding tasks
@@ -61,9 +67,8 @@ For type safety and compatibility with both live agent scoring and trace scoring
 ```typescript
 const myScorer = createScorer({
   // ...
-  type: 'agent', // Automatically handles agent input/output types
-})
-.generateScore(({ run, results }) => {
+  type: "agent", // Automatically handles agent input/output types
+}).generateScore(({ run, results }) => {
   // run.output is automatically typed as ScorerRunOutputForAgent
   // run.input is automatically typed as ScorerRunInputForAgent
 });
@@ -83,7 +88,7 @@ const glutenCheckerScorer = createScorer(...)
   // Extract and clean recipe text
   const recipeText = run.output.text.toLowerCase();
   const wordCount = recipeText.split(' ').length;
   return {
     recipeText,
     wordCount,
@@ -105,7 +110,7 @@ const glutenCheckerScorer = createScorer(...)
   createPrompt: ({ run }) => `
     Extract all ingredients and cooking methods from this recipe:
     ${run.output.text}
     Return JSON with ingredients and cookingMethods arrays.
   `
 })
@@ -124,13 +129,13 @@ const glutenCheckerScorer = createScorer({...})
 .preprocess(...)
 .analyze(({ run, results }) => {
   const { recipeText, hasCommonGlutenWords } = results.preprocessStepResult;
   // Simple gluten detection algorithm
   const glutenKeywords = ['wheat', 'flour', 'barley', 'rye', 'bread'];
-  const foundGlutenWords = glutenKeywords.filter(word =>
+  const foundGlutenWords = glutenKeywords.filter(word =>
     recipeText.includes(word)
   );
   return {
     isGlutenFree: foundGlutenWords.length === 0,
     detectedGlutenSources: foundGlutenWords,
@@ -154,7 +159,7 @@ const glutenCheckerScorer = createScorer({...})
   createPrompt: ({ run, results }) => `
     Analyze this recipe for gluten content:
     "${results.preprocessStepResult.recipeText}"
     Look for wheat, barley, rye, and hidden sources like soy sauce.
     Return JSON with isGlutenFree, glutenSources array, and confidence (0-1).
   `
@@ -175,7 +180,7 @@ const glutenCheckerScorer = createScorer({...})
 .analyze(...)
 .generateScore(({ results }) => {
   const { isGlutenFree, confidence } = results.analyzeStepResult;
   // Return 1 for gluten-free, 0 for contains gluten
   // Weight by confidence level
   return isGlutenFree ? confidence : 0;
@@ -199,7 +204,7 @@ const glutenCheckerScorer = createScorer({...})
 .generateScore(...)
 .generateReason(({ results, score }) => {
   const { isGlutenFree, glutenSources } = results.analyzeStepResult;
   if (isGlutenFree) {
     return `Score: ${score}. This recipe is gluten-free with no harmful ingredients detected.`;
   } else {
@@ -220,14 +225,12 @@ const glutenCheckerScorer = createScorer({...})
   createPrompt: ({ results, score }) => `
     Explain why this recipe received a score of ${score}.
     Analysis: ${JSON.stringify(results.analyzeStepResult)}
     Provide a clear explanation for someone with dietary restrictions.
   `
 })
 ```
 ## Example: Create a custom scorer
 A custom scorer in Mastra uses `createScorer` with four core components:
@@ -241,14 +244,18 @@ Together, these components allow you to define custom evaluation logic using LLM
 > See [createScorer](/reference/scorers/create-scorer) for the full API and configuration options.
-```typescript filename="src/mastra/scorers/gluten-checker.ts" showLineNumbers copy
-import { openai } from '@ai-sdk/openai';
-import { createScorer } from '@mastra/core/scores';
-import { z } from 'zod';
+```typescript title="src/mastra/scorers/gluten-checker.ts" showLineNumbers copy
+import { openai } from "@ai-sdk/openai";
+import { createScorer } from "@mastra/core/scores";
+import { z } from "zod";
 export const GLUTEN_INSTRUCTIONS = `You are a Chef that identifies if recipes contain gluten.`;
-export const generateGlutenPrompt = ({ output }: { output: string }) => `Check if this recipe is gluten-free.
+export const generateGlutenPrompt = ({
+  output,
+}: {
+  output: string;
+}) => `Check if this recipe is gluten-free.
 Check for:
 - Wheat
@@ -285,23 +292,23 @@ export const generateReasonPrompt = ({
 }: {
   isGlutenFree: boolean;
   glutenSources: string[];
-}) => `Explain why this recipe is${isGlutenFree ? '' : ' not'} gluten-free.
+}) => `Explain why this recipe is${isGlutenFree ? "" : " not"} gluten-free.
-${glutenSources.length > 0 ? `Sources of gluten: ${glutenSources.join(', ')}` : 'No gluten-containing ingredients found'}
+${glutenSources.length > 0 ? `Sources of gluten: ${glutenSources.join(", ")}` : "No gluten-containing ingredients found"}
 Return your response in this format:
 "This recipe is [gluten-free/contains gluten] because [explanation]"`;
 export const glutenCheckerScorer = createScorer({
-  name: 'Gluten Checker',
-  description: 'Check if the output contains any gluten',
+  name: "Gluten Checker",
+  description: "Check if the output contains any gluten",
   judge: {
-    model: openai('gpt-4o'),
+    model: openai("gpt-4o"),
     instructions: GLUTEN_INSTRUCTIONS,
   },
 })
   .analyze({
-    description: 'Analyze the output for gluten',
+    description: "Analyze the output for gluten",
     outputSchema: z.object({
       isGlutenFree: z.boolean(),
       glutenSources: z.array(z.string()),
@@ -315,7 +322,7 @@ export const glutenCheckerScorer = createScorer({
     return results.analyzeStepResult.isGlutenFree ? 1 : 0;
   })
   .generateReason({
-    description: 'Generate a reason for the score',
+    description: "Generate a reason for the score",
     createPrompt: ({ results }) => {
       return generateReasonPrompt({
         glutenSources: results.analyzeStepResult.glutenSources,
@@ -355,6 +362,7 @@ Defines how the LLM should analyze the input and what structured output to retur
 ```
 The analysis step uses a prompt object to:
 - Provide a clear description of the analysis task
 - Define expected output structure with Zod schema (both boolean result and list of gluten sources)
 - Generate dynamic prompts based on the input content
@@ -388,11 +396,12 @@ Provides human-readable explanations for the score using another LLM call.
 ```
 The reason generation step creates explanations that help users understand why a particular score was assigned, using both the boolean result and the specific gluten sources identified by the analysis step.
-```
+````
 ## High gluten-free example
-```typescript filename="src/example-high-gluten-free.ts" showLineNumbers copy
+```typescript title="src/example-high-gluten-free.ts" showLineNumbers copy
 const result = await glutenCheckerScorer.run({
   input: [{ role: 'user', content: 'Mix rice, beans, and vegetables' }],
   output: { text: 'Mix rice, beans, and vegetables' },
@@ -401,16 +410,16 @@ const result = await glutenCheckerScorer.run({
 console.log('Score:', result.score);
 console.log('Gluten sources:', result.analyzeStepResult.glutenSources);
 console.log('Reason:', result.reason);
-```
+````
 ### High gluten-free output
 ```typescript
 {
   score: 1,
-  analyzeStepResult: {
+  analyzeStepResult: {
     isGlutenFree: true,
-    glutenSources: []
+    glutenSources: []
   },
   reason: 'This recipe is gluten-free because rice, beans, and vegetables are naturally gluten-free ingredients that are safe for people with celiac disease.'
 }
@@ -418,15 +427,15 @@ console.log('Reason:', result.reason);
 ## Partial gluten example
-```typescript filename="src/example-partial-gluten.ts" showLineNumbers copy
+```typescript title="src/example-partial-gluten.ts" showLineNumbers copy
 const result = await glutenCheckerScorer.run({
-  input: [{ role: 'user', content: 'Mix flour and water to make dough' }],
-  output: { text: 'Mix flour and water to make dough' },
+  input: [{ role: "user", content: "Mix flour and water to make dough" }],
+  output: { text: "Mix flour and water to make dough" },
 });
-console.log('Score:', result.score);
-console.log('Gluten sources:', result.analyzeStepResult.glutenSources);
-console.log('Reason:', result.reason);
+console.log("Score:", result.score);
+console.log("Gluten sources:", result.analyzeStepResult.glutenSources);
+console.log("Reason:", result.reason);
 ```
 ### Partial gluten output
@@ -434,9 +443,9 @@ console.log('Reason:', result.reason);
 ```typescript
 {
   score: 0,
-  analyzeStepResult: {
+  analyzeStepResult: {
     isGlutenFree: false,
-    glutenSources: ['flour']
+    glutenSources: ['flour']
   },
   reason: 'This recipe is not gluten-free because it contains flour. Regular flour is made from wheat and contains gluten, making it unsafe for people with celiac disease or gluten sensitivity.'
 }
@@ -444,15 +453,15 @@ console.log('Reason:', result.reason);
 ## Low gluten-free example
-```typescript filename="src/example-low-gluten-free.ts" showLineNumbers copy
+```typescript title="src/example-low-gluten-free.ts" showLineNumbers copy
 const result = await glutenCheckerScorer.run({
-  input: [{ role: 'user', content: 'Add soy sauce and noodles' }],
-  output: { text: 'Add soy sauce and noodles' },
+  input: [{ role: "user", content: "Add soy sauce and noodles" }],
+  output: { text: "Add soy sauce and noodles" },
 });
-console.log('Score:', result.score);
-console.log('Gluten sources:', result.analyzeStepResult.glutenSources);
-console.log('Reason:', result.reason);
+console.log("Score:", result.score);
+console.log("Gluten sources:", result.analyzeStepResult.glutenSources);
+console.log("Reason:", result.reason);
 ```
 ### Low gluten-free output
@@ -460,14 +469,15 @@ console.log('Reason:', result.reason);
 ```typescript
 {
   score: 0,
-  analyzeStepResult: {
+  analyzeStepResult: {
     isGlutenFree: false,
-    glutenSources: ['soy sauce', 'noodles']
+    glutenSources: ['soy sauce', 'noodles']
   },
   reason: 'This recipe is not gluten-free because it contains soy sauce, noodles. Regular soy sauce contains wheat and most noodles are made from wheat flour, both of which contain gluten and are unsafe for people with gluten sensitivity.'
 }
 ```
 **Examples and Resources:**
 - [createScorer API Reference](/reference/scorers/create-scorer) - Complete technical documentation
 - [Built-in Scorers Source Code](https://github.com/mastra-ai/mastra/tree/main/packages/evals/src/scorers) - Real implementations for reference

package/.docs/raw/scorers/evals-old-api/custom-eval.mdx CHANGED Viewed

@@ -1,13 +1,13 @@
 ---
-title: "Create a custom eval"
+title: "Create a Custom Eval | Scorers | Mastra Docs"
 description: "Mastra allows you to create your own evals, here is how."
 ---
-import { ScorerCallout } from '@/components/scorer-callout'
 # Create a Custom Eval
-<ScorerCallout />
+:::info Scorers
+This documentation refers to the legacy evals API. For the latest scorer features, see [Scorers](/docs/scorers/overview).
+:::
 Create a custom eval by extending the `Metric` class and implementing the `measure` method. This gives you full control over how scores are calculated and what information is returned. For LLM-based evaluations, extend the `MastraAgentJudge` class to define how the model reasons and scores output.
@@ -15,12 +15,10 @@ Create a custom eval by extending the `Metric` class and implementing the `measu
 You can write lightweight custom metrics using plain JavaScript/TypeScript. These are ideal for simple string comparisons, pattern checks, or other rule-based logic.
-See our [Word Inclusion example](/examples/evals/custom-native-javascript-eval.mdx), which scores responses based on the number of reference words found in the output.
+See our [Word Inclusion example](/examples/evals/custom-native-javascript-eval), which scores responses based on the number of reference words found in the output.
 ## LLM as a judge evaluation
 For more complex evaluations, you can build a judge powered by an LLM. This lets you capture more nuanced criteria, like factual accuracy, tone, or reasoning.
-See the [Real World Countries example](/examples/evals/custom-llm-judge-eval.mdx) for a complete walkthrough of building a custom judge and metric that evaluates real-world factual accuracy.
+See the [Real World Countries example](/examples/evals/custom-llm-judge-eval) for a complete walkthrough of building a custom judge and metric that evaluates real-world factual accuracy.

package/.docs/raw/scorers/evals-old-api/overview.mdx CHANGED Viewed

@@ -1,13 +1,13 @@
 ---
-title: "Overview"
+title: "Testing your agents with evals | Scorers | Mastra Docs"
 description: "Understanding how to evaluate and measure AI agent quality using Mastra evals."
 ---
-import { ScorerCallout } from '@/components/scorer-callout'
 # Testing your agents with evals
-<ScorerCallout />
+:::info Scorers
+This documentation refers to the legacy evals API. For the latest scorer features, see [Scorers](/docs/scorers/overview).
+:::
 While traditional software tests have clear pass/fail conditions, AI outputs are non-deterministic — they can vary with the same input. Evals help bridge this gap by providing quantifiable metrics for measuring agent quality.
@@ -35,7 +35,7 @@ npm install @mastra/evals@latest
 Evals need to be added to an agent. Here's an example using the summarization, content similarity, and tone consistency metrics:
-```typescript copy showLineNumbers filename="src/mastra/agents/index.ts"
+```typescript copy showLineNumbers title="src/mastra/agents/index.ts"
 import { Agent } from "@mastra/core/agent";
 import { openai } from "@ai-sdk/openai";
 import { SummarizationMetric } from "@mastra/evals/llm";
@@ -99,8 +99,8 @@ Once you're hitting your targets:
 3. Test edge cases - Add examples that cover unusual scenarios
 4. Fine-tune - Look for ways to improve efficiency
-See [Textual Evals](/docs/evals/textual-evals) for more info on what evals can do.
+See [Textual Evals](/docs/scorers/evals-old-api/textual-evals) for more info on what evals can do.
-For more info on how to create your own evals, see the [Custom Evals](/docs/evals/custom-eval) guide.
+For more info on how to create your own evals, see the [Custom Evals](/docs/scorers/evals-old-api/custom-eval) guide.
-For running evals in your CI pipeline, see the [Running in CI](/docs/evals/running-in-ci) guide.
+For running evals in your CI pipeline, see the [Running in CI](/docs/scorers/evals-old-api/running-in-ci) guide.

package/.docs/raw/scorers/evals-old-api/running-in-ci.mdx CHANGED Viewed

@@ -1,13 +1,13 @@
 ---
-title: "Running in CI"
+title: "Running Evals in CI | Scorers | Mastra Docs"
 description: "Learn how to run Mastra evals in your CI/CD pipeline to monitor agent quality over time."
 ---
-import { ScorerCallout } from '@/components/scorer-callout'
 # Running Evals in CI
-<ScorerCallout />
+:::info Scorers
+This documentation refers to the legacy evals API. For the latest scorer features, see [Scorers](/docs/scorers/overview).
+:::
 Running evals in your CI pipeline helps bridge this gap by providing quantifiable metrics for measuring agent quality over time.
@@ -15,7 +15,7 @@ Running evals in your CI pipeline helps bridge this gap by providing quantifiabl
 We support any testing framework that supports ESM modules. For example, you can use [Vitest](https://vitest.dev/), [Jest](https://jestjs.io/) or [Mocha](https://mochajs.org/) to run evals in your CI/CD pipeline.
-```typescript copy showLineNumbers filename="src/mastra/agents/index.test.ts"
+```typescript copy showLineNumbers title="src/mastra/agents/index.test.ts"
 import { describe, it, expect } from "vitest";
 import { evaluate } from "@mastra/evals";
 import { ToneConsistencyMetric } from "@mastra/evals/nlp";
@@ -39,7 +39,7 @@ You will need to configure a testSetup and globalSetup script for your testing f
 Add these files to your project to run evals in your CI/CD pipeline:
-```typescript copy showLineNumbers filename="globalSetup.ts"
+```typescript copy showLineNumbers title="globalSetup.ts"
 import { globalSetup } from "@mastra/evals";
 export default function setup() {
@@ -47,7 +47,7 @@ export default function setup() {
 }
 ```
-```typescript copy showLineNumbers filename="testSetup.ts"
+```typescript copy showLineNumbers title="testSetup.ts"
 import { beforeAll } from "vitest";
 import { attachListeners } from "@mastra/evals";
@@ -56,7 +56,7 @@ beforeAll(async () => {
 });
 ```
-```typescript copy showLineNumbers filename="vitest.config.ts"
+```typescript copy showLineNumbers title="vitest.config.ts"
 import { defineConfig } from "vitest/config";
 export default defineConfig({
@@ -71,7 +71,7 @@ export default defineConfig({
 To store eval results in Mastra Storage and capture results in the Mastra dashboard:
-```typescript copy showLineNumbers filename="testSetup.ts"
+```typescript copy showLineNumbers title="testSetup.ts"
 import { beforeAll } from "vitest";
 import { attachListeners } from "@mastra/evals";
 import { mastra } from "./your-mastra-setup";

package/.docs/raw/scorers/evals-old-api/textual-evals.mdx CHANGED Viewed

@@ -1,19 +1,19 @@
 ---
-title: "Textual Evals"
+title: "Textual Evals | Scorers | Mastra Docs"
 description: "Understand how Mastra uses LLM-as-judge methodology to evaluate text quality."
 ---
-import { ScorerCallout } from '@/components/scorer-callout'
 # Textual Evals
-<ScorerCallout />
+:::info Scorers
+This documentation refers to the legacy evals API. For the latest scorer features, see [Scorers](/docs/scorers/overview).
+:::
 Textual evals use an LLM-as-judge methodology to evaluate agent outputs. This approach leverages language models to assess various aspects of text quality, similar to how a teaching assistant might grade assignments using a rubric.
 Each eval focuses on specific quality aspects and returns a score between 0 and 1, providing quantifiable metrics for non-deterministic AI outputs.
-Mastra provides several eval metrics for assessing Agent outputs. Mastra is not limited to these metrics, and you can also [define your own evals](/docs/evals/custom-eval).
+Mastra provides several eval metrics for assessing Agent outputs. Mastra is not limited to these metrics, and you can also [define your own evals](/docs/scorers/evals-old-api/custom-eval).
 ## Why Use Textual Evals?

package/.docs/raw/scorers/off-the-shelf-scorers.mdx CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: "Built-in Scorers"
+title: "Built-in Scorers | Scorers | Mastra Docs"
 description: "Overview of Mastra's ready-to-use scorers for evaluating AI outputs across quality, safety, and performance dimensions."
 ---
@@ -31,10 +31,12 @@ These scorers evaluate the quality and relevance of context used in generating r
 - [`context-relevance`](/reference/scorers/context-relevance): Measures context utility with nuanced relevance levels, usage tracking, and missing context detection (`0-1`, higher is better)
 > tip Context Scorer Selection
 - Use **Context Precision** when context ordering matters and you need standard IR metrics (ideal for RAG ranking evaluation)
 - Use **Context Relevance** when you need detailed relevance assessment and want to track context usage and identify gaps
 Both context scorers support:
 - **Static context**: Pre-defined context arrays
 - **Dynamic context extraction**: Extract context from runs using custom functions (ideal for RAG systems, vector databases, etc.)

package/.docs/raw/scorers/overview.mdx CHANGED Viewed

@@ -1,10 +1,8 @@
 ---
-title: "Overview"
+title: "Scorers overview | Scorers | Mastra Docs"
 description: Overview of scorers in Mastra, detailing their capabilities for evaluating AI outputs and measuring performance.
 ---
-import { Callout } from "nextra/components";
 # Scorers overview
 While traditional software tests have clear pass/fail conditions, AI outputs are non-deterministic — they can vary with the same input. **Scorers** help bridge this gap by providing quantifiable metrics for measuring agent quality.
@@ -37,12 +35,12 @@ npm install @mastra/evals@latest
 You can add built-in scorers to your agents to automatically evaluate their outputs. See the [full list of built-in scorers](/docs/scorers/off-the-shelf-scorers) for all available options.
-```typescript filename="src/mastra/agents/evaluated-agent.ts" showLineNumbers copy
+```typescript title="src/mastra/agents/evaluated-agent.ts" showLineNumbers copy
 import { Agent } from "@mastra/core/agent";
 import { openai } from "@ai-sdk/openai";
-import {
+import {
   createAnswerRelevancyScorer,
-  createToxicityScorer
+  createToxicityScorer,
 } from "@mastra/evals/scorers/llm";
 export const evaluatedAgent = new Agent({
@@ -50,13 +48,13 @@ export const evaluatedAgent = new Agent({
   scorers: {
     relevancy: {
       scorer: createAnswerRelevancyScorer({ model: openai("gpt-4o-mini") }),
-      sampling: { type: "ratio", rate: 0.5 }
+      sampling: { type: "ratio", rate: 0.5 },
     },
     safety: {
       scorer: createToxicityScorer({ model: openai("gpt-4o-mini") }),
-      sampling: { type: "ratio", rate: 1 }
-    }
-  }
+      sampling: { type: "ratio", rate: 1 },
+    },
+  },
 });
 ```
@@ -64,7 +62,7 @@ export const evaluatedAgent = new Agent({
 You can also add scorers to individual workflow steps to evaluate outputs at specific points in your process:
-```typescript filename="src/mastra/workflows/content-generation.ts" showLineNumbers copy
+```typescript title="src/mastra/workflows/content-generation.ts" showLineNumbers copy
 import { createWorkflow, createStep } from "@mastra/core/workflows";
 import { z } from "zod";
 import { customStepScorer } from "../scorers/custom-step-scorer";
@@ -92,8 +90,9 @@ export const contentWorkflow = createWorkflow({ ... })
 **Asynchronous execution**: Live evaluations run in the background without blocking your agent responses or workflow execution. This ensures your AI systems maintain their performance while still being monitored.
 **Sampling control**: The `sampling.rate` parameter (0-1) controls what percentage of outputs get scored:
 - `1.0`: Score every single response (100%)
-- `0.5`: Score half of all responses (50%)
+- `0.5`: Score half of all responses (50%)
 - `0.1`: Score 10% of responses
 - `0.0`: Disable scoring
@@ -103,11 +102,13 @@ export const contentWorkflow = createWorkflow({ ... })
 In addition to live evaluations, you can use scorers to evaluate historical traces from your agent interactions and workflows. This is particularly useful for analyzing past performance, debugging issues, or running batch evaluations.
-<Callout type="info">
+:::info
 **Observability Required**
-To score traces, you must first configure observability in your Mastra instance to collect trace data. See [AI Tracing documentation](../observability/ai-tracing) for setup instructions.
-</Callout>
+To score traces, you must first configure observability in your Mastra instance to collect trace data. See [AI Tracing documentation](../observability/ai-tracing/overview) for setup instructions.
+:::
 ### Scoring traces with the playground
@@ -118,8 +119,8 @@ const mastra = new Mastra({
   // ...
   scorers: {
     answerRelevancy: myAnswerRelevancyScorer,
-    responseQuality: myResponseQualityScorer
-  }
+    responseQuality: myResponseQualityScorer,
+  },
 });
 ```
@@ -129,10 +130,10 @@ Once registered, you can score traces interactively within the Mastra playground
 Mastra provides a CLI command `mastra dev` to test your scorers. The playground includes a scorers section where you can run individual scorers against test inputs and view detailed results.
-For more details, see the [Local Dev Playground](/docs/server-db/local-dev-playground) docs.
+For more details, see the [Local Dev Playground](/docs/getting-started/studio) docs.
 ## Next steps
 - Learn how to create your own scorers in the [Creating Custom Scorers](/docs/scorers/custom-scorers) guide
 - Explore built-in scorers in the [Off-the-shelf Scorers](/docs/scorers/off-the-shelf-scorers) section
-- Test scorers with the [Local Dev Playground](/docs/server-db/local-dev-playground)
+- Test scorers with the [Local Dev Playground](/docs/getting-started/studio)