@mastra/evals 1.0.0 → 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +18 -0
- package/dist/docs/README.md +1 -1
- package/dist/docs/SKILL.md +1 -1
- package/dist/docs/SOURCE_MAP.json +1 -1
- package/dist/docs/evals/01-overview.md +7 -7
- package/dist/docs/evals/02-built-in-scorers.md +16 -16
- package/dist/docs/evals/03-reference.md +39 -39
- package/package.json +6 -6
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,23 @@
|
|
|
1
1
|
# @mastra/evals
|
|
2
2
|
|
|
3
|
+
## 1.0.1
|
|
4
|
+
|
|
5
|
+
### Patch Changes
|
|
6
|
+
|
|
7
|
+
- Force alpha version bump for @mastra/evals, @mastra/loggers, @mastra/observability, and @mastra/memory ([#12505](https://github.com/mastra-ai/mastra/pull/12505))
|
|
8
|
+
|
|
9
|
+
- Updated dependencies [[`90fc0e5`](https://github.com/mastra-ai/mastra/commit/90fc0e5717cb280c2d4acf4f0410b510bb4c0a72), [`1cf5d2e`](https://github.com/mastra-ai/mastra/commit/1cf5d2ea1b085be23e34fb506c80c80a4e6d9c2b), [`b99ceac`](https://github.com/mastra-ai/mastra/commit/b99ceace2c830dbdef47c8692d56a91954aefea2), [`deea43e`](https://github.com/mastra-ai/mastra/commit/deea43eb1366d03a864c5e597d16a48592b9893f), [`833ae96`](https://github.com/mastra-ai/mastra/commit/833ae96c3e34370e58a1e979571c41f39a720592), [`943772b`](https://github.com/mastra-ai/mastra/commit/943772b4378f625f0f4e19ea2b7c392bd8e71786), [`b5c711b`](https://github.com/mastra-ai/mastra/commit/b5c711b281dd1fb81a399a766bc9f86c55efc13e), [`3efbe5a`](https://github.com/mastra-ai/mastra/commit/3efbe5ae20864c4f3143457f4f3ee7dc2fa5ca76), [`1e49e7a`](https://github.com/mastra-ai/mastra/commit/1e49e7ab5f173582154cb26b29d424de67d09aef), [`751eaab`](https://github.com/mastra-ai/mastra/commit/751eaab4e0d3820a94e4c3d39a2ff2663ded3d91), [`69d8156`](https://github.com/mastra-ai/mastra/commit/69d81568bcf062557c24471ce26812446bec465d), [`60d9d89`](https://github.com/mastra-ai/mastra/commit/60d9d899e44b35bc43f1bcd967a74e0ce010b1af), [`5c544c8`](https://github.com/mastra-ai/mastra/commit/5c544c8d12b08ab40d64d8f37b3c4215bee95b87), [`771ad96`](https://github.com/mastra-ai/mastra/commit/771ad962441996b5c43549391a3e6a02c6ddedc2), [`2b0936b`](https://github.com/mastra-ai/mastra/commit/2b0936b0c9a43eeed9bef63e614d7e02ee803f7e), [`3b04f30`](https://github.com/mastra-ai/mastra/commit/3b04f3010604f3cdfc8a0674731700ad66471cee), [`97e26de`](https://github.com/mastra-ai/mastra/commit/97e26deaebd9836647a67b96423281d66421ca07), [`ac9ec66`](https://github.com/mastra-ai/mastra/commit/ac9ec6672779b2e6d4344e415481d1a6a7d4911a), [`10523f4`](https://github.com/mastra-ai/mastra/commit/10523f4882d9b874b40ce6e3715f66dbcd4947d2), [`cb72d20`](https://github.com/mastra-ai/mastra/commit/cb72d2069d7339bda8a0e76d4f35615debb07b84), [`42856b1`](https://github.com/mastra-ai/mastra/commit/42856b1c8aeea6371c9ee77ae2f5f5fe34400933), [`66f33ff`](https://github.com/mastra-ai/mastra/commit/66f33ff68620018513e499c394411d1d39b3aa5c), [`ab3c190`](https://github.com/mastra-ai/mastra/commit/ab3c1901980a99910ca9b96a7090c22e24060113), [`d4f06c8`](https://github.com/mastra-ai/mastra/commit/d4f06c85ffa5bb0da38fb82ebf3b040cc6b4ec4e), [`0350626`](https://github.com/mastra-ai/mastra/commit/03506267ec41b67add80d994c0c0fcce93bbc75f), [`bc9fa00`](https://github.com/mastra-ai/mastra/commit/bc9fa00859c5c4a796d53a0a5cae46ab4a3072e4), [`f46a478`](https://github.com/mastra-ai/mastra/commit/f46a4782f595949c696569e891f81c8d26338508), [`90fc0e5`](https://github.com/mastra-ai/mastra/commit/90fc0e5717cb280c2d4acf4f0410b510bb4c0a72), [`f05a3a5`](https://github.com/mastra-ai/mastra/commit/f05a3a5cf2b9a9c2d40c09cb8c762a4b6cd5d565), [`a291da9`](https://github.com/mastra-ai/mastra/commit/a291da9363efd92dafd8775dccb4f2d0511ece7a), [`c5d71da`](https://github.com/mastra-ai/mastra/commit/c5d71da1c680ce5640b1a7f8ca0e024a4ab1cfed), [`07042f9`](https://github.com/mastra-ai/mastra/commit/07042f9f89080f38b8f72713ba1c972d5b1905b8), [`0423442`](https://github.com/mastra-ai/mastra/commit/0423442b7be2dfacba95890bea8f4a810db4d603)]:
|
|
10
|
+
- @mastra/core@1.1.0
|
|
11
|
+
|
|
12
|
+
## 1.0.1-alpha.0
|
|
13
|
+
|
|
14
|
+
### Patch Changes
|
|
15
|
+
|
|
16
|
+
- Force alpha version bump for @mastra/evals, @mastra/loggers, @mastra/observability, and @mastra/memory ([#12505](https://github.com/mastra-ai/mastra/pull/12505))
|
|
17
|
+
|
|
18
|
+
- Updated dependencies:
|
|
19
|
+
- @mastra/core@1.1.0-alpha.2
|
|
20
|
+
|
|
3
21
|
## 1.0.0
|
|
4
22
|
|
|
5
23
|
### Major Changes
|
package/dist/docs/README.md
CHANGED
package/dist/docs/SKILL.md
CHANGED
|
@@ -20,8 +20,8 @@ There are different kinds of scorers, each serving a specific purpose. Here are
|
|
|
20
20
|
|
|
21
21
|
To access Mastra's scorers feature install the `@mastra/evals` package.
|
|
22
22
|
|
|
23
|
-
```bash
|
|
24
|
-
npm install @mastra/evals@
|
|
23
|
+
```bash npm2yarn
|
|
24
|
+
npm install @mastra/evals@latest
|
|
25
25
|
```
|
|
26
26
|
|
|
27
27
|
## Live evaluations
|
|
@@ -30,7 +30,7 @@ npm install @mastra/evals@beta
|
|
|
30
30
|
|
|
31
31
|
### Adding scorers to agents
|
|
32
32
|
|
|
33
|
-
You can add built-in scorers to your agents to automatically evaluate their outputs. See the [full list of built-in scorers](https://mastra.ai/docs/
|
|
33
|
+
You can add built-in scorers to your agents to automatically evaluate their outputs. See the [full list of built-in scorers](https://mastra.ai/docs/evals/built-in-scorers) for all available options.
|
|
34
34
|
|
|
35
35
|
```typescript title="src/mastra/agents/evaluated-agent.ts"
|
|
36
36
|
import { Agent } from "@mastra/core/agent";
|
|
@@ -121,10 +121,10 @@ Once registered, you can score traces interactively within Studio under the Obse
|
|
|
121
121
|
|
|
122
122
|
Mastra provides a CLI command `mastra dev` to test your scorers. Studio includes a scorers section where you can run individual scorers against test inputs and view detailed results.
|
|
123
123
|
|
|
124
|
-
For more details, see [Studio](https://mastra.ai/docs/
|
|
124
|
+
For more details, see [Studio](https://mastra.ai/docs/getting-started/studio) docs.
|
|
125
125
|
|
|
126
126
|
## Next steps
|
|
127
127
|
|
|
128
|
-
- Learn how to create your own scorers in the [Creating Custom Scorers](https://mastra.ai/docs/
|
|
129
|
-
- Explore built-in scorers in the [Built-in Scorers](https://mastra.ai/docs/
|
|
130
|
-
- Test scorers with [Studio](https://mastra.ai/docs/
|
|
128
|
+
- Learn how to create your own scorers in the [Creating Custom Scorers](https://mastra.ai/docs/evals/custom-scorers) guide
|
|
129
|
+
- Explore built-in scorers in the [Built-in Scorers](https://mastra.ai/docs/evals/built-in-scorers) section
|
|
130
|
+
- Test scorers with [Studio](https://mastra.ai/docs/getting-started/studio)
|
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
|
|
5
5
|
Mastra provides a comprehensive set of built-in scorers for evaluating AI outputs. These scorers are optimized for common evaluation scenarios and are ready to use in your agents and workflows.
|
|
6
6
|
|
|
7
|
-
To create your own scorers, see the [Custom Scorers](https://mastra.ai/docs/
|
|
7
|
+
To create your own scorers, see the [Custom Scorers](https://mastra.ai/docs/evals/custom-scorers) guide.
|
|
8
8
|
|
|
9
9
|
## Available scorers
|
|
10
10
|
|
|
@@ -12,22 +12,22 @@ To create your own scorers, see the [Custom Scorers](https://mastra.ai/docs/v1/e
|
|
|
12
12
|
|
|
13
13
|
These scorers evaluate how correct, truthful, and complete your agent's answers are:
|
|
14
14
|
|
|
15
|
-
- [`answer-relevancy`](https://mastra.ai/reference/
|
|
16
|
-
- [`answer-similarity`](https://mastra.ai/reference/
|
|
17
|
-
- [`faithfulness`](https://mastra.ai/reference/
|
|
18
|
-
- [`hallucination`](https://mastra.ai/reference/
|
|
19
|
-
- [`completeness`](https://mastra.ai/reference/
|
|
20
|
-
- [`content-similarity`](https://mastra.ai/reference/
|
|
21
|
-
- [`textual-difference`](https://mastra.ai/reference/
|
|
22
|
-
- [`tool-call-accuracy`](https://mastra.ai/reference/
|
|
23
|
-
- [`prompt-alignment`](https://mastra.ai/reference/
|
|
15
|
+
- [`answer-relevancy`](https://mastra.ai/reference/evals/answer-relevancy): Evaluates how well responses address the input query (`0-1`, higher is better)
|
|
16
|
+
- [`answer-similarity`](https://mastra.ai/reference/evals/answer-similarity): Compares agent outputs against ground-truth answers for CI/CD testing using semantic analysis (`0-1`, higher is better)
|
|
17
|
+
- [`faithfulness`](https://mastra.ai/reference/evals/faithfulness): Measures how accurately responses represent provided context (`0-1`, higher is better)
|
|
18
|
+
- [`hallucination`](https://mastra.ai/reference/evals/hallucination): Detects factual contradictions and unsupported claims (`0-1`, lower is better)
|
|
19
|
+
- [`completeness`](https://mastra.ai/reference/evals/completeness): Checks if responses include all necessary information (`0-1`, higher is better)
|
|
20
|
+
- [`content-similarity`](https://mastra.ai/reference/evals/content-similarity): Measures textual similarity using character-level matching (`0-1`, higher is better)
|
|
21
|
+
- [`textual-difference`](https://mastra.ai/reference/evals/textual-difference): Measures textual differences between strings (`0-1`, higher means more similar)
|
|
22
|
+
- [`tool-call-accuracy`](https://mastra.ai/reference/evals/tool-call-accuracy): Evaluates whether the LLM selects the correct tool from available options (`0-1`, higher is better)
|
|
23
|
+
- [`prompt-alignment`](https://mastra.ai/reference/evals/prompt-alignment): Measures how well agent responses align with user prompt intent, requirements, completeness, and format (`0-1`, higher is better)
|
|
24
24
|
|
|
25
25
|
### Context quality
|
|
26
26
|
|
|
27
27
|
These scorers evaluate the quality and relevance of context used in generating responses:
|
|
28
28
|
|
|
29
|
-
- [`context-precision`](https://mastra.ai/reference/
|
|
30
|
-
- [`context-relevance`](https://mastra.ai/reference/
|
|
29
|
+
- [`context-precision`](https://mastra.ai/reference/evals/context-precision): Evaluates context relevance and ranking using Mean Average Precision, rewarding early placement of relevant context (`0-1`, higher is better)
|
|
30
|
+
- [`context-relevance`](https://mastra.ai/reference/evals/context-relevance): Measures context utility with nuanced relevance levels, usage tracking, and missing context detection (`0-1`, higher is better)
|
|
31
31
|
|
|
32
32
|
> tip Context Scorer Selection
|
|
33
33
|
>
|
|
@@ -43,7 +43,7 @@ These scorers evaluate the quality and relevance of context used in generating r
|
|
|
43
43
|
|
|
44
44
|
These scorers evaluate adherence to format, style, and safety requirements:
|
|
45
45
|
|
|
46
|
-
- [`tone-consistency`](https://mastra.ai/reference/
|
|
47
|
-
- [`toxicity`](https://mastra.ai/reference/
|
|
48
|
-
- [`bias`](https://mastra.ai/reference/
|
|
49
|
-
- [`keyword-coverage`](https://mastra.ai/reference/
|
|
46
|
+
- [`tone-consistency`](https://mastra.ai/reference/evals/tone-consistency): Measures consistency in formality, complexity, and style (`0-1`, higher is better)
|
|
47
|
+
- [`toxicity`](https://mastra.ai/reference/evals/toxicity): Detects harmful or inappropriate content (`0-1`, lower is better)
|
|
48
|
+
- [`bias`](https://mastra.ai/reference/evals/bias): Detects potential biases in the output (`0-1`, lower is better)
|
|
49
|
+
- [`keyword-coverage`](https://mastra.ai/reference/evals/keyword-coverage): Assesses technical terminology usage (`0-1`, higher is better)
|
|
@@ -79,9 +79,9 @@ const result = await runEvals({
|
|
|
79
79
|
console.log(result.scores);
|
|
80
80
|
```
|
|
81
81
|
|
|
82
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
82
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
83
83
|
|
|
84
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
84
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview) guide.
|
|
85
85
|
|
|
86
86
|
## Related
|
|
87
87
|
|
|
@@ -153,9 +153,9 @@ const result = await runEvals({
|
|
|
153
153
|
console.log(result.scores);
|
|
154
154
|
```
|
|
155
155
|
|
|
156
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
156
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
157
157
|
|
|
158
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
158
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
159
159
|
|
|
160
160
|
---
|
|
161
161
|
|
|
@@ -249,9 +249,9 @@ const result = await runEvals({
|
|
|
249
249
|
console.log(result.scores);
|
|
250
250
|
```
|
|
251
251
|
|
|
252
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
252
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
253
253
|
|
|
254
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
254
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
255
255
|
|
|
256
256
|
## Related
|
|
257
257
|
|
|
@@ -381,9 +381,9 @@ const result = await runEvals({
|
|
|
381
381
|
console.log(result.scores);
|
|
382
382
|
```
|
|
383
383
|
|
|
384
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
384
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
385
385
|
|
|
386
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
386
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
387
387
|
|
|
388
388
|
## Related
|
|
389
389
|
|
|
@@ -462,9 +462,9 @@ const result = await runEvals({
|
|
|
462
462
|
console.log(result.scores);
|
|
463
463
|
```
|
|
464
464
|
|
|
465
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
465
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
466
466
|
|
|
467
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
467
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
468
468
|
|
|
469
469
|
### Score interpretation
|
|
470
470
|
|
|
@@ -654,9 +654,9 @@ const result = await runEvals({
|
|
|
654
654
|
console.log(result.scores);
|
|
655
655
|
```
|
|
656
656
|
|
|
657
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
657
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
658
658
|
|
|
659
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
659
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
660
660
|
|
|
661
661
|
## Comparison with Context Relevance
|
|
662
662
|
|
|
@@ -672,9 +672,9 @@ Choose the right scorer for your needs:
|
|
|
672
672
|
|
|
673
673
|
## Related
|
|
674
674
|
|
|
675
|
-
- [Answer Relevancy Scorer](https://mastra.ai/reference/
|
|
676
|
-
- [Faithfulness Scorer](https://mastra.ai/reference/
|
|
677
|
-
- [Custom Scorers](https://mastra.ai/docs/
|
|
675
|
+
- [Answer Relevancy Scorer](https://mastra.ai/reference/evals/answer-relevancy) - Evaluates if answers address the question
|
|
676
|
+
- [Faithfulness Scorer](https://mastra.ai/reference/evals/faithfulness) - Measures answer groundedness in context
|
|
677
|
+
- [Custom Scorers](https://mastra.ai/docs/evals/custom-scorers) - Creating your own evaluation metrics
|
|
678
678
|
|
|
679
679
|
---
|
|
680
680
|
|
|
@@ -1203,9 +1203,9 @@ Choose the right scorer for your needs:
|
|
|
1203
1203
|
|
|
1204
1204
|
## Related
|
|
1205
1205
|
|
|
1206
|
-
- [Context Precision Scorer](https://mastra.ai/reference/
|
|
1207
|
-
- [Faithfulness Scorer](https://mastra.ai/reference/
|
|
1208
|
-
- [Custom Scorers](https://mastra.ai/docs/
|
|
1206
|
+
- [Context Precision Scorer](https://mastra.ai/reference/evals/context-precision) - Evaluates context ranking using MAP
|
|
1207
|
+
- [Faithfulness Scorer](https://mastra.ai/reference/evals/faithfulness) - Measures answer groundedness in context
|
|
1208
|
+
- [Custom Scorers](https://mastra.ai/docs/evals/custom-scorers) - Creating your own evaluation metrics
|
|
1209
1209
|
|
|
1210
1210
|
---
|
|
1211
1211
|
|
|
@@ -1289,9 +1289,9 @@ const result = await runEvals({
|
|
|
1289
1289
|
console.log(result.scores);
|
|
1290
1290
|
```
|
|
1291
1291
|
|
|
1292
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
1292
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
1293
1293
|
|
|
1294
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
1294
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
1295
1295
|
|
|
1296
1296
|
## Related
|
|
1297
1297
|
|
|
@@ -1397,9 +1397,9 @@ const result = await runEvals({
|
|
|
1397
1397
|
console.log(result.scores);
|
|
1398
1398
|
```
|
|
1399
1399
|
|
|
1400
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
1400
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
1401
1401
|
|
|
1402
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
1402
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
1403
1403
|
|
|
1404
1404
|
## Related
|
|
1405
1405
|
|
|
@@ -1517,9 +1517,9 @@ const result = await runEvals({
|
|
|
1517
1517
|
console.log(result.scores);
|
|
1518
1518
|
```
|
|
1519
1519
|
|
|
1520
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
1520
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
1521
1521
|
|
|
1522
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
1522
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
1523
1523
|
|
|
1524
1524
|
## Related
|
|
1525
1525
|
|
|
@@ -2205,10 +2205,10 @@ jobs:
|
|
|
2205
2205
|
|
|
2206
2206
|
## Related
|
|
2207
2207
|
|
|
2208
|
-
- [Scorers Overview](https://mastra.ai/docs/
|
|
2209
|
-
- [Hallucination Scorer](https://mastra.ai/reference/
|
|
2210
|
-
- [Answer Relevancy Scorer](https://mastra.ai/reference/
|
|
2211
|
-
- [Custom Scorers](https://mastra.ai/docs/
|
|
2208
|
+
- [Scorers Overview](https://mastra.ai/docs/evals/overview) - Setting up scorer pipelines
|
|
2209
|
+
- [Hallucination Scorer](https://mastra.ai/reference/evals/hallucination) - Evaluates fabricated content
|
|
2210
|
+
- [Answer Relevancy Scorer](https://mastra.ai/reference/evals/answer-relevancy) - Measures response focus
|
|
2211
|
+
- [Custom Scorers](https://mastra.ai/docs/evals/custom-scorers) - Creating your own evaluation metrics
|
|
2212
2212
|
|
|
2213
2213
|
---
|
|
2214
2214
|
|
|
@@ -2821,10 +2821,10 @@ const result = await scorer.run({
|
|
|
2821
2821
|
|
|
2822
2822
|
## Related
|
|
2823
2823
|
|
|
2824
|
-
- [Answer Relevancy Scorer](https://mastra.ai/reference/
|
|
2825
|
-
- [Faithfulness Scorer](https://mastra.ai/reference/
|
|
2826
|
-
- [Tool Call Accuracy Scorer](https://mastra.ai/reference/
|
|
2827
|
-
- [Custom Scorers](https://mastra.ai/docs/
|
|
2824
|
+
- [Answer Relevancy Scorer](https://mastra.ai/reference/evals/answer-relevancy) - Evaluates query-response relevance
|
|
2825
|
+
- [Faithfulness Scorer](https://mastra.ai/reference/evals/faithfulness) - Measures context groundedness
|
|
2826
|
+
- [Tool Call Accuracy Scorer](https://mastra.ai/reference/evals/tool-call-accuracy) - Evaluates tool selection
|
|
2827
|
+
- [Custom Scorers](https://mastra.ai/docs/evals/custom-scorers) - Creating your own evaluation metrics
|
|
2828
2828
|
|
|
2829
2829
|
---
|
|
2830
2830
|
|
|
@@ -3253,9 +3253,9 @@ const result = await runEvals({
|
|
|
3253
3253
|
console.log(result.scores);
|
|
3254
3254
|
```
|
|
3255
3255
|
|
|
3256
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
3256
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
3257
3257
|
|
|
3258
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
3258
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
3259
3259
|
|
|
3260
3260
|
## Related
|
|
3261
3261
|
|
|
@@ -3371,9 +3371,9 @@ const result = await runEvals({
|
|
|
3371
3371
|
console.log(result.scores);
|
|
3372
3372
|
```
|
|
3373
3373
|
|
|
3374
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
3374
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
3375
3375
|
|
|
3376
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
3376
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
3377
3377
|
|
|
3378
3378
|
## Related
|
|
3379
3379
|
|
|
@@ -3906,7 +3906,7 @@ console.log("LLM Reason:", llmResult.reason); // Explains why search-tool is les
|
|
|
3906
3906
|
- [Answer Relevancy Scorer](./answer-relevancy)
|
|
3907
3907
|
- [Completeness Scorer](./completeness)
|
|
3908
3908
|
- [Faithfulness Scorer](./faithfulness)
|
|
3909
|
-
- [Custom Scorers](https://mastra.ai/docs/
|
|
3909
|
+
- [Custom Scorers](https://mastra.ai/docs/evals/custom-scorers)
|
|
3910
3910
|
|
|
3911
3911
|
---
|
|
3912
3912
|
|
|
@@ -4008,9 +4008,9 @@ const result = await runEvals({
|
|
|
4008
4008
|
console.log(result.scores);
|
|
4009
4009
|
```
|
|
4010
4010
|
|
|
4011
|
-
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/
|
|
4011
|
+
For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).
|
|
4012
4012
|
|
|
4013
|
-
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/
|
|
4013
|
+
To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview#adding-scorers-to-agents) guide.
|
|
4014
4014
|
|
|
4015
4015
|
## Related
|
|
4016
4016
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@mastra/evals",
|
|
3
|
-
"version": "1.0.
|
|
3
|
+
"version": "1.0.1",
|
|
4
4
|
"description": "",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"files": [
|
|
@@ -70,15 +70,15 @@
|
|
|
70
70
|
"@types/string-similarity": "^4.0.2",
|
|
71
71
|
"@vitest/coverage-v8": "4.0.12",
|
|
72
72
|
"@vitest/ui": "4.0.12",
|
|
73
|
-
"dotenv": "^17.
|
|
73
|
+
"dotenv": "^17.2.3",
|
|
74
74
|
"eslint": "^9.37.0",
|
|
75
|
-
"tsup": "^8.5.
|
|
75
|
+
"tsup": "^8.5.1",
|
|
76
76
|
"typescript": "^5.9.3",
|
|
77
77
|
"vitest": "4.0.16",
|
|
78
78
|
"zod": "^3.25.76",
|
|
79
|
-
"@internal/lint": "0.0.
|
|
80
|
-
"@internal/types-builder": "0.0.
|
|
81
|
-
"@mastra/core": "1.
|
|
79
|
+
"@internal/lint": "0.0.56",
|
|
80
|
+
"@internal/types-builder": "0.0.31",
|
|
81
|
+
"@mastra/core": "1.1.0"
|
|
82
82
|
},
|
|
83
83
|
"engines": {
|
|
84
84
|
"node": ">=22.13.0"
|