@agentv/sdk 4.42.1-next.1 → 4.42.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +43 -1
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # @agentv/sdk
2
2
 
3
- Evaluation SDK for AgentV - build YAML-aligned eval suites, custom graders, and prompt templates around the canonical AgentV eval model.
3
+ Public lightweight SDK for AgentV - build YAML-aligned eval suites, custom graders, and prompt templates around the canonical AgentV eval model.
4
4
 
5
5
  ## Installation
6
6
 
@@ -8,6 +8,21 @@ Evaluation SDK for AgentV - build YAML-aligned eval suites, custom graders, and
8
8
  npm install @agentv/sdk
9
9
  ```
10
10
 
11
+ ## Migrating from `@agentv/eval`
12
+
13
+ Use `@agentv/sdk` for new code:
14
+
15
+ ```bash
16
+ npm uninstall @agentv/eval
17
+ npm install @agentv/sdk
18
+ ```
19
+
20
+ ```typescript
21
+ import { defineCodeGrader } from '@agentv/sdk';
22
+ ```
23
+
24
+ `@agentv/eval` remains only as a temporary deprecated compatibility package that re-exports this SDK for existing consumers. New docs, examples, scaffolds, and skills should not import from it.
25
+
11
26
  ## Quick Start
12
27
 
13
28
  ### defineAssertion (simplest way)
@@ -97,6 +112,33 @@ export default defineEval({
97
112
 
98
113
  The helpers return ordinary `assertions` entries such as `type: contains`, `type: llm-grader`, and `type: code-grader`. CamelCase SDK options such as `minScore` and `maxSteps` lower to canonical YAML keys such as `min_score` and `max_steps`.
99
114
 
115
+ If you are coming from Braintrust `scores` or DeepEval metrics, model reusable checks as small AgentV-native helper factories that return these grader configs. They still lower to the same YAML/runtime contract:
116
+
117
+ ```typescript
118
+ import { defineEval, graders } from '@agentv/sdk';
119
+
120
+ function ragFaithfulness() {
121
+ return graders.llmGrader({
122
+ name: 'rag-faithfulness',
123
+ target: 'grader-target',
124
+ prompt: 'Grade whether the answer is supported by the provided context.',
125
+ });
126
+ }
127
+
128
+ export default defineEval({
129
+ name: 'rag-suite',
130
+ tests: [
131
+ {
132
+ id: 'grounded-answer',
133
+ input: 'Answer using the retrieved context.',
134
+ assertions: [ragFaithfulness()],
135
+ },
136
+ ],
137
+ });
138
+ ```
139
+
140
+ Python workflows should emit canonical YAML/JSONL or implement code graders over the stdin/stdout contract. The repo-local helper under `examples/features/sdk-python/` is an example, not a promised published Python package.
141
+
100
142
  ## Exports
101
143
 
102
144
  - `defineAssertion(handler)` - Define a custom assertion (pass/fail + optional score)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@agentv/sdk",
3
- "version": "4.42.1-next.1",
3
+ "version": "4.42.2",
4
4
  "description": "Evaluation SDK for AgentV - build custom code judges",
5
5
  "type": "module",
6
6
  "repository": {