npm - agentv - Versions diffs - 0.26.0 → 1.0.0 - Mend

agentv 0.26.0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/dist/{chunk-6ZM7WVSC.js → chunk-RIJO5WBF.js} +13 -13
package/dist/chunk-RIJO5WBF.js.map +1 -0
package/dist/cli.js +1 -1
package/dist/cli.js.map +1 -1
package/dist/index.js +1 -1
package/dist/templates/.claude/skills/agentv-eval-builder/SKILL.md +36 -19
package/dist/templates/.claude/skills/agentv-eval-builder/references/eval-schema.json +217 -217
package/dist/templates/.claude/skills/agentv-eval-builder/references/example-evals.md +94 -2
package/dist/templates/.claude/skills/agentv-eval-builder/references/tool-trajectory-evaluator.md +8 -8
package/package.json +1 -1
package/dist/chunk-6ZM7WVSC.js.map +0 -1
package/dist/templates/agentv/.env.template +0 -23

package/dist/templates/.claude/skills/agentv-eval-builder/references/tool-trajectory-evaluator.md CHANGED Viewed

@@ -76,7 +76,7 @@ execution:
 - Strict protocol validation
 - Regression testing specific behavior
-## Expected Messages Evaluator
+## Expected Tool Calls Evaluator
 For simpler cases, specify tool_calls inline in `expected_messages`:
@@ -84,11 +84,11 @@ For simpler cases, specify tool_calls inline in `expected_messages`:
 evalcases:
   - id: research-task
     expected_outcome: Agent searches and retrieves documents
     input_messages:
       - role: user
         content: Research REST vs GraphQL differences
     expected_messages:
       - role: assistant
         content: I'll research this topic.
@@ -96,11 +96,11 @@ evalcases:
           - tool: knowledgeSearch
           - tool: knowledgeSearch
           - tool: documentRetrieve
     execution:
       evaluators:
         - name: tool-validator
-          type: expected_messages
+          type: expected_tool_calls
 ```
 ### With Input Matching
@@ -130,7 +130,7 @@ expected_messages:
 | `in_order` | (matched tools in sequence) / (expected tools count) |
 | `exact` | (correctly positioned tools) / (expected tools count) |
-### expected_messages Scoring
+### expected_tool_calls Scoring
 Sequential matching: `(matched tool_calls) / (expected tool_calls)`
@@ -215,7 +215,7 @@ evalcases:
     execution:
       evaluators:
         - name: pipeline-check
-          type: expected_messages
+          type: expected_tool_calls
 ```
 ## CLI Options for Traces
@@ -234,4 +234,4 @@ agentv eval evals/test.yaml --include-trace
 2. **Start with any_order** - Then tighten to `in_order` or `exact` as needed
 3. **Combine with other evaluators** - Use tool trajectory for execution, LLM judge for output quality
 4. **Test with --dump-traces** - Inspect actual traces to understand agent behavior
-5. **Use expected_messages for simple cases** - It's more readable for basic tool validation
+5. **Use expected_tool_calls for simple cases** - It's more readable for basic tool validation

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agentv",
-  "version": "0.26.0",
+  "version": "1.0.0",
   "description": "CLI entry point for AgentV",
   "type": "module",
   "repository": {