agentv 0.5.3 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +7 -3
- package/dist/{chunk-5WBKOCCW.js → chunk-GURDWEMI.js} +4123 -1108
- package/dist/chunk-GURDWEMI.js.map +1 -0
- package/dist/cli.js +1 -1
- package/dist/index.js +1 -1
- package/dist/templates/agentv/targets.yaml +29 -29
- package/package.json +3 -2
- package/dist/chunk-5WBKOCCW.js.map +0 -1
package/README.md
CHANGED
|
@@ -120,6 +120,9 @@ agentv eval "path/to/eval.yaml"
|
|
|
120
120
|
|
|
121
121
|
# Override the eval file's target with CLI flag
|
|
122
122
|
agentv eval --target vscode_projectx "path/to/eval.yaml"
|
|
123
|
+
|
|
124
|
+
# Run multiple evals via glob
|
|
125
|
+
agentv eval "path/to/evals/**/*.yaml"
|
|
123
126
|
```
|
|
124
127
|
|
|
125
128
|
Run a specific eval case with custom targets path:
|
|
@@ -130,17 +133,18 @@ agentv eval --target vscode_projectx --targets "path/to/targets.yaml" --eval-id
|
|
|
130
133
|
|
|
131
134
|
### Command Line Options
|
|
132
135
|
|
|
133
|
-
- `
|
|
136
|
+
- `eval_paths...`: Path(s) or glob(s) to eval YAML files (required; e.g., `evals/**/*.yaml`)
|
|
134
137
|
- `--target TARGET`: Execution target name from targets.yaml (overrides target specified in eval file)
|
|
135
138
|
- `--targets TARGETS`: Path to targets.yaml file (default: ./.agentv/targets.yaml)
|
|
136
139
|
- `--eval-id EVAL_ID`: Run only the eval case with this specific ID
|
|
137
|
-
- `--out OUTPUT_FILE`: Output file path (default: results/
|
|
140
|
+
- `--out OUTPUT_FILE`: Output file path (default: .agentv/results/eval_<timestamp>.jsonl)
|
|
138
141
|
- `--output-format FORMAT`: Output format: 'jsonl' or 'yaml' (default: jsonl)
|
|
139
142
|
- `--dry-run`: Run with mock model for testing
|
|
140
143
|
- `--agent-timeout SECONDS`: Timeout in seconds for agent response polling (default: 120)
|
|
141
144
|
- `--max-retries COUNT`: Maximum number of retries for timeout cases (default: 2)
|
|
142
145
|
- `--cache`: Enable caching of LLM responses (default: disabled)
|
|
143
146
|
- `--dump-prompts`: Save all prompts to `.agentv/prompts/` directory
|
|
147
|
+
- `--workers COUNT`: Parallel workers for eval cases (default: 3; target `workers` setting used when provided)
|
|
144
148
|
- `--verbose`: Verbose output
|
|
145
149
|
|
|
146
150
|
### Target Selection Priority
|
|
@@ -153,7 +157,7 @@ The CLI determines which execution target to use with the following precedence:
|
|
|
153
157
|
|
|
154
158
|
This allows eval files to specify their preferred target while still allowing command-line overrides for flexibility, and maintains backward compatibility with existing workflows.
|
|
155
159
|
|
|
156
|
-
Output goes to `.agentv/results/
|
|
160
|
+
Output goes to `.agentv/results/eval_<timestamp>.jsonl` (or `.yaml`) unless `--out` is provided.
|
|
157
161
|
|
|
158
162
|
### Tips for VS Code Copilot Evals
|
|
159
163
|
|