agentv 0.5.3 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -120,6 +120,9 @@ agentv eval "path/to/eval.yaml"
120
120
 
121
121
  # Override the eval file's target with CLI flag
122
122
  agentv eval --target vscode_projectx "path/to/eval.yaml"
123
+
124
+ # Run multiple evals via glob
125
+ agentv eval "path/to/evals/**/*.yaml"
123
126
  ```
124
127
 
125
128
  Run a specific eval case with custom targets path:
@@ -130,17 +133,18 @@ agentv eval --target vscode_projectx --targets "path/to/targets.yaml" --eval-id
130
133
 
131
134
  ### Command Line Options
132
135
 
133
- - `eval_file`: Path to eval YAML file (required, positional argument)
136
+ - `eval_paths...`: Path(s) or glob(s) to eval YAML files (required; e.g., `evals/**/*.yaml`)
134
137
  - `--target TARGET`: Execution target name from targets.yaml (overrides target specified in eval file)
135
138
  - `--targets TARGETS`: Path to targets.yaml file (default: ./.agentv/targets.yaml)
136
139
  - `--eval-id EVAL_ID`: Run only the eval case with this specific ID
137
- - `--out OUTPUT_FILE`: Output file path (default: results/{evalname}_{timestamp}.jsonl)
140
+ - `--out OUTPUT_FILE`: Output file path (default: .agentv/results/eval_<timestamp>.jsonl)
138
141
  - `--output-format FORMAT`: Output format: 'jsonl' or 'yaml' (default: jsonl)
139
142
  - `--dry-run`: Run with mock model for testing
140
143
  - `--agent-timeout SECONDS`: Timeout in seconds for agent response polling (default: 120)
141
144
  - `--max-retries COUNT`: Maximum number of retries for timeout cases (default: 2)
142
145
  - `--cache`: Enable caching of LLM responses (default: disabled)
143
146
  - `--dump-prompts`: Save all prompts to `.agentv/prompts/` directory
147
+ - `--workers COUNT`: Parallel workers for eval cases (default: 3; target `workers` setting used when provided)
144
148
  - `--verbose`: Verbose output
145
149
 
146
150
  ### Target Selection Priority
@@ -153,7 +157,7 @@ The CLI determines which execution target to use with the following precedence:
153
157
 
154
158
  This allows eval files to specify their preferred target while still allowing command-line overrides for flexibility, and maintains backward compatibility with existing workflows.
155
159
 
156
- Output goes to `.agentv/results/{evalname}_{timestamp}.jsonl` (or `.yaml`) unless `--out` is provided.
160
+ Output goes to `.agentv/results/eval_<timestamp>.jsonl` (or `.yaml`) unless `--out` is provided.
157
161
 
158
162
  ### Tips for VS Code Copilot Evals
159
163