agentv 0.7.5 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -102,7 +102,7 @@ evalcases:
102
102
  # ...
103
103
 
104
104
  # Targets files
105
- $schema: agentv-targets-v2.1
105
+ $schema: agentv-targets-v2.2
106
106
  targets:
107
107
  - name: default
108
108
  # ...
@@ -175,7 +175,8 @@ Each target specifies:
175
175
 
176
176
  - `name`: Unique identifier for the target
177
177
  - `provider`: The model provider (`azure`, `anthropic`, `gemini`, `codex`, `vscode`, `vscode-insiders`, `cli`, or `mock`)
178
- - `settings`: Environment variable names to use for this target
178
+ - Provider-specific configuration fields at the top level (no `settings` wrapper needed)
179
+ - Optional fields: `judge_target`, `workers`, `provider_batching`
179
180
 
180
181
  ### Examples
181
182
 
@@ -184,24 +185,27 @@ Each target specifies:
184
185
  ```yaml
185
186
  - name: azure_base
186
187
  provider: azure
187
- settings:
188
- endpoint: "AZURE_OPENAI_ENDPOINT"
189
- api_key: "AZURE_OPENAI_API_KEY"
190
- model: "AZURE_DEPLOYMENT_NAME"
188
+ endpoint: ${{ AZURE_OPENAI_ENDPOINT }}
189
+ api_key: ${{ AZURE_OPENAI_API_KEY }}
190
+ model: ${{ AZURE_DEPLOYMENT_NAME }}
191
191
  ```
192
192
 
193
+ Note: Environment variables are referenced using `${{ VARIABLE_NAME }}` syntax. The actual values are resolved from your `.env` file at runtime.
194
+
193
195
  **VS Code targets:**
194
196
 
195
197
  ```yaml
196
198
  - name: vscode_projectx
197
199
  provider: vscode
198
- settings:
199
- workspace_env: "EVAL_PROJECTX_WORKSPACE_PATH"
200
+ workspace_template: ${{ PROJECTX_WORKSPACE_PATH }}
201
+ provider_batching: false
202
+ judge_target: azure_base
200
203
 
201
204
  - name: vscode_insiders_projectx
202
205
  provider: vscode-insiders
203
- settings:
204
- workspace_env: "EVAL_PROJECTX_WORKSPACE_PATH"
206
+ workspace_template: ${{ PROJECTX_WORKSPACE_PATH }}
207
+ provider_batching: false
208
+ judge_target: azure_base
205
209
  ```
206
210
 
207
211
  **CLI targets (template-based):**
@@ -209,30 +213,39 @@ Each target specifies:
209
213
  ```yaml
210
214
  - name: local_cli
211
215
  provider: cli
212
- settings:
213
- command_template: 'somecommand {PROMPT} {FILES}'
214
- files_format: '--file {path}'
215
- cwd: PROJECT_ROOT # optional working directory
216
- env: # merged into process.env
217
- API_TOKEN: LOCAL_AGENT_TOKEN
218
- timeout_seconds: 30 # optional per-command timeout
219
- healthcheck:
220
- type: command # or http
221
- command_template: code --version
216
+ judge_target: azure_base
217
+ command_template: 'uv run ./my_agent.py --prompt {PROMPT} {FILES}'
218
+ files_format: '--file {path}'
219
+ cwd: ${{ CLI_EVALS_DIR }} # optional working directory
220
+ timeout_seconds: 30 # optional per-command timeout
221
+ healthcheck:
222
+ type: command # or http
223
+ command_template: uv run ./my_agent.py --healthcheck
222
224
  ```
223
225
 
226
+ **Supported placeholders in CLI commands:**
227
+ - `{PROMPT}` - The rendered prompt text (shell-escaped)
228
+ - `{FILES}` - Expands to multiple file arguments using `files_format` template
229
+ - `{GUIDELINES}` - Guidelines content
230
+ - `{EVAL_ID}` - Current eval case ID
231
+ - `{ATTEMPT}` - Retry attempt number
232
+ - `{OUTPUT_FILE}` - Path to output file (for agents that write responses to disk)
233
+
224
234
  **Codex CLI targets:**
225
235
 
226
236
  ```yaml
227
237
  - name: codex_cli
228
238
  provider: codex
229
- settings:
230
- executable: "CODEX_CLI_PATH" # defaults to `codex` if omitted
231
- profile: "CODEX_PROFILE" # matches the profile in ~/.codex/config
232
- model: "CODEX_MODEL" # optional, falls back to profile default
233
- approval_preset: "CODEX_APPROVAL_PRESET"
234
- timeout_seconds: 180
235
- cwd: CODEX_WORKSPACE_DIR
239
+ judge_target: azure_base
240
+ executable: ${{ CODEX_CLI_PATH }} # defaults to `codex` if omitted
241
+ args: # optional CLI arguments
242
+ - --profile
243
+ - ${{ CODEX_PROFILE }}
244
+ - --model
245
+ - ${{ CODEX_MODEL }}
246
+ timeout_seconds: 180
247
+ cwd: ${{ CODEX_WORKSPACE_DIR }}
248
+ log_format: json # 'summary' or 'json'
236
249
  ```
237
250
 
238
251
  Codex targets require the standalone `codex` CLI and a configured profile (via `codex configure`) so credentials are stored in `~/.codex/config` (or whatever path the CLI already uses). AgentV mirrors all guideline and attachment files into a fresh scratch workspace, so the `file://` preread links remain valid even when the CLI runs outside your repo tree.
@@ -335,24 +348,56 @@ Evaluation criteria and guidelines...
335
348
  }
336
349
  ```
337
350
 
338
- ## Next Steps
351
+ ## Advanced Configuration
352
+
353
+ ### Retry Configuration
354
+
355
+ AgentV supports automatic retry with exponential backoff for handling rate limiting (HTTP 429) and transient errors. All retry configuration fields are optional and work with Azure, Anthropic, and Gemini providers.
356
+
357
+ **Available retry fields:**
358
+
359
+ | Field | Type | Default | Description |
360
+ |-------|------|---------|-------------|
361
+ | `max_retries` | number | 3 | Maximum number of retry attempts |
362
+ | `retry_initial_delay_ms` | number | 1000 | Initial delay in milliseconds before first retry |
363
+ | `retry_max_delay_ms` | number | 60000 | Maximum delay cap in milliseconds |
364
+ | `retry_backoff_factor` | number | 2 | Exponential backoff multiplier |
365
+ | `retry_status_codes` | number[] | [500, 408, 429, 502, 503, 504] | HTTP status codes to retry |
366
+
367
+ **Example configuration:**
368
+
369
+ ```yaml
370
+ $schema: agentv-targets-v2.2
371
+
372
+ targets:
373
+ - name: azure_base
374
+ provider: azure
375
+ endpoint: ${{ AZURE_OPENAI_ENDPOINT }}
376
+ api_key: ${{ AZURE_OPENAI_API_KEY }}
377
+ model: gpt-4
378
+ max_retries: 5 # Maximum retry attempts
379
+ retry_initial_delay_ms: 2000 # Initial delay before first retry
380
+ retry_max_delay_ms: 120000 # Maximum delay cap
381
+ retry_backoff_factor: 2 # Exponential backoff multiplier
382
+ retry_status_codes: [500, 408, 429, 502, 503, 504] # HTTP status codes to retry
383
+ ```
339
384
 
340
- - Review [docs/examples/simple/evals/example-eval.yaml](docs/examples/simple/evals/example-eval.yaml) to understand the schema
341
- - Create your own eval dataset following the schema
342
- - Write custom evaluator scripts for deterministic evaluation
343
- - Create LLM judge prompts for semantic evaluation
344
- - Set up optimizer configs when ready to improve prompts
385
+ **Retry behavior:**
386
+ - Exponential backoff with jitter (0.75-1.25x) to avoid thundering herd
387
+ - Automatically retries on HTTP 429 (rate limiting), 5xx errors, and network failures
388
+ - Respects abort signals for cancellation
389
+ - If no retry config is specified, uses sensible defaults
345
390
 
346
391
  ## Resources
347
392
 
348
393
  - [Simple Example README](docs/examples/simple/README.md)
349
394
  - [Ax ACE Documentation](https://github.com/ax-llm/ax/blob/main/docs/ACE.md)
350
395
 
351
- ## License
352
-
353
- MIT License - see [LICENSE](LICENSE) for details.
354
-
355
396
  ## Related Projects
356
397
 
357
398
  - [subagent](https://github.com/EntityProcess/subagent) - VS Code Copilot programmatic interface
358
399
  - [Ax](https://github.com/axflow/axflow) - TypeScript LLM framework
400
+
401
+ ## License
402
+
403
+ MIT License - see [LICENSE](LICENSE) for details.