agentv 1.5.0 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # AgentV
2
2
 
3
- A TypeScript-based AI agent evaluation and optimization framework using YAML specifications to score task completion. Built for modern development workflows with first-class support for VS Code Copilot, OpenAI Codex CLI and Azure OpenAI.
3
+ A TypeScript-based AI agent evaluation and optimization framework using YAML specifications to score task completion. Built for modern development workflows with first-class support for VS Code Copilot, OpenAI Codex CLI, Pi Coding Agent, and Azure OpenAI.
4
4
 
5
5
  ## Installation and Setup
6
6
 
@@ -162,7 +162,7 @@ Execution targets in `.agentv/targets.yaml` decouple evals from providers/settin
162
162
  Each target specifies:
163
163
 
164
164
  - `name`: Unique identifier for the target
165
- - `provider`: The model provider (`azure`, `anthropic`, `gemini`, `codex`, `vscode`, `vscode-insiders`, `cli`, or `mock`)
165
+ - `provider`: The model provider (`azure`, `anthropic`, `gemini`, `codex`, `pi-coding-agent`, `vscode`, `vscode-insiders`, `cli`, or `mock`)
166
166
  - Provider-specific configuration fields at the top level (no `settings` wrapper needed)
167
167
  - Optional fields: `judge_target`, `workers`, `provider_batching`
168
168
 
@@ -240,6 +240,27 @@ Note: Environment variables are referenced using `${{ VARIABLE_NAME }}` syntax.
240
240
  Codex targets require the standalone `codex` CLI and a configured profile (via `codex configure`) so credentials are stored in `~/.codex/config` (or whatever path the CLI already uses). AgentV mirrors all guideline and attachment files into a fresh scratch workspace, so the `file://` preread links remain valid even when the CLI runs outside your repo tree.
241
241
  Confirm the CLI works by running `codex exec --json --profile <name> "ping"` (or any supported dry run) before starting an eval. This prints JSONL events; seeing `item.completed` messages indicates the CLI is healthy.
242
242
 
243
+ **Pi Coding Agent targets:**
244
+
245
+ ```yaml
246
+ - name: pi
247
+ provider: pi-coding-agent
248
+ judge_target: gemini_base
249
+ executable: ${{ PI_CLI_PATH }} # Optional: defaults to `pi` if omitted
250
+ pi_provider: google # google, anthropic, openai, groq, xai, openrouter
251
+ model: ${{ GEMINI_MODEL_NAME }}
252
+ api_key: ${{ GOOGLE_GENERATIVE_AI_API_KEY }}
253
+ tools: read,bash,edit,write # Available tools for the agent
254
+ timeout_seconds: 180
255
+ cwd: ${{ PI_WORKSPACE_DIR }} # Optional: run in specific directory
256
+ log_format: json # 'summary' (default) or 'json' for full logs
257
+ # system_prompt: optional override for the default system prompt
258
+ ```
259
+
260
+ Pi Coding Agent is an autonomous coding CLI from [pi-mono](https://github.com/badlogic/pi-mono). Install it globally with `npm install -g @mariozechner/pi-coding-agent` (or use a local path via `executable`). It supports multiple LLM providers and outputs JSONL events. AgentV extracts tool trajectories from the output for trace-based evaluation. File attachments are passed using Pi's native `@path` syntax.
261
+
262
+ By default, a system prompt instructs the agent to include code in its response (required for evaluation scoring). Use `system_prompt` to override this behavior.
263
+
243
264
  ## Writing Custom Evaluators
244
265
 
245
266
  ### Code Evaluator I/O Contract