npm - agentv - Versions diffs - 0.5.1 → 0.5.3 - Mend

agentv 0.5.1 → 0.5.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +12 -142
package/dist/{chunk-HPH4YWGU.js → chunk-5WBKOCCW.js} +279 -17
package/dist/chunk-5WBKOCCW.js.map +1 -0
package/dist/cli.js +1 -1
package/dist/index.js +1 -1
package/dist/templates/agentv/config.yaml +2 -3
package/dist/templates/agentv/targets.yaml +13 -13
package/package.json +4 -3
package/dist/chunk-HPH4YWGU.js.map +0 -1

package/README.md CHANGED Viewed

@@ -64,39 +64,17 @@ You are now ready to start development. The monorepo contains:
 ### Environment Setup
-1. Configure environment variables:
-   - Copy [.env.template](docs/examples/simple/.env.template) to `.env` in your project root
-   - Fill in your API keys, endpoints, and other configuration values
+1. Initialize your workspace:
+   - Run `agentv init` at the root of your repository
+   - This command automatically sets up the `.agentv/` directory structure and configuration files
-2. Set up targets:
-   - Copy [targets.yaml](docs/examples/simple/.agentv/targets.yaml) to `.agentv/targets.yaml`
-   - Update the environment variable names in targets.yaml to match those defined in your `.env` file
+2. Configure environment variables:
+   - The init command creates a `.env.template` file in your project root
+   - Copy `.env.template` to `.env` and fill in your API keys, endpoints, and other configuration values
+   - Update the environment variable names in `.agentv/targets.yaml` to match those defined in your `.env` file
 ## Quick Start
-### Configuring Guideline Patterns
-AgentV automatically detects guideline files and treats them differently from regular file content. You can customize which files are considered guidelines using an optional `.agentv/config.yaml` configuration file.
-**Config file discovery:**
-- AgentV searches for `.agentv/config.yaml` starting from the eval file's directory
-- Walks up the directory tree to the repository root
-- Uses the first config file found (similar to how `targets.yaml` is discovered)
-- This allows you to place one config file at the project root for all evals
-**Custom patterns** (create `.agentv/config.yaml` in same directory as your eval file):
-```yaml
-# .agentv/config.yaml
-guideline_patterns:
-  - "**/*.guide.md"           # Match all .guide.md files
-  - "**/guidelines/**"        # Match all files in /guidelines/ dirs
-  - "docs/AGENTS.md"          # Match specific files
-  - "**/*.rules.md"           # Match by naming convention
-```
-See [config.yaml example](docs/examples/simple/.agentv/config.yaml) for more pattern examples.
 ### Validating Eval Files
 Validate your eval and targets files before running them:
@@ -157,7 +135,7 @@ agentv eval --target vscode_projectx --targets "path/to/targets.yaml" --eval-id
 - `--targets TARGETS`: Path to targets.yaml file (default: ./.agentv/targets.yaml)
 - `--eval-id EVAL_ID`: Run only the eval case with this specific ID
 - `--out OUTPUT_FILE`: Output file path (default: results/{evalname}_{timestamp}.jsonl)
-- `--format FORMAT`: Output format: 'jsonl' or 'yaml' (default: jsonl)
+- `--output-format FORMAT`: Output format: 'jsonl' or 'yaml' (default: jsonl)
 - `--dry-run`: Run with mock model for testing
 - `--agent-timeout SECONDS`: Timeout in seconds for agent response polling (default: 120)
 - `--max-retries COUNT`: Maximum number of retries for timeout cases (default: 2)
@@ -256,21 +234,6 @@ Each target specifies:
 Codex targets require the standalone `codex` CLI and a configured profile (via `codex configure`) so credentials are stored in `~/.codex/config` (or whatever path the CLI already uses). AgentV mirrors all guideline and attachment files into a fresh scratch workspace, so the `file://` preread links remain valid even when the CLI runs outside your repo tree.
 Confirm the CLI works by running `codex exec --json --profile <name> "ping"` (or any supported dry run) before starting an eval. This prints JSONL events; seeing `item.completed` messages indicates the CLI is healthy.
-## Timeout Handling and Retries
-When using VS Code or other AI agents that may experience timeouts, the evaluator includes automatic retry functionality:
-- **Timeout detection:** Automatically detects when agents timeout
-- **Automatic retries:** When a timeout occurs, the same eval case is retried up to `--max-retries` times (default: 2)
-- **Retry behavior:** Only timeouts trigger retries; other errors proceed to the next eval case
-- **Timeout configuration:** Use `--agent-timeout` to adjust how long to wait for agent responses
-Example with custom timeout settings:
-```bash
-agentv eval evals/projectx/example.yaml --target vscode_projectx --agent-timeout 180 --max-retries 3
-```
 ## Writing Custom Evaluators
 ### Code Evaluator I/O Contract
@@ -370,110 +333,17 @@ Evaluation criteria and guidelines...
 ## Next Steps
-- Review `docs/examples/simple/evals/example-eval.yaml` to understand the schema
-- Create your own eval cases following the schema
-- Write custom evaluator scripts for domain-specific validation
-- Create LLM judge templates for semantic evaluation
+- Review [docs/examples/simple/evals/example-eval.yaml](docs/examples/simple/evals/example-eval.yaml) to understand the schema
+- Create your own eval dataset following the schema
+- Write custom evaluator scripts for deterministic evaluation
+- Create LLM judge prompts for semantic evaluation
 - Set up optimizer configs when ready to improve prompts
 ## Resources
 - [Simple Example README](docs/examples/simple/README.md)
-- [Schema Specification](docs/openspec/changes/update-eval-schema-v2/)
 - [Ax ACE Documentation](https://github.com/ax-llm/ax/blob/main/docs/ACE.md)
-## Scoring and Outputs
-Run with `--verbose` to print detailed information and stack traces on errors.
-### Scoring Methodology
-AgentV uses an AI-powered quality grader that:
-- Extracts key aspects from the expected answer
-- Compares model output against those aspects
-- Provides detailed hit/miss analysis with reasoning
-- Returns a normalized score (0.0 to 1.0)
-### Output Formats
-**JSONL format (default):**
-- One JSON object per line (newline-delimited)
-- Fields: `eval_id`, `score`, `hits`, `misses`, `model_answer`, `expected_aspect_count`, `target`, `timestamp`, `reasoning`, `raw_request`, `grader_raw_request`
-**YAML format (with `--format yaml`):**
-- Human-readable YAML documents
-- Same fields as JSONL, properly formatted for readability
-- Multi-line strings use literal block style
-### Summary Statistics
-After running all eval cases, AgentV displays:
-- Mean, median, min, max scores
-- Standard deviation
-- Distribution histogram
-- Total eval count and execution time
-## Architecture
-AgentV is built as a TypeScript monorepo using:
-- **pnpm workspaces:** Efficient dependency management
-- **Turbo:** Build system and task orchestration
-- **@ax-llm/ax:** Unified LLM provider abstraction
-- **Vercel AI SDK:** Streaming and tool use capabilities
-- **Zod:** Runtime type validation
-- **Commander.js:** CLI argument parsing
-- **Vitest:** Testing framework
-### Package Structure
-- `@agentv/core` - Core evaluation engine, providers, grading logic
-- `agentv` - Main package that bundles CLI functionality
-## Troubleshooting
-### Installation Issues
-**Problem:** Package installation fails or command not found.
-**Solution:**
-```bash
-# Clear npm cache and reinstall
-npm cache clean --force
-npm uninstall -g agentv
-npm install -g agentv
-# Or use npx without installing
-npx agentv@latest --help
-```
-### VS Code Integration Issues
-**Problem:** VS Code workspace doesn't open or prompts aren't injected.
-**Solution:**
-- Ensure the `subagent` package is installed (should be automatic)
-- Verify your workspace path in `.env` is correct and points to a `.code-workspace` file
-- Close any other VS Code instances before running evals
-- Use `--verbose` flag to see detailed workspace switching logs
-### Provider Configuration Issues
-**Problem:** API authentication errors or missing credentials.
-**Solution:**
-- Double-check environment variables in your `.env` file
-- Verify the variable names in `targets.yaml` match your `.env` file
-- Use `--dry-run` first to test without making API calls
-- Check provider-specific documentation for required environment variables
 ## License
 MIT License - see [LICENSE](LICENSE) for details.

package/dist/{chunk-HPH4YWGU.js → chunk-5WBKOCCW.js} RENAMED Viewed

@@ -5040,7 +5040,8 @@ import { exec as execWithCallback } from "node:child_process";
 import path22 from "node:path";
 import { promisify as promisify2 } from "node:util";
 import { exec as execCallback, spawn as spawn2 } from "node:child_process";
-import { constants as constants22 } from "node:fs";
+import { randomUUID } from "node:crypto";
+import { constants as constants22, createWriteStream } from "node:fs";
 import { access as access22, copyFile as copyFile2, mkdtemp, mkdir as mkdir3, rm as rm2, writeFile as writeFile3 } from "node:fs/promises";
 import { tmpdir } from "node:os";
 import path42 from "node:path";
@@ -11032,8 +11033,8 @@ import { constants as constants32 } from "node:fs";
 import { access as access32, readFile as readFile3 } from "node:fs/promises";
 import path62 from "node:path";
 import { parse as parse22 } from "yaml";
-import { randomUUID } from "node:crypto";
-import { createHash, randomUUID as randomUUID2 } from "node:crypto";
+import { randomUUID as randomUUID2 } from "node:crypto";
+import { createHash, randomUUID as randomUUID3 } from "node:crypto";
 import { mkdir as mkdir22, readFile as readFile4, writeFile as writeFile22 } from "node:fs/promises";
 import path72 from "node:path";
 var TEST_MESSAGE_ROLE_VALUES = ["system", "user", "assistant", "tool"];
@@ -12088,6 +12089,7 @@ var CodexProvider = class {
       collectGuidelineFiles(inputFiles, request.guideline_patterns).map((file) => path42.resolve(file))
     );
     const workspaceRoot = await this.createWorkspace();
+    const logger = await this.createStreamLogger(request).catch(() => void 0);
     try {
       const { mirroredInputFiles, guidelineMirrors } = await this.mirrorInputFiles(
         inputFiles,
@@ -12102,7 +12104,7 @@ var CodexProvider = class {
       await writeFile3(promptFile, promptContent, "utf8");
       const args = this.buildCodexArgs();
       const cwd = this.resolveCwd(workspaceRoot);
-      const result = await this.executeCodex(args, cwd, promptContent, request.signal);
+      const result = await this.executeCodex(args, cwd, promptContent, request.signal, logger);
       if (result.timedOut) {
         throw new Error(
           `Codex CLI timed out${formatTimeoutSuffix2(this.config.timeoutMs ?? void 0)}`
@@ -12126,10 +12128,12 @@ var CodexProvider = class {
           executable: this.resolvedExecutable ?? this.config.executable,
           promptFile,
           workspace: workspaceRoot,
-          inputFiles: mirroredInputFiles
+          inputFiles: mirroredInputFiles,
+          logFile: logger?.filePath
         }
       };
     } finally {
+      await logger?.close();
       await this.cleanupWorkspace(workspaceRoot);
     }
   }
@@ -12156,7 +12160,7 @@ var CodexProvider = class {
     args.push("-");
     return args;
   }
-  async executeCodex(args, cwd, promptContent, signal) {
+  async executeCodex(args, cwd, promptContent, signal, logger) {
     try {
       return await this.runCodex({
         executable: this.resolvedExecutable ?? this.config.executable,
@@ -12165,7 +12169,9 @@ var CodexProvider = class {
         prompt: promptContent,
         timeoutMs: this.config.timeoutMs,
         env: process.env,
-        signal
+        signal,
+        onStdoutChunk: logger ? (chunk) => logger.handleStdoutChunk(chunk) : void 0,
+        onStderrChunk: logger ? (chunk) => logger.handleStderrChunk(chunk) : void 0
       });
     } catch (error) {
       const err = error;
@@ -12217,7 +12223,235 @@ var CodexProvider = class {
     } catch {
     }
   }
+  resolveLogDirectory() {
+    const disabled = isCodexLogStreamingDisabled();
+    if (disabled) {
+      return void 0;
+    }
+    if (this.config.logDir) {
+      return path42.resolve(this.config.logDir);
+    }
+    return path42.join(process.cwd(), ".agentv", "logs", "codex");
+  }
+  async createStreamLogger(request) {
+    const logDir = this.resolveLogDirectory();
+    if (!logDir) {
+      return void 0;
+    }
+    try {
+      await mkdir3(logDir, { recursive: true });
+    } catch (error) {
+      const message = error instanceof Error ? error.message : String(error);
+      console.warn(`Skipping Codex stream logging (could not create ${logDir}): ${message}`);
+      return void 0;
+    }
+    const filePath = path42.join(logDir, buildLogFilename(request, this.targetName));
+    try {
+      const logger = await CodexStreamLogger.create({
+        filePath,
+        targetName: this.targetName,
+        evalCaseId: request.evalCaseId,
+        attempt: request.attempt,
+        format: this.config.logFormat ?? "summary"
+      });
+      console.log(`Streaming Codex CLI output to ${filePath}`);
+      return logger;
+    } catch (error) {
+      const message = error instanceof Error ? error.message : String(error);
+      console.warn(`Skipping Codex stream logging for ${filePath}: ${message}`);
+      return void 0;
+    }
+  }
 };
+var CodexStreamLogger = class _CodexStreamLogger {
+  filePath;
+  stream;
+  startedAt = Date.now();
+  stdoutBuffer = "";
+  stderrBuffer = "";
+  format;
+  constructor(filePath, format) {
+    this.filePath = filePath;
+    this.format = format;
+    this.stream = createWriteStream(filePath, { flags: "a" });
+  }
+  static async create(options) {
+    const logger = new _CodexStreamLogger(options.filePath, options.format);
+    const header = [
+      "# Codex CLI stream log",
+      `# target: ${options.targetName}`,
+      options.evalCaseId ? `# eval: ${options.evalCaseId}` : void 0,
+      options.attempt !== void 0 ? `# attempt: ${options.attempt + 1}` : void 0,
+      `# started: ${(/* @__PURE__ */ new Date()).toISOString()}`,
+      ""
+    ].filter((line2) => Boolean(line2));
+    logger.writeLines(header);
+    return logger;
+  }
+  handleStdoutChunk(chunk) {
+    this.stdoutBuffer += chunk;
+    this.flushBuffer("stdout");
+  }
+  handleStderrChunk(chunk) {
+    this.stderrBuffer += chunk;
+    this.flushBuffer("stderr");
+  }
+  async close() {
+    this.flushBuffer("stdout");
+    this.flushBuffer("stderr");
+    this.flushRemainder();
+    await new Promise((resolve, reject) => {
+      this.stream.once("error", reject);
+      this.stream.end(() => resolve());
+    });
+  }
+  writeLines(lines) {
+    for (const line2 of lines) {
+      this.stream.write(`${line2}
+`);
+    }
+  }
+  flushBuffer(source2) {
+    const buffer2 = source2 === "stdout" ? this.stdoutBuffer : this.stderrBuffer;
+    const lines = buffer2.split(/\r?\n/);
+    const remainder = lines.pop() ?? "";
+    if (source2 === "stdout") {
+      this.stdoutBuffer = remainder;
+    } else {
+      this.stderrBuffer = remainder;
+    }
+    for (const line2 of lines) {
+      const formatted = this.formatLine(line2, source2);
+      if (formatted) {
+        this.stream.write(formatted);
+        this.stream.write("\n");
+      }
+    }
+  }
+  formatLine(rawLine, source2) {
+    const trimmed = rawLine.trim();
+    if (trimmed.length === 0) {
+      return void 0;
+    }
+    const message = this.format === "json" ? formatCodexJsonLog(trimmed) : formatCodexLogMessage(trimmed, source2);
+    return `[+${formatElapsed(this.startedAt)}] [${source2}] ${message}`;
+  }
+  flushRemainder() {
+    const stdoutRemainder = this.stdoutBuffer.trim();
+    if (stdoutRemainder.length > 0) {
+      const formatted = this.formatLine(stdoutRemainder, "stdout");
+      if (formatted) {
+        this.stream.write(formatted);
+        this.stream.write("\n");
+      }
+    }
+    const stderrRemainder = this.stderrBuffer.trim();
+    if (stderrRemainder.length > 0) {
+      const formatted = this.formatLine(stderrRemainder, "stderr");
+      if (formatted) {
+        this.stream.write(formatted);
+        this.stream.write("\n");
+      }
+    }
+    this.stdoutBuffer = "";
+    this.stderrBuffer = "";
+  }
+};
+function isCodexLogStreamingDisabled() {
+  const envValue = process.env.AGENTV_CODEX_STREAM_LOGS;
+  if (!envValue) {
+    return false;
+  }
+  const normalized = envValue.trim().toLowerCase();
+  return normalized === "false" || normalized === "0" || normalized === "off";
+}
+function buildLogFilename(request, targetName) {
+  const timestamp = (/* @__PURE__ */ new Date()).toISOString().replace(/[:.]/g, "-");
+  const evalId = sanitizeForFilename(request.evalCaseId ?? "codex");
+  const attemptSuffix = request.attempt !== void 0 ? `_attempt-${request.attempt + 1}` : "";
+  const target = sanitizeForFilename(targetName);
+  return `${timestamp}_${target}_${evalId}${attemptSuffix}_${randomUUID().slice(0, 8)}.log`;
+}
+function sanitizeForFilename(value) {
+  const sanitized = value.replace(/[^A-Za-z0-9._-]+/g, "_");
+  return sanitized.length > 0 ? sanitized : "codex";
+}
+function formatElapsed(startedAt) {
+  const elapsedSeconds = Math.floor((Date.now() - startedAt) / 1e3);
+  const hours = Math.floor(elapsedSeconds / 3600);
+  const minutes = Math.floor(elapsedSeconds % 3600 / 60);
+  const seconds = elapsedSeconds % 60;
+  if (hours > 0) {
+    return `${hours.toString().padStart(2, "0")}:${minutes.toString().padStart(2, "0")}:${seconds.toString().padStart(2, "0")}`;
+  }
+  return `${minutes.toString().padStart(2, "0")}:${seconds.toString().padStart(2, "0")}`;
+}
+function formatCodexLogMessage(rawLine, source2) {
+  const parsed = tryParseJsonValue(rawLine);
+  if (parsed) {
+    const summary = summarizeCodexEvent(parsed);
+    if (summary) {
+      return summary;
+    }
+  }
+  if (source2 === "stderr") {
+    return `stderr: ${rawLine}`;
+  }
+  return rawLine;
+}
+function formatCodexJsonLog(rawLine) {
+  const parsed = tryParseJsonValue(rawLine);
+  if (!parsed) {
+    return rawLine;
+  }
+  try {
+    return JSON.stringify(parsed, null, 2);
+  } catch {
+    return rawLine;
+  }
+}
+function summarizeCodexEvent(event) {
+  if (!event || typeof event !== "object") {
+    return void 0;
+  }
+  const record = event;
+  const type = typeof record.type === "string" ? record.type : void 0;
+  let message = extractFromEvent(event) ?? extractFromItem(record.item) ?? flattenContent(record.output ?? record.content);
+  if (!message && type === JSONL_TYPE_ITEM_COMPLETED) {
+    const item = record.item;
+    if (item && typeof item === "object") {
+      const candidate = flattenContent(
+        item.text ?? item.content ?? item.output
+      );
+      if (candidate) {
+        message = candidate;
+      }
+    }
+  }
+  if (!message) {
+    const itemType = typeof record.item?.type === "string" ? record.item.type : void 0;
+    if (type && itemType) {
+      return `${type}:${itemType}`;
+    }
+    if (type) {
+      return type;
+    }
+  }
+  if (type && message) {
+    return `${type}: ${message}`;
+  }
+  if (message) {
+    return message;
+  }
+  return type;
+}
+function tryParseJsonValue(rawLine) {
+  try {
+    return JSON.parse(rawLine);
+  } catch {
+    return void 0;
+  }
+}
 async function locateExecutable(candidate) {
   const includesPathSeparator = candidate.includes("/") || candidate.includes("\\");
   if (includesPathSeparator) {
@@ -12487,10 +12721,12 @@ async function defaultCodexRunner(options) {
     child.stdout.setEncoding("utf8");
     child.stdout.on("data", (chunk) => {
       stdout += chunk;
+      options.onStdoutChunk?.(chunk);
     });
     child.stderr.setEncoding("utf8");
     child.stderr.on("data", (chunk) => {
       stderr += chunk;
+      options.onStderrChunk?.(chunk);
     });
     child.stdin.end(options.prompt);
     const cleanup = () => {
@@ -12730,6 +12966,8 @@ function resolveCodexConfig(target, env) {
   const argsSource = settings.args ?? settings.arguments;
   const cwdSource = settings.cwd;
   const timeoutSource = settings.timeout_seconds ?? settings.timeoutSeconds;
+  const logDirSource = settings.log_dir ?? settings.logDir ?? settings.log_directory ?? settings.logDirectory;
+  const logFormatSource = settings.log_format ?? settings.logFormat ?? settings.log_output_format ?? settings.logOutputFormat ?? env.AGENTV_CODEX_LOG_FORMAT;
   const executable = resolveOptionalString(executableSource, env, `${target.name} codex executable`, {
     allowLiteral: true,
     optionalEnv: true
@@ -12740,13 +12978,33 @@ function resolveCodexConfig(target, env) {
     optionalEnv: true
   });
   const timeoutMs = resolveTimeoutMs(timeoutSource, `${target.name} codex timeout`);
+  const logDir = resolveOptionalString(logDirSource, env, `${target.name} codex log directory`, {
+    allowLiteral: true,
+    optionalEnv: true
+  });
+  const logFormat = normalizeCodexLogFormat(logFormatSource);
   return {
     executable,
     args,
     cwd,
-    timeoutMs
+    timeoutMs,
+    logDir,
+    logFormat
   };
 }
+function normalizeCodexLogFormat(value) {
+  if (value === void 0 || value === null) {
+    return void 0;
+  }
+  if (typeof value !== "string") {
+    throw new Error("codex log format must be 'summary' or 'json'");
+  }
+  const normalized = value.trim().toLowerCase();
+  if (normalized === "json" || normalized === "summary") {
+    return normalized;
+  }
+  throw new Error("codex log format must be 'summary' or 'json'");
+}
 function resolveMockConfig(target) {
   const settings = target.settings ?? {};
   const response = typeof settings.response === "string" ? settings.response : void 0;
@@ -13394,7 +13652,7 @@ var LlmJudgeEvaluator = class {
     const misses = Array.isArray(parsed.misses) ? parsed.misses.filter(isNonEmptyString).slice(0, 4) : [];
     const reasoning = parsed.reasoning ?? response.reasoning;
     const evaluatorRawRequest = {
-      id: randomUUID(),
+      id: randomUUID2(),
       provider: judgeProvider.id,
       prompt,
       target: context2.target.name,
@@ -14395,7 +14653,7 @@ function sanitizeFilename(value) {
     return "prompt";
   }
   const sanitized = value.replace(/[^A-Za-z0-9._-]+/g, "_");
-  return sanitized.length > 0 ? sanitized : randomUUID2();
+  return sanitized.length > 0 ? sanitized : randomUUID3();
 }
 async function invokeProvider(provider, options) {
   const { evalCase, promptInputs, attempt, agentTimeoutMs, signal } = options;
@@ -14756,7 +15014,7 @@ var Mutex = class {
 };
 // src/commands/eval/jsonl-writer.ts
-import { createWriteStream } from "node:fs";
+import { createWriteStream as createWriteStream2 } from "node:fs";
 import { mkdir as mkdir4 } from "node:fs/promises";
 import path10 from "node:path";
 import { finished } from "node:stream/promises";
@@ -14769,7 +15027,7 @@ var JsonlWriter = class _JsonlWriter {
   }
   static async open(filePath) {
     await mkdir4(path10.dirname(filePath), { recursive: true });
-    const stream = createWriteStream(filePath, { flags: "w", encoding: "utf8" });
+    const stream = createWriteStream2(filePath, { flags: "w", encoding: "utf8" });
     return new _JsonlWriter(stream);
   }
   async append(record) {
@@ -14798,7 +15056,7 @@ var JsonlWriter = class _JsonlWriter {
 };
 // src/commands/eval/yaml-writer.ts
-import { createWriteStream as createWriteStream2 } from "node:fs";
+import { createWriteStream as createWriteStream3 } from "node:fs";
 import { mkdir as mkdir5 } from "node:fs/promises";
 import path11 from "node:path";
 import { finished as finished2 } from "node:stream/promises";
@@ -14813,7 +15071,7 @@ var YamlWriter = class _YamlWriter {
   }
   static async open(filePath) {
     await mkdir5(path11.dirname(filePath), { recursive: true });
-    const stream = createWriteStream2(filePath, { flags: "w", encoding: "utf8" });
+    const stream = createWriteStream3(filePath, { flags: "w", encoding: "utf8" });
     return new _YamlWriter(stream);
   }
   async append(record) {
@@ -15269,7 +15527,7 @@ function normalizeNumber(value, fallback) {
   return fallback;
 }
 function normalizeOptions(rawOptions) {
-  const formatStr = normalizeString(rawOptions.format) ?? "jsonl";
+  const formatStr = normalizeString(rawOptions.outputFormat) ?? "jsonl";
   const format = formatStr === "yaml" ? "yaml" : "jsonl";
   const workers = normalizeNumber(rawOptions.workers, 0);
   return {
@@ -15489,7 +15747,11 @@ function registerEvalCommand(program) {
     "--workers <count>",
     "Number of parallel workers (default: 1, max: 50). Can also be set per-target in targets.yaml",
     (value) => parseInteger(value, 1)
-  ).option("--out <path>", "Write results to the specified path").option("--format <format>", "Output format: 'jsonl' or 'yaml' (default: jsonl)", "jsonl").option("--dry-run", "Use mock provider responses instead of real LLM calls", false).option(
+  ).option("--out <path>", "Write results to the specified path").option(
+    "--output-format <format>",
+    "Output format: 'jsonl' or 'yaml' (default: jsonl)",
+    "jsonl"
+  ).option("--dry-run", "Use mock provider responses instead of real LLM calls", false).option(
     "--dry-run-delay <ms>",
     "Fixed delay in milliseconds for dry-run mode (overridden by delay range if specified)",
     (value) => parseInteger(value, 0),
@@ -16618,4 +16880,4 @@ export {
   createProgram,
   runCli
 };
-//# sourceMappingURL=chunk-HPH4YWGU.js.map
+//# sourceMappingURL=chunk-5WBKOCCW.js.map