@alis-build/harness-eval 0.1.0 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. package/README.md +17 -4
  2. package/dist/adapters/claude-code/index.d.ts +1 -1
  3. package/dist/adapters/claude-code/index.js +1 -1
  4. package/dist/{claude-code-ycT0JQZF.js → claude-code-DZ4Vkgp6.js} +35 -6
  5. package/dist/{claude-code-ycT0JQZF.js.map → claude-code-DZ4Vkgp6.js.map} +1 -1
  6. package/dist/cli/bin.js +109 -12
  7. package/dist/cli/bin.js.map +1 -1
  8. package/dist/config/loader.d.ts +1 -1
  9. package/dist/config/loader.js +1 -1
  10. package/dist/{index-6Z17eKZx.d.ts → index-V22PrR0p.d.ts} +2 -1
  11. package/dist/index.d.ts +270 -152
  12. package/dist/index.js +124 -5
  13. package/dist/index.js.map +1 -0
  14. package/dist/{loader-DTvoVfN0.d.ts → loader-C9yQHUPC.d.ts} +19 -2
  15. package/dist/{loader-BCnFJ8rm.js → loader-DcI0KfRX.js} +291 -4
  16. package/dist/loader-DcI0KfRX.js.map +1 -0
  17. package/dist/{build-DsVJ_UeU.js → projections-BcX7w-f6.js} +486 -243
  18. package/dist/projections-BcX7w-f6.js.map +1 -0
  19. package/dist/runner/suite.d.ts +1 -1
  20. package/dist/runner/suite.js +1 -1
  21. package/dist/{suite-BoOvK_lq.d.ts → suite-DPJMIEbu.d.ts} +7 -2
  22. package/dist/{suite-chj0j22j.js → suite-Dlzl-HI0.js} +58 -4
  23. package/dist/suite-Dlzl-HI0.js.map +1 -0
  24. package/dist/{types-BQol062t.d.ts → types-CD3TwOtZ.d.ts} +151 -10
  25. package/package.json +4 -2
  26. package/schemas/eval-interchange-instances.schema.json +196 -0
  27. package/schemas/eval-interchange.schema.json +65 -52
  28. package/schemas/eval-run-envelope.schema.json +182 -425
  29. package/dist/build-DsVJ_UeU.js.map +0 -1
  30. package/dist/loader-BCnFJ8rm.js.map +0 -1
  31. package/dist/suite-chj0j22j.js.map +0 -1
  32. package/schemas/eval-interchange-agent-trace.schema.json +0 -322
  33. package/schemas/eval-interchange-proto-instance.schema.json +0 -106
package/README.md CHANGED
@@ -33,6 +33,8 @@ npx @alis-build/harness-eval run examples/basic.yaml --output report.json
33
33
 
34
34
  The npm package name is `@alis-build/harness-eval`; the CLI binary is `harness-eval`. With a single bin entry, `npx @alis-build/harness-eval <command>` invokes it directly.
35
35
 
36
+ In a git checkout of this repo, npm resolves `npx @alis-build/harness-eval` to the local package (not the registry). Run `pnpm run build` first so `dist/cli/bin.js` exists; the build links `harness-eval` into `node_modules/.bin` for local use.
37
+
36
38
  ### Development (clone & build)
37
39
 
38
40
  Contributors working from a git checkout:
@@ -40,7 +42,8 @@ Contributors working from a git checkout:
40
42
  ```bash
41
43
  pnpm install
42
44
  pnpm run build
43
- node dist/cli/bin.js --help
45
+ pnpm exec harness-eval --help
46
+ # or: node dist/cli/bin.js --help
44
47
  ```
45
48
 
46
49
  ---
@@ -93,6 +96,8 @@ cases:
93
96
 
94
97
  Generic fields (`model`, `cwd`, `timeoutMs`, `env`) sit at the top level. Claude-specific options go under `claudeCode`.
95
98
 
99
+ **Full suite & grading YAML reference:** [docs/suite-config.md](docs/suite-config.md) — all case/matrix fields, `reference_trajectory`, `human_ratings`, multi-file layout, and `grading.yaml` options.
100
+
96
101
  ### 2. Run behavioral eval
97
102
 
98
103
  ```bash
@@ -419,6 +424,8 @@ npx @alis-build/harness-eval --help
419
424
 
420
425
  Uses a standalone **`grading.yaml`** for judge model, timeout, env, and `claudeCode` flags (Option B — separate from the suite file).
421
426
 
427
+ **Field reference:** [docs/suite-config.md — Grading config](docs/suite-config.md#grading-config-gradingyaml)
428
+
422
429
  ```yaml
423
430
  # examples/grading.yaml
424
431
  judge:
@@ -464,7 +471,7 @@ npx @alis-build/harness-eval envelope report.json --suite examples/basic.yaml --
464
471
  # Interchange projections
465
472
  npx @alis-build/harness-eval envelope report.json --projection trajectory --output trajectory.jsonl
466
473
  npx @alis-build/harness-eval envelope report.json --projection instances --output instances.json
467
- npx @alis-build/harness-eval envelope report.json --projection agent-trace --output agent-traces.json
474
+ npx @alis-build/harness-eval envelope report.json --projection instances --output instances.jsonl
468
475
  ```
469
476
 
470
477
  | Option | Description |
@@ -472,7 +479,7 @@ npx @alis-build/harness-eval envelope report.json --projection agent-trace --out
472
479
  | `--output <path>` | Write output (stdout if omitted) |
473
480
  | `--grading <path>` | Merge `grading.json` outcome scores into the envelope |
474
481
  | `--suite <path>` | Suite YAML for provenance (`uri`, `contentHash`) |
475
- | `--projection envelope\|trajectory\|instances\|agent-trace` | Output shape (default: `envelope`) |
482
+ | `--projection envelope\|trajectory\|instances` | Output shape (default: `envelope`) |
476
483
  | `--include-raw-stream-events` | Include adapter raw stream events in repetition artifacts |
477
484
  | `--no-transcript` | Omit judge transcript artifacts |
478
485
 
@@ -506,6 +513,8 @@ See [Data contracts & schemas](#data-contracts--schemas) for type details.
506
513
 
507
514
  ## Suite concepts
508
515
 
516
+ **Authoring reference:** [docs/suite-config.md](docs/suite-config.md) — complete field list for suite YAML, matrix cells, test cases, reference trajectories, and grading config.
517
+
509
518
  ### Test case
510
519
 
511
520
  One prompt + assertions + optional expectations, run N times per matrix cell.
@@ -530,6 +539,10 @@ assertions:
530
539
 
531
540
  Default threshold is `1.0` (every evaluated rep must pass). Reps where the harness crashes are excluded from the denominator and counted as `adapterErrors`.
532
541
 
542
+ ### Reference trajectory (optional)
543
+
544
+ Define expected tool calls for Vertex trajectory metrics on the eval envelope. Use `tool_name_mode: bare` when reference steps use short tool names but the harness records MCP-prefixed names. See [docs/suite-config.md — Reference trajectory](docs/suite-config.md#reference-trajectory).
545
+
533
546
  **Full reference:** [docs/assertions.md](docs/assertions.md) — all assertion kinds, predicates, statistical model, and how to add new assertion types or harness adapters.
534
547
 
535
548
  ---
@@ -687,7 +700,7 @@ pnpm run typecheck
687
700
  pnpm run generate-schemas # Zod → schemas/*.schema.json only
688
701
  ```
689
702
 
690
- **Docs:** [Assertion DSL & adapter extension](docs/assertions.md) · [Eval record contract (DB / CI)](docs/eval-record.md)
703
+ **Docs:** [Suite & grading YAML](docs/suite-config.md) · [Assertion DSL & adapter extension](docs/assertions.md) · [Eval record contract (DB / CI)](docs/eval-record.md)
691
704
 
692
705
  ---
693
706
 
@@ -1,3 +1,3 @@
1
1
  import { n as AdapterError, o as ParseErrorRecord, r as AdapterResult, t as AdapterDiagnostics } from "../../types-B9H4IZtA.js";
2
- import { a as ClaudeCodeAdapterResult, i as ClaudeCodeAdapterConfig, o as ClaudeCodeOptions, r as runClaudeCode, s as PermissionMode, t as claudeCodeAdapter } from "../../index-6Z17eKZx.js";
2
+ import { a as ClaudeCodeAdapterResult, i as ClaudeCodeAdapterConfig, o as ClaudeCodeOptions, r as runClaudeCode, s as PermissionMode, t as claudeCodeAdapter } from "../../index-V22PrR0p.js";
3
3
  export { type AdapterDiagnostics, AdapterError, type AdapterResult, type ClaudeCodeAdapterConfig, type ClaudeCodeAdapterResult, type ClaudeCodeOptions, type ParseErrorRecord, type PermissionMode, claudeCodeAdapter, runClaudeCode };
@@ -1,2 +1,2 @@
1
- import { a as AdapterError, r as runClaudeCode, t as claudeCodeAdapter } from "../../claude-code-ycT0JQZF.js";
1
+ import { a as AdapterError, r as runClaudeCode, t as claudeCodeAdapter } from "../../claude-code-DZ4Vkgp6.js";
2
2
  export { AdapterError, claudeCodeAdapter, runClaudeCode };
@@ -298,10 +298,15 @@ var AdapterError = class extends Error {
298
298
  };
299
299
  //#endregion
300
300
  //#region src/adapters/claude-code/flags.ts
301
+ /** Append repeated `--flag value` pairs for array config fields. */
301
302
  function pushRepeatableFlag(args, flag, values) {
302
303
  if (!values) return;
303
304
  for (const value of values) args.push(flag, value);
304
305
  }
306
+ /**
307
+ * Append an optional CLI flag. Boolean `true` emits the flag alone; other
308
+ * scalars emit `--flag value`.
309
+ */
305
310
  function pushOptionalFlag(args, flag, value) {
306
311
  if (value === void 0) return;
307
312
  if (typeof value === "boolean") {
@@ -360,7 +365,12 @@ function buildArgs(config) {
360
365
  appendClaudeCodeFlags(args, config);
361
366
  return args;
362
367
  }
363
- /** Build args for an LLM judge subprocess (`--output-format json`). */
368
+ /**
369
+ * Build args for an LLM judge subprocess (`--output-format json`).
370
+ *
371
+ * Defaults permission mode to `bypassPermissions` so the judge does not
372
+ * block on tool permission prompts during single-shot JSON grading.
373
+ */
364
374
  function buildJudgeArgs(prompt, config = {}) {
365
375
  const args = [
366
376
  "-p",
@@ -402,6 +412,14 @@ const KILL_GRACE_MS = 5e3;
402
412
  /**
403
413
  * Spawn `claude` in headless mode with isolated config and a process-group
404
414
  * lifecycle. See {@link SpawnedClaude} for how to consume the result.
415
+ *
416
+ * **Kill sequence:** timeout and abort both follow the same two-step path:
417
+ * `SIGTERM` to the process group, then `SIGKILL` after {@link KILL_GRACE_MS}
418
+ * if the group is still alive. This avoids leaving MCP/tool subprocesses
419
+ * running while still giving claude a chance to flush stream-json output.
420
+ *
421
+ * @param config - Adapter options; `timeoutMs`, `signal`, and `isolateConfig`
422
+ * control lifecycle and config isolation.
405
423
  */
406
424
  async function spawnClaude(config) {
407
425
  const binary = config.binary ?? "claude";
@@ -425,6 +443,10 @@ async function spawnClaude(config) {
425
443
  let timedOut = false;
426
444
  let killEscalation = null;
427
445
  const timeoutMs = config.timeoutMs ?? DEFAULT_TIMEOUT_MS;
446
+ /**
447
+ * Arm (or re-arm) the SIGKILL fallback. Each SIGTERM attempt gets its own
448
+ * grace window so a slow shutdown doesn't leave orphaned MCP servers.
449
+ */
428
450
  const scheduleKillEscalation = () => {
429
451
  if (killEscalation) clearTimeout(killEscalation);
430
452
  killEscalation = setTimeout(() => killTree(child, "SIGKILL"), KILL_GRACE_MS);
@@ -487,10 +509,16 @@ async function spawnClaude(config) {
487
509
  * group is already gone. This catches MCP server subprocesses and tool
488
510
  * processes spawned by claude.
489
511
  *
490
- * Why both? On some platforms the process group dies before we get here
491
- * (the child itself already cleaned up), in which case `kill(-pid)` throws
492
- * ESRCH. The fallback handles that edge case without leaking zombies in
493
- * the common case.
512
+ * **Signal escalation:** callers typically invoke this first with `SIGTERM`,
513
+ * then again with `SIGKILL` after {@link KILL_GRACE_MS}. The group kill is
514
+ * essential a bare `child.kill()` would leave MCP servers running.
515
+ *
516
+ * **Platform edge case:** when the group leader exits first, `kill(-pid)`
517
+ * throws `ESRCH`. The single-PID fallback covers that without failing the
518
+ * adapter run.
519
+ *
520
+ * @param child - Spawned process handle from {@link spawn}.
521
+ * @param signal - POSIX signal to deliver (`SIGTERM` or `SIGKILL` in practice).
494
522
  */
495
523
  function killTree(child, signal) {
496
524
  if (child.pid === void 0) return;
@@ -553,6 +581,7 @@ async function runClaudeCode(config) {
553
581
  await spawned.cleanup();
554
582
  }
555
583
  }
584
+ /** Registered {@link HarnessAdapter} for Claude Code headless runs. */
556
585
  const claudeCodeAdapter = {
557
586
  id: "claude-code",
558
587
  run: runClaudeCode
@@ -560,4 +589,4 @@ const claudeCodeAdapter = {
560
589
  //#endregion
561
590
  export { isUserMessage as _, AdapterError as a, buildTrajectory as c, isResult as d, isSystemInit as f, isToolUseBlock as g, isToolResultBlock as h, buildJudgeArgs as i, namespaceOf as l, isTextBlock as m, claude_code_exports as n, parseStreamJson as o, isSystemRetry as p, runClaudeCode as r, TrajectoryBuilder as s, claudeCodeAdapter as t, isAssistantMessage as u };
562
591
 
563
- //# sourceMappingURL=claude-code-ycT0JQZF.js.map
592
+ //# sourceMappingURL=claude-code-DZ4Vkgp6.js.map
@@ -1 +1 @@
1
- {"version":3,"file":"claude-code-ycT0JQZF.js","names":[],"sources":["../src/types/stream.ts","../src/types/trajectory.ts","../src/trajectory/builder.ts","../src/parsers/stream-json.ts","../src/adapters/types.ts","../src/adapters/claude-code/flags.ts","../src/adapters/claude-code/process.ts","../src/adapters/claude-code/index.ts"],"sourcesContent":["/**\n * Discriminated union of events emitted by Claude Code's\n * `--output-format stream-json` mode.\n *\n * The format is NDJSON (one JSON object per line on stdout). Each line has\n * a required `type` field and often a `subtype` for further disambiguation.\n *\n * Source notes: the stream-json schema is not formally documented as of mid-2026.\n * These types are derived from:\n * - https://code.claude.com/docs/en/headless\n * - https://github.com/anthropics/claude-code/issues/24612 (event-types tracking issue)\n * - https://takopi.dev/reference/runners/claude/stream-json-cheatsheet/\n * - The `@anthropic-ai/claude-agent-sdk` TypeScript declaration files,\n * which are the de-facto source of truth.\n *\n * When adding new event types, prefer extending the union here rather than\n * branching on `any` in callers. Unknown events should be tolerated silently\n * by the builder (the schema evolves and we don't want CI to break on a new\n * event type we haven't modelled yet).\n */\n\n/** Top-level discriminated union of stream-json events. */\nexport type StreamEvent =\n | SystemInitEvent\n | SystemRetryEvent\n | SystemPluginInstallEvent\n | SystemCompactBoundaryEvent\n | SystemUnknownEvent\n | AssistantMessageEvent\n | UserMessageEvent\n | ResultEvent;\n\n// system events\n\n/** Emitted once at session start. Carries the session-level metadata. */\nexport interface SystemInitEvent {\n type: \"system\";\n subtype: \"init\";\n session_id: string;\n cwd: string;\n model: string;\n permissionMode?: string;\n apiKeySource?: string;\n /** Names of tools available in the session (built-in + MCP). */\n tools: string[];\n /** MCP servers configured for this session, with connection status. */\n mcp_servers: McpServerStatus[];\n}\n\nexport interface McpServerStatus {\n name: string;\n status: \"connected\" | \"disconnected\" | \"error\" | string;\n}\n\n/** Emitted when the API rate-limits us or otherwise asks for a retry. */\nexport interface SystemRetryEvent {\n type: \"system\";\n subtype: \"api_retry\";\n session_id: string;\n /** Implementation-defined retry payload (delay, reason, etc). */\n [key: string]: unknown;\n}\n\n/** Emitted while marketplace plugins are installing pre-session. */\nexport interface SystemPluginInstallEvent {\n type: \"system\";\n subtype: \"plugin_install\";\n session_id: string;\n [key: string]: unknown;\n}\n\n/** Emitted when Claude Code compacts the context window mid-session. */\nexport interface SystemCompactBoundaryEvent {\n type: \"system\";\n subtype: \"compact_boundary\";\n session_id: string;\n [key: string]: unknown;\n}\n\n/**\n * Catch-all for `type: \"system\"` events we haven't modelled.\n *\n * Keeps the union exhaustive while tolerating schema evolution. Callers should\n * either explicitly handle a known subtype or fall through to ignore.\n */\nexport interface SystemUnknownEvent {\n type: \"system\";\n subtype: string;\n session_id?: string;\n [key: string]: unknown;\n}\n\n// conversational events\n\n/** One assistant turn. The `message` field mirrors the Anthropic Messages API shape. */\nexport interface AssistantMessageEvent {\n type: \"assistant\";\n session_id: string;\n message: AssistantMessage;\n}\n\nexport interface AssistantMessage {\n id: string;\n type: \"message\";\n role: \"assistant\";\n content: ContentBlock[];\n model?: string;\n stop_reason?: StopReason | null;\n usage?: Usage;\n}\n\n/**\n * A user-role message in the stream.\n *\n * In stream-json these are usually *synthetic* — the harness injects them to\n * feed tool results back into the conversation after dispatching a tool. The\n * very first user message (the prompt) is also emitted here for completeness.\n */\nexport interface UserMessageEvent {\n type: \"user\";\n session_id: string;\n message: UserMessage;\n}\n\nexport interface UserMessage {\n role: \"user\";\n /** String for the initial prompt, array of blocks when carrying tool results. */\n content: ContentBlock[] | string;\n}\n\n// content blocks\n\nexport type ContentBlock = TextBlock | ToolUseBlock | ToolResultBlock;\n\nexport interface TextBlock {\n type: \"text\";\n text: string;\n}\n\nexport interface ToolUseBlock {\n type: \"tool_use\";\n /** Unique id assigned by the model; used to match tool_result back to this call. */\n id: string;\n /** Tool name. MCP tools follow the convention `mcp__<server>__<tool>`. */\n name: string;\n /** Arguments the model passed. Schema is per-tool. */\n input: unknown;\n}\n\nexport interface ToolResultBlock {\n type: \"tool_result\";\n /** The id of the corresponding tool_use block. */\n tool_use_id: string;\n /** Tool output. May be plain text or further content blocks for richer tools. */\n content: string | ContentBlock[];\n is_error?: boolean;\n}\n\n// result envelope\n\n/** Emitted once at session end. Carries aggregate usage and cost. */\nexport interface ResultEvent {\n type: \"result\";\n subtype: \"success\" | \"error\";\n session_id: string;\n total_cost_usd: number;\n is_error: boolean;\n duration_ms: number;\n duration_api_ms?: number;\n num_turns: number;\n /** The final text the harness returned, if any. */\n result?: string;\n usage?: Usage;\n}\n\n// shared scalars\n\n/**\n * Reasons the model can stop a turn. Open-ended string union because new\n * stop reasons appear over time.\n */\nexport type StopReason =\n | \"end_turn\"\n | \"tool_use\"\n | \"max_tokens\"\n | \"stop_sequence\"\n | (string & {});\n\nexport interface Usage {\n input_tokens: number;\n output_tokens: number;\n cache_creation_input_tokens?: number;\n cache_read_input_tokens?: number;\n}\n\n// type guards\n\n/** Type guards. Prefer these over manual `e.type === \"...\"` checks at call sites. */\n\nexport function isSystemInit(e: StreamEvent): e is SystemInitEvent {\n return e.type === \"system\" && (e as SystemInitEvent).subtype === \"init\";\n}\n\nexport function isSystemRetry(e: StreamEvent): e is SystemRetryEvent {\n return e.type === \"system\" && (e as SystemRetryEvent).subtype === \"api_retry\";\n}\n\nexport function isAssistantMessage(e: StreamEvent): e is AssistantMessageEvent {\n return e.type === \"assistant\";\n}\n\nexport function isUserMessage(e: StreamEvent): e is UserMessageEvent {\n return e.type === \"user\";\n}\n\nexport function isResult(e: StreamEvent): e is ResultEvent {\n return e.type === \"result\";\n}\n\nexport function isTextBlock(b: ContentBlock): b is TextBlock {\n return b.type === \"text\";\n}\n\nexport function isToolUseBlock(b: ContentBlock): b is ToolUseBlock {\n return b.type === \"tool_use\";\n}\n\nexport function isToolResultBlock(b: ContentBlock): b is ToolResultBlock {\n return b.type === \"tool_result\";\n}\n","/**\n * TrajectoryView — the assertion-friendly projection of a Claude Code session.\n *\n * The view is derived from the stream of {@link StreamEvent} values produced by\n * the harness, but is optimized for the queries that the assertion DSL needs to\n * express:\n *\n * - did tool X get called? (look at `toolCalls`)\n * - did tool A come before tool B? (compare `turnIndex` / `callIndex`)\n * - was a tool called with arguments matching predicate P? (`toolCalls[i].args`)\n * - did the agent answer without using any tool? (`toolCalls.length === 0`)\n *\n * The view is reconstructable from the raw events (lossless w.r.t. assertions),\n * but operating on it directly is dramatically simpler than walking event\n * streams or OTel span trees.\n *\n * Design notes:\n * - `turnIndex` and `callIndex` are the right primitives for ordering.\n * Wall-clock timestamps from the stream are unreliable for sub-second\n * ordering and parallel tool dispatch.\n * - Parallel tool calls (multiple `tool_use` blocks in one assistant message)\n * share a `turnIndex` but have distinct `callIndex` values in emission order.\n * - `namespace` is precomputed so assertions like `called(pattern: \"mcp__api__*\")`\n * can do a cheap string check.\n */\n\nimport type { StopReason } from \"./stream\";\n\nexport interface TrajectoryView {\n /** Session metadata, captured from the `system/init` event. */\n meta: SessionMeta;\n\n /** Every tool call, in global emission order. */\n toolCalls: ToolCall[];\n\n /** Each assistant turn: text content + any tool calls emitted in that turn. */\n turns: AssistantTurn[];\n\n /** All assistant text concatenated across turns. Useful for `response_contains`. */\n finalResponse: string;\n\n /** Stop reason of the *last* assistant turn. */\n finalStopReason: StopReason | null;\n\n /** Aggregate usage and cost from the result event. */\n usage: UsageSummary;\n\n /** Retry events observed during the run (rate limits, transient errors). */\n retries: RetryRecord[];\n\n /** Whether the result envelope indicated success. */\n success: boolean;\n}\n\nexport interface SessionMeta {\n sessionId: string;\n model: string;\n cwd: string;\n permissionMode?: string;\n /** Tool names the harness reported as available at session start. */\n availableTools: string[];\n /** MCP servers configured for the session, with connection status. */\n mcpServers: { name: string; status: string }[];\n}\n\nexport interface ToolCall {\n /** Fully-qualified tool name, e.g. `\"mcp__api__search_skills\"` or `\"Bash\"`. */\n name: string;\n\n /**\n * Namespace prefix for MCP-style names (`\"mcp__api\"`), or null for built-ins.\n * Precomputed via {@link namespaceOf} for cheap pattern matching.\n */\n namespace: string | null;\n\n /** The `tool_use` block's `id`; matches a later `tool_result.tool_use_id`. */\n callId: string;\n\n /** Args the model emitted on this call. Tool-specific schema. */\n args: unknown;\n\n /** Tool result, or null if no result was observed (e.g. process killed). */\n result: unknown | null;\n\n /** Whether the tool reported an error in its result. */\n isError: boolean;\n\n /**\n * Which assistant turn produced this call. Parallel calls within a single\n * assistant message share a `turnIndex`.\n */\n turnIndex: number;\n\n /** Index in the global ordered tool-call sequence. */\n callIndex: number;\n}\n\nexport interface AssistantTurn {\n turnIndex: number;\n /** Text emitted in this turn (may be empty if turn was tool-only). */\n text: string;\n /** Tool calls emitted in this turn, in their block order. */\n toolCalls: ToolCall[];\n /** Stop reason reported by the model for this turn. */\n stopReason: StopReason | null;\n}\n\nexport interface UsageSummary {\n inputTokens: number;\n outputTokens: number;\n totalCostUsd: number;\n durationMs: number;\n numTurns: number;\n}\n\nexport interface RetryRecord {\n /** ms since session start (approximate; the stream doesn't include precise ts). */\n offsetMs: number;\n /** Raw payload from the `system/api_retry` event for diagnostics. */\n raw: unknown;\n}\n\n// helpers\n\n/**\n * Extract the MCP namespace prefix from a tool name.\n *\n * Claude Code formats MCP tool names as `mcp__<server>__<tool>`. The namespace\n * is the first two segments joined: `mcp__<server>`. Returns null for non-MCP\n * tool names (built-ins like `Bash`, `Read`, `Edit`).\n *\n * @example\n * namespaceOf(\"mcp__api__search_skills\") // \"mcp__api\"\n * namespaceOf(\"Bash\") // null\n */\nexport function namespaceOf(toolName: string): string | null {\n if (!toolName.startsWith(\"mcp__\")) return null;\n const parts = toolName.split(\"__\");\n if (parts.length < 3) return null;\n return `${parts[0]}__${parts[1]}`;\n}\n","/**\n * TrajectoryBuilder — consumes a stream of {@link StreamEvent} values and\n * produces a {@link TrajectoryView}.\n *\n * State machine: the builder is a small, tolerant state machine. Invariants:\n *\n * - Exactly one `system/init` event opens the session. The builder requires\n * it to be present before `build()`.\n * - Each `assistant` event begins a new turn. Text blocks accumulate into\n * the turn's text; `tool_use` blocks become `ToolCall` records.\n * - `user` events with `tool_result` blocks deliver tool results back. We\n * match them to pending calls by `tool_use_id`.\n * - One `result` event closes the session and carries aggregate usage.\n *\n * The builder is *tolerant of partial streams*: a process killed mid-run\n * produces a coherent (but flagged) view. Tool calls without matching results\n * keep `result: null`. The `success` flag reflects whether a successful result\n * event was actually observed.\n *\n * Why a class (not a reducer)?\n * The internal `pendingCalls` map is mutable by design — we modify ToolCall\n * objects in place when results arrive, so other parts of the view (which\n * hold references to the same objects) see the update for free. A reducer\n * would force a deep copy per result event, which is wasteful and would\n * complicate identity-based queries.\n */\n\nimport {\n isAssistantMessage,\n isResult,\n isSystemInit,\n isTextBlock,\n isToolResultBlock,\n isToolUseBlock,\n isUserMessage,\n type StreamEvent,\n type Usage,\n} from \"../types/stream\";\nimport {\n namespaceOf,\n type AssistantTurn,\n type RetryRecord,\n type SessionMeta,\n type ToolCall,\n type TrajectoryView,\n} from \"../types/trajectory\";\n\nexport class TrajectoryBuilder {\n private meta: SessionMeta | null = null;\n private sessionStartTs: number | null = null;\n\n private turns: AssistantTurn[] = [];\n private allToolCalls: ToolCall[] = [];\n\n /**\n * tool_use_id → ToolCall, for matching results back to calls.\n * Entries are removed once a result is observed.\n */\n private pendingCalls: Map<string, ToolCall> = new Map();\n\n private retries: RetryRecord[] = [];\n\n private finalUsage: Usage | null = null;\n private finalCostUsd = 0;\n private finalDurationMs = 0;\n private finalNumTurns = 0;\n private finalResultText = \"\";\n private sawResultEvent = false;\n private resultIsError = false;\n\n /**\n * Consume one event. Safe to call with events in stream order.\n *\n * Unknown event types are silently ignored — the schema evolves and we\n * don't want CI to break on a new event type we haven't modelled.\n */\n consume(event: StreamEvent): void {\n if (isSystemInit(event)) {\n this.meta = {\n sessionId: event.session_id,\n model: event.model,\n cwd: event.cwd,\n permissionMode: event.permissionMode,\n availableTools: event.tools ?? [],\n mcpServers: (event.mcp_servers ?? []).map((s) => ({\n name: s.name,\n status: s.status,\n })),\n };\n this.sessionStartTs = Date.now();\n return;\n }\n\n if (event.type === \"system\" && event.subtype === \"api_retry\") {\n this.retries.push({\n offsetMs: this.sessionStartTs ? Date.now() - this.sessionStartTs : 0,\n raw: event,\n });\n return;\n }\n\n if (isAssistantMessage(event)) {\n this.handleAssistantMessage(event);\n return;\n }\n\n if (isUserMessage(event)) {\n this.handleUserMessage(event);\n return;\n }\n\n if (isResult(event)) {\n this.sawResultEvent = true;\n this.resultIsError = event.is_error;\n this.finalUsage = event.usage ?? null;\n this.finalCostUsd = event.total_cost_usd ?? 0;\n this.finalDurationMs = event.duration_ms ?? 0;\n this.finalNumTurns = event.num_turns ?? 0;\n this.finalResultText = event.result ?? \"\";\n return;\n }\n\n // Unknown event: ignored. See class doc.\n }\n\n /**\n * Finalize the view. Call after consuming the last event from the stream.\n *\n * Throws if no `system/init` was observed — at that point we have no model,\n * no session id, and no available-tools list, which means assertions like\n * \"called any mcp__api__* tool\" can't even be evaluated meaningfully.\n */\n build(): TrajectoryView {\n if (this.meta === null) {\n throw new Error(\n \"TrajectoryBuilder.build() called before any system/init event was observed. \" +\n \"The harness may have failed to start, or the stream was truncated before init.\",\n );\n }\n\n const lastTurn = this.turns[this.turns.length - 1];\n\n // Prefer the assistant text we accumulated turn-by-turn over the\n // `result.result` field, because the latter is sometimes a summary\n // and the former is exactly what the model said.\n const accumulatedText = this.turns\n .map((t) => t.text)\n .filter((t) => t.length > 0)\n .join(\"\\n\\n\")\n .trim();\n\n return {\n meta: this.meta,\n toolCalls: this.allToolCalls,\n turns: this.turns,\n finalResponse: accumulatedText || this.finalResultText,\n finalStopReason: lastTurn?.stopReason ?? null,\n usage: {\n inputTokens: this.finalUsage?.input_tokens ?? 0,\n outputTokens: this.finalUsage?.output_tokens ?? 0,\n totalCostUsd: this.finalCostUsd,\n durationMs: this.finalDurationMs,\n // Fall back to observed turn count if the result event was missing.\n numTurns: this.finalNumTurns || this.turns.length,\n },\n retries: this.retries,\n // Successful = saw a non-error result envelope. Streams that ended without\n // a result event are reported as unsuccessful regardless of tool outcomes.\n success: this.sawResultEvent && !this.resultIsError,\n };\n }\n\n // private handlers\n\n private handleAssistantMessage(\n event: Extract<StreamEvent, { type: \"assistant\" }>,\n ): void {\n const turnIndex = this.turns.length;\n const textChunks: string[] = [];\n const toolCallsThisTurn: ToolCall[] = [];\n\n for (const block of event.message.content) {\n if (isTextBlock(block)) {\n textChunks.push(block.text);\n continue;\n }\n if (isToolUseBlock(block)) {\n const call: ToolCall = {\n name: block.name,\n namespace: namespaceOf(block.name),\n callId: block.id,\n args: block.input,\n result: null,\n isError: false,\n turnIndex,\n callIndex: this.allToolCalls.length,\n };\n this.allToolCalls.push(call);\n this.pendingCalls.set(block.id, call);\n toolCallsThisTurn.push(call);\n continue;\n }\n // tool_result blocks don't appear in assistant messages — those arrive\n // via user messages. If one does appear, ignore it; we'd rather drop\n // an unexpected block than crash the eval.\n }\n\n this.turns.push({\n turnIndex,\n text: textChunks.join(\"\").trim(),\n toolCalls: toolCallsThisTurn,\n stopReason: event.message.stop_reason ?? null,\n });\n }\n\n private handleUserMessage(\n event: Extract<StreamEvent, { type: \"user\" }>,\n ): void {\n const content = event.message.content;\n\n // The very first user message carries the prompt as a plain string. We\n // already know the prompt (the caller passed it to the adapter), so we\n // ignore this case — there's nothing assertion-relevant in it.\n if (typeof content === \"string\") return;\n\n for (const block of content) {\n if (!isToolResultBlock(block)) continue;\n\n const call = this.pendingCalls.get(block.tool_use_id);\n if (!call) {\n // Unmatched result: ignore. Can happen if events arrive out of order\n // or the corresponding tool_use was emitted in an earlier run that\n // we're resuming. Either way, dropping is safer than throwing.\n continue;\n }\n\n call.result = block.content;\n call.isError = block.is_error ?? false;\n this.pendingCalls.delete(block.tool_use_id);\n }\n }\n}\n\n/**\n * Convenience: drain an async iterable of events through a fresh builder.\n *\n * Suitable when you have the full event stream and just want the view.\n * For interactive/incremental scenarios (e.g. surfacing partial state in a UI)\n * instantiate {@link TrajectoryBuilder} directly and call `consume()` /\n * `build()` yourself.\n */\nexport async function buildTrajectory(\n events: AsyncIterable<StreamEvent>,\n): Promise<TrajectoryView> {\n const builder = new TrajectoryBuilder();\n for await (const event of events) {\n builder.consume(event);\n }\n return builder.build();\n}\n","/**\n * Line-buffered NDJSON parser for Claude Code's `--output-format stream-json`.\n *\n * Claude Code emits one JSON object per line on stdout. The parser:\n * - buffers across chunk boundaries (a single JSON line may arrive in two reads)\n * - skips empty lines (defensive — shouldn't occur, but harmless if it does)\n * - emits a discriminated `ParseResult` per line so callers can decide whether\n * a malformed line should abort the run or just be logged.\n *\n * Why a generator (and not a Transform stream)?\n * The eval adapter consumes events sequentially and synchronously updates a\n * builder. Async iteration is the simplest interface for that pattern and\n * composes cleanly with `for await` in the adapter. A Transform would force\n * the builder into event-handler style.\n */\n\nimport type { Readable } from \"node:stream\";\nimport type { StreamEvent } from \"../types/stream\";\n\n/**\n * Result of attempting to parse a single line.\n *\n * Successful parses yield `{ ok: true }` with the typed event and the raw line\n * (kept for diagnostics and OTel `events.attributes.raw`). Failed parses yield\n * `{ ok: false }` with the parse error and the raw line — callers can log,\n * skip, or fail the run as they see fit.\n */\nexport type ParseResult =\n | { ok: true; event: StreamEvent; rawLine: string }\n | { ok: false; error: Error; rawLine: string };\n\n/**\n * Parse a readable stream of NDJSON into a sequence of typed stream-json events.\n *\n * @example\n * const child = spawn(\"claude\", [\"-p\", prompt, \"--output-format\", \"stream-json\", \"--verbose\"]);\n * for await (const result of parseStreamJson(child.stdout)) {\n * if (result.ok) builder.consume(result.event);\n * else console.warn(\"malformed stream line:\", result.rawLine, result.error);\n * }\n */\nexport async function* parseStreamJson(\n stream: Readable,\n): AsyncGenerator<ParseResult, void, void> {\n let buffer = \"\";\n // The Node child_process stdout is a binary stream by default. Setting the\n // encoding here means `for await (const chunk of stream)` yields strings.\n stream.setEncoding(\"utf8\");\n\n for await (const chunk of stream) {\n buffer += chunk as string;\n\n // Drain every complete line currently in the buffer before reading more.\n // Multiple JSON objects can arrive in one chunk (e.g. when the harness\n // emits a burst of events at session start).\n let newlineIdx: number;\n while ((newlineIdx = buffer.indexOf(\"\\n\")) !== -1) {\n const line = buffer.slice(0, newlineIdx).trim();\n buffer = buffer.slice(newlineIdx + 1);\n if (line.length === 0) continue;\n yield tryParseLine(line);\n }\n }\n\n // Flush any trailing content that arrived without a final newline. Stream-json\n // typically ends with a newline-terminated `result` event, but a killed\n // process may not flush, so we still try to emit what we have.\n const trailing = buffer.trim();\n if (trailing.length > 0) {\n yield tryParseLine(trailing);\n }\n}\n\n/**\n * Parse a single line. Extracted as a helper so the generator stays readable.\n *\n * Note: we do not validate the event structure beyond `JSON.parse`. Runtime\n * validation (e.g. zod) is overkill here — the schema is stable enough at\n * runtime, and the TrajectoryBuilder is tolerant of missing fields. Adding\n * validation would be premature.\n */\nfunction tryParseLine(line: string): ParseResult {\n try {\n const event = JSON.parse(line) as StreamEvent;\n return { ok: true, event, rawLine: line };\n } catch (err) {\n return {\n ok: false,\n error: err instanceof Error ? err : new Error(String(err)),\n rawLine: line,\n };\n }\n}\n","/**\n * Generic harness adapter contract.\n *\n * Every harness adapter produces a {@link TrajectoryView} plus process\n * diagnostics. The runner and assertion engine depend only on these types —\n * not on any specific harness implementation.\n */\n\nimport type { TrajectoryView } from \"../types/trajectory\";\n\n/** Base config every adapter must accept. */\nexport interface BaseAdapterConfig {\n prompt: string;\n model?: string;\n timeoutMs?: number;\n signal?: AbortSignal;\n env?: Record<string, string>;\n cwd?: string;\n}\n\n/** Suite-level config: generic fields plus adapter-specific nested blocks. */\nexport type SuiteConfig = Partial<BaseAdapterConfig> & {\n /** Claude Code adapter options (when `adapter` is `claude-code`). */\n claudeCode?: Record<string, unknown>;\n};\n\n/** Generic harness adapter interface. */\nexport interface HarnessAdapter<\n TConfig extends BaseAdapterConfig = BaseAdapterConfig,\n> {\n readonly id: string;\n run(config: TConfig): Promise<AdapterResult>;\n}\n\n/** Successful adapter run. */\nexport interface AdapterResult {\n view: TrajectoryView;\n diagnostics: AdapterDiagnostics;\n}\n\n/** Process-level diagnostics from any adapter. */\nexport interface AdapterDiagnostics {\n exitCode: number | null;\n signal: NodeJS.Signals | null;\n stderr: string;\n parseErrors: ParseErrorRecord[];\n timedOut: boolean;\n durationMs: number;\n}\n\nexport interface ParseErrorRecord {\n line: string;\n error: string;\n}\n\n/**\n * Thrown when the harness fails to produce a usable trajectory.\n *\n * Most commonly this means the process failed before emitting a usable\n * session init event. Inspect `diagnostics.stderr` for the cause.\n */\nexport class AdapterError extends Error {\n constructor(\n message: string,\n public readonly diagnostics: Partial<AdapterDiagnostics>,\n ) {\n super(message);\n this.name = \"AdapterError\";\n }\n}\n","/**\n * Build CLI args for Claude Code judge subprocesses (JSON output, not stream-json).\n */\n\nimport type { ClaudeCodeAdapterConfig, ClaudeCodeOptions } from \"./types\";\n\nfunction pushRepeatableFlag(args: string[], flag: string, values?: string[]): void {\n if (!values) return;\n for (const value of values) {\n args.push(flag, value);\n }\n}\n\nfunction pushOptionalFlag(\n args: string[],\n flag: string,\n value: string | number | boolean | undefined,\n): void {\n if (value === undefined) return;\n if (typeof value === \"boolean\") {\n if (value) args.push(flag);\n return;\n }\n args.push(flag, String(value));\n}\n\n/** Append Claude Code CLI flags shared by harness runs and grading judges. */\nexport function appendClaudeCodeFlags(\n args: string[],\n config: ClaudeCodeOptions & { model?: string },\n): void {\n pushRepeatableFlag(args, \"--plugin-dir\", config.pluginDirs);\n pushRepeatableFlag(args, \"--plugin-url\", config.pluginUrls);\n pushRepeatableFlag(args, \"--add-dir\", config.addDirs);\n\n pushOptionalFlag(args, \"--mcp-config\", config.mcpConfig);\n pushOptionalFlag(args, \"--model\", config.model);\n pushOptionalFlag(args, \"--permission-mode\", config.permissionMode);\n pushOptionalFlag(args, \"--effort\", config.effort);\n pushOptionalFlag(args, \"--agent\", config.agent);\n pushOptionalFlag(args, \"--fallback-model\", config.fallbackModel);\n pushOptionalFlag(args, \"--tools\", config.tools);\n pushOptionalFlag(args, \"--settings\", config.settings);\n pushOptionalFlag(args, \"--setting-sources\", config.settingSources);\n pushOptionalFlag(args, \"--max-turns\", config.maxTurns);\n pushOptionalFlag(args, \"--max-budget-usd\", config.maxBudgetUsd);\n pushOptionalFlag(args, \"--system-prompt\", config.systemPrompt);\n pushOptionalFlag(args, \"--system-prompt-file\", config.systemPromptFile);\n pushOptionalFlag(args, \"--append-system-prompt\", config.appendSystemPrompt);\n pushOptionalFlag(\n args,\n \"--append-system-prompt-file\",\n config.appendSystemPromptFile,\n );\n pushOptionalFlag(args, \"--debug\", config.debug);\n pushOptionalFlag(args, \"--debug-file\", config.debugFile);\n\n if (config.allowedTools && config.allowedTools.length > 0) {\n args.push(\"--allowedTools\", config.allowedTools.join(\",\"));\n }\n\n if (config.disallowedTools && config.disallowedTools.length > 0) {\n args.push(\"--disallowedTools\", config.disallowedTools.join(\",\"));\n }\n\n pushOptionalFlag(args, \"--strict-mcp-config\", config.strictMcpConfig);\n pushOptionalFlag(args, \"--include-hook-events\", config.includeHookEvents);\n pushOptionalFlag(args, \"--no-session-persistence\", config.noSessionPersistence);\n pushOptionalFlag(args, \"--disable-slash-commands\", config.disableSlashCommands);\n pushOptionalFlag(args, \"--bare\", config.bare);\n pushOptionalFlag(args, \"--safe-mode\", config.safeMode);\n pushOptionalFlag(\n args,\n \"--allow-dangerously-skip-permissions\",\n config.allowDangerouslySkipPermissions,\n );\n pushOptionalFlag(\n args,\n \"--dangerously-skip-permissions\",\n config.dangerouslySkipPermissions,\n );\n}\n\n/**\n * Build the argument vector for spawning `claude`.\n *\n * Order matters only for flags that take values — value flags must come\n * after their flag name. Everything else is order-independent.\n */\nexport function buildArgs(config: ClaudeCodeAdapterConfig): string[] {\n const args: string[] = [\n \"-p\",\n config.prompt,\n \"--output-format\",\n \"stream-json\",\n \"--verbose\",\n ];\n\n appendClaudeCodeFlags(args, config);\n\n return args;\n}\n\n/** Build args for an LLM judge subprocess (`--output-format json`). */\nexport function buildJudgeArgs(\n prompt: string,\n config: ClaudeCodeOptions & { model?: string } = {},\n): string[] {\n const args: string[] = [\"-p\", prompt, \"--output-format\", \"json\"];\n const permissionMode = config.permissionMode ?? \"bypassPermissions\";\n appendClaudeCodeFlags(args, {\n ...config,\n permissionMode,\n });\n return args;\n}\n","/**\n * Process management for the Claude Code adapter.\n *\n * This module owns spawning, timeout, abort signal handling, and process-tree\n * teardown. The orchestrator (`index.ts`) consumes the returned handle —\n * reading stdout and waiting for completion — but doesn't worry about how\n * the process gets killed or how its config gets isolated.\n *\n * Why a separate module? Process management is the one part of the adapter\n * with real I/O complexity (process groups, signal escalation, temp-dir\n * lifecycle, env merging). Isolating it makes the orchestrator easy to read\n * and lets us swap the spawning logic if we later need to, e.g., wrap claude\n * in a sandbox runner.\n */\n\nimport { spawn, type ChildProcess } from \"node:child_process\";\nimport { mkdtemp, rm } from \"node:fs/promises\";\nimport { tmpdir } from \"node:os\";\nimport { join } from \"node:path\";\nimport type { Readable } from \"node:stream\";\n\nimport { buildArgs } from \"./flags\";\nimport type { ClaudeCodeAdapterConfig } from \"./types\";\n\n/** Default hard timeout per run. Tunable via config.timeoutMs. */\nconst DEFAULT_TIMEOUT_MS = 5 * 60 * 1000;\n\n/**\n * Grace period between SIGTERM and SIGKILL. Most processes shut down cleanly\n * within a few seconds; this gives them that chance while preventing CI from\n * hanging indefinitely on a stuck child.\n */\nconst KILL_GRACE_MS = 5_000;\n\n/**\n * Handle to a spawned `claude` process. The orchestrator drives it:\n * - Read `stdout` (typically via parseStreamJson).\n * - Await `done` to learn the exit state.\n * - Await `stderrCollected` for diagnostic stderr.\n * - Check `timedOut()` after exit to distinguish kill-by-timeout from\n * normal termination.\n * - Call `cleanup()` after all of the above to remove the temp config dir.\n */\nexport interface SpawnedClaude {\n stdout: Readable;\n done: Promise<{ exitCode: number | null; signal: NodeJS.Signals | null }>;\n stderrCollected: Promise<string>;\n timedOut: () => boolean;\n cleanup: () => Promise<void>;\n}\n\n/**\n * Spawn `claude` in headless mode with isolated config and a process-group\n * lifecycle. See {@link SpawnedClaude} for how to consume the result.\n */\nexport async function spawnClaude(\n config: ClaudeCodeAdapterConfig,\n): Promise<SpawnedClaude> {\n const binary = config.binary ?? \"claude\";\n const args = buildArgs(config);\n\n const isolateConfig = config.isolateConfig !== false;\n\n // Isolated runs use a fresh temp dir so plugins/settings don't leak between\n // reps. Non-isolated runs inherit the caller's Claude login and plugins.\n const tempConfigDir = isolateConfig\n ? await mkdtemp(join(tmpdir(), \"harness-eval-\"))\n : null;\n\n const env: Record<string, string | undefined> = {\n ...process.env,\n ...config.env,\n };\n if (tempConfigDir) {\n // Override after ...env so callers can't accidentally un-isolate.\n env.CLAUDE_CONFIG_DIR = tempConfigDir;\n }\n\n const child = spawn(binary, args, {\n cwd: config.cwd ?? process.cwd(),\n env,\n stdio: [\"ignore\", \"pipe\", \"pipe\"],\n // detached: true means the child becomes the leader of its own process\n // group. We exploit this to kill the entire group (including any MCP\n // server subprocesses and tool processes) on timeout/abort.\n detached: true,\n });\n\n\n let timedOut = false;\n let killEscalation: NodeJS.Timeout | null = null;\n const timeoutMs = config.timeoutMs ?? DEFAULT_TIMEOUT_MS;\n\n const scheduleKillEscalation = () => {\n if (killEscalation) clearTimeout(killEscalation);\n killEscalation = setTimeout(\n () => killTree(child, \"SIGKILL\"),\n KILL_GRACE_MS,\n );\n };\n\n const timeoutTimer = setTimeout(() => {\n timedOut = true;\n killTree(child, \"SIGTERM\");\n scheduleKillEscalation();\n }, timeoutMs);\n\n // External cancellation uses the same kill path.\n const onAbort = () => {\n killTree(child, \"SIGTERM\");\n scheduleKillEscalation();\n };\n config.signal?.addEventListener(\"abort\", onAbort, { once: true });\n\n\n // Drain stderr eagerly so the OS-level buffer never fills and stalls the\n // child (Node child processes will block on a full pipe).\n const stderrChunks: string[] = [];\n child.stderr?.setEncoding(\"utf8\");\n child.stderr?.on(\"data\", (chunk: string) => {\n stderrChunks.push(chunk);\n });\n\n const stderrCollected = new Promise<string>((resolve) => {\n const finalize = () => resolve(stderrChunks.join(\"\"));\n child.stderr?.on(\"end\", finalize);\n // Errors during stderr capture shouldn't fail the whole run; we just\n // return what we've buffered so far.\n child.stderr?.on(\"error\", finalize);\n });\n\n\n const done = new Promise<{\n exitCode: number | null;\n signal: NodeJS.Signals | null;\n }>((resolve) => {\n let settled = false;\n const finalize = (\n exitCode: number | null,\n signal: NodeJS.Signals | null,\n ) => {\n if (settled) return;\n settled = true;\n clearTimeout(timeoutTimer);\n if (killEscalation) clearTimeout(killEscalation);\n config.signal?.removeEventListener(\"abort\", onAbort);\n resolve({ exitCode, signal });\n };\n\n child.on(\"close\", (code, signal) => finalize(code, signal));\n // ENOENT and other spawn failures emit `error` — `close` may not follow.\n child.on(\"error\", () => finalize(null, null));\n });\n\n\n const cleanup = async () => {\n if (!tempConfigDir) return;\n try {\n await rm(tempConfigDir, { recursive: true, force: true });\n } catch {\n // Best-effort. A leftover temp dir is annoying but not catastrophic;\n // we don't want to fail the run for it.\n }\n };\n\n // stdout is guaranteed non-null because we passed `stdio: [..., \"pipe\", ...]`.\n // The `!` is safe; the alternative would be a redundant runtime check that\n // could never fire.\n return {\n stdout: child.stdout!,\n done,\n stderrCollected,\n timedOut: () => timedOut,\n cleanup,\n };\n}\n\n/**\n * Kill the child's process group, then fall back to the bare PID if the\n * group is already gone. This catches MCP server subprocesses and tool\n * processes spawned by claude.\n *\n * Why both? On some platforms the process group dies before we get here\n * (the child itself already cleaned up), in which case `kill(-pid)` throws\n * ESRCH. The fallback handles that edge case without leaking zombies in\n * the common case.\n */\nfunction killTree(child: ChildProcess, signal: NodeJS.Signals): void {\n if (child.pid === undefined) return;\n try {\n // Negative PID = process group. Requires the child was spawned with\n // detached=true (which we do).\n process.kill(-child.pid, signal);\n } catch {\n try {\n child.kill(signal);\n } catch {\n // Nothing left to kill.\n }\n }\n}\n","/**\n * Claude Code adapter — public API.\n */\n\nimport { parseStreamJson } from \"../../parsers/stream-json\";\nimport { TrajectoryBuilder } from \"../../trajectory/builder\";\nimport type { StreamEvent } from \"../../types/stream\";\n\nimport { AdapterError } from \"../types\";\nimport { spawnClaude } from \"./process\";\nimport type {\n AdapterDiagnostics,\n ClaudeCodeAdapterConfig,\n ClaudeCodeAdapterResult,\n ParseErrorRecord,\n} from \"./types\";\nimport type { HarnessAdapter } from \"../types\";\n\nexport { AdapterError } from \"../types\";\nexport type {\n AdapterDiagnostics,\n AdapterResult,\n ClaudeCodeAdapterConfig,\n ClaudeCodeAdapterResult,\n ClaudeCodeOptions,\n ParseErrorRecord,\n PermissionMode,\n} from \"./types\";\n\n/**\n * Run Claude Code in headless mode and return a trajectory.\n */\nexport async function runClaudeCode(\n config: ClaudeCodeAdapterConfig,\n): Promise<ClaudeCodeAdapterResult> {\n const startTs = Date.now();\n const spawned = await spawnClaude(config);\n\n const builder = new TrajectoryBuilder();\n const rawEvents: StreamEvent[] = [];\n const parseErrors: ParseErrorRecord[] = [];\n\n try {\n for await (const result of parseStreamJson(spawned.stdout)) {\n if (result.ok) {\n builder.consume(result.event);\n rawEvents.push(result.event);\n } else {\n parseErrors.push({\n line: result.rawLine,\n error: result.error.message,\n });\n }\n }\n\n const [{ exitCode, signal }, stderr] = await Promise.all([\n spawned.done,\n spawned.stderrCollected,\n ]);\n\n const diagnostics: AdapterDiagnostics = {\n exitCode,\n signal,\n stderr,\n parseErrors,\n timedOut: spawned.timedOut(),\n durationMs: Date.now() - startTs,\n };\n\n let view;\n try {\n view = builder.build();\n } catch (err) {\n const message = err instanceof Error ? err.message : String(err);\n throw new AdapterError(\n `harness produced no usable trajectory: ${message}`,\n diagnostics,\n );\n }\n\n return { view, diagnostics, rawEvents };\n } finally {\n await spawned.cleanup();\n }\n}\n\nexport const claudeCodeAdapter: HarnessAdapter<ClaudeCodeAdapterConfig> = {\n id: \"claude-code\",\n run: runClaudeCode,\n};\n"],"mappings":";;;;;;;AAuMA,SAAgB,aAAa,GAAsC;CACjE,OAAO,EAAE,SAAS,YAAa,EAAsB,YAAY;AACnE;AAEA,SAAgB,cAAc,GAAuC;CACnE,OAAO,EAAE,SAAS,YAAa,EAAuB,YAAY;AACpE;AAEA,SAAgB,mBAAmB,GAA4C;CAC7E,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,cAAc,GAAuC;CACnE,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,SAAS,GAAkC;CACzD,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,YAAY,GAAiC;CAC3D,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,eAAe,GAAoC;CACjE,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,kBAAkB,GAAuC;CACvE,OAAO,EAAE,SAAS;AACpB;;;;;;;;;;;;;;AC9FA,SAAgB,YAAY,UAAiC;CAC3D,IAAI,CAAC,SAAS,WAAW,OAAO,GAAG,OAAO;CAC1C,MAAM,QAAQ,SAAS,MAAM,IAAI;CACjC,IAAI,MAAM,SAAS,GAAG,OAAO;CAC7B,OAAO,GAAG,MAAM,GAAG,IAAI,MAAM;AAC/B;;;;;;;;;;;;;;;;;;;;;;;;;;;;;AC7FA,IAAa,oBAAb,MAA+B;CAC7B,OAAmC;CACnC,iBAAwC;CAExC,QAAiC,CAAC;CAClC,eAAmC,CAAC;;;;;CAMpC,+BAA8C,IAAI,IAAI;CAEtD,UAAiC,CAAC;CAElC,aAAmC;CACnC,eAAuB;CACvB,kBAA0B;CAC1B,gBAAwB;CACxB,kBAA0B;CAC1B,iBAAyB;CACzB,gBAAwB;;;;;;;CAQxB,QAAQ,OAA0B;EAChC,IAAI,aAAa,KAAK,GAAG;GACvB,KAAK,OAAO;IACV,WAAW,MAAM;IACjB,OAAO,MAAM;IACb,KAAK,MAAM;IACX,gBAAgB,MAAM;IACtB,gBAAgB,MAAM,SAAS,CAAC;IAChC,aAAa,MAAM,eAAe,CAAC,EAAA,CAAG,KAAK,OAAO;KAChD,MAAM,EAAE;KACR,QAAQ,EAAE;IACZ,EAAE;GACJ;GACA,KAAK,iBAAiB,KAAK,IAAI;GAC/B;EACF;EAEA,IAAI,MAAM,SAAS,YAAY,MAAM,YAAY,aAAa;GAC5D,KAAK,QAAQ,KAAK;IAChB,UAAU,KAAK,iBAAiB,KAAK,IAAI,IAAI,KAAK,iBAAiB;IACnE,KAAK;GACP,CAAC;GACD;EACF;EAEA,IAAI,mBAAmB,KAAK,GAAG;GAC7B,KAAK,uBAAuB,KAAK;GACjC;EACF;EAEA,IAAI,cAAc,KAAK,GAAG;GACxB,KAAK,kBAAkB,KAAK;GAC5B;EACF;EAEA,IAAI,SAAS,KAAK,GAAG;GACnB,KAAK,iBAAiB;GACtB,KAAK,gBAAgB,MAAM;GAC3B,KAAK,aAAa,MAAM,SAAS;GACjC,KAAK,eAAe,MAAM,kBAAkB;GAC5C,KAAK,kBAAkB,MAAM,eAAe;GAC5C,KAAK,gBAAgB,MAAM,aAAa;GACxC,KAAK,kBAAkB,MAAM,UAAU;GACvC;EACF;CAGF;;;;;;;;CASA,QAAwB;EACtB,IAAI,KAAK,SAAS,MAChB,MAAM,IAAI,MACR,4JAEF;EAGF,MAAM,WAAW,KAAK,MAAM,KAAK,MAAM,SAAS;EAKhD,MAAM,kBAAkB,KAAK,MAC1B,KAAK,MAAM,EAAE,IAAI,CAAC,CAClB,QAAQ,MAAM,EAAE,SAAS,CAAC,CAAC,CAC3B,KAAK,MAAM,CAAC,CACZ,KAAK;EAER,OAAO;GACL,MAAM,KAAK;GACX,WAAW,KAAK;GAChB,OAAO,KAAK;GACZ,eAAe,mBAAmB,KAAK;GACvC,iBAAiB,UAAU,cAAc;GACzC,OAAO;IACL,aAAa,KAAK,YAAY,gBAAgB;IAC9C,cAAc,KAAK,YAAY,iBAAiB;IAChD,cAAc,KAAK;IACnB,YAAY,KAAK;IAEjB,UAAU,KAAK,iBAAiB,KAAK,MAAM;GAC7C;GACA,SAAS,KAAK;GAGd,SAAS,KAAK,kBAAkB,CAAC,KAAK;EACxC;CACF;CAIA,uBACE,OACM;EACN,MAAM,YAAY,KAAK,MAAM;EAC7B,MAAM,aAAuB,CAAC;EAC9B,MAAM,oBAAgC,CAAC;EAEvC,KAAK,MAAM,SAAS,MAAM,QAAQ,SAAS;GACzC,IAAI,YAAY,KAAK,GAAG;IACtB,WAAW,KAAK,MAAM,IAAI;IAC1B;GACF;GACA,IAAI,eAAe,KAAK,GAAG;IACzB,MAAM,OAAiB;KACrB,MAAM,MAAM;KACZ,WAAW,YAAY,MAAM,IAAI;KACjC,QAAQ,MAAM;KACd,MAAM,MAAM;KACZ,QAAQ;KACR,SAAS;KACT;KACA,WAAW,KAAK,aAAa;IAC/B;IACA,KAAK,aAAa,KAAK,IAAI;IAC3B,KAAK,aAAa,IAAI,MAAM,IAAI,IAAI;IACpC,kBAAkB,KAAK,IAAI;IAC3B;GACF;EAIF;EAEA,KAAK,MAAM,KAAK;GACd;GACA,MAAM,WAAW,KAAK,EAAE,CAAC,CAAC,KAAK;GAC/B,WAAW;GACX,YAAY,MAAM,QAAQ,eAAe;EAC3C,CAAC;CACH;CAEA,kBACE,OACM;EACN,MAAM,UAAU,MAAM,QAAQ;EAK9B,IAAI,OAAO,YAAY,UAAU;EAEjC,KAAK,MAAM,SAAS,SAAS;GAC3B,IAAI,CAAC,kBAAkB,KAAK,GAAG;GAE/B,MAAM,OAAO,KAAK,aAAa,IAAI,MAAM,WAAW;GACpD,IAAI,CAAC,MAIH;GAGF,KAAK,SAAS,MAAM;GACpB,KAAK,UAAU,MAAM,YAAY;GACjC,KAAK,aAAa,OAAO,MAAM,WAAW;EAC5C;CACF;AACF;;;;;;;;;AAUA,eAAsB,gBACpB,QACyB;CACzB,MAAM,UAAU,IAAI,kBAAkB;CACtC,WAAW,MAAM,SAAS,QACxB,QAAQ,QAAQ,KAAK;CAEvB,OAAO,QAAQ,MAAM;AACvB;;;;;;;;;;;;;AC1NA,gBAAuB,gBACrB,QACyC;CACzC,IAAI,SAAS;CAGb,OAAO,YAAY,MAAM;CAEzB,WAAW,MAAM,SAAS,QAAQ;EAChC,UAAU;EAKV,IAAI;EACJ,QAAQ,aAAa,OAAO,QAAQ,IAAI,OAAO,IAAI;GACjD,MAAM,OAAO,OAAO,MAAM,GAAG,UAAU,CAAC,CAAC,KAAK;GAC9C,SAAS,OAAO,MAAM,aAAa,CAAC;GACpC,IAAI,KAAK,WAAW,GAAG;GACvB,MAAM,aAAa,IAAI;EACzB;CACF;CAKA,MAAM,WAAW,OAAO,KAAK;CAC7B,IAAI,SAAS,SAAS,GACpB,MAAM,aAAa,QAAQ;AAE/B;;;;;;;;;AAUA,SAAS,aAAa,MAA2B;CAC/C,IAAI;EAEF,OAAO;GAAE,IAAI;GAAM,OADL,KAAK,MAAM,IACF;GAAG,SAAS;EAAK;CAC1C,SAAS,KAAK;EACZ,OAAO;GACL,IAAI;GACJ,OAAO,eAAe,QAAQ,MAAM,IAAI,MAAM,OAAO,GAAG,CAAC;GACzD,SAAS;EACX;CACF;AACF;;;;;;;;;AC/BA,IAAa,eAAb,cAAkC,MAAM;CAGpB;CAFlB,YACE,SACA,aACA;EACA,MAAM,OAAO;EAFG,KAAA,cAAA;EAGhB,KAAK,OAAO;CACd;AACF;;;AC/DA,SAAS,mBAAmB,MAAgB,MAAc,QAAyB;CACjF,IAAI,CAAC,QAAQ;CACb,KAAK,MAAM,SAAS,QAClB,KAAK,KAAK,MAAM,KAAK;AAEzB;AAEA,SAAS,iBACP,MACA,MACA,OACM;CACN,IAAI,UAAU,KAAA,GAAW;CACzB,IAAI,OAAO,UAAU,WAAW;EAC9B,IAAI,OAAO,KAAK,KAAK,IAAI;EACzB;CACF;CACA,KAAK,KAAK,MAAM,OAAO,KAAK,CAAC;AAC/B;;AAGA,SAAgB,sBACd,MACA,QACM;CACN,mBAAmB,MAAM,gBAAgB,OAAO,UAAU;CAC1D,mBAAmB,MAAM,gBAAgB,OAAO,UAAU;CAC1D,mBAAmB,MAAM,aAAa,OAAO,OAAO;CAEpD,iBAAiB,MAAM,gBAAgB,OAAO,SAAS;CACvD,iBAAiB,MAAM,WAAW,OAAO,KAAK;CAC9C,iBAAiB,MAAM,qBAAqB,OAAO,cAAc;CACjE,iBAAiB,MAAM,YAAY,OAAO,MAAM;CAChD,iBAAiB,MAAM,WAAW,OAAO,KAAK;CAC9C,iBAAiB,MAAM,oBAAoB,OAAO,aAAa;CAC/D,iBAAiB,MAAM,WAAW,OAAO,KAAK;CAC9C,iBAAiB,MAAM,cAAc,OAAO,QAAQ;CACpD,iBAAiB,MAAM,qBAAqB,OAAO,cAAc;CACjE,iBAAiB,MAAM,eAAe,OAAO,QAAQ;CACrD,iBAAiB,MAAM,oBAAoB,OAAO,YAAY;CAC9D,iBAAiB,MAAM,mBAAmB,OAAO,YAAY;CAC7D,iBAAiB,MAAM,wBAAwB,OAAO,gBAAgB;CACtE,iBAAiB,MAAM,0BAA0B,OAAO,kBAAkB;CAC1E,iBACE,MACA,+BACA,OAAO,sBACT;CACA,iBAAiB,MAAM,WAAW,OAAO,KAAK;CAC9C,iBAAiB,MAAM,gBAAgB,OAAO,SAAS;CAEvD,IAAI,OAAO,gBAAgB,OAAO,aAAa,SAAS,GACtD,KAAK,KAAK,kBAAkB,OAAO,aAAa,KAAK,GAAG,CAAC;CAG3D,IAAI,OAAO,mBAAmB,OAAO,gBAAgB,SAAS,GAC5D,KAAK,KAAK,qBAAqB,OAAO,gBAAgB,KAAK,GAAG,CAAC;CAGjE,iBAAiB,MAAM,uBAAuB,OAAO,eAAe;CACpE,iBAAiB,MAAM,yBAAyB,OAAO,iBAAiB;CACxE,iBAAiB,MAAM,4BAA4B,OAAO,oBAAoB;CAC9E,iBAAiB,MAAM,4BAA4B,OAAO,oBAAoB;CAC9E,iBAAiB,MAAM,UAAU,OAAO,IAAI;CAC5C,iBAAiB,MAAM,eAAe,OAAO,QAAQ;CACrD,iBACE,MACA,wCACA,OAAO,+BACT;CACA,iBACE,MACA,kCACA,OAAO,0BACT;AACF;;;;;;;AAQA,SAAgB,UAAU,QAA2C;CACnE,MAAM,OAAiB;EACrB;EACA,OAAO;EACP;EACA;EACA;CACF;CAEA,sBAAsB,MAAM,MAAM;CAElC,OAAO;AACT;;AAGA,SAAgB,eACd,QACA,SAAiD,CAAC,GACxC;CACV,MAAM,OAAiB;EAAC;EAAM;EAAQ;EAAmB;CAAM;CAC/D,MAAM,iBAAiB,OAAO,kBAAkB;CAChD,sBAAsB,MAAM;EAC1B,GAAG;EACH;CACF,CAAC;CACD,OAAO;AACT;;;;;;;;;;;;;;;;;;AC1FA,MAAM,qBAAqB,MAAS;;;;;;AAOpC,MAAM,gBAAgB;;;;;AAuBtB,eAAsB,YACpB,QACwB;CACxB,MAAM,SAAS,OAAO,UAAU;CAChC,MAAM,OAAO,UAAU,MAAM;CAM7B,MAAM,gBAJgB,OAAO,kBAAkB,QAK3C,MAAM,QAAQ,KAAK,OAAO,GAAG,eAAe,CAAC,IAC7C;CAEJ,MAAM,MAA0C;EAC9C,GAAG,QAAQ;EACX,GAAG,OAAO;CACZ;CACA,IAAI,eAEF,IAAI,oBAAoB;CAG1B,MAAM,QAAQ,MAAM,QAAQ,MAAM;EAChC,KAAK,OAAO,OAAO,QAAQ,IAAI;EAC/B;EACA,OAAO;GAAC;GAAU;GAAQ;EAAM;EAIhC,UAAU;CACZ,CAAC;CAGD,IAAI,WAAW;CACf,IAAI,iBAAwC;CAC5C,MAAM,YAAY,OAAO,aAAa;CAEtC,MAAM,+BAA+B;EACnC,IAAI,gBAAgB,aAAa,cAAc;EAC/C,iBAAiB,iBACT,SAAS,OAAO,SAAS,GAC/B,aACF;CACF;CAEA,MAAM,eAAe,iBAAiB;EACpC,WAAW;EACX,SAAS,OAAO,SAAS;EACzB,uBAAuB;CACzB,GAAG,SAAS;CAGZ,MAAM,gBAAgB;EACpB,SAAS,OAAO,SAAS;EACzB,uBAAuB;CACzB;CACA,OAAO,QAAQ,iBAAiB,SAAS,SAAS,EAAE,MAAM,KAAK,CAAC;CAKhE,MAAM,eAAyB,CAAC;CAChC,MAAM,QAAQ,YAAY,MAAM;CAChC,MAAM,QAAQ,GAAG,SAAS,UAAkB;EAC1C,aAAa,KAAK,KAAK;CACzB,CAAC;CAED,MAAM,kBAAkB,IAAI,SAAiB,YAAY;EACvD,MAAM,iBAAiB,QAAQ,aAAa,KAAK,EAAE,CAAC;EACpD,MAAM,QAAQ,GAAG,OAAO,QAAQ;EAGhC,MAAM,QAAQ,GAAG,SAAS,QAAQ;CACpC,CAAC;CAGD,MAAM,OAAO,IAAI,SAGb,YAAY;EACd,IAAI,UAAU;EACd,MAAM,YACJ,UACA,WACG;GACH,IAAI,SAAS;GACb,UAAU;GACV,aAAa,YAAY;GACzB,IAAI,gBAAgB,aAAa,cAAc;GAC/C,OAAO,QAAQ,oBAAoB,SAAS,OAAO;GACnD,QAAQ;IAAE;IAAU;GAAO,CAAC;EAC9B;EAEA,MAAM,GAAG,UAAU,MAAM,WAAW,SAAS,MAAM,MAAM,CAAC;EAE1D,MAAM,GAAG,eAAe,SAAS,MAAM,IAAI,CAAC;CAC9C,CAAC;CAGD,MAAM,UAAU,YAAY;EAC1B,IAAI,CAAC,eAAe;EACpB,IAAI;GACF,MAAM,GAAG,eAAe;IAAE,WAAW;IAAM,OAAO;GAAK,CAAC;EAC1D,QAAQ,CAGR;CACF;CAKA,OAAO;EACL,QAAQ,MAAM;EACd;EACA;EACA,gBAAgB;EAChB;CACF;AACF;;;;;;;;;;;AAYA,SAAS,SAAS,OAAqB,QAA8B;CACnE,IAAI,MAAM,QAAQ,KAAA,GAAW;CAC7B,IAAI;EAGF,QAAQ,KAAK,CAAC,MAAM,KAAK,MAAM;CACjC,QAAQ;EACN,IAAI;GACF,MAAM,KAAK,MAAM;EACnB,QAAQ,CAER;CACF;AACF;;;;;;;;;;;;;;ACxKA,eAAsB,cACpB,QACkC;CAClC,MAAM,UAAU,KAAK,IAAI;CACzB,MAAM,UAAU,MAAM,YAAY,MAAM;CAExC,MAAM,UAAU,IAAI,kBAAkB;CACtC,MAAM,YAA2B,CAAC;CAClC,MAAM,cAAkC,CAAC;CAEzC,IAAI;EACF,WAAW,MAAM,UAAU,gBAAgB,QAAQ,MAAM,GACvD,IAAI,OAAO,IAAI;GACb,QAAQ,QAAQ,OAAO,KAAK;GAC5B,UAAU,KAAK,OAAO,KAAK;EAC7B,OACE,YAAY,KAAK;GACf,MAAM,OAAO;GACb,OAAO,OAAO,MAAM;EACtB,CAAC;EAIL,MAAM,CAAC,EAAE,UAAU,UAAU,UAAU,MAAM,QAAQ,IAAI,CACvD,QAAQ,MACR,QAAQ,eACV,CAAC;EAED,MAAM,cAAkC;GACtC;GACA;GACA;GACA;GACA,UAAU,QAAQ,SAAS;GAC3B,YAAY,KAAK,IAAI,IAAI;EAC3B;EAEA,IAAI;EACJ,IAAI;GACF,OAAO,QAAQ,MAAM;EACvB,SAAS,KAAK;GAEZ,MAAM,IAAI,aACR,0CAFc,eAAe,QAAQ,IAAI,UAAU,OAAO,GAAG,KAG7D,WACF;EACF;EAEA,OAAO;GAAE;GAAM;GAAa;EAAU;CACxC,UAAU;EACR,MAAM,QAAQ,QAAQ;CACxB;AACF;AAEA,MAAa,oBAA6D;CACxE,IAAI;CACJ,KAAK;AACP"}
1
+ {"version":3,"file":"claude-code-DZ4Vkgp6.js","names":[],"sources":["../src/types/stream.ts","../src/types/trajectory.ts","../src/trajectory/builder.ts","../src/parsers/stream-json.ts","../src/adapters/types.ts","../src/adapters/claude-code/flags.ts","../src/adapters/claude-code/process.ts","../src/adapters/claude-code/index.ts"],"sourcesContent":["/**\n * Discriminated union of events emitted by Claude Code's\n * `--output-format stream-json` mode.\n *\n * The format is NDJSON (one JSON object per line on stdout). Each line has\n * a required `type` field and often a `subtype` for further disambiguation.\n *\n * Source notes: the stream-json schema is not formally documented as of mid-2026.\n * These types are derived from:\n * - https://code.claude.com/docs/en/headless\n * - https://github.com/anthropics/claude-code/issues/24612 (event-types tracking issue)\n * - https://takopi.dev/reference/runners/claude/stream-json-cheatsheet/\n * - The `@anthropic-ai/claude-agent-sdk` TypeScript declaration files,\n * which are the de-facto source of truth.\n *\n * When adding new event types, prefer extending the union here rather than\n * branching on `any` in callers. Unknown events should be tolerated silently\n * by the builder (the schema evolves and we don't want CI to break on a new\n * event type we haven't modelled yet).\n */\n\n/** Top-level discriminated union of stream-json events. */\nexport type StreamEvent =\n | SystemInitEvent\n | SystemRetryEvent\n | SystemPluginInstallEvent\n | SystemCompactBoundaryEvent\n | SystemUnknownEvent\n | AssistantMessageEvent\n | UserMessageEvent\n | ResultEvent;\n\n// system events\n\n/** Emitted once at session start. Carries the session-level metadata. */\nexport interface SystemInitEvent {\n type: \"system\";\n subtype: \"init\";\n session_id: string;\n cwd: string;\n model: string;\n permissionMode?: string;\n apiKeySource?: string;\n /** Names of tools available in the session (built-in + MCP). */\n tools: string[];\n /** MCP servers configured for this session, with connection status. */\n mcp_servers: McpServerStatus[];\n}\n\nexport interface McpServerStatus {\n name: string;\n status: \"connected\" | \"disconnected\" | \"error\" | string;\n}\n\n/** Emitted when the API rate-limits us or otherwise asks for a retry. */\nexport interface SystemRetryEvent {\n type: \"system\";\n subtype: \"api_retry\";\n session_id: string;\n /** Implementation-defined retry payload (delay, reason, etc). */\n [key: string]: unknown;\n}\n\n/** Emitted while marketplace plugins are installing pre-session. */\nexport interface SystemPluginInstallEvent {\n type: \"system\";\n subtype: \"plugin_install\";\n session_id: string;\n [key: string]: unknown;\n}\n\n/** Emitted when Claude Code compacts the context window mid-session. */\nexport interface SystemCompactBoundaryEvent {\n type: \"system\";\n subtype: \"compact_boundary\";\n session_id: string;\n [key: string]: unknown;\n}\n\n/**\n * Catch-all for `type: \"system\"` events we haven't modelled.\n *\n * Keeps the union exhaustive while tolerating schema evolution. Callers should\n * either explicitly handle a known subtype or fall through to ignore.\n */\nexport interface SystemUnknownEvent {\n type: \"system\";\n subtype: string;\n session_id?: string;\n [key: string]: unknown;\n}\n\n// conversational events\n\n/** One assistant turn. The `message` field mirrors the Anthropic Messages API shape. */\nexport interface AssistantMessageEvent {\n type: \"assistant\";\n session_id: string;\n message: AssistantMessage;\n}\n\nexport interface AssistantMessage {\n id: string;\n type: \"message\";\n role: \"assistant\";\n content: ContentBlock[];\n model?: string;\n stop_reason?: StopReason | null;\n usage?: Usage;\n}\n\n/**\n * A user-role message in the stream.\n *\n * In stream-json these are usually *synthetic* — the harness injects them to\n * feed tool results back into the conversation after dispatching a tool. The\n * very first user message (the prompt) is also emitted here for completeness.\n */\nexport interface UserMessageEvent {\n type: \"user\";\n session_id: string;\n message: UserMessage;\n}\n\nexport interface UserMessage {\n role: \"user\";\n /** String for the initial prompt, array of blocks when carrying tool results. */\n content: ContentBlock[] | string;\n}\n\n// content blocks\n\nexport type ContentBlock = TextBlock | ToolUseBlock | ToolResultBlock;\n\nexport interface TextBlock {\n type: \"text\";\n text: string;\n}\n\nexport interface ToolUseBlock {\n type: \"tool_use\";\n /** Unique id assigned by the model; used to match tool_result back to this call. */\n id: string;\n /** Tool name. MCP tools follow the convention `mcp__<server>__<tool>`. */\n name: string;\n /** Arguments the model passed. Schema is per-tool. */\n input: unknown;\n}\n\nexport interface ToolResultBlock {\n type: \"tool_result\";\n /** The id of the corresponding tool_use block. */\n tool_use_id: string;\n /** Tool output. May be plain text or further content blocks for richer tools. */\n content: string | ContentBlock[];\n is_error?: boolean;\n}\n\n// result envelope\n\n/** Emitted once at session end. Carries aggregate usage and cost. */\nexport interface ResultEvent {\n type: \"result\";\n subtype: \"success\" | \"error\";\n session_id: string;\n total_cost_usd: number;\n is_error: boolean;\n duration_ms: number;\n duration_api_ms?: number;\n num_turns: number;\n /** The final text the harness returned, if any. */\n result?: string;\n usage?: Usage;\n}\n\n// shared scalars\n\n/**\n * Reasons the model can stop a turn. Open-ended string union because new\n * stop reasons appear over time.\n */\nexport type StopReason =\n | \"end_turn\"\n | \"tool_use\"\n | \"max_tokens\"\n | \"stop_sequence\"\n | (string & {});\n\nexport interface Usage {\n input_tokens: number;\n output_tokens: number;\n cache_creation_input_tokens?: number;\n cache_read_input_tokens?: number;\n}\n\n// type guards\n\n/** Type guards. Prefer these over manual `e.type === \"...\"` checks at call sites. */\n\nexport function isSystemInit(e: StreamEvent): e is SystemInitEvent {\n return e.type === \"system\" && (e as SystemInitEvent).subtype === \"init\";\n}\n\nexport function isSystemRetry(e: StreamEvent): e is SystemRetryEvent {\n return e.type === \"system\" && (e as SystemRetryEvent).subtype === \"api_retry\";\n}\n\nexport function isAssistantMessage(e: StreamEvent): e is AssistantMessageEvent {\n return e.type === \"assistant\";\n}\n\nexport function isUserMessage(e: StreamEvent): e is UserMessageEvent {\n return e.type === \"user\";\n}\n\nexport function isResult(e: StreamEvent): e is ResultEvent {\n return e.type === \"result\";\n}\n\nexport function isTextBlock(b: ContentBlock): b is TextBlock {\n return b.type === \"text\";\n}\n\nexport function isToolUseBlock(b: ContentBlock): b is ToolUseBlock {\n return b.type === \"tool_use\";\n}\n\nexport function isToolResultBlock(b: ContentBlock): b is ToolResultBlock {\n return b.type === \"tool_result\";\n}\n","/**\n * TrajectoryView — the assertion-friendly projection of a Claude Code session.\n *\n * The view is derived from the stream of {@link StreamEvent} values produced by\n * the harness, but is optimized for the queries that the assertion DSL needs to\n * express:\n *\n * - did tool X get called? (look at `toolCalls`)\n * - did tool A come before tool B? (compare `turnIndex` / `callIndex`)\n * - was a tool called with arguments matching predicate P? (`toolCalls[i].args`)\n * - did the agent answer without using any tool? (`toolCalls.length === 0`)\n *\n * The view is reconstructable from the raw events (lossless w.r.t. assertions),\n * but operating on it directly is dramatically simpler than walking event\n * streams or OTel span trees.\n *\n * Design notes:\n * - `turnIndex` and `callIndex` are the right primitives for ordering.\n * Wall-clock timestamps from the stream are unreliable for sub-second\n * ordering and parallel tool dispatch.\n * - Parallel tool calls (multiple `tool_use` blocks in one assistant message)\n * share a `turnIndex` but have distinct `callIndex` values in emission order.\n * - `namespace` is precomputed so assertions like `called(pattern: \"mcp__api__*\")`\n * can do a cheap string check.\n */\n\nimport type { StopReason } from \"./stream\";\n\nexport interface TrajectoryView {\n /** Session metadata, captured from the `system/init` event. */\n meta: SessionMeta;\n\n /** Every tool call, in global emission order. */\n toolCalls: ToolCall[];\n\n /** Each assistant turn: text content + any tool calls emitted in that turn. */\n turns: AssistantTurn[];\n\n /** All assistant text concatenated across turns. Useful for `response_contains`. */\n finalResponse: string;\n\n /** Stop reason of the *last* assistant turn. */\n finalStopReason: StopReason | null;\n\n /** Aggregate usage and cost from the result event. */\n usage: UsageSummary;\n\n /** Retry events observed during the run (rate limits, transient errors). */\n retries: RetryRecord[];\n\n /** Whether the result envelope indicated success. */\n success: boolean;\n}\n\nexport interface SessionMeta {\n sessionId: string;\n model: string;\n cwd: string;\n permissionMode?: string;\n /** Tool names the harness reported as available at session start. */\n availableTools: string[];\n /** MCP servers configured for the session, with connection status. */\n mcpServers: { name: string; status: string }[];\n}\n\nexport interface ToolCall {\n /** Fully-qualified tool name, e.g. `\"mcp__api__search_skills\"` or `\"Bash\"`. */\n name: string;\n\n /**\n * Namespace prefix for MCP-style names (`\"mcp__api\"`), or null for built-ins.\n * Precomputed via {@link namespaceOf} for cheap pattern matching.\n */\n namespace: string | null;\n\n /** The `tool_use` block's `id`; matches a later `tool_result.tool_use_id`. */\n callId: string;\n\n /** Args the model emitted on this call. Tool-specific schema. */\n args: unknown;\n\n /** Tool result, or null if no result was observed (e.g. process killed). */\n result: unknown | null;\n\n /** Whether the tool reported an error in its result. */\n isError: boolean;\n\n /**\n * Which assistant turn produced this call. Parallel calls within a single\n * assistant message share a `turnIndex`.\n */\n turnIndex: number;\n\n /** Index in the global ordered tool-call sequence. */\n callIndex: number;\n}\n\nexport interface AssistantTurn {\n turnIndex: number;\n /** Text emitted in this turn (may be empty if turn was tool-only). */\n text: string;\n /** Tool calls emitted in this turn, in their block order. */\n toolCalls: ToolCall[];\n /** Stop reason reported by the model for this turn. */\n stopReason: StopReason | null;\n}\n\nexport interface UsageSummary {\n inputTokens: number;\n outputTokens: number;\n totalCostUsd: number;\n durationMs: number;\n numTurns: number;\n}\n\nexport interface RetryRecord {\n /** ms since session start (approximate; the stream doesn't include precise ts). */\n offsetMs: number;\n /** Raw payload from the `system/api_retry` event for diagnostics. */\n raw: unknown;\n}\n\n// helpers\n\n/**\n * Extract the MCP namespace prefix from a tool name.\n *\n * Claude Code formats MCP tool names as `mcp__<server>__<tool>`. The namespace\n * is the first two segments joined: `mcp__<server>`. Returns null for non-MCP\n * tool names (built-ins like `Bash`, `Read`, `Edit`).\n *\n * @example\n * namespaceOf(\"mcp__api__search_skills\") // \"mcp__api\"\n * namespaceOf(\"Bash\") // null\n */\nexport function namespaceOf(toolName: string): string | null {\n if (!toolName.startsWith(\"mcp__\")) return null;\n const parts = toolName.split(\"__\");\n if (parts.length < 3) return null;\n return `${parts[0]}__${parts[1]}`;\n}\n","/**\n * TrajectoryBuilder — consumes a stream of {@link StreamEvent} values and\n * produces a {@link TrajectoryView}.\n *\n * State machine: the builder is a small, tolerant state machine. Invariants:\n *\n * - Exactly one `system/init` event opens the session. The builder requires\n * it to be present before `build()`.\n * - Each `assistant` event begins a new turn. Text blocks accumulate into\n * the turn's text; `tool_use` blocks become `ToolCall` records.\n * - `user` events with `tool_result` blocks deliver tool results back. We\n * match them to pending calls by `tool_use_id`.\n * - One `result` event closes the session and carries aggregate usage.\n *\n * The builder is *tolerant of partial streams*: a process killed mid-run\n * produces a coherent (but flagged) view. Tool calls without matching results\n * keep `result: null`. The `success` flag reflects whether a successful result\n * event was actually observed.\n *\n * Why a class (not a reducer)?\n * The internal `pendingCalls` map is mutable by design — we modify ToolCall\n * objects in place when results arrive, so other parts of the view (which\n * hold references to the same objects) see the update for free. A reducer\n * would force a deep copy per result event, which is wasteful and would\n * complicate identity-based queries.\n */\n\nimport {\n isAssistantMessage,\n isResult,\n isSystemInit,\n isTextBlock,\n isToolResultBlock,\n isToolUseBlock,\n isUserMessage,\n type StreamEvent,\n type Usage,\n} from \"../types/stream\";\nimport {\n namespaceOf,\n type AssistantTurn,\n type RetryRecord,\n type SessionMeta,\n type ToolCall,\n type TrajectoryView,\n} from \"../types/trajectory\";\n\nexport class TrajectoryBuilder {\n private meta: SessionMeta | null = null;\n private sessionStartTs: number | null = null;\n\n private turns: AssistantTurn[] = [];\n private allToolCalls: ToolCall[] = [];\n\n /**\n * tool_use_id → ToolCall, for matching results back to calls.\n * Entries are removed once a result is observed.\n */\n private pendingCalls: Map<string, ToolCall> = new Map();\n\n private retries: RetryRecord[] = [];\n\n private finalUsage: Usage | null = null;\n private finalCostUsd = 0;\n private finalDurationMs = 0;\n private finalNumTurns = 0;\n private finalResultText = \"\";\n private sawResultEvent = false;\n private resultIsError = false;\n\n /**\n * Consume one event. Safe to call with events in stream order.\n *\n * Unknown event types are silently ignored — the schema evolves and we\n * don't want CI to break on a new event type we haven't modelled.\n */\n consume(event: StreamEvent): void {\n if (isSystemInit(event)) {\n this.meta = {\n sessionId: event.session_id,\n model: event.model,\n cwd: event.cwd,\n permissionMode: event.permissionMode,\n availableTools: event.tools ?? [],\n mcpServers: (event.mcp_servers ?? []).map((s) => ({\n name: s.name,\n status: s.status,\n })),\n };\n this.sessionStartTs = Date.now();\n return;\n }\n\n if (event.type === \"system\" && event.subtype === \"api_retry\") {\n this.retries.push({\n offsetMs: this.sessionStartTs ? Date.now() - this.sessionStartTs : 0,\n raw: event,\n });\n return;\n }\n\n if (isAssistantMessage(event)) {\n this.handleAssistantMessage(event);\n return;\n }\n\n if (isUserMessage(event)) {\n this.handleUserMessage(event);\n return;\n }\n\n if (isResult(event)) {\n this.sawResultEvent = true;\n this.resultIsError = event.is_error;\n this.finalUsage = event.usage ?? null;\n this.finalCostUsd = event.total_cost_usd ?? 0;\n this.finalDurationMs = event.duration_ms ?? 0;\n this.finalNumTurns = event.num_turns ?? 0;\n this.finalResultText = event.result ?? \"\";\n return;\n }\n\n // Unknown event: ignored. See class doc.\n }\n\n /**\n * Finalize the view. Call after consuming the last event from the stream.\n *\n * Throws if no `system/init` was observed — at that point we have no model,\n * no session id, and no available-tools list, which means assertions like\n * \"called any mcp__api__* tool\" can't even be evaluated meaningfully.\n */\n build(): TrajectoryView {\n if (this.meta === null) {\n throw new Error(\n \"TrajectoryBuilder.build() called before any system/init event was observed. \" +\n \"The harness may have failed to start, or the stream was truncated before init.\",\n );\n }\n\n const lastTurn = this.turns[this.turns.length - 1];\n\n // Prefer the assistant text we accumulated turn-by-turn over the\n // `result.result` field, because the latter is sometimes a summary\n // and the former is exactly what the model said.\n const accumulatedText = this.turns\n .map((t) => t.text)\n .filter((t) => t.length > 0)\n .join(\"\\n\\n\")\n .trim();\n\n return {\n meta: this.meta,\n toolCalls: this.allToolCalls,\n turns: this.turns,\n finalResponse: accumulatedText || this.finalResultText,\n finalStopReason: lastTurn?.stopReason ?? null,\n usage: {\n inputTokens: this.finalUsage?.input_tokens ?? 0,\n outputTokens: this.finalUsage?.output_tokens ?? 0,\n totalCostUsd: this.finalCostUsd,\n durationMs: this.finalDurationMs,\n // Fall back to observed turn count if the result event was missing.\n numTurns: this.finalNumTurns || this.turns.length,\n },\n retries: this.retries,\n // Successful = saw a non-error result envelope. Streams that ended without\n // a result event are reported as unsuccessful regardless of tool outcomes.\n success: this.sawResultEvent && !this.resultIsError,\n };\n }\n\n // private handlers\n\n private handleAssistantMessage(\n event: Extract<StreamEvent, { type: \"assistant\" }>,\n ): void {\n const turnIndex = this.turns.length;\n const textChunks: string[] = [];\n const toolCallsThisTurn: ToolCall[] = [];\n\n for (const block of event.message.content) {\n if (isTextBlock(block)) {\n textChunks.push(block.text);\n continue;\n }\n if (isToolUseBlock(block)) {\n const call: ToolCall = {\n name: block.name,\n namespace: namespaceOf(block.name),\n callId: block.id,\n args: block.input,\n result: null,\n isError: false,\n turnIndex,\n callIndex: this.allToolCalls.length,\n };\n this.allToolCalls.push(call);\n this.pendingCalls.set(block.id, call);\n toolCallsThisTurn.push(call);\n continue;\n }\n // tool_result blocks don't appear in assistant messages — those arrive\n // via user messages. If one does appear, ignore it; we'd rather drop\n // an unexpected block than crash the eval.\n }\n\n this.turns.push({\n turnIndex,\n text: textChunks.join(\"\").trim(),\n toolCalls: toolCallsThisTurn,\n stopReason: event.message.stop_reason ?? null,\n });\n }\n\n private handleUserMessage(\n event: Extract<StreamEvent, { type: \"user\" }>,\n ): void {\n const content = event.message.content;\n\n // The very first user message carries the prompt as a plain string. We\n // already know the prompt (the caller passed it to the adapter), so we\n // ignore this case — there's nothing assertion-relevant in it.\n if (typeof content === \"string\") return;\n\n for (const block of content) {\n if (!isToolResultBlock(block)) continue;\n\n const call = this.pendingCalls.get(block.tool_use_id);\n if (!call) {\n // Unmatched result: ignore. Can happen if events arrive out of order\n // or the corresponding tool_use was emitted in an earlier run that\n // we're resuming. Either way, dropping is safer than throwing.\n continue;\n }\n\n call.result = block.content;\n call.isError = block.is_error ?? false;\n this.pendingCalls.delete(block.tool_use_id);\n }\n }\n}\n\n/**\n * Convenience: drain an async iterable of events through a fresh builder.\n *\n * Suitable when you have the full event stream and just want the view.\n * For interactive/incremental scenarios (e.g. surfacing partial state in a UI)\n * instantiate {@link TrajectoryBuilder} directly and call `consume()` /\n * `build()` yourself.\n */\nexport async function buildTrajectory(\n events: AsyncIterable<StreamEvent>,\n): Promise<TrajectoryView> {\n const builder = new TrajectoryBuilder();\n for await (const event of events) {\n builder.consume(event);\n }\n return builder.build();\n}\n","/**\n * Line-buffered NDJSON parser for Claude Code's `--output-format stream-json`.\n *\n * Claude Code emits one JSON object per line on stdout. The parser:\n * - buffers across chunk boundaries (a single JSON line may arrive in two reads)\n * - skips empty lines (defensive — shouldn't occur, but harmless if it does)\n * - emits a discriminated `ParseResult` per line so callers can decide whether\n * a malformed line should abort the run or just be logged.\n *\n * Why a generator (and not a Transform stream)?\n * The eval adapter consumes events sequentially and synchronously updates a\n * builder. Async iteration is the simplest interface for that pattern and\n * composes cleanly with `for await` in the adapter. A Transform would force\n * the builder into event-handler style.\n */\n\nimport type { Readable } from \"node:stream\";\nimport type { StreamEvent } from \"../types/stream\";\n\n/**\n * Result of attempting to parse a single line.\n *\n * Successful parses yield `{ ok: true }` with the typed event and the raw line\n * (kept for diagnostics and OTel `events.attributes.raw`). Failed parses yield\n * `{ ok: false }` with the parse error and the raw line — callers can log,\n * skip, or fail the run as they see fit.\n */\nexport type ParseResult =\n | { ok: true; event: StreamEvent; rawLine: string }\n | { ok: false; error: Error; rawLine: string };\n\n/**\n * Parse a readable stream of NDJSON into a sequence of typed stream-json events.\n *\n * @example\n * const child = spawn(\"claude\", [\"-p\", prompt, \"--output-format\", \"stream-json\", \"--verbose\"]);\n * for await (const result of parseStreamJson(child.stdout)) {\n * if (result.ok) builder.consume(result.event);\n * else console.warn(\"malformed stream line:\", result.rawLine, result.error);\n * }\n */\nexport async function* parseStreamJson(\n stream: Readable,\n): AsyncGenerator<ParseResult, void, void> {\n let buffer = \"\";\n // The Node child_process stdout is a binary stream by default. Setting the\n // encoding here means `for await (const chunk of stream)` yields strings.\n stream.setEncoding(\"utf8\");\n\n for await (const chunk of stream) {\n buffer += chunk as string;\n\n // Drain every complete line currently in the buffer before reading more.\n // Multiple JSON objects can arrive in one chunk (e.g. when the harness\n // emits a burst of events at session start).\n let newlineIdx: number;\n while ((newlineIdx = buffer.indexOf(\"\\n\")) !== -1) {\n const line = buffer.slice(0, newlineIdx).trim();\n buffer = buffer.slice(newlineIdx + 1);\n if (line.length === 0) continue;\n yield tryParseLine(line);\n }\n }\n\n // Flush any trailing content that arrived without a final newline. Stream-json\n // typically ends with a newline-terminated `result` event, but a killed\n // process may not flush, so we still try to emit what we have.\n const trailing = buffer.trim();\n if (trailing.length > 0) {\n yield tryParseLine(trailing);\n }\n}\n\n/**\n * Parse a single line. Extracted as a helper so the generator stays readable.\n *\n * Note: we do not validate the event structure beyond `JSON.parse`. Runtime\n * validation (e.g. zod) is overkill here — the schema is stable enough at\n * runtime, and the TrajectoryBuilder is tolerant of missing fields. Adding\n * validation would be premature.\n */\nfunction tryParseLine(line: string): ParseResult {\n try {\n const event = JSON.parse(line) as StreamEvent;\n return { ok: true, event, rawLine: line };\n } catch (err) {\n return {\n ok: false,\n error: err instanceof Error ? err : new Error(String(err)),\n rawLine: line,\n };\n }\n}\n","/**\n * Generic harness adapter contract.\n *\n * Every harness adapter produces a {@link TrajectoryView} plus process\n * diagnostics. The runner and assertion engine depend only on these types —\n * not on any specific harness implementation.\n */\n\nimport type { TrajectoryView } from \"../types/trajectory\";\n\n/** Base config every adapter must accept. */\nexport interface BaseAdapterConfig {\n prompt: string;\n model?: string;\n timeoutMs?: number;\n signal?: AbortSignal;\n env?: Record<string, string>;\n cwd?: string;\n}\n\n/** Suite-level config: generic fields plus adapter-specific nested blocks. */\nexport type SuiteConfig = Partial<BaseAdapterConfig> & {\n /** Claude Code adapter options (when `adapter` is `claude-code`). */\n claudeCode?: Record<string, unknown>;\n};\n\n/** Generic harness adapter interface. */\nexport interface HarnessAdapter<\n TConfig extends BaseAdapterConfig = BaseAdapterConfig,\n> {\n readonly id: string;\n run(config: TConfig): Promise<AdapterResult>;\n}\n\n/** Successful adapter run. */\nexport interface AdapterResult {\n view: TrajectoryView;\n diagnostics: AdapterDiagnostics;\n}\n\n/** Process-level diagnostics from any adapter. */\nexport interface AdapterDiagnostics {\n exitCode: number | null;\n signal: NodeJS.Signals | null;\n stderr: string;\n parseErrors: ParseErrorRecord[];\n timedOut: boolean;\n durationMs: number;\n}\n\nexport interface ParseErrorRecord {\n line: string;\n error: string;\n}\n\n/**\n * Thrown when the harness fails to produce a usable trajectory.\n *\n * Most commonly this means the process failed before emitting a usable\n * session init event. Inspect `diagnostics.stderr` for the cause.\n */\nexport class AdapterError extends Error {\n constructor(\n message: string,\n public readonly diagnostics: Partial<AdapterDiagnostics>,\n ) {\n super(message);\n this.name = \"AdapterError\";\n }\n}\n","/**\n * Build CLI args for Claude Code judge subprocesses (JSON output, not stream-json).\n *\n * Shared flag assembly for harness runs (`buildArgs`) and LLM grading judges\n * (`buildJudgeArgs`).\n */\n\nimport type { ClaudeCodeAdapterConfig, ClaudeCodeOptions } from \"./types\";\n\n/** Append repeated `--flag value` pairs for array config fields. */\nfunction pushRepeatableFlag(args: string[], flag: string, values?: string[]): void {\n if (!values) return;\n for (const value of values) {\n args.push(flag, value);\n }\n}\n\n/**\n * Append an optional CLI flag. Boolean `true` emits the flag alone; other\n * scalars emit `--flag value`.\n */\nfunction pushOptionalFlag(\n args: string[],\n flag: string,\n value: string | number | boolean | undefined,\n): void {\n if (value === undefined) return;\n if (typeof value === \"boolean\") {\n if (value) args.push(flag);\n return;\n }\n args.push(flag, String(value));\n}\n\n/** Append Claude Code CLI flags shared by harness runs and grading judges. */\nexport function appendClaudeCodeFlags(\n args: string[],\n config: ClaudeCodeOptions & { model?: string },\n): void {\n pushRepeatableFlag(args, \"--plugin-dir\", config.pluginDirs);\n pushRepeatableFlag(args, \"--plugin-url\", config.pluginUrls);\n pushRepeatableFlag(args, \"--add-dir\", config.addDirs);\n\n pushOptionalFlag(args, \"--mcp-config\", config.mcpConfig);\n pushOptionalFlag(args, \"--model\", config.model);\n pushOptionalFlag(args, \"--permission-mode\", config.permissionMode);\n pushOptionalFlag(args, \"--effort\", config.effort);\n pushOptionalFlag(args, \"--agent\", config.agent);\n pushOptionalFlag(args, \"--fallback-model\", config.fallbackModel);\n pushOptionalFlag(args, \"--tools\", config.tools);\n pushOptionalFlag(args, \"--settings\", config.settings);\n pushOptionalFlag(args, \"--setting-sources\", config.settingSources);\n pushOptionalFlag(args, \"--max-turns\", config.maxTurns);\n pushOptionalFlag(args, \"--max-budget-usd\", config.maxBudgetUsd);\n pushOptionalFlag(args, \"--system-prompt\", config.systemPrompt);\n pushOptionalFlag(args, \"--system-prompt-file\", config.systemPromptFile);\n pushOptionalFlag(args, \"--append-system-prompt\", config.appendSystemPrompt);\n pushOptionalFlag(\n args,\n \"--append-system-prompt-file\",\n config.appendSystemPromptFile,\n );\n pushOptionalFlag(args, \"--debug\", config.debug);\n pushOptionalFlag(args, \"--debug-file\", config.debugFile);\n\n if (config.allowedTools && config.allowedTools.length > 0) {\n args.push(\"--allowedTools\", config.allowedTools.join(\",\"));\n }\n\n if (config.disallowedTools && config.disallowedTools.length > 0) {\n args.push(\"--disallowedTools\", config.disallowedTools.join(\",\"));\n }\n\n pushOptionalFlag(args, \"--strict-mcp-config\", config.strictMcpConfig);\n pushOptionalFlag(args, \"--include-hook-events\", config.includeHookEvents);\n pushOptionalFlag(args, \"--no-session-persistence\", config.noSessionPersistence);\n pushOptionalFlag(args, \"--disable-slash-commands\", config.disableSlashCommands);\n pushOptionalFlag(args, \"--bare\", config.bare);\n pushOptionalFlag(args, \"--safe-mode\", config.safeMode);\n pushOptionalFlag(\n args,\n \"--allow-dangerously-skip-permissions\",\n config.allowDangerouslySkipPermissions,\n );\n pushOptionalFlag(\n args,\n \"--dangerously-skip-permissions\",\n config.dangerouslySkipPermissions,\n );\n}\n\n/**\n * Build the argument vector for spawning `claude`.\n *\n * Order matters only for flags that take values — value flags must come\n * after their flag name. Everything else is order-independent.\n */\nexport function buildArgs(config: ClaudeCodeAdapterConfig): string[] {\n const args: string[] = [\n \"-p\",\n config.prompt,\n \"--output-format\",\n \"stream-json\",\n \"--verbose\",\n ];\n\n appendClaudeCodeFlags(args, config);\n\n return args;\n}\n\n/**\n * Build args for an LLM judge subprocess (`--output-format json`).\n *\n * Defaults permission mode to `bypassPermissions` so the judge does not\n * block on tool permission prompts during single-shot JSON grading.\n */\nexport function buildJudgeArgs(\n prompt: string,\n config: ClaudeCodeOptions & { model?: string } = {},\n): string[] {\n const args: string[] = [\"-p\", prompt, \"--output-format\", \"json\"];\n const permissionMode = config.permissionMode ?? \"bypassPermissions\";\n appendClaudeCodeFlags(args, {\n ...config,\n permissionMode,\n });\n return args;\n}\n","/**\n * Process management for the Claude Code adapter.\n *\n * This module owns spawning, timeout, abort signal handling, and process-tree\n * teardown. The orchestrator (`index.ts`) consumes the returned handle —\n * reading stdout and waiting for completion — but doesn't worry about how\n * the process gets killed or how its config gets isolated.\n *\n * Why a separate module? Process management is the one part of the adapter\n * with real I/O complexity (process groups, signal escalation, temp-dir\n * lifecycle, env merging). Isolating it makes the orchestrator easy to read\n * and lets us swap the spawning logic if we later need to, e.g., wrap claude\n * in a sandbox runner.\n */\n\nimport { spawn, type ChildProcess } from \"node:child_process\";\nimport { mkdtemp, rm } from \"node:fs/promises\";\nimport { tmpdir } from \"node:os\";\nimport { join } from \"node:path\";\nimport type { Readable } from \"node:stream\";\n\nimport { buildArgs } from \"./flags\";\nimport type { ClaudeCodeAdapterConfig } from \"./types\";\n\n/** Default hard timeout per run. Tunable via config.timeoutMs. */\nconst DEFAULT_TIMEOUT_MS = 5 * 60 * 1000;\n\n/**\n * Grace period between SIGTERM and SIGKILL. Most processes shut down cleanly\n * within a few seconds; this gives them that chance while preventing CI from\n * hanging indefinitely on a stuck child.\n */\nconst KILL_GRACE_MS = 5_000;\n\n/**\n * Handle to a spawned `claude` process. The orchestrator drives it:\n * - Read `stdout` (typically via parseStreamJson).\n * - Await `done` to learn the exit state.\n * - Await `stderrCollected` for diagnostic stderr.\n * - Check `timedOut()` after exit to distinguish kill-by-timeout from\n * normal termination.\n * - Call `cleanup()` after all of the above to remove the temp config dir.\n */\nexport interface SpawnedClaude {\n stdout: Readable;\n done: Promise<{ exitCode: number | null; signal: NodeJS.Signals | null }>;\n stderrCollected: Promise<string>;\n timedOut: () => boolean;\n cleanup: () => Promise<void>;\n}\n\n/**\n * Spawn `claude` in headless mode with isolated config and a process-group\n * lifecycle. See {@link SpawnedClaude} for how to consume the result.\n *\n * **Kill sequence:** timeout and abort both follow the same two-step path:\n * `SIGTERM` to the process group, then `SIGKILL` after {@link KILL_GRACE_MS}\n * if the group is still alive. This avoids leaving MCP/tool subprocesses\n * running while still giving claude a chance to flush stream-json output.\n *\n * @param config - Adapter options; `timeoutMs`, `signal`, and `isolateConfig`\n * control lifecycle and config isolation.\n */\nexport async function spawnClaude(\n config: ClaudeCodeAdapterConfig,\n): Promise<SpawnedClaude> {\n const binary = config.binary ?? \"claude\";\n const args = buildArgs(config);\n\n const isolateConfig = config.isolateConfig !== false;\n\n // Isolated runs use a fresh temp dir so plugins/settings don't leak between\n // reps. Non-isolated runs inherit the caller's Claude login and plugins.\n const tempConfigDir = isolateConfig\n ? await mkdtemp(join(tmpdir(), \"harness-eval-\"))\n : null;\n\n const env: Record<string, string | undefined> = {\n ...process.env,\n ...config.env,\n };\n if (tempConfigDir) {\n // Override after ...env so callers can't accidentally un-isolate.\n env.CLAUDE_CONFIG_DIR = tempConfigDir;\n }\n\n const child = spawn(binary, args, {\n cwd: config.cwd ?? process.cwd(),\n env,\n stdio: [\"ignore\", \"pipe\", \"pipe\"],\n // detached: true means the child becomes the leader of its own process\n // group. We exploit this to kill the entire group (including any MCP\n // server subprocesses and tool processes) on timeout/abort.\n detached: true,\n });\n\n\n // `timedOut` is set only by the hard timeout timer, not by abort — callers\n // use it to distinguish \"ran too long\" from user cancellation or normal exit.\n let timedOut = false;\n let killEscalation: NodeJS.Timeout | null = null;\n const timeoutMs = config.timeoutMs ?? DEFAULT_TIMEOUT_MS;\n\n /**\n * Arm (or re-arm) the SIGKILL fallback. Each SIGTERM attempt gets its own\n * grace window so a slow shutdown doesn't leave orphaned MCP servers.\n */\n const scheduleKillEscalation = () => {\n if (killEscalation) clearTimeout(killEscalation);\n killEscalation = setTimeout(\n () => killTree(child, \"SIGKILL\"),\n KILL_GRACE_MS,\n );\n };\n\n const timeoutTimer = setTimeout(() => {\n timedOut = true;\n killTree(child, \"SIGTERM\");\n scheduleKillEscalation();\n }, timeoutMs);\n\n // AbortSignal cancellation mirrors timeout kills but does not flip `timedOut`.\n const onAbort = () => {\n killTree(child, \"SIGTERM\");\n scheduleKillEscalation();\n };\n config.signal?.addEventListener(\"abort\", onAbort, { once: true });\n\n\n // Drain stderr eagerly so the OS-level buffer never fills and stalls the\n // child (Node child processes will block on a full pipe).\n const stderrChunks: string[] = [];\n child.stderr?.setEncoding(\"utf8\");\n child.stderr?.on(\"data\", (chunk: string) => {\n stderrChunks.push(chunk);\n });\n\n const stderrCollected = new Promise<string>((resolve) => {\n const finalize = () => resolve(stderrChunks.join(\"\"));\n child.stderr?.on(\"end\", finalize);\n // Errors during stderr capture shouldn't fail the whole run; we just\n // return what we've buffered so far.\n child.stderr?.on(\"error\", finalize);\n });\n\n\n // Resolve once the process exits or fails to spawn. Guard against double\n // settlement because both `close` and `error` can fire in edge cases.\n const done = new Promise<{\n exitCode: number | null;\n signal: NodeJS.Signals | null;\n }>((resolve) => {\n let settled = false;\n const finalize = (\n exitCode: number | null,\n signal: NodeJS.Signals | null,\n ) => {\n if (settled) return;\n settled = true;\n // Tear down timers/listeners so a late timeout cannot SIGKILL a reused PID.\n clearTimeout(timeoutTimer);\n if (killEscalation) clearTimeout(killEscalation);\n config.signal?.removeEventListener(\"abort\", onAbort);\n resolve({ exitCode, signal });\n };\n\n child.on(\"close\", (code, signal) => finalize(code, signal));\n // ENOENT and other spawn failures emit `error` — `close` may not follow.\n child.on(\"error\", () => finalize(null, null));\n });\n\n\n const cleanup = async () => {\n if (!tempConfigDir) return;\n try {\n await rm(tempConfigDir, { recursive: true, force: true });\n } catch {\n // Best-effort. A leftover temp dir is annoying but not catastrophic;\n // we don't want to fail the run for it.\n }\n };\n\n // stdout is guaranteed non-null because we passed `stdio: [..., \"pipe\", ...]`.\n // The `!` is safe; the alternative would be a redundant runtime check that\n // could never fire.\n return {\n stdout: child.stdout!,\n done,\n stderrCollected,\n timedOut: () => timedOut,\n cleanup,\n };\n}\n\n/**\n * Kill the child's process group, then fall back to the bare PID if the\n * group is already gone. This catches MCP server subprocesses and tool\n * processes spawned by claude.\n *\n * **Signal escalation:** callers typically invoke this first with `SIGTERM`,\n * then again with `SIGKILL` after {@link KILL_GRACE_MS}. The group kill is\n * essential — a bare `child.kill()` would leave MCP servers running.\n *\n * **Platform edge case:** when the group leader exits first, `kill(-pid)`\n * throws `ESRCH`. The single-PID fallback covers that without failing the\n * adapter run.\n *\n * @param child - Spawned process handle from {@link spawn}.\n * @param signal - POSIX signal to deliver (`SIGTERM` or `SIGKILL` in practice).\n */\nfunction killTree(child: ChildProcess, signal: NodeJS.Signals): void {\n if (child.pid === undefined) return;\n try {\n // Negative PID targets the entire process group (requires detached spawn).\n process.kill(-child.pid, signal);\n } catch {\n try {\n // Group already reaped — try the leader PID directly.\n child.kill(signal);\n } catch {\n // Process fully gone; nothing to do.\n }\n }\n}\n","/**\n * Claude Code adapter — public API.\n */\n\nimport { parseStreamJson } from \"../../parsers/stream-json\";\nimport { TrajectoryBuilder } from \"../../trajectory/builder\";\nimport type { StreamEvent } from \"../../types/stream\";\n\nimport { AdapterError } from \"../types\";\nimport { spawnClaude } from \"./process\";\nimport type {\n AdapterDiagnostics,\n ClaudeCodeAdapterConfig,\n ClaudeCodeAdapterResult,\n ParseErrorRecord,\n} from \"./types\";\nimport type { HarnessAdapter } from \"../types\";\n\nexport { AdapterError } from \"../types\";\nexport type {\n AdapterDiagnostics,\n AdapterResult,\n ClaudeCodeAdapterConfig,\n ClaudeCodeAdapterResult,\n ClaudeCodeOptions,\n ParseErrorRecord,\n PermissionMode,\n} from \"./types\";\n\n/**\n * Run Claude Code in headless mode and return a trajectory.\n */\nexport async function runClaudeCode(\n config: ClaudeCodeAdapterConfig,\n): Promise<ClaudeCodeAdapterResult> {\n const startTs = Date.now();\n const spawned = await spawnClaude(config);\n\n const builder = new TrajectoryBuilder();\n const rawEvents: StreamEvent[] = [];\n const parseErrors: ParseErrorRecord[] = [];\n\n try {\n for await (const result of parseStreamJson(spawned.stdout)) {\n if (result.ok) {\n builder.consume(result.event);\n rawEvents.push(result.event);\n } else {\n parseErrors.push({\n line: result.rawLine,\n error: result.error.message,\n });\n }\n }\n\n const [{ exitCode, signal }, stderr] = await Promise.all([\n spawned.done,\n spawned.stderrCollected,\n ]);\n\n const diagnostics: AdapterDiagnostics = {\n exitCode,\n signal,\n stderr,\n parseErrors,\n timedOut: spawned.timedOut(),\n durationMs: Date.now() - startTs,\n };\n\n let view;\n try {\n view = builder.build();\n } catch (err) {\n const message = err instanceof Error ? err.message : String(err);\n throw new AdapterError(\n `harness produced no usable trajectory: ${message}`,\n diagnostics,\n );\n }\n\n return { view, diagnostics, rawEvents };\n } finally {\n await spawned.cleanup();\n }\n}\n\n/** Registered {@link HarnessAdapter} for Claude Code headless runs. */\nexport const claudeCodeAdapter: HarnessAdapter<ClaudeCodeAdapterConfig> = {\n id: \"claude-code\",\n run: runClaudeCode,\n};\n"],"mappings":";;;;;;;AAuMA,SAAgB,aAAa,GAAsC;CACjE,OAAO,EAAE,SAAS,YAAa,EAAsB,YAAY;AACnE;AAEA,SAAgB,cAAc,GAAuC;CACnE,OAAO,EAAE,SAAS,YAAa,EAAuB,YAAY;AACpE;AAEA,SAAgB,mBAAmB,GAA4C;CAC7E,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,cAAc,GAAuC;CACnE,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,SAAS,GAAkC;CACzD,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,YAAY,GAAiC;CAC3D,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,eAAe,GAAoC;CACjE,OAAO,EAAE,SAAS;AACpB;AAEA,SAAgB,kBAAkB,GAAuC;CACvE,OAAO,EAAE,SAAS;AACpB;;;;;;;;;;;;;;AC9FA,SAAgB,YAAY,UAAiC;CAC3D,IAAI,CAAC,SAAS,WAAW,OAAO,GAAG,OAAO;CAC1C,MAAM,QAAQ,SAAS,MAAM,IAAI;CACjC,IAAI,MAAM,SAAS,GAAG,OAAO;CAC7B,OAAO,GAAG,MAAM,GAAG,IAAI,MAAM;AAC/B;;;;;;;;;;;;;;;;;;;;;;;;;;;;;AC7FA,IAAa,oBAAb,MAA+B;CAC7B,OAAmC;CACnC,iBAAwC;CAExC,QAAiC,CAAC;CAClC,eAAmC,CAAC;;;;;CAMpC,+BAA8C,IAAI,IAAI;CAEtD,UAAiC,CAAC;CAElC,aAAmC;CACnC,eAAuB;CACvB,kBAA0B;CAC1B,gBAAwB;CACxB,kBAA0B;CAC1B,iBAAyB;CACzB,gBAAwB;;;;;;;CAQxB,QAAQ,OAA0B;EAChC,IAAI,aAAa,KAAK,GAAG;GACvB,KAAK,OAAO;IACV,WAAW,MAAM;IACjB,OAAO,MAAM;IACb,KAAK,MAAM;IACX,gBAAgB,MAAM;IACtB,gBAAgB,MAAM,SAAS,CAAC;IAChC,aAAa,MAAM,eAAe,CAAC,EAAA,CAAG,KAAK,OAAO;KAChD,MAAM,EAAE;KACR,QAAQ,EAAE;IACZ,EAAE;GACJ;GACA,KAAK,iBAAiB,KAAK,IAAI;GAC/B;EACF;EAEA,IAAI,MAAM,SAAS,YAAY,MAAM,YAAY,aAAa;GAC5D,KAAK,QAAQ,KAAK;IAChB,UAAU,KAAK,iBAAiB,KAAK,IAAI,IAAI,KAAK,iBAAiB;IACnE,KAAK;GACP,CAAC;GACD;EACF;EAEA,IAAI,mBAAmB,KAAK,GAAG;GAC7B,KAAK,uBAAuB,KAAK;GACjC;EACF;EAEA,IAAI,cAAc,KAAK,GAAG;GACxB,KAAK,kBAAkB,KAAK;GAC5B;EACF;EAEA,IAAI,SAAS,KAAK,GAAG;GACnB,KAAK,iBAAiB;GACtB,KAAK,gBAAgB,MAAM;GAC3B,KAAK,aAAa,MAAM,SAAS;GACjC,KAAK,eAAe,MAAM,kBAAkB;GAC5C,KAAK,kBAAkB,MAAM,eAAe;GAC5C,KAAK,gBAAgB,MAAM,aAAa;GACxC,KAAK,kBAAkB,MAAM,UAAU;GACvC;EACF;CAGF;;;;;;;;CASA,QAAwB;EACtB,IAAI,KAAK,SAAS,MAChB,MAAM,IAAI,MACR,4JAEF;EAGF,MAAM,WAAW,KAAK,MAAM,KAAK,MAAM,SAAS;EAKhD,MAAM,kBAAkB,KAAK,MAC1B,KAAK,MAAM,EAAE,IAAI,CAAC,CAClB,QAAQ,MAAM,EAAE,SAAS,CAAC,CAAC,CAC3B,KAAK,MAAM,CAAC,CACZ,KAAK;EAER,OAAO;GACL,MAAM,KAAK;GACX,WAAW,KAAK;GAChB,OAAO,KAAK;GACZ,eAAe,mBAAmB,KAAK;GACvC,iBAAiB,UAAU,cAAc;GACzC,OAAO;IACL,aAAa,KAAK,YAAY,gBAAgB;IAC9C,cAAc,KAAK,YAAY,iBAAiB;IAChD,cAAc,KAAK;IACnB,YAAY,KAAK;IAEjB,UAAU,KAAK,iBAAiB,KAAK,MAAM;GAC7C;GACA,SAAS,KAAK;GAGd,SAAS,KAAK,kBAAkB,CAAC,KAAK;EACxC;CACF;CAIA,uBACE,OACM;EACN,MAAM,YAAY,KAAK,MAAM;EAC7B,MAAM,aAAuB,CAAC;EAC9B,MAAM,oBAAgC,CAAC;EAEvC,KAAK,MAAM,SAAS,MAAM,QAAQ,SAAS;GACzC,IAAI,YAAY,KAAK,GAAG;IACtB,WAAW,KAAK,MAAM,IAAI;IAC1B;GACF;GACA,IAAI,eAAe,KAAK,GAAG;IACzB,MAAM,OAAiB;KACrB,MAAM,MAAM;KACZ,WAAW,YAAY,MAAM,IAAI;KACjC,QAAQ,MAAM;KACd,MAAM,MAAM;KACZ,QAAQ;KACR,SAAS;KACT;KACA,WAAW,KAAK,aAAa;IAC/B;IACA,KAAK,aAAa,KAAK,IAAI;IAC3B,KAAK,aAAa,IAAI,MAAM,IAAI,IAAI;IACpC,kBAAkB,KAAK,IAAI;IAC3B;GACF;EAIF;EAEA,KAAK,MAAM,KAAK;GACd;GACA,MAAM,WAAW,KAAK,EAAE,CAAC,CAAC,KAAK;GAC/B,WAAW;GACX,YAAY,MAAM,QAAQ,eAAe;EAC3C,CAAC;CACH;CAEA,kBACE,OACM;EACN,MAAM,UAAU,MAAM,QAAQ;EAK9B,IAAI,OAAO,YAAY,UAAU;EAEjC,KAAK,MAAM,SAAS,SAAS;GAC3B,IAAI,CAAC,kBAAkB,KAAK,GAAG;GAE/B,MAAM,OAAO,KAAK,aAAa,IAAI,MAAM,WAAW;GACpD,IAAI,CAAC,MAIH;GAGF,KAAK,SAAS,MAAM;GACpB,KAAK,UAAU,MAAM,YAAY;GACjC,KAAK,aAAa,OAAO,MAAM,WAAW;EAC5C;CACF;AACF;;;;;;;;;AAUA,eAAsB,gBACpB,QACyB;CACzB,MAAM,UAAU,IAAI,kBAAkB;CACtC,WAAW,MAAM,SAAS,QACxB,QAAQ,QAAQ,KAAK;CAEvB,OAAO,QAAQ,MAAM;AACvB;;;;;;;;;;;;;AC1NA,gBAAuB,gBACrB,QACyC;CACzC,IAAI,SAAS;CAGb,OAAO,YAAY,MAAM;CAEzB,WAAW,MAAM,SAAS,QAAQ;EAChC,UAAU;EAKV,IAAI;EACJ,QAAQ,aAAa,OAAO,QAAQ,IAAI,OAAO,IAAI;GACjD,MAAM,OAAO,OAAO,MAAM,GAAG,UAAU,CAAC,CAAC,KAAK;GAC9C,SAAS,OAAO,MAAM,aAAa,CAAC;GACpC,IAAI,KAAK,WAAW,GAAG;GACvB,MAAM,aAAa,IAAI;EACzB;CACF;CAKA,MAAM,WAAW,OAAO,KAAK;CAC7B,IAAI,SAAS,SAAS,GACpB,MAAM,aAAa,QAAQ;AAE/B;;;;;;;;;AAUA,SAAS,aAAa,MAA2B;CAC/C,IAAI;EAEF,OAAO;GAAE,IAAI;GAAM,OADL,KAAK,MAAM,IACF;GAAG,SAAS;EAAK;CAC1C,SAAS,KAAK;EACZ,OAAO;GACL,IAAI;GACJ,OAAO,eAAe,QAAQ,MAAM,IAAI,MAAM,OAAO,GAAG,CAAC;GACzD,SAAS;EACX;CACF;AACF;;;;;;;;;AC/BA,IAAa,eAAb,cAAkC,MAAM;CAGpB;CAFlB,YACE,SACA,aACA;EACA,MAAM,OAAO;EAFG,KAAA,cAAA;EAGhB,KAAK,OAAO;CACd;AACF;;;;AC3DA,SAAS,mBAAmB,MAAgB,MAAc,QAAyB;CACjF,IAAI,CAAC,QAAQ;CACb,KAAK,MAAM,SAAS,QAClB,KAAK,KAAK,MAAM,KAAK;AAEzB;;;;;AAMA,SAAS,iBACP,MACA,MACA,OACM;CACN,IAAI,UAAU,KAAA,GAAW;CACzB,IAAI,OAAO,UAAU,WAAW;EAC9B,IAAI,OAAO,KAAK,KAAK,IAAI;EACzB;CACF;CACA,KAAK,KAAK,MAAM,OAAO,KAAK,CAAC;AAC/B;;AAGA,SAAgB,sBACd,MACA,QACM;CACN,mBAAmB,MAAM,gBAAgB,OAAO,UAAU;CAC1D,mBAAmB,MAAM,gBAAgB,OAAO,UAAU;CAC1D,mBAAmB,MAAM,aAAa,OAAO,OAAO;CAEpD,iBAAiB,MAAM,gBAAgB,OAAO,SAAS;CACvD,iBAAiB,MAAM,WAAW,OAAO,KAAK;CAC9C,iBAAiB,MAAM,qBAAqB,OAAO,cAAc;CACjE,iBAAiB,MAAM,YAAY,OAAO,MAAM;CAChD,iBAAiB,MAAM,WAAW,OAAO,KAAK;CAC9C,iBAAiB,MAAM,oBAAoB,OAAO,aAAa;CAC/D,iBAAiB,MAAM,WAAW,OAAO,KAAK;CAC9C,iBAAiB,MAAM,cAAc,OAAO,QAAQ;CACpD,iBAAiB,MAAM,qBAAqB,OAAO,cAAc;CACjE,iBAAiB,MAAM,eAAe,OAAO,QAAQ;CACrD,iBAAiB,MAAM,oBAAoB,OAAO,YAAY;CAC9D,iBAAiB,MAAM,mBAAmB,OAAO,YAAY;CAC7D,iBAAiB,MAAM,wBAAwB,OAAO,gBAAgB;CACtE,iBAAiB,MAAM,0BAA0B,OAAO,kBAAkB;CAC1E,iBACE,MACA,+BACA,OAAO,sBACT;CACA,iBAAiB,MAAM,WAAW,OAAO,KAAK;CAC9C,iBAAiB,MAAM,gBAAgB,OAAO,SAAS;CAEvD,IAAI,OAAO,gBAAgB,OAAO,aAAa,SAAS,GACtD,KAAK,KAAK,kBAAkB,OAAO,aAAa,KAAK,GAAG,CAAC;CAG3D,IAAI,OAAO,mBAAmB,OAAO,gBAAgB,SAAS,GAC5D,KAAK,KAAK,qBAAqB,OAAO,gBAAgB,KAAK,GAAG,CAAC;CAGjE,iBAAiB,MAAM,uBAAuB,OAAO,eAAe;CACpE,iBAAiB,MAAM,yBAAyB,OAAO,iBAAiB;CACxE,iBAAiB,MAAM,4BAA4B,OAAO,oBAAoB;CAC9E,iBAAiB,MAAM,4BAA4B,OAAO,oBAAoB;CAC9E,iBAAiB,MAAM,UAAU,OAAO,IAAI;CAC5C,iBAAiB,MAAM,eAAe,OAAO,QAAQ;CACrD,iBACE,MACA,wCACA,OAAO,+BACT;CACA,iBACE,MACA,kCACA,OAAO,0BACT;AACF;;;;;;;AAQA,SAAgB,UAAU,QAA2C;CACnE,MAAM,OAAiB;EACrB;EACA,OAAO;EACP;EACA;EACA;CACF;CAEA,sBAAsB,MAAM,MAAM;CAElC,OAAO;AACT;;;;;;;AAQA,SAAgB,eACd,QACA,SAAiD,CAAC,GACxC;CACV,MAAM,OAAiB;EAAC;EAAM;EAAQ;EAAmB;CAAM;CAC/D,MAAM,iBAAiB,OAAO,kBAAkB;CAChD,sBAAsB,MAAM;EAC1B,GAAG;EACH;CACF,CAAC;CACD,OAAO;AACT;;;;;;;;;;;;;;;;;;ACvGA,MAAM,qBAAqB,MAAS;;;;;;AAOpC,MAAM,gBAAgB;;;;;;;;;;;;;AA+BtB,eAAsB,YACpB,QACwB;CACxB,MAAM,SAAS,OAAO,UAAU;CAChC,MAAM,OAAO,UAAU,MAAM;CAM7B,MAAM,gBAJgB,OAAO,kBAAkB,QAK3C,MAAM,QAAQ,KAAK,OAAO,GAAG,eAAe,CAAC,IAC7C;CAEJ,MAAM,MAA0C;EAC9C,GAAG,QAAQ;EACX,GAAG,OAAO;CACZ;CACA,IAAI,eAEF,IAAI,oBAAoB;CAG1B,MAAM,QAAQ,MAAM,QAAQ,MAAM;EAChC,KAAK,OAAO,OAAO,QAAQ,IAAI;EAC/B;EACA,OAAO;GAAC;GAAU;GAAQ;EAAM;EAIhC,UAAU;CACZ,CAAC;CAKD,IAAI,WAAW;CACf,IAAI,iBAAwC;CAC5C,MAAM,YAAY,OAAO,aAAa;;;;;CAMtC,MAAM,+BAA+B;EACnC,IAAI,gBAAgB,aAAa,cAAc;EAC/C,iBAAiB,iBACT,SAAS,OAAO,SAAS,GAC/B,aACF;CACF;CAEA,MAAM,eAAe,iBAAiB;EACpC,WAAW;EACX,SAAS,OAAO,SAAS;EACzB,uBAAuB;CACzB,GAAG,SAAS;CAGZ,MAAM,gBAAgB;EACpB,SAAS,OAAO,SAAS;EACzB,uBAAuB;CACzB;CACA,OAAO,QAAQ,iBAAiB,SAAS,SAAS,EAAE,MAAM,KAAK,CAAC;CAKhE,MAAM,eAAyB,CAAC;CAChC,MAAM,QAAQ,YAAY,MAAM;CAChC,MAAM,QAAQ,GAAG,SAAS,UAAkB;EAC1C,aAAa,KAAK,KAAK;CACzB,CAAC;CAED,MAAM,kBAAkB,IAAI,SAAiB,YAAY;EACvD,MAAM,iBAAiB,QAAQ,aAAa,KAAK,EAAE,CAAC;EACpD,MAAM,QAAQ,GAAG,OAAO,QAAQ;EAGhC,MAAM,QAAQ,GAAG,SAAS,QAAQ;CACpC,CAAC;CAKD,MAAM,OAAO,IAAI,SAGb,YAAY;EACd,IAAI,UAAU;EACd,MAAM,YACJ,UACA,WACG;GACH,IAAI,SAAS;GACb,UAAU;GAEV,aAAa,YAAY;GACzB,IAAI,gBAAgB,aAAa,cAAc;GAC/C,OAAO,QAAQ,oBAAoB,SAAS,OAAO;GACnD,QAAQ;IAAE;IAAU;GAAO,CAAC;EAC9B;EAEA,MAAM,GAAG,UAAU,MAAM,WAAW,SAAS,MAAM,MAAM,CAAC;EAE1D,MAAM,GAAG,eAAe,SAAS,MAAM,IAAI,CAAC;CAC9C,CAAC;CAGD,MAAM,UAAU,YAAY;EAC1B,IAAI,CAAC,eAAe;EACpB,IAAI;GACF,MAAM,GAAG,eAAe;IAAE,WAAW;IAAM,OAAO;GAAK,CAAC;EAC1D,QAAQ,CAGR;CACF;CAKA,OAAO;EACL,QAAQ,MAAM;EACd;EACA;EACA,gBAAgB;EAChB;CACF;AACF;;;;;;;;;;;;;;;;;AAkBA,SAAS,SAAS,OAAqB,QAA8B;CACnE,IAAI,MAAM,QAAQ,KAAA,GAAW;CAC7B,IAAI;EAEF,QAAQ,KAAK,CAAC,MAAM,KAAK,MAAM;CACjC,QAAQ;EACN,IAAI;GAEF,MAAM,KAAK,MAAM;EACnB,QAAQ,CAER;CACF;AACF;;;;;;;;;;;;;;AC/LA,eAAsB,cACpB,QACkC;CAClC,MAAM,UAAU,KAAK,IAAI;CACzB,MAAM,UAAU,MAAM,YAAY,MAAM;CAExC,MAAM,UAAU,IAAI,kBAAkB;CACtC,MAAM,YAA2B,CAAC;CAClC,MAAM,cAAkC,CAAC;CAEzC,IAAI;EACF,WAAW,MAAM,UAAU,gBAAgB,QAAQ,MAAM,GACvD,IAAI,OAAO,IAAI;GACb,QAAQ,QAAQ,OAAO,KAAK;GAC5B,UAAU,KAAK,OAAO,KAAK;EAC7B,OACE,YAAY,KAAK;GACf,MAAM,OAAO;GACb,OAAO,OAAO,MAAM;EACtB,CAAC;EAIL,MAAM,CAAC,EAAE,UAAU,UAAU,UAAU,MAAM,QAAQ,IAAI,CACvD,QAAQ,MACR,QAAQ,eACV,CAAC;EAED,MAAM,cAAkC;GACtC;GACA;GACA;GACA;GACA,UAAU,QAAQ,SAAS;GAC3B,YAAY,KAAK,IAAI,IAAI;EAC3B;EAEA,IAAI;EACJ,IAAI;GACF,OAAO,QAAQ,MAAM;EACvB,SAAS,KAAK;GAEZ,MAAM,IAAI,aACR,0CAFc,eAAe,QAAQ,IAAI,UAAU,OAAO,GAAG,KAG7D,WACF;EACF;EAEA,OAAO;GAAE;GAAM;GAAa;EAAU;CACxC,UAAU;EACR,MAAM,QAAQ,QAAQ;CACxB;AACF;;AAGA,MAAa,oBAA6D;CACxE,IAAI;CACJ,KAAK;AACP"}